Claude Code Tool Execution: Scheduling, Streaming, and Governance
How partitionToolCalls batches concurrent-safe operations, StreamingToolExecutor manages backpressure, and why tool execution is a subsystem rather than a callback.
query.ts does not execute tools directly.
It hands tool work to services/tools/ — a subsystem that treats tool execution as a pipeline with scheduling, governance, abort control, progress, and transcript shaping all included.
Once you see that, "the model called a tool" stops being the right mental model. The model emits a tool_use block. The execution subsystem turns that into governed work.
File structure
src/
├── services/
│ └── tools/
│ ├── toolOrchestration.ts
│ ├── toolExecution.ts
│ ├── StreamingToolExecutor.ts
│ └── toolResultPersistence.tssrc/services/tools/toolOrchestration.ts
Batch scheduler for concurrent and serial tool execution
Partitions a batch of tool_use blocks into concurrent and serial groups using isConcurrencySafe, runs each group with the right strategy, applies context modifiers in order after each concurrent batch, and threads updated context forward.
Key Exports
- runTools
- partitionToolCalls
Why It Matters
- Concurrency is determined per-invocation, not per-tool-type.
- Context modifiers from concurrent tools are applied in original order after the batch completes.
- CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY (default 10) caps parallel execution.
src/services/tools/toolExecution.ts
Per-tool control plane
Resolves tool by name (with alias fallback), validates abort state, runs pre-tool hooks, resolves permission decisions, executes with progress and abort handling, runs post-success and post-failure hooks, and emits transcript-safe result messages with telemetry.
Key Exports
- runToolUse
- classifyToolError
Why It Matters
- runToolUse is a generator: it yields message updates as the tool progresses, not a batch at the end.
- Tool name resolution includes alias fallback for deprecated tool names found in old transcripts.
- Unknown tool calls produce error tool_result messages — they do not crash the loop.
src/services/tools/StreamingToolExecutor.ts
Incremental tool scheduler during assistant response streaming
Starts tool work while the assistant response is still streaming. Buffers incomplete inputs, handles streaming fallback with discard and tombstoning, preserves sibling ordering constraints, and yields progress events alongside final results.
Key Exports
- StreamingToolExecutor
Why It Matters
- Claude Code starts executing tools before the full assistant message arrives.
- Streaming fallback requires explicit discard of the old executor and creation of a new one.
- Tombstone messages tell the UI and transcript to remove orphaned partial messages.
The scheduling layer: partitionToolCalls
The most important invariant in toolOrchestration.ts is how concurrency is decided:
concurrency is per-invocation, not per-tool-type.
partitionToolCalls groups adjacent concurrent-safe calls into one batch, then groups the next set of exclusive calls into a serial batch, and so on. The batching respects the original order from the model — it never reorders calls.
export async function* runTools(toolUseMessages, assistantMessages, canUseTool, toolUseContext) {
let currentContext = toolUseContext
for (const { isConcurrencySafe, blocks } of partitionToolCalls(toolUseMessages, currentContext)) {
if (isConcurrencySafe) {
// Run this batch concurrently — collect context modifiers, apply in order after
const queuedContextModifiers = {}
for await (const update of runToolsConcurrently(blocks, ...)) {
if (update.contextModifier) {
queuedContextModifiers[update.contextModifier.toolUseID] ??= []
queuedContextModifiers[update.contextModifier.toolUseID].push(update.contextModifier.modifyContext)
}
yield { message: update.message, newContext: currentContext }
}
// Apply context modifiers in original block order — not concurrent arrival order
for (const block of blocks) {
for (const modifier of queuedContextModifiers[block.id] ?? []) {
currentContext = modifier(currentContext)
}
}
} else {
// Serial batch — each tool sees the previous tool's updated context
for await (const update of runToolsSerially(blocks, ...)) {
if (update.newContext) currentContext = update.newContext
yield { message: update.message, newContext: currentContext }
}
}
}
}
The context modifier ordering is the subtle detail: when running concurrently, all tools start with the same currentContext. Their contextModifier callbacks (which mutate the context — e.g., updating file state cache, adding a tracked path) are applied after the batch, in the original block order, not the arrival order of completions. This prevents nondeterministic context state.
The per-tool pipeline: runToolUse
runToolUse is a generator that handles one tool_use block from end to end. In order:
runToolUse pipeline
The stable path for every tool call from tool_use block to transcript-visible result.
- 1
Resolve tool by name (with alias fallback)
findToolByName / aliasesLooks up the tool in the active registry. If not found, checks alias names for deprecated tool names (e.g. old transcripts calling 'KillShell', now 'TaskStop'). Unknown tools yield an error tool_result — the loop keeps running.
- 2
Check abort signal
abortController.signal.abortedIf the session abort controller is already signaled, logs a cancellation event and yields no messages — the tool is skipped cleanly.
- 3
Run pre-tool governance
runPreToolUseHooks / resolveHookPermissionDecisionValidates input schema, runs PreToolUse hooks, and merges hook outcomes with the permission system. A hook returning 'allow' does not bypass deny or safety-check rules.
- 4
Execute with progress and abort control
checkPermissionsAndCallTool / onProgressCalls tool.call() with progress callbacks and the shared abort controller. Progress events yield intermediate messages visible in the UI before the tool finishes.
- 5
Serialize outcome as transcript messages
addToolResult / runPostToolUseHooks / runPostToolUseFailureHooksSuccess and failure both become user messages with tool_result content. Images, structured output, large results (persisted to disk), and MCP metadata are all handled here.
StreamingToolExecutor: speculation during the stream
The default execution model would be: wait for the full assistant response, then execute all tool calls. StreamingToolExecutor changes this.
As the assistant response streams in and tool_use blocks complete their JSON parsing, StreamingToolExecutor can start executing them immediately. The model's output latency overlaps with tool execution latency.
When streaming fails and a fallback model is tried:
- The old
StreamingToolExecutoris discarded —discard()marks it as dead - Orphaned partial messages from the failed attempt are yielded as tombstone events — the UI and transcript remove them
- A fresh executor is created for the fallback attempt
if (streamingFallbackOccurred) {
for (const msg of assistantMessages) {
yield { type: 'tombstone', message: msg }
}
if (streamingToolExecutor) {
streamingToolExecutor.discard()
streamingToolExecutor = new StreamingToolExecutor(tools, canUseTool, toolUseContext)
}
}
This is why tombstone messages exist: they are not error signals, they are cleanup signals for a specific streaming recovery path.
The hook invariant: hooks cannot override deny or safety checks
One of the most important policy decisions in toolHooks.ts is resolveHookPermissionDecision:
A PreToolUse hook returning allow does not bypass deny rules or safety-check rules.
Hooks operate inside the governance system. They can approve requests that would otherwise prompt the user — but they cannot override rules that the user or administrator explicitly set as deny, and they cannot skip safety-check gates (.git/, .claude/, shell configs).
This keeps hooks powerful without making them a back door around policy.
Tool execution scheduling across runtimes
The partition-then-execute pattern. JS shows the actual structure; Python and Go show the same scheduling logic.
Matches the actual runTools structure from learning-claude-code
// Partition tool calls into concurrent/serial batches, run in order
export async function* runTools(toolUseMessages, assistantMessages, canUseTool, ctx) {
let context = ctx
for (const { isConcurrencySafe, blocks } of partitionToolCalls(toolUseMessages, context)) {
if (isConcurrencySafe) {
const pending = {}
for await (const update of runToolsConcurrently(blocks, assistantMessages, canUseTool, context)) {
if (update.contextModifier) {
pending[update.contextModifier.toolUseID] ??= []
pending[update.contextModifier.toolUseID].push(update.contextModifier.modifyContext)
}
yield { message: update.message, newContext: context }
}
// Apply modifiers in original order — not completion order
for (const block of blocks) {
for (const mod of pending[block.id] ?? []) {
context = mod(context)
}
}
} else {
for await (const update of runToolsSerially(blocks, assistantMessages, canUseTool, context)) {
if (update.newContext) context = update.newContext
yield { message: update.message, newContext: context }
}
}
}
}In Claude Code's tool orchestration, why are context modifiers from a concurrent batch applied in original block order rather than the order tools completed?
hardWhen a batch of concurrency-safe tools runs in parallel, each tool can return a contextModifier that mutates ToolUseContext. The modifiers are collected, then applied after the batch.
AFor performance: applying in order is faster than sorting by completion time
Incorrect.Order of application doesn't affect performance.BTo produce deterministic context state regardless of which tool finishes first
Correct!Correct. Two concurrent tools could both return context modifiers. If applied in completion order, the final context would vary run-to-run depending on scheduling. Applying in original block order makes the result deterministic.CBecause context modifiers can only be created by the last tool in the batch
Incorrect.Any tool in a concurrent batch can return a contextModifier.DTo match the order the model expects tool results to appear in the transcript
Incorrect.Transcript order is handled separately from context mutation order.