Claude Code Tool Execution: Scheduling, Streaming, and Governance — Blog

query.ts does not execute tools directly.

It hands tool work to services/tools/ — a subsystem that treats tool execution as a pipeline with scheduling, governance, abort control, progress, and transcript shaping all included.

Once you see that, "the model called a tool" stops being the right mental model. The model emits a tool_use block. The execution subsystem turns that into governed work.

File structure

Files covered in this post4 files

src/
├── services/
│   └── tools/
│       ├── toolOrchestration.ts
│       ├── toolExecution.ts
│       ├── StreamingToolExecutor.ts
│       └── toolResultPersistence.ts

src/services/tools/toolOrchestration.ts

Batch scheduler for concurrent and serial tool execution

Critical

Partitions a batch of tool_use blocks into concurrent and serial groups using isConcurrencySafe, runs each group with the right strategy, applies context modifiers in order after each concurrent batch, and threads updated context forward.

Module: Tool Execution Runtime

Key Exports

runTools
partitionToolCalls

Why It Matters

Concurrency is determined per-invocation, not per-tool-type.
Context modifiers from concurrent tools are applied in original order after the batch completes.
CLAUDE_CODE_MAX_TOOL_USE_CONCURRENCY (default 10) caps parallel execution.

src/services/tools/toolExecution.ts

Per-tool control plane

Critical

Resolves tool by name (with alias fallback), validates abort state, runs pre-tool hooks, resolves permission decisions, executes with progress and abort handling, runs post-success and post-failure hooks, and emits transcript-safe result messages with telemetry.

Module: Tool Execution Runtime

Key Exports

runToolUse
classifyToolError

Why It Matters

runToolUse is a generator: it yields message updates as the tool progresses, not a batch at the end.
Tool name resolution includes alias fallback for deprecated tool names found in old transcripts.
Unknown tool calls produce error tool_result messages — they do not crash the loop.

src/services/tools/StreamingToolExecutor.ts

Incremental tool scheduler during assistant response streaming

Critical

Starts tool work while the assistant response is still streaming. Buffers incomplete inputs, handles streaming fallback with discard and tombstoning, preserves sibling ordering constraints, and yields progress events alongside final results.

Module: Streaming Tool Execution

Key Exports

StreamingToolExecutor

Why It Matters

Claude Code starts executing tools before the full assistant message arrives.
Streaming fallback requires explicit discard of the old executor and creation of a new one.
Tombstone messages tell the UI and transcript to remove orphaned partial messages.

The scheduling layer: partitionToolCalls

The most important invariant in toolOrchestration.ts is how concurrency is decided:

concurrency is per-invocation, not per-tool-type.

partitionToolCalls groups adjacent concurrent-safe calls into one batch, then groups the next set of exclusive calls into a serial batch, and so on. The batching respects the original order from the model — it never reorders calls.

export async function* runTools(toolUseMessages, assistantMessages, canUseTool, toolUseContext) {
  let currentContext = toolUseContext
  for (const { isConcurrencySafe, blocks } of partitionToolCalls(toolUseMessages, currentContext)) {
    if (isConcurrencySafe) {
      // Run this batch concurrently — collect context modifiers, apply in order after
      const queuedContextModifiers = {}
      for await (const update of runToolsConcurrently(blocks, ...)) {
        if (update.contextModifier) {
          queuedContextModifiers[update.contextModifier.toolUseID] ??= []
          queuedContextModifiers[update.contextModifier.toolUseID].push(update.contextModifier.modifyContext)
        }
        yield { message: update.message, newContext: currentContext }
      }
      // Apply context modifiers in original block order — not concurrent arrival order
      for (const block of blocks) {
        for (const modifier of queuedContextModifiers[block.id] ?? []) {
          currentContext = modifier(currentContext)
        }
      }
    } else {
      // Serial batch — each tool sees the previous tool's updated context
      for await (const update of runToolsSerially(blocks, ...)) {
        if (update.newContext) currentContext = update.newContext
        yield { message: update.message, newContext: currentContext }
      }
    }
  }
}

The context modifier ordering is the subtle detail: when running concurrently, all tools start with the same currentContext. Their contextModifier callbacks (which mutate the context — e.g., updating file state cache, adding a tracked path) are applied after the batch, in the original block order, not the arrival order of completions. This prevents nondeterministic context state.

The per-tool pipeline: runToolUse

runToolUse is a generator that handles one tool_use block from end to end. In order:

runToolUse pipeline

The stable path for every tool call from tool_use block to transcript-visible result.

1
Resolve tool by name (with alias fallback)
findToolByName / aliases
Looks up the tool in the active registry. If not found, checks alias names for deprecated tool names (e.g. old transcripts calling 'KillShell', now 'TaskStop'). Unknown tools yield an error tool_result — the loop keeps running.
2
Check abort signal
abortController.signal.aborted
If the session abort controller is already signaled, logs a cancellation event and yields no messages — the tool is skipped cleanly.
3
Run pre-tool governance
runPreToolUseHooks / resolveHookPermissionDecision
Validates input schema, runs PreToolUse hooks, and merges hook outcomes with the permission system. A hook returning 'allow' does not bypass deny or safety-check rules.
4
Execute with progress and abort control
checkPermissionsAndCallTool / onProgress
Calls tool.call() with progress callbacks and the shared abort controller. Progress events yield intermediate messages visible in the UI before the tool finishes.
5
Serialize outcome as transcript messages
addToolResult / runPostToolUseHooks / runPostToolUseFailureHooks
Success and failure both become user messages with tool_result content. Images, structured output, large results (persisted to disk), and MCP metadata are all handled here.

StreamingToolExecutor: speculation during the stream

The default execution model would be: wait for the full assistant response, then execute all tool calls. StreamingToolExecutor changes this.

As the assistant response streams in and tool_use blocks complete their JSON parsing, StreamingToolExecutor can start executing them immediately. The model's output latency overlaps with tool execution latency.

When streaming fails and a fallback model is tried:

The old StreamingToolExecutor is discarded — discard() marks it as dead
Orphaned partial messages from the failed attempt are yielded as tombstone events — the UI and transcript remove them
A fresh executor is created for the fallback attempt

if (streamingFallbackOccurred) {
  for (const msg of assistantMessages) {
    yield { type: 'tombstone', message: msg }
  }
  if (streamingToolExecutor) {
    streamingToolExecutor.discard()
    streamingToolExecutor = new StreamingToolExecutor(tools, canUseTool, toolUseContext)
  }
}

This is why tombstone messages exist: they are not error signals, they are cleanup signals for a specific streaming recovery path.

The hook invariant: hooks cannot override deny or safety checks

One of the most important policy decisions in toolHooks.ts is resolveHookPermissionDecision:

A PreToolUse hook returning allow does not bypass deny rules or safety-check rules.

Hooks operate inside the governance system. They can approve requests that would otherwise prompt the user — but they cannot override rules that the user or administrator explicitly set as deny, and they cannot skip safety-check gates (.git/, .claude/, shell configs).

This keeps hooks powerful without making them a back door around policy.

Tool execution scheduling across runtimes

The partition-then-execute pattern. JS shows the actual structure; Python and Go show the same scheduling logic.

javascripttoolOrchestration.ts (simplified)

Matches the actual runTools structure from learning-claude-code

// Partition tool calls into concurrent/serial batches, run in order
export async function* runTools(toolUseMessages, assistantMessages, canUseTool, ctx) {
let context = ctx
for (const { isConcurrencySafe, blocks } of partitionToolCalls(toolUseMessages, context)) {
  if (isConcurrencySafe) {
    const pending = {}
    for await (const update of runToolsConcurrently(blocks, assistantMessages, canUseTool, context)) {
      if (update.contextModifier) {
        pending[update.contextModifier.toolUseID] ??= []
        pending[update.contextModifier.toolUseID].push(update.contextModifier.modifyContext)
      }
      yield { message: update.message, newContext: context }
    }
    // Apply modifiers in original order — not completion order
    for (const block of blocks) {
      for (const mod of pending[block.id] ?? []) {
        context = mod(context)
      }
    }
  } else {
    for await (const update of runToolsSerially(blocks, assistantMessages, canUseTool, context)) {
      if (update.newContext) context = update.newContext
      yield { message: update.message, newContext: context }
    }
  }
}
}

In Claude Code's tool orchestration, why are context modifiers from a concurrent batch applied in original block order rather than the order tools completed?

hard

When a batch of concurrency-safe tools runs in parallel, each tool can return a contextModifier that mutates ToolUseContext. The modifiers are collected, then applied after the batch.

AFor performance: applying in order is faster than sorting by completion time
Incorrect.Order of application doesn't affect performance.
BTo produce deterministic context state regardless of which tool finishes first
Correct!Correct. Two concurrent tools could both return context modifiers. If applied in completion order, the final context would vary run-to-run depending on scheduling. Applying in original block order makes the result deterministic.
CBecause context modifiers can only be created by the last tool in the batch
Incorrect.Any tool in a concurrent batch can return a contextModifier.
DTo match the order the model expects tool results to appear in the transcript
Incorrect.Transcript order is handled separately from context mutation order.

File structure

Batch scheduler for concurrent and serial tool execution

Per-tool control plane

Incremental tool scheduler during assistant response streaming

The scheduling layer: partitionToolCalls

The per-tool pipeline: runToolUse

runToolUse pipeline

Resolve tool by name (with alias fallback)

Check abort signal

Run pre-tool governance

Execute with progress and abort control

Serialize outcome as transcript messages

StreamingToolExecutor: speculation during the stream

The hook invariant: hooks cannot override deny or safety checks

In Claude Code's tool orchestration, why are context modifiers from a concurrent batch applied in original block order rather than the order tools completed?