Claude Code Input Pipeline: Three Paths Before the Model Runs

2 min readAI Agents

How processUserInput routes text, !bash, and /slash into separate permission paths, and how speculative execution and prompt suggestions reduce perceived latency.

ai-agentsclaude-codeinput-pipelinespeculationprompt-suggestions

The agent loop starts when the model runs.

That is the wrong frame.

In Claude Code, a significant amount of work happens before the model is called — routing input, pre-executing likely continuations, and generating suggestions for what the user might type next.

File structure

Files covered in this post5 files
src/
└── utils/
    ├── processUserInput/
    │   └── processUserInput.ts
    ├── promptSuggestion/
    │   ├── generatePromptSuggestions.ts
    │   └── promptSuggestionState.ts
    ├── speculation/
    │   ├── speculativeExecute.ts
    │   └── speculations.ts

src/utils/processUserInput/processUserInput.ts

3-path input router

Critical

606-line router that classifies every user input into one of three paths: plain text (→ query loop), !bash command (→ shell execution), or /slash command (→ command handler). Each path has distinct pre-processing, permission checks, and output routing.

Module
Input Pipeline

Key Exports

  • processUserInput
  • classifyInput
  • InputPath

Why It Matters

  • The 3-path split means the query loop never sees raw !bash or /slash input — it only receives pre-classified text turns.
  • Each path has its own permission check: !bash checks shell permissions, /slash resolves command availability, text goes straight to query.
  • At 606 lines, processUserInput is one of the largest single-responsibility files in the codebase — the routing logic is genuinely complex.

src/utils/speculation/

Speculative turn pre-execution

High

Starts executing likely continuations in the background before the user submits. Uses the current conversation state and prompt suggestion candidates to speculatively call the model, then discards or promotes results based on whether the user's actual input matched the speculation.

Module
Input Pipeline

Key Exports

  • speculativeExecute
  • SpeculationState
  • promoteSpeculation
  • discardSpeculation

Why It Matters

  • Speculation is a latency optimization, not a feature — the user never sees speculative work unless it was promoted.
  • The discard path must be careful not to leave observable side effects (tool calls, file writes) from a speculative turn.
  • AppState tracks SpeculationState so UI can indicate 'pre-computing' without exposing the actual speculative content.

src/utils/promptSuggestion/

Background suggestion generation

Medium

Generates candidate next prompts in the background during idle periods. Suggestions appear as autocomplete candidates in the input box. Generated using the conversation context and a lightweight prediction prompt.

Module
Input Pipeline

Key Exports

  • generatePromptSuggestions
  • PromptSuggestionState

Why It Matters

  • Suggestion generation runs in idle time — it does not block the foreground turn.
  • Suggestions feed the speculation engine: the top suggestion is the most likely target for speculative execution.
  • AppState holds PromptSuggestionState, keeping suggestions synchronized across UI renders without prop drilling.

The pre-turn pipeline

What happens before the model runs

The input pipeline stages that execute between user keypress and model API call.

  1. 1

    Classify input

    utils/processUserInput/processUserInput.ts

    processUserInput.ts routes the raw input into one of three paths: plain text, !bash, or /slash. Each path gets separate permission checks and pre-processing.

  2. 2

    Generate suggestions (background)

    utils/promptSuggestion/

    During idle time between turns, promptSuggestion generates candidate next prompts. These appear as autocomplete in the input box and feed the speculation engine.

  3. 3

    Speculative pre-execution (background)

    utils/speculation/

    Using the top suggestion as a target, speculation starts executing the most likely continuation before the user submits. If the actual input matches, the result is promoted instead of re-executed.

  4. 4

    Query loop entry

    query.ts

    Classified text input enters query.ts. If speculation matched and was promoted, the turn result is already available. If not, a fresh turn begins.

The 3-path router matters for permissions

The reason processUserInput is 606 lines is not routing complexity alone.

Each path carries its own trust and permission model:

Text path: Goes to the query loop. The model decides what tools to call. Permissions apply at tool execution time.

!bash path: Executes a shell command directly. Requires the user to be in a trust context that allows arbitrary shell commands — a different check than tool permissions.

/slash path: Resolves a command from the command registry. Must verify the command exists and is available in the current context (not all commands are available in non-interactive mode, pipe mode, or bridge sessions).

Merging these paths into one would mean applying the least-restrictive permission model to all three, which would let /slash and !bash bypass tool permission gates, or force text input through unnecessary checks.

The split is the correct design.

Why speculation needs a discard path

Speculative execution is only safe if the discard path is clean.

The risk: a speculative turn calls a tool — writes a file, runs a shell command, makes an API call — and then the user's actual input does not match the speculation.

Claude Code handles this by restricting what speculative turns can do.

Speculation runs a model call but does not allow side-effectful tool execution. Read-only operations (file reads, search) are safe to speculate. Write operations and shell commands are not.

The SpeculationState in AppState tracks whether a speculative call is in flight, so the UI can show a loading indicator without revealing which specific completion was being speculatively computed.

Suggestions as latency infrastructure

The prompt suggestion pipeline is easy to dismiss as a UI convenience.

It is actually latency infrastructure.

The suggestion pipeline runs during idle time — after a turn completes but before the user types their next input. If the user's next input matches a suggestion, speculation may have already completed the corresponding turn.

That is the full loop:

  1. turn completes
  2. suggestions generated in background
  3. speculation starts on top suggestion
  4. user submits — if it matches, result is promoted; if not, discard and run fresh

The latency benefit only materializes when the suggestion accurately predicted what the user would type. That accuracy depends on the quality of the suggestion prompt and the consistency of the user's working patterns.

Input routing and speculation pattern

The 3-path router and speculation lifecycle. JS shows the actual shape; Python and Go show the same structure.

typescriptutils/processUserInput (simplified)

3-path input router — actual pattern from learning-claude-code

type InputPath = 'text' | 'bash' | 'slash'

interface ClassifiedInput {
path: InputPath
raw: string
payload: string  // stripped of prefix for bash/slash
}

function classifyInput(raw: string): ClassifiedInput {
if (raw.startsWith('!')) {
  return { path: 'bash', raw, payload: raw.slice(1).trim() }
}
if (raw.startsWith('/')) {
  return { path: 'slash', raw, payload: raw.slice(1).trim() }
}
return { path: 'text', raw, payload: raw }
}

async function processUserInput(
raw: string,
context: SessionContext
): Promise<ProcessResult> {
const classified = classifyInput(raw)

switch (classified.path) {
  case 'bash': {
    // Check shell permission before executing
    checkPermission(context, 'shell_execute')
    return executeBashCommand(classified.payload, context)
  }
  case 'slash': {
    // Resolve command from registry — fails if not available in this context
    const command = resolveCommand(classified.payload, context)
    return executeSlashCommand(command, context)
  }
  case 'text': {
    // Check if speculative execution already has a result
    const promoted = tryPromoteSpeculation(classified.payload, context)
    if (promoted) return promoted
    // Otherwise enter the query loop
    return enterQueryLoop(classified.payload, context)
  }
}
}

// Speculation: starts before user submits
async function speculativeExecute(
suggestion: string,
context: SessionContext
): Promise<SpeculationResult> {
const state = getSpeculationState()
if (state.inFlight) return // only one speculation at a time

setSpeculationState({ inFlight: true, target: suggestion })

try {
  // Model call with read-only tool restriction
  const result = await runQueryLoopReadOnly(suggestion, context)
  setSpeculationState({ inFlight: false, result, target: suggestion })
  return result
} catch {
  setSpeculationState({ inFlight: false, result: null, target: null })
  throw
}
}

Claude Code's speculation engine restricts speculative turns to read-only tool operations. Why is this restriction necessary even though the user might eventually submit the speculated prompt?

medium

Speculative execution starts running a turn before the user submits. If the user's actual input matches the speculation, the result is promoted. If it does not match, the speculation is discarded.

  • ARead-only tools are faster, so speculation completes before the user submits
    Incorrect.Performance is a benefit, but it is not the reason for the restriction. The primary concern is about side effects from non-matching speculations.
  • BSpeculative turns that do not match the user's actual input must be discarded without trace — write operations and shell commands cannot be undone if the speculation was wrong
    Correct!Correct. If a speculative turn writes a file or runs a shell command, and then the user types something different, those side effects are already in the world. There is no clean discard path for writes. Read-only operations (file reads, search) produce no side effects, so they can be safely discarded if the speculation does not match.
  • CWrite operations require user confirmation, which would reveal that speculation was in progress
    Incorrect.While this is a secondary concern, the primary issue is irreversibility. Even if the user never sees the write, the file system has already been modified.
  • DThe model cannot reliably predict which files to write without seeing the user's exact input
    Incorrect.This describes a prediction quality issue, not the reason for the restriction. The restriction exists because of side-effect irreversibility, not because write prediction is unreliable.