Claude Code Tool Internals: BashTool, FileEditTool, AgentTool, Compact — Blog

Earlier posts covered the tool protocol and execution service. This post covers what tools actually do — the implementation logic behind the agent's most-used capabilities.

Several tools are substantially more complex than their names suggest.

File structure

Files covered in this post11 files

src/
├── services/
│   ├── compact/
│   │   ├── autoCompact.ts
│   │   ├── microCompact.ts
├── tools/
│   ├── AgentTool/
│   │   ├── AgentTool.tsx
│   ├── BashTool/
│   │   ├── BashTool.tsx
│   │   ├── bashPermissions.ts
│   │   ├── readOnlyValidation.ts
│   │   ├── pathValidation.ts
│   │   ├── destructiveCommandWarning.ts
│   │   └── shouldUseSandbox.ts
│   ├── FileEditTool/
│   │   └── FileEditTool.ts
│   ├── FileReadTool/
│   │   └── FileReadTool.ts

src/tools/BashTool/BashTool.tsx

Shell execution with five-layer defense, AST-based parsing, and auto-backgrounding

Critical

The most complex tool in the codebase. Wraps shell execution with AST-based command parsing (tree-sitter with 23 regex fallbacks), read-only validation, path validation, destructive command warnings, speculative pre-checks, sandbox routing, output truncation, and auto-backgrounding for long-running commands.

Module: Tool Implementations

Key Exports

BashTool
bashToolHasPermission
splitCommandWithOperators

Why It Matters

BashTool has its own permission subsystem (bashPermissions.ts ~2900 lines) separate from the main permission engine.
Command parsing uses tree-sitter AST analysis, not regex. The 23 regex patterns are a fail-closed fallback for unparseable input.
Speculative pre-checks validate bash command permissions before the approval dialog fires — prevents 'yes to tool, immediately denied by bash' UX.
Commands that run longer than 15s auto-background in assistant mode. npm, yarn, python, and common build tools are detected and backgrounded proactively.

src/tools/FileReadTool/FileReadTool.ts

File reading with device blocking and macOS screenshot handling

High

Reads files with device path blocking (/dev/zero, /dev/stdin, /proc/self/fd/*) to prevent hangs. Detects macOS screenshots by thin-space vs regular space in filenames. Supports offset/limit for large files and image resizing for visual content.

Module: Tool Implementations

Key Exports

FileReadTool
isDevicePath
readFileWithLimits

Why It Matters

Device path blocking is a correctness guard, not a security one — /dev/zero would hang the agent indefinitely.
The macOS screenshot detection (thin-space filename pattern) is an example of platform-specific pragmatism buried in a generic tool.
Image files go through a separate path: resize to fit context window, then encode as base64 for the multimodal API.

src/tools/AgentTool/AgentTool.ts

Subagent spawning with quota, isolation, and result materialization

Critical

Spawns a sub-agent with a specific task and context. Enforces a subagent nesting depth limit (max 4 levels) to prevent runaway recursion. Materializes the spawned agent's conversation as a structured tool result. The parent agent waits for the subagent to complete before continuing.

Module: Tool Implementations

Key Exports

AgentTool
runAgent
MAX_AGENT_NESTING_DEPTH

Why It Matters

The nesting depth limit (4 levels) is a safety guard against agents recursively spawning agents that spawn agents.
The subagent runs a full agent turn via runAgent() — same query loop, same tool access, same permission context as the parent.
The result is materialized as a structured object that the parent agent reads as a tool result, not as raw text.

src/services/compact/autoCompact.ts

Context window auto-compaction with circuit breaker

High

When the conversation approaches the context limit, autoCompact triggers a compaction pass that summarizes old messages into a compact block. A circuit breaker prevents compaction loops: if compaction itself produces a conversation that is too large, the system stops trying rather than looping.

Module: Core Agent Runtime

Key Exports

autoCompact
CompactCircuitBreaker
shouldAutoCompact

Why It Matters

The circuit breaker is the most important safety guard in the compaction system — without it, a large conversation could trigger endless compaction attempts.
Compaction is triggered at 95% of the context window, not 100%, to leave room for the compaction prompt itself.
The compact output is inserted as a special message type that the query loop recognizes and handles separately from regular messages.

BashTool's five-layer defense model

BashTool is not a thin wrapper around exec. It has five defense layers:

BashTool defense pipeline

Every bash command passes through these five layers before executing. Fail-closed at each step.

1
AST-based command parsing (tree-sitter)
tools/BashTool/readOnlyValidation.ts
Command structure is parsed using tree-sitter AST analysis, not regex. This catches nested subshells, heredocs, and operator sequences that regex would misparse. 23 regex patterns serve as a fail-closed fallback for commands tree-sitter cannot parse — if parsing fails, the command is blocked, not allowed.
2
Read-only validation
tools/BashTool/readOnlyValidation.ts
Uses the AST parse result to check for blocked destructive patterns: rm, truncate, sed -i write operations. The semantic parse means operator context matters — 'echo foo | sed' and 'sed -i' are distinct cases.
3
Path validation
tools/BashTool/pathValidation.ts
pathValidation.ts validates that file paths in the command are within the allowed working directory. Prevents the agent from operating on files outside the project.
4
Destructive command warning
tools/BashTool/destructiveCommandWarning.ts
Certain commands (git reset --hard, rm -rf, truncate) trigger a confirmation dialog before execution, even if the agent has shell permission.
5
Sandbox routing
tools/BashTool/shouldUseSandbox.ts
When sandbox mode is active, the command is routed through the sandbox runtime with network policies. shouldUseSandbox.ts determines which commands need sandboxing.

The speculative pre-check pattern

One of the most interesting patterns in BashTool is speculative permission pre-checking.

The problem: the user approves a tool call in the permission dialog, but the bash permission check inside BashTool immediately denies it. This produces a confusing UX where the user said "yes" and nothing happened.

The solution: clearSpeculativeChecks() in bashPermissions.ts runs the bash permission check ahead of the approval dialog. If the bash layer would deny the command, the approval dialog is not shown — instead, the agent is told immediately that the command is not permitted.

This is permission validation at two levels: the main permission engine asks "is the user allowing this tool category?" and the bash layer asks "is this specific command allowed within that category?" The speculative check ensures both questions are answered before the user sees the dialog.

Auto-backgrounding long-running commands

BashTool has a 15-second budget for foreground execution in assistant mode (ASSISTANT_BLOCKING_BUDGET_MS).

Commands that run longer than 15 seconds auto-background. The agent receives a task handle and can check the result later.

Additionally, common build tools — npm, yarn, python, pytest, cargo — are detected by name and backgrounded proactively without waiting for the 15-second budget to expire.

This is a product decision: build commands are always long-running, so there is no value in blocking the foreground turn for 15 seconds before deciding to background them.

The run_in_background: true tool property in the model's response can also explicitly request backgrounding for a specific command.

FileReadTool's device blocking

The device path check in FileReadTool is worth examining because it reveals a class of failure that naive implementations miss.

Without the check, the agent could call read_file('/dev/stdin') and wait indefinitely for input that never arrives. Or call read_file('/dev/zero') and read an infinite stream of null bytes until memory is exhausted.

The block list includes:

/dev/zero, /dev/null, /dev/random, /dev/urandom
/dev/stdin, /dev/stdout, /dev/stderr
/proc/self/fd/*

These are not security concerns — they are correctness guards against the agent accidentally hanging itself or exhausting memory.

FileEditTool's correctness guards

FileEditTool has three guards that a minimal implementation would omit:

Staleness detection: The tool tracks the file's mtime at read time. If the file has been modified between read and edit, the edit is rejected. This prevents the agent from applying a diff to a version of the file that no longer matches what it read. Without this, a file modified externally during a long agent turn could receive a corrupted edit.

UNC path blocking: On Windows, paths like \\server\share\file trigger automatic NTLM authentication negotiation with the remote server. FileEditTool blocks UNC paths explicitly. An agent that edits files over UNC paths would silently leak the user's Windows credentials to any SMB server it contacts.

Team memory secret guard: Edits to team memory files (.claude/CLAUDE.md and similar) are scanned for secret patterns before committing. An agent that writes a secret into team memory would persist it into the repository for all team members to share.

sed -i TOCTOU mitigation: The sed -i write path in BashTool has an interesting security property. Rather than executing sed -i and then checking what it did, the tool pre-computes the final file content at permission-preview time — before the user sees the approval dialog. The content shown in the preview is byte-exact what will be written if the user approves. This eliminates the TOCTOU window where the file could change between preview and execution. The same computed content is used for both preview and the actual write.

Each guard protects against a different failure class: correctness (staleness), security (UNC), accidental secret exposure (team memory), and preview/execution divergence (TOCTOU).

GrepTool's pagination design

GrepTool does not return unbounded results. It enforces a default head limit of 250 lines and reports the applied limit and offset back to the model as structured metadata.

This is a deliberate pagination design: the model can request head_limit=0 for unlimited results, or pass an offset to continue from a previous page. The tool result includes appliedLimit and appliedOffset so the model knows what slice it received.

The reason for the default limit is context budget. A grep across a large codebase can return tens of thousands of lines. Injecting all of them into context would consume the budget before the agent could do anything with them. The 250-line default is a token budget decision, not a correctness one.

microCompact: tool result pruning over time

There are two compaction systems in Claude Code, not one.

autoCompact (covered above) summarizes the entire conversation when it approaches the context limit.

microCompact is a separate system that prunes old tool result bodies over time, while keeping the metadata. It runs before API requests rather than reactively.

The prunable tools are: FILE_READ, BASH, GREP, GLOB, WEB_SEARCH, WEB_FETCH, FILE_EDIT, FILE_WRITE. For each, the content body is removed after a time window, but the tool call record and result status remain in the conversation so the model knows the operation happened.

The distinction is important: autoCompact produces a summary that replaces conversation history. microCompact is surgical — it removes large payloads from individual tool results while leaving the conversation structure intact. A large file read from turn 3 does not need to occupy context in turn 50.

AgentTool's nesting depth limit

The subagent nesting depth limit (MAX_AGENT_NESTING_DEPTH = 4) prevents the most obvious runaway failure mode: an agent that spawns an agent, which spawns an agent, recursively.

The depth counter is passed through the task context and checked at AgentTool invocation time. Attempting to spawn a subagent at depth 4 returns an error immediately rather than spawning.

The limit of 4 is a product judgment: deep enough to support coordinator → worker → sub-worker patterns, but not so deep that a looping agent can cause significant resource consumption before the limit fires.

AutoCompact's circuit breaker

The circuit breaker in autoCompact.ts addresses a failure mode that is easy to overlook:

What happens if the compacted summary is itself too long?

Without a circuit breaker: compact the conversation → the result is still too large → compact again → loop indefinitely.

The circuit breaker tracks how many compaction attempts have been made for the current conversation. After one failed compaction (where the output still exceeds the context budget), the breaker opens and prevents further attempts. The agent instead receives a hard context limit error, which it can handle explicitly.

BashTool security layers and AgentTool depth guard

The layered validation pipeline and nesting depth enforcement. JS shows the actual shape; Python and Go show the same structure.

typescriptBashTool + AgentTool (simplified)

Speculative pre-check and nesting depth guard — actual patterns from learning-claude-code

// BashTool: layered validation before execution
async function executeBashCommand(
command: string,
context: ToolExecutionContext
): Promise<ToolResult> {
// Layer 1: read-only validation (semantic, not string matching)
const readOnlyResult = validateReadOnly(command)
if (readOnlyResult.blocked) {
  return toolError(`Command blocked: ${readOnlyResult.reason}`)
}

// Layer 2: path validation (must be within working directory)
const pathResult = validatePaths(command, context.workingDir)
if (pathResult.blocked) {
  return toolError(`Path outside allowed directory: ${pathResult.path}`)
}

// Layer 3: destructive command confirmation (fires dialog)
if (isDestructiveCommand(command)) {
  const confirmed = await showDestructiveWarning(command)
  if (!confirmed) return toolError('User declined destructive command')
}

// Layer 4: sandbox routing (when enabled)
const exec = shouldUseSandbox(command)
  ? sandboxExecutor
  : directExecutor

// Auto-background for long-running commands
const timeout = getTimeoutMs(command)  // npm/yarn/pytest → immediate background
if (timeout === BACKGROUND_IMMEDIATELY || context.runInBackground) {
  return spawnBackgroundTask(command, exec)
}

return exec.run(command, { timeout, context })
}

// Speculative pre-check: validate bash permissions BEFORE showing approval dialog
function speculativePreCheck(command: string, context: SessionContext): boolean {
// If bash layer would deny, don't show the approval dialog at all
const bashResult = clearSpeculativeChecks(command, context)
return bashResult.allowed  // false → skip dialog, return denied immediately
}

// AgentTool: nesting depth guard
const MAX_AGENT_NESTING_DEPTH = 4

async function agentTool(
task: string,
context: ToolExecutionContext
): Promise<ToolResult> {
const depth = context.agentNestingDepth ?? 0

if (depth >= MAX_AGENT_NESTING_DEPTH) {
  return toolError(
    `Maximum agent nesting depth (${MAX_AGENT_NESTING_DEPTH}) exceeded. ` +
    `Cannot spawn a subagent from depth ${depth}.`
  )
}

// Run the subagent — same query loop as parent, with incremented depth
const result = await runAgent(task, {
  ...context,
  agentNestingDepth: depth + 1
})

// Materialize conversation as structured tool result
return { type: 'agent_result', output: result.finalOutput, conversation: result.messages }
}

BashTool runs speculative permission pre-checks before showing the approval dialog. What specific UX failure does this prevent that would occur without the pre-check?

medium

Claude Code has two separate permission layers for bash: the main permission engine (does the user allow this tool category?) and the bash-specific layer (is this specific command allowed within that category?). The approval dialog fires between these two layers.

AWithout pre-checks, the user could accidentally approve a command that writes files outside the project directory
Incorrect.Path validation is a separate layer that runs regardless of pre-checks. Pre-checks are about the permission dialog UX, not path security.
BWithout pre-checks, the user approves the tool in the dialog, but the bash permission layer immediately denies the specific command — the user said yes and nothing happened, with no clear explanation
Correct!Correct. The approval dialog asks 'do you allow this tool?' but the bash permission layer then applies its own more specific rules. Without pre-checks, a user could click Allow on a command, only to see it silently fail because bashPermissions.ts blocked the specific command. The speculative pre-check runs the bash-layer check before showing the dialog — if the bash layer would deny it, the dialog is skipped entirely and the agent is told immediately.
CWithout pre-checks, bash commands could execute before the permission dialog is dismissed
Incorrect.The approval dialog blocks execution until dismissed. Pre-checks do not change the execution order — they change whether the dialog is shown at all.
DWithout pre-checks, the bash command could time out while the user is reading the approval dialog
Incorrect.Timeout is measured from when execution starts, not from when the dialog is shown. Pre-checks do not affect timeout behavior.

File structure

Shell execution with five-layer defense, AST-based parsing, and auto-backgrounding

File reading with device blocking and macOS screenshot handling

Subagent spawning with quota, isolation, and result materialization

Context window auto-compaction with circuit breaker

BashTool's five-layer defense model

BashTool defense pipeline

AST-based command parsing (tree-sitter)

Read-only validation

Path validation

Destructive command warning

Sandbox routing

The speculative pre-check pattern

Auto-backgrounding long-running commands

FileReadTool's device blocking

FileEditTool's correctness guards

GrepTool's pagination design

microCompact: tool result pruning over time

AgentTool's nesting depth limit

AutoCompact's circuit breaker

BashTool runs speculative permission pre-checks before showing the approval dialog. What specific UX failure does this prevent that would occur without the pre-check?