Blog
Practical writing on AI engineering, infrastructure, backend systems, and production lessons learned.
Category
154 posts found
Reading Series
View series →Claude Code: Source Reading Series
A working engineer's read through every subsystem in the Claude Code source.
Curated migration review batch
Spotlighted legacy posts, rewritten with the new MDX component system
This batch highlights a few older notes that were worth preserving and upgrading — not just importing. The rest of the archive remains available below.
4 min read
Claude Code: Source Reading Series — Start Here
The entry point for the Claude Code source-reading series. Five layers, sixteen posts. Start here to navigate the series in order.
Read curated article →
4 min read
Building an Agent Client from Claude Code Patterns
What I extracted from the learning-claude-code codebase about sessions, permissions, plans, subagents, and remote transport when building a serious agent client.
Read curated article →
4 min read
OIDC for CI/CD: Replacing Long-Lived Cloud Credentials with Workload Identity
A practical migration guide for using OIDC in CI/CD so pipelines can assume cloud roles without storing long-lived secrets.
Read curated article →
9 min read
How You Actually Get a Google OAuth Refresh Token for Chrome Web Store Automation
A precise walkthrough of the Google OAuth path to a refresh token: why you must start with a client_id, why Google gives you an authorization code first, why localhost redirects are common, and how that turns into Chrome Web Store automation.
Read curated article →
4 min read
Redis Distributed Locks: What They Solve, Where They Break, and How to Use Them Safely
A pragmatic guide to Redis-based distributed locks for high-concurrency systems, including ownership, expiry, contention, and when a lock should be replaced by a better architecture.
Read curated article →
4 min read
Kubernetes vs ECS: Choosing the Right Control Plane for Real Teams
A practical comparison of Kubernetes and Amazon ECS focused on platform ownership, operational complexity, and when each option is the better bet.
Read curated article →
Latest from the archive
Freshly published writing and the broader imported archive continue to live here.
Latest
Claude Code Observability: Dual-Sink Analytics, OTEL Spans, Perfetto Traces
How dual-sink event routing, compile-time PII enforcement, full-turn OTEL instrumentation, and Perfetto/Chrome Trace export give Claude Code production-grade observability from a local-first runtime.
Read article →
Archive
The full imported knowledge archive stays visible so older references remain easy to browse.
147 posts total
Claude Code Architecture: Five Principles for Building Agent Runtimes
•5 min read•AI AgentsA synthesis of the full series: capability assembly, recoverable turn loops, tool execution subsystems, session control planes, and promoted detached work — and what these principles mean for building serious agent products.
ai-agentsclaude-codearchitectureruntimesystems-designmulti-agentobservabilityClaude Code Input Pipeline: Three Paths Before the Model Runs
•2 min read•AI AgentsHow processUserInput routes text, !bash, and /slash into separate permission paths, and how speculative execution and prompt suggestions reduce perceived latency.
ai-agentsclaude-codeinput-pipelinespeculationprompt-suggestionsClaude Code LSP Integration: Persistent Servers, Push-Based Diagnostics
•2 min read•AI AgentsHow Claude Code runs a persistent multi-server LSP stack, collects diagnostics passively via push notifications, and bounds the registry to 10 diagnostics per file and 30 total.
ai-agentsclaude-codelsplanguage-serverdiagnosticsClaude Code MCP Assembly: Seven Config Sources, Three Capability Surfaces
•3 min read•AI AgentsHow MCP server configuration resolves across seven priority layers, how getMcpServerSignature() prevents duplicate connections, and how MCP tools, commands, and resources expand into the session capability surface.
ai-agentsclaude-codemcpintegrationsruntimeClaude Code Memory System: Five Layers from Injection to Consolidation
•2 min read•AI AgentsHow memdir injection, SessionMemory compaction, ExtractMemories taxonomy, Relevant Memories side-queries, and AutoDream consolidation form a complete agent memory architecture.
ai-agentsclaude-codememoryagent-designpersistenceClaude Code REPL: 5000 Lines Binding Every Subsystem
•5 min read•AI AgentsHow REPL.tsx uses 30+ hooks and the QueryGuard state machine to bind AppState, streaming, overlays, and tool execution into one interactive session, with messagesRef and onSubmitRef as key memory and correctness patterns.
ai-agentsclaude-codereplruntimereact-hooksqueryguardClaude Code Tool Execution: Scheduling, Streaming, and Governance
•2 min read•AI AgentsHow partitionToolCalls batches concurrent-safe operations, StreamingToolExecutor manages backpressure, and why tool execution is a subsystem rather than a callback.
ai-agentsclaude-codetoolsruntimepermissionshooksClaude Code Multi-Agent Runtime: Coordinator, Three Backends, Disjoint Ownership
•2 min read•AI AgentsHow the coordinator's synthesize-first strategy prevents file conflicts, how three swarm backends (TmuxBackend, ITermBackend, InProcessBackend) share one interface, and what team context tracks.
ai-agentsclaude-codemulti-agentswarmcoordinatorparallel-executionClaude Code Task Runtime: Background Work as First-Class Objects
•2 min read•AI AgentsHow five task types (LocalShell, LocalAgent, RemoteAgent, InProcessTeammate, DreamTask) promote background work into observable, cancellable runtime objects.
ai-agentsclaude-codetasksruntimebackground-jobssubagentsClaude Code Subagent Runtime: Context Isolation and State Sharing
•3 min read•AI AgentsHow runAgent materializes a subagent turn, why sub-agents receive a no-op setAppState but setAppStateForTasks always pierces to the root store, and how fork agents share prompt cache.
ai-agentsclaude-codesubagentstasksruntimeClaude Code Tool Internals: BashTool, FileEditTool, AgentTool, Compact
•5 min read•AI AgentsBashTool's five defense layers including AST-based parsing and TOCTOU mitigation, FileEditTool's staleness detection and UNC path blocking, AgentTool's nesting depth limit, and the two compaction systems (autoCompact + microCompact).
ai-agentsclaude-codetoolsbash-toolagent-toolautocompactClaude Code UI Runtime: External Store, Custom Ink Pipeline, Session Control Plane
•2 min read•AI AgentsHow AppStateStore (80+ fields) acts as a session control plane, why useSyncExternalStore selectors prevent unnecessary re-renders, and how the custom Ink stack renders frames through a seven-stage pipeline.
ai-agentsclaude-codeui-runtimeinkstate-managementClaude Code Command System: Six Sources, One Registry
•4 min read•AI AgentsHow six command sources assemble into a memoized registry, and what REMOTE_SAFE and BRIDGE_SAFE availability flags actually control.
ai-agentsclaude-codecommandsruntimetypescriptClaude Code Permission Engine: Seven Steps, Bypass-Immune Gates
•4 min read•AI AgentsThe seven-step decision pipeline, which checks survive bypassPermissions mode, how Auto mode uses an AI Classifier, and why dangerous rules are stripped and restored rather than deleted.
ai-agentsclaude-codepermissionssafetyruntimeClaude Code Query Loop: The Recoverable Turn Engine
•5 min read•AI AgentsHow query.ts implements a streaming AsyncGenerator with seven-layer error recovery, message preprocessing, and two tool execution modes.
ai-agentsclaude-codequery-looptoolsruntimeClaude Code Boot Sequence: How a CLI Becomes a Runtime Host
•3 min read•AI AgentsHow cli.tsx, main.tsx, and replLauncher.tsx assemble a governed agent session from startup signal handlers to first REPL render.
ai-agentsclaude-codesource-coderuntimetypescriptClaude Code Tool System: 30+ Methods, Fail-Closed Defaults
•3 min read•AI AgentsThe Tool interface design, buildTool() safety defaults, deferred tool discovery when MCP tools exceed context budget, and how the registry filters at three distinct layers.
ai-agentsclaude-codetoolspermissionsruntimeEKS in Production: Operator Patterns and the Path to Writing Your Own
•8 min readPractical lessons from running EKS at scale — when to use existing Kubernetes operators, when to build custom ones, and the architecture patterns that keep clusters manageable as complexity grows.
infrastructurekuberneteseksoperatorsplatform-engineeringGPU Inference Pipeline: A Visual Guide to Serving LLMs at Scale
•5 min readA media-rich walkthrough of GPU inference infrastructure — from request routing to batching strategies, with architecture diagrams, performance visualizations, and real deployment patterns.
aiinfrastructuregpuinferenceperformanceHigh-Concurrency System Design: The Tradeoffs Nobody Warns You About
•11 min readA practical breakdown of the architectural tradeoffs in high-concurrency backend systems — covering connection management, backpressure, consistency models, and the decisions that determine whether your system degrades gracefully or falls over at scale.
backendsystem-designconcurrencyarchitecturedistributed-systemsBuilding Production RAG + MCP Agents: A Practical Architecture Guide
•7 min readA senior engineer's field guide to designing retrieval-augmented generation systems with Model Context Protocol integration — covering chunking strategies, retrieval pipelines, tool orchestration, and the pitfalls that only show up at scale.
aiagentsragmcparchitectureHello World
•1 min readSample blog post for content loader validation
sampleblogBearer Tokens: What 'Bearer' Means and What It Costs You
•3 min read•Web DevelopmentBearer means whoever holds the token can use it. That security model is simpler than alternatives but requires careful token handling — expiry, storage, and transport matter more than they do with session cookies.
authenticationoauth2jwtsecurityTerraform jsonencode: Fixing List Interpolation Errors in IAM Policies
•1 min read•Cloud InfrastructureTerraform list variables can't be interpolated directly into JSON string templates — the type mismatch causes 'string required' errors. jsonencode() converts HCL tuples to valid JSON arrays. The better solution is aws_iam_policy_document, which handles type conversions automatically and produces cleaner, validated HCL.
infrastructure-as-codeterraformiam