Beyond Single-Agent Thinking
A single AI agent with a powerful model can accomplish remarkable things. But production systems rarely rely on one agent doing everything. The shift to multi-agent orchestration unlocks capabilities that no single agent--however capable--can achieve alone.
This isn't about scaling for its own sake. It's about specialization. Different tasks benefit from different contexts, different instructions, different tool access. Trying to cram everything into one agent's context creates the very overflow problems we explored in A1.
Orchestration is the discipline of coordinating multiple agents to accomplish goals that exceed any single agent's effective scope.
The Routing Problem
When a user makes a request, who handles it? In a single-agent system, the answer is obvious. In a multi-agent system, it's the first architectural decision.
Keyword routing: Simple pattern matching. "Deploy" goes to the deployment agent. "Test" goes to the testing agent. Fast and predictable, but brittle when requests don't fit neat categories.
Classifier routing: A small model that categorizes requests and routes accordingly. More flexible than keywords, but adds latency and another potential failure point.
Model-as-router: A capable model that analyzes the request and decides which agent should handle it. Most flexible, but most expensive and potentially slowest.
Hierarchical routing: Multiple layers of routing, progressively narrowing. First: is this a coding task or a research task? If coding: is this implementation or review? Each layer can use different routing strategies.
The right choice depends on your workload patterns. Predictable request types favor simpler routing. Highly variable requests need more sophisticated classification. Cost-sensitive applications need to minimize routing overhead.
Handoff Architecture
When one agent passes work to another, what gets transferred? This is where orchestration succeeds or fails.
Full context transfer: Send everything. The receiving agent gets the complete history of what led to this point. Comprehensive but potentially overwhelming--and often exceeds context limits.
Summary transfer: The outgoing agent produces a summary of relevant context. Compact but lossy. Critical details may not make it into the summary.
Structured handoff: A defined interface specifying what information transfers. Like an API contract between agents. Consistent and verifiable, but requires upfront design work.
Reference transfer: Instead of copying context, pass references to shared resources. The receiving agent retrieves what it needs. Efficient but requires shared infrastructure.
Production systems typically combine approaches. Structured handoffs for core information, summaries for history, references for large artifacts. The goal is giving the receiving agent what it needs to continue effectively without overwhelming its context.
Coordination Patterns
Multiple agents working simultaneously need coordination. Without it, they duplicate effort, conflict over resources, or produce inconsistent results.
Sequential pipeline: Agents execute in order. Output from one becomes input to the next. Simple and predictable, but slow. A testing agent can't start until the coding agent finishes.
Parallel fan-out: Multiple agents work simultaneously on independent subtasks. Results combine at the end. Fast but requires tasks that don't interfere.
Supervisor pattern: A coordinating agent delegates to workers and synthesizes their results. The supervisor doesn't do the work directly but manages who does what.
Consensus pattern: Multiple agents independently attempt the same task. Results are compared and merged. Expensive but catches errors through redundancy.
Blackboard pattern: Agents read from and write to a shared state. Each agent watches for changes relevant to its function and acts accordingly. Flexible but complex to debug.
Real systems often mix patterns. A supervisor might coordinate a sequential pipeline where each step uses parallel fan-out internally. The art is matching patterns to problem structure.
State Management Across Agents
When agents coordinate, where does state live? This question determines how robust your system is to failures and how much agents can share.
Agent-local state: Each agent maintains its own context and memory. Simple isolation but no sharing. If an agent crashes, its state is lost.
Shared session store: A central store that all agents can read and write. Enables coordination but requires careful access control. Race conditions become possible.
Event log pattern: All state changes are recorded as events. Any agent can reconstruct state by replaying events. Audit-friendly and recoverable, but can be slow for large histories.
Checkpoint pattern: Periodic snapshots of system state. Agents can restore from checkpoints after failures. Good for recovery but adds overhead.
The session layer from the four-layer memory architecture (Chapter 04) naturally fits here. Sessions store structured event logs--exactly what multi-agent systems need for coordination and recovery.
Failure Modes in Orchestration
Multi-agent systems introduce failure modes that single-agent systems don't face:
Routing failures: Requests go to the wrong agent. The agent attempts a task it's not equipped for, wasting time and potentially producing incorrect results.
Handoff losses: Critical context doesn't transfer. The receiving agent lacks information needed to continue correctly.
Coordination deadlocks: Agents waiting for each other. Agent A waits for Agent B's output; Agent B waits for Agent A's confirmation. Neither progresses.
State inconsistencies: Agents have different views of the current state. They take actions that conflict or duplicate.
Cascade failures: One agent's failure propagates. Downstream agents receive bad input and fail in turn.
Each failure mode has mitigation patterns. Timeouts prevent deadlocks. Idempotent operations prevent duplication. Validation at handoffs catches inconsistencies early. Circuit breakers prevent cascades.
Building reliable multi-agent systems requires planning for failure as carefully as planning for success.
The Human Organization Parallel
Multi-agent orchestration mirrors how human organizations coordinate. Routing is like an intake process that directs requests to the right team. Handoffs are like project transitions between departments. Coordination patterns map to organizational structures.
The supervisor pattern is a management hierarchy. The blackboard pattern is an open workspace where people notice what colleagues are doing. The pipeline pattern is an assembly line.
This isn't coincidence. Both are solutions to the same underlying problem: how do you coordinate specialized capabilities to accomplish goals beyond any single actor's scope?
The patterns that work for AI agent orchestration have been tested in human organizations for decades. The difference is that with AI systems, you can iterate on the architecture much faster and measure results more precisely.
Design Principles
Start simple: Begin with single-agent or simple sequential designs. Add complexity only when you hit specific limitations.
Make handoffs explicit: Don't rely on implicit context passing. Define what transfers at each boundary.
Plan for failure: Every agent interaction can fail. Every handoff can lose data. Build recovery into the design from the start.
Observe everything: You can't debug what you can't see. Log routing decisions, handoff contents, coordination events.
Test boundaries: Unit tests for agents are necessary but not sufficient. Integration tests that exercise handoffs and coordination catch a different class of bugs.
The goal isn't the most sophisticated orchestration architecture. It's the simplest architecture that reliably accomplishes your goals. Sophistication for its own sake adds failure modes without adding value.
Related: A3 covers how individual agents gain capabilities through tools. A5 explores how to handle failures when they inevitably occur.