Sub-Agents & Delegation

A single loop can only hold so much in its context. Sub-agents let a parent delegate a scoped task to a child that runs its own loop with its own context, and returns just the result. This is the runtime mechanism behind hierarchical and orchestrator patterns.

Delegation is context hygiene as much as it is task decomposition. A child agent can read fifty files, reason through them, and hand back a three-line conclusion — the parent pays for the conclusion, not the fifty files. Isolation is the point.

Structure

The child's intermediate work stays in the child's window; only its distilled result crosses back, keeping the parent's context clean.

How It Works

Scope the task — the parent defines a self-contained sub-task with a clear input and an expected output shape.
Spawn with a fresh context — the child gets its own window, its own budget (a slice of the parent's), and only the context it needs.
Run independently — the child executes its own loop, dispatching tools and reasoning without touching the parent's history.
Return a distilled result — the child hands back a conclusion or structured output, not its full transcript.
Synthesize — the parent integrates the result, optionally spawning more children in parallel.

Key Characteristics

Isolation protects the parent's window — the child's exploration doesn't pollute the parent's context; only the result does.
Budgets compose downward — each child inherits a bounded slice, so the whole agent tree stays within a global ceiling.
Structured returns beat transcripts — forcing the child to return a structured output makes integration reliable.
Delegation has overhead — spawning a child costs a full loop. Delegate when isolation or parallelism pays for that cost, not reflexively.
Depth should be bounded — children spawning children spawning children is how a tree becomes uncontrollable. Cap nesting.

In the Wild

AnthropicMulti-Agent Systemssource ↗

An orchestrator-worker research system: one lead agent plans, parallel sub-agents execute

Anthropic's Research feature is this page's structure in production. A lead agent (Claude Opus 4) plans the investigation and spawns three to five specialized sub-agents (Sonnet 4) that search in parallel, each with its own isolated context; the lead synthesizes their distilled results, and a separate citation pass grounds the final answer. On Anthropic's internal research eval, this multi-agent configuration outperformed single-agent Claude Opus 4 by 90.2 percent. The economics are the caveat: multi-agent runs consumed about fifteen times the tokens of an ordinary chat, and token usage alone explained 80 percent of the performance variance — much of the win is simply spending more inference, in parallel, on a task that genuinely decomposes.

Parallelism wins when the task decomposes and the value covers the spend — a 90.2% quality gain bought with roughly 15× the tokens of an ordinary chat.

The counterargument deserves equal weight: Cognition's "Don't Build Multi-Agents" argues that fragmenting context across agents produces conflicting decisions — every action carries implicit decisions the other agents never see — which is why Devin is built single-threaded by design, sharing full context and traces in one loop.

Pitfalls

Agent Sprawl — spawning sub-agents for work that one loop handles fine, multiplying cost and coordination for no gain.
Leaking the full transcript back — returning the child's entire history defeats the context-isolation benefit entirely.
Unbounded recursion — no nesting cap turns a delegation tree into a runaway that exhausts the global budget.

Structure​

How It Works​

Key Characteristics​

In the Wild​

Pitfalls​

Structure

How It Works

Key Characteristics

In the Wild

Pitfalls