The Agent Loop
The beating heart of any harness. A model call is one-shot: prompt in, text out. The agent loop wraps that call in a cycle — assemble context, call the model, execute whatever it asked for, feed the result back, repeat — until the task is done or a limit is hit. Everything else in this section exists to make this loop reliable.
Observe, think, act, repeat — that's the whole loop. The idea is simple; making it reliable under real conditions is the work.
Structure
Each pass through the cycle is a turn. The loop is deterministic harness code; only the model's decision inside it is non-deterministic.
How It Works
- Assemble — build the prompt for this turn from the system prompt, history, and retrieved context (see Context Assembly).
- Call the model — send the assembled context; receive either a final answer or one or more tool calls.
- Branch — if the model returned a final answer, exit the loop and return it. Otherwise, proceed to execute.
- Execute — run the requested tool(s) through the tool dispatch layer.
- Append — add the tool results to the conversation history as the observation for the next turn.
- Check budget — if step/token/time/cost limits remain, loop; otherwise halt with partial results and a reason.
Key Characteristics
- The loop is deterministic; the model is not — keep all control flow (branching, limits, retries) in harness code, never in the prompt. The model decides what to do; the harness decides whether it's allowed and when to stop.
- Turns are the unit of everything — budgets, traces, and checkpoints are all measured per turn. A clean turn boundary is what makes the rest of the harness tractable.
- Tool results are just more context — execution output is appended as the next observation. The loop doesn't care whether it came from a calculator or a sub-agent.
- One loop, many shapes — a single-tool-call-per-turn loop, a parallel-tool-call loop, and a plan-then-execute loop are all the same skeleton with different branch logic. Start with the plainest shape that works: Anthropic's Building Effective Agents makes this the core advice — find the simplest pattern, and add complexity only when it measurably improves outcomes.
- Statelessness is the model's; state is the harness's — the model forgets everything between calls. The loop is what carries history forward.
Pitfalls
- Control flow in the prompt — "stop when you're confident" is a wish, not a guarantee. Halting belongs in code. This is the root of the Infinite Loop anti-pattern.
- No turn ceiling — a loop with no budget is one bad inference away from running forever and burning money.
- Swallowing the "final vs. tool" ambiguity — if the model emits both prose and a tool call, decide explicitly which wins. Silent guessing produces confusing behavior.