Tool Registry & Discovery
As an agent gains capabilities, two problems appear: the harness needs a single source of truth for what tools exist, and the context window can't hold a hundred verbose tool schemas at once. The registry solves the first; discovery — loading only the relevant tools per turn — solves the second.
Every tool definition you put in context costs tokens and dilutes attention. Ten tools is fine; two hundred is a Tool Junk Drawer that degrades every decision. Discovery is how a large toolset stays usable.
Structure
The registry holds every tool; only a relevant subset's schemas are surfaced into context, expanded on demand when the model needs more.
How It Works
- Register — each tool declares a name, description, schema, and metadata (cost, permissions, category) in one catalog.
- Scope per turn — instead of exposing all tools, select the subset relevant to the current turn's work.
- Defer schemas — surface lightweight names/summaries first; load full schemas only when a tool is actually selected, keeping the window lean.
- Discover on demand — let the model search the registry for a capability ("find a tool that sends email") and pull its schema in when matched.
- Validate against the registry — dispatch checks every call against the catalog, so unknown or out-of-scope tools are rejected cleanly.
Key Characteristics
- One catalog, many consumers — dispatch, permissions, and tracing all read from the same registry, so a tool is defined once.
- Schemas are context cost — full definitions are expensive; load them lazily. The cheapest token is the one you didn't include.
- Scoping improves decisions — a focused toolset makes the model's choice easier and more accurate than a wall of options.
- Discovery scales the toolset — search-on-demand means adding the 201st tool doesn't degrade the other 200.
- Metadata enables governance — per-tool cost and permission tags let the harness reason about what a call will cost and whether it's allowed.
Anthropic introduced the Model Context Protocol in November 2024 to standardize how applications expose tools and resources to LLMs. Each tool declares a name, a description, and JSON-Schema parameters — exactly the registry contract, lifted out of any one codebase and onto the wire. Clients can list a server's tools, surface them to the model, and validate calls against the declared schemas, which makes discovery and lazy schema loading natural rather than bespoke. The protocol was adopted across the industry, including OpenAI's announcement of support in March 2025, turning what had been a per-harness internal catalog into a shared ecosystem boundary.
Pitfalls
- Exposing every tool every turn — floods the window, raises cost, and worsens selection. The Tool Junk Drawer in its purest form.
- No single source of truth — tool definitions scattered across the codebase drift out of sync with what dispatch actually accepts.
- Over-scoping — hiding a tool the task genuinely needs forces the model to improvise badly. Discovery must be able to surface it.