Parallel Fan-Out
Multiple LLM calls or agent tasks run concurrently, and their results are aggregated into a final output. Two variants: sectioning (independent subtasks run in parallel) and voting (the same task runs multiple times for higher confidence).
Structure
The fan-out distributes work. Each branch runs independently with no inter-branch communication. The fan-in collects all results and merges them — by concatenation, scoring, majority vote, or LLM-powered synthesis.
How It Works
Sectioning variant:
- Decompose — split the task into independent subtasks
- Fan-out — run all subtasks concurrently
- Fan-in — aggregate results into a unified output
Voting variant:
- Duplicate — run the same task N times (same or different models)
- Collect — gather all N outputs
- Select — pick the best via majority vote, scoring, or LLM judge
Key Characteristics
- Lower latency — wall-clock time = slowest branch, not sum of all branches
- Higher throughput — N tasks in the time of one
- No inter-branch dependency — branches can't communicate during execution
- Higher cost — N concurrent LLM calls instead of one
- Aggregation is hard — merging conflicting results requires careful design
When to Use
- Subtasks are truly independent (no data dependencies between them)
- Latency is critical and sequential execution is too slow
- You want higher confidence through multiple independent attempts (voting)
- The task naturally splits into parallel concerns (security review + performance review + style review)
- You can afford the cost of N concurrent calls