Skip to main content

Parallel Fan-Out

Multiple LLM calls or agent tasks run concurrently, and their results are aggregated into a final output. Two variants: sectioning (independent subtasks run in parallel) and voting (the same task runs multiple times for higher confidence).


Structure

The fan-out distributes work. Each branch runs independently with no inter-branch communication. The fan-in collects all results and merges them — by concatenation, scoring, majority vote, or LLM-powered synthesis.


How It Works

Sectioning variant:

  1. Decompose — split the task into independent subtasks
  2. Fan-out — run all subtasks concurrently
  3. Fan-in — aggregate results into a unified output

Voting variant:

  1. Duplicate — run the same task N times (same or different models)
  2. Collect — gather all N outputs
  3. Select — pick the best via majority vote, scoring, or LLM judge

Key Characteristics

  • Lower latency — wall-clock time = slowest branch, not sum of all branches
  • Higher throughput — N tasks in the time of one
  • No inter-branch dependency — branches can't communicate during execution
  • Higher cost — N concurrent LLM calls instead of one
  • Aggregation is hard — merging conflicting results requires careful design

When to Use

  • Subtasks are truly independent (no data dependencies between them)
  • Latency is critical and sequential execution is too slow
  • You want higher confidence through multiple independent attempts (voting)
  • The task naturally splits into parallel concerns (security review + performance review + style review)
  • You can afford the cost of N concurrent calls