Parallel Fan-Out

Multiple LLM calls or agent tasks run concurrently, and their results are aggregated into a final output. Two variants: sectioning (independent subtasks run in parallel) and voting (the same task runs multiple times for higher confidence).

Structure

The fan-out distributes work. Each branch runs independently with no inter-branch communication. The fan-in collects all results and merges them — by concatenation, scoring, majority vote, or LLM-powered synthesis.

How It Works

Sectioning variant:

Decompose — split the task into independent subtasks
Fan-out — run all subtasks concurrently
Fan-in — aggregate results into a unified output

Voting variant:

Duplicate — run the same task N times (same or different models)
Collect — gather all N outputs
Select — pick the best via majority vote, scoring, or LLM judge

Key Characteristics

Lower latency — wall-clock time = slowest branch, not sum of all branches
Higher throughput — N tasks in the time of one
No inter-branch dependency — branches can't communicate during execution
Higher cost — N concurrent LLM calls instead of one
Aggregation is hard — merging conflicting results requires careful design

When to Use

Subtasks are truly independent (no data dependencies between them)
Latency is critical and sequential execution is too slow
You want higher confidence through multiple independent attempts (voting)
The task naturally splits into parallel concerns (security review + performance review + style review)
You can afford the cost of N concurrent calls

Structure​

How It Works​

Key Characteristics​

When to Use​

Structure

How It Works

Key Characteristics

When to Use