Orchestrator-Workers
A central agent inspects a goal, invents the subtasks it implies, dispatches workers to handle each, and synthesizes their results.
Understand This First
- Subagent – workers in this pattern are typically subagents with focused scopes.
- Decomposition – the orchestrator must decompose the goal into useful pieces.
- Plan Mode – the orchestrator usually plans before it dispatches.
Context
At the agentic level, Orchestrator-Workers is one of the canonical multi-agent architectures: a single coordinator agent receives a goal, figures out what subtasks it implies, spawns worker agents to handle those subtasks, and stitches their answers back together. The key move is that the orchestrator decides what to dispatch after it has looked at the input. Subtasks aren’t pre-declared in code; the orchestrator invents them per request.
This is the default shape most production coding agents fall into when the work spans an unknown number of files, research threads, or implementation steps. It sits one level up from a single agent and one level below a team of peers that self-organize. A feature request that needs research, design, implementation, and review often maps cleanly to an orchestrator plus four workers — but only because the orchestrator decided those were the right four steps for this particular request.
Problem
You have a goal that breaks into multiple subtasks, but you don’t know in advance what those subtasks are or how many there will be.
A single agent working alone hits context and focus limits. A pre-wired pipeline (step A, then step B, then step C) can’t adapt when the input demands a different shape. A team of peer agents can self-organize, but the overhead of peer coordination is high and unnecessary when one coordinator can direct the work cleanly. You need an architecture that adapts its shape to the input without burning the coordination budget of a full team.
Forces
- Dynamic shape. The number and type of subtasks depend on the specific input, so the structure can’t be hard-coded.
- Context budget. One agent can’t hold every file, every search result, and every piece of generated code in its own window without degrading.
- Coordination cost. Peer coordination among agents multiplies messages; a single coordinator is cheaper when the dispatch pattern is hierarchical.
- Synthesis loss. When workers return results, the orchestrator has to integrate them without dropping the important detail.
- Cost and latency. Every worker dispatch is more tokens and often more wall-clock time.
Solution
Structure the agent as one orchestrator plus a set of workers. The orchestrator has three jobs: decide what subtasks the goal requires, dispatch a worker for each, and synthesize the returned results into the final answer.
Decide. When a request arrives, the orchestrator inspects it and produces a plan: these are the subtasks, in this rough order, with these dependencies. The plan isn’t a menu chosen from a fixed list; it’s a fresh decomposition written for this request. If the request is a bug report, the plan might be “reproduce, localize, fix, verify.” If the request is a refactoring task, the plan might be “map the call sites, design the new shape, apply the change, run the tests.” Different inputs get different plans.
Dispatch. For each subtask, the orchestrator spawns a worker with a narrow prompt, the specific context it needs, and a clear expected output. Each worker runs in its own context window, often on a cheaper model. Workers don’t see each other; they see only what the orchestrator gave them.
Synthesize. As workers return results, the orchestrator integrates them into the running picture and decides what to do next. Sometimes a worker’s output changes the plan: a research worker discovers a hidden dependency, so the orchestrator spawns an extra implementation worker. Sometimes a worker fails and the orchestrator has to decide whether to retry, fall back, or escalate. The synthesis step is where the orchestrator earns its role: it keeps the big picture coherent while the workers stay focused on their fragments.
The contrast with Parallelization is sharp. Parallelization runs pre-declared independent tasks at the same time (run the same test suite on three branches). Orchestrator-Workers invents the subtasks per request and runs them in parallel or in sequence as dependencies allow.
Keep the orchestrator’s prompt focused on decision-making and synthesis, not on execution. If the orchestrator is doing the actual coding, reading, or reviewing, you’ve collapsed the pattern back into a single agent. Workers exist so the orchestrator can stay high-level.
How It Plays Out
A developer asks an agent to “add a caching layer to the order service.” The orchestrator reads the request and doesn’t yet know which files need to change, whether the project already has a caching library, or how the cache should be invalidated. It writes a three-step plan: research the current order service, design the cache shape, and implement the change.
A research worker goes first. It reports back that the service has four hot endpoints, uses Postgres directly, and already pulls in Redis for session storage. The orchestrator updates the plan (Redis is available, so no new dependency) and spawns a design worker with the research output as context. Once the design lands, an implementation worker builds it, and a review worker checks the diff. Each worker saw only what it needed; the orchestrator stitched the whole thing together.
Now consider an agent asked to summarize a long technical discussion thread with thirty messages across twelve contributors. The orchestrator can’t predict how many named topics will emerge, so it doesn’t write a fixed pipeline. It spawns a scanning worker to cluster the messages into topics, and the worker returns five clusters.
The orchestrator then spawns one summarizer worker per topic, in parallel, each with the relevant message subset. When the summaries come back, the orchestrator writes a top-level overview and appends the five sections. The shape of the output, five topics rather than three or seven, was decided by the orchestrator after looking at the input, not before.
Here’s what dispatch looks like inside the orchestrator’s loop:
Orchestrator receives: "Refactor the billing module to use the new invoice schema."
Orchestrator plans:
1. Research worker: map all files that reference Invoice, InvoiceLine, or
Billing. Return a list with brief annotations.
2. (Wait for research.)
3. Design worker: given the file list and the new schema, propose the
minimal diff strategy. Return a plan.
4. Implementation workers: one per module boundary the design identifies.
5. Review worker: read the final diff and flag anything the design didn't
anticipate.
Research worker returns: 23 files across 4 modules (api/, billing/core/,
reports/, migrations/).
Orchestrator updates plan:
Design worker will get the 4-module breakdown as its context scope.
Design worker returns: "Change InvoiceLine in billing/core first; api/ and
reports/ follow by reference; migrations/ needs a new version file."
Orchestrator dispatches 3 implementation workers in parallel (core, api
+ reports, migrations) since the design made their independence clear.
All three workers return. Orchestrator dispatches the reviewer.
Reviewer flags one missing call site in reports/templates/. Orchestrator
spawns a follow-up worker to patch it. Done.
Two things worth noticing. The plan changed after the first worker’s output, and a fixed pipeline couldn’t have adapted. And the parallelization decision (three implementation workers at once) was the orchestrator’s call, made because the design worker’s output revealed the modules were independent. A peer team would have had to discover that through coordination messages; a single agent would have serialized the work.
Consequences
Benefits. The orchestrator’s context stays relatively clean because the workers absorb the heavy reading, searching, and generation. The architecture adapts to the specific input, so the same agent handles small and large requests without reconfiguration. Workers can run on cheaper or faster models when their subtasks don’t need the orchestrator’s reasoning strength. Parallelism falls out naturally when the plan reveals independent subtasks.
Liabilities. Orchestrator context saturation is real: as workers report back, their outputs pile up in the orchestrator’s window. On long tasks, the orchestrator needs compaction or externalized state to keep working. Cost can blow out when speculative worker dispatches eat tokens whose output isn’t used. Synthesis loss happens when the orchestrator summarizes a worker’s report and drops a detail that mattered.
Partial failure is awkward. If one worker of five fails, the orchestrator has to decide whether to retry, substitute, or abandon, and that logic is surprisingly easy to get wrong. The pattern also creates a subtle trust hierarchy (see Delegation Chain): the orchestrator’s authority flows to workers the user never directly approved.
Related Patterns
- Depends on: Subagent – workers are typically subagents.
- Depends on: Decomposition – the orchestrator’s core skill is good decomposition.
- Depends on: Plan Mode – the orchestrator usually plans before dispatching.
- Contrasts with: Parallelization – parallelization runs pre-declared independent tasks; orchestrator-workers invents them per request.
- Contrasts with: Agent Teams – agent teams coordinate peer-to-peer; orchestrator-workers is hierarchical.
- Contrasts with: Handoff – handoff transfers a task between agents; orchestrator-workers keeps a central coordinator.
- Uses: Compaction – how the orchestrator manages its own window as workers report back.
- Uses: Model Routing – workers often run on cheaper or faster models than the orchestrator.
- Enables: Generator-Evaluator – one common worker configuration is a generator with an evaluator worker checking its output.
- Crosses: Delegation Chain – the orchestrator’s authority propagates to workers the user never directly approved.
Sources
- Anthropic’s Building Effective Agents (December 2024) named and formalized orchestrator-workers as one of six canonical agentic architectures, alongside prompt chaining, routing, parallelization, evaluator-optimizer, and fully autonomous agents. The article’s framing of “subtasks determined by the orchestrator based on the specific input” is the core definition used here.
- Reid G. Smith’s Contract Net Protocol (1980) is the intellectual ancestor: a coordinator announces a task, receives bids, and awards contracts to specialist workers. The modern agentic version drops the bidding and lets the orchestrator choose workers directly, but the hierarchical coordinator-plus-workers shape is the same.
- The multi-agent systems literature from the 1990s, particularly work by Michael Wooldridge and Nick Jennings, established the vocabulary of coordination, delegation, and task assignment among software agents. That vocabulary underpins the language used across modern agent frameworks.
- The “puppeteer” framing (arXiv:2505.19591, Multi-Agent Collaboration via Evolving Orchestration) extends the pattern by using reinforcement learning to train the orchestrator’s dispatch policy, treating the worker-selection decision as a learned skill rather than a hand-crafted prompt.
Further Reading
- Building Effective Agents – Anthropic’s canonical survey of agentic architectures: https://www.anthropic.com/research/building-effective-agents
- Design Patterns for Effective AI Agents by Pat McGuinness – a practitioner-oriented walkthrough of the same taxonomy with extended examples.
- Multi-Agent Collaboration via Evolving Orchestration (arXiv:2505.19591) – the research direction where the orchestrator’s policy is learned rather than prompted.