Async Subagent

Pattern

A named solution to a recurring problem.

Delegate a task to a child agent without blocking: spawn it, get a handle back immediately, keep working, and collect the result later.

Also known as: Non-Blocking Subagent, Fire-and-Continue Delegation, Asynchronous Delegation

The ordinary Subagent makes the parent wait. The parent dispatches a child, freezes, and resumes only when the child returns. That’s fine when the parent has nothing better to do, but it wastes the parent’s time whenever it has work it could be getting on with. An async subagent removes the wait: the parent spawns the child, gets a task handle back right away, and keeps going. It picks up the result whenever the child finishes, or whenever the parent decides to look.

Understand This First

Subagent — the synchronous case this pattern relaxes; read it first.
Agent — an async subagent is an agent with a non-blocking launch contract.
Thread-per-Task — each async child runs in its own thread.

Context

At the agentic level, delegation comes in two flavors that mirror blocking and non-blocking I/O. A blocking call hands off work and stops until the result comes back. A non-blocking call hands off work and returns immediately with a handle you can poll, leaving you free to do something else in the meantime. An async subagent applies the second flavor to agent delegation.

The parent is itself an agent — not a human. That’s what separates this pattern from a Background Agent, where a person launches an independent session and walks away. Here the launcher stays inside the same workflow, still planning and still acting, with one or more children running concurrently underneath it. The parent never leaves the loop; it just stops waiting inside it.

Problem

How does a parent agent delegate a slow subtask without freezing for the duration?

A synchronous subagent blocks its parent. While a child reviews a large module or runs a slow test suite, the parent sits idle, burning wall-clock time it could spend on independent work. Worse, the parent stops being responsive: if the human wants to steer, ask a question, or change direction, they have to wait out the child too. As agentic pipelines grow from one-thing-at-a-time into concurrent architectures, the parent-freezes-while-waiting constraint becomes the bottleneck. You can’t fix it by making the child faster. You need a way for the parent to delegate and stay live.

Forces

Parent responsiveness. A parent that blocks on every child can’t react to new input or coordinate other work while it waits.
Wall-clock overlap. A child’s runtime should overlap the parent’s, not stack on top of it.
Result timing. A non-blocking result arrives at an unpredictable moment, and the parent has to fold it into context that has moved on since the spawn.
Lifecycle bookkeeping. Every outstanding child is state the parent must track: which handles are live, which finished, which to cancel.
Loss of a natural sync point. Blocking delegation has an obvious “now I have the answer” moment; non-blocking delegation gives that up and must recover it deliberately.

Solution

Spawn the child through a non-blocking call that returns a task handle, then manage the child’s life through explicit lifecycle operations. The parent decides when to check on it, when to steer it, when to collect its result, and when to give up on it. The contract has four moves.

Spawn. The parent calls a delegation tool and gets back a task identifier instead of a result. The child starts running in its own thread with its own context. Control returns to the parent on the next line.

Check. The parent polls the handle to ask whether the child is still running, finished, or failed. Checking is cheap and read-only; it doesn’t block. A parent typically checks at natural pauses in its own work rather than in a tight loop.

Steer. While the child runs, the parent can send it a correction or extra context: “also cover the auth module,” or “stop once the first failing test reproduces.” Not every harness supports mid-flight steering, but when it does, it lets the parent adjust without canceling and restarting.

Collect or cancel. When the child is done, the parent collects its result and folds it into context. This is the handoff that ends the delegation. If the child is no longer needed (the parent found the answer elsewhere, the task changed, the run is drifting), the parent cancels the handle and reclaims the resources.

The discipline the pattern demands is bookkeeping. The parent holds a set of live handles and has to remember what each one is for, because a result can land at any time and has to be matched back to the work that requested it. Keep the set small and the purposes distinct. A parent juggling ten anonymous handles has reinvented the coordination problem that Orchestrator-Workers exists to structure, at which point you want the orchestrator, not a pile of loose async calls.

Warning

A non-blocking result has to land somewhere. If the parent spawns a child, moves three steps down its own plan, and then collects a result that assumes the world as it was at spawn time, the result can be stale or contradictory. Decide up front where each child’s output will be merged, and re-check the relevant context at collection time rather than trusting the result to still fit.

How It Plays Out

A parent agent is planning a refactor and wants a full read of the current code review findings before it commits to an approach, but reading every flagged file itself would fill its context window. It spawns an async review child: “scan the modules under src/payments/ for the issues in this review thread and summarize them by severity.” The spawn returns a task id immediately, and the parent goes back to drafting its migration plan. Two planning steps later it checks the handle, sees the child finished, and collects a severity-ranked summary. It folds the findings into the plan it was already writing. No idle wait, and the two streams of work overlapped in wall-clock time.

A parent doing a multi-file refactor delegates the mechanical edits to children while it writes the tests that will verify them. It spawns one async child per module (“convert this module to the new logging interface”) and, instead of waiting on each, turns to authoring the regression tests for the new behavior. As each child reports back, the parent collects its diff and runs it against the tests it has been writing. One child drifts, touching a file outside its module; the parent cancels that handle, narrows the prompt, and respawns. The test-writing never stopped while any of this happened.

Tip

Give each async child a result that is small and self-describing (a summary, a diff, a short report), not a transcript. The parent has to absorb the result into context that has moved on since the spawn, and a compact, labeled artifact is far easier to fold in correctly than a wall of raw output.

Consequences

Benefits. The parent stays responsive and stays busy. Delegation no longer costs the parent its own wall-clock time, so a single agent can keep planning, keep talking to the human, and keep making progress while slow subtasks run underneath it. This is the move that turns a sequential delegation chain into a concurrent one, and it’s the foundation an Orchestrator-Workers coordinator stands on: workers that don’t block the coordinator. Compared with Parallelization’s fan-out-and-wait, the parent here fans out and keeps working, collecting each result as it lands.

Liabilities. You trade the wait for bookkeeping. The parent now tracks live handles, matches arriving results back to the requests that spawned them, and decides what to cancel: state that didn’t exist in the synchronous case. Result timing becomes a real hazard: an answer can arrive after the parent’s context has moved past the question it answered, so collection has to re-check fit rather than assume it. And the natural “now I have the result” sync point is gone; the parent has to recreate it deliberately by checking and collecting, which is easy to forget and leaves orphaned children running. The pattern pays off when the parent has independent work to overlap with the child’s runtime. When it doesn’t, when the parent’s very next step needs the child’s answer, the plain synchronous Subagent is simpler and just as fast.

Sources

The blocking-versus-non-blocking distinction this pattern borrows comes from concurrent and asynchronous I/O, a foundational idea in operating systems and systems programming long predating agents. The async/await model that made non-blocking delegation ergonomic in mainstream programming traces through C#’s introduction of the keywords and the wider adoption of the future/promise abstraction for “a value that isn’t ready yet.”
The idea of a parent process spawning children it can poll, wait on, or kill is the classic Unix process model: fork, a returned process identifier, waitpid, and signals are the lifecycle operations this pattern re-expresses for agents. W. Richard Stevens’s Advanced Programming in the UNIX Environment is the durable reference for that model.
The non-blocking subagent variant was named explicitly in the agentic-coding practitioner community in mid-2026, as multi-agent frameworks began shipping delegation tools that return a task handle immediately rather than blocking the caller. The vocabulary settled fast: “async subagent” and “non-blocking delegation” converged across independent framework discussions over a span of days, a sign the concept had become common enough to need a shared name.

Keyboard shortcuts