--- slug: ephemeral-environment type: pattern summary: "Run each agent task in a short-lived, isolated runtime that is created on demand and destroyed the moment the task finishes." created: 2026-06-20 updated: 2026-06-20 related: sandbox: relation: contrasts-with note: "A sandbox is the isolation boundary; an ephemeral environment is the lifecycle that creates a runtime per task and destroys it after." worktree-isolation: relation: complements note: "Worktree isolation gives each task its own filesystem branch on one machine; an ephemeral environment gives each task its own runtime, often on a separate microVM." async-subagent: relation: complements note: "Fan-out subagents each need their own runtime, and ephemeral environments are the per-subagent execution substrate that keeps parallel work from colliding." background-agent: relation: used-by note: "A background agent is asynchronous delegation; the ephemeral environment is frequently where that delegated work runs." environment: relation: refines note: "Environment is the ops sense of a configured place to run; the ephemeral environment is the disposable, single-use cousin spun up on demand and never promoted." blast-radius: relation: enables note: "Destroying the runtime after each task bounds the blast radius of a mistake or a poisoned dependency to a single disposable environment." --- # Ephemeral Environment > **Pattern** > > A named solution to a recurring problem. *Run each agent task in a short-lived, isolated runtime that is created on demand and destroyed the moment the task finishes.* *Also known as: Ephemeral Sandbox, Disposable Environment, Per-Task Runtime* You have already met the disposable version of this idea: the CI job that spins up a fresh container, runs your tests, and throws the container away. Nothing carries over to the next run, so a build can't be poisoned by leftover state from the last one. An ephemeral environment is that same instinct, pointed at coding agents. Each task an agent picks up gets a clean runtime that exists only for the length of that task, then disappears. ## Understand This First - [Sandbox](sandbox.md) — the isolation boundary that most ephemeral environments are built on. - [Environment](environment.md) — the ops notion of a configured place to run, which this pattern makes disposable. ## Context At the **agentic** level, the question "where does this agent's code actually run?" stops being an afterthought the moment agents leave a single developer's laptop. An interactive session on your own machine has an obvious answer: it runs where you run. But once agents work in CI, or fan out across many tasks at once, or pick up jobs while you sleep, each one needs a place to read files, run tests, and watch output. Those places must not interfere with each other or with production. The convenient answer is to give every agent the same place: the shared build server, a long-lived staging box, a developer's workstation. That answer is also the trap. A shared runtime accumulates state, leaks credentials between tasks, and turns one agent's mistake into everyone's problem. The ephemeral environment is the alternative the agent-infrastructure layer has settled on. ## Problem How do you give an agent a real place to execute code (install dependencies, run a build, read test logs, iterate) without that place becoming a shared liability that carries state, secrets, or damage from one task into the next? A persistent environment is convenient on the first run and dangerous on every run after. Dependencies from a previous task linger. A half-finished migration leaves the database in a strange state. A credential read into memory for one job is still reachable by the next. And when an agent does something destructive, the damage is scoped to a long-lived box that other work shares. ## Forces - **Isolation versus provisioning latency.** A truly fresh runtime per task is the safest design and the slowest to start. Cold starts add seconds to the agent's inner loop. - **Statelessness versus build caches.** Clean teardown demands that nothing meaningful live inside the environment, but builds want warm dependency and compiler caches to be fast. - **Per-task cost versus shared-runner cost.** Spinning up a runtime for every task can cost more than one always-on server, especially under heavy fan-out. - **Convenience versus safety.** Reusing one environment "just this once" is always easier in the moment and is exactly how state leakage creeps back in. ## Solution **Create a new isolated runtime for each unit of agent work, run the task inside it, and destroy it when the task completes.** The defining property is the lifecycle (create, use, destroy), not the isolation boundary alone. The environment carries no state between runs and has no standing access to host infrastructure or to other tasks. Make teardown trivial by keeping all durable state outside the runtime. Source comes from version control, results go to object storage or a database, and the environment itself holds nothing worth saving. When a task ends, there is nothing to clean up, because everything that mattered already lives elsewhere. This externalized-state discipline is what lets you treat the runtime as disposable rather than as something to reset. Pick the substrate against the isolation-versus-latency force. Firecracker microVMs give near-VM isolation with cold starts measured in a second or two; gVisor and Kata Containers trade some startup speed for a stronger boundary than plain containers; an ordinary container is fastest and weakest. Where cold starts hurt the agent's loop, keep a warm pool of pre-booted runtimes and hand one to the next task, resetting it between uses rather than rebuilding from scratch. Wire the trigger to the unit of work. A pull request, a commit, a tool call, or an agent task each spins up its own environment as a step in the pipeline. The agent gets a real place to build, test, read failures, and self-correct. When it finishes, the place is gone. > **💡 Tip** > > Make the trigger the task, not the session. An environment created per agent session quietly becomes long-lived when one session runs many tasks back to back, and the state you were trying to avoid creeps right back in. ## How It Plays Out A platform team lets coding agents open pull requests against their main service. Earlier, each agent worked on a shared staging box; one agent's broken database migration left the box in a state that failed the next three agents' test runs, and nobody could tell whose change was at fault. The team rewires the pipeline so every agent task triggers a Firecracker microVM: the agent clones the repo, installs dependencies, runs the suite, reads the failures, fixes its code, and reruns, all inside a runtime that lasts about ninety seconds and then vanishes. A broken migration now dies with its own microVM and touches nothing else. Consider the cost trap on the other side. A team turns on aggressive fan-out, launching dozens of agent tasks in parallel, each in its own environment. Throughput climbs, then the cloud bill spikes: runtimes were spinning up faster than finished ones tore down, and idle environments piled up waiting on a slow dependency install. The fix isn't to abandon the pattern. Cap concurrent environments, add a warm pool so cold starts stop dominating, and put a hard timeout on any runtime that outlives its task. ## Consequences **Benefits.** Each task starts from a known-clean state, so no run can be poisoned by leftovers from the last. Destroying the runtime after each task bounds the [blast radius](blast-radius.md) of an agent's mistake or a poisoned dependency to a single disposable environment. Parallel tasks can't collide, because each has its own runtime. And the model scales: the same per-task lifecycle works for one agent or for hundreds. **Liabilities.** Cold-start latency degrades the agent's inner loop unless you keep warm pools, and warm pools reintroduce some of the state risk the pattern exists to remove. Per-task cost can blow up under unbounded fan-out, where runtimes spin up faster than they tear down. And the worst failure is the quiet one: an "ephemeral" environment that is actually being reused. A warm pool that isn't reset, or a session-scoped runtime serving many tasks, leaks state exactly where you assumed it couldn't, and the leak stays invisible until a task sees data it should never have had. ## Sources - The disposable per-build environment is long-standing practice in continuous integration; the discipline of running each job in a fresh, isolated container so no run contaminates the next predates coding agents by years and is the direct ancestor of this pattern. - [Firecracker](https://firecracker-microvm.github.io/), the microVM monitor Amazon built for AWS Lambda and Fargate, established that VM-grade isolation could start in well under a second, making a fresh runtime per task practical rather than prohibitively slow. Its [design paper](https://www.usenix.org/conference/nsdi20/presentation/agache) (NSDI 2020) documents the isolation-versus-startup tradeoff this pattern balances. - [gVisor](https://gvisor.dev/) and [Kata Containers](https://katacontainers.io/) developed the middle ground between container speed and VM isolation that the substrate choice in this article draws on. - The framing of a short-lived, per-task runtime as the execution substrate for coding agents emerged from the agent-infrastructure community in 2025-2026, as practitioners moved agents off laptops into CI and asynchronous fan-out and needed an answer to "where does each agent task run?" --- - [Next: Compaction](compaction.md) - [Previous: Background Agent](background-agent.md)