Compound Engineering
“Each feature should make subsequent features easier to build, not harder.” — Dan Shipper
Also known as: Compounding Engineering
Make every shipped unit of work, whether bug fix, feature, code review, or plan revision, convert its lesson into a durable, agent-readable surface before the work closes, so the next feature is genuinely cheaper than the last.
Understand This First
- Instruction File — the primary surface where codified lessons land for the next session.
- Skill — the package format for workflow lessons.
- Memory — the cross-session durability layer for what’s been learned.
- Garbage Collection — the maintenance loop that keeps codified knowledge from rotting.
Context
You’re working on a real codebase with a capable coding agent. Every feature you ship leaves behind a tail of context: which lint rule the project follows and why, which migration sequence is forbidden, how the deploy pipeline reads its env vars, what counts as “done” in this codebase, why the auth module is shaped the way it is. That context is the unwritten payment your team made for the feature, and it can either go to waste or become an asset.
This pattern sits one level above the bricks. The book has the Instruction File, the Skill, the Hook, the Subagent, and the Memory. Compound Engineering is the discipline that says every shipped lesson must end up on one of those surfaces before the work closes. Without that discipline, you have the bricks but no building.
Problem
Without a deliberate practice, the context you paid for evaporates between sessions. Your team re-explains the same conventions to fresh agent contexts. You fix the same recurring class of bug a fourth, fifth, sixth time. Code-review notes from last sprint never become rules, so the agent re-makes the mistake it made then. The marginal cost of the next feature stays flat or rises, even though the agents are getting more capable and the codebase is getting larger. The compounding curve you were promised never shows up.
The promise of agentic engineering was that experience would compound. The default is that it doesn’t.
Forces
- Sessions are stateless by default. What an agent learned in this morning’s correction is gone by this afternoon’s session. The lesson lives only in the developer’s head, until that fades too.
- Codification feels like a tax on the work. When you’ve just fixed the bug, writing the rule that would prevent its return feels like a separate, smaller task you can skip. That’s how every recurring class of bug gets re-fixed forever.
- Lessons land on different surfaces well. A naming convention belongs in an instruction file; a workflow belongs in a skill; a deterministic check belongs in a hook; a recurring review lens belongs in a subagent. Picking the right surface matters; cramming everything into one document fails differently than nothing.
- Codified knowledge can rot. Rules contradict each other. Skills go out of date. Hooks block work nobody remembers asking for. Without an explicit pruning discipline, the compounding asset turns into a compounding liability.
- Knowledge is repo-local by default. The compounding gain inside one codebase doesn’t automatically transfer to a new project. Teams that don’t plan for portability rebuild the same scaffolding every time.
Solution
Make codification a closing condition for every unit of work, not a separate cleanup pass. Before a bug fix, feature, or review closes, ask: what general lesson did we just learn, and which durable surface should it live on? If the answer is “none,” that’s fine. Most individual fixes don’t generalize. But the question is mandatory; the answer is permitted to be no.
Five canonical surfaces accept the lessons:
- Instruction file rules. When a lesson generalizes to “always do X” or “never do Y” in this codebase, encode it in the project’s instruction file (CLAUDE.md, AGENTS.md, or the equivalent your harness loads). Be specific. “Use 2-space indentation in all markdown files” beats “follow our conventions.”
- Skills. When the lesson is a workflow (“the right way to add a database migration in this repo”), package it as a skill. The next agent invokes it by name and gets the steps, the template, and the quality criteria without re-explanation.
- Hooks. When a lesson must be enforced deterministically and forgetting it costs real money (“never let a commit through if the build fails,” “always run the formatter after edits”), wire it into a hook. The work can’t proceed past the gate, so the lesson can’t be forgotten.
- Subagents. When the lesson is “this kind of review needs a dedicated lens” (security, performance, accessibility, schema-migration safety), encode the lens as a subagent the orchestrator invokes for every relevant change.
- Tests and evals encoding intent. When the lesson is “this behavior must not regress” or “this contract is real,” write a test or eval that fails if the behavior breaks. The test is the lesson made executable.
The bricks already exist; what compound engineering adds is the closing condition. The cycle isn’t “ship and move on.” It’s “ship, codify, then move on.”
A separate maintenance discipline runs alongside it. Codified knowledge rots. Rules conflict, skills go stale, hooks block work nobody asked for. Treat the codified surfaces the same way you treat the code: prune them on a cadence. The book’s name for this companion is Garbage Collection. Without it, compound engineering turns into compound liability.
Don’t codify too early. The first time you hit something, learn from it. The second time, notice it. The third time, codify it. Lessons that land on a surface after one occurrence tend to be wrong; the team hasn’t yet seen the variations the rule has to cover. The Feedback Flywheel framing of “three corrections from three developers” is a good rule of thumb for when the lesson has stabilized enough to encode.
Distinguishing from neighbors
Two patterns are close enough that readers reasonably ask how they differ.
Regenerative Software also inverts the cost curve of engineering, but at the code layer: it treats specifications, boundaries, and evals as durable assets and the code itself as a disposable, regenerable output. Compound Engineering inverts the cost curve at the engineering-knowledge layer: it treats the codified lessons embedded in the agent’s working surface (instruction files, skills, hooks, subagents, tests) as the compounding asset. A team can practice both, and they reinforce each other. A strong eval suite is one of the surfaces compound engineering writes lessons onto, and is also what makes a regeneration safe. But the two patterns operate at different layers.
Feedback Flywheel is the named harvesting loop with first-pass acceptance rate as its leading metric. It’s the canonical mechanism for one specific input (developer corrections) and one specific surface (instruction-file rules). Compound Engineering is the broader discipline: corrections are one input, code reviews are another, plan revisions a third, edge-case discoveries a fourth, and the surfaces include skills and hooks and subagents and tests, not just rule documents. Run a feedback flywheel and you’re practicing compound engineering on one channel; the discipline asks you to run the same loop on the others.
How It Plays Out
A two-engineer team ships an email assistant serving thousands of daily users. Every code review surfaces something specific: “the agent didn’t know that the settings panel uses the existing form component instead of writing a new one”; “the agent generated a migration without an IF NOT EXISTS guard.” Each finding becomes one line in the instruction file that night. Six months later, the file has accumulated sixty rules of that shape. The agent never reaches for a fresh form component. Migrations always include the guard. The marginal cost of the next feature has gone down, not up. The team’s working summary is “we ship more this week than last week, every week,” and the line has held for months.
A different team adopts compound engineering enthusiastically and runs into the failure mode. Two months in, they have 200 instruction-file rules and 40 skills, and they’ve never pruned. Half the rules contradict each other. The agent follows whichever conflicting rule it sees first. Developers spend more time arguing with stale rules than building features. The team’s first response is to blame the discipline. The actual fix is the missing companion: schedule a Garbage Collection pass on the codified surfaces, retire rules that haven’t prevented a correction in months, merge ones that have drifted into near-duplicates. The compounding asset comes back online once the maintenance loop catches up.
A solo developer practices compound engineering in miniature. Every time they correct an agent twice, they ask whether to encode the rule. Most answers are no, because the correction was situational. But over a quarter, they’ve added eighteen rules, three small skills, and one pre-commit hook. None of them is dramatic. Together they’re the difference between an agent that needs steady steering and one that produces shippable output on the first try most of the time. When they take a contract in a fresh codebase six weeks later, the first thing they do is set up the same skeleton (a thin instruction file, the pre-commit hook, the format-and-lint skill), knowing the rest will accumulate the same way.
“After we close this fix, list the lessons worth codifying. For each one, recommend a surface (instruction-file rule, skill, hook, subagent, or test) and draft the codified version. Tell me which of these you think are too situational to bother with.”
Consequences
The wins are the ones the pattern’s name promises. Engineering inverts from diminishing-returns to compounding-returns. Onboarding new agents (and new humans) collapses, because the codebase’s tacit knowledge is now explicit. Recurring classes of defect shrink over time instead of cycling. Small teams can run several production products because the cost of “operating a codebase” stops scaling with the codebase’s size.
The costs are honest. Every shipped unit of work now has a documentation tail that must actually get done; teams that skip the closing condition lose the compounding effect quickly. The codified surfaces need their own maintenance loop, and a team without that discipline produces a slow, contradictory mess of rules nobody trusts. Codified knowledge is repo-local by default, so transferring the gain to a new project takes deliberate scaffolding. And the most expensive failure mode is the most subtle one: codifying lessons that aren’t true yet, then watching the agent obediently apply a wrong rule everywhere. Lessons codified too early lock in misunderstandings; the discipline has to include the patience to wait until the lesson has stabilized.
A final caution: the worst version of this pattern is the team that adopts it as a slogan and stops there. Compound engineering doesn’t compound because you said the words. It compounds because every fix, every review, every plan revision actually pays its codification cost before it closes. That’s the whole pattern. Skip the closing condition and you have the bricks but no building.
Related Patterns
| Note | ||
|---|---|---|
| Contrasts with | Regenerative Software | Both invert the cost curve of engineering, but Regenerative Software does it at the code layer (specs and evals as durable, code as disposable) while Compound Engineering does it at the engineering-knowledge layer (lessons codified into agent-readable surfaces). |
| Contrasts with | Technical Debt | Technical debt is what accumulates when lessons stay tacit; compound engineering is the discipline that converts the same accumulating pressure into a compounding asset instead. |
| Depends on | Garbage Collection | Without periodic pruning, codified lessons accumulate into contradiction and bloat; garbage collection keeps the compounding asset from turning into a compounding liability. |
| Refines | Feedback Flywheel | The feedback flywheel is one specific compound-engineering loop: capture corrections, distill them into rules, codify them in instruction files. Compound engineering is the broader discipline; the flywheel is the named harvesting mechanism with first-pass acceptance rate as its metric. |
| Specializes | Feedback Loop | Compound engineering is one specific feedback loop: shipped lesson loops back into the harness as a codified rule, skill, hook, subagent, or test. |
| Uses | Code Review | Code review is one of the dominant inputs to the compounding loop; review findings convert into instruction-file rules, refined subagents, or new hooks. |
| Uses | Eval | Tests and evals encoding intent are one of the five canonical surfaces lessons codify into. |
| Uses | Externalized State | Externalized state holds the plan and intermediate results during the work; compound engineering decides what graduates from those scratch artifacts into a durable, reusable surface. |
| Uses | Hook | Hooks codify lessons that must be enforced deterministically rather than remembered. |
| Uses | Memory | Memory is one of the durable surfaces lessons can land on; compound engineering is the discipline of putting them there. |
| Uses | Progress Log | The progress log is the in-flight record of what the work tried; compound engineering is what happens to those entries after the work closes. |
| Uses | Skill | Skills package workflow lessons ("the right way to do X here") so future agents invoke the expertise by name. |
| Uses | Subagent | Subagents codify a review or task lens (security, performance, accessibility) that must be applied every time. |
Sources
- Dan Shipper and Kieran Klaassen, Compound Engineering: How Every Codes With Agents (December 2025, updated April 2026), is the canonical written treatment. Their working definition, “you expect each feature to make the next feature easier to build,” is the load-bearing reframing this article extends. Klaassen, the general manager of Cora at Every, is the practitioner whose workflow the article describes; Shipper, Every’s CEO, frames the discipline.
- Dan Shipper, public statement on the inversion (X / Twitter, August 2025): “Each feature should make subsequent features easier to build, not harder.” This is the pithiest available formulation of the pattern’s central claim.
- The retrospective-driven institutional learning the pattern depends on has roots in Norm Kerth’s Project Retrospectives: A Handbook for Team Reviews (2001), which established structured team reflection as the engine of organizational learning. Compound engineering applies that engine to a new substrate: the codified surfaces a coding agent reads.
- The flywheel framing (small consistent pushes in a coherent direction compounding into momentum) is Jim Collins’s, from Good to Great (2001). Compound engineering is one realization of that dynamic at the level of agent-readable artifacts.
- The deeper economic claim that knowledge work compounds when it’s externalized into reusable artifacts is older than software. Peter Drucker’s analyses of knowledge work in The Effective Executive (1967) and his later writing on the productivity of knowledge workers prefigure the move from “lessons live in heads” to “lessons live on durable surfaces.”
Further Reading
- Dan Shipper and Kieran Klaassen, Compound Engineering: How Every Codes With Agents — the canonical article, with the Cora case study and the four-step Plan / Work / Review / Compound loop framed in the practitioner’s own voice.
- How Two Engineers Ship Like a Team of 15 With AI Agents — Klaassen on the AI & I podcast, walking through the working setup and the multi-subagent code-review loop in real time. Useful for seeing the discipline in motion rather than as a writeup.