Spec-Driven Development
Spec-Driven Development is a workflow where a written specification is the primary artifact, and the team organizes implementation, review, and evolution around that document.
“Make it a rule never to give a reading that you have not prepared carefully beforehand.” — Richard Feynman
Also known as: SDD
Understand This First
- Specification – SDD is the workflow that forms around the specification artifact.
- Plan Mode – the execution discipline that pairs with SDD.
- Verification Loop – implementation is checked back against the spec.
Context
This is a strategic pattern about how a team organizes work, not about what goes in a document. It applies whenever you are directing one or more agents to build something non-trivial and you want the team’s shared understanding of the system to outlive any single conversation, session, or commit.
A Specification is the artifact. Spec-Driven Development is the workflow that forms around it: who writes the spec, when it gets updated, how it interacts with the code, and what the agent’s relationship to it is across the life of the project. The distinction is the same one that separates a test file from test-driven development. One is a thing; the other is a way of working.
The workflow matters more now than it used to. When a human developer held the system in their head, the spec was optional scaffolding. When agents do most of the typing, the spec is the thing the team reasons against, the artifact a reviewer checks before reading code, and the memory the agent loads when your last conversation drops out of context.
Problem
How should a team organize around a written specification so that the document stays useful as the system grows, agents stay aligned with intent across sessions, and the humans in the loop always know what is true?
A spec written once and forgotten drifts. The code moves; the document doesn’t. Six weeks later nobody trusts it, so nobody reads it, so nobody updates it. A spec treated as a contract that never changes blocks learning: real projects discover requirements by building, and a frozen spec turns that discovery into conflict. And a spec that lives only in one person’s head cannot survive a handoff, whether to a new teammate or to tomorrow’s fresh agent session.
Forces
- Durable reference vs. living document. A spec is most useful when everyone trusts it, which pushes toward careful maintenance. But maintenance takes time and discipline a small team may not have.
- Upfront detail vs. iterative discovery. You can’t write down everything before you start, but starting without writing anything down invites every old planning failure.
- Human authorship vs. agent generation. Agents can draft and update specs quickly, but a spec the humans never wrote is a spec the humans don’t know.
- One spec per project vs. one per change. A single rolling document captures the current state cleanly but buries history; one spec per feature preserves history but fragments the picture.
Solution
Choose a rigor level (how tightly spec and code are coupled) and commit to the workflow that matches it.
Three rigor levels have emerged in practice:
- Spec-first. Write a spec before building. Once the implementation lands, the spec is allowed to go stale. Best for short-lived work where the spec’s job is to align people at the start, not to survive the project.
- Spec-anchored. Keep the spec alongside the code as a living document. Every change to behavior updates the spec in the same pull request. The spec is not the source of truth for the code, but it is the source of truth for intent. This is the workflow most teams settle into.
- Spec-as-source. The spec is the canonical source file. Code is regenerated (in whole or in part) from the spec, and editing the code directly is discouraged or forbidden. Best for narrow, well-understood domains where regeneration is cheap and the cost of manual code drift is high.
Whichever level you pick, four disciplines hold the workflow together:
- A named spec owner. One person is accountable for the document’s correctness, even if many people contribute to it.
- Visibility to the agent. The spec lives at a known path in the repo, and every prompt that changes behavior references it by name.
- Spec before code. You write the change down first. That’s how you confirm you know what you’re changing before the agent starts typing.
- A review gate. A human checks the spec diff at the boundary between intent and implementation. The agent does not run until that check passes.
None of this replaces thinking. It replaces scattered thinking.
How It Plays Out
A four-person team is building an invoicing service. They start spec-first: two pages covering the tax rules, the PDF layout, and the webhook payloads they have to support. The agent ships a working first cut in an afternoon. Month two, they realize tax rules branch by jurisdiction in ways they didn’t anticipate. The spec is still accurate about the first cut, but it’s silent on the new branching. They promote to spec-anchored: every pull request that changes tax behavior now updates the spec in the same commit. The reviewer checks the spec diff before reading the code diff, because the spec diff is shorter and tells them whether the intent of the change makes sense.
A second team runs a small internal tool that converts YAML definitions of business reports into dashboards. They adopt spec-as-source: the YAML is the spec, and the rendering code is regenerated from it on every change. Editing the generated code directly is disallowed. When a new chart type is needed, they extend the schema and regenerate. The discipline pays off because the domain is narrow and the generator is cheap to maintain. It would not pay off on a general-purpose product.
A solo founder tries spec-anchored and fails at it. She keeps writing code first and updating the spec later, when she can remember to. After three weeks the spec is lying to her. She drops back to spec-first for small features and only promotes a feature to spec-anchored once it has settled and she can see it will need to evolve. The rigor level is a tool, not a badge.
Put the spec at a stable path like docs/spec.md or specs/<feature>.md and reference it by path in every prompt: “Read docs/spec.md before making any changes. If your change affects behavior described there, update the spec in the same commit.” This makes the spec the agent’s first stop and makes the workflow self-enforcing: the agent reminds you when you’re about to skip the discipline.
Consequences
Benefits. A durable spec gives every new session, human or agent, a shared starting point. Reviewers can evaluate intent before reading code, which is faster and catches a different class of mistake. The document makes drift visible: when the spec and the code disagree, someone is wrong, and you can see which. Onboarding gets faster because the intent is written down, not stored only in people’s heads.
Liabilities. The workflow has real cost. Writing before coding slows the first mile, and for trivial changes the overhead isn’t worth it. A living spec demands discipline every team doesn’t have; without the discipline the document rots faster than no document at all, because a misleading reference is worse than no reference. And spec-as-source is a sharp tool: it works beautifully in the right domain and fights you everywhere else.
The deepest risk is false confidence. A thick spec feels authoritative, and agents will implement confidently against a wrong document. The remedy is to keep the review gate honest: don’t just check that the code matches the spec; check that the spec matches reality.
Related Patterns
- Uses: Specification – SDD is the workflow built around the spec artifact.
- Depends on: Plan Mode – plan mode is the execution discipline that turns a spec into reviewed changes.
- Enables: Verification Loop – the spec supplies the criteria the verification loop checks against.
- Enables: Acceptance Criteria – a living spec makes it easy to derive testable criteria at each phase boundary.
- Uses: Ubiquitous Language – the spec is the place the shared vocabulary lives and stays consistent.
- Refines: Research, Plan, Implement – RPI is one concrete three-phase shape an SDD workflow can take.
- Complemented by: Architecture Decision Record – ADRs record the reasoning behind decisions the spec captures.
- Contrasts with: ad hoc prompting – SDD replaces scattered context with a single durable reference.
Sources
- The epigraph is Richard Feynman’s advice to his student Leighton on teaching, recorded in “Surely You’re Joking, Mr. Feynman!” (1985). The rule generalizes: you owe your audience the preparation.
- Martin Fowler’s series on Spec-Driven Development articulated the spec-first, spec-anchored, and spec-as-source distinction and set much of the current vocabulary.
- Thoughtworks placed Spec-Driven Development on their Technology Radar, naming it as one of the key emerging engineering practices of 2025-2026 and flagging both its promise and the risk of reverting to heavy upfront specification.
- Addy Osmani’s writing on specifications for AI agents established the working framework for what a good spec covers (commands, testing, project structure, style, git workflow, and boundaries) and made the cost of ambiguity concrete.
- An arXiv paper on Spec-Driven Development (“From Code to Contract in the Age of AI Coding Assistants”, 2026) formalized the three rigor levels and gave the methodology an academic grounding.
- The workflow’s recent popularization emerged from the agentic coding practitioner community through 2025-2026, as teams rediscovered that agents without durable references drift faster than humans do.
Further Reading
- Fowler’s “Exploring Gen AI” series on SDD tools – a tour of how Kiro, Spec Kit, and Tessl each interpret the methodology differently.
- Addy Osmani, “How to write a good spec for AI agents” – concrete guidance on what goes in the document the workflow centers on.