Test-Driven Development

Pattern

A named solution to a recurring problem.

Write the test before the code, and let failing tests drive every line of implementation.

Also known as: TDD

“The act of writing a unit test is more an act of design than of verification.” — Robert C. Martin

Understand This First

Test, Test Oracle, Harness – TDD requires working test infrastructure.

You’re about to implement a feature or fix a bug. You could write the code first and test it afterward, or you could flip the order and let the tests guide the design. This is a tactical pattern that changes how code gets written, not just how it gets checked. It builds on Tests, Harnesses, and Fixtures, but treats them as a design tool rather than a verification afterthought.

Problem

When you write code first and tests later, the tests tend to confirm what the code already does rather than challenging whether it does the right thing. Tests written after the fact often miss edge cases, because the developer is already thinking in terms of the implementation they just wrote. Worse, “I’ll add tests later” often becomes “I never added tests.” How do you ensure that tests are thorough, that code meets its requirements, and that you write only the code you actually need?

Forces

Writing tests after code tends to produce tests that mirror the implementation rather than the requirements.
Without tests as a guide, it’s easy to over-engineer, building features nobody asked for.
Without tests as a safety net, refactoring is risky.
Writing tests first feels slow at the start of a task.
Some designs are hard to test, and discovering this late is expensive.

Solution

Write the test before you write the code. Kent Beck, who formalized TDD as part of Extreme Programming in the late 1990s, described the discipline this way: start by expressing a single, specific behavior you want the system to have, as a Test with a clear Test Oracle. Run the test and watch it fail. Then write the minimum code needed to make it pass. Once it passes, clean up the code through Refactoring. Repeat.

This approach has several effects. First, you never write code without a reason; every line exists to make a failing test pass. Second, you discover design problems early, because code that’s hard to test is usually code with too many dependencies or unclear responsibilities. Third, you accumulate a test suite as a side effect of development, not as a separate chore.

TDD doesn’t require writing all tests first. You write one test at a time, in small increments. The rhythm is what matters: test, code, clean up. The specific mechanics of this rhythm are described in Red/Green TDD.

How It Plays Out

A developer needs to build a function that validates email addresses. Before writing any validation logic, they write a test: assert is_valid_email("alice@example.com") == True. It fails because the function doesn’t exist yet. They create the function, returning True for any input. The test passes. They add another test: assert is_valid_email("not-an-email") == False. It fails. They add the minimum logic to distinguish valid from invalid. Step by step, the test suite and the implementation grow together, each informed by the other.

In agentic workflows, TDD becomes a potent steering mechanism. Instead of describing what you want in prose, you write a failing test that defines what you want in code. The agent gets an unambiguous target and can iterate autonomously until it reaches green. One subtlety: research on test-driven agentic development (2025-2026) found that telling an agent “practice TDD” without pointing it at specific tests actually increased regressions. The agents performed better when given a concrete map of which tests to run and which dependencies to check. The lesson: don’t just hand the agent a philosophy. Hand it a failing test and the command to run it.

Tip

When working with an AI agent, write the tests yourself and let the agent write the implementation. Your tests encode your intent; the agent’s code fulfills it. This division of labor plays to each party’s strengths.

Example Prompt

“I’ll write the tests, you write the implementation. Here’s the first test: assert is_valid_email(‘alice@example.com’) == True. Make it pass, then I’ll add the next test.”

Consequences

TDD produces code with high test coverage by construction. Designs tend to come out simpler, because you’re always writing the minimum code to pass the next test. The test suite doubles as a living specification of the system’s behavior, one that stays current because every change starts with a test update.

The cost is discipline. TDD feels unnatural at first; writing a test for code that doesn’t exist yet requires thinking about behavior before implementation. It can also be misapplied. Testing implementation details instead of behavior produces brittle suites that break with every Refactor. The goal is to test what the code does, not how it does it. Teams that lose sight of this distinction end up with thousands of tests that slow them down instead of freeing them up.

Sources

Kent Beck formalized test-driven development as a named practice and described its mechanics in Test-Driven Development: By Example (2003). Beck has noted that he “rediscovered” rather than invented the technique — test-first programming appeared as early as D.D. McCracken’s 1957 programming manual and was used in NASA’s Project Mercury in the early 1960s.
TDD emerged from the Extreme Programming (XP) community in the late 1990s, where Beck and others applied the XP principle of taking effective practices to their logical extreme. The question “what if we wrote the tests before the code?” became a core XP discipline.
Robert C. Martin (quoted in the epigraph) championed TDD through Clean Code (2008) and The Clean Coder (2011), and formulated the “Three Laws of TDD” that many practitioners follow today.
Martin Fowler’s Refactoring: Improving the Design of Existing Code (1999, 2nd ed. 2018) provided the vocabulary and catalog for the “refactor” step of the red-green-refactor cycle.
The TDAD (Test-Driven Agentic Development) paper arXiv:2603.17973 (2026) demonstrated that AI coding agents given a graph-based test-impact map reduced regressions by 70% on SWE-bench Verified, while agents given only procedural TDD instructions without specific test targets actually performed worse than a vanilla baseline.

Keyboard shortcuts