Correctness, Testing, and Evolution
Software isn’t a static thing. It changes constantly: new features arrive, bugs get fixed, requirements shift, and the world it operates in evolves. The patterns in this section live at the tactical level. They address how you know your software is correct, how you keep it correct as it changes, and how you detect when something goes wrong.
Correctness starts with knowing what “right” looks like. An Invariant is a condition that must always hold. A Test is an executable claim about behavior. A Test Oracle tells you whether the output you got is the output you should have gotten. Around every test sits a Harness, the machinery that runs it, and within that harness, Fixtures provide the controlled data and environment the test needs.
Testing isn’t just verification; it can drive design itself. Test-Driven Development uses tests as a design tool, and Red/Green TDD gives that idea a tight, repeatable loop. Once tests pass, Refactoring lets you improve internal structure without breaking what works. When something does break unexpectedly, that’s a Regression, and catching regressions early is one of the highest-value activities in software development.
Not all problems announce themselves. Observability is the degree to which you can see what’s happening inside a running system, and Logging is the primary mechanism for achieving it. When a bug resists reading and reasoning, Printf Debugging lets you make runtime values visible with nothing more than a print statement and a hypothesis. Every system has Failure Modes, specific ways it can break, and the most dangerous are Silent Failures, where something goes wrong and nobody notices. Finally, every system operates within a Performance Envelope, the range of conditions under which it still behaves acceptably.
In an agentic coding world, where AI agents generate and modify code at high speed, these patterns become guardrails. An agent can write a function in seconds, but only tests can tell you whether that function does what it should. The faster you change code, the more you need the safety net these patterns provide.
Defining Correctness
What “right” means: the foundations for knowing whether your software does what it should.
- Invariant — A condition that must remain true for the system to be valid.
- Test — An executable claim about behavior.
- Test Oracle — The source of truth that tells you whether an output is correct.
- Harness — The surrounding machinery used to exercise software in a controlled way.
- Fixture — The fixed setup, data, or environment used by a test or harness.
- Happy Path — The default scenario where everything works as expected, and the concept that makes every other kind of testing meaningful.
- Code Review — Having someone other than the code’s author examine changes before they merge, catching what tests and the author’s own eyes miss.
Test-Driven Workflows
Using tests to drive design and catch breakage before it ships.
- Test-Driven Development — Tests written to define expected behavior before or alongside implementation.
- Red/Green TDD — The core TDD loop: write a failing test, then make it pass.
- Refactor — Changing internal structure without changing external behavior.
- Regression — A previously working behavior that stops working after a change.
- Test Pyramid — Shape a test suite with many fast unit tests at the base, fewer integration tests in the middle, and a small number of end-to-end tests at the top.
Observability and Debugging
Seeing what your system is doing, measuring how well it works, and finding out why it broke.
- Observability — The degree to which you can infer internal state from outputs.
- Failure Mode — A specific way a system can break or degrade.
- Silent Failure — A failure that produces no clear signal.
- Performance Envelope — The range of operating conditions within which a system remains acceptable.
- Logging — Record what your software does as it runs, so you can understand its behavior after the fact.
- Printf Debugging — Insert temporary output statements to test a hypothesis about code behavior, then remove them once you’ve found the answer.
- Metric — A quantified signal, tracked over time, that tells you whether your software, team, or process is improving or degrading.
- Feedback Loop — Any arrangement where a system’s output circles back to influence its next action, enabling self-correction or self-reinforcement.
- Service Level Objective — A committed reliability target with a matching error budget that governs how much risk the team can spend on change.
Managing Change
Evolving a system safely over time without breaking what works.
- Technical Debt — Shortcuts in code act like financial debt, letting you ship faster now and charging interest on every future change.
- Strangler Fig — Replace a legacy system incrementally by building new functionality alongside it, routing traffic piece by piece, until the old system can be switched off.
- Parallel Change — Change an interface by adding the new form first, migrating callers at their own pace, and removing the old form last, so consumers never see a breaking change.
- Deprecation — Announce the removal of a feature on a specific future date, keep it working in the meantime, watch who still uses it, and remove it only once usage has actually gone to zero.
- Evolutionary Modernization — Treat modernization as a continuous, guided process of small replacements with working software at every step, rather than a bounded project that ends in a single cutover.