---
slug: rollback
type: pattern
summary: "Returning a system to a previously known-good state after a deployment goes wrong, a deployment run in reverse."
created: 2026-04-04
updated: 2026-04-05
related:
  continuous-delivery:
    relation: supports
    note: "Frequent small deployments make rollbacks simpler and lower-risk."
  continuous-deployment:
    relation: supports
    note: "Automated rollback is a safety mechanism for continuous deployment."
  deployment:
    relation: depends-on
    note: "Rollback is a deployment in reverse."
  feature-flag:
    relation: supports
    note: "Sometimes disabling a flag is faster than a full rollback."
  git-checkpoint:
    relation: uses
    note: "Checkpoints provide specific rollback targets."
  migration:
    relation: complements
    note: "Reversible migrations make data rollback possible."
  parallel-change:
    relation: contrasts-with
    note: "Rollback recovers from a failed change after it ships; parallel change is the discipline that prevents needing to roll back in the first place."
  runbook:
    relation: documented-by
    note: "Rollback procedures are a common and critical runbook topic."
  service-level-objective:
    relation: enabled-by
    note: "A blown error budget is the most defensible trigger for rolling back a release."
  strangler-fig:
    relation: related
    note: "The routing layer gives you a rollback mechanism at the capability level."
  user-feedback:
    relation: enabled-by
    note: "Good feedback about failures makes it clear when a rollback is needed."
  version-control:
    relation: depends-on
    note: "Version control preserves the previous state to return to."
---
# Rollback

> **Pattern**
>
> A named solution to a recurring problem.

## Understand This First

- [Deployment](deployment.md) -- rollback is a deployment in reverse.
- [Version Control](version-control.md) -- version control preserves the previous state to return to.

## Context

This is an **operational** pattern that provides the safety net for [Deployment](deployment.md). A rollback is the act of returning a system to a previous known-good state after a deployment or change introduces a problem. It is the "undo" button for production.

In agentic coding, rollback capability is what makes rapid iteration safe. When AI agents can generate and deploy changes quickly, the ability to reverse those changes just as quickly isn't a luxury. It's a requirement. The confidence to move fast comes from knowing you can move back.

## Problem

You deploy a new version and something breaks. Users are affected. The clock is ticking. Do you try to fix the problem under pressure, or do you revert to the previous version and fix it calmly? Without a reliable rollback mechanism, you are forced to debug live, under time pressure, with users watching. How do you ensure that any deployment can be safely and quickly reversed?

## Forces

- Speed matters: every minute a broken deployment is live, users are affected.
- Not all changes are easily reversible. Database [migrations](migration.md), deleted data, and external API changes may not have clean rollback paths.
- Rolling back introduces its own risks: the old version may not be compatible with changes that happened during the failed deployment.
- The pressure of an incident makes complex procedures error-prone.

## Solution

Design your deployment process so that every deployment can be reversed. This means keeping the previous version's artifacts (binaries, container images, bundles) available and having a tested procedure for switching back to them.

For application code, rollback typically means redeploying the previous version. If you use container images, this is as simple as pointing to the previous image tag. If you use compiled artifacts, it means redeploying the previous build. The deployment mechanism should support this natively; "deploy version X" should work for any recent version, not just the latest.

For database changes, rollback is harder. This is why [Migration](migration.md) patterns emphasize reversible changes and multi-step transitions. If you added a column, you can drop it. If you dropped a column, the data is gone. Plan your rollback strategy *before* deploying, not during an incident.

For [Configuration](configuration.md) changes, keep previous configurations available. If a config change causes problems, reverting to the previous config should be a one-step operation.

Automate what you can. In [Continuous Deployment](continuous-deployment.md) environments, automated health checks should trigger rollback without human intervention. In other environments, make rollback a single command that any authorized team member can execute.

## How It Plays Out

A team deploys a new version that introduces a memory leak. Response times degrade over 30 minutes. The on-call engineer runs `deploy --version=v2.4.1` (the previous version) and the system stabilizes within two minutes. The team debugs the memory leak the next morning at a normal pace, with no user impact beyond the initial degradation.

A developer asks an agent to optimize a database query. The optimization introduces a subtle bug that causes incorrect results for a small percentage of users. Because the code change is a single commit with a [Git Checkpoint](git-checkpoint.md) before it, the team reverts the commit, redeploys, and confirms the correct results are restored, all within 15 minutes.

> **💡 Tip**
>
> Practice rollbacks before you need them. Run a drill: deploy the current version, then immediately roll back. If the rollback procedure does not work smoothly in calm conditions, it will not work during an incident.

> **💡 Example Prompt**
>
> "The latest deploy introduced a memory leak. Roll back to the previous version using deploy --version=v2.4.1. After confirming the system is stable, we'll debug the leak tomorrow."

## Consequences

A reliable rollback capability changes the risk profile of deployment. Deploying becomes a low-stakes action because the downside is limited: if something goes wrong, you can be back to the previous state in minutes. This directly supports frequent deployment, experimentation, and the rapid iteration that agentic workflows enable.

The cost is maintaining rollback infrastructure and discipline. Previous versions must be preserved. Rollback procedures must be tested. Database migrations must be designed with reversibility in mind. And rollback isn't always clean — some changes (sent notifications, processed payments, synced data) can't be undone, which means rollback is a partial remedy for stateful systems.

---

- [Next: Feature Flag](feature-flag.md)
- [Previous: Continuous Deployment](continuous-deployment.md)