--- slug: gitops type: pattern summary: "Make a Git repository the single source of truth for system state, and let an automated controller continuously reconcile the running environment to match it." created: 2026-06-16 updated: 2026-06-16 related: version-control: relation: uses note: "GitOps puts version control at the center: Git becomes the system of record for operations, not just source." configuration: relation: uses note: "The desired state GitOps reconciles toward is declared as versioned configuration." environment: relation: enables note: "Reconciliation targets a specific environment and converges it toward the declared state." deployment: relation: contrasts-with note: "Deployment pushes a new version out; GitOps inverts that, so a controller pulls the declared state in." continuous-deployment: relation: complements note: "Continuous deployment automates the path to production; GitOps gives that path a declarative, auditable control plane." rollback: relation: enables note: "Reverting the commit reverts the system: a git revert is the rollback path." runbook: relation: contrasts-with note: "A runbook documents the steps a human takes; GitOps encodes the desired end state and lets a controller take the steps." pipeline-synthesis: relation: related note: "An agent generating a GitOps manifest is doing pipeline synthesis aimed at the reconciliation loop rather than an imperative script." bounded-autonomy: relation: related note: "The reconciliation controller is a bounded autonomous actor: it acts continuously within the limits the repository declares." --- # GitOps > **Pattern** > > A named solution to a recurring problem. *Make a Git repository the single source of truth for what your system should look like, and let an automated controller keep the running system matching it.* > **📝 Where the name comes from** > > The name fuses Git with "Ops." The team behind it, Weaveworks, coined it in 2017 to describe how they ran their own Kubernetes clusters: every change to the system went through a Git commit, and a process inside the cluster watched that repository and made the live environment match. The "Ops" half signals that this is an operations model, not a development one. You don't push changes into the system. You change the repository, and the system pulls those changes to itself. ## Understand This First - [Version Control](version-control.md) — GitOps makes the version-control system authoritative over operations, so you need to understand it as the system of record first. - [Configuration](configuration.md) — the desired state GitOps reconciles toward is declared as configuration, separate from code. - [Deployment](deployment.md) — GitOps is a different way to arrive at a deployed system, so it helps to know the conventional act it replaces. ## Context This is an **operational** pattern. It applies once your system runs somewhere other people depend on (a cloud environment, a Kubernetes cluster, a fleet of servers) and you need a disciplined way to change it without anyone logging in and editing the live system by hand. The conventional model is push-based. A pipeline runs, builds an artifact, and pushes the change outward into the running environment: it connects to the cluster, applies the new configuration, and restarts what needs restarting. The pipeline holds the credentials to modify production, and the act of deploying is a sequence of imperative commands. GitOps inverts this. Instead of a pipeline reaching into the environment, a controller that lives *inside* the environment watches a Git repository and pulls changes toward itself. Git holds what the system should be; the controller's job is to make reality match. In agentic coding, this distinction matters more than it first appears, because it changes what an agent is allowed to touch. ## Problem You want changes to your running system to be deliberate, reviewable, and reversible. But the conventional ways of changing infrastructure work against all three. Someone runs a command against production and there's no record of it. A pipeline applies a change, but the live state has since drifted from what anyone believes is deployed. Credentials that can modify production are scattered across CI systems and laptops, each one a way in. When something breaks, no one can say with confidence what the system is *supposed* to look like, let alone get it back there. How do you make the state of a running system as legible, as reviewable, and as recoverable as your source code already is? ## Forces - A running system drifts. Manual fixes, emergency patches, and out-of-band changes accumulate until no one knows the true state. - Credentials that can modify production are dangerous wherever they live. The more places hold them, the larger the attack surface. - Auditing changes to a system requires a record of who changed what and when, which imperative commands rarely leave behind. - Recovery demands a known-good state to return to, but "what we had yesterday" is only useful if it was written down. - Automation is faster than humans but harder to trust; an automated actor that changes production needs tight, legible limits. ## Solution **Declare the entire desired state of your system in a Git repository, and run a reconciliation controller that continuously converges the live environment to match.** Four properties make this work, and dropping any one of them weakens the rest. *Declarative.* Describe what the system should be, not the steps to get there. A manifest says "run three replicas of this service with this config," not "scale up by one." Declarative state can be diffed, reviewed, and reasoned about. *Versioned and immutable.* The declared state lives in Git, so every change is a commit: authored, reviewed in a pull request, and permanently recorded. The history is the audit log. Nothing changes the system without first changing the repository. *Pulled automatically.* A controller inside the environment pulls the desired state from Git, rather than an outside pipeline pushing it in. This is the heart of the inversion. The credentials to modify production stay inside the environment with the controller; nothing external needs them. *Continuously reconciled.* The controller never stops comparing live reality to declared intent. When they diverge, whether because someone made a manual change or a node failed and rescheduled, it corrects the drift back toward what Git says. The repository isn't a record of the last deploy; it's a continuously enforced contract. The payoff is that operating the system becomes editing a file and merging a pull request. To change production, you change the repository. To roll back, you revert the commit and let the controller converge. To know the true state, you read the repository, because the controller's whole purpose is to make reality agree with it. ## How It Plays Out A platform team runs a dozen services on Kubernetes. They stop giving engineers direct cluster access entirely. Every change, whether a new service version, a config tweak, or an extra replica, is a pull request against a manifests repository. A controller in each cluster watches that repository and applies merged changes within a minute. When a midnight incident forces an emergency manual patch, the controller notices the drift at the next reconciliation and either reverts it or flags it, depending on policy. The manual fix doesn't survive unless someone codifies it in Git, which is exactly the discipline the team wanted. A release goes bad: a new version is crashing on startup. The on-call engineer doesn't open a console or run a rollout command. They revert the offending commit in the manifests repository and merge. The controller sees the previous known-good state in Git and converges the cluster back to it. The rollback is a `git revert`, with the same review trail as any other change. > **⚠️ The constraint this places on agents** > > GitOps changes what an agent directing your infrastructure is allowed to do. In a push model, an agent might hold cluster credentials and apply changes directly. That's fast, and almost impossible to audit or undo cleanly. Under GitOps, the agent has no path to production except a commit. It edits a manifest, opens a pull request, and the change is reviewed and reconciled like any other. The agent never touches the running system; it changes the declaration of what the system should be. This narrows the agent's blast radius to "things it can express as a reviewed commit," which is a far safer boundary than "things it can do with live credentials." ## Consequences **Benefits.** The state of your system becomes as legible and recoverable as your code. Every change is reviewed and recorded, so the audit trail is automatic. Drift is detected and corrected instead of silently accumulating. The credentials to modify production stay inside the environment, shrinking the attack surface. Rollback is a revert. And because the repository is authoritative, recovering a destroyed environment can be as simple as pointing a fresh controller at the same Git history. **Liabilities.** The reconciliation controller is now critical infrastructure: if it misbehaves, it can fight a legitimate manual change or propagate a bad commit across the fleet quickly. The model assumes your system state can be expressed declaratively, which fits cloud-native and container workloads well but maps awkwardly onto stateful resources and one-off operations. There's a learning curve: engineers used to logging in and fixing things have to learn that the only path is a commit, and that discipline can feel slow during an incident. And secrets need careful handling, since a naive setup tempts people to commit credentials straight into the very repository everyone can read. ## Sources - The term and the model were introduced by Alexis Richardson and colleagues at Weaveworks in 2017, describing how they operated Kubernetes clusters with Git as the source of truth and an in-cluster agent reconciling against it. - The Cloud Native Computing Foundation, through its OpenGitOps working group, codified four principles as a vendor-neutral definition: declarative, versioned and immutable, pulled automatically, and continuously reconciled. ArgoCD and Flux, both CNCF-graduated projects, are the reference implementations of the reconciliation controller. - The reconciliation-loop idea predates the name: it generalizes the control-loop model at the heart of Kubernetes itself, where controllers continuously drive observed state toward desired state. --- - [Next: Rollback](rollback.md) - [Previous: Build Provenance](build-provenance.md)