Sweep
Apply one rule uniformly across many files in a single, disciplined pass, so the codebase moves from old convention to new convention without drift or dangling exceptions.
Also known as: Mass Refactoring, Cross-Cutting Change, Codebase-Wide Rewrite
Understand This First
- Refactor — a sweep is often a refactor applied at codebase scale, though not every sweep is behavior-preserving.
- Parallel Change — the middle phase of a parallel change is typically a sweep of callers from old form to new.
- Blast Radius — a sweep has maximal blast radius by definition, which is why it needs its own discipline.
Context
At some point every codebase needs a change that touches many files at once. You rename a function used in 300 places. You replace a deprecated API with its successor. You add a missing license header. You update an import path after a package moves. You normalize casing on dozens of environment variables. The rule itself is simple: the work is in applying that rule everywhere, consistently, without losing a file to the inconsistency that started the work in the first place.
Before agents, you had three choices for this kind of work. Regex search-and-replace was cheap and fragile. An IDE’s language-aware rename worked inside a single project but fell apart at service boundaries or in a language the IDE didn’t parse. A codemod (an abstract-syntax-tree transformation script like jscodeshift or ast-grep) gave you precision but required writing and debugging the transformation up front. Agents add a fourth option: a reasoning sweep, where the agent holds the rule in context and applies it file-by-file with judgment about the edge cases that would break a purely syntactic transformation.
Problem
You have one rule (a rename, an API replacement, a convention change, a vocabulary update) that needs to land consistently across many locations. If it lands in some places and not others, you now have two conventions in the same codebase, which is worse than either convention on its own. The change itself is mechanical at any single site. The difficulty is coordination: find every site, apply the rule correctly, catch the edge cases, verify nothing regressed, and do it without spending a week manually reviewing three hundred nearly-identical diffs.
How do you apply one transformation uniformly to a large codebase without drift, without missing edge cases, and without detonating a hidden regression that doesn’t surface until production?
Forces
- Consistency matters more than any single site. Missing one call site is often worse than doing none.
- The blast radius is maximal. One bad rule applied to every matching file touches every matching file.
- Some rules are syntactic and some require judgment. Picking the wrong execution mechanism wastes the effort or silently corrupts the result.
- Review cost grows linearly with the number of touched files, so human-scale review is the first thing to collapse.
- Tests are the only check that scales, but only if they actually cover the behavior the sweep could break.
- Rollback must stay cheap, because no one is perfect and a bad sweep needs to un-land fast.
Solution
Define the rule crisply, pick the execution mechanism that matches the rule’s precision needs, and execute in batches small enough that a failing batch is easy to roll back. Each batch is gated on green tests and a diff review. A checkpoint lands before every batch. The sweep isn’t done when the last file is touched; it’s done when the test suite is green, the diff has been reviewed, and you can explain what changed to someone who wasn’t watching.
Three execution modes, with a decision rule:
Regex or search-and-replace. Cheap and fast, but blind to syntax. Use this only when the rule is trivially textual: adding a missing file header, updating a URL, renaming a string constant whose spelling is unambiguous. The moment the rule depends on what the text means (is this user a variable name or a comment word?), regex is the wrong tool.
Codemod. An AST-based transformation script. Precise, repeatable, and reviewable. This is the right tool when the rule is syntactic but non-trivial: renaming a function and its call sites, replacing one API with another, migrating between two versions of a framework. The cost is writing the transformation, which is often worth it for rules that will run more than once or on a very large codebase.
Agentic sweep. The agent holds the rule in context, reads each file, and applies the rule with judgment. This is the right tool when the rule requires meaning: when some call sites are legitimate exceptions, when nearby comments or tests also need updating, when the rule interacts with local context the transformation script can’t see. An agent can also write the codemod for you as a first step, then switch to direct editing for the sites the codemod can’t handle.
The sweep discipline is the same regardless of mechanism. Write the rule down in plain language before you start. Enumerate the target set with a search you can double-check. Sample three or four candidates by hand to verify the rule actually holds on real code. Then checkpoint, apply the rule to a small batch, run the tests, review the diff, and checkpoint again. Scale the batch size only after the first batch lands clean. The “one sweep at a time” rule holds: if the rule changes mid-sweep, you’re starting a new sweep, not amending the current one.
How It Plays Out
A product team needs to rename a payments function from charge(amount) to charge_cents(amount_cents) across a monorepo. There are 312 call sites across 14 services. They write the rule in a plan doc: every call to charge becomes a call to charge_cents, with the argument multiplied by 100; related variable names change to reflect cents; a handful of test fixtures will need updated expected values. A senior engineer hand-samples six call sites and confirms the rule. Then an agent runs the sweep in batches of 40 files, checkpointing before each batch and running the service-local test suite after. Two batches surface edge cases the rule didn’t cover (a scheduled job that already multiplies by 100, and a legacy integration test that mocks the old signature), and each surfaces as a failing test, not a silent regression. The team pauses the sweep, amends the rule, and restarts from the last checkpoint. Total wall time: two days, most of it waiting on CI. No production incident.
A React shop has 1,400 components still using the deprecated componentWillMount lifecycle. The rule is structural enough that a jscodeshift codemod handles 95% of the sites. For the remaining 5%, the codemod output fails review because the components have side-effect ordering that the syntactic transformation can’t preserve. A human writes a short list of the exceptions, an agent handles the subtle cases one file at a time, and the team ends with a single PR per module rather than one monster PR per codebase.
Ask an agent to walk the target set once before it starts editing. A preflight pass that says “I found 312 matches across 14 services; here are six representative sites and the rule I plan to apply” gives you a chance to correct the rule while the sweep is still cheap to redirect. Editing starts only after the preflight is approved.
The Encyclopedia itself runs sweeps. When the style guide grew a new prerequisite-link convention, every article needed a small, consistent edit. That work landed as a sweep, not as 230 separate edits, because treating it as one named unit of work forced the discipline: write the rule, enumerate the targets, sample, checkpoint, batch, verify. The name Sweep is how the improve engine’s own planning refers to this kind of change.
When It Fails
Rule ambiguity. The rule looks obvious to you and ambiguous to the agent. The first ten files get it right; the eleventh interprets an edge case the wrong way; by the hundredth file the drift is baked in. Fix: sample before batching. Re-sample after any rule amendment.
Missed targets. Your grep query didn’t catch every form. charge( missed charge ( with extra whitespace, dynamic calls through a registry, or the renamed copy in a vendored dependency. Fix: combine textual and semantic search. Verify the target count matches the expected count before starting.
Silent regressions. The test suite passes but doesn’t exercise the behavior the sweep could break. This is the most dangerous failure because it ships. Fix: before sweeping, confirm the tests cover the surface the rule touches. If coverage is thin, write the tests first. Test-less sweeps are coin flips.
Batches too large to review. A 400-file diff isn’t reviewable in any meaningful sense. The review becomes a ritual. Fix: batch sizes small enough that a human can actually read each diff, typically 20 to 50 files, fewer for subtle rules.
Treating the sweep as idempotent when it isn’t. Running the sweep twice produces a different result than running it once. Fix: either make the rule truly idempotent (the second run is a no-op) or treat each run as a one-shot from a clean checkpoint.
Sweeping before the test suite is reliable. If CI is flaky, you can’t tell whether the sweep broke something or CI is just CI. Fix: stabilize the test suite first. A sweep on a shaky test suite is flying blind at maximum speed.
Consequences
Benefits. The codebase ends in a consistent state, not partially migrated. Readers and future tools (including future agents) see one convention, not two. The discipline of writing the rule down forces clarity about what actually changed and why. Batching keeps the work reviewable and reversible, turning one terrifying diff into a sequence of boring ones. Agentic sweeps unlock changes that were previously too tedious to attempt, so codebases can stay closer to their preferred conventions rather than drifting.
Liabilities. A sweep is more change than most review processes are built for. Even well-batched, the review overhead is real, and reviewers tire. A badly-specified sweep can silently degrade a large part of the codebase before anyone notices. Sweeps also tend to obscure the history; a single commit that renames 300 things makes subsequent git blame harder, so prefer smaller commits per batch and clear commit messages over one giant squash.
There is a coordination cost with other work. While a sweep is in flight, every merge conflict with main is amplified. Schedule sweeps for windows when the rest of the team isn’t landing large changes in the same files, or the sweep will spend more time rebasing than sweeping.
Related Patterns
- Depends on: Refactor — a sweep is often a codebase-wide refactor, and shares its requirement for tests as a safety net.
- Depends on: Parallel Change — the migrate phase of a parallel change is typically a sweep of callers from the old form to the new one.
- Depends on: Git Checkpoint — the checkpoint-per-batch discipline is what makes a sweep safely reversible.
- Depends on: Test — tests are the only oracle that scales to sweep-sized changes.
- Depends on: Blast Radius — naming the blast radius is what motivates the sweep discipline.
- Uses: Approval Policy — sample-then-approve-then-batch is a natural fit for agentic sweeps.
- Uses: Task Decomposition — large sweeps must be decomposed into batches small enough to review and revert.
- Uses: Regression — regression tests are what catch the sweep’s edge cases that the rule missed.
- Related: Deprecation — a deprecation often ends with a final sweep to remove the last users of the old form.
- Related: Strangler Fig — a strangler migration is gradual where a sweep is uniform; a strangler may contain many sweeps inside it.
- Related: Evolutionary Modernization — sweeps are one of its day-to-day mechanisms for moving the codebase toward new conventions.
- Related: Migration — migrations often contain sweeps as sub-steps.
- Related: Coding Convention / Naming — canonical targets for sweep rules.
- Enabled by: Agent / Subagent — the agentic mode of the sweep is only practical once agents exist to carry the rule across files.
- Related: Garbage Collection — each recurring garbage-collection pass is a small, scheduled sweep.
Sources
Martin Fowler’s writing on codemod-based refactoring, particularly Refactoring with Codemods to Automate API Changes, names the deterministic half of this pattern and develops the discipline for applying an AST transformation across a codebase while preserving behavior. The three-mode framing in this article (regex, codemod, agentic) builds on that baseline.
The practice of cross-cutting change has long roots in the Extreme Programming and refactoring communities. William Opdyke’s 1992 PhD thesis at the University of Illinois, Refactoring Object-Oriented Frameworks, established the idea that large structural changes could be decomposed into small, behavior-preserving steps, a direct ancestor of the batch-and-verify discipline in the Solution section.
The jscodeshift and ast-grep tool communities developed the practical mechanics of running deterministic sweeps at scale, including the batch-review patterns that the agentic mode now inherits.
The agentic variant of the pattern emerged from the coding-agent practitioner community in 2024 and 2025, as tools capable of reliably editing many files on a single rule became widely available. The name Sweep for this operation is now in common practitioner use, including as a product name for agent-driven refactoring and as a proposal type inside the Encyclopedia’s own authoring engine.
Further Reading
- Refactoring: Improving the Design of Existing Code (Martin Fowler, 2nd ed. 2018) — the canonical catalog of behavior-preserving transformations, which any sweep rule should draw on.
- Refactoring Databases (Scott Ambler and Pramod Sadalage, 2006) — the schema case of cross-cutting change, which shaped the dual-write and batch discipline that sweep-style migrations inherit.