Agent Sprawl

Antipattern

A recurring trap that causes harm: learn to recognize and escape it.

Agents proliferate faster than governance can keep up, and within months nobody can say how many are running, what they touch, or who owns them.

Symptoms

Nobody can give you a number. Ask “how many agents are we running?” and you get ranges, not answers.
Two teams discover they’ve built the same agent to solve the same problem, with different credentials, different prompts, and different failure modes.
Agents run on personal API keys tied to engineers who left months ago. When the keys finally rotate, things break in places nobody expected.
Each repository has its own CLAUDE.md (or equivalent), and the guardrails drift apart. The same agent behaves one way in the billing service and another in the notifications service, and neither matches policy.
Security can’t draw a map of which agents touch which data stores. When the question comes up in an audit, the honest answer is “we’ll have to grep for API tokens.”
Incident reviews start including a new kind of line: “an internal agent made this change.” Nobody logged the reasoning, and the person who configured the agent isn’t on the incident.

The cost of creating an agent is near zero. A prompt file, an API key, a shell alias, and you have an autonomous worker running against production systems. That’s a lower bar than any previous wave of shadow IT ever cleared. Shadow servers needed hardware. Shadow SaaS needed a credit card. Shadow agents need a few minutes and a tool that any engineer already has.

Agents also solve real problems fast. A team that’s been waiting six weeks for a platform feature can build an agent that works around the gap in an afternoon. The agent works. It saves time. It doesn’t go through review because review takes longer than the agent took to build, and the work is already done. Every team reaches this conclusion independently, and the answer they reach is the same.

The governance side moves in the opposite direction. Registries, policies, and observability are platform work, and platform work lags product work by design. By the time the platform team starts building an agent registry, ten teams already have agents in production that don’t know the registry exists. The platform is building the map; the territory is expanding faster than the map can catch up.

Nothing about this is malicious or careless. It’s the rational response to a fast-moving tool and a slow-moving organization. But the result is a population of autonomous workers that nobody is tracking, and that population compounds.

The Harm

Sprawl doesn’t look dangerous from inside any one team. Each team’s agent is fine. The harm is a system-level property that nobody owns.

The most visible cost is maintenance. Gartner and industry analysts tracking AI-generated code in 2026 report maintenance costs running roughly 4x traditional levels by the second year of heavy agent use. The reason is structural: each agent accretes its own conventions, its own prompts, its own assumed credentials, and its own failure modes. When something drifts, there’s no shared toolchain to fix it. The fleet grows, and so does the per-agent cost of keeping any one of them healthy.

The security cost is worse. Each unregistered agent is an unmonitored attack surface holding real credentials and taking real actions. The 2026 IBM Cost of a Data Breach report put the average breach cost at around $4.6 million, and agent-related exposures are becoming a distinct category in those numbers. An attacker who compromises one shadow agent inherits everything that agent can reach. Because the agent isn’t in any inventory, the compromise isn’t detected by the monitoring the organization does have. It’s detected, if at all, by the downstream damage.

Then there’s the governance cost, which is the quiet one. The book already treats Shadow Agent as the antipattern of a single unregistered agent. Sprawl is what happens when shadow-agent conditions persist at the population scale. Bounded Autonomy can’t bound what isn’t listed. Approval Policy can’t gate what doesn’t pass through any gate. Least Privilege can’t constrain credentials it hasn’t seen. The governance patterns the book describes depend on knowing the agents exist. When the population is uncharted, none of them apply.

Industry surveys in 2026 (Paperclipped, RSAC) reported that about 80% of organizations running agents at scale had seen at least one unintended action whose root cause traced to an agent outside the inventory. In regulated industries the harm is even simpler. An auditor asks “what automated systems accessed this customer record in the last ninety days?” The answer has to be complete or the answer is worthless. Sprawl guarantees the answer can’t be complete.

The Way Out

The corrective pattern isn’t eradication. It’s treating the agent fleet as a production system, with the same disciplines any other fleet gets.

Start with a registry, not a policy. You can’t govern what you can’t count. Build a lightweight agent registry before you build enforcement. Every agent gets an entry: what it does, what it accesses, who owns it, and when it was last reviewed. Keep the form short. Make submission faster than the shadow path, or the shadow path wins again. Pair the launch with an amnesty window, the way Shadow Agent describes: invite teams to disclose existing agents without penalty, then enforce after the window closes.

Put a platform team on agent operations. Sprawl is a platform problem, not a security problem. Platform as a Product applies directly: a small team owns the agent runtime, provides shared scaffolding (logging, credential vault, guardrails), and makes the supported path cheaper than the unsupported path. This is how Thinnest Viable Platform gets off the ground for agents specifically. It doesn’t have to solve everything. It has to solve enough that teams don’t want to opt out.

Converge observability into one stream. Agents need to emit the same kinds of signals other production systems do: what they did, what they touched, how long it took, what it cost. Route that stream into the organization’s observability stack alongside services and jobs. When the next incident happens, agents should appear in the incident timeline as first-class participants, not as a footnote someone adds after the fact.

Apply Least Privilege and Trust Boundary to every registered agent. An agent in the registry without scoped credentials is barely better than an agent outside the registry. Scope the credentials. Draw the blast radius. Review on a cadence.

Treat the accumulating drift as debt. Agent sprawl is a form of Technical Debt, and the ways out are the same: make it visible, pay it down continuously, and stop accruing new debt. Rely on Garbage Collection as an ongoing habit. Assign an owner for the fleet and hold them accountable for its health.

Tip

A fast way to estimate sprawl: grep your logs and API gateway for consumers that don’t match any registered service. Each unrecognized consumer is a candidate agent. This exercise almost always returns a larger number than the team expects, and the number itself is the argument for building the registry.

How It Plays Out

A mid-size SaaS company has adopted agentic coding across three product teams. After six months, the head of engineering asks a simple question at a Monday standup: “how many agents are we running in production?” Silence. The team leads huddle for two days and come back with a list of eleven. Security runs an API key audit over the same period and finds nineteen agents issuing calls the team leads didn’t know about, most of them still valid and several tied to people who left the company. Nobody is at fault. Every individual decision made sense at the time. The company spends the next six weeks pulling together a registry, rotating credentials, and shutting down the agents that no longer have an owner. Two of the shutdown agents break things nobody expected, because internal workflows had quietly come to depend on them. The team writes “agent sprawl remediation” on the incident postmortems and starts treating the registry as production infrastructure.

A platform team at a financial services firm sees the problem coming and gets ahead of it. They set up a registry, a shared runtime, and a light approval workflow before any of the product teams ship a production agent. The supported path has a single-page form, same-day approval for routine cases, and a pre-wired credential vault that scopes what each agent can reach. Some teams still try to run their own agents outside the system at first. The platform team doesn’t argue. They instrument the API gateway to surface unrecognized consumers, share the list in a monthly operations review, and help the offending teams migrate onto the platform without drama. Within a quarter, everyone is using the supported path because it’s measurably less work. The firm’s auditors get a complete answer to the “which automated systems touched customer data” question in five minutes.

An engineering manager at a startup runs an agent audit after reading Paperclipped’s 2026 piece on rogue agents. She expects to find maybe half a dozen. She finds twenty-seven. Most of them were built by a single engineer who discovered that Claude Code could automate his Jira triage, his on-call noise filtering, his PR reviews, and his weekly report generation: a one-person agent fleet, invisible to everyone else, running against production tokens. The engineer isn’t doing anything wrong. The incentive was to ship. But the audit makes clear that when one person can build twenty-seven agents without anyone noticing, the organization isn’t governing anything. The next week, the company starts an agent registry and signs the engineer up as its first contributor.

Scales up: Shadow Agent – a single unregistered agent is a shadow agent; the population of them, unmanaged over time, is sprawl.
Prevented by: Bounded Autonomy – autonomy boundaries require knowing the agent exists to scope them in the first place.
Prevented by: Approval Policy – approval is only meaningful when every agent routes through the same gates.
Prevented by: Least Privilege – credentials held by unregistered agents are almost never scoped down, because nobody was reviewing them.
Depends on: Observability – without shared observability the fleet is invisible, and sprawl is the default steady state.
Countered by: Garbage Collection – ongoing sweeps that find and retire drifted or abandoned agents before the fleet compounds.
Countered by: Architecture Fitness Function – automated checks that flag agents violating organizational standards as soon as they appear.
Enabled by: Platform as a Product – the supported path that makes registration cheaper than evasion.
Enabled by: Thinnest Viable Platform – the smallest platform that solves the real governance need without over-building.
Related: Trust Boundary – every registered agent sits on a trust boundary that has to be drawn explicitly.
Related: Attack Surface – each unregistered agent is an unmonitored entry point.
Related: Technical Debt – sprawl behaves like debt, compounds like debt, and has to be paid down like debt.
Related: Ownership – sprawl is, at root, a gap in ownership; every agent needs a named steward.
Related: Steering Loop – a well-structured steering loop concentrates work in a focused agent rather than spawning new ones.

Sources

The term “agent sprawl” has crossed into vendor glossaries and industry reporting as a named phenomenon rather than a coined metaphor. Okta’s 2026 glossary entry on agent sprawl frames it as the operational version of identity sprawl, adapted for autonomous workers. Beam.ai’s “AI Agent Sprawl: The New Shadow IT Threatening Enterprises” draws the direct parallel to the historical shadow IT pattern and explains why the agent version scales faster.

Arthur.ai’s “Managing AI Agent Sprawl: Governance That Scales” contributes the platform-team framing used in the Way Out: sprawl is a platform problem first, a security problem second. Unframe’s 2026 piece on the costs of ungoverned agents provided the specific maintenance-cost multiplier and the registry-first recommendation.

Paperclipped’s “AI Agent Sprawl: 1.5 Million Rogue Agents and the Governance Gap” (2026) documents the scale of the phenomenon at enterprise level and the RSAC-reported figure that roughly 80% of organizations running agents had experienced at least one unintended action traceable to an agent outside their inventory. Security Boulevard’s March 2026 column on uncontrolled agent growth in SaaS environments supplied the operational view from an incident-response perspective.

Covasant’s 2026 piece on governance for agentic AI on cloud platforms connects agent sprawl to Architecture Fitness Function and the broader “treat the fleet as production” framing. The connection to Technical Debt follows Cunningham’s original metaphor: unmanaged shortcuts in the agent fleet accrue interest the same way shortcuts in code do.

Encyclopedia of Agentic Coding Patterns

Agent Sprawl

Symptoms

Why It Happens

The Harm

The Way Out

How It Plays Out

Sources

Keyboard shortcuts

Encyclopedia of Agentic Coding Patterns