Agent Gateway

Pattern

A named solution to a recurring problem.

A purpose-built reverse proxy that brokers every tool call between agents and tools, so authentication, authorization, audit, and runtime policy live in one place instead of being re-implemented in every agent.

Understand This First

Least Privilege — what credentials should grant; the gateway decides what’s allowed right now.
Trust Boundary — the gateway is the explicit boundary between the agent network and the tool network.
MCP — the dominant tool protocol the gateway brokers.
Agent Registry — the inventory the gateway uses to identify each agent.

Context

This is a tactical pattern for any team running more than one agent in production. Each agent calls several tools: internal APIs, MCP servers, third-party services, model APIs. Each tool needs credentials. Each call should be logged. Each tool has rate limits, retry semantics, schema versions. In a single-agent prototype, you can hand-wire all of that. In a fleet of fifteen agents calling twenty-five tools, you can’t.

This is the same shape that drove the rise of API gateways for web services in the 2010s. The novelty isn’t the gateway — it’s that the traffic running through it is generated by a probabilistic reasoner that can be talked into making calls its developer never anticipated.

Problem

Without a central broker, every agent ends up with its own credential bundle, its own retry logic, its own observability hooks, and its own ad-hoc enforcement of whatever policy the security team last sent around. The math gets bad fast: N agents times M tools means N times M integrations to write, N times M secrets to rotate, and zero places where central policy can be applied uniformly.

The harder problem is enforcement. A credential says what an agent could call. It can’t say what the agent should call right now, in this context, with this payload. When a prompt-injected agent uses its valid payments credential to send money to an attacker-controlled account, the failure is not at the credential layer. It’s that nothing was sitting on the action path to ask the question “is this call sensible right now?” before it left the building.

Forces

Generous credentials are convenient at development time and dangerous at runtime.
Per-agent integration code is easy to start and brutal to maintain at fleet scale.
A central broker becomes critical infrastructure: outages take down every agent’s tool access at once.
Every gateway hop adds latency on a path that’s already slow.
The temptation to “just put one more check in the gateway” turns the gateway into an undocumented application server.
Policy that lives in code in the gateway needs the same testing, versioning, and rollout discipline as any other production system.

Solution

Put a gateway between every agent and every tool. The gateway is the only endpoint each agent connects to. It holds the upstream tool credentials, brokers each call, and centralizes five concerns that don’t belong scattered across agent code.

Authentication. The agent identifies itself to the gateway with mTLS, a signed JWT, or a registered key tied to its Agent Registry entry. The gateway never trusts an unauthenticated caller.

Authorization. Each tool call is checked against policy: is this agent identity, acting on behalf of this user identity, with this request shape, allowed to call this tool right now? Policy lives in the gateway in an engine like OPA or Cedar, where it can change without redeploying any agent.

Audit. Every call produces a structured log entry: agent identity, user identity, tool, request, response, latency, outcome. This is the surface the security team queries when something goes wrong, and the surface the compliance team points at when an auditor asks “show me everywhere this customer’s data was touched.”

Runtime policy enforcement. Beyond static authorization, the gateway can inspect content and deny on anomaly. A database query that exceeds a row-count ceiling. A payment to a counterparty the agent has never paid before. A tool call that pattern-matches a known prompt-injection exfiltration shape. This is the layer that catches what credentials alone cannot.

Operational concerns. Per-agent and per-tool rate limits. Retry and circuit-breaker behavior. Schema validation against upstream tool versions. Cost accounting when the upstream is a metered API.

The gateway typically supports more than one protocol: MCP for tools, A2A for agent-to-agent calls, plus direct LLM-API brokerage so cost and rate-limit policy applies to model calls too. The agent doesn’t know which upstream it’s hitting; it knows the gateway’s endpoint, and the gateway knows the rest.

The N-by-M-to-1-by-N collapse is why this pattern exists. With N agents and M tools, integrations scale as N times M without a gateway. With a gateway, integrations are 1 times N (gateway-to-tool) plus N times 1 (agent-to-gateway). At fifteen agents and twenty-five tools, that’s the difference between 375 integration points and 40.

How It Plays Out

A platform team supports five product teams running fifteen agents against twenty-five internal MCP servers. Each agent started out with its own credentials embedded in a config file. By the time the fleet hit thirty agents and thirty tools, the secrets-rotation calendar took a full week, three different agents had silently stopped working because nobody updated their credentials, and the security team had no answer to “which agents currently have access to the customer-data export tool.” They install Kong’s Agent Gateway as the single endpoint. Each agent now holds one credential: its identity to the gateway. Each tool is registered once. Rotation happens once. New agents onboard against one endpoint, not thirty.

A finance-domain agent has credentials to call the payments tool, because its job requires it. A prompt-injection attack in a vendor invoice convinces the agent to issue a payment to an attacker-controlled counterparty. The credential alone would have allowed the call: the agent has payments authority, and the destination is a valid account. The gateway’s policy checks every payments call against a “previously seen counterparty” allowlist. The call is denied, the security team is paged, and the human operator confirms within minutes that no legitimate payment to this account was scheduled. The credential was never wrong. The runtime policy was the right question.

Six weeks after a deploy, the legal team asks for evidence that no agent has called the customer-data export tool with a non-allowlisted user identity. Without a gateway, the answer would have been “we’d need to grep five different log formats across three different observability systems and hope nobody silently swallowed an error.” With one structured log surface, the audit closes in an afternoon: one query, one CSV, one signed attestation that the boundary held.

Tip

Treat the gateway as the place where cross-cutting concerns live and only those concerns. Authentication, authorization, audit, rate-limit, schema validation: yes. Anything specific to one tool’s business logic: no. The moment “we’ll just add a small check in the gateway” becomes a habit, the gateway has become a hidden application server and you’ve recreated the problem the gateway was supposed to solve.

Where It Breaks

Single point of failure. The gateway is now critical infrastructure. An outage takes down every agent’s tool access. Mitigate with a highly available deployment, health checks, and a read-only degraded mode for non-mutating tool calls.
Latency tax. Every tool call takes the gateway hop. Co-locate the gateway with agents and tools where you can; cache authorization decisions for repeat calls within the same session.
Schema drift. Upstream tools change; the gateway’s schema definitions don’t update themselves. Pin schema versions per agent, stage upgrades, and treat the gateway as the place where Deprecation windows are enforced for tool versions.
Business-logic creep. The gateway is a tempting place to “just add this one check” specific to a particular tool or agent. Resist. The hard rule: the gateway only enforces cross-cutting concerns. Anything tool-specific stays in the tool. Anything agent-specific stays in the agent.
Policy-engine complexity. Once policy lives in OPA or Cedar, it needs its own CI, its own testing, its own staged rollout. Treat policy as code with the same discipline you’d treat a database migration.
Defense-replaced thinking. “The credentials don’t really matter, the gateway will catch it.” This is exactly backwards. The gateway is defense in depth on top of Least Privilege, not a replacement for it.

Consequences

The wins are concrete. Secrets sprawl collapses to a single rotation surface. Audit becomes one structured log instead of five formats across three systems. Runtime policy gives a real defense-in-depth layer above credentials, with the ability to deny calls that look wrong even when the credential would have allowed them. Central security teams can enforce org-wide policy without per-agent integration. New agents onboard fast because they only need to know one endpoint.

The costs are real and ongoing. The gateway is a piece of infrastructure to deploy and operate. Latency adds up on hot paths. Schema drift between the gateway and upstream tools is recurring maintenance work. Policy-as-code introduces engineering discipline that didn’t exist when each agent enforced its own ad-hoc rules.

There’s also a category of failure worth naming up front: the gateway as a hidden application server. Every successful gateway deployment has to defend against the steady pressure to put more and more business logic in the central broker until it’s the most fragile and least-documented part of the system. The discipline that keeps a gateway useful is the discipline that keeps it small.

Sources

The agent gateway pattern emerged across multiple infrastructure vendors and security-focused practitioners during 2025–2026 as agent fleets started hitting the N-by-M integration wall in production. The architecture borrows directly from the API gateway pattern that became standard in the microservices era. What’s new is the source of the traffic: probabilistic reasoners that can be talked into actions their developers never anticipated.
The runtime-enforcement layer descends from object-capability security and the least-privilege tradition. Mark S. Miller’s Robust Composition: Towards a Unified Approach to Access Control and Concurrency Control (Johns Hopkins PhD thesis, 2006) developed the case that authority should be granted at the moment of action, not as a static property of an identity. The agent gateway operationalizes that argument in 2026 production architecture: credentials describe potential, gateway policy describes permission at the call site.
The N-by-M-to-1-by-N framing comes from the API gateway literature, where it was the original case for centralizing cross-cutting concerns out of individual services. Chris Richardson’s Microservices Patterns (Manning, 2018) is the canonical written treatment in the web-API era; the agent gateway pattern adapts the same accounting to fleets of agents.
OWASP’s Top 10 for Large Language Model Applications names excessive agency as one of the canonical failure modes of agent deployments. The runtime-policy responsibility of the agent gateway is the operational answer to that failure: a checkpoint on the action path that can deny calls a credential would otherwise permit.
The Cisco Agent Gateway Protocol (AGP), referenced in the A2A article, is one of several protocol-layer specifications a gateway might implement. The protocol and the pattern are distinct: AGP defines a wire format for secure agent-to-agent traffic; the gateway pattern names the runtime control plane that brokers it.

Keyboard shortcuts