---
slug: approval-policy
type: pattern
summary: "Defines when an agent may act on its own and when it must pause for human confirmation: the contract between the human's trust and the agent's autonomy."
created: 2026-04-04
updated: 2026-05-04
related:
  agent:
    relation: depends-on
    note: "Approval policies govern agent behavior."
  agent-registry:
    relation: enabled-by
    note: "Approval gates need a known set of agents to gate; the registry supplies that set and ties each entry to an owner who can be paged."
  agent-sprawl:
    relation: related
    note: "Sprawl overwhelms approval policies by multiplying the number of agents under governance."
  agentic-payments:
    relation: refined-by
    note: "Payment thresholds are a natural and well-bounded place to insert human review."
  agentops:
    relation: enforced-by
    note: "Policy violations are a first-class AgentOps signal; the operations stream watches whether agents stay inside the policy envelope."
  approval-fatigue:
    relation: related
    note: "Poorly calibrated policies cause approval fatigue."
  bounded-agency:
    relation: refines
    note: "Approval policy is the mechanical enforcement of the approval set in an agency envelope."
  bounded-autonomy:
    relation: refined-by
    note: "Bounded autonomy graduates binary approve/deny gates into a spectrum calibrated to consequence severity."
  dark-factory:
    relation: contrasts-with
    note: "Approval policies gate code-level actions; Dark Factory contrasts by gating only specifications and removing the per-action human in the loop."
  delegation-chain:
    relation: used-by
    note: "Delegation chains thread approval policies through subagents so authority flows down without being assumed from the top."
  harness-agentic:
    relation: depends-on
    note: "The harness enforces approval policies."
  human-in-the-loop:
    relation: enables
    note: "Approval policies define the specific points where the human intervenes."
  ownership:
    relation: informs
    note: "Approval gates enforce ownership by requiring the owner's review before changes merge."
  permission-classifier:
    relation: refined-by
    note: "A runtime classifier decides which permitted actions need a human in the loop right now."
  runtime-governance:
    relation: enforced-by
    note: "Approval policy defines which actions are allowed in principle; runtime governance is the layer that enforces those rules at request time."
  shadow-agent:
    relation: related
    note: "Without registration, agents bypass approval entirely."
  tool:
    relation: uses
    note: "Policies are typically defined per tool or per action type."
  worktree-isolation:
    relation: enables
    note: "Isolation reduces the risk surface, allowing more liberal policies within the worktree."
---
# Approval Policy

> **Pattern**
>
> A named solution to a recurring problem.

## Understand This First

- [Harness (Agentic)](harness-agentic.md) -- the harness enforces approval policies.
- [Agent](agent.md) -- approval policies govern agent behavior.

## Context

At the **agentic** level, an approval policy defines when an [agent](agent.md) may act autonomously and when it must pause for human confirmation. It's the primary governance mechanism in agentic workflows: the contract between the human's trust and the agent's autonomy.

Approval policies exist because agents are powerful enough to cause real damage. An agent with shell access can delete files, an agent with Git access can push to production, and an agent with API access can modify live systems. The question isn't whether agents should have these capabilities (they often must) but under what conditions they may use them without asking.

## Problem

How do you give an agent enough autonomy to be productive while retaining enough control to prevent costly mistakes?

Too little autonomy and the agent is crippled. It pauses for approval on every file read, every shell command, every minor edit, turning a productive workflow into an exhausting approval queue. Too much autonomy and the agent is dangerous. It makes destructive changes, pushes broken code, or modifies systems it shouldn't touch, all without the human knowing until the damage is done.

## Forces

- **Productivity** increases with agent autonomy. Fewer interruptions mean faster work.
- **Risk** increases with agent autonomy. Unsupervised actions can cause damage.
- **Context matters**: reading a file is low-risk; deleting a database table is high-risk.
- **Trust builds over time.** As you gain confidence in an agent's judgment, the range of actions you're willing to leave unsupervised widens.

## Solution

Define approval policies that match the risk level of each action. A typical policy has three tiers:

**Autonomous (no approval needed).** Low-risk, easily reversible actions: reading files, running tests, searching the codebase, reading documentation. These should never require approval because the interruption cost exceeds the risk.

**Notify and proceed.** Medium-risk actions where the human wants visibility but doesn't need to approve each one: writing files, creating branches, running build commands. The agent proceeds but the human can review at their convenience.

**Require approval.** High-risk actions that need explicit human confirmation before execution: deleting files, running destructive shell commands, pushing to remote repositories, modifying production systems, installing packages. The agent pauses and waits.

Most [harnesses](harness-agentic.md) let you configure these tiers. Some use deny-lists (these specific commands require approval) while others use allow-lists (only these commands are autonomous). The right choice depends on your risk tolerance and the maturity of your workflow.

Approval policies should evolve. Start conservative: require approval for anything you're uncertain about. As you build confidence in the agent's behavior and your harness's safeguards, gradually expand the autonomous tier.

> **⚠️ Warning**
>
> Never set a blanket "approve everything" policy when starting with a new agent, harness, or codebase. One early mistake (a deleted file, a force push, a corrupted database) can cost more than all the time saved by skipping approvals. Earn trust incrementally.

## How It Plays Out

A developer configures their harness with a conservative policy: file reads and test runs are autonomous, file writes require notification, and shell commands require approval. After a week of work, they notice they're approving every `npm install` and `git status` command. They add those to the autonomous tier because the risk is negligible. Over time, the policy converges to the right balance for their workflow.

A team running parallel agents in [worktree isolation](worktree-isolation.md) uses a policy where agents can read, write, and test autonomously within their worktrees, but can't push branches or create pull requests without approval. The agents work at full speed within their sandboxes, and the human reviews the results before anything reaches the shared repository.

> **💡 Example Prompt**
>
> "Set your approval policy so that file reads, test runs, and lint checks are autonomous. File writes should notify me but proceed. Shell commands that modify system state — package installs, git push, database migrations — require my explicit approval."

## Consequences

Well-calibrated approval policies make agentic workflows both productive and safe. The agent operates at full speed on low-risk actions and pauses only when the stakes justify the interruption. The human stays in control without being buried in approval requests.

The cost is the ongoing effort of calibrating the policy. Too tight and you create friction; too loose and you create risk. A policy that fits one project, team, or task may be wrong for the next. Calibration is never truly finished: tools evolve, team confidence grows, and new categories of risk appear.

## Sources

Jerome Saltzer and Michael Schroeder's [*The Protection of Information in Computer Systems*](https://web.mit.edu/saltzer/www/publications/protection/) (Proceedings of the IEEE, 1975) established the principles of *least privilege* and *fail-safe defaults* that underpin the "deny unless explicitly authorized" posture this pattern recommends. Their argument — that access decisions should be based on permission rather than exclusion — is the reason a conservative starting policy is the default recommendation here, fifty years later.

The three-tier allow/ask/deny model described in the Solution section is the one implemented by Anthropic's Claude Code and documented in its [*Configure permissions*](https://platform.claude.com/docs/en/agent-sdk/permissions) guide. Claude Code's evaluation order (deny first, then ask, then allow) and its settings hierarchy (managed, project, user) are the concrete reference implementation behind the abstract tiers in this article.

K. J. Kevin Feng, David W. McDonald, and Amy X. Zhang's [*Levels of Autonomy for AI Agents*](https://arxiv.org/abs/2506.12469) (Knight First Amendment Institute, 2025) frames an agent's autonomy as a deliberate design decision rather than an emergent property of capability. Their five-level taxonomy — operator, collaborator, consultant, approver, observer — offers a finer-grained view than this article's three tiers and is the right next step for readers who want to calibrate approval policy at more points along the autonomy spectrum.

---

- [Next: Permission Classifier](permission-classifier.md)
- [Previous: Agent Governance and Feedback](agent-governance-and-feedback.md)