Approval Policy
Understand This First
- Harness (Agentic) – the harness enforces approval policies.
- Agent – approval policies govern agent behavior.
Context
At the agentic level, an approval policy defines when an agent may act autonomously and when it must pause for human confirmation. It’s the primary governance mechanism in agentic workflows: the contract between the human’s trust and the agent’s autonomy.
Approval policies exist because agents are powerful enough to cause real damage. An agent with shell access can delete files, an agent with Git access can push to production, and an agent with API access can modify live systems. The question isn’t whether agents should have these capabilities (they often must) but under what conditions they may use them without asking.
Problem
How do you give an agent enough autonomy to be productive while retaining enough control to prevent costly mistakes?
Too little autonomy and the agent is crippled. It pauses for approval on every file read, every shell command, every minor edit, turning a productive workflow into an exhausting approval queue. Too much autonomy and the agent is dangerous. It makes destructive changes, pushes broken code, or modifies systems it shouldn’t touch, all without the human knowing until the damage is done.
Forces
- Productivity increases with agent autonomy. Fewer interruptions mean faster work.
- Risk increases with agent autonomy. Unsupervised actions can cause damage.
- Context matters: reading a file is low-risk; deleting a database table is high-risk.
- Trust builds over time. As you gain confidence in an agent’s judgment, the range of actions you’re willing to leave unsupervised widens.
Solution
Define approval policies that match the risk level of each action. A typical policy has three tiers:
Autonomous (no approval needed). Low-risk, easily reversible actions: reading files, running tests, searching the codebase, reading documentation. These should never require approval because the interruption cost exceeds the risk.
Notify and proceed. Medium-risk actions where the human wants visibility but doesn’t need to approve each one: writing files, creating branches, running build commands. The agent proceeds but the human can review at their convenience.
Require approval. High-risk actions that need explicit human confirmation before execution: deleting files, running destructive shell commands, pushing to remote repositories, modifying production systems, installing packages. The agent pauses and waits.
Most harnesses let you configure these tiers. Some use deny-lists (these specific commands require approval) while others use allow-lists (only these commands are autonomous). The right choice depends on your risk tolerance and the maturity of your workflow.
Approval policies should evolve. Start conservative: require approval for anything you’re uncertain about. As you build confidence in the agent’s behavior and your harness’s safeguards, gradually expand the autonomous tier.
Never set a blanket “approve everything” policy when starting with a new agent, harness, or codebase. One early mistake (a deleted file, a force push, a corrupted database) can cost more than all the time saved by skipping approvals. Earn trust incrementally.
How It Plays Out
A developer configures their harness with a conservative policy: file reads and test runs are autonomous, file writes require notification, and shell commands require approval. After a week of work, they notice they’re approving every npm install and git status command. They add those to the autonomous tier because the risk is negligible. Over time, the policy converges to the right balance for their workflow.
A team running parallel agents in worktree isolation uses a policy where agents can read, write, and test autonomously within their worktrees, but can’t push branches or create pull requests without approval. The agents work at full speed within their sandboxes, and the human reviews the results before anything reaches the shared repository.
“Set your approval policy so that file reads, test runs, and lint checks are autonomous. File writes should notify me but proceed. Shell commands that modify system state — package installs, git push, database migrations — require my explicit approval.”
Consequences
Well-calibrated approval policies make agentic workflows both productive and safe. The agent operates at full speed on low-risk actions and pauses only when the stakes justify the interruption. The human stays in control without being buried in approval requests.
The cost is the ongoing effort of calibrating the policy. Too tight and you create friction; too loose and you create risk. A policy that fits one project, team, or task may be wrong for the next. Calibration is never truly finished: tools evolve, team confidence grows, and new categories of risk appear.
Related Patterns
- Depends on: Harness (Agentic) – the harness enforces approval policies.
- Depends on: Agent – approval policies govern agent behavior.
- Enables: Human in the Loop – approval policies define the specific points where the human intervenes.
- Uses: Tool – policies are typically defined per tool or per action type.
- Enables: Worktree Isolation – isolation reduces the risk surface, allowing more liberal policies within the worktree.
- Related: Shadow Agent – without registration, agents bypass approval entirely.
- Related: Approval Fatigue – poorly calibrated policies cause approval fatigue.
- Refined by: Bounded Autonomy – bounded autonomy graduates binary approve/deny gates into a spectrum calibrated to consequence severity.
- Related: Agent Sprawl – sprawl overwhelms approval policies by multiplying the number of agents under governance.
Sources
Jerome Saltzer and Michael Schroeder’s “The Protection of Information in Computer Systems” (1975) established the principles of least privilege and fail-safe defaults that underpin the “deny unless explicitly authorized” posture this pattern recommends. Their argument — that access decisions should be based on permission rather than exclusion — is the reason a conservative starting policy is the default recommendation here, fifty years later.
The three-tier allow/ask/deny model described in the Solution section is the one implemented by Anthropic’s Claude Code and documented in its Configure permissions guide. Claude Code’s evaluation order (deny first, then ask, then allow) and its settings hierarchy (managed, project, user) are the concrete reference implementation behind the abstract tiers in this article.
K. J. Kevin Feng, David W. McDonald, and Amy X. Zhang’s “Levels of Autonomy for AI Agents” (Knight First Amendment Institute, 2025; arXiv 2506.12469) frames an agent’s autonomy as a deliberate design decision rather than an emergent property of capability. Their five-level taxonomy — operator, collaborator, consultant, approver, observer — offers a finer-grained view than this article’s three tiers and is the right next step for readers who want to calibrate approval policy at more points along the autonomy spectrum.