Agent Governance and Feedback

Work in Progress

This section is actively being expanded. Entries on drift sensors, architecture fitness functions, supervisory engineering, and other governance patterns are on the way.

This section covers the patterns that govern how agents are controlled, evaluated, and steered toward correct outcomes. Where Agentic Software Construction describes the building blocks of agent-driven workflows, this section describes the control systems that keep those workflows on track.

The core challenge is that AI agents produce plausible output, not provably correct output. They need guardrails before they act, checks after they act, and a closed loop connecting the two. They also need human oversight calibrated to the risk of each action: tight for irreversible operations, loose for safe and reversible ones.

The patterns here form a natural progression. Feedforward controls shape what the agent does before it writes a single line. Feedback Sensor checks report what happened after it acted. The Steering Loop connects both into a system that converges on correct output. Harnessability describes the codebase properties that make all of this work well. And the governance patterns (Approval Policy, Human in the Loop, Eval) define when humans intervene and how you measure whether the whole system is improving.

Human Oversight

When and how humans stay in the loop as agents gain autonomy.

Approval Policy — When an agent may act autonomously vs. when a human must approve.
Permission Classifier — A small model judges each proposed action and routes it to auto-approve, human review, or block.
Runtime Governance — Move every policy decision onto the action path itself, where each call is ruled allow, throttle, sandbox, escalate, or block at machine speed.
Human in the Loop — A person remains part of the control structure.
Eval — A repeatable suite to measure agentic workflow performance.
Evaluation Gate — Run the agent’s eval suite in CI and block merge or deploy when quality, latency, cost, or safety regress past agreed thresholds.
Bounded Autonomy — Graduated tiers of agent freedom calibrated to the consequence and reversibility of each action.
Dark Factory — The maximum-autonomy operating model where agents write, test, and ship code while humans work only at the specification and governance layer.
Agent Registry — A governed, queryable catalog of every agent in the organization, recording what each one does, who owns it, what it touches, and when it was last reviewed.
Agent Provenance — Record which agent, model, harness, instructions, permissions, and human prompt produced an artifact, at creation, so authorship is queryable rather than reconstructed after an incident.
Agent Cost Governance — Govern what an agent fleet costs by making token, tool, and infrastructure spend an allocated, budgeted, attributed, and continuously optimized variable.

Control Loops

The feedback and feedforward mechanisms that keep agents converging on correct output.

Feedforward — Controls placed before the agent acts to steer it toward correct output on the first attempt.
Feedback Sensor — Checks that run after the agent acts, telling it what went wrong so it can correct course.
Steering Loop — The closed cycle of act, sense, decide, and adjust that turns feedforward and feedback into a convergent control system.
Shift-Left Feedback — Move quality checks as close to the point of creation as possible, so agents catch mistakes while they can still fix them cheaply.
Feedback Flywheel — A cross-session retrospective loop that harvests corrections from AI-assisted work and feeds validated rules back into instruction files.
Agentic Pull Request — Making the agent’s work a reviewable change with a branch, tests, a session link, and a rationale, so review comments become the agent’s next instructions.
AgentOps — The operational discipline of monitoring, costing, and governing agents running in production.

Codebase Health

Patterns that keep the codebase tractable for agents over time.

Harnessability — The degree to which a codebase’s structural properties make it tractable for AI agents.
Garbage Collection — Recurring agent-driven sweeps that find where a codebase has drifted from its standards and fix the drift before it compounds.
Architecture Fitness Function — An automated check that verifies the system still honors a specific architectural decision.

Antipatterns

What goes wrong when governance fails to keep pace with agent adoption.

Approval Fatigue — When approval requests arrive faster than a human can read them, oversight collapses into rubber-stamping.
Shadow Agent — An AI agent operating inside your organization without anyone in governance knowing it exists.
Delegation Chain — The path authority follows from a human through one or more agents, where each link can amplify, misdirect, or quietly exceed the original intent.
Agent Sprawl — The population-scale antipattern of shadow agents, where autonomous workers proliferate faster than governance can track them.
Tool Sprawl — A single agent’s tool catalog grows past the model’s ability to choose among its members, and accuracy collapses as capabilities keep expanding.
Agent Drift — An agent’s defect rate creeps upward across many runs over calendar time, with no single run throwing an obvious error, until the system is quietly less reliable than the day it launched.

Keyboard shortcuts

Agent Governance and Feedback

Human Oversight

Control Loops

Codebase Health

Antipatterns