Human in the Loop
Understand This First
- Agent – this pattern exists because agents exist.
Context
At the agentic level, human in the loop means that a person remains part of the control structure in an agentic workflow. The agent acts, but the human reviews, approves, corrects, and directs. This isn’t a limitation to be engineered away. It’s a design choice that reflects the current state of AI capability and the nature of software as a product that affects real people.
This pattern sits at the intersection of approval policy, verification loop, and plan mode. Each of those patterns creates specific points where human judgment enters the workflow. Human in the loop is the broader principle that unifies them.
Problem
How do you get the productivity benefits of AI agents while maintaining the judgment, accountability, and contextual understanding that only humans currently provide?
Agents are fast, tireless, and broadly knowledgeable. They’re also confidently wrong, unaware of business context, and unable to take responsibility for their decisions. A fully autonomous agent can produce impressive work and also impressive damage. A fully supervised agent loses most of its productivity advantage. The challenge is finding the right level of human involvement for each task and each stage of the workflow.
Forces
- Agent speed is wasted if every action requires human approval.
- Agent errors, especially subtle ones, require human detection because the agent doesn’t know what it doesn’t know.
- Business context (priorities, politics, user sentiment, regulatory requirements) is often not in the context window.
- Accountability for shipped software rests with humans, not agents.
- Skill development: humans who delegate everything stop learning, which erodes their ability to direct agents effectively.
Solution
Keep humans in the loop at high-leverage points, the moments where human judgment has the greatest impact per minute spent. These typically include:
Task definition. The human decides what to build. This is a product judgment skill that agents can’t yet perform reliably.
Plan review. When the agent proposes a plan in plan mode, the human reviews it for architectural fit, business alignment, and risks the agent may not see.
Code review. The human reviews the agent’s changes before they’re merged. This isn’t rubber-stamping. It requires reading the code critically, checking for AI smells, and verifying that the changes match the intent.
Approval gates. Approval policies define specific actions that require human confirmation: destructive operations, deployments, and changes to critical systems.
Course correction. When the agent goes down the wrong path, the human intervenes early rather than letting the agent waste time on an unproductive approach.
The human role shifts from writing code to directing, reviewing, and deciding. This isn’t less work; it’s different work. It requires deeper understanding of the system, stronger judgment about tradeoffs, and better communication skills (because you’re now communicating through prompts and reviews rather than keystrokes).
“Human in the loop” doesn’t mean “human approves every action.” It means the human is present at the points where their judgment matters most. The goal isn’t maximum oversight but optimal oversight: enough to catch important errors without creating a bottleneck.
How It Plays Out
A developer uses an agent to implement a new feature. She defines the task, reviews the agent’s plan, and approves it with one modification. The agent implements the feature across three files, running tests at each step. The developer reviews the final diff, catches a naming inconsistency the agent didn’t notice, requests the fix, and approves the merge. The total human time was fifteen minutes. The total agent time was five minutes. The feature is correct, consistent, and reviewed.
A team experiments with fully autonomous agents for routine dependency updates. The agents update versions, run tests, and create pull requests without human involvement. This works well for ninety percent of updates. The other ten percent break in subtle ways that the tests don’t catch (an API behavior change, a performance regression). The team adds a human review step for dependency updates that change more than the version number.
“Implement this feature across the three files described in the spec. After each file, pause and show me the diff so I can review before you continue to the next.”
Consequences
Human in the loop maintains quality and accountability while using agent productivity. It keeps humans engaged with the codebase (understanding what the agent is doing and why) which preserves the knowledge needed to direct agents effectively.
The cost is human time and attention. Every review point is a potential bottleneck if the human is busy or unavailable. There’s also a skill-atrophy risk in the opposite direction: humans who review without engaging deeply become rubber-stampers, providing the appearance of oversight without the substance. The antidote is maintaining personal coding practice alongside agentic workflows, staying sharp enough that your reviews are genuine.
Related Patterns
- Depends on: Agent — this pattern exists because agents exist.
- Uses: Approval Policy — policies define the specific approval points.
- Uses: Plan Mode — plan review is a key human-in-the-loop moment.
- Uses: Verification Loop — some verification steps require human judgment.
- Enables: Smell (AI Smell) — AI smell detection is a human-in-the-loop skill.
- Degraded by: Shadow Agent – no human is in the loop if nobody knows the agent is running.
- Degraded by: Approval Fatigue – fatigue turns meaningful oversight into rubber-stamping.