Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Human in the Loop

Pattern

A reusable solution you can apply to your work.

Human in the loop keeps a person inside the control structure of an agentic workflow, positioned at the moments where human judgment has the highest leverage.

Understand This First

  • Agent – agents create the need for this pattern.

Context

At the agentic level, human in the loop means that a person remains part of the control structure in an agentic workflow. The agent acts, but the human reviews, approves, corrects, and directs. This isn’t a limitation to be engineered away. It’s a design choice that reflects the current state of AI capability and the nature of software as a product that affects real people.

Approval Policy, Verification Loop, and Plan Mode each create specific points where human judgment enters the workflow. Human in the loop is the broader principle that unifies them.

Problem

How do you get the productivity benefits of AI agents while maintaining the judgment, accountability, and contextual understanding that only humans currently provide?

Agents are fast, tireless, and broadly knowledgeable. They’re also confidently wrong, blind to business context, and unable to take responsibility for their decisions. A fully autonomous agent can produce impressive work and impressive damage in the same session. A fully supervised agent loses most of its productivity advantage. The challenge is calibrating human involvement to each task and each stage of the workflow.

Forces

  • Agent speed is wasted if every action requires human approval.
  • Agent errors, especially subtle ones, require human detection because the agent doesn’t know what it doesn’t know.
  • Business context (priorities, politics, user sentiment, regulatory requirements) is often not in the context window.
  • Accountability for shipped software rests with humans, not agents.
  • Skill development: humans who delegate everything stop learning, which erodes their ability to direct agents effectively.

Solution

Keep humans in the loop at high-leverage points: the moments where human judgment has the greatest impact per minute spent.

Task definition. The human decides what to build. Product judgment requires business context, user empathy, and strategic awareness that agents don’t have.

Plan review. When the agent proposes a plan in plan mode, the human reviews it for architectural fit, business alignment, and risks the agent may not see.

Code review. The human reviews the agent’s changes before they merge. This isn’t rubber-stamping. It means reading the code critically, checking for AI smells, and verifying that the changes match the intent.

Approval gates. Approval policies define which actions require human confirmation: destructive operations, deployments, changes to critical systems.

Course correction. When the agent goes down the wrong path, the human intervenes early rather than letting the agent waste time on an unproductive approach.

The human role shifts from writing code to directing, reviewing, and deciding. This isn’t less work; it’s different work. It demands deeper understanding of the system, stronger judgment about tradeoffs, and better communication skills, because you’re now communicating through prompts and reviews rather than keystrokes.

Note

“Human in the loop” doesn’t mean “human approves every action.” It means the human is present at the points where their judgment matters most. The goal is optimal oversight, not maximum oversight: enough to catch important errors without becoming a bottleneck.

How It Plays Out

A developer uses an agent to implement a new feature. She defines the task, reviews the agent’s plan, and approves it with one modification. The agent implements the feature across three files, running tests at each step. The developer reviews the final diff, catches a naming inconsistency the agent didn’t notice, requests the fix, and approves the merge. The total human time was fifteen minutes. The total agent time was five minutes. The feature is correct, consistent, and reviewed.

A team experiments with fully autonomous agents for routine dependency updates. The agents update versions, run tests, and create pull requests without human involvement. This works well for ninety percent of updates. The other ten percent break in subtle ways that the tests don’t catch (an API behavior change, a performance regression). The team adds a human review step for dependency updates that change more than the version number.

Example Prompt

“Implement this feature across the three files described in the spec. After each file, pause and show me the diff so I can review before you continue to the next.”

Consequences

Human in the loop maintains quality and accountability while capturing the productivity gains of agents. It keeps humans engaged with the codebase, preserving the knowledge needed to direct agents effectively.

The cost is human time and attention. Every review point is a potential bottleneck when the human is busy or unavailable. And there’s a subtler risk: humans who review without engaging deeply become rubber-stampers, providing the appearance of oversight without the substance. The antidote is maintaining personal coding practice alongside agentic workflows. Stay sharp enough that your reviews are genuine.

  • Depends on: Agent — agents create the need for human oversight.
  • Uses: Approval Policy — policies define the specific approval points.
  • Uses: Plan Mode — plan review is a key human-in-the-loop moment.
  • Uses: Verification Loop — some verification steps require human judgment.
  • Enables: Smell (AI Smell) — AI smell detection is a human-in-the-loop skill.
  • Degraded by: Shadow Agent – no human is in the loop if nobody knows the agent is running.
  • Degraded by: Approval Fatigue – fatigue turns meaningful oversight into rubber-stamping.
  • Related: Agent Trap – some traps specifically target the human oversight checkpoint.
  • Governed by: Bounded Autonomy – bounded autonomy describes how the system decides when to invoke human participation.
  • Triggered by: Steering Loop – the steering loop’s escalation path is what brings a human in at the right moment.

Sources

  • Norbert Wiener’s Cybernetics: Or Control and Communication in the Animal and the Machine (1948) established the foundational idea that human operators are feedback elements in control systems, not bystanders watching from outside. The entire framing of humans participating in a loop of sensing, deciding, and acting traces back to Wiener’s work.
  • Lisanne Bainbridge’s “Ironies of Automation” (1983) identified the paradox this article raises in Consequences: the more you automate, the more demanding the human role becomes, because skills atrophy from disuse at exactly the moment they matter most. Her analysis of industrial process control applies directly to agentic coding, where developers who delegate everything lose the judgment needed to review what agents produce.
  • Ben Shneiderman’s Human-Centered AI (2022) reframed the question from “how do we make AI autonomous?” to “how do we keep humans in control?” His emphasis on comprehensible, predictable, and controllable designs over anthropomorphic autonomy informs the article’s stance that human involvement is a design choice, not a limitation to be engineered away.