Human in the Loop

Pattern

A named solution to a recurring problem.

Human in the loop keeps a person inside the control structure of an agentic workflow, positioned at the moments where human judgment has the highest leverage.

Understand This First

Agent – agents create the need for this pattern.

At the agentic level, human in the loop means that a person remains part of the control structure in an agentic workflow. The agent acts, but the human reviews, approves, corrects, and directs. This isn’t a limitation to be engineered away. It’s a design choice that reflects the current state of AI capability and the nature of software as a product that affects real people.

Approval Policy, Verification Loop, and Plan Mode each create specific points where human judgment enters the workflow. Human in the loop is the broader principle that unifies them.

Kief Morris names three positions the human can hold relative to the agent’s cycle: in the loop (the human approves each step before the agent continues), on the loop (the human monitors the cycle and intervenes only when something looks wrong), and out of the loop (the human sets the goal and the agent runs the cycle alone). The position is not a fixed property of a team; it shifts with the task’s risk, the harness’s maturity, and how much trust the agent has earned. Most effective teams move fluidly between the three, tightening up for dangerous work and loosening for routine work. The steering loop is where these positions actually live (that’s the cycle the human is in, on, or out of), and bounded autonomy is what formalizes which actions belong to each position for a given project.

Annie Vella’s longitudinal study of 158 engineers across 28 countries (October 2024 to April 2025) gave this role a name: supervisory engineering. Her data shows AI tools are not just changing which tasks engineers do but which loop they spend time in. Work shifts from generation in the inner loop to direction, evaluation, and correction in a middle loop. Supervisory engineering decomposes into three activities: directing (specifying intent and crafting prompts), evaluating (deciding which AI output to accept or reject), and correcting (fixing errors and maintaining consistency). The three positions Morris named describe how close the human supervisor is. Vella’s three activities describe what the supervisor is doing at any of those distances.

Problem

How do you get the productivity benefits of AI agents while maintaining the judgment, accountability, and contextual understanding that only humans currently provide?

Agents are fast, tireless, and broadly knowledgeable. They’re also confidently wrong, blind to business context, and unable to take responsibility for their decisions. A fully autonomous agent can produce impressive work and impressive damage in the same session. A fully supervised agent loses most of its productivity advantage. The challenge is calibrating human involvement to each task and each stage of the workflow.

Forces

Agent speed is wasted if every action requires human approval.
Agent errors, especially subtle ones, require human detection because the agent doesn’t know what it doesn’t know.
Business context (priorities, politics, user sentiment, regulatory requirements) is often not in the context window.
Accountability for shipped software rests with humans, not agents.
Skill development: humans who delegate everything stop learning, which erodes their ability to direct agents effectively.

Solution

Keep humans in the loop at high-leverage points: the moments where human judgment has the greatest impact per minute spent.

Task definition. The human decides what to build. Product judgment requires business context, user empathy, and strategic awareness that agents don’t have.

Plan review. When the agent proposes a plan in plan mode, the human reviews it for architectural fit, business alignment, and risks the agent may not see.

Code review. The human reviews the agent’s changes before they merge. This isn’t rubber-stamping. It means reading the code critically, checking for AI smells, and verifying that the changes match the intent.

Approval gates. Approval policies define which actions require human confirmation: destructive operations, deployments, changes to critical systems.

Course correction. When the agent goes down the wrong path, the human intervenes early rather than letting the agent waste time on an unproductive approach.

The human role shifts from writing code to directing, reviewing, and deciding. This isn’t less work; it’s different work. It demands deeper understanding of the system, stronger judgment about tradeoffs, and better communication skills, because you’re now communicating through prompts and reviews rather than keystrokes.

Note

“Human in the loop” doesn’t mean “human approves every action.” It means the human is present at the points where their judgment matters most. The goal is optimal oversight, not maximum oversight: enough to catch important errors without becoming a bottleneck.

How It Plays Out

A developer uses an agent to implement a new feature. She defines the task, reviews the agent’s plan, and approves it with one modification. The agent implements the feature across three files, running tests at each step. The developer reviews the final diff, catches a naming inconsistency the agent didn’t notice, requests the fix, and approves the merge. The total human time was fifteen minutes. The total agent time was five minutes. The feature is correct, consistent, and reviewed.

A team experiments with fully autonomous agents for routine dependency updates. The agents update versions, run tests, and create pull requests without human involvement. This works well for ninety percent of updates. The other ten percent break in subtle ways that the tests don’t catch (an API behavior change, a performance regression). The team adds a human review step for dependency updates that change more than the version number.

Example Prompt

“Implement this feature across the three files described in the spec. After each file, pause and show me the diff so I can review before you continue to the next.”

Consequences

Human in the loop maintains quality and accountability while capturing the productivity gains of agents. It keeps humans engaged with the codebase, preserving the knowledge needed to direct agents effectively.

The cost is human time and attention. Every review point is a potential bottleneck when the human is busy or unavailable. And there’s a subtler risk: humans who review without engaging deeply become rubber-stampers, providing the appearance of oversight without the substance. The antidote is maintaining personal coding practice alongside agentic workflows. Stay sharp enough that your reviews are genuine.

Sources

Norbert Wiener’s Cybernetics: Or Control and Communication in the Animal and the Machine (MIT Press, 1948) established the foundational idea that human operators are feedback elements in control systems, not bystanders watching from outside. The entire framing of humans participating in a loop of sensing, deciding, and acting traces back to Wiener’s work.
Lisanne Bainbridge’s Ironies of Automation (Automatica, 1983) identified the paradox this article raises in Consequences: the more you automate, the more demanding the human role becomes, because skills atrophy from disuse at exactly the moment they matter most. Her analysis of industrial process control applies directly to agentic coding, where developers who delegate everything lose the judgment needed to review what agents produce.
Ben Shneiderman’s Human-Centered AI (Oxford University Press, 2022) reframed the question from “how do we make AI autonomous?” to “how do we keep humans in control?” His emphasis on comprehensible, predictable, and controllable designs over anthropomorphic autonomy informs the article’s stance that human involvement is a design choice, not a limitation to be engineered away.
Kief Morris’s Humans and Agents in Software Engineering Loops (ThoughtWorks, March 2026) introduced the three-position vocabulary — in the loop, on the loop, out of the loop — and argued that the “on the loop” position is the one most teams should be growing into as their harness matures. The distinction is now spreading as standard terminology across enterprise AI writing.
Annie Vella’s The Middle Loop (March 2026) reports a longitudinal mixed-methods study of software engineers across two rounds (158, then 101, with 95 matched), naming supervisory engineering as the new category of work emerging between the inner and outer development loops, decomposed into directing, evaluating, and correcting.

Keyboard shortcuts