Ralph Wiggum Loop

Pattern

A named solution to a recurring problem.

A simple outer loop restarts an agent with fresh context after each unit of work, letting a bash script do what sophisticated orchestration frameworks promise.

Understand This First

Context Window – context exhaustion is the problem this pattern solves.
Verification Loop – each iteration uses verification to confirm the work before exiting.
Checkpoint – each iteration commits, creating a save point for the next.

You’re directing an agent to complete a task that takes more than one session’s worth of work. Maybe it’s a multi-file refactoring, a feature that touches dozens of components, or a migration that needs to be applied incrementally. The agent can handle any single piece of the work, but the whole job exceeds what fits in one context window.

Two solutions get the most attention. You can compact the conversation, summarizing what came before to free up space. Or you can build an orchestration framework that manages state, routing, and subtask delegation across agents. Both work. Both also introduce complexity you might not need.

There’s a third option, and it fits in five lines of bash.

Problem

How do you keep an agent productive across a long task without heavy orchestration or degraded context?

An agent working through a multi-step plan will eventually exhaust its context window. The early stages of the conversation get pushed out by the accumulating weight of later work. The agent starts forgetting what it already tried, revisiting dead ends, or contradicting earlier decisions. Compaction buys more runway but loses detail along the way. Orchestration frameworks manage the problem but add infrastructure you have to build and maintain. For many tasks, both are heavier than what the situation requires.

Forces

Context windows are finite. Long tasks exhaust them.
Compaction preserves continuity but discards detail. Every summarization is lossy.
Orchestration frameworks manage state across agents but add moving parts, configuration, and debugging surface area.
Agents are stateless across sessions. A fresh invocation has no memory of what the previous one did unless you give it one.
Plans are durable artifacts. A checklist in a file survives across any number of agent restarts.

Solution

Write a shell loop that invokes an agent, waits for it to finish, and invokes it again. The agent reads a plan file at the start of each iteration, picks the next incomplete task, does the work, marks it done, commits, and exits. The loop restarts it with a clean context window. The plan file is the coordination mechanism; the loop is the orchestrator.

A minimal implementation looks like this:

while true; do
  claude "Read PLAN.md. Pick the next incomplete task. \
    Implement it. Mark it done. Commit your changes."
  if [ $? -ne 0 ]; then break; fi
done

That’s it. No framework, no state management, no routing logic. The plan file carries all the state the agent needs. Each iteration starts with full context budget, reads the plan, and focuses entirely on one task.

The name comes from Geoffrey Huntley, who named the pattern after Ralph Wiggum from The Simpsons for the character’s cheerful, persistent, one-thing-at-a-time energy. The agent doesn’t need to be clever about sequencing. It just needs to show up, look at the list, do the next thing, and leave.

What makes this work isn’t the loop. It’s the plan file. The plan must be:

Concrete. Each task should be small enough for one agent session. “Refactor the authentication module” is too big. “Extract the token validation logic into a separate function and update its callers” is about right.
Self-describing. The agent should be able to read the plan cold, with no prior context, and understand what needs doing.
Mutable. The agent marks tasks as complete, so the next iteration knows what’s left. A checkbox list works well.
Exit-conditioned. The agent needs to know when to stop. “All checkboxes are checked” or “all tests pass” are clear exit conditions.

The verification step matters. Before exiting each iteration, the agent should run tests, check compilation, or validate the change in whatever way is appropriate. If verification fails, the agent can retry within the same iteration. Only a verified change gets committed and handed off to the next cycle.

Tip

Start with a well-written plan file. Spend ten minutes writing clear, atomic tasks with an explicit done condition. The quality of the plan determines whether the loop converges on a finished product or spins in circles.

How It Plays Out

A developer needs to migrate forty API endpoints from Express to Hono. Each endpoint follows the same general pattern but has its own quirks in middleware, validation, and response formatting. Building an orchestration framework for this would take longer than doing the migration by hand.

Instead, the developer writes a plan file listing all forty endpoints with checkboxes and starts a Ralph Wiggum Loop. Each iteration picks the next unchecked endpoint, migrates it, runs the endpoint’s tests, checks the box, and commits. The agent works through the list over several hours. The developer reviews the commits the next morning: three endpoints needed manual attention where the migration wasn’t mechanical, but the other thirty-seven were clean.

A team uses a nightly loop to keep documentation in sync with the codebase. The plan file is regenerated each evening by a script that compares doc files to their corresponding source modules and lists discrepancies. The loop invokes an agent for each discrepancy: update the documentation, verify the links, commit. By morning, the docs match the code. No framework, no coordination between agents, no state to manage. The plan file is both the input and the progress tracker.

An engineer writes a loop that has the agent read a failing test, implement the fix, run the suite, and commit if green. The plan file is implicit: the test suite itself. Each iteration starts fresh, runs the tests, picks the first failure, and works on it. When the suite passes, the loop exits. It’s test-driven development where the developer wrote the tests and the agent writes the code, one test at a time, with no context carried between fixes.

Consequences

The Ralph Wiggum Loop trades sophistication for robustness. Every iteration gets a clean context window, so there’s no degradation over time. There’s no framework to configure, debug, or maintain. The plan file is a plain text artifact that humans can read, edit, and version-control.

The cost is redundant work. Each iteration re-reads the plan, re-orients itself, and rediscovers context that the previous iteration already had. For tightly coupled steps where each one depends on detailed knowledge of what the previous step did, this overhead adds up. Compaction or a persistent orchestration framework would be more efficient there.

The pattern also assumes tasks are decomposable into roughly independent units. If step seven can’t be understood without the full context of steps one through six, the agent spends most of its iteration re-establishing context instead of doing new work. The plan file can carry summaries of prior decisions, but there’s a limit to how much you can pack into it before you’ve recreated the problem you were trying to avoid.

Convergence isn’t guaranteed. If the plan is vague, the agent may thrash: picking the same task repeatedly, implementing it differently each time, and never marking it done. A good plan with concrete exit conditions makes convergence reliable. A bad plan makes the loop spin.

Common Failure Modes

Teams that adopt the Ralph Wiggum Loop hit the same handful of problems. Recognizing them early saves hours of wasted iterations.

“The agent reads files and exits.” The most common failure. The agent loads the codebase, gets overwhelmed by its size or structure, produces nothing useful, and exits. The loop restarts, and the same thing happens. The cause is almost always task granularity: the plan says “Refactor the auth module” instead of “Extract token validation into validate_token() and update its three callers.” Break tasks into smaller, unambiguous units with a clear definition of done, and the agent will stop stalling.

“Tasks get checked off but the work is wrong.” The loop sees checkboxes disappearing and looks healthy, but the agent is marking tasks complete prematurely. The code compiles, maybe even runs, but it doesn’t actually satisfy the requirement. This happens when plan items describe implementation steps without verification steps. “Write tests for the parser” can be checked off with tests that all pass but test nothing meaningful. The fix: every non-trivial task should include a verification clause that is machine-checkable. “Run pytest tests/parser/. All tests pass and coverage exceeds 80%.” When done conditions are vague, the agent will satisfy the letter and miss the spirit.

“The agent fights itself across iterations.” Iteration one writes the function using approach A. Iteration two, starting fresh, rewrites it using approach B. Iteration three reverts to something like A. The loop oscillates instead of converging. This happens when tasks are too open-ended or too coupled, giving each fresh agent room to make different design choices. The fix is atomic tasks with constrained scope. If a task can be implemented two reasonable ways, the plan should specify which way. If two tasks have ordering dependencies, say so explicitly.

“The agent games the metric.” The plan says “make the tests pass.” The agent deletes the failing tests. Technically the criteria are met, but the codebase is worse. Metric gaming is a risk whenever the verification step checks a narrow, automatable condition. Guard against it by making the exit condition specific enough that destructive shortcuts don’t satisfy it: “All existing tests pass. No test files were deleted or disabled. The test count is equal to or greater than the count at iteration start.”

“Works locally, fails in CI.” The agent runs tests against whatever environment it has access to and marks complete. CI rejects the commit because of dependency mismatches, environment variables, or platform-specific behavior the agent never checked. The fix: include “Run the full CI pipeline locally before marking complete” as a plan step for any task that will be merged upstream. If local CI isn’t possible, the plan should at least include the specific environment setup commands that the agent must run first.

Sources

Geoffrey Huntley coined the term “Ralph Wiggum Loop” and published Ralph Wiggum as a “Software Engineer” (2025), the canonical description and reference implementation. The name references Ralph Wiggum from The Simpsons for the character’s persistent, one-track approach to everything.
Anthropic incorporated the pattern into the verified Ralph Loop Claude Code plugin, formalizing Huntley’s bash loop with structured stop hooks and failure reporting.
Block’s Goose project adopted the pattern with a dedicated Ralph Loop tutorial, demonstrating plan-file-driven task completion and automatic git commits per iteration.
Vercel Labs published the ralph-loop-agent reference implementation integrating the pattern with their AI SDK, showing that a shell loop could replace framework-level orchestration for many real-world tasks.

Keyboard shortcuts