Question Generation
Make the agent interview you before it writes a single line of code.
Understand This First
- Prompt – question generation is a specific kind of prompt pattern.
- Plan Mode – questions come before the plan in the same read-first discipline.
Context
At the agentic level, question generation is the practice of instructing the agent to act as a requirements analyst first and a coder second. Before any plan or code appears, the agent asks you a structured list of clarifying questions, grouped by category, and waits for answers.
This sits at the very front of an agent session, earlier than plan mode. Plan mode is the pause between understanding and action. Question generation is the pause between request and understanding.
Problem
How do you stop an agent from building the wrong thing at full speed?
The default behavior of a coding agent is to generate immediately. You type a vague sentence, the agent fills the gaps with plausible guesses, and three minutes later you have a working feature that isn’t the one you meant. The agent didn’t know it should ask. It assumed, because its training data rewards confidence, and because every unstated assumption has a likely default.
The most expensive bugs in agentic work don’t come from building the thing wrong. They come from building the wrong thing, well. A test suite that passes for a feature nobody needed is worse than one that fails for the right feature, because you cannot tell at a glance that anything is wrong.
Forces
- Agents are biased toward action, and acting feels productive even when it’s premature.
- Every unstated requirement becomes a default assumption, and defaults are usually wrong in interesting ways.
- Questioning costs tokens and attention that could have been spent generating.
- Users get fatigued when asked too many questions, especially questions pitched at the wrong level.
- A single revision cycle is expensive: the more code already exists, the more painful each correction becomes.
Solution
Before the agent generates any plan or code, instruct it to interview you. The interview has three properties that separate it from a generic “ask me questions” prompt:
Questions come in named categories. Scope and goals. Users and use cases. Technical constraints. Edge cases and failure modes. Security and data. The categories force the agent to cover ground it would otherwise skip, and they let you see at a glance whether it has understood the shape of the request.
Only one category at a time. The agent asks about scope, waits for answers, then moves to the next category. This keeps each round short enough to answer without fatigue and lets earlier answers inform later questions.
Every question has a recommended default. The agent offers its best guess alongside each question, so you can confirm most defaults with a single word and spend your attention on the ones that matter. “Should invalid emails return an error or a warning? (Default: error.)” is a question you can answer in one second; “What should the behavior be for invalid emails?” is a question that stalls you.
At the end of the interview, the agent summarizes the answers as a short spec (the thing it now thinks you asked for) and gives you one more chance to correct it before anything is planned or built. Only then does it proceed to plan mode and execution.
Put the question-generation instruction in your instruction file so you don’t have to retype it every session. A one-line rule like “For any non-trivial task, interview me first: list questions by category, offer defaults, one category at a time” reliably changes the agent’s behavior on day one.
How It Plays Out
A product manager asks an agent to “add a feedback form to the pricing page.” Without interviewing, the agent builds a three-field form (name, email, message), stores submissions in a new database table, and sends a thank-you email. It looks great. It’s the wrong feature: the PM wanted to collect structured feedback on specific pricing tiers, not a generic contact form. Two hours of iteration follow, most of them spent explaining what the feature was actually for.
A different session, a different agent, same request, but this time the instruction file says to interview first. The agent comes back with questions grouped by category:
Scope:
1. Is this feedback about the pricing page itself (layout, clarity)
or about the pricing (too expensive, unclear value)? (Default: pricing.)
2. Should feedback tie to a specific plan the user was looking at?
(Default: yes.)
Users:
3. Anonymous feedback, or only from signed-in users? (Default: both.)
4. Do you want to follow up, or is this fire-and-forget?
(Default: optional email field.)
Data:
5. Where should submissions go: database, Slack, email, spreadsheet?
(Default: database + Slack notification.)
6. Retention? (Default: 90 days, then anonymize.)
The PM answers in under a minute, mostly by typing “default” next to each item. One answer surprises the agent: submissions should go to a specific Slack channel that product leadership already watches. The agent updates its plan accordingly. The feature ships in the same afternoon and is the right feature on the first try.
A second scenario: a developer working with an unfamiliar codebase asks the agent to refactor a payment module. The interview surfaces that there’s a second caller in a legacy system the developer forgot about, that the existing tests don’t cover the retry path, and that the team has an undocumented convention that error messages must include a correlation ID. All three facts would have been discovered eventually, but as rework, not as requirements.
Consequences
The first-pass acceptance rate, meaning the percentage of agent output that lands without revision, rises sharply. So does trust: the interview makes the agent’s understanding visible before anything is built, so you catch misunderstandings when they’re still just words.
The cost is a short pause at the start of each session, and some social friction if the agent pitches questions at the wrong level. An interview that asks obvious things (“What programming language?” on a project where the stack is already clear) feels like busywork and trains you to skim. An interview that asks deep architectural questions on a one-line bug fix feels absurd. Calibration is a skill: give the agent examples in your instruction file of when to interview deeply, when to interview lightly, and when to skip the interview and act.
Question generation also tends to shift where time is spent. Less time on revision, more time up front. For teams that resist the front-loaded style, this feels like a slowdown, even when the overall cycle is faster. The measurable improvement shows up at the pull-request level, not at the first prompt.
Related Patterns
- Depends on: Prompt – the interview is run through prompts; question generation is a prompt pattern with structure.
- Enables: Plan Mode – the answers from the interview become the inputs plan mode uses to propose a plan.
- Enables: Specification – the interview summary is a lightweight spec the agent can consume on the next run.
- Uses: Instruction File – the rule that triggers interviewing belongs in the project’s instruction file, not in every prompt.
- Uses: Context Engineering – the interview is a deliberate way to front-load the context the agent needs most.
- Contrasts with: Vibe Coding – vibe coding accepts the agent’s first output; question generation insists on agreement before any output at all.
Sources
- Paul Duvall’s “AI Development Patterns” repository named Question Generation as a first-class pattern in April 2026, arguing that a structured interview round costs fewer tokens than a single revision cycle and that the most expensive bugs come from building the wrong thing rather than building it wrong.
- An independent practitioner guide published by BSWEN in April 2026 (“How to Get AI Coding Assistants to Ask Clarifying Questions Before Generating Code”) documented the same pattern under the same framing, walking a design tree one branch at a time, providing recommended answers for each question, and asking one category at a time. Two independent sources converging on the same named behavior is what promoted this from folk wisdom to a named pattern.
- The broader practice of requirements interviewing long predates agentic coding. It grows out of the analyst tradition documented by Gerald Weinberg in Exploring Requirements (1989), which argued that ambiguity discovered in conversation is cheap and ambiguity discovered in code is expensive. Question generation applies that insight to a new kind of analyst: one that responds in seconds and never gets tired of asking.