Architecture Astronaut
“When you go too far up, abstraction-wise, you run out of oxygen.” — Joel Spolsky, “Don’t Let Architecture Astronauts Scare You”
Designing at an altitude so high that the abstractions stop touching any real problem.
Understand This First
- Abstraction — the tool the astronaut reaches for too early and too often.
- KISS — the heuristic that pulls the design back to the simplest thing that works.
- YAGNI — the heuristic that rejects layers added for hypothetical needs.
The name comes from Joel Spolsky’s 2001 essay, written about a generation of software thinkers who kept generalizing one level past the point where the words still meant anything. Component model abstracts the parts of a program; messaging abstracts what those components do; once you reach “patterns of interaction in distributed systems of agents” you’re somewhere the air is thin and the engineering has nowhere to land. Spolsky’s metaphor stuck because every working engineer has watched a meeting climb that ladder. In the agentic era the ladder has a new bottom rung: a fluent model that will gladly produce three more levels of abstraction at the slightest invitation.
Symptoms
- The design uses words like platform, framework, engine, or system for software that has one customer and one workload.
- Code reviews argue about generality before any concrete requirement is on the table.
- The class diagram has more interfaces than implementations.
- A small feature requires touching files in five layers that were introduced to handle scenarios that never arrived.
- The agent produces ports, adapters, use cases, presenters, and factories for a CRUD endpoint and an SQLite file.
- Justifications for structure are forward-looking: “this will let us swap the database,” “this will let us scale to multiple tenants,” “this will let us add another channel later” — none of which have a date.
- A reader needs a diagram to understand a hundred-line program.
Why It Happens
The astronaut mindset starts with a real virtue. Good engineers learn to see structure, name forces, and pull common pieces into shared shapes. The mistake is treating abstraction as inherently valuable rather than as a tool that pays rent only when it captures a real distinction. The first abstraction often does pay rent. The second tier may or may not. By the fourth tier the design is talking to itself.
The trap is socially reinforced. Senior engineers are rewarded for showing range; conference talks select for grand vocabulary; interview rituals reward candidates who reach for architecture words. None of this is wrong on its face, but it produces a steady cultural pressure to build the impressive shape instead of the small thing that actually works. A program that does its job in two hundred lines feels embarrassing to present; a program that does the same job behind a framework of factories and protocols looks like serious engineering.
Agents make this much cheaper to do badly. A model has read tens of thousands of mature codebases. When you ask for a “production-ready” service or a “scalable” API, the model has seen what those phrases usually look like in code: hexagonal layers, ports and adapters, command/query separation, dependency-inversion containers, and an event bus. It will reproduce that shape on top of a problem that is two database tables and one webhook. The output reads as professional because it borrows the surface of professional work. The actual reasoning — do these layers earn their cost on this codebase? — is the step the model cannot do for you.
There is a quieter cause underneath: discomfort with concreteness. Naming exact column types and writing the actual control flow forces commitments. Talking about “the persistence layer” and “the orchestration plane” defers them. The astronaut posture is sometimes a way to keep moving while never quite landing on the decision that the work requires.
The Harm
The harm is rarely a dramatic failure. It is a steady drag. Every read becomes longer because the eye has to climb through layers to find the line that does the work. Every change becomes a hunt because the place where the behavior lives is one hop away from the place where you’d look. Every new contributor spends a week learning the local cosmology before they can touch anything. The code’s complexity grows decoupled from the product’s complexity.
The deeper harm is the false floor of sophistication. A reviewer who sees a familiar architecture stops asking whether it fits. A founder who sees a tidy folder tree assumes the system is sound. A team that has invested in elaborate ceremony resists simplifying because the ceremony has acquired the dignity of work already done. Sunk-cost reasoning protects the layers from the only force that would let them be removed: someone willing to read each one and ask what it is for.
In agentic coding, the harm compounds across prompts. Once the first speculative layer lands, the agent treats it as local convention. The next prompt extends it. The third one tests it. The unnecessary interface gets an unnecessary mock; the mock gets brittle tests; the tests look like quality. Months later the system has accreted a scaffold that nobody chose, holds up nothing in particular, and is difficult to take down without breaking something incidental.
There is also a cost the code itself cannot show: opportunity. Time spent designing the meta-system is time not spent talking to the customer, reading the data, or shipping the next thing. The astronaut version of an idea takes longer to build and longer to discover is wrong, because the layers have absorbed the energy a smaller version would have spent on a quick test against reality.
The Way Out
Stay low until altitude pays. The discipline is not “never abstract.” It is “abstract when the second instance exists and the right shape of the abstraction is visible from the first two.” Before the second instance, you are guessing.
Use three checks:
Name the second customer. Every layer of abstraction promises to serve more than one case. Before adding the layer, name the second case concretely. Not “another database someday.” A second database that has a name, a workload, and a schedule. If the second case is hypothetical, you are not building generality; you are building speculative generality. Build the concrete thing and let the second instance, when it arrives, show you the shape.
Demand a falsifiable claim. Each layer should make a falsifiable claim about a force it balances. “Repository pattern isolates persistence” can be tested: if you actually swap the database, the change should be confined to the repository. “Hexagonal architecture decouples the core from frameworks” can be tested: if you actually replace the framework, the core should not move. If the claim cannot be tested by any change you can plausibly make in the next quarter, the layer is decoration. Delete it.
Run the deletion sketch. On paper or in a branch, write out the same code with the topmost layer removed. Read both versions side by side. Which one would you rather debug at 2 a.m.? Which one would you rather hand a new hire? If the simpler version answers both questions, the layer was not pulling weight. A pattern that survives the deletion sketch is one you can defend; a pattern that does not survive it was protecting the design from being read.
When you are working with an agent, state altitude explicitly in the prompt. “This service has one caller, one database, and three endpoints. Keep it as small as possible. Do not introduce repositories, factories, dependency-injection containers, or hexagonal layers unless I ask. If you think a layer is justified, name the second concrete case before you add it.” Without that direction the agent will reach for the mature-system shape it has seen most often, regardless of whether your problem has earned that shape.
A useful prompt against an astronaut draft: “Here is the design. Strip out the topmost layer of abstraction and rewrite the code as if that layer never existed. Tell me what got worse and what got better.” The pieces that got worse name the forces the layer was balancing. The pieces that got better name the layers that were never doing real work.
How It Plays Out
A two-person startup asks an agent to build “a clean, scalable user-management service.” The agent produces a service with a domain layer, an application layer, an infrastructure layer, ports for persistence and email, adapters around Postgres and SendGrid, a command bus, a query bus, a result-object pattern, and an event publisher. The actual requirement is signup, login, password reset, and email verification, all backed by one Postgres instance. Six weeks later, the founders cannot remember which layer to edit to change the password-reset email’s subject line. They delete most of the structure, keep the four handlers and the database calls, and finish the work in an afternoon.
A senior engineer prompts an agent to refactor a working data pipeline. The pipeline is two hundred lines of SQL and a small Python wrapper. The agent returns a Pipeline Orchestration Framework with abstract base classes for sources, sinks, and transforms, a dependency-injection container, a plugin registry, and a YAML configuration schema. The agent’s design memo says this will let the company plug in new data sources easily. The company has had the same two data sources for three years. The simpler version, with the SQL right there to read, is one file. The framework version is fourteen.
A platform team draws a diagram for a new internal tool. The diagram has a Domain Layer, a Capabilities Plane, an Experience Surface, and a Governance Mesh. Each box has its own design document. Six months in, no team has shipped any feature that touches all four. Anyone who tries gets routed through three reviews and a working group. A new engineer who joined to write code ends up writing position papers about which plane a feature belongs in. The first team to ship anything quietly side-steps the architecture entirely and ships a small service that talks directly to the database. The side-step works and is widely copied. The architecture remains on the wiki, accruing dignity.
An agent is asked to add a small feature to a Rails monolith: an admin page that lists recent payments. The agent decides this is an opportunity to “modernize the read path.” It introduces a query-side abstraction, an event-sourced projection, and a read-model store. The diff is twelve hundred lines and touches forty-three files. The original requirement could have been fifteen lines and one query.
Related Patterns
Sources
- Joel Spolsky’s Don’t Let Architecture Astronauts Scare You (Joel on Software, 2001) named the antipattern and supplied the metaphor of altitude as the failure mode: when you generalize past the level where the words still touch real problems, the air gets thin and the engineering has nowhere to land.
- William J. Brown, Raphael C. Malveau, Thomas J. Mowbray, and Hays W. “Skip” McCormick III’s AntiPatterns (Wiley, 1998) established the antipattern form this article follows and catalogued the related corporate failure mode (Stovepipe Enterprise, Vendor Lock-In) in which abstraction layers accumulate organizational weight without delivering operational value.
- Martin Fowler and Kent Beck’s Refactoring (Addison-Wesley, 2nd ed. 2018) names Speculative Generality as a related but distinct code smell: hooks added for hypothetical future needs. Astronaut work is the same impulse one level up the stack — the smell is at the design and architecture layer rather than at the class and method layer.
- Richard P. Gabriel’s Worse Is Better essay (1991) is the older grounding for the same intuition: simpler designs that touch real problems out-compete more elegant designs that climb too high above them. The astronaut antipattern is what happens when a team forgets the lesson.