Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

State

State is the information a system remembers between operations; the word is what lets a team talk about where that memory lives, how long it lasts, and who can change it.

Concept

Vocabulary that names a phenomenon.

What It Is

State is whatever a system remembers between one operation and the next. The items in a shopping cart, the current step in a checkout workflow, whether a user is logged in, the cursor position in an open file, the contents of a cache, the row counts in a database: all state. A pure function that takes inputs and returns outputs without remembering anything between calls is stateless. Almost no useful software is fully stateless; the interesting question is never whether the system has state but where the state lives, how long it lasts, and who is allowed to change it.

It pays to keep the layers separate, because they get conflated and the conflation is where the bugs come from:

  • Ephemeral state lives for the lifetime of a single request, a single function call, or a single agent turn — local variables, the arguments on the stack, the partial computation in flight. It vanishes when the operation ends and nothing downstream sees it. The cheapest state to reason about, because nothing else can observe it going wrong.
  • Session state lives for the lifetime of a user’s session or an agent’s working context — the current logged-in user, an open conversation history, the partial form a user is filling in, the context window the agent is reasoning over. It survives across operations but not across restarts; lose it and the user starts over.
  • Persistent state lives across restarts — rows in a database, files on disk, records in object storage, contents of a build artifact. It is the layer the business cares about. Lose it and the team has lost something they cannot reproduce.
  • Distributed state is the same logical fact replicated across more than one machine — a row in the leader and the same row in a read replica, a cache and the database it caches, a search index and the records it indexes. The replicas can disagree, and reasoning about distributed state is reasoning about that disagreement.

Each layer has its own failure mode. Ephemeral state goes wrong when one function reaches into another’s locals through a global. Session state goes wrong when two browser tabs disagree about the user’s intent. Persistent state goes wrong when a crash interrupts a multi-step write. Distributed state goes wrong when the leader and the replicas drift apart under load. When practitioners say “the system has a state bug,” they almost always mean a bug in one specific layer; the team that hasn’t named the layers is going to spend the first half of every incident review figuring out which one they meant.

The deeper move the word does is mark which decisions are still open. For every piece of information a system remembers, three questions need answers, and naming the questions is most of the work:

  • Where does it live — which component owns it, which storage backs it, and is there exactly one owner or several pretending to be one?
  • How long does it last — request-scoped, session-scoped, persistent across restarts, or persistent forever and never garbage collected?
  • Who can change it — which code paths are allowed to write, and is the rest of the system reading the written value or a cached copy of it?

A team that has answered the three questions for every important piece of state has the architecture the system actually has. A team that has not is operating on a hope.

Why It Matters

State is the reason programs behave differently when you run them a second time. It’s why “it works on my machine” is a meme. The machine has state the new environment doesn’t. Every piece of state is something that can be in an unexpected condition: stale, corrupted, half-written, or out of sync with another copy of itself. The more state a system carries, the more configurations it has, and the configuration space is where the bugs live.

The cost of treating state as one undifferentiated word is concrete. A team that doesn’t separate ephemeral from session state writes web handlers that quietly mutate module-level singletons and then can’t explain why two concurrent requests interfere. A team that doesn’t separate session from persistent state ships a “save” button that updates the in-memory model but not the database, and the user’s work disappears on the next page load. A team that doesn’t separate persistent from distributed state assumes the read replica is the database, builds a feature on top of it, and discovers under load that the feature reads stale data. None of these are exotic bugs; all of them are the routine consequence of asking one layer to enforce a guarantee that lives in a different layer.

Naming the layers is also what makes testing honest. Pure functions (functions that take inputs and return outputs without reading or writing external state) are easy to test, because the test specifies the inputs and asserts the outputs and there is no fourth variable. Functions that read or write state are harder, because the test has to set the state up before the call and inspect it after, and any other code path that touches the same state can change the result. A codebase that pushes state to the edges, reads it at the entry, threads pure logic through the middle, and writes the result at the exit, is a codebase whose middle is testable in isolation. A codebase that reads and writes state from every layer is a codebase whose tests have to spin up the world.

For agentic coding the surface tightens. Coding agents trained on a decade of public code lean toward whichever patterns are most common in the corpus, and the corpus is heavy on mutable globals, module-level singletons, and “just stash it on the request object” handlers. An agent asked to add a feature won’t, by default, ask whether the new feature should own its own state or inherit state from an existing component; it’ll reach for the nearest convenient place and put the state there. Six months in, the codebase has state in fifteen places that nobody planned. The remedy isn’t to write a longer system prompt; it’s to make the codebase itself name where state lives, so the agent reads the answer rather than guesses it.

How to Recognize It

You’re looking at a state question whenever a function’s output depends on something other than its arguments, or whenever a value the system relies on has more than one place it could be read from. The questions to ask are layer-specific; trying to ask all of them at once is what produces vague design discussions.

Ephemeral and shared in-process state. Look for state that’s modified outside the function that owns it:

  • A module-level dict or list that any handler can mutate, with no rule about who’s allowed to write and no test that catches a stray write.
  • A singleton object that two parts of the codebase initialize in opposite orders depending on which import path runs first.
  • A global counter, log buffer, or configuration object that “for now” is fine and that the next concurrent request is about to step on.
  • A function whose output changes when you call it twice with the same arguments — somewhere in the call chain, hidden state is being read or written, and the next debugging session is going to find it.

Session state. Look for places where the user’s working memory and the server’s working memory can disagree:

  • A form whose draft is held only on the client, where a refresh loses it and nobody decided that was the policy.
  • A logged-in session that survives in one browser tab but not another, because the auth token was stashed in tab-local storage rather than the shared origin.
  • An agent conversation whose history is held in memory by a long-running process that’s about to be restarted by the orchestrator, and there’s no plan for what happens to the half-completed task.

Persistent state. Look for what the storage actually guarantees and what the application is assuming:

  • A multi-step write where a crash between steps leaves a half-state nobody designed for — a paid invoice with no order, an account debited without a matching credit.
  • A “save” path that writes to the database but doesn’t update the cache, and the next read returns the old value.
  • A migration that touches a million rows and the rollback plan is “we’ll figure it out.”
  • A column the application treats as required but the schema doesn’t (NOT NULL is missing), so the next code path that bypasses the application writes nulls and breaks every downstream reader.

Distributed state. Look for the same fact in more than one place and ask which one is canonical:

  • A cache in front of a database, with no rule about how the cache stays current and no metric on how stale it is.
  • A read replica behind a leader, with the application reading from the replica because it’s faster and not realizing the replica lags under load.
  • A search index built off the database, where “the user just created this and immediately searched for it” returns nothing for a beat and the team calls it a bug rather than the indexing pipeline doing its job.
  • A multi-region deployment where two regions can both accept writes for the same record and the conflict-resolution policy is the default the database chose.

A few signs that the team’s vocabulary for state is the thing that’s missing:

  • The same word (“the data,” “the user,” “the cart”) points at different things in different parts of a discussion, and nobody flags it.
  • A bug review that can’t decide whether the bug is in the application, the database, or the cache, because the three layers are being argued about as if they were one.
  • An agent’s “I added the feature” combined with a downstream report that says it doesn’t work; the gap is somewhere in the layers and the agent’s self-report didn’t include which layer it thought it was touching.

Warning

Coding agents are particularly prone to creating hidden state: module-level variables, singletons, mutable globals, “just stash it on request.state” handlers. When reviewing agent-generated code, search for state that’s modified outside the function that owns it. That’s almost always the bug nobody named.

How It Plays Out

A small team is building a SaaS dashboard and ships a “save filter preset” feature. The agent writes a handler that stores the preset on a module-level dict keyed by user ID, the integration test passes against a single-worker dev server, and the code merges. In production the dashboard runs four worker processes behind a load balancer. The user saves a preset on worker 1, refreshes the page, the request lands on worker 3, the preset isn’t there. The fix is to move the preset out of the in-process dict and into the database; the deeper fix is for the codebase to acquire a written rule that user-visible preferences live in persistent storage, not in worker memory. The agent wasn’t wrong about how to write a Python dict; it was reading a codebase that hadn’t yet decided where its session state lived, and so the agent picked the most convenient place. Naming the layer is what would have made the agent’s choice the right one.

A platform team migrates their search feature from a database LIKE query to a real search index. The index is built from the same database that backs the application, but the indexing pipeline is asynchronous and runs every five seconds. A user creates a new record, navigates to the search page, types the record’s name, and sees no results until they refresh five seconds later. The product manager files a bug; the engineer who debugs it discovers the indexing pipeline is doing exactly what it was designed to do. The fix isn’t to make the indexing synchronous (which would slow every write); the fix is for the search-results page to either show “results may be a few seconds behind for new records” or read straight from the database when the user just typed something they themselves just created. What the team learned was specific: the search index is a distributed copy of the database, the staleness has a known bound, and the application has to make a deliberate choice about which read path to use when. The architectural document gains a paragraph; the next feature built on top of search gets the choice right the first time.

A coding agent is asked to add an “undo” feature to a note-taking app. The agent reads the existing model (notes are stored in a single table, edits overwrite the row) and writes the feature by keeping the last five versions of each note in an in-memory cache. The integration tests pass; the demo to the product owner goes well; the feature ships. A week later a user complains that undo doesn’t work after they close and reopen the app. Investigation reveals the cache is process-local and the app’s server restarts nightly. The fix is to move the version history into the database as a proper history table, making the “undo memory” persistent state rather than ephemeral. What the agent missed was the layer: the prompt said “add undo,” the agent built undo at the ephemeral layer, and the user expected it at the persistent layer. A team whose codebase already named where note-related state lived (everything about a note is persistent) would have gotten the right answer the first time; the agent would have read the rule and followed it.

Example Prompt

“Refactor this function so it doesn’t read or write the module-level _user_cache dict. Instead, accept the data it needs as parameters, return the new value as output, and let the caller decide where to persist it. The goal is a function whose behavior is determined entirely by its arguments, with no hidden state.”

Consequences

Treating state as a named layered question, rather than as one word the team uses to mean four different things, changes what the team’s defensive investment is for. The team stops trying to “manage state” in the abstract and starts asking, of each piece of information the system remembers, where does it live, how long does it last, who can change it, and which layer is going to enforce that? Those questions have answers; the previous question has only opinions.

Benefits. A team that has separated the layers writes code whose state is locatable. Pure functions sit in the middle of the call graph, where they belong, because the team knows that putting business logic next to a storage call is the move that makes the middle untestable. Storage decisions get written down — this data is persistent, this data is session-scoped, this cache has a known TTL — so the next engineer reading the codebase doesn’t have to reverse-engineer the architecture from the code. Bugs that involve state become diagnosable: “the cache is stale” and “the application wrote the wrong value” are different bugs with different fixes, and the team can tell which one it has. A reviewer can point at any line of the codebase and ask “what state does this function read and write?” and get a short, true answer.

Liabilities. Every layer of state discipline costs something to maintain. Pushing state to the edges means more parameters threaded through more function signatures, and at some point the threading becomes its own readability cost. Concentrating persistent state in one source of truth means more round trips for code that would have been faster reading a local cache. Naming the replica layer explicitly means the architecture document has more paragraphs in it and the new hire has more to learn. A team that doesn’t budget for the cost will reach for the most rigorous discipline everywhere, hit the cost, and quietly relax the discipline in the places where the cost hurts, without writing the relaxations down. Six months later the codebase’s documented architecture and the codebase’s actual architecture have drifted apart, and a feature gets built on top of an assumption that no longer holds.

For agentic workflows the consequence is sharper. An agent will produce code that puts state at the layer the prompt named and quietly puts other state at whichever layer the surrounding code already uses. The team that prompts only at the “make it work” layer will get a feature that works at one layer and leaks state at another; the team whose codebase already names where state belongs will get an agent that places new state in the named place, because the named place is what the agent is reading. The remedy isn’t longer prompts. It’s a codebase whose vocabulary the agent can use: naming conventions, module-level docstrings about where state lives, a written rule in the contributing guide about pure-function preference, an architectural document the agent can grep. The agent is a fast writer of code in the codebase it’s reading. If the codebase names its state layers, the agent’s code will name them too.

Sources

  • The argument that minimizing and isolating state is the central programming-language design move runs from the functional-programming tradition forward. John Backus’s 1977 Turing Award lecture “Can Programming Be Liberated from the von Neumann Style?” named the conventional model — programs as sequences of statements that mutate state in named cells — and argued that “applicative” programming (today: functional programming) could free programmers from reasoning about that mutation. Backus’s specific languages didn’t win; the underlying argument did, and the contemporary preference for pure functions in code reviewed by automated tools is its direct descendant.
  • The vocabulary used here for layered state — ephemeral vs. session vs. persistent vs. distributed — is the working vocabulary of production web engineering and has no single originator. Martin Fowler’s Patterns of Enterprise Application Architecture (2002) organizes the persistent layer (Domain Model, Active Record, Data Mapper, Unit of Work) and the session layer (Server Session State, Client Session State, Database Session State) into the named pieces this article uses; the book remains the canonical reference for that vocabulary.
  • The distributed-state framing — that the same fact replicated across machines is its own architectural problem — is owed to the line of work that runs from Leslie Lamport’s “Time, Clocks, and the Ordering of Events in a Distributed System” (Communications of the ACM, 1978) through the CAP-theorem literature cited in the Consistency article. The point used here — that a multi-replica system has to make a deliberate choice about which copy a reader gets — sits downstream of all of it.
  • The agent-specific framing — that an agent’s code will put state at the layer the surrounding codebase already uses — is implicit in the working literature on coding agents and the broader practitioner conversation around production-grade agent loops. The operational rule used here is that a codebase’s vocabulary is the agent’s vocabulary, so a codebase that names where its state lives gets agent code that respects the naming.