--- slug: side-effect type: concept summary: "Any observable change a function makes beyond returning its result, distinguishing code that calculates from code that acts on the world." created: 2026-04-04 updated: 2026-05-23 related: algorithm: relation: depends-on note: "The pure algorithmic core is where side effects should be absent." algorithmic-complexity: relation: contrasts-with note: "Complexity analysis measures computation cost, while side effects concern observable changes beyond the return value." api: relation: contrasts-with note: "API calls are intentional, visible side effects; the problem is unintentional or hidden ones." concurrency: relation: enables note: "Minimizing shared side effects reduces concurrency bugs." determinism: relation: refines note: "Controlling side effects is the primary technique for achieving deterministic behavior." event: relation: enables note: "From the side effects triggered by it." --- # Side Effect *Any observable change a function makes beyond computing and returning its result — the vocabulary by which we distinguish code that calculates from code that acts on the world.* > **Concept** > > Vocabulary that names a phenomenon. ## Understand This First - [Algorithm](algorithm.md) — the pure algorithmic core is where side effects should be absent. ## What It Is A side effect is anything a function does, while running, that you can observe from outside the function other than its return value. Writing a row to a database is a side effect. Sending an email is a side effect. Modifying a global variable, mutating an argument the caller still holds a reference to, printing to standard output, reading the system clock, generating a random number, raising an exception that escapes the call, sending a packet over the network: all side effects. The return value is the only output a *pure* function produces. Everything else a function does to the surrounding world, or in response to it, falls under this term. The simplest way into the concept is the contrast with a pure function. A pure function has two properties. Given the same inputs it always produces the same output, and calling it leaves no trace anywhere else in the program or the system. Square a number, sort a list into a new list, parse a string into a tree: pure. Increment a counter that lives outside the function, write the parsed tree to a cache, log the input as it goes by: not pure. The presence of any of those secondary actions makes the function side-effectful, and the term covers both the action itself and the property of the function that performs it. It helps to hold two related distinctions clearly. The first is *intentional* versus *incidental* side effects. A function called `send_confirmation_email` is announcing in its name that emailing is part of what it does; the side effect is the point. A function called `calculate_shipping_cost` that quietly also writes an audit row, bumps a counter, and emits a metric is producing incidental side effects; the caller has no signal that anything but a calculation is happening. The intellectual content of the term is mostly carried by the incidental case; intentional effects are easy to reason about because they are visible at the call site. The second distinction is *local* versus *non-local*. Mutating a value the caller passed in is local in the sense that the caller can still see it, but non-local in the sense that the function is editing memory the caller did not ask to have edited. Writing to a file or a database is non-local in both senses: the change persists past the function's return and is visible to other parts of the system, possibly other processes. Different languages and different paradigms draw the line in different places, but the underlying phenomenon (observable behavior beyond the return value) is what the term names. Several vocabulary terms travel with the concept and are worth holding precisely. *Purity* is the property of having no side effects. *Referential transparency* is the property that an expression can be replaced by its computed value anywhere it appears without changing the program's meaning; pure functions enable it, and side-effectful ones break it. The *functional core, imperative shell* terminology, coined by Gary Bernhardt in his 2012 *Destroy All Software* screencasts, names a deliberate partition of a codebase into a pure inner region and a thin outer region where all the effects live. Haskell's *IO monad* is a type-level admission that a function performs effects; Rust's *ownership* and *borrowing* rules constrain where and how mutation can happen. Each of these is a different language's way of making the side-effect distinction first-class, but the underlying concept is the same: a function that acts on the world is a different kind of thing than a function that only computes. ## Why It Matters Code is easier to understand when its name tells you everything it does. A function called `total_with_tax` that only computes a number can be read in isolation; you don't need to know anything about the database, the logging system, the cache, or the email service to reason about whether it produces the right answer. The moment that same function quietly also writes a row, fires an event, and warms a cache, you cannot reason about it without knowing all four of those subsystems. The signature lies about the work. The concept of a side effect is the vocabulary that lets a reviewer say "this function's signature lies about the work" precisely. Software that runs reliably under change depends on this distinction more than it depends on almost anything else. Pure functions are trivial to test: an input goes in, a value comes out, the assertion is a single equality. Side-effectful functions need scaffolding before they can be tested at all: mocks, stubs, in-memory replacements for real infrastructure, careful teardown between tests, and a small library of conventions for what counts as a sufficient simulation of the real world. When a codebase concentrates its effects in a small number of well-named places, the rest of the code stays cheap to test and cheap to change. When effects are scattered, every modification ripples outward through whatever subsystems the modified function happens to touch, and the test suite slowly turns into integration tests that nobody trusts. The concept matters in a sharper way for agentic systems specifically. An AI agent writing code is biased toward "make it work" in the smallest visible diff; it will cheerfully add a database write inside a function called `compute_score` if that is what the failing test or the user's request appears to require. Without a reviewer who is fluent in the concept of side effects, those additions accumulate, and the codebase quickly becomes one in which no function's name can be trusted. The agent is not malicious (it has no concept of long-term maintainability cost), but it also has no built-in pressure toward the functional-core, imperative-shell discipline. The human (or the rubric the human gives the agent) has to supply that pressure explicitly. The vocabulary is how you supply it. "This is a pure calculation; the effects belong in the shell" is a sentence the agent can act on. "Make the code cleaner" is not. Three failure modes keep recurring when the concept is missing or imprecise. The first is the *hidden effect*: a function whose name and apparent purpose are pure but which secretly mutates, writes, or emits. The cost is usually paid at debugging time, when a behavior that should have been impossible turns out to be coming from a function nobody suspected. The second is the *effect cascade*: a side-effectful function calling another side-effectful function calling another, with each layer doing some computing and some acting, and no layer reading as a clean unit of behavior. The cost is paid in test setup that grows superlinearly with the depth of the cascade. The third is the *effect at the wrong layer*: domain logic that ought to be pure (pricing rules, validation, scoring) reaches out to touch databases and external services directly, instead of returning a value the surrounding orchestration code can act on. The cost is paid when the same logic needs to run in a context the original layer didn't anticipate (a batch job, a test harness, a different deployment shape), and the code can't be lifted out because the effects are baked into it. ## How to Recognize It A handful of signs reliably tell you a function is side-effectful, or that what you are designing should be modeled as one: - **Verbs in the name that describe action on the world.** `save`, `send`, `write`, `delete`, `update`, `publish`, `notify`, `log`, `record`, `commit`, `flush`. These are admissions that the function does something beyond computing. The inverse is also a signal: function names like `compute`, `calculate`, `parse`, `format`, `derive`, `to_*`, `as_*` suggest a pure intent, and any side effect inside a function with such a name is almost certainly accidental. - **Return value of `void` (or `None`, or unit).** A function that returns nothing exists only for its effects. That is not automatically bad — `void` functions are legitimate when their job is to act — but it is a strong signal that the function is part of the imperative shell rather than the functional core. - **Arguments mutated in place.** A function that takes a list and changes it, takes an object and edits its fields, or takes a buffer and fills it has side effects whether or not the language enforces the distinction. The caller's data has been edited; the change persists past the function's return. - **Imports that admit the world.** A function that imports a database client, an HTTP library, a file-system module, a logger, a clock, or a random-number generator is almost certainly going to use them. Effects travel with their dependencies; the import list is a reliable indicator of what kinds of effects to expect. - **Tests that need scaffolding.** If a function cannot be tested with a single equality assertion — if it needs a database fixture, an HTTP mock, a clock injection, a captured-stdout helper, a temporary directory — it is side-effectful, even when nothing in its signature says so. The shape of the test is the diagnostic. The reverse direction matters too: knowing how to recognize a *pure* function in agent-generated code is what lets you refactor a tangle into something maintainable. A function whose body reads only its parameters and local variables, whose return statement is the only thing that escapes, and whose imports are confined to the standard library's data and math modules is almost certainly pure. That function can be moved into the functional core without ceremony. The side-effectful pieces that surround it become the shell, and the boundary between core and shell becomes visible and defendable. ## How It Plays Out An AI agent generates a function to process a customer order. The function validates the order, calculates the total, charges the payment, sends a confirmation email, and updates inventory, all in one block. It works, but it can't be tested as a unit: you can't check the pricing logic without also triggering a real payment, and you can't check the inventory update without also charging a card. A developer who is fluent in the concept asks the agent to split the function into a pure calculator that returns `(total, list_of_actions_to_perform)` and a thin orchestrator that takes those actions and executes them. The pure half becomes testable with simple assertions; the imperative half stays small enough that its tests can use straightforward mocks. The change takes thirty minutes; the test coverage of the pricing logic goes from one fragile integration test to a dozen cheap unit tests. > **⚠️ Warning** > > When reviewing agent-generated code, watch for hidden side effects: logging calls that trigger alerts, database writes buried inside utility functions, HTTP calls inside what looks like a pure calculation. Agents optimize for "the test passes," not for "the effects are visible." A team tracks down a mysterious bug. A daily report shows incorrect totals, but every relevant calculation function looks correct in isolation. After hours of investigation, they find that a `normalize_items` helper called during the calculation modifies a shared list in place. The function is named like a pure transformer and looks like one in the call site, but its implementation is a mutation. The fix is two lines: return a new list instead of editing the input. The deeper fix is a code-review rule that any helper whose name suggests a transformation must return a new value, and any helper that mutates must say so in its name. The vocabulary of "this function has a hidden side effect" is what makes the rule expressible. > **💡 Example Prompt** > > "Refactor this order-processing function so the pricing calculation is a pure function that returns the computed total and a list of actions to perform: charge payment, send email, update inventory. Move the actions into a separate orchestrator that takes the calculator's output and executes the side effects. The calculator should be testable without touching any external service." A senior engineer reviewing a colleague's pull request notices that a function called `calculate_eligibility` is making an HTTP call to a third-party scoring service. The reviewer doesn't push back on the integration; the system genuinely needs that score. They push back on the location: the eligibility *calculation* should be a pure function that takes the score as a parameter; the HTTP call to fetch the score belongs in the orchestrating code that calls the calculator. The reframing turns a function that depends on the network into two functions, one of which is trivially testable and the other of which is a thin adapter. The underlying behavior is unchanged; what changes is which parts of the code can be tested, replayed, and reasoned about in isolation. ## Consequences Treating side effects as a precise, named concept changes how code is structured, how it is tested, and how it is briefed to an AI agent. The cost is not zero, but the alternative (letting effects scatter wherever the path of least resistance places them) is more expensive almost immediately. **Benefits.** Pure functions are easy to test, easy to compose, easy to cache, easy to parallelize, and easy to reason about. When the bulk of a codebase's *logic* lives in pure functions, the test suite stays fast and trustworthy, the build stays cheap, and the failure modes stay narrow. When effects are concentrated in a small, well-named shell, the surface area that has to be tested with real or simulated infrastructure stays small. Refactoring becomes safer because the pure parts can be moved, renamed, and recomposed without disturbing any external system. Briefing an agent becomes easier because the unit of work — "write a pure function that takes X and returns Y" — is a much sharper target than "implement the feature." **Liabilities.** The discipline has a real cost. Strict separation often means passing more parameters through more layers, threading dependencies that would otherwise be reached for from inside a function, and writing two pieces of code (the calculator and the orchestrator) where one would have worked. In small programs this overhead can look gratuitous. The cost rises further when working with AI agents, because the natural shape of agent-generated code is whatever minimally makes the failing test pass, which is usually not the functional-core shape; the human has to supply the structural pressure deliberately, and the agent has to be told what shape to aim for. And the vocabulary itself can be misused: not every mutation is harmful, not every effect is hidden, and code that bends over backwards to avoid all side effects can become harder to read than code that uses them carefully. The discipline is "make effects visible and concentrated," not "eliminate effects." The point is to be able to read a function's signature and trust it, not to write programs that never touch the world. The practical upshot mirrors the one for [Determinism](determinism.md): side effects are worth naming explicitly whenever they show up in a design conversation. The cost of leaving the concept implicit (a hidden mutation that corrupts a daily report, an agent-written function whose name says "calculate" and whose body sends an email, a test suite that grew into a slow integration battery because effects scattered) is high enough that the discipline of naming pays for itself well inside a single project. ## Sources - The separation of pure computation from effects is a central idea in functional programming, formalized in Haskell's type system through monadic I/O. Simon Peyton Jones and Philip Wadler set out the foundational treatment in *[Imperative Functional Programming](https://www.microsoft.com/en-us/research/publication/imperative-functional-programming/)* (POPL 1993), which showed how an ostensibly pure language can perform real-world effects while preserving the equational reasoning that purity provides. - Gary Bernhardt popularized the practical application of the distinction for object-oriented and multi-paradigm codebases as the [functional core, imperative shell](https://www.destroyallsoftware.com/screencasts/catalog/functional-core-imperative-shell) pattern in his *Destroy All Software* screencast series (2012). His framing — pure inner region for computation, thin outer region for actions, no calls inward from shell to core — has become the standard reference for the partition in practitioner literature. - The architectural parallel was named earlier by Alistair Cockburn in *[Hexagonal Architecture](https://alistair.cockburn.us/hexagonal-architecture/)* (Ports and Adapters, 2005), in which the domain core has no knowledge of or dependency on the infrastructure that surrounds it. The same shape recurs in Robert C. Martin's *Clean Architecture* (2017) under the name *dependency rule*, and in *Domain-Driven Design* (Eric Evans, 2003) as the separation of the domain model from the infrastructure layer. --- - [Next: Concurrency](concurrency.md) - [Previous: Determinism](determinism.md)