Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Source of Truth

Pattern

A reusable solution you can apply to your work.

Also known as: Single Source of Truth (SSOT), Authoritative Source

Understand This First

  • State – a source of truth is the authoritative location for specific state.
  • Database – the source of truth typically lives in a database.

Context

Any system of meaningful size stores the same information in multiple places. A user’s email address might appear in the authentication database, the email service’s subscriber list, and the analytics platform. This is often unavoidable. But when those copies disagree (and they will), you need to know which one is right. The source of truth is the authoritative location where a given fact is defined and maintained. This is an architectural pattern because it determines how the system resolves contradictions.

Problem

When the same piece of information exists in multiple places and those places disagree, which one do you trust?

Without a designated source of truth, disagreements become permanent. One service says the user’s name is “Jane Smith.” Another says “Jane S. Smith.” A third says “J. Smith.” Nobody knows which is correct because nobody decided where the authoritative version lives. Updates get applied to whichever copy is convenient, and the system slowly drifts into incoherence.

Forces

  • Performance and availability push you to copy data closer to where it is needed (caching, replication, denormalization).
  • Every copy is a potential source of stale or conflicting information.
  • Different teams or services may each assume they own a piece of data.
  • Users expect the system to behave as if there is one coherent truth, even when the internals are distributed.

Solution

For every important piece of information, explicitly designate one system, one table, or one service as the source of truth. All other locations that hold that information are derived — they are caches, replicas, or projections that are populated from the source and refreshed on some schedule or trigger.

The rules are simple. Writes go to the source. If you need to change a user’s email, you change it in the source of truth. Reads prefer the source unless performance requires a cache, in which case the cache is understood to be potentially stale. Conflicts resolve in favor of the source. If the cache says one thing and the source says another, the source wins.

Document your sources of truth. A simple table (“user profile: users table in the auth database; product catalog: the products service; pricing: the pricing table in the billing database”) prevents months of confusion.

How It Plays Out

A company runs a marketing email platform and a customer support tool, both of which store customer email addresses. A customer updates their email through the support tool, but the marketing platform still has the old address. Emails bounce. The fix is to designate the authentication database as the source of truth for email addresses and have both the marketing platform and the support tool sync from it.

In an agentic workflow, the source of truth problem shows up constantly. An AI agent generating code might create a configuration value in both a config file and a constants module. Later, someone changes the config file but not the constants module. The system breaks in a way that is baffling until you realize there were two “sources” and they disagreed. Instructing the agent to “define this value in exactly one place and reference it everywhere else” is applying the source of truth pattern.

Tip

When directing an AI agent to build a system with multiple data stores (a database, a cache, a search index), explicitly state which store is the source of truth for each type of data. This prevents the agent from creating update paths that bypass the authoritative source.

Example Prompt

“The customer email address must be defined in exactly one place: the auth database. The marketing service and the support tool should both read from there. Don’t create a second copy of the email in either system.”

Consequences

A designated source of truth makes conflicts resolvable and debugging tractable. When data looks wrong, you know exactly where to check. It simplifies synchronization: every derived copy has a clear upstream to refresh from.

The cost is that funneling all writes through one system can create a bottleneck or a single point of failure. It also means accepting that derived copies may be temporarily out of date, which requires the rest of the system to tolerate staleness gracefully. The discipline of always writing to the source is easy to state but hard to maintain across a growing team, especially when a shortcut “just this once” creates a second write path.

  • Uses / Depends on: State — a source of truth is the authoritative location for specific state.
  • Enables: Consistency — designating a source of truth is the first step toward maintaining consistency.
  • Enables: DRY — DRY is the principle; source of truth is the practice of applying it to data.
  • Refined by: Data Normalization / Denormalization — normalization concentrates facts in one place; denormalization intentionally copies them.
  • Uses / Depends on: Database — the source of truth typically lives in a database.
  • Example of: Ubiquitous Language — the domain glossary is a source of truth for naming.