Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Aggregate

Pattern

A reusable solution you can apply to your work.

An aggregate is a cluster of entities and value objects treated as a single unit for data changes, with one entity — the aggregate root — guarding the boundary.

“Cluster the entities and value objects into aggregates and define boundaries around each. Choose one entity to be the root of each aggregate, and control all access to the objects inside the boundary through the root.” — Eric Evans, Domain-Driven Design

Understand This First

  • Entity – entities carry identity and are the building blocks that aggregates organize.
  • Value Object – value objects carry meaning without identity and live inside aggregates alongside entities.
  • Domain Model – the domain model identifies which concepts belong together in an aggregate.
  • Consistency – aggregates define the boundary within which consistency rules are enforced.

Context

You have a domain model with entities and value objects. Some of these objects form natural clusters. An order has line items. A blog post has comments. A shopping cart has products, quantities, and a shipping address. The objects in each cluster depend on each other: you can’t validate a line item’s discount without knowing the order’s total, and you can’t check the order’s total without knowing its line items.

This is an architectural decision. What you group into an aggregate determines your transaction boundaries, your API surface, your storage strategy, and what an agent can safely modify without coordination. Inside an aggregate, rules hold. Across aggregates, eventual consistency is the norm.

Eric Evans introduced aggregates in Domain-Driven Design (2003) to solve a problem that gets worse as systems grow: when every object can reach every other object through navigation, there’s no obvious place to enforce rules and no clear boundary for transactions. Aggregates draw that boundary. In agentic workflows, the boundary matters even more. An agent generating code that touches an order needs to know whether it should also update the line items in the same operation or whether line items are managed separately. Without aggregate boundaries, the agent guesses.

Problem

How do you keep a group of related objects consistent without locking the entire database or letting any piece of code reach in and modify anything it can find?

A team building an e-commerce system has Order, LineItem, and Payment entities. The business rule is simple: the sum of line item prices must equal the order total, and a payment can’t exceed that total. In early development, everything works. Then the team adds a bulk-discount endpoint that modifies line items directly, and a payment service that reads the order total from a cache. The discount endpoint updates line items without recalculating the order total. The payment service authorizes a payment against a stale total. The customer pays $50 for $80 worth of goods, and nobody notices until the accounting report at month-end.

The root cause isn’t a missing validation check. The system has no boundary defining which objects must change together and which must agree before a transaction commits.

Forces

  • Related objects need to stay consistent with each other. An order and its line items must agree. A bank account and its transaction history must balance.
  • Locking too many objects in one transaction kills concurrency. If updating a single line item locks the entire product catalog, the system stalls under load.
  • External code that modifies internal objects directly bypasses business rules. If any service can edit a line item without going through the order, the order’s invariants are unguarded.
  • Agents follow whatever access paths the code exposes. If the code lets you reach a line item without going through its order, the agent will do exactly that when it generates new features.

Solution

Draw a boundary around each cluster of objects that must be consistent with each other. Designate one entity as the aggregate root — the single entry point for all reads and modifications. Nothing outside the aggregate touches the internal objects directly. Everything goes through the root.

The root enforces the rules. When you add a line item to an order, you call a method on the Order (the root), not on the LineItem. The Order recalculates the total, checks the discount policy, and ensures its invariants hold before the change is persisted. Loading data means loading the entire aggregate: the root and all its internal objects arrive together in a consistent state. Saving works the same way: the entire aggregate goes into storage in a single transaction.

This gives you three things. First, a consistency boundary: the invariants that span multiple objects are checked in one place, by the root, within a single transaction. Second, a concurrency boundary: two users modifying different aggregates don’t interfere with each other, because each aggregate is its own transaction scope. Third, a navigation boundary: code outside the aggregate can hold a reference to the root but never to an internal object, which means the root can’t be bypassed.

Keep aggregates small. A common mistake is drawing the boundary too wide, pulling in every related entity. An order aggregate contains line items but not the customer. The customer is a separate aggregate, referenced by ID. If the order aggregate included the customer, updating a customer’s address would lock every order that customer ever placed. Vaughn Vernon’s guideline holds up: prefer small aggregates with just the root entity and its value objects, and reference other aggregates by identity rather than by direct object reference.

Document your aggregates explicitly in the project glossary or instruction file. State the boundaries: “Order is an aggregate root. It contains LineItems (entities) and a ShippingAddress (value object). Payment is a separate aggregate, referenced by order_id. All modifications to line items go through Order methods.” When an agent sees this, it generates code that respects the boundaries instead of reaching in through whatever navigation path looks shortest.

How It Plays Out

A healthcare scheduling system manages appointments. Each Appointment is an aggregate root containing a TimeSlot value object and a list of Participant entities (the patient, the doctor, any specialists). The business rule: no participant can be double-booked within the same time slot. When the team directs an agent to add a rescheduling feature, the agent generates code that calls appointment.reschedule(new_slot), checking every participant’s availability before accepting the change. Because participants live inside the aggregate, the check and the update happen atomically. A separate Calendar aggregate exists for each provider, referenced by ID, so rescheduling one appointment doesn’t lock the provider’s entire calendar.

A logistics company tracks shipments. Early in development, Shipment, Package, and Route live in one large aggregate. Adding a package to a shipment locks the route, and rerouting locks all packages. Under load, drivers waiting for route updates stall because another process is adding packages to a different shipment on the same route. The team splits the model: Shipment becomes an aggregate containing Package entities, Route becomes a separate aggregate referenced by ID. Throughput jumps tenfold because shipments and routes no longer contend for the same lock.

Sizing Aggregates

Start with the smallest aggregate that enforces your invariants. If a rule spans two entities, they belong in the same aggregate. If no rule connects them, they don’t. When an agent asks you (or you ask yourself) whether two entities belong together, the test is: does modifying one require checking the other in the same transaction? If yes, same aggregate. If no, separate aggregates linked by ID.

Consequences

Aggregates give you transaction boundaries that match your business rules rather than your database schema. Each aggregate protects its own invariants, and the system can process changes to different aggregates concurrently without interference. APIs and repositories become simpler because they deal in whole aggregates, not individual objects scattered across the model.

The cost is design discipline. Drawing aggregate boundaries requires understanding which invariants span which objects, and that understanding comes from conversations with domain experts, not from staring at a database diagram. Getting the boundary wrong is expensive in both directions. Too wide, and you get contention: unrelated changes block each other. Too narrow, and rules that span two aggregates can only be enforced through eventual-consistency mechanisms or sagas, which are harder to reason about and harder to get right.

Cross-aggregate references by ID feel awkward in object-oriented code. Loading a related aggregate requires an explicit repository call instead of walking a pointer. That friction is the point. It keeps the boundary visible in the code, so neither humans nor agents accidentally couple things that should be independent.

  • Uses / Depends on: Entity – aggregates are built from entities, with one serving as the root.
  • Uses / Depends on: Value Object – value objects inside an aggregate carry domain meaning without adding identity management.
  • Uses / Depends on: Domain Model – the domain model identifies which clusters of objects form aggregates.
  • Enables: Consistency – the aggregate boundary is the consistency boundary; within it, invariants hold after every transaction.
  • Enables: Transaction – each aggregate defines the scope of a single transaction.
  • Enables: Atomic – changes to an aggregate are persisted atomically.
  • Refined by: Bounded Context – the same real-world concept may participate in different aggregates across different bounded contexts.
  • Contrasts with: Data Model – data models organize storage; aggregates organize consistency. The two often differ.
  • Informed by: Invariant – invariants spanning multiple objects define which objects must live inside the same aggregate.
  • Supported by: Instruction File – documenting aggregate boundaries in the instruction file helps agents respect them.

Sources

  • Eric Evans defined aggregates as a core tactical pattern in Domain-Driven Design: Tackling Complexity in the Heart of Software (2003), Chapter 6. The epigraph and the three-part definition (cluster, boundary, root) come from his treatment. Evans’s key insight was that without explicit boundaries, object graphs become an undifferentiated web where any mutation can violate any rule.
  • Vaughn Vernon refined aggregate design in Implementing Domain-Driven Design (2013), introducing the “small aggregates” guideline that this article follows. His rule of thumb (reference other aggregates by identity, not by object reference) solved the performance and contention problems that plagued early DDD implementations where aggregates were drawn too large.
  • Martin Fowler documented the aggregate pattern in Patterns of Enterprise Application Architecture (2002) and his bliki, connecting it to repository and unit-of-work patterns. His framing of aggregates as transaction boundaries influenced the way this article presents the concurrency benefit.