--- slug: aggregate type: pattern summary: "A cluster of entities and value objects treated as a single unit for data changes, with one root entity guarding the boundary." created: 2026-04-10 updated: 2026-05-04 related: atomic: relation: enables note: "Changes to an aggregate are persisted atomically." bounded-context: relation: refined-by note: "The same real-world concept may participate in different aggregates across different bounded contexts." consistency: relation: enables note: "The aggregate boundary is the consistency boundary; within it, invariants hold after every transaction." data-model: relation: contrasts-with note: "Data models organize storage; aggregates organize consistency. The two often differ." domain-model: relation: uses note: "The domain model identifies which clusters of objects form aggregates." entity: relation: uses note: "Aggregates are built from entities, with one serving as the root." instruction-file: relation: supported-by note: "Documenting aggregate boundaries in the instruction file helps agents respect them." invariant: relation: informed-by note: "Invariants spanning multiple objects define which objects must live inside the same aggregate." transaction: relation: enables note: "Each aggregate defines the scope of a single transaction." value-object: relation: uses note: "Value objects inside an aggregate carry domain meaning without adding identity management." --- # Aggregate > **Pattern** > > A named solution to a recurring problem. *An aggregate is a cluster of entities and value objects treated as a single unit for data changes, with one entity — the aggregate root — guarding the boundary.* > "Cluster the entities and value objects into aggregates and define boundaries around each. Choose one entity to be the root of each aggregate, and control all access to the objects inside the boundary through the root." > — Eric Evans, *Domain-Driven Design* ## Understand This First - [Entity](entity.md) -- entities carry identity and are the building blocks that aggregates organize. - [Value Object](value-object.md) -- value objects carry meaning without identity and live inside aggregates alongside entities. - [Domain Model](domain-model.md) -- the domain model identifies which concepts belong together in an aggregate. - [Consistency](consistency.md) -- aggregates define the boundary within which consistency rules are enforced. ## Context You have a [domain model](domain-model.md) with [entities](entity.md) and [value objects](value-object.md). Some of these objects form natural clusters. An order has line items. A blog post has comments. A shopping cart has products, quantities, and a shipping address. The objects in each cluster depend on each other: you can't validate a line item's discount without knowing the order's total, and you can't check the order's total without knowing its line items. This is an **architectural** decision. What you group into an aggregate determines your transaction boundaries, your API surface, your storage strategy, and what an agent can safely modify without coordination. Inside an aggregate, rules hold. Across aggregates, eventual consistency is the norm. Eric Evans introduced aggregates in *Domain-Driven Design* (2003) to solve a problem that gets worse as systems grow: when every object can reach every other object through navigation, there's no obvious place to enforce rules and no clear boundary for transactions. Aggregates draw that boundary. In agentic workflows, the boundary matters even more. An agent generating code that touches an order needs to know whether it should also update the line items in the same operation or whether line items are managed separately. Without aggregate boundaries, the agent guesses. ## Problem How do you keep a group of related objects consistent without locking the entire database or letting any piece of code reach in and modify anything it can find? A team building an e-commerce system has `Order`, `LineItem`, and `Payment` entities. The business rule is simple: the sum of line item prices must equal the order total, and a payment can't exceed that total. In early development, everything works. Then the team adds a bulk-discount endpoint that modifies line items directly, and a payment service that reads the order total from a cache. The discount endpoint updates line items without recalculating the order total. The payment service authorizes a payment against a stale total. The customer pays $50 for $80 worth of goods, and nobody notices until the accounting report at month-end. The root cause isn't a missing validation check. The system has no boundary defining which objects must change together and which must agree before a transaction commits. ## Forces - Related objects need to stay consistent with each other. An order and its line items must agree. A bank account and its transaction history must balance. - Locking too many objects in one transaction kills concurrency. If updating a single line item locks the entire product catalog, the system stalls under load. - External code that modifies internal objects directly bypasses business rules. If any service can edit a line item without going through the order, the order's invariants are unguarded. - Agents follow whatever access paths the code exposes. If the code lets you reach a line item without going through its order, the agent will do exactly that when it generates new features. ## Solution Draw a boundary around each cluster of objects that must be consistent with each other. Designate one entity as the **aggregate root** — the single entry point for all reads and modifications. Nothing outside the aggregate touches the internal objects directly. Everything goes through the root. The root enforces the rules. When you add a line item to an order, you call a method on the `Order` (the root), not on the `LineItem`. The `Order` recalculates the total, checks the discount policy, and ensures its invariants hold before the change is persisted. Loading data means loading the entire aggregate: the root and all its internal objects arrive together in a consistent state. Saving works the same way: the entire aggregate goes into storage in a single transaction. This gives you three things. First, a **consistency boundary**: the invariants that span multiple objects are checked in one place, by the root, within a single transaction. Second, a **concurrency boundary**: two users modifying different aggregates don't interfere with each other, because each aggregate is its own transaction scope. Third, a **navigation boundary**: code outside the aggregate can hold a reference to the root but never to an internal object, which means the root can't be bypassed. Keep aggregates small. A common mistake is drawing the boundary too wide, pulling in every related entity. An order aggregate contains line items but not the customer. The customer is a separate aggregate, referenced by ID. If the order aggregate included the customer, updating a customer's address would lock every order that customer ever placed. Vaughn Vernon's guideline holds up: prefer small aggregates with just the root entity and its value objects, and reference other aggregates by identity rather than by direct object reference. Document your aggregates explicitly in the project glossary or [instruction file](instruction-file.md). State the boundaries: "Order is an aggregate root. It contains LineItems (entities) and a ShippingAddress (value object). Payment is a separate aggregate, referenced by order_id. All modifications to line items go through Order methods." When an agent sees this, it generates code that respects the boundaries instead of reaching in through whatever navigation path looks shortest. ## How It Plays Out A healthcare scheduling system manages appointments. Each `Appointment` is an aggregate root containing a `TimeSlot` value object and a list of `Participant` entities (the patient, the doctor, any specialists). The business rule: no participant can be double-booked within the same time slot. When the team directs an agent to add a rescheduling feature, the agent generates code that calls `appointment.reschedule(new_slot)`, checking every participant's availability before accepting the change. Because participants live inside the aggregate, the check and the update happen atomically. A separate `Calendar` aggregate exists for each provider, referenced by ID, so rescheduling one appointment doesn't lock the provider's entire calendar. A logistics company tracks shipments. Early in development, `Shipment`, `Package`, and `Route` live in one large aggregate. Adding a package to a shipment locks the route, and rerouting locks all packages. Under load, drivers waiting for route updates stall because another process is adding packages to a different shipment on the same route. The team splits the model: `Shipment` becomes an aggregate containing `Package` entities, `Route` becomes a separate aggregate referenced by ID. Throughput jumps tenfold because shipments and routes no longer contend for the same lock. > **💡 Sizing Aggregates** > > Start with the smallest aggregate that enforces your invariants. If a rule spans two entities, they belong in the same aggregate. If no rule connects them, they don't. When an agent asks you (or you ask yourself) whether two entities belong together, the test is: does modifying one require checking the other in the same transaction? If yes, same aggregate. If no, separate aggregates linked by ID. ## Consequences Aggregates give you transaction boundaries that match your business rules rather than your database schema. Each aggregate protects its own [invariants](invariant.md), and the system can process changes to different aggregates concurrently without interference. APIs and repositories become simpler because they deal in whole aggregates, not individual objects scattered across the model. The cost is design discipline. Drawing aggregate boundaries requires understanding which invariants span which objects, and that understanding comes from conversations with domain experts, not from staring at a database diagram. Getting the boundary wrong is expensive in both directions. Too wide, and you get contention: unrelated changes block each other. Too narrow, and rules that span two aggregates can only be enforced through eventual-[consistency](consistency.md) mechanisms or sagas, which are harder to reason about and harder to get right. Cross-aggregate references by ID feel awkward in object-oriented code. Loading a related aggregate requires an explicit repository call instead of walking a pointer. That friction is the point. It keeps the boundary visible in the code, so neither humans nor agents accidentally couple things that should be independent. ## Sources - Eric Evans defined aggregates as a core tactical pattern in *[Domain-Driven Design: Tackling Complexity in the Heart of Software](https://openlibrary.org/works/OL4464385W/Domain-Driven_Design)* (Addison-Wesley, 2003), Chapter 6. The epigraph and the three-part definition (cluster, boundary, root) come from his treatment. Evans's key insight was that without explicit boundaries, object graphs become an undifferentiated web where any mutation can violate any rule. - Vaughn Vernon refined aggregate design in *[Implementing Domain-Driven Design](https://openlibrary.org/works/OL17392277W)* (Addison-Wesley, 2013), introducing the "small aggregates" guideline that this article follows. His rule of thumb (reference other aggregates by identity, not by object reference) solved the performance and contention problems that plagued early DDD implementations where aggregates were drawn too large. - Martin Fowler documented the aggregate pattern in *[Patterns of Enterprise Application Architecture](https://martinfowler.com/books/eaa.html)* (Addison-Wesley, 2002) and his bliki, connecting it to repository and unit-of-work patterns. His framing of aggregates as transaction boundaries influenced the way this article presents the concurrency benefit. --- - [Next: Bounded Context](bounded-context.md) - [Previous: Coding Convention](coding-convention.md)