a bit about aggregates

Signed-off-by: Gavin King <[email protected]>
hibernate · Oct 23, 2024 · 39736eb · 39736eb
1 parent 3e0568a
commit 39736eb
Show file tree

Hide file tree

Showing 2 changed files with 36 additions and 3 deletions.
diff --git a/documentation/src/main/asciidoc/introduction/Entities.adoc b/documentation/src/main/asciidoc/introduction/Entities.adoc
@@ -526,7 +526,7 @@ Hibernate slightly extends this list with the following types:
 |====
 | Classification | Package | Types
 
-| Additional date/time types | `java.time` | `Duration`, `ZoneId`, `ZoneOffset`, `Year`, and even `ZonedDateTime`
+| Additional date/time types | `java.time` | `Duration`, `ZoneId`, `ZoneOffset`, and even `ZonedDateTime`
 | JDBC LOB types | `java.sql` | `Blob`, `Clob`, `NClob`
 | Java class object | `java.lang` | `Class`
 | Miscellaneous types | `java.util` | `Currency`, `URL`, `TimeZone`
@@ -945,7 +945,40 @@ a|
 
 We'll explain the effect of these members as we consider the various types of association mapping.
 
-Let's begin with the most common association multiplicity.
+It's not a requirement to represent _every_ foreign key relationship as an association at the Java level.
+It's perfectly acceptable to replace a `@ManyToOne` mapping with a basic-typed attribute holding an identifier, if it's inconvenient to think of this relationship as an association at the Java level.
+That said, it's possible to take this idea way to far.
+
+
+.💀 Aggregates 💀
+****
+It's come to our attention that a vocal group of people advocate that Java entity classes should be broken up into tiny disconnected islands they call "aggregates". An aggregate--at least as a first approximation--corresponds roughly to what we would usually call a parent/child relationship.
+Simple examples of aggregates might be `Order`/`Item`, or `Product`/`Part`.
+According to this way of thinking, there should be no associations _between_ aggregates, that is, that the `Item.product` association should be replaced with `productId`, that `Part.manufacturer` should be replaced with `manufacturerId`, and so on.
+(Of course, the word "aggregate" may also be employed in other senses, but this is the sense we're discussing right now.)
+
+In the example we've been using, `Book` would not be permitted to have a collection of entity type `Author`, and should instead hold only the ids of the authors, or perhaps instances of some `BookAuthor` type which duplicates some state of `Author` and is disconnected from the rest of the model.
+
+Let's stipulate that this might be a perfectly natural thing to do in certain contexts, for example, when accessing a document database.
+But one context where it doesn't usually make sense is when accessing a relational database via Hibernate.
+The reason is that Hibernate offers <<association-fetching,rich functionality>> for optimizing access to associated data, including:
+
+- the <<second-level-cache,second level cache>>, and
+- join, batch, and subselect fetching, whether via HQL, <<entity-graph,entity graphs>>, or <<fetch-profiles,fetch profiles>>.
+
+But all this functionality is lost if Hibernate doesn't know it's dealing with an association, inevitably making the application program much more vulnerable to problems with <<association-fetching,N+1 selects>>, just as soon as we encounter a business requirement which involves data from more than one aggregate.
+(Always keep in mind that business requirements change much faster than relational data models!)
+
+To put it mildly: this is not how JPA was ever intended to be used.
+
+It's difficult to respond charitably to most of the arguments in favor of this approach, since most of them don't rise above the level of hand-waving at boxes on drawn on whiteboards.
+An argument we _can_ respond to is the concern that transparent lazy fetching can lead to "accidental" fetching of an association and the potential for N+1 selects.
+This is a legit concern, and one we worry about too, but where it's really a problem we have a much better solution: just use a `StatelessSession`, or a Jakarta Data repository, where <<stateless-sessions,association fetching is always an explicit operation>>.
+Indeed, `StatelessSession` even guards against accidental _updates_, since `update()` is always an explicit operation.
+****
+
+Now that we know that associations are actually good and useful, let's see how to model the various kinds of association we might find need to map to a relational data model.
+We begin with the most common association multiplicity.
 
 [[many-to-one]]
 === Many-to-one

diff --git a/documentation/src/main/asciidoc/introduction/Introduction.adoc b/documentation/src/main/asciidoc/introduction/Introduction.adoc
@@ -443,7 +443,7 @@ The birth of the Jakarta Data specification has obsoleted our arguments against
 Jakarta Data--as realized by Hibernate Data Repositories--offers a clean but very flexible way to organize code, along with much better compile-time type safety, without getting in the way of direct use of the Hibernate `StatelessSession`.
 
 That said, we reiterate our preference for design which emerges organically from the code itself, via a process of refactoring and iterative abstraction.
-The Extract Method refactoring is a far, far more powerful tool than drawing boxes on whiteboards.
+The Extract Method refactoring is a far, far more powerful tool than drawing boxes and arrows on whiteboards.
 In particular, we hereby give you permission to write code which mixes business logic with persistence logic within the same architectural layer--every architectural layer comes with a high cost in boilerplate, and in many contexts a separate persistence layer is simply unnecessary.
 // In 2025 it no longer makes sense to shoehorn every system into an architecture advocated by some book written in the early 2000's.
 Both of the following architectures are allowed, and each has its place: