-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HHH-18326 Inefficient identity hash code computation tracking persistent instances #9650
base: main
Are you sure you want to change the base?
Conversation
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/collection/spi/AbstractPersistentCollection.java
Show resolved
Hide resolved
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Show resolved
Hide resolved
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Outdated
Show resolved
Hide resolved
hibernate-core/src/main/java/org/hibernate/internal/util/collections/PagedArray.java
Outdated
Show resolved
Hide resolved
hibernate-core/src/main/java/org/hibernate/engine/internal/EntityEntryContext.java
Fixed
Show fixed
Hide fixed
010c6dc
to
0db7d26
Compare
hibernate-core/src/main/java/org/hibernate/engine/internal/StatefulPersistenceContext.java
Outdated
Show resolved
Hide resolved
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Show resolved
Hide resolved
Wouldn't 7.0 be a better time than 7.1 to make potentially dangerous changes? cc @sebersole |
Updated the PR accounting for the initial concerns. For anyone interested, here you can find the updated benchmark results. |
} | ||
return this.collectionEntries; | ||
collection.$$_hibernate_setInstanceId( nextCollectionInstanceId() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
collection.$$_hibernate_setInstanceId( nextCollectionInstanceId() ); | |
assert collection.$$_hibernate_getInstanceId() == -1; | |
collection.$$_hibernate_setInstanceId( nextCollectionInstanceId() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The collection might have been associated to a different persistence context, we don't reset instance ids.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but didn't we discuss that we could unset the instance id in unsetSession()
when changing the order of operations that call this method right now?
I'm not saying this check is super critical to have, just that it might be a nice way to capture potential bugs in our code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, for collections I do believe we can reset it without having side-effects, and the pros are we can add this assertion.
For cons, handling -1
adds an if check every time we access the map (instanceId >= 0
) and since we already do the ==
check, we don't strictly need it. Also, for entities that would be not ideal, as we'd need to instrument enhanced entities' constructors and when resetting we'd run into an additional itable stub
call, which impacts performance.
Not sure what I prefer, but keeping both behaviors aligned seems more consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resetting happens in org.hibernate.engine.internal.EntityEntryContext#removeEntityEntry
and there are currently 2 itable callsites there, I suggest we introduce a new method that handles "reset" which manages previous/next entries and the instance id, which will even improve perf further :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single method for previous/next is a great improvement idea! Why not do that as a separate improvement?
I would not stress too much on resetting ids though, as we've already discussed there might still be cases which we need to handle (with the ==
check), and it will require an additional check on the id when accessing the collection, so I don't see that much additional value by doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single method for previous/next is a great improvement idea! Why not do that as a separate improvement?
Sounds good to me.
I would not stress too much on resetting ids though, as we've already discussed there might still be cases which we need to handle (with the == check), and it will require an additional check on the id when accessing the collection, so I don't see that much additional value by doing that.
Like I wrote before, it's not super critical to have this, it would IMO just be nice because it can be a nice safety belt for us. Since we'd only check the id in assertions, regular code would not be impacted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hibernate-core/src/main/java/org/hibernate/collection/spi/AbstractPersistentCollection.java
Outdated
Show resolved
Hide resolved
if ( immutableManagedEntityXref == null ) { | ||
immutableManagedEntityXref = new IdentityHashMap<>(); | ||
if ( !isReferenceCachingEnabled( entityEntry.getPersister() ) ) { | ||
managed.$$_hibernate_setInstanceId( nextManagedEntityInstanceId() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
managed.$$_hibernate_setInstanceId( nextManagedEntityInstanceId() ); | |
assert managed.$$_hibernate_getInstanceId() == -1; | |
managed.$$_hibernate_setInstanceId( nextManagedEntityInstanceId() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The entity might have been associated to a different persistence context, we don't reset instance ids.
hibernate-core/src/main/java/org/hibernate/engine/internal/StatefulPersistenceContext.java
Outdated
Show resolved
Hide resolved
*/ | ||
public interface InstanceIdentity { | ||
/** | ||
* Retrieve the unique identifier of this instance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Retrieve the unique identifier of this instance | |
* Retrieve the unique identifier of this instance, or {@code -1} if the instance is not associated to a persistence context. |
@@ -225,6 +225,14 @@ private DynamicType.Builder<?> doEnhance(Supplier<DynamicType.Builder<?>> builde | |||
EnhancerConstants.USE_TRACKER_SETTER_NAME | |||
); | |||
|
|||
builder = addFieldWithGetterAndSetter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice to initialize the field to -1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd need to intercept and enhance all entity constructors to do that. Also, since we don't reset instance ids and we have the ==
check to make sure we have the actual same object, IMO setting this to -1
doesn't bring much value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, you just have to call .value( -1 )
after defineField()
e.g.
.defineField( fieldName, type, constants.fieldModifierPRIVATE_TRANSIENT )
.value( -1 )
.annotateField( constants.TRANSIENT_ANNOTATION )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems to be only for static
fields, see raphw/byte-buddy#553 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resetting the value could happen in org.hibernate.engine.internal.EntityEntryContext#removeEntityEntry
which also manages the entity entry reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would result in an itable stub
invocation, which would degrade performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it seems all constructors would have to initialize this field :/
raphw/byte-buddy#227
if ( page != null ) { | ||
final int offset = toPageOffset( keyIndex ); | ||
final Object k = page.get( offset ); | ||
if ( k == key ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throw an exception when the keys don't match?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't: in case of "pre-set" instance-ids (see our previous discussion), it's plausible that we find an existing instance in the array at the same index but that isn't the same object. IMO we should just return null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that is dependent on whether we want/can ensure instance ids are reset.
final int pageOffset = toPageOffset( keyIndex ); | ||
Object k = page.set( pageOffset, null ); | ||
// Check that the provided instance really matches with the key contained in the store | ||
if ( k == key ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throw an exception when the keys don't match?
final Map.Entry<K, V> entry = page.set( pageOffset, null ); | ||
// Check that the provided instance really matches with the key contained in the map | ||
if ( entry != null ) { | ||
if ( entry.getKey() == key ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throw an exception when the keys don't match?
*/ | ||
public @Nullable V get(int instanceId, Object key) { | ||
final Entry<K, V> entry = get( instanceId ); | ||
if ( entry != null && entry.getKey() == key ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throw an exception when the keys don't match?
hibernate-core/src/main/java/org/hibernate/internal/util/collections/AbstractPagedArray.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
hibernate-core/src/main/java/org/hibernate/internal/util/collections/InstanceIdentityMap.java
Fixed
Show fixed
Hide fixed
entities Also, treat immutable, enhanced, reference-cacheable entities as non-enhanced
https://hibernate.atlassian.net/browse/HHH-18326
This PR tries to address the performance impact of relying on identity hashes to track persistent instances. See the initial benchmark results for the current state here: hibernate/hibernate-orm-benchmark#15.
This strategy is using a unique
int
(but could belong
too) instance id that is assigned once an entity/collection is made persistent, and through that maps instances to a new collection (InstanceIdentityMap
) that simply stores information in an array using the unique id as index. This allows us to avoid relying on identity hash code for:I will add more details on the proposed change along with benchmarks results of using the new strategy in the linked
hibernate-orm-benchmarks
PR, but to synthetize them here:8.4%
more ops/s than before and we align with the result of mutable entities (that do not need to rely on a map at all); increasing the entity count to1_000
the result is similar, with9.5%
more ops/s. Looking at the flamegraphsStatefulPersistenceContext.addEntry
matches are basically aligned between mutabletrue
andfalse
tests.18.3%
more ops/s with a mapping of only 2 collections (increasing to more than50%
for 7+ collections) if we're not initializing the associations (matches forStatefulPersistenceContext.addCollection
go down from17.4%
to1.8%
; if we do initialize the collections lazily (i.e. 1 additional query for each collection) we still get a very relevant performance increase with a 5-10% improvement in ops/s for 2 to 7 mapped collections, dropping to just 1% with 16 mapped collections.Note this solution requires a change in the bytecode-enhanced entity classes, adding a new field to store the required unique ID. Other than that, only the
ManagedEntity
andPersistentCollection
SPI interfaces have been changed. Marking this as draft as it will probably need to wait until after 7.0.0 is released.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license
and can be relicensed under the terms of the LGPL v2.1 license in the future at the maintainers' discretion.
For more information on licensing, please check here.