-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
trie: optimize memory allocation #30932
base: master
Are you sure you want to change the base?
Conversation
Before I dive into the nitty gritty; can you say anything about what's done here? |
|
CI is failing Investigation needed |
c9b1b6c
to
04eb383
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some comments I made yesterday that got stuck in 'pending'
// do that, since we don't overwrite/reuse keys | ||
// cached.Key = common.CopyBytes(n.Key) | ||
func (h *hasher) hashShortNodeChildren(n *shortNode) *shortNode { | ||
var collapsed shortNode |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is essentially still a copy
. You don't invoke copy()
, but you create a new one and set it's fields. Was it not possible to do an in-place version here?
Just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's still a copy.
It's not possible to do an in-place modify. If so, the child will be replaced by its hash and the entire trie will be collapsed into a single hash. It's suitable for commit operation, but not for hash operation. The trie being hashed is still available for following usage.
e03fd3d
to
9156cec
Compare
9156cec
to
25ae4e5
Compare
case hashNode: | ||
return n | ||
default: | ||
panic(fmt.Sprintf("%T: unknown node type", n)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
panic(fmt.Sprintf("%T: unknown node type", n)) | |
panic(fmt.Sprintf("%T: unknown node type: %v", n, n)) |
for i := 0; i < 10; i++ { | ||
tr.Update(testrand.Bytes(32), testrand.Bytes(32)) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tr.Hash() | |
tr.Commit(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is here missing the Commit
?
I run a benchmark with the master branch code about the
|
@@ -1311,3 +1311,171 @@ func printSet(set *trienode.NodeSet) string { | |||
} | |||
return out.String() | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func BenchmarkTestTrieCopy(b *testing.B) { | |
b.ResetTimer() | |
tr := NewEmpty(nil) | |
for i := 0; i < 256; i++ { | |
tr.Update(testrand.Bytes(32), testrand.Bytes(32)) | |
} | |
for i := 0; i < b.N; i++ { | |
copiedTrie := tr.Copy() | |
tr.Update(testrand.Bytes(32), testrand.Bytes(32)) | |
_ = copiedTrie | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been reviewing and toying with this code for quite a while now. As far as I can tell, these changes are fine. I'll click "approve", but not necessarily with the intention of "go ahead and merge asap".
The workings of the trie is very core to the correct operations. Before we merge this -- how long has it been running on benchmarkers?
Can't remember honestly, probably one or two days. You are right, that trie is very core and sensitive. I am totally fine to on hold it a bit, let's say to deploy it on benchmark for syncing the whole chain and include it in 1.15.1 maybe. As it's definitely not an urgent thing |
This pull request removes the node copy operation to reduce memory allocation. Key Changes as below:
(a) Use
decodeNodeUnsafe
for decoding nodes retrieved from the trie node readerIn the current implementation of the MPT, once a trie node blob is retrieved, it is passed to
decodeNode
for decoding. However,decodeNode
assumes the supplied byte slice might be mutated later, so it performs a deep copy internally before parsing the node.Given that the node reader is implemented by the path database and the hash database, both of which guarantee the immutability of the returned byte slice. By restricting the node reader interface to explicitly guarantee that the returned byte slice will not be modified, we can safely replace
decodeNode
withdecodeNodeUnsafe
. This eliminates the need for a redundant byte copy during each node resolution.(b) Modify the trie in place
In the current implementation of the MPT, a copy of a trie node is created before any modifications are made. These modifications include:
This mechanism ensures that modifications only affect the live tree, leaving all previously created copies unaffected.
Unfortunately, this property leads to a huge memory allocation requirement. For example, if we want to modify the fullNode for n times, the node will be copied for n times.
In this pull request, all the trie modifications are made in place. In order to make sure all previously created copies are unaffected, the
Copy
function now will deep-copy all the live nodes rather than the root node itself.With this change, while the
Copy
function becomes more expensive, it's totally acceptable as it's not a frequently used one. For the normal trie operations (Get, GetNode, Hash, Commit, Insert, Delete), the node copy is not required anymore.