7582: Add naive RLP caching for BlockHeader, Transaction, and Withdrawal #7988

Matilda-Clerke · 2024-12-05T03:49:35Z

PR description

Add naive RLP caching for BlockHeader, Transaction, and Withdrawal. This PR currently ignores Block, BlockBody, and TransactionReceipt due to complexity.

Issue

#7582

Signed-off-by: Matilda Clerke <[email protected]>

…ng-during-sync # Conflicts: # ethereum/core/src/main/java/org/hyperledger/besu/ethereum/core/BlockHeader.java

Signed-off-by: Matilda Clerke <[email protected]>

fab-10 · 2024-12-05T09:53:49Z

I see the potential benefit of the caching, but since this is a tradeoff between compute and memory, have you done benchmarks to quantify the performance benefit of the change, and the possible impact on memory?

Matilda-Clerke · 2024-12-08T23:20:15Z

I see the potential benefit of the caching, but since this is a tradeoff between compute and memory, have you done benchmarks to quantify the performance benefit of the change, and the possible impact on memory?

Fair point. I modified the BlockHeadersMessageTest locally to produce 10000 headers and timed the looped writeTo calls in BlockHeadersMessage. The original code performed the 10000 writeTo calls in 18 to 23ms, while the updated code performed the 10000 writeTo calls in 13 to 18ms. In terms of memory, it's actually just using indexes to the original underlying array, so it should only be a small amount of extra memory used.

fab-10 · 2024-12-09T10:59:07Z

So the micro benchmark test confirm that there is a gain on the compute side, and that is worth exploring more, for that I suggest to run some real sync on mainnet with this change and compare the results against some control instances.

ahamlat · 2024-12-09T11:14:53Z

it's actually just using indexes to the original underlying array, so it should only be a small amount of extra memory used.

Agreed, but those references will keep strong links to the underlying byte arrays preventing it from garbage collection. I would like to have a rationale on what this feature is going to improve. It is an interesting improvement but is there is real use case for it ? and if so, we need to evaluate, how much is the latency improvement for how much overhead in terms of memory ?

fab-10 · 2024-12-09T11:15:54Z

On the implementation side, have you thought about different approaches to the caching?
I am thinking in particular to an approach that is generic and flexible, with an external cache that is able to keep the RPL encoded form of many different types, that we could tag like RLP cacheable, probably this requires a certain refactor, but could be the base for a reusable framework in the long term.

Matilda-Clerke · 2024-12-11T04:50:42Z

From the micro benchmark, we'd expect a time saving of around 10 seconds for 10 million block headers. Realistically, not a noticeable gain on our current 20+ hour sync. However, it should give us a small saving of CPU utilisation.

Memory breakdown is as follows:
4 bytes to reference the Optional
4 bytes to reference ArrayWrappingBytes
4 bytes to reference the array (reference, not a copy)
4 bytes to store the offset in the array which the ArrayWrappingBytes instance is focusing on
4 bytes to store the length of the array which the ArrayWrappingBytes instance is focusing on
20 bytes per header total additional memory

Regarding garbage collection: If the headers are currently getting garbage collected (I'd expect they are, but not sure), these references won't prevent the underlying array from being garbage collected.

For this PR, I was looking just for some potential quick wins while continuing to focus mainly on my refactoring of peer tasks. @pinges is working on a more compilcated scheme which manages to avoid decoding the RLP for large portions of the block body in addition to avoiding re-encoding as we're doing here.

Signed-off-by: Matilda-Clerke <[email protected]>

Signed-off-by: Matilda Clerke <[email protected]>

Matilda-Clerke · 2024-12-12T23:45:03Z

Backed out the withdrawal changes as we discussed and don't believe the changes will ever be used.

ahamlat · 2024-12-13T08:59:44Z

4 bytes to reference the Optional

Consider using JOL (Java Object Layout) or a heap dump to have more accurate numbers. With CompressedOOPs enabled, each Java object has a header of 12 Bytes if the JVM heap < 32 GiB or 16 Bytes if JVM heap >= 32 GiB. Depending on the total number of bytes for each object, there is internal and external padding to make the size a multiple of 8.

20 bytes per header total additional memory

I think we should focus on the RLP raw data, the underlying byte array, not the headers. The current calculation focuses more on the shallow size because it is deterministic, and takes into account only the headers and the references. With the RLP stored in the transaction, we need to evaluate the impact on the transactionPool in terms of memory usage and garbage collection.

Matilda-Clerke · 2024-12-19T01:45:11Z

Here's a BlockHeader from a heap dump

I'm not sure exactly how to read this (e.g. the integers don't have a size listed, are they included in their parent instance's size?), but it seems like double digits of extra bytes per header or transaction. What do you think?

Matilda-Clerke · 2024-12-19T01:49:23Z

We can see from the references on the underlying byte array that it is being reused across many ArrayWrappingBytes instances, including the new rawRlp references

ahamlat · 2024-12-19T10:57:10Z

I'm not sure exactly how to read this (e.g. the integers don't have a size listed, are they included in their parent instance's size?), but it seems like double digits of extra bytes per header or transaction. What do you think?

In this screenshot below, we can see that the block header is having a reference to a ~2 MBytes byte array (rlp) which I guess is the RLP of the block, but that would be a big block, as we can see the offset and the length inside that AbstractWrappingBytes.

We can see from the references on the underlying byte array that it is being reused across many ArrayWrappingBytes instances, including the new rawRlp references

I guess all the transactions of that specific block, and the header are referencing the same underlying byte array. Now, we need to evaluate the real memory overhead. To do that, we need to have the number of live blocks in memory during sync and the average of the size of the RLP. You can share a heap dump during sync with your PR. I will extract the numbers.

ahamlat

I have a suggestion to init RLP during decoding the transaction from the RLP. I think overall, there is no much overhead when I look the heap dump as the underlying byte array is already referenced by the payload of the transaction, and to address.

ahamlat · 2025-01-06T15:55:49Z

ethereum/core/src/main/java/org/hyperledger/besu/ethereum/core/Transaction.java

+    return Transaction.builder()
+        .copiedFrom(transaction)
+        .rawRlp(Optional.of(transactionRlp.raw()))
+        .build();


calling again a builder to add RLP, I would do it directly in each transaction decoder just after the payload, as they're very similar fields :

besu/ethereum/core/src/main/java/org/hyperledger/besu/ethereum/core/encoding/EIP1559TransactionDecoder.java

Line 52 in 2909ea4

.payload(input.readBytes())

Using the data I collected from the shared heap dump, the payload field is a subset of the rawRlp field

RawRlp

Payload

On the transaction level, as the payload is already referencing the same underlying byte array, the only overhead is the reference of the rlp

My only concern in applying this to the transaction encoder/decoder classes is that they seem to only populate a subset of the Transaction fields, so if we try to instead supply in the encoder the full original RLP, it may be significantly larger than it is currently.

What do you think?

There will be no difference, as rawRlp and payload are just references to the same byte array with an offset and a length. It doesn't change the size of the transaction object. With copiedFrom, we lose the first reference of the transaction as we're creating another one.
The question I didn't investigate is wether we have the (whole) rawRlp of the transaction at this level.

Signed-off-by: Matilda-Clerke <[email protected]>

Signed-off-by: Matilda Clerke <[email protected]>

Matilda-Clerke · 2025-01-08T23:10:24Z

It seems these latest changes aren't quite working right. In particular, a block encoded and written to blockchainStorage is causing RLP errors when being read and decoded. I'm parking this briefly for now to progress some other issues.

Matilda-Clerke added 3 commits December 5, 2024 14:44

7582: Add naive RLP caching for BlockHeader, Transaction, and Withdrawal

2bf32b1

Signed-off-by: Matilda Clerke <[email protected]>

Merge branch 'refs/heads/main' into 7582-avoid-unnecessary-rlp-encodi…

0d42988

…ng-during-sync # Conflicts: # ethereum/core/src/main/java/org/hyperledger/besu/ethereum/core/BlockHeader.java

7582: spotless

eebbab3

Signed-off-by: Matilda Clerke <[email protected]>

Matilda-Clerke added 4 commits December 11, 2024 16:25

Merge branch 'main' into 7582-avoid-unnecessary-rlp-encoding-during-sync

585ed8b

Merge branch 'main' into 7582-avoid-unnecessary-rlp-encoding-during-sync

007bc3c

Signed-off-by: Matilda-Clerke <[email protected]>

7582: Fix broken test after merge

2fb60f8

Signed-off-by: Matilda Clerke <[email protected]>

7582: Back out withdrawal changes

7738076

Signed-off-by: Matilda Clerke <[email protected]>

ahamlat reviewed Jan 6, 2025

View reviewed changes

Matilda-Clerke added 3 commits January 7, 2025 16:30

Merge branch 'main' into 7582-avoid-unnecessary-rlp-encoding-during-sync

f0b89b4

Signed-off-by: Matilda-Clerke <[email protected]>

7582: Fix up compile error after merge

a00bd94

Signed-off-by: Matilda Clerke <[email protected]>

7582: Apply naive RLP caching in transaction encoder/decoder classes

6c0ac42

Signed-off-by: Matilda Clerke <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

7582: Add naive RLP caching for BlockHeader, Transaction, and Withdrawal #7988

7582: Add naive RLP caching for BlockHeader, Transaction, and Withdrawal #7988

Matilda-Clerke commented Dec 5, 2024

fab-10 commented Dec 5, 2024

Matilda-Clerke commented Dec 8, 2024 •

edited

Loading

fab-10 commented Dec 9, 2024

ahamlat commented Dec 9, 2024

fab-10 commented Dec 9, 2024

Matilda-Clerke commented Dec 11, 2024

Matilda-Clerke commented Dec 12, 2024

ahamlat commented Dec 13, 2024 •

edited

Loading

Matilda-Clerke commented Dec 19, 2024 •

edited

Loading

Matilda-Clerke commented Dec 19, 2024

ahamlat commented Dec 19, 2024 •

edited

Loading

ahamlat left a comment •

edited

Loading

ahamlat Jan 6, 2025

ahamlat Jan 6, 2025

ahamlat Jan 6, 2025

Matilda-Clerke Jan 7, 2025

ahamlat Jan 7, 2025

Matilda-Clerke commented Jan 8, 2025

7582: Add naive RLP caching for BlockHeader, Transaction, and Withdrawal #7988

Are you sure you want to change the base?

7582: Add naive RLP caching for BlockHeader, Transaction, and Withdrawal #7988

Conversation

Matilda-Clerke commented Dec 5, 2024

PR description

Issue

fab-10 commented Dec 5, 2024

Matilda-Clerke commented Dec 8, 2024 • edited Loading

fab-10 commented Dec 9, 2024

ahamlat commented Dec 9, 2024

fab-10 commented Dec 9, 2024

Matilda-Clerke commented Dec 11, 2024

Matilda-Clerke commented Dec 12, 2024

ahamlat commented Dec 13, 2024 • edited Loading

Matilda-Clerke commented Dec 19, 2024 • edited Loading

Matilda-Clerke commented Dec 19, 2024

ahamlat commented Dec 19, 2024 • edited Loading

ahamlat left a comment • edited Loading

Choose a reason for hiding this comment

ahamlat Jan 6, 2025

Choose a reason for hiding this comment

ahamlat Jan 6, 2025

Choose a reason for hiding this comment

ahamlat Jan 6, 2025

Choose a reason for hiding this comment

Matilda-Clerke Jan 7, 2025

Choose a reason for hiding this comment

ahamlat Jan 7, 2025

Choose a reason for hiding this comment

Matilda-Clerke commented Jan 8, 2025

Matilda-Clerke commented Dec 8, 2024 •

edited

Loading

ahamlat commented Dec 13, 2024 •

edited

Loading

Matilda-Clerke commented Dec 19, 2024 •

edited

Loading

ahamlat commented Dec 19, 2024 •

edited

Loading

ahamlat left a comment •

edited

Loading