-
Notifications
You must be signed in to change notification settings - Fork 999
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gossipsub): use Bytes
to cut down on allocations
#4751
base: master
Are you sure you want to change the base?
Conversation
I purposefully avoided making sig/key bytes at this point just to keep this slim, but I don't think there'd be any blockers there? (Also: I debated an API change to just bytes, though I don't think the current change would cause a clone if you already pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me. Couple of things:
- If we just change internals we don't need a changelog entry.
- You can't change generated code.
- I am open to changing the API if we keep it as
impl Into<Bytes>
.
Thanks @thomaseizinger. i've updated the PR to:
|
This should help avoid potentially costly clones over as it is processed and published
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Two minor comments :)
Okay, updated as per feedback 🙇 . |
Thank you for the work here.
Would you mind sharing some numbers before and after? E.g. a heaptrack profile of a node running with and without this patch. |
Abbbbsolutely. Literally doing a bit of heaptrackery the now. 👍 Thank you guys for all the work. I'm happy to wee PR in where I can! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! A few more suggestions after looking at it closely now :)
Co-authored-by: Thomas Eizinger <[email protected]>
Co-authored-by: Thomas Eizinger <[email protected]>
…related improvements
Good catches. Updated 👍 🙇 |
Okay, so here is main-no-libp2p-bytes-heaptrack.safenode.4817.zst.zip, looking at allocations there (specifically searching for gossip, there are a gooood amount. This is roughly the same load in both instances (~1k nodes, same folders being uploaded and so the same quantity of gossip msgs being generated). I've searched through the heaptracks (we have 20 nodes across the network here heaptracking) for worst case examples. And here with the 3 PRs applied together. libp2p-bytes-updates-heaptrack.safenode.6027.zst.zip The latter case here actually has has marginally more data go through it as w/ main the network degrades to become unusable after not too long. This is where I'm comparing allocations. Total mem used / leaked etc is about the same in both instances as you'll see. But I think all the extra allocations causes a fair amount of load that's crippling us here (as we're perhaps sending much more than was originally envisaged over gossip). But allocations wise, the |
Bytes
to cut down on allocations during message processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, thank you!
Bytes
to cut down on allocations during message processingBytes
to cut down on allocations
You could look into patching https://github.com/tafia/quick-protobuf to use |
Thanks for weighing in here @jxs. @mxinden and I discussed this yesterday and we have some doubts about the optimisations presented in this PR. The issue is, the protobuf files contain owned values of data and thus, sooner or later, we will allocate for each message that is being sent. This can only be fixed if we start to use borrowed data for the protobuf structs which would actually be great. I think the last time we tried this, it didn't work well with @joshuef With the above reasoning, I cannot explain why you are seeing performance improvement with this PR. Are you sure the two heaptrack snapshots show the same workload? They also show the |
@thomaseizinger within this PR there are no It also helps prevent / ease any handling of the messages for the consumer too. (I agree though, if we got this into the lower layers that'd be even better) |
But these sites don't show up in your heaptrack screenshots? Perhaps I am misreading it but that mostly shows the codec and the That is what makes a bit doubtful that we are optimising the right thing here. I am definitely on-board with optimising memory usage. I just want to understand and see the improvement :) |
@joshuef Looking in more detail at the code and this screenshots, I am pretty confident I now know where these allocations are coming from. In the current implementation of
I think I improved on this in greatly in #4782. Now, we write to the Currently, these allocations happen in It would be great if you could test the above PR and check if number of allocations go down :) |
Ah right, yeh I was just focussing on the gossip allocations there. You have the two heaptracks entirely in the same message there if you want to deep dive into the actual data. For me it's about being able to safely handle/consume/clone these as a user, and also within libp2p, I think. As The RawMessage and so one are cloned about in the libp2p code, as well as
I will absolutely have at that now. I assume in isolation from these PRs? I suspect it'll get allocations down, but we'll still want these changes or similar so clones of |
We can definitely land these changes, I'd just like to see them having an impact first :)
I'd love to but ironically, |
Also, I was just poking about quick-protobuf w/r/t |
You are right! I came to the same conclusion after experimenting a bit now. |
This pull request has merge conflicts. Could you please resolve them @joshuef? 🙏 |
Description
Sets up gossipsub to use
Bytes
internally in place ofVec<u8>
for message processing. This should help avoid potentially costly clones over the message as it is processed and published.Notes & open questions
Change checklist