-
Notifications
You must be signed in to change notification settings - Fork 998
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(swarm): rewrite NetworkBehaviour
macro for more optimal code gen
#5303
Conversation
Numbers for new macro
Compared to #5026.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, thank you for this!
I don't have the capacity to review this in detail, will defer to @jxs. Overall, performance improvements are always welcome. We don't have many explicit tests that ensure the derive macro works correctly though. Esp. if we start to generate our own poll
impls, I am a bit worried that we are introducing a source of very-hard-to-find bugs.
We could offer the new version behind a feature toggle and let users experiment with it instead of re-writing it right away. For example, we can start using it internally for all our tests etc.
What do you think?
Feature flag sounds very reasonable (since I want to use this in my project), In that case, I can introduce a required field in the behavior that can store the state needed to also improve the main poll implementation. The main idea behind the custom poll is simple: // previous implementation
#(
if let std::task::Poll::Ready(event) = self.#fields.poll(cx) {
return std::task::Poll::Ready(event
.map_custom(#to_beh::#var_names)
.map_outbound_open_info(#ooi::#var_names)
.map_protocol(#ou::#var_names));
}
)*
// proposed implementation
let mut fuel = #beh_count;
while fuel > 0 {
// save the poll position to avoid repolling exhaused handlers
match self.field_index {
#(#indices => match self.#fields.poll(cx) {
std::task::Poll::Ready(event) =>
return std::task::Poll::Ready(event
.map_custom(#to_beh::#var_names)
.map_outbound_open_info(#ooi::#var_names)
.map_protocol(#ou::#var_names)),
std::task::Poll::Pending => {}
},)*
_ => {
self.field_index = 0;
continue;
}
}
self.field_index += 1;
fuel -= 1;
} in each poll:
This polling pattern ensures we don't repoll other behaviors while exhausting events from some point of the hierarchy. Hopefully, the compiler is smart enough to use a branch table for the match. For this to work, I need to maintain an integer between poll calls, thus the extra field. |
This pull request has merge conflicts. Could you please resolve them @jakubDoka? 🙏 |
Find myself doing this too lol. My implementation mimics existing |
This pull request has merge conflicts. Could you please resolve them @jakubDoka? 🙏 |
This pull request has merge conflicts. Could you please resolve them @jakubDoka? 🙏 |
NetworkBehaviour
macro rewritten to generate more optimal codeNetworkBehaviour
macro for more optimal code gen
This pull request has merge conflicts. Could you please resolve them @jakubDoka? 🙏 |
1 similar comment
This pull request has merge conflicts. Could you please resolve them @jakubDoka? 🙏 |
@jakubDoka we have a rust-libp2p maintainers meeting where we all discuss technical changes/PRs that I think you should attend. The next one is starting right now: https://lu.ma/2024-07-02-rust-libp2p But the next one is in two weeks: https://lu.ma/2024-07-16-rust-libp2p |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @jakubDoka, thanks for this and sorry for the delay in reviewing, can you address the conflicts? Thanks!
@dhuseby sorry, I missed the last one, I'll hopefully show up on the 16th |
3f8eb31
to
4e4b594
Compare
Well that was an accident |
Description
I have rewritten
NetworkBehavior
derive macro to generate more optimal and faster to compile code when using more behaviours (5, 10, 20), I noticed performance degrades even though I benchmarked the same load. This is related to #5026.New macro implementation generates enums and structs for each type implementing the traits instead of type-level linked lists. In many cases, this makes resulting types more compact (we store just one enum tag, whereas composed
Either
s each need to store tags to make values addressable) and makes the enum dispatch constant. This also opened the opportunity to optimizeUpgradeInfoIterator
andConnectionHandler
into a state machine (they now remember where they stopped polling/iterating and skipped exhausted subhandlers/iterators). We could optimize theNetworkBehaviour
itself too, but it would require users to put extra fields into the struct (this could be optional for BC).Change checklist