-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: reduce identification broadcast from one per farm to one per farmer, also for caches #2945
opt: reduce identification broadcast from one per farm to one per farmer, also for caches #2945
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, I just had some questions about changing how much detail we log.
They are all optional changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for those changes, I'm going to leave this to a more experienced reviewer to do the final approval.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you have tackled the description of the #2900 literally by reducing number of network messages, but that isn't quite the solution I meant.
We still have a similar amount of network bandwidth used and we still process every single identification message just like we did before, so this probably improves situation a bit, but not fundamentally. It also breaks wire messages for those who upgrade components one by one.
What I expected to happen instead is this:
- farmer and cache application instances generate ephemeral IDs during startup and use them for identification
- controller keeps track of ephemeral farmers/caches instances and their internal farms/caches
- when new farmer/cache ephemeral instance is discovered, stream request is made to retrieve the details, but otherwise details are not sent over the wire at all, saving both bandwidth on the network and compute on the controller since controller needs to do a single check for the whole bunch of potential farms
let ClusterFarmerIdentifyFarmBroadcast { details } = identify_message; | ||
for detail in details { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd extract another function for the loop so the indentation is kept the same before and after for the actual logic
@nazar-pc Oh, so you're saying that we're actually only checking their identity once by creating a stream, right? And then it keeps the stream connected, and pushes the new state when it changes. |
Yeah, it's a better way to do it, and it's still forward compatible. Let me turn to him. |
Not really. The stream will only be used to send individual farm details. But as long as ephemeral farmer ID doesn't change, we know that farmer application didn't restart and don't need to worry about individual farms in it. In fact if we derive ephemeral farmer ID from fingerprints of individual farms, we don't need to send individual fingerprints in farm details anymore and we'll be able to support quick farmer restarts without losing state on the controller. So current farm notifications will be replaced with farmer notifications and farm details will be sent in a stream only if/when necessary. |
4ac9713
to
5506663
Compare
@nazar-pc I'm a bit confused here, do you mean that we include farm information (like some hash) in the farmer ID so that the derived id doesn't change every time the farmer is restarted if the configuration and machine don't change? |
Exactly. See how |
This is the first step towards #2900.
Next I'll try to replace the request with a stream(if its still necessary), and I think we can optimise the ids in the farmer (cache), e.g. by incrementing the first id, so that when we send an identify we only need one id, and the rest can be calculated directly from the index. For very large farmers, this should make the identify message much smaller.
The first commit is some moving and renaming(hope I'm right this time.).
Code contributor checklist: