-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2652 Membership::set_first_epoch #2654
Conversation
516ba06
to
2ff3c8b
Compare
a97d405
to
4bf4660
Compare
hotshot-task-impls/src/helpers.rs
Outdated
if let Some(decided_leaf_info) = res.leaf_views.last() { | ||
decide_epoch_root(&decided_leaf_info.leaf, epoch_height, membership).await; | ||
} else { | ||
tracing::info!("No decided leaf while a view has been decided."); | ||
} | ||
} | ||
|
||
if let Some(ref decided_upgrade_cert) = res.decided_upgrade_cert { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we combine this if let
and the one 2 levels down (i.e. if let (Some(..), Some(..)) = (.., ..)
or similar)
hotshot-testing/src/test_runner.rs
Outdated
temp_memberships | ||
.set_first_epoch(<TYPES as NodeType>::Epoch::new(1), INITIAL_DRB_RESULT); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should do this in hotshot-testing
, is there a reason we can't just do this in SystemContext::init
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up removing this entirely as claim_block_with_num_nodes updates the number of nodes already.
hotshot/src/lib.rs
Outdated
@@ -297,6 +298,16 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> SystemContext<T | |||
config.epoch_height, | |||
); | |||
|
|||
if let Some(epoch) = epoch { | |||
// #2652 REVIEW NOTE: This epoch can't be right; what if a node starts up between when we upgrade to epochs | |||
// and when we're at epoch+2? This would cause the current node to propagate the non-epochs stake table |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not sure if this is the right way to do this
can we maybe move this to start_consensus
and do this in the same place we broadcast the genesis QC (~ line 440)? that seems like it might be closer to the semantics we want
I think there the check should be on V::Base::VERSION
not anything else. this won't handle (at all) the case where we're restarting with epochs, but that should be handled by @bfish713's catchup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we don't need to set this epoch when we start consensus. If we set it to our current epoch (based on the anchor leaf) I think that value will never be correct. I think we should only set this when the epoch upgrade succeeds and it should be set to a specific epoch (i.e the upgrade is agree to start epochs at a specific height)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we start up with epochs enabled for genesis, the Membership
trait needs to know that epochs 1 & 2 have no epoch root/drb
I agree we should not use the anchor leaf, but I think we have to specifically do it for genesis unless I'm missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be handled in a general way, if epochs are enabled when we start, we need to load with an epoch root or roots. I think this should be included in the HotshotInitializer
, We should also make sure we can load up again with the stored epoch, like how we do with the anchor leaf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree we can't just ignore at startup, I was mistaken. I was only considering the mainnet upgrade
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really have any comments. I'll let someone else approve after the discussion with the setting the initial first epoch is resolved.
I do have a question but I'll take it offline as this more of a generic question and I don't think it will affect the PR in any way.
681a0b5
to
8e85bb6
Compare
b987c63
to
8710c56
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left some comments. I think notably there aren't any methods storing the initializer info, will that be a separate PR?
otherwise I think it looks fine overall. I'll leave to @bfish713 or @lukaszrzasik to approve (since I accidentally merged the upgrade logic into this branch and I don't want to approve my own code)
hotshot/src/lib.rs
Outdated
@@ -243,6 +244,53 @@ impl<TYPES: NodeType, I: NodeImplementation<TYPES>, V: Versions> SystemContext<T | |||
.await | |||
} | |||
|
|||
async fn load_start_epoch_info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason this is a method on SystemContext
?
it seems like it should either be a method on Membership
or a stand-alone function
hotshot/src/lib.rs
Outdated
// If the first element in start_epoch_info does not have a block header, | ||
// use that to set_first_epoch instead | ||
if let Some(epoch_info) = start_epoch_info.first() { | ||
if epoch_info.block_header.is_none() { | ||
membership | ||
.write() | ||
.await | ||
.set_first_epoch(epoch_info.epoch, INITIAL_DRB_RESULT); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually suggest just calling set_first_epoch
in the loop, as part of the else
on if let Some(block_header)
I think doing it the way it's done here is more likely to result in unexpected behaviour, because we can silently discard parts of the initializer. and I don't think there's much to gain from second-guessing the initializer: if it's misconfigured and results in set_first_epoch
being called twice, it shouldn't be load_start_epoch_info
's job to second-guess it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we should log by the way, maybe at tracing::debug!()
, every call to set_first_epoch
or add_drb_result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done and done
Addressed. The methods for storing it are part of a separate task |
hotshot-task-impls/src/helpers.rs
Outdated
let decided_block_number = decided_leaf_info.leaf.block_header().block_number(); | ||
let first_epoch_number = | ||
TYPES::Epoch::new(epoch_from_block_number(decided_block_number, epoch_height)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the plan was to have the block height configured for the first epoch, but his code is getting the epoch from when we decide the upgrade. Is that change for a different PR or am I misunderstanding/misremembering?
let epoch_height = consensus_reader.epoch_height; | ||
drop(consensus_reader); | ||
|
||
if with_epochs && res.new_decided_view_number.is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this function stops being called when we upgrade, decide_from_proposal_2 gets activated, any logic specific to epochs would need to be in there too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding here was that once we're in epochs we don't want to set_first_epoch any more anyway. Either way this logic moved due to the next comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are correct once we in epochs we don't have to worry about epoch upgrade stuff
@@ -390,21 +391,44 @@ pub(crate) async fn handle_quorum_proposal_validated< | |||
.await | |||
}; | |||
|
|||
if let Some(cert) = &task_state.staged_epoch_upgrade_certificate { | |||
if leaf_views.last().unwrap().leaf.height() >= task_state.epoch_upgrade_block_height { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we are waiting to apply the upgrade until the configured height, but we already set the first epoch, I think that needs to happen here
membership | ||
.write() | ||
.await | ||
.set_first_epoch(first_epoch_number, INITIAL_DRB_RESULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set this in the block below
if let Some(cert) = decided_upgrade_cert.clone() {
if cert.data.new_version == V::Epochs::VERSION {
task_state.staged_epoch_upgrade_certificate = Some(cert);
let epoch_height = consensus_reader.epoch_height; | ||
drop(consensus_reader); | ||
|
||
if with_epochs && res.new_decided_view_number.is_some() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are correct once we in epochs we don't have to worry about epoch upgrade stuff
@@ -390,21 +391,56 @@ pub(crate) async fn handle_quorum_proposal_validated< | |||
.await | |||
}; | |||
|
|||
if let Some(cert) = &task_state.staged_epoch_upgrade_certificate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clear this option after we apply the upgrade so we don't hit this on every proposal
Closes #2652
This PR:
This PR does not:
Key places to review: