-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/runtime/history: Ensure WatchBlocks emits increasing round sequence (no gaps) #6079
base: master
Are you sure you want to change the base?
go/runtime/history: Ensure WatchBlocks emits increasing round sequence (no gaps) #6079
Conversation
✅ Deploy Preview for oasisprotocol-oasis-core canceled.
|
type History interface { | ||
roothash.BlockHistory | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically this interface (History
) is no longer needed, and clients could reference struct directly ?
Draft: If reviewers agree I would prefer to get this in before #6050, possibly in parallel. :) |
This method was only used for testing, and is redundant since Commit correctly updates LastConsensusHeight.
Fixes existing problem where clients of WatchBlocks, may receive block with non-sequentially increasing rounds.
1044e1a
to
80cfc76
Compare
// If no local storage worker, and not during initial history reindex, | ||
// notify the block watcher that new block is committed. | ||
// Otherwise the storage-sync-checkpoint will do the notification. | ||
if h.hasLocalStorage || !h.reindexDone { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use existing mutex for h.reindexDone
, or don't protect it at all since it is currently only called from one thread/can be only overwritten to true. Not sure what is best practice in such scenario, probably protect everything to prevent future bugs if someone extends? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this solution is not simple and we should find a better one. Unless someone convinces me otherwise.
// Passing the special value `RoundLatest` will return results for the latest round. | ||
GetRoundResults(ctx context.Context, round uint64) (*RoundResults, error) | ||
// Calling this methods more then once has no additional side effect. | ||
ReindexFinished() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reindexing is happening one level higher, so the BlockHistory
should know nothing about it. Therefore, this doesn't belong here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We agreed to skip this for now in private and try to rework this sometime in the future. Leaving comment for when we proceed.
I still believe only non-storage related methods can stay in roothash.api.BlockHistory
since roothash service should not know anything about "nodes" and "(local) storage": fffd403 is probably a good step?
I agree ReindexFinished
is probably off, albeit notify
is probably off for the same reason, given we only use it to modify behaviour of WatchBlocks
: 1. WatchBlocks
should not be part of the roothash.api.BlockHistory
since it depends on storage syncing if node has local storage, and 2. it is misleading since in case of notify=false
blocks will still be eventually notified in case of node having a local storage.
return nil | ||
} | ||
h.blocksNotifier.Broadcast(blk) | ||
|
||
return nil | ||
} | ||
|
||
func (h *runtimeHistory) ConsensusCheckpoint(height int64) error { | ||
return h.db.consensusCheckpoint(height) | ||
func (h *runtimeHistory) ReindexFinished() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be private.
@@ -702,6 +703,10 @@ func (sc *serviceClient) processFinalizedEvent( | |||
return fmt.Errorf("failed to reindex blocks: %w", err) | |||
} | |||
tr.reindexDone = true | |||
if !tr.initialReindexDone { | |||
tr.initialReindexDone = true | |||
tr.blockHistory.ReindexFinished() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a simple solution. See how many if
s and flags we have in this method. We should refactor this code to make it more readable, not to make it more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree this is not best, cross-referencing this issue/thread for the next steps.
Technically ReindexFinished
is idempotent, so this 3 lines could be replaced by tr.blockHistory.ReindexFinished()
. #6050 removed some of those flags. Finally, we could also factor out reindexing part to helper function, making code even simpler.
Alternative is to use existing notify=true
for all reindexes after initial one, still not cleanest, likely the simplest.
By far the best option is to refactor as proposed in the issue (?).
What:
After initial history reindex, node without local storage will notify subscribers about every block (even if during subsequent reindex) , ensuring sequential block rounds, just like is the case when node has local storage.**
Finally, we simplify
roothash.BlockHistory
interface as discussed here: #6050 (comment)**For comparison, node with local storage first waits for initial reindex to complete, and then starts syncing rounds with
worker.storage.committe.Node
, fetching missing rounds one by one, and notifying them withStorageSyncCheckpoint
. Subscriber in such scenario always receives blocks with no gaps. Finally, if they subscribe right after first reindex is done, they will still receive most blocks from initial reindex (since storage sync is slower then history reindex).Why:
#6050 (comment) && #6050 (comment)
Possible considerations
We might as well notify during first reindex (?), which would simplify things further. Not sure why current code in
master
was preventing notification during every reindex (if without local storage).