Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go/runtime/history: Ensure WatchBlocks emits increasing round sequence (no gaps) #6079

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

martintomazic
Copy link
Contributor

@martintomazic martintomazic commented Feb 23, 2025

What:
After initial history reindex, node without local storage will notify subscribers about every block (even if during subsequent reindex) , ensuring sequential block rounds, just like is the case when node has local storage.**

Finally, we simplify roothash.BlockHistory interface as discussed here: #6050 (comment)

**For comparison, node with local storage first waits for initial reindex to complete, and then starts syncing rounds with worker.storage.committe.Node, fetching missing rounds one by one, and notifying them with StorageSyncCheckpoint. Subscriber in such scenario always receives blocks with no gaps. Finally, if they subscribe right after first reindex is done, they will still receive most blocks from initial reindex (since storage sync is slower then history reindex).

Why:
#6050 (comment) && #6050 (comment)

Possible considerations
We might as well notify during first reindex (?), which would simplify things further. Not sure why current code in master was preventing notification during every reindex (if without local storage).

Copy link

netlify bot commented Feb 23, 2025

Deploy Preview for oasisprotocol-oasis-core canceled.

Name Link
🔨 Latest commit 80cfc76
🔍 Latest deploy log https://app.netlify.com/sites/oasisprotocol-oasis-core/deploys/67bb939f7853ef0008477e9d

Comment on lines 30 to 32
type History interface {
roothash.BlockHistory

Copy link
Contributor Author

@martintomazic martintomazic Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically this interface (History) is no longer needed, and clients could reference struct directly ?

@martintomazic
Copy link
Contributor Author

Draft: If reviewers agree I would prefer to get this in before #6050, possibly in parallel. :)

This method was only used for testing, and is redundant since
Commit correctly updates LastConsensusHeight.
Fixes existing problem where clients of WatchBlocks, may
receive block with non-sequentially increasing rounds.
@martintomazic martintomazic force-pushed the martin/internal/rework-roothash-block-history-interface branch from 1044e1a to 80cfc76 Compare February 23, 2025 21:31
@martintomazic martintomazic marked this pull request as ready for review February 23, 2025 22:01
Comment on lines +116 to +119
// If no local storage worker, and not during initial history reindex,
// notify the block watcher that new block is committed.
// Otherwise the storage-sync-checkpoint will do the notification.
if h.hasLocalStorage || !h.reindexDone {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use existing mutex for h.reindexDone, or don't protect it at all since it is currently only called from one thread/can be only overwritten to true. Not sure what is best practice in such scenario, probably protect everything to prevent future bugs if someone extends? :)

@martintomazic martintomazic self-assigned this Feb 25, 2025
Copy link
Contributor

@peternose peternose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this solution is not simple and we should find a better one. Unless someone convinces me otherwise.

// Passing the special value `RoundLatest` will return results for the latest round.
GetRoundResults(ctx context.Context, round uint64) (*RoundResults, error)
// Calling this methods more then once has no additional side effect.
ReindexFinished()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reindexing is happening one level higher, so the BlockHistory should know nothing about it. Therefore, this doesn't belong here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We agreed to skip this for now in private and try to rework this sometime in the future. Leaving comment for when we proceed.

I still believe only non-storage related methods can stay in roothash.api.BlockHistory since roothash service should not know anything about "nodes" and "(local) storage": fffd403 is probably a good step?

I agree ReindexFinished is probably off, albeit notify is probably off for the same reason, given we only use it to modify behaviour of WatchBlocks: 1. WatchBlocks should not be part of the roothash.api.BlockHistory since it depends on storage syncing if node has local storage, and 2. it is misleading since in case of notify=false blocks will still be eventually notified in case of node having a local storage.

return nil
}
h.blocksNotifier.Broadcast(blk)

return nil
}

func (h *runtimeHistory) ConsensusCheckpoint(height int64) error {
return h.db.consensusCheckpoint(height)
func (h *runtimeHistory) ReindexFinished() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be private.

@@ -702,6 +703,10 @@ func (sc *serviceClient) processFinalizedEvent(
return fmt.Errorf("failed to reindex blocks: %w", err)
}
tr.reindexDone = true
if !tr.initialReindexDone {
tr.initialReindexDone = true
tr.blockHistory.ReindexFinished()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a simple solution. See how many ifs and flags we have in this method. We should refactor this code to make it more readable, not to make it more complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree this is not best, cross-referencing this issue/thread for the next steps.

Technically ReindexFinished is idempotent, so this 3 lines could be replaced by tr.blockHistory.ReindexFinished(). #6050 removed some of those flags. Finally, we could also factor out reindexing part to helper function, making code even simpler.

Alternative is to use existing notify=true for all reindexes after initial one, still not cleanest, likely the simplest.

By far the best option is to refactor as proposed in the issue (?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants