Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix false positive race tests by gating all R/W access to timers behind synchronization #875

Merged
merged 2 commits into from
Dec 11, 2023

Conversation

clemire
Copy link
Contributor

@clemire clemire commented Dec 11, 2023

Changelist

Race tests are detecting potential conflict in the initialization of the health checker mutable state timer, and updates to the timer after polling starts. This race is logically impossible, but to prevent the false positive I moved the initialization of the timer to a method where it isgated behind the lock on the mutable state object.

Test Plan

Existing tests should continue to pass + no more race failures.

Author/Reviewer Checklist

  • If this PR has changes that result in a different app state given the same prior state and transaction list, manually add the state-breaking label.
  • If the PR has breaking postgres changes to the indexer add the indexer-postgres-breaking label.
  • If this PR isn't state-breaking but has changes that modify behavior in PrepareProposal or ProcessProposal, manually add the label proposal-breaking.
  • If this PR is one of many that implement a specific feature, manually label them all feature:[feature-name].
  • If you wish to for mergify-bot to automatically create a PR to backport your change to a release branch, manually add the label backport/[branch-name].
  • Manually add any of the following labels: refactor, chore, bug.

…er to elimiate false positives in race tests.
Copy link
Contributor

coderabbitai bot commented Dec 11, 2023

Walkthrough

The health checking mechanism in the server protocol has been enhanced with the addition of a new method, InitializePolling, to the healthCheckerMutableState type. This method sets up the polling timer with a custom delay for the first poll, improving the initialization process. The StartNewHealthChecker function has been updated to utilize this new method, streamlining the creation of new health checkers by abstracting the timer setup.

Changes

File Change Summary
.../server/types/health_checker.go Added InitializePolling method to healthCheckerMutableState for initializing polling with a delay. Updated StartNewHealthChecker to use the new method.

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on X ?


Tips

Chat with CodeRabbit Bot (@coderabbitai)

  • If you reply to a review comment from CodeRabbit, the bot will automatically respond.
  • To engage with CodeRabbit bot directly around the specific lines of code in the PR, mention @coderabbitai in your review comment
  • Note: Review comments are made on code diffs or files, not on the PR overview.
  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Note: For conversation with the bot, please use the review comments on code diffs or files.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • The JSON schema for the configuration file is available here.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Status

Actionable comments generated: 2

Configuration used: CodeRabbit UI

Commits Files that changed from the base of the PR and between fb83527 and 9c0fccf.
Files selected for processing (1)
  • protocol/daemons/server/types/health_checker.go (2 hunks)
Additional comments: 2
protocol/daemons/server/types/health_checker.go (2)
  • 115-127: The implementation of InitializePolling correctly ensures that the timer is only initialized if it has not been previously set, which should prevent the race condition false positives as intended by the PR.

  • 221-221: The call to InitializePolling within StartNewHealthChecker correctly schedules the first poll using the new synchronized method, aligning with the PR's objective to synchronize timer initialization.

@@ -112,6 +112,20 @@ func (u *healthCheckerMutableState) SchedulePoll(nextPollDelay time.Duration) {
u.timer.Reset(nextPollDelay)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SchedulePoll method assumes that the timer is already initialized when calling Reset. It might be prudent to add a check to ensure that the timer is not nil before calling Reset to avoid a potential nil pointer dereference if SchedulePoll is called before InitializePolling.

  func (u *healthCheckerMutableState) SchedulePoll(nextPollDelay time.Duration) {
  	u.lock.Lock()
  	defer u.lock.Unlock()

  	// Don't schedule a poll if the health checker has been stopped.
  	if u.stopped {
  		return
  	}

+ 	// Ensure the timer is initialized before resetting it.
+ 	if u.timer == nil {
+ 		return
+ 	}
  	u.timer.Reset(nextPollDelay)
  }

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
u.timer.Reset(nextPollDelay)
func (u *healthCheckerMutableState) SchedulePoll(nextPollDelay time.Duration) {
u.lock.Lock()
defer u.lock.Unlock()
// Don't schedule a poll if the health checker has been stopped.
if u.stopped {
return
}
// Ensure the timer is initialized before resetting it.
if u.timer == nil {
return
}
u.timer.Reset(nextPollDelay)
}

// The first poll is scheduled after a custom delay to allow the service to initialize.
u.timer = time.AfterFunc(firstPollDelay, pollFunc)
}

// Stop stops the health checker. This method is synchronized.
func (u *healthCheckerMutableState) Stop() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Stop method should check if the timer is not nil before calling Stop on it to prevent a potential nil pointer dereference if Stop is called before the timer is initialized.

  func (u *healthCheckerMutableState) Stop() {
  	u.lock.Lock()
  	defer u.lock.Unlock()

  	// Don't stop the health checker if it has already been stopped.
  	if u.stopped {
  		return
  	}

+ 	// Ensure the timer is initialized before stopping it.
+ 	if u.timer == nil {
+ 		return
+ 	}
  	u.timer.Stop()
  	u.stopped = true
  }

Committable suggestion

IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Suggested change
func (u *healthCheckerMutableState) Stop() {
func (u *healthCheckerMutableState) Stop() {
u.lock.Lock()
defer u.lock.Unlock()
// Don't stop the health checker if it has already been stopped.
if u.stopped {
return
}
// Ensure the timer is initialized before stopping it.
if u.timer == nil {
return
}
u.timer.Stop()
u.stopped = true
}

@clemire clemire merged commit 0b6a974 into main Dec 11, 2023
15 of 16 checks passed
@clemire clemire deleted the crystal/health-check-data-race-fix branch December 11, 2023 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

2 participants