Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry mechanism to test_ucx_config_w_env_var #377

Merged
merged 5 commits into from
Feb 25, 2025

Conversation

pentschev
Copy link
Member

Add a retry mechanism to ensure the Dask scheduler will bind to a free port and retry if the port is not free. We still use distributed.utils.open_port to attempt first, but that port seems not to be realiable for UCX, in which cases UCXXBusyError is raised, we capture that, increment the port and retry.

This test fails seldom in CI for the reason stated above.

Add a retry mechanism to ensure the Dask scheduler will bind to a free
port and retry if the port is not free. We still use
`distributed.utils.open_port` to attempt first, but that port seems not
to be realiable for UCX, in which cases `UCXXBusyError` is raised, we
capture that, increment the port and retry.
@pentschev pentschev added bug Something isn't working non-breaking Introduces a non-breaking change labels Feb 24, 2025
@pentschev pentschev requested a review from a team as a code owner February 24, 2025 13:43
Copy link
Member

@madsbk madsbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I only have some minor suggestions

Co-authored-by: Mads R. B. Kristensen <[email protected]>
@pentschev
Copy link
Member Author

Thanks @madsbk , I pushed the changes you suggested and will merge when CI passes.

@pentschev
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 4f72edd into rapidsai:branch-0.43 Feb 25, 2025
59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working non-breaking Introduces a non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants