Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement ephemeral port reservation for k3d #297

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

chrisseto
Copy link
Contributor

Prior to this commit we discovered that CI can flake when standing up multiple k3d clusters due to port conflicts.

This commit attempts to implement "port reservation" such that pkg/k3d can always create k3d clusters without conflicts without coordination across different processes.

It's unclear if this will actually work as engineering an intentional failure is not feasible.

@chrisseto chrisseto force-pushed the chris/p/port-reservation branch 2 times, most recently from c639d66 to 8e942a3 Compare November 8, 2024 21:13
Prior to this commit we discovered that CI can flake when standing up multiple
k3d clusters due to port conflicts.

This commit attempts to implement "port reservation" such that pkg/k3d can
always create k3d clusters without conflicts without coordination across
different processes.

It's unclear if this will actually work as engineering an intentional failure
is not feasible.
@RafalKorepta RafalKorepta force-pushed the chris/p/port-reservation branch from 8e942a3 to e870e14 Compare November 15, 2024 10:27
Copy link
Contributor

@RafalKorepta RafalKorepta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this improvement even if #296 fixed other issues.

@chrisseto
Copy link
Contributor Author

So, fun story with this. I can't actually prove/showcase that this will work. The implementation on linux may be different from macos/darwin/bsd but: When testing on macos, it seems that port conflicts do still get generated after running this test with -count=50 though the sockets/ports do get put into the TIME_WAIT state.

I've not been able to track down any information about how ephemeral port assignment works in macos. It seems that ports in TIME_WAIT will be left alone until ephemeral ports near exhaustion. After which they get reassigned.

@birdayz
Copy link
Contributor

birdayz commented Nov 29, 2024

Thanks, this is an interesting approach!

Copying this approach to cloud's internal repo as well.
Whenever possible, we try to allocate with address ":", so the OS assigns a free port to our application, and then we pass along this address.
However, sometimes this is hard, eg if third party deps don't support this approach. So it's great to have a working fallback solution; so far we did only randomly allocate, and hope no other test randomly picks the same random port.

@chrisseto
Copy link
Contributor Author

Let me know if this actually solves any issues in cloud. On macOS at least, this doesn't seem to do much 😓

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants