Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Local IPs with Robots server on vSwitch not working #676

Open
mglants opened this issue Jul 2, 2024 · 6 comments · May be fixed by #851
Open

Local IPs with Robots server on vSwitch not working #676

mglants opened this issue Jul 2, 2024 · 6 comments · May be fixed by #851
Assignees
Labels
bug Something isn't working

Comments

@mglants
Copy link

mglants commented Jul 2, 2024

TL;DR

Node not adding if kubelet IP is set to internal network via Vswitch

Expected behavior

Node succesfully updated with Node controller

Observed behavior

Node broken and stays as uninitialized

Minimal working example

No response

Log output

E0702 15:17:50.125761       1 node_controller.go:240] error syncing 'kn2-stage.htzn': failed to get node modifiers from cloud provider: provided node ip for node "kn2-stage.htzn" is not valid: failed to get node address from cloud provider that matches ip: 192.168.0.131, requeuing
I0702 15:17:50.135997       1 node_controller.go:431] Initializing node kn2-stage.htzn with cloud provider

Additional information

NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
km1-stage Ready control-plane 14d v1.27.2 192.168.0.3 Talos (v1.7.0) 6.6.28-talos containerd://1.7.15
km2-stage Ready control-plane 14d v1.27.2 192.168.0.5 Talos (v1.7.0) 6.6.28-talos containerd://1.7.15
km3-stage Ready control-plane 18d v1.27.2 192.168.0.4 Talos (v1.7.0) 6.6.28-talos containerd://1.7.15
kn1-stage.htzn Ready 155m v1.27.2 192.168.0.130 Talos (v1.7.4) 6.6.32-talos containerd://1.7.16
kn2-stage Ready egress-proxy 13d v1.27.2 192.168.0.131 Talos (v1.7.4) 6.6.32-talos containerd://1.7.16

@mglants mglants added the bug Something isn't working label Jul 2, 2024
@apricote apricote self-assigned this Aug 30, 2024
@apricote
Copy link
Member

apricote commented Sep 2, 2024

Hey @mglants,

using IPs assigned through the vSwitch is not supported right now. We only support Private Networks in Cloud-only clusters.

@mglants mglants closed this as completed Sep 7, 2024
@mglants
Copy link
Author

mglants commented Sep 7, 2024

solved with specify hrobot id if in every node

@codablock
Copy link

codablock commented Jan 10, 2025

I think I have found a solution to support this use case in a good way and will provide a PR after I run it privately for some time. The solution is to introduce a new config option to hccm that instructs it to treat the --node-ip from kubelet as internal IP in case the node is a Robot node, which then fixes the error message seen in this issue description. I assume that this feature would also fix a few other issues I found in the Github repo.

@mglants can you maybe re-open this issue so that we can use it to track this? @apricote Or is there already a better place/issue for this?

@mglants mglants reopened this Jan 10, 2025
@mglants
Copy link
Author

mglants commented Jan 10, 2025

Reopened due to feedback from community @codablock

@lukasmetzner
Copy link
Contributor

Hey,

@codablock in the coming weeks, we plan to explore this topic in greater detail. There are already some open PRs and issues related to it. If you have a solution, feel free to submit a PR. We will review and evaluate all contributions as we delve deeper into the subject. I´d prefer closing this issue though.

Best Regards,
Lukas

@codablock
Copy link

codablock commented Jan 23, 2025

@lukasmetzner Thanks for the response. I created the PR (#851) with the proposed solution. If you want, you can close this issue again and we can switch to the PR for discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants