Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[occm] Ability to delay node removal #2213

Closed
zifeo opened this issue Apr 23, 2023 · 6 comments
Closed

[occm] Ability to delay node removal #2213

zifeo opened this issue Apr 23, 2023 · 6 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@zifeo
Copy link
Contributor

zifeo commented Apr 23, 2023

/kind feature

What happened:

Currently, occm monitors nodes with a providerID and deletes the node from the cluster when the corresponding instance in OpenStack is not existing.

What you expected to happen:

In most of our setups, we persist the data of Kubernetes folder (RKE2 in our case) for worst case scenario and/or single master cluster. This means that during an update of the nodes, the OpenStack instance will be destroy and a new one respawned with the same disk/data. The new node cannot then rejoin the cluster as it is identified as a deleted node (e.g. etcd does not accept deleted member).

How to reproduce it:

NA

Anything else we need to know?:

The best implementation would be to have a configurable delay before a node is delete and support the providerID to be updated.

Environment:

  • openstack-cloud-controller-manager(or other related binary) version: -
  • OpenStack version: -
  • Others: -
@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 23, 2023
@mdbooth
Copy link
Contributor

mdbooth commented Apr 24, 2023

How does this work? Which component is going to force the update of providerID on a node which already exists? The providerID has the property of being unique, so once set it's treated as canonical everywhere I'm aware of. What mechanism would we use to indicate that this should not happen?

Incidentally, anything involving a delay isn't going to be a robust solution. If this use case is accepted then we would need something explicit.

The single-node master case is interesting, though. I wonder if there could be some way to cause the node lifecycle manager to ignore specific nodes, allowing the administrator to delete them manually if required. Looking at the code there doesn't seem to be one currently: https://github.com/kubernetes/kubernetes/blob/08755fe2490beb91b5c2866c3413dec043fd72f9/staging/src/k8s.io/cloud-provider/controllers/nodelifecycle/node_lifecycle_controller.go#L130-L197

This is not an OpenStack-specific issue. I don't think we would address it in occm. There appears to be some related discussion in kubernetes/cloud-provider#35. I would recommend closing this issue and re-opening the discussion there.

@zifeo
Copy link
Contributor Author

zifeo commented Apr 24, 2023

@mdbooth Thanks for the feedback. I dig deeper into the topic:

  • providerID is immutable once set
  • delay is indeed not robust enough for all potential use cases

Instead, I would suggest that the occm does not remove nodes annotated with occm.openstack.org/automatic-node-removal=false or similar. This needs to be fixed here in occm because occm is the one implementing that deletion logic.

kubernetes/cloud-provider#35 sounds different to me (nodes with the suggested annotations should still be supervised, just not deleted).

@jichenjc
Copy link
Contributor

occm.openstack.org/automatic-node-removal=false

I think you propose a way to keep the original node even though the underlying instance is gone (with the flag above = flase)
then after create a new VM instance that corresponding to the node will switch back to true (optional) but overall goal is to
keep the node for further usage and avoid rejoin issue?

The new node cannot then rejoin the cluster as it is identified as a deleted node (e.g. etcd does not accept deleted member).

but I don't fully understand why we need rejoin as deleted member, maybe the new node has exactly same member name etc?
but k8s allows create a new resource with identical name to previous deleted resource, do they?

@mdbooth
Copy link
Contributor

mdbooth commented Apr 25, 2023

@mdbooth Thanks for the feedback. I dig deeper into the topic:

  • providerID is immutable once set
  • delay is indeed not robust enough for all potential use cases

Instead, I would suggest that the occm does not remove nodes annotated with occm.openstack.org/automatic-node-removal=false or similar. This needs to be fixed here in occm because occm is the one implementing that deletion logic.

kubernetes/cloud-provider#35 sounds different to me (nodes with the suggested annotations should still be supervised, just not deleted).

We don't actually implement most of the 'architectural' behaviour in occm. For consistency between cloud providers the majority of the controller, including all its high level control flow, is implemented in cloud-provider. We essentially just implement some interfaces that provide cloud-specific behaviour. You can see how that works in the code I linked above. This is the node lifecycle control loop, and you can see that it calls cloud-specific functions via an interface.

This use case does sound interesting, but it's not specific to OpenStack. The logic to implement it would almost certainly have to be in cloud-provider, and would apply to all cloud providers. It might have to extend beyond just 'don't delete this node', because for the cloud provider to continue managing node addresses after the replacement is created we would have to change provider ID, and we'd also need a mechanism to safely do that.

It's also possible that with more discussion at the right level it might be possible to do what you need at a different level.

If you find the right place to ask the question (I suggest trying cloud-provider next, but they may forward you again), I wouldn't be surprised to discover that there's already an accepted way to achieve it, or at least a number of other people with the same requirements.

@jichenjc
Copy link
Contributor

It's also possible that with more discussion at the right level it might be possible to do what you need at a different level.

I think this is the right direction and suggestion

@zifeo
Copy link
Contributor Author

zifeo commented Apr 26, 2023

but I don't fully understand why we need rejoin as deleted member, maybe the new node has exactly same member name etc?

The root issue is that an excluded etcd member cannot rejoin the same etcd cluster. In short, this disallow to have any persistence at the node replacement level.

I agree this is out of scope for this repo. However going further with that architecture, I am not convinced this is strong enough need to open an upstream issue, I will rather adapt the logic to work around it. Thanks for the comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants