Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mode where nfd-worker updates the labels #2022

Open
ivelichkovich opened this issue Jan 15, 2025 · 1 comment
Open

Mode where nfd-worker updates the labels #2022

ivelichkovich opened this issue Jan 15, 2025 · 1 comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@ivelichkovich
Copy link

What would you like to be added:

a mode where NFD worker can update labels without needing to run nfd-master with informer cache of entire nodefeatures, maybe some subset map of nodefeatures just for gc if this issue is implemented: #2021 i.e. just store nodefeatures by name/node.

Why is this needed:

nodefeature CRs can be a footgun if users list them i.e. k get nodefeatures in a high scale environment.

nfd master also uses a ton of memory at scale.

If nodefeature-worker just handled the labels for its own node then it would alleviate a lot of scale concerns

@ivelichkovich ivelichkovich added the kind/feature Categorizes issue or PR as related to a new feature. label Jan 15, 2025
@marquiz
Copy link
Contributor

marquiz commented Jan 22, 2025

One big issue with the nodefeature objects currently is their size. Quickly thinking, I can see two big culprits adding to that. One is the "managed fields metadata", basically every feature (like kernel.enabledmodule.e1000 is listed there. Not sure how we could avoid listing every possible member of the CRD there, or alternatively filtering out this metadata in listers. Second one is the kernel.config feature which lists every kernel config option, most of which nobody is interested in, and there's a ton of those. We could start building a deny list for filtering out the uninteresting ones or smth.

Second improvent, which I think we really need (and which I have thought for a long time) would be sharding of nfd-master. I.e. distribute the nodes across multiple instances of nfd-master. E.g. calculate a checksum of the nodename and do mod number-of-shards to get the instance (shard) which is responsible for that node.

Splitting the functionality to two daemons is deliberate, e.g. from the security considerations.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants