Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-4943 #4947

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

KEP-4943 #4947

wants to merge 2 commits into from

Conversation

bernot-dev
Copy link

@bernot-dev bernot-dev commented Nov 4, 2024

  • One-line PR description: Adding new KEP
  • Other comments:

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 4, 2024
@k8s-ci-robot k8s-ci-robot added the kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory label Nov 4, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: bernot-dev
Once this PR has been reviewed and has the lgtm label, please assign maciekpytel for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 4, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @bernot-dev. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 4, 2024
@bernot-dev bernot-dev marked this pull request as draft November 4, 2024 18:16
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 4, 2024
@adrianmoisey
Copy link
Member

I know this is very much still in draft, but I thought I'd ask now. Does this work depend on in-place pod resizing?

@bernot-dev
Copy link
Author

@adrianmoisey This does not depend on in-place pod resizing any more than existing VPA. However, it will benefit from it when that feature lands.

@raywainman
Copy link

This is super cool! I think this could definitely tackle a gap that has existed in vertically scaling daemon sets.

Some questions from me:

  • What is some of your high level thinking here around how to configure this?
  • Would we keep a histogram of usage per replica in this case?
  • What do we do about new pods being added to the daemonset?

(Feel free to also tell me to wait until this is captured in the KEP :), I realize this is still a draft proposal!)

@bernot-dev
Copy link
Author

@raywainman Thanks for the questions! These are closely aligned with the first questions I am looking to tackle as I start filling in the details. I am going to be looking for ways to leverage existing code to whatever extent makes sense. I will have to make myself a bit more familiar with existing code before I can find those opportunities.

I'm hoping to sidestep the question of, "What do we do about new pods being added to the daemonset?"

Retaining existing behavior seems to be the best way forward for now. I can imagine reasonable arguments for a variety of behaviors in different situations. Perhaps it could become configurable in the future, but I would rather keep that out of scope for now.

@adrianmoisey
Copy link
Member

adrianmoisey commented Nov 10, 2024

@adrianmoisey This does not depend on in-place pod resizing any more than existing VPA. However, it will benefit from it when that feature lands.

Good to know. I know I may be jumping the gun here, so apologies for that, I know you are still working on the KEP.

The reason I asked about in-place is because I just assumed the solution you would use, so let me take a step back.

At the moment VPA admission-controller receives a Pod before it's scheduled to a node (to my understanding, please correct me if I am wrong), meaning that the VPA admission-controller won't know which node a specific Pod will land on.
Do you have any ideas on how to get around this limitation for this KEP?

EDIT: I'm entirely wrong! Turns out that DaemonSets use node affinity to tell the scheduler where to schedule a Pod. I didn't know that!
See https://github.com/kubernetes/kubernetes/blob/v1.31.2/pkg/controller/daemon/daemon_controller.go#L1014-L1019
and
https://github.com/kubernetes/kubernetes/blob/v1.31.2/pkg/controller/daemon/util/daemonset_util.go#L173-L221

So I guess that's the answer, the node can be fetched from there.

SECOND EDIT: What about other workloads, besides DaemonSets?

@omerap12
Copy link
Member

We're only talking about DaemonSets, right? Because if we're dealing with Deployments, it could cause some issues. For example, when you change the resources of a pod, the new pod might be scheduled on a different node or even in a different zone, depending on how the Deployment is configured. This could impact the load distribution across pods. If Deployments are also in scope, we might need to consider these factors.

@adrianmoisey
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 20, 2025
@k8s-ci-robot
Copy link
Contributor

@bernot-dev: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-enhancements-verify 137f0f1 link true /test pull-enhancements-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants