Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPCLOUD-2500: Update to recognize both upstream and openshift ScaleFromZero annotations #335

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

racheljpg
Copy link

What type of PR is this?

What this PR does / why we need it:

Hello! This is a PR to update the autoscaler to recognize both the upstream and openshift ScaleFromZero annotations, while favouring the upstream annotations if they exist. Thanks!

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 17, 2025

@racheljpg: This pull request references OCPCLOUD-2500 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

What type of PR is this?

What this PR does / why we need it:

Hello! This is a PR to update the autoscaler to recognize both the upstream and openshift ScaleFromZero annotations, while favouring the upstream annotations if they exist. Thanks!

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 17, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 17, 2025
Copy link

openshift-ci bot commented Jan 17, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@racheljpg
Copy link
Author

Hey @elmiko, just wanted to make a small PR early - is this along the lines of what you were thinking? Thanks!

@racheljpg racheljpg changed the title OCPCLOUD-2500: Update cpu and memory key annotations OCPCLOUD-2500: Update to recognize both upstream and openshift ScaleFromZero annotations Jan 17, 2025
@racheljpg racheljpg force-pushed the upstreamannotations branch 2 times, most recently from 10e2b62 to f76c1b2 Compare January 24, 2025 16:08
@racheljpg racheljpg marked this pull request as ready for review January 24, 2025 16:12
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 24, 2025
@openshift-ci openshift-ci bot requested review from elmiko and JoelSpeed January 24, 2025 16:13
@racheljpg racheljpg force-pushed the upstreamannotations branch from b43d21a to 12f347c Compare January 24, 2025 16:59
@racheljpg racheljpg force-pushed the upstreamannotations branch 2 times, most recently from 953ddc2 to 1245144 Compare January 27, 2025 13:38
Copy link

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a few cleanup comments and a suggestion for the failing unit test. this is looking good though, very close to ready.

@racheljpg racheljpg force-pushed the upstreamannotations branch 2 times, most recently from 49b595c to a448aa8 Compare January 27, 2025 15:35
@elmiko
Copy link

elmiko commented Jan 27, 2025

i think this is looking good, let's wait to see how the tests go then i'm happy to add an approve. let's get someone else to review as well if we can.

@racheljpg
Copy link
Author

/retest

@racheljpg
Copy link
Author

racheljpg commented Jan 27, 2025

Updating the value of the memoryKey annotation to the upstream value has messed up some of the other unit tests that were expecting the old downstream value - I've updated the TestNodeGroupTemplateNodeInfo test that broke due to this, but a bit confused about the output below:

    clusterapi_unstructured_test.go:402: expected {{1024 0} {<nil>} 1024 DecimalSI}, got {{1024 0} {<nil>} 1024 DecimalSI}
--- FAIL: TestAnnotations (0.20s)

Is it me, or are these values the same? 😬 Will look deeper into updating this suite, but @elmiko (and maybe @JoelSpeed) any thoughts, when you have a moment? :) Thanks!

// Checking both sets of annotations. If the upstream annotation doesn't exist,
// get the value of the deprecated one. If neither exist, return nil.
if val, err := parseKey(annotations, cpuKey); val != zeroQuantity && err == nil {
return val, err

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If err is nil here, return nil explicitly please

Suggested change
return val, err
return val, nil

if val, err := parseKey(annotations, cpuKey); val != zeroQuantity && err == nil {
return val, err
} else if err != nil {
return zeroQuantity.DeepCopy(), err

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap the error with context to make it easy to find which line produced the error?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - did some tidying up on the errors being returned

@elmiko
Copy link

elmiko commented Jan 29, 2025

the test failures are happening due to our corrections for the original upstream unit test, https://github.com/openshift/kubernetes-autoscaler/blob/master/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go#L379-L382

if we remove the logic that converts the mem value to bytes then it will work properly.

@elmiko
Copy link

elmiko commented Jan 29, 2025

this made it work

diff --git a/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go b/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go
index 48e633436..8ed42fad5 100644
--- a/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go
+++ b/cluster-autoscaler/cloudprovider/clusterapi/clusterapi_unstructured_test.go
@@ -28,7 +28,6 @@ import (
        "k8s.io/apimachinery/pkg/api/resource"
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
        "k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
-       "k8s.io/autoscaler/cluster-autoscaler/utils/units"
        "k8s.io/client-go/tools/cache"
 )

@@ -376,11 +375,6 @@ func TestAnnotations(t *testing.T) {
                labelsKey:       "key3=value3,key4=value4,key5=value5",
        }

-       // convert the initial memory value from Mebibytes to bytes as this conversion happens internally
-       // when we use InstanceMemoryCapacity()
-       memVal, _ := memQuantity.AsInt64()
-       memQuantityAsBytes := resource.NewQuantity(memVal*units.MiB, resource.DecimalSI)
-
        test := func(t *testing.T, testConfig *testConfig, testResource *unstructured.Unstructured) {
                controller, stop := mustCreateTestController(t, testConfig)
                defer stop()
@@ -398,7 +392,7 @@ func TestAnnotations(t *testing.T) {

                if mem, err := sr.InstanceMemoryCapacityAnnotation(); err != nil {
                        t.Fatal(err)
-               } else if memQuantityAsBytes.Cmp(mem) != 0 {
+               } else if memQuantity.Cmp(mem) != 0 {
                        t.Errorf("expected %v, got %v", memQuantity, mem)
                }

@racheljpg racheljpg force-pushed the upstreamannotations branch from 0822cdd to 20bfa3e Compare January 30, 2025 13:27
@racheljpg
Copy link
Author

Thanks @elmiko, that's fixed it for the upstream annotation - but after updating the units so that they test both sets of annotations, it's still failing for the same reason with the downstream value, so the problem has just moved 😬

t.Fatal(err)
} else if deprecatedMemQuantityAsBytes.Cmp(deprecatedMem) != 0 {
t.Errorf("expected %v, got %v", deprecatedMemQuantity, deprecatedMem)
} */
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@elmiko I am talking about here! it's still failing with expected {i:{value:1024 scale:0} d:{Dec:<nil>} s:1024 Format:DecimalSI}, got {i:{value:1024 scale:0} d:{Dec:<nil>} s:1024 Format:DecimalSI}. Going to step away from it briefly and take a look again shortly

@racheljpg racheljpg force-pushed the upstreamannotations branch from 20bfa3e to 86e5935 Compare January 30, 2025 16:34
Copy link

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this looks good, i inspected the test failures and they don't seem to be related to this change.

/approve

Copy link

openshift-ci bot commented Jan 30, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: elmiko

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2025
Copy link

openshift-ci bot commented Jan 30, 2025

@racheljpg: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws 86e5935 link true /test e2e-aws
ci/prow/e2e-aws-operator 86e5935 link true /test e2e-aws-operator
ci/prow/e2e-aws-periodic-pre 86e5935 link false /test e2e-aws-periodic-pre
ci/prow/e2e-azure-periodic-pre 86e5935 link false /test e2e-azure-periodic-pre

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

if val, err := parseKey(annotations, cpuKey); val != zeroQuantity && err == nil {
return val, nil
} else if err != nil {
//return zeroQuantity.DeepCopy(), err

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a comment? I think you meant to remove it

// Checking both sets of annotations. If the upstream annotation doesn't exist,
// get the value of the deprecated one. If neither exist, return nil.
if val, err := parseKey(annotations, memoryKey); val != zeroQuantity && err == nil {
return val, err

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err should be nil here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants