You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expected behavior
I expect the controller's behavior to be stable and predictable.
Version
# kubectl version
Client Version: v1.31.1
Kustomize Version: v5.4.2
Server Version: v1.31.1
# Argo Rollouts Chart/App version
2.37.7/v1.7.2
# Keda Chart/App version
2.15.1/2.15.1
Logs
Logs indicate the following:
When decreasing from 3 to 2 replicas, there are no issues; a patch for HPA is applied and no new replicas are created.
When decreasing from 2 to 1 replicas, there are issues: a new replica is created (unsuccessfully), the HPA patch is absent, and negative conditions patch is present.
➜ ~ kubectl -n argo-rollouts logs deployment/argo-rollouts | grep "namespace=staging" | grep "rollout=app-api"
...
# generation=153 / scale down 3->2
time="2024-11-07T11:20:24Z" level=info msg="Started syncing rollout" generation=153 namespace=staging resourceVersion=585378104 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Syncing replicas only due to scaling event" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Scaled down ReplicaSet app-api-54cc7b4b4d (revision 50) from 3 to 2" event_reason=ScalingReplicaSet namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Conflict when updating replicaset app-api-54cc7b4b4d, falling back to patch" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Patching replicaset with patch: {\"metadata\":{\"annotations\":{\"rollout.argoproj.io/desired-replicas\":\"2\",\"rollout.argoproj.io/revision\":\"50\"},\"labels\":{\"rollouts-pod-template-hash\":\"54cc7b4b4d\"}},\"spec\":{\"replicas\":2,\"selector\":{\"matchLabels\":{\"rollouts-pod-template-hash\":\"54cc7b4b4d\"}},\"template\":{\"metadata\":{\"annotations\":{\"vector-format\":\"pod-app\"},\"labels\":{\"app\":\"app-api\",\"app.kubernetes.io/instance\":\"app\",\"app.kubernetes.io/managed-by\":\"Helm\",\"app.kubernetes.io/name\":\"app\",\"app.kubernetes.io/version\":\"5ae9aef4fcd097a2059bf31802273518b2e3cde8\",\"helm.sh/chart\":\"app-v1.0.0-5ae9aef4fcd097a2059bf31802273518b2e3cde8\",\"rollouts-pod-template-hash\":\"54cc7b4b4d\"}}}}}" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Scaled down ReplicaSet app-api-54cc7b4b4d (revision 50) from 3 to 2" event_reason=ScalingReplicaSet namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Patched: {\"status\":{\"observedGeneration\":\"153\"}}" generation=153 namespace=staging resourceVersion=585378104 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="persisted to informer" generation=153 namespace=staging resourceVersion=585378118 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciliation completed" generation=153 namespace=staging resourceVersion=585378104 rollout=app-api time_ms=85.632779
time="2024-11-07T11:20:24Z" level=info msg="Started syncing rollout" generation=153 namespace=staging resourceVersion=585378118 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Patched: {\"status\":{\"HPAReplicas\":2,\"availableReplicas\":2,\"readyReplicas\":2,\"replicas\":2,\"updatedReplicas\":2}}" generation=153 namespace=staging resourceVersion=585378118 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="persisted to informer" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciliation completed" generation=153 namespace=staging resourceVersion=585378118 rollout=app-api time_ms=26.638623
time="2024-11-07T11:20:24Z" level=info msg="Started syncing rollout" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="No status changes. Skipping patch" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciliation completed" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api time_ms=3.438866
time="2024-11-07T11:20:24Z" level=info msg="Started syncing rollout" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="No status changes. Skipping patch" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api
time="2024-11-07T11:20:24Z" level=info msg="Reconciliation completed" generation=153 namespace=staging resourceVersion=585378119 rollout=app-api time_ms=3.333507
# generation=154 / scale down 2->1
time="2024-11-07T11:25:24Z" level=info msg="Started syncing rollout" generation=154 namespace=staging resourceVersion=585381067 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Syncing replicas only due to scaling event" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Scaled down ReplicaSet app-api-54cc7b4b4d (revision 50) from 2 to 1" event_reason=ScalingReplicaSet namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Conflict when updating replicaset app-api-54cc7b4b4d, falling back to patch" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Patching replicaset with patch: {\"metadata\":{\"annotations\":{\"rollout.argoproj.io/desired-replicas\":\"1\",\"rollout.argoproj.io/revision\":\"50\"},\"labels\":{\"rollouts-pod-template-hash\":\"54cc7b4b4d\"}},\"spec\":{\"replicas\":1,\"selector\":{\"matchLabels\":{\"rollouts-pod-template-hash\":\"54cc7b4b4d\"}},\"template\":{\"metadata\":{\"annotations\":{\"vector-format\":\"pod-app\"},\"labels\":{\"app\":\"app-api\",\"app.kubernetes.io/instance\":\"app\",\"app.kubernetes.io/managed-by\":\"Helm\",\"app.kubernetes.io/name\":\"app\",\"app.kubernetes.io/version\":\"5ae9aef4fcd097a2059bf31802273518b2e3cde8\",\"helm.sh/chart\":\"app-v1.0.0-5ae9aef4fcd097a2059bf31802273518b2e3cde8\",\"rollouts-pod-template-hash\":\"54cc7b4b4d\"}}}}}" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Scaled down ReplicaSet app-api-54cc7b4b4d (revision 50) from 2 to 1" event_reason=ScalingReplicaSet namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Patched: {\"status\":{\"observedGeneration\":\"154\"}}" generation=154 namespace=staging resourceVersion=585381067 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="persisted to informer" generation=154 namespace=staging resourceVersion=585381080 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Reconciliation completed" generation=154 namespace=staging resourceVersion=585381067 rollout=app-api time_ms=85.818662
time="2024-11-07T11:25:24Z" level=info msg="Started syncing rollout" generation=154 namespace=staging resourceVersion=585381080 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="New RS 'app-api-54cc7b4b4d' is not ready to pause" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="skipping active service switch: New RS 'app-api-54cc7b4b4d' is not fully saturated" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Patched: {\"status\":{\"conditions\":[{\"lastTransitionTime\":\"2024-11-05T14:48:24Z\",\"lastUpdateTime\":\"2024-11-05T14:48:24Z\",\"message\":\"RolloutCompleted\",\"reason\":\"RolloutCompleted\",\"status\":\"True\",\"type\":\"Completed\"},{\"lastTransitionTime\":\"2024-11-07T11:25:24Z\",\"lastUpdateTime\":\"2024-11-07T11:25:24Z\",\"message\":\"Rollout is not healthy\",\"reason\":\"RolloutHealthy\",\"status\":\"False\",\"type\":\"Healthy\"},{\"lastTransitionTime\":\"2024-11-06T18:19:49Z\",\"lastUpdateTime\":\"2024-11-07T11:25:24Z\",\"message\":\"Rollout does not have minimum availability\",\"reason\":\"ReplicaSetNotAvailable\",\"status\":\"True\",\"type\":\"Progressing\"},{\"lastTransitionTime\":\"2024-11-07T11:25:24Z\",\"lastUpdateTime\":\"2024-11-07T11:25:24Z\",\"message\":\"Rollout does not have minimum availability\",\"reason\":\"AvailableReason\",\"status\":\"False\",\"type\":\"Available\"}]}}" generation=154 namespace=staging resourceVersion=585381080 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="persisted to informer" generation=154 namespace=staging resourceVersion=585381081 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Reconciliation completed" generation=154 namespace=staging resourceVersion=585381080 rollout=app-api time_ms=28.417491
time="2024-11-07T11:25:24Z" level=info msg="Started syncing rollout" generation=154 namespace=staging resourceVersion=585381081 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Reconciling stable ReplicaSet 'app-api-54cc7b4b4d'" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="New RS 'app-api-54cc7b4b4d' is not ready to pause" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="skipping active service switch: New RS 'app-api-54cc7b4b4d' is not fully saturated" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Timed out (false) [last progress check: 2024-11-07 11:25:24 +0000 UTC - now: 2024-11-07 11:25:24.564589362 +0000 UTC m=+779860.795748433]" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="No status changes. Skipping patch" generation=154 namespace=staging resourceVersion=585381081 rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Queueing up rollout for a progress after 599s" namespace=staging rollout=app-api
time="2024-11-07T11:25:24Z" level=info msg="Reconciliation completed" generation=154 namespace=staging resourceVersion=585381081 rollout=app-api time_ms=3.115118
...
How to reproduce the bug
Create the rollouts object (blue green) with one service
Add keda ScaledObject with prom metrics
Pass some load until the HPA not be triggered, after that stop pass load and wait for max->min replicas will be reached (repeat until not catched "Degraded" phase)
For comfortable analysis you can use this snippet in another terminal window.
while sleep 10;do v=$(kubectl -n staging get rollout app-api -o jsonpath='{.status.HPAReplicas}/{.spec.replicas} ({.status.phase})');echo"${v} at $(date)";done
if HPAReplicas == replicas - all ok, ex: 1/1 (Healthy) at Mon Nov 11 09:46:21 AM UTC 2024
else phase Degraded will be soon, ex: 2/1 (Degraded) at Mon Nov 11 09:46:21 AM UTC 2024
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered:
Hello there 👋🏻
Checklist:
Describe the bug
From time to time (yes, it is floating issue), I encounter that "kind:Rollout" applications become to a Degraded status.
After examining the logs and the application's state, I discovered the following:
The HPA processes are managed by the KEDA controller kind:ScaledObject
The current number of replicas is determined by the above kind:ScaledObject
Current status (Degraded)
Expected behavior
I expect the controller's behavior to be stable and predictable.
Version
Logs
Logs indicate the following:
How to reproduce the bug
For comfortable analysis you can use this snippet in another terminal window.
if HPAReplicas == replicas - all ok, ex: 1/1 (Healthy) at Mon Nov 11 09:46:21 AM UTC 2024
else phase Degraded will be soon, ex: 2/1 (Degraded) at Mon Nov 11 09:46:21 AM UTC 2024
Message from the maintainers:
Impacted by this bug? Give it a 👍. We prioritize the issues with the most 👍.
The text was updated successfully, but these errors were encountered: