avoid duplicates at PromQL level #11

rptaylor · 2021-05-27T00:58:15Z

The 'instance' and 'pod' labels represent the KSM instance and need to be filtered out to avoid mismatch errors ("many-to-many matching not allowed: matching labels must be unique on one side") , in the situation where KSM restarts (or runs >1 replica).

This works for the one side of the CPU query:

max without (instance,pod) (max_over_time(kube_pod_container_resource_requests_cpu_cores{node != ""}[48h]))

But no matter what I tried I could not get a vector result from the other side; this is where the mismatch error is coming from:

max_over_time(kube_pod_completion_time[48h]) - on (exported_pod) max_over_time(kube_pod_start_time[48h])

I don't understand that because 'on (exported_pod)' should mean only the exported_pod label is used for matching.

Particularly odd: both of these queries work, which each ignore just one label:

max_over_time(kube_pod_completion_time[48h]) - ignoring(pod) max_over_time(kube_pod_start_time[48h])

max_over_time(kube_pod_completion_time[48h]) - ignoring(instance) max_over_time(kube_pod_start_time[48h])

But ignoring both labels causes a many-to-many matching error, just like using on(exported_pod):

max_over_time(kube_pod_completion_time[48h]) - ignoring (pod, instance) max_over_time(kube_pod_start_time[48h])

And this works but it has problematic duplicate entries:

max_over_time(kube_pod_completion_time[48h]) - max_over_time(kube_pod_start_time[48h])

so that is how it is working currently, and it relies on the rearrange function ignoring duplicates.
It might be preferable to avoid duplicates at the PromQL level instead of in the python code - however the prometheus queries are subject to complex vagaries and occasional syntax changes so maybe deduplicating in python is safer.

The text was updated successfully, but these errors were encountered:

rptaylor · 2021-05-27T20:03:06Z

Currently the cputime query is anyway somewhat redundant with the endtime,starttime,cores queries.
They are used as a check against each other to ensure that cputime = (endtime - starttime) * cores.

Good to have a cross check, though it takes a bit more time to run the extra queries.

rptaylor · 2022-05-31T00:49:46Z

With the updates for KSM 2.0 #22
exported_pod is gone and pod is the right label to use, so it would only be necessary to use ignoring(instance).
However the Prometheus behaviour seems different now.

max_over_time(kube_pod_completion_time[48h]) - ignoring(instance) max_over_time(kube_pod_start_time[48h])

Shortly after redeploying KSM to trigger the multi-instance issue, now this gives an error "many-to-many matching not allowed: matching labels must be unique on one side". Or maybe I didn't test it correctly before.
So the situation remains the same, still relying on the rearrange function in python to remove the duplicates.

rptaylor · 2022-05-31T00:51:26Z

Here is an example of the same pod record, duplicated for each KSM instance that Prom collected the record from:

{container="kube-state-metrics", endpoint="http", instance="10.233.96.214:8080", job="kube-prometheus-kube-state-metrics", namespace="harvester", pod="grid-job-14869085-4hdv5", service="kube-prometheus-kube-state-metrics", uid="97ff9f95-3c7c-4306-9509-7838147fce65"}

{container="kube-state-metrics", endpoint="http", instance="10.233.87.156:8080", job="kube-prometheus-kube-state-metrics", namespace="harvester", pod="grid-job-14869085-4hdv5", service="kube-prometheus-kube-state-metrics", uid="97ff9f95-3c7c-4306-9509-7838147fce65"}

rptaylor changed the title ~~improve PromQL query to avoid duplicates~~ avoid duplicates at PromQL level Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

avoid duplicates at PromQL level #11

avoid duplicates at PromQL level #11

rptaylor commented May 27, 2021 •

edited

Loading

rptaylor commented May 27, 2021

rptaylor commented May 31, 2022 •

edited

Loading

rptaylor commented May 31, 2022

avoid duplicates at PromQL level #11

avoid duplicates at PromQL level #11

Comments

rptaylor commented May 27, 2021 • edited Loading

rptaylor commented May 27, 2021

rptaylor commented May 31, 2022 • edited Loading

rptaylor commented May 31, 2022

rptaylor commented May 27, 2021 •

edited

Loading

rptaylor commented May 31, 2022 •

edited

Loading