Replies: 1 comment
-
You can play with lower push rates to see if things improve. Generally, the Pushgateway is designed for rarely running batch jobs to push at the end of their runtime, so more like a handful of pushes per hour or per day, not regular pushes every 10 seconds. If you are pushing that often, you might investigate other ways of delivering your metrics (like enable the usual pull based scrape somehow, e.g. via https://github.com/prometheus-community/PushProx , or investigate remote write to a long term storage). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Our Pushgateway deployment is restarted periodically (ones in 1-2 hours).
The liveness probe has http endpoint
/-/healthy
and timeout 10 seconds.Does that means our write queue (with hardcoded capacity == 1000 in the code) is full?
Our clients send metric families every 10 seconds, there are up to 130 running jobs. It results in ~ 13
WriteRequest
s per second.The CPU consumption of our Pushgateway deployment is ~ 0.4 core according to k8s metrics, container allocation bounds are [1,4] (min, max).
What could be the reason why the Pushgateway cannot keep up with inbound requests?
Beta Was this translation helpful? Give feedback.
All reactions