-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove timestamp from the metrics #43
Comments
I've also got a similar issue... Any advice or help about what to grab for diagnosis would be great, but this really feels like a bug of some sort, induced by an unexpected state... |
Same here. I got the following metrics with the timestamps that are 2 months ago:
1624932662650 Causing the Prometheus drops those metrics. Not sure why the mcac doesn't update the timestamp. Please advise. |
We got the same issue. It was working fine, but after leaving it for a couple of days, mcac is reporting the wrong time causing prometheus to fail: Please fix. |
I'm unable to reproduce the issue on GKE. I've left the cluster run for a few days and Prometheus isn't complaining about metrics that are too old. |
Hi, @adejanovski no drift and both prometheus and cassandra container report the same time (UTC). I did notice that with the fix for 'out-of-order timespaces' (#969), I had no problems with the timestamps as long as I had a smaller number of tables (~100) in the DB. After our production upgrade, I now have 326 tables spread across keyspaces and the problem has reappeared again. Our dev env has also got a similar number of tables so it appears that this happens if you've got a large number of tables in your DB, but that is just an observation... |
Hi @tah-mas, that's an interesting observation. Each table comes with a large set of metrics and this could mean that they take too long to be ingested and end up being ingested once they're outside of the accepted timestamp range. |
Thank you @adejanovski! Much appreciated |
@adejanovski Any ETA making the default config usable? We just switched from the instaclustr exporter to MCAC and are winding up with no metrics/blank dashboards from our main cluster due to this issue, despite it working fine on a smaller cluster with fewer tables. |
Hi @eriksw, we merged the changes a while ago actually to let you filter metrics more easily. Check this commit for some examples. |
@adejanovski Glad to see some rules documented here! I had looked around and found https://github.com/k8ssandra/k8ssandra/pull/1149/files and derived the following rule set:
The bad news: with those rules, on our main cluster we still ran into wildly out of date metric timestamps and all the other issues of #39 Has MCAC ever been used in actual production on a cluster with >300 tables on 60 nodes? If so, how? |
Hi everyone The rate of having Prometheus warning Increase I was from having scrape warning every minutes to every 3-4 mins. I'm testing MCAC in a 3-nodes-cluster with 100+ tables. |
Hi everyone |
@Miles-Garnsey can you investigate this? Could this be related to #73? |
The timestamp in the metrics is 2 hours behind the system time.
Here's the system time and timestamp it translates to
The time reported in metric is 2 hours behind and I can't figure out the way to disable the timestamp in metrics.
This is causing the following error when scrapping in prometheus
The text was updated successfully, but these errors were encountered: