-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prometheus ec2 monitoring #182
Conversation
…ess from monitoring server to clickhouse proxy server
Terraform Run Output 🤖Format and Style 🖌
|
Pusher | @LDiazN |
Action | pull_request |
Environment | dev |
Workflow | .github/workflows/check_terraform.yml |
Last updated | Thu, 13 Feb 2025 15:23:03 GMT |
Ansible Run Output 🤖Ansible Playbook Recap 🔍
Ansible playbook output 📖
|
Pusher | @LDiazN |
Action | pull_request |
Working Directory | |
Workflow | .github/workflows/check_ansible.yml |
Last updated | Thu, 13 Feb 2025 15:23:43 GMT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left two really minor comments, but otherwise this looks really great! Thanks for putting it together. I would suggest either fixing those now in this PR, but let's be sure to do it prior to prod rollout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐋
Originally this PR was about adding prometheus monitoring to services and nodes in the ECS cluster, but then we realized that the application level metrics are trickier to implement because ECS deployed tasks have a random port assigned, but the standard ec2 discovery settings in Prometheus require you to provide the port in advance. So for this reason in this PR we only set up node level metrics, since nodes can have a node exporter process running in a fixed port (see #179).
Since the ECS nodes are not reachable through the Internet, we added a proxy server to forward scrape requests from the monitoring server to the actual nodes.
To achieve node level metrics scraping, we:
This PR solves #171 and #172 and depends on #179