Metrics can alert team members and present their performance on the dashboards. For details on collected metrics, refer to the reference metrics list.
In Prometheus, the following exporters are enabled to collect the required system-level metrics for the Chef Automate HA.
The Node exporter provides system-level metrics, such as CPU, disk, memory, etc. To learn its full capabilities, refer to the node exporter documentation.
For the chef-managed Automate HA deployment, refer to the node exporter service configuration file.
This section explains the process of monitoring Automate HA application services.
-
Install the Blackbox exporter on the following servers.
-
All Chef Automate Frontend Servers
-
All Chef Infra Frontend Servers
-
Any other Server to Monitoring Load Balancers
-
-
Refer to the Blackbox Exporter Service and Blackbox configuration files.
-
Configure the
prometheus.yml
file on the Prometheus server to capture metrics for various chef services. Refer to the prometheus.yml for the following job configurations.-
chef-server-url: Monitors elastic load balancer for Chef Infra Frontend servers.
-
chef-automate-url: Monitors elastic load balancer for Chef Automate Frontend servers.
-
chef-server-services.*: Monitors all services running on each Chef Infra Frontend server.
-
automate-services.*: Monitors all services running on each chef to automate front-end servers.
-
The PostgreSQL exporter is installed on each node running PostgreSQL. Refer to PostgreSQL exporter documentation to learn more about capabilities. Refer to the PostgreSQL exporter services and PostgreSQL exporter config files
The following steps will guide you in configuring Prometheus Alertmanager.
-
Change the current working path to the home directly
cd ~
-
Refer to the Alertmanager Download Page for the updated version of Alertmanager.
-
Execute the following command to download and install the Alertmanager:
curl -LO https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz tar -xvf alertmanager-0.25.0.linux-amd64.tar.gz mv alertmanager-0.25.0.linux-amd64/alertmanager /usr/local/bin
- The following steps guide preparing various receivers for the Alertmanager to send alerts and Alertmanager configurations.
-
Perform the following steps to configure the Alertmanager
mkdir /etc/alertmanager/ vi /etc/alertmanager/alertmanager.yml
-
Based on Alert integration to Slack, MS Teams, and PagerDuty, add the following sections under the receiver section.
Route: # A default receiver receiver: slack
-
Refer to the Prometheus Alertmanager configuration Documentation for detailed options.
Refer to the following integration sections to provide guidance to integrate with Alertmanager:
- Create an Alertmanager service by running the following command:
vi /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager Server Service
Wants=network-online.target
After=network-online.target
[Service]
User=root
Group=root
Type=Simple
ExecStart=/usr/local/bin/alertmanager \
--config.file /etc/alertmanager/alertmanager.yml
[Install]
WantedBy=multi-user.target
- Run the following commands to start and enable the service.
systemctl daemon-reload
systemctl start alertmanager
systemctl status alertmanager
systemctl enable alertmanager
-
Add the following configuration to Prometheus
vi /etc/prometheus/prometheus.yml
alerting: alertmanagers: - static_configs: - targets: - localhost:9093 rule_files: - "system_rules.yml"
-
Creates all the system specific rules in the following file
vi /etc/prometheus/system_rules.yml
groups: - name: node-exporter rules: - alert: InstanceDown expr: up == 0 for: 1m
Execute the following command to validate configurations.
promtool check config /etc/prometheus/prometheus.yml
Output:
Checking /etc/prometheus/prometheus.yml SUCCESS: 1 rule file found SUCCESS: /etc/prometheus/prometheus.yml is a valid Prometheus config file syntax Checking /etc/prometheus/system_rules.yml SUCCESS: 1 rule found
systemctl daemon-reload systemctl start prometheus systemctl status prometheus