Skip to content

Latest commit

 

History

History
180 lines (122 loc) · 5.62 KB

Prometheus_Monitor_configuration_and_alerting.md

File metadata and controls

180 lines (122 loc) · 5.62 KB

Metrics and Alertmanager Configuration

Enabling and Configuring Metrics

Metrics can alert team members and present their performance on the dashboards. For details on collected metrics, refer to the reference metrics list.

Configure System Level Monitoring

In Prometheus, the following exporters are enabled to collect the required system-level metrics for the Chef Automate HA.

The Node exporter provides system-level metrics, such as CPU, disk, memory, etc. To learn its full capabilities, refer to the node exporter documentation.

For the chef-managed Automate HA deployment, refer to the node exporter service configuration file.

Configure Application Level Monitoring

This section explains the process of monitoring Automate HA application services.

  1. Install the Blackbox exporter on the following servers.

    • All Chef Automate Frontend Servers

    • All Chef Infra Frontend Servers

    • Any other Server to Monitoring Load Balancers

  2. Refer to the Blackbox Exporter Service and Blackbox configuration files.

  3. Configure the prometheus.yml file on the Prometheus server to capture metrics for various chef services. Refer to the prometheus.yml for the following job configurations.

    • chef-server-url: Monitors elastic load balancer for Chef Infra Frontend servers.

    • chef-automate-url: Monitors elastic load balancer for Chef Automate Frontend servers.

    • chef-server-services.*: Monitors all services running on each Chef Infra Frontend server.

    • automate-services.*: Monitors all services running on each chef to automate front-end servers.

Configure PostgreSQL Metrics

The PostgreSQL exporter is installed on each node running PostgreSQL. Refer to PostgreSQL exporter documentation to learn more about capabilities. Refer to the PostgreSQL exporter services and PostgreSQL exporter config files

Installing Alertmanager

The following steps will guide you in configuring Prometheus Alertmanager.

  1. Change the current working path to the home directly

    cd ~
  2. Refer to the Alertmanager Download Page for the updated version of Alertmanager.

  3. Execute the following command to download and install the Alertmanager:

    curl -LO  https://github.com/prometheus/alertmanager/releases/download/v0.25.0/alertmanager-0.25.0.linux-amd64.tar.gz
    tar -xvf alertmanager-0.25.0.linux-amd64.tar.gz
    mv alertmanager-0.25.0.linux-amd64/alertmanager /usr/local/bin

Configure Alertmanager

Prerequisites

  • The following steps guide preparing various receivers for the Alertmanager to send alerts and Alertmanager configurations.

Configuration

  1. Perform the following steps to configure the Alertmanager

    mkdir /etc/alertmanager/
    vi /etc/alertmanager/alertmanager.yml
  2. Based on Alert integration to Slack, MS Teams, and PagerDuty, add the following sections under the receiver section.

    Route:
        # A default receiver
        receiver: slack
  3. Refer to the Prometheus Alertmanager configuration Documentation for detailed options.

Refer to the following integration sections to provide guidance to integrate with Alertmanager:

  1. Slack Integration

  2. PagerDuty Integration

  3. MS Teams Integration

  • Create an Alertmanager service by running the following command:
vi /etc/systemd/system/alertmanager.service
[Unit]
Description=Alertmanager Server Service
Wants=network-online.target
After=network-online.target
[Service]
User=root
Group=root
Type=Simple
ExecStart=/usr/local/bin/alertmanager \
    --config.file /etc/alertmanager/alertmanager.yml
[Install]
WantedBy=multi-user.target
  • Run the following commands to start and enable the service.
systemctl daemon-reload
systemctl start alertmanager
systemctl status alertmanager
systemctl enable alertmanager

Configuring Alertmanager with Prometheus

  1. Add the following configuration to Prometheus

    vi /etc/prometheus/prometheus.yml
    alerting:
        alertmanagers:
            - static_configs:
            - targets:
                - localhost:9093
            rule_files:
                - "system_rules.yml"
  2. Creates all the system specific rules in the following file

    vi /etc/prometheus/system_rules.yml
    groups:
        - name: node-exporter
        rules:
            - alert: InstanceDown
                  expr: up == 0
            for: 1m

    Execute the following command to validate configurations.

    promtool check config /etc/prometheus/prometheus.yml

    Output:

    Checking /etc/prometheus/prometheus.yml
        SUCCESS: 1 rule file found
        SUCCESS: /etc/prometheus/prometheus.yml is a valid Prometheus config file syntax
    Checking /etc/prometheus/system_rules.yml
        SUCCESS: 1 rule found
    systemctl daemon-reload
    systemctl start prometheus
    systemctl status prometheus