Skip to content
This repository has been archived by the owner on Apr 19, 2024. It is now read-only.

poor Dashboard->Alarms performance when there are a lot of alarms under the same section (family) #309

Open
ilyam8 opened this issue Sep 16, 2021 · 0 comments

Comments

@ilyam8
Copy link
Member

ilyam8 commented Sep 16, 2021

@ilyam8 commented on Mon Apr 05 2021

Bug report summary

It takes 6-7 seconds to load netdata alarms when there a lot of alarms under the same section.

Screenshot 2021-04-05 at 15 39 57

Exact number of alarms in my test is 100

[pc ilyam]# curl -s http://127.0.0.1:19999/api/v1/alarms?all | jq ".alarms" | jq length
100

I was trying to add alarms for systemd units state. systemdunits collector creates a chart per unit type. Number of dimensions in charts equal to number of units per the type (up to 90 on my server).

I added the following alarm

template: systemd_service_units_state
      on: systemd.service_units_state
  lookup: max -1s foreach *
   units: ok/failed
   every: 10s
    warn: $this != nan AND $this == 5
   delay: down 5m multiplier 1.5 max 1h
    info: systemd service crashed
      to: sysadmin

And noticed performance problems.

OS / Environment
Linux pc 5.10.23-1-MANJARO #1 SMP PREEMPT Thu Mar 11 18:47:18 UTC 2021 x86_64 GNU/Linux
release [Click]
/etc/arch-release:Manjaro Linux
/etc/lsb-release:DISTRIB_ID=ManjaroLinux
/etc/lsb-release:DISTRIB_RELEASE=21.0
/etc/lsb-release:DISTRIB_CODENAME=Ornara
/etc/lsb-release:DISTRIB_DESCRIPTION="Manjaro Linux"
/etc/manjaro-release:Manjaro Linux
/etc/os-release:NAME="Manjaro Linux"
/etc/os-release:ID=manjaro
/etc/os-release:ID_LIKE=arch
/etc/os-release:BUILD_ID=rolling
/etc/os-release:PRETTY_NAME="Manjaro Linux"
/etc/os-release:ANSI_COLOR="32;1;24;144;200"
/etc/os-release:HOME_URL="https://manjaro.org/"
/etc/os-release:DOCUMENTATION_URL="https://wiki.manjaro.org/"
/etc/os-release:SUPPORT_URL="https://manjaro.org/"
/etc/os-release:BUG_REPORT_URL="https://bugs.manjaro.org/"
/etc/os-release:LOGO=manjarolinux
Netdata version

Version: netdata v1.30.0-3-g0068b7c1

buildinfo [Click]
Configure options:  '--prefix=/opt/netdata/usr' '--sysconfdir=/opt/netdata/etc' '--localstatedir=/opt/netdata/var' '--libexecdir=/opt/netdata/usr/libexec' '--libdir=/opt/netdata/usr/lib' '--with-zlib' '--with-math' '--with-user=netdata' '--disable-cloud' 'CFLAGS=-O2' 'LDFLAGS='
Features:
    dbengine:                YES
    Native HTTPS:            YES
    Netdata Cloud:           NO (by user request)
    TLS Host Verification:   YES
Libraries:
    jemalloc:                NO
    JSON-C:                  YES
    libcap:                  YES
    libcrypto:               YES
    libm:                    YES
    LWS:                     YES shared-lib
    mosquitto:               YES
    tcalloc:                 NO
    zlib:                    YES
Plugins:
    apps:                    YES
    cgroup Network Tracking: YES
    CUPS:                    YES
    EBPF:                    YES
    IPMI:                    NO
    NFACCT:                  NO
    perf:                    YES
    slabinfo:                YES
    Xen:                     NO
    Xen VBD Error Tracking:  NO
Exporters:
    AWS Kinesis:             NO
    GCP PubSub:              NO
    MongoDB:                 NO
    Prometheus Remote Write: YES
Installation method

From source.

Component Name

web/gui

Steps To Reproduce

Easy to reproduce using go.d/example collector

  1. activate go.d/example collector
# go.d.conf

modules:
  example: yes
  1. configure go.d/example job
# go.d/example.conf

jobs:
  - name: myexample
    charts:
      num: 5
      dimensions: 20
  1. create an alarm
# /etc/netdata/health.d/example.conf

template: my_example
      on: example.random
  lookup: max -1s foreach *
   units: ok/failed
   every: 10s
    warn: $this != nan AND $this == nan
   delay: down 5m multiplier 1.5 max 1h
    info: example
      to: sysadmin
  1. Restart netdata service, go to Dashboard->Alarms section
Expected behavior

It doesn't take 6-7 seconds to load netdata alarms when there a lot of alarms under the same section. I expect it to be fast (less then a second?).


@ilyam8 commented on Mon Apr 05 2021

@jacekkolasa im not 100% sure, but i think this issue is web related issue, so assigning you. If you need to help with reproducing (see Steps To Reproduce) the problem, let me know.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant