Skip to content

Commit

Permalink
Feat: Improve Prometheus Metrics (#1338)
Browse files Browse the repository at this point in the history
* Feat: Improve Prometheus Metrics

* Remove console.log

* Fix test

* Fix broken test

* Feat: Updated Metrics

* Add docs for default metrics

* Clarify docs

* Reverse the FAILED_ASSERTION_ALERT hacks
  • Loading branch information
dennypradipta authored Jan 31, 2025
1 parent ffc8e9b commit 7a84661
Show file tree
Hide file tree
Showing 9 changed files with 211 additions and 93 deletions.
22 changes: 15 additions & 7 deletions docs/src/pages/guides/cli-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -253,13 +253,21 @@ Then you can scrape the metrics from `http://localhost:3001/metrics`.

Monika exposes [Prometheus default metrics](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors), [Node.js specific metrics](https://github.com/siimon/prom-client/tree/master/lib/metrics), and Monika probe metrics below.

| Metric Name | Type | Purpose | Label |
| -------------------------------------- | --------- | -------------------------------------------- | ------------------------------------------- |
| `monika_probes_total` | Gauge | Collect total probe | - |
| `monika_request_status_code_info` | Gauge | Collect HTTP status code | `id`, `name`, `url`, `method` |
| `monika_request_response_time_seconds` | Histogram | Collect duration of probe request in seconds | `id`, `name`, `url`, `method`, `statusCode` |
| `monika_request_response_size_bytes` | Gauge | Collect size of response size in bytes | `id`, `name`, `url`, `method`, `statusCode` |
| `monika_alert_total` | Counter | Collect total alert triggered | `id`, `name`, `url`, `method`, `alertQuery` |
| Metric Name | Type | Purpose | Labels |
| -------------------------------------- | --------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- |
| `monika_alerts_triggered` | Counter | Indicates the count of incident alerts triggered | `id`, `name`, `url`, `method`, `alertQuery` |
| `monika_alerts_triggered_total` | Counter | Indicates the cumulative count of incident alerts triggered | - |
| `monika_probes_running` | Gauge | Indicates whether a probe is running (1) or idle (0). Running means the probe is currently sending requests, while idle means the probe is waiting for the next request to be sent. |
| `monika_probes_running_total` | Gauge | Indicates the total count of probes that are currently running. Running means the probe is currently sending requests. | - |
| `monika_probes_status` | Gauge | Indicates whether a probe is healthy (1) or is having an incident (0) | `id`, `name`, `url`, `method` |
| `monika_probes_total` | Gauge | Total count of all probes configured | - |
| `monika_request_response_size_bytes` | Gauge | Indicates the size of probe request's response in bytes | `id`, `name`, `url`, `method`, `statusCode`, `result` |
| `monika_request_response_time_seconds` | Histogram | Indicates the duration of the probe request in seconds | `id`, `name`, `url`, `method`, `statusCode`, `result` |
| `monika_request_status_code_info` | Gauge | Indicates the HTTP status code of the probe requests' response(s) | `id`, `name`, `url`, `method` |
| `monika_notifications_triggered` | Counter | Indicates the count of notifications triggered | `type`, `status` |
| `monika_notifications_triggered_total` | Counter | Indicates the cumulative count of notifications triggered | - |

Aside from the above metrics, Monika also exposes [Prometheus default metrics](https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors) and [Node.js specific metrics](https://github.com/siimon/prom-client/tree/master/lib/metrics)

## Repeat

Expand Down
16 changes: 10 additions & 6 deletions packages/notification/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,29 +34,33 @@ async function sendNotifications(
notifications: Notification[],
message: NotificationMessage,
sender?: InputSender
): Promise<void> {
): Promise<{ type: string; success: boolean }[]> {
if (sender) {
updateSender(sender)
}

await Promise.all(
// Map notifications to an array of results
const results = await Promise.all(
notifications.map(async ({ data, type }) => {
const channel = channels[type]

try {
if (!channel) {
throw new Error('Notification channel is not available')
}

await channel.send(data, message)
return { type, success: true }
} catch (error: unknown) {
const message = getErrorMessage(error)
throw new Error(
`Failed to send message using ${type}, please check your ${type} notification config.\nMessage: ${message}`
const errorMessage = getErrorMessage(error)
console.error(
`Failed to send message using ${type}, please check your ${type} notification config.\nMessage: ${errorMessage}`
)
return { type, success: false }
}
})
)

return results
}

export { sendNotifications }
Expand Down
10 changes: 9 additions & 1 deletion src/components/notification/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,13 @@
* SOFTWARE. *
**********************************************************************************/

import { getEventEmitter } from '../../utils/events'
import { ValidatedResponse } from '../../plugins/validate-response'
import getIp from '../../utils/ip'
import { getMessageForAlert } from './alert-message'
import { sendNotifications } from '@hyperjumptech/monika-notification'
import type { Notification } from '@hyperjumptech/monika-notification'
import events from '../../events'

type SendAlertsProps = {
probeID: string
Expand Down Expand Up @@ -54,5 +56,11 @@ export async function sendAlerts({
response: validation.response,
})

return sendNotifications(notifications, message)
const results = await sendNotifications(notifications, message)
for (const result of results) {
getEventEmitter().emit(events.notifications.sent, {
type: result.type,
status: result.success ? 'success' : 'failed',
})
}
}
14 changes: 13 additions & 1 deletion src/components/probe/prober/http/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,12 @@ export class HTTPProber extends BaseProber {
response,
})

getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex,
status: 'up',
})

this.logMessage(
true,
getProbeResultMessage({
Expand Down Expand Up @@ -226,10 +232,16 @@ export class HTTPProber extends BaseProber {
}
const alertId = getAlertID(url, validation, probeID)

getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex,
status: 'down',
})

getEventEmitter().emit(events.probe.alert.triggered, {
probe: this.probeConfig,
requestIndex,
alertQuery: '',
alertQuery: triggeredAlert,
})

addIncident({
Expand Down
6 changes: 6 additions & 0 deletions src/components/probe/prober/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@ export abstract class BaseProber implements Prober {

// this probe is definitely in incident state because of fail assertion, so send notification, etc.
this.handleFailedProbe(probeResults)

return
}

Expand All @@ -148,6 +149,11 @@ export abstract class BaseProber implements Prober {
requestIndex: index,
response: requestResponse,
})
getEventEmitter().emit(events.probe.status.changed, {
probe: this.probeConfig,
requestIndex: index,
status: 'up',
})
logResponseTime(requestResponse.responseTime)

if (
Expand Down
6 changes: 6 additions & 0 deletions src/events/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ export default {
sanitized: 'CONFIG_SANITIZED',
updated: 'CONFIG_UPDATED',
},
notifications: {
sent: 'NOTIFICATIONS_SENT',
},
probe: {
alert: {
triggered: 'PROBE_ALERT_TRIGGERED',
Expand All @@ -46,5 +49,8 @@ export default {
notification: {
willSend: 'PROBE_NOTIFICATION_WILL_SEND',
},
status: {
changed: 'PROBE_STATUS_CHANGED',
},
},
}
4 changes: 4 additions & 0 deletions src/loaders/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ function initPrometheus(prometheusPort: number) {
decrementProbeRunningTotal,
incrementProbeRunningTotal,
resetProbeRunningTotal,
collectProbeStatus,
collectNotificationSentMetrics,
} = new PrometheusCollector()

// collect prometheus metrics
Expand All @@ -93,6 +95,8 @@ function initPrometheus(prometheusPort: number) {
eventEmitter.on(events.probe.ran, incrementProbeRunningTotal)
eventEmitter.on(events.probe.finished, decrementProbeRunningTotal)
eventEmitter.on(events.config.updated, resetProbeRunningTotal)
eventEmitter.on(events.probe.status.changed, collectProbeStatus)
eventEmitter.on(events.notifications.sent, collectNotificationSentMetrics)

startPrometheusMetricsServer(prometheusPort)
}
Loading

0 comments on commit 7a84661

Please sign in to comment.