Skip to content

Commit

Permalink
Send troubleshooting metrics (#142)
Browse files Browse the repository at this point in the history
* Implements POC

* Fix

* Encapsulates metric harvester so it only sends metrics when enabled

* Adds payloadSize metric

* Implements sending troubleshooting metrics based on optional sendMetrics configuration option

* Fix type alias removed by mistake

* Fix type alias removed by mistake

* Fixes error that returned a nil metrics client instead of a noop one if no Metrics API URL mapping existed

* Include hasError dimension in relevant metrics

* Updates readme with new related documentation for the sendMetrics configuration option

* Adds Dashboard template and README notes

* Set timeseries to max resolution

* Bump version

* Include metrics module in Dockerfiles copy operations
  • Loading branch information
jsubirat authored Dec 7, 2023
1 parent fb0a33f commit 64dc438
Show file tree
Hide file tree
Showing 472 changed files with 115,783 additions and 120,040 deletions.
1 change: 1 addition & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ WORKDIR /go/src/github.com/newrelic/newrelic-fluent-bit-output

COPY Makefile go.* *.go /go/src/github.com/newrelic/newrelic-fluent-bit-output/
COPY config/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/config
COPY metrics/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/metrics
COPY nrclient/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/nrclient
COPY record/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/record
COPY utils/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/utils
Expand Down
1 change: 1 addition & 0 deletions Dockerfile.windows
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ RUN setx PATH "C:\ProgramData\chocolatey\lib\mingw\tools\install\mingw\bin;%PATH
# Compile the newrelic-fluent-bit-output plugin
COPY Makefile go.* *.go ./
COPY config/ ./config
COPY metrics/ ./metrics
COPY nrclient/ ./nrclient
COPY record/ ./record
COPY utils/ ./utils
Expand Down
1 change: 1 addition & 0 deletions Dockerfile_debug
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ WORKDIR /go/src/github.com/newrelic/newrelic-fluent-bit-output

COPY Makefile go.* *.go /go/src/github.com/newrelic/newrelic-fluent-bit-output/
COPY config/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/config
COPY metrics/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/metrics
COPY nrclient/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/nrclient
COPY record/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/record
COPY utils/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/utils
Expand Down
1 change: 1 addition & 0 deletions Dockerfile_firelens
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ WORKDIR /go/src/github.com/newrelic/newrelic-fluent-bit-output

COPY Makefile go.* *.go /go/src/github.com/newrelic/newrelic-fluent-bit-output/
COPY config/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/config
COPY metrics/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/metrics
COPY nrclient/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/nrclient
COPY record/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/record
COPY utils/ /go/src/github.com/newrelic/newrelic-fluent-bit-output/utils
Expand Down
22 changes: 20 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,19 @@ and one space between keys and values.
The plugin supports the following configuration parameters (apart from the ones provided out-of-the-box by Fluent Bit for the output plugins, such as the [retry options](#retry-logic)). Note that it's **mandatory to supply either`apiKey` or `licenseKey`**.
| Key | Description | Default |
| ------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------------------------------- |
|--------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------|
| endpoint | The endpoint you send data to. By default, it sends it to the US (`endpoint=https://log-api.newrelic.com/log/v1`). Set it to `https://log-api.eu.newrelic.com/log/v1` to send it to the EU region. | `https://log-api.newrelic.com/log/v1` |
| apiKey | Your New Relic Insights Insert key. For information on how to find your New Relic Insights Insert key, take a look at the documentation [here](https://docs.newrelic.com/docs/insights/insights-data-sources/custom-data/send-custom-events-event-api#register). | (none) |
| licenseKey | Your New Relic License key | (none) |
| httpClientTimeout | Http Client timeout for sending the logs (in seconds) | 5 |
| httpClientTimeout | Http Client timeout for sending the logs (in seconds) | 5 |
| maxBufferSize | **[Deprecated since 1.3.0]** The maximum size the payloads sent in bytes | 256000 |
| maxRecords | **[Deprecated since 1.3.0]** The maximum number of records to send at a time | 1024 |
| proxy | Optional proxy to communicate with New Relic, overrides any environment-defined one. Must follow the format `https://user:password@hostname:port`. Can be HTTP or HTTPS. | (none) |
| ignoreSystemProxy | Ignore any proxy defined via the `HTTP_PROXY` or `HTTPS_PROXY` environment variables. Note that if a proxy has been defined using the `proxy` parameter, this one has no effect. | false |
| caBundleFile | **[LINUX HTTPS ONLY]** Specifies the Certificate Authority certificate to use for validating HTTPS connections against the proxy. Useful when the proxy uses a self-signed certificate. **The certificate file must be in the PEM format**. If not specified, then the operating system's CA list is used. Only used when `validateProxyCerts` is `true`. | (none) |
| caBundleDir | **[LINUX HTTPS ONLY]** Specifies a folder containing one or more Certificate Authority certificates ot use for validating HTTPS connections against the proxy. Useful when the proxy uses a self-signed certificate. **Only certificate files in the PEM format and \*.pem extension will be considered**. If not specified, then the operating system's CA list is used. Only used when `validateProxyCerts` is `true`. | (none) |
| validateProxyCerts | **[HTTPS ONLY]** When using a HTTPS proxy, the proxy certificates are validated by default when establishing a HTTPS connection. To disable the proxy certificate validation, set `validateProxyCerts` to `false` (insecure) | true |
| sendMetrics | Set to true to send plugin troubleshoot metrics to the Metrics event type. Please see [this section](#troubleshooting-metrics) for more details | false |
#### Proxy support
Expand Down Expand Up @@ -136,6 +137,23 @@ Fluent Bit provides an out-of-the-box retry logic, configurable via the `Retry_L
| Retry_Limit | N | Integer value to set the maximum number of retries allowed. N must be >= 1 (default: 1) |
| Retry_Limit | False | When Retry_Limit is set to False, means that there is not limit for the number of retries that the Scheduler can do. |

#### Troubleshooting metrics
Set the `sendMetrics` option to `true` if you want to send troubleshooting metrics to your Metrics event type via the [Metrics API](https://docs.newrelic.com/docs/data-apis/ingest-apis/metric-api/introduction-metric-api/). Please note that **enabling this option will incur extra ingestion costs** due to the data size of the metrics stored in your New Relic account.

Please note that the **metrics reported by this plugin must not be considered as a stable API: they can change its naming or dimensions at any time in newer plugin versions**. That is, **no critical alerts or dashboard should be created out of them**. The purpose of these metrics is no other than to allow you to troubleshoot a malfunctioning Fluent Bit installation.

The following are the metrics currently reported by the plugin:

| Metric name | Dimensions | Description | Units |
|---------------------------|-----------------------------------|-----------------------------------------------------------------------------------------------------------|---------------|
| logs.fb.packaging.time | hasError (bool) | Time used to package a Fluent Bit chunk into one or more <=1MB compressed New Relic payloads | milliseconds |
| logs.fb.payload.count | - | Amount of <=1MB compressed New Relic payloads that a single Fluent Bit chunk was divided into | integer count |
| logs.fb.total.send.time | - | Time used to send a single Fluent Bit chunk consisting of one or more <=1MB compressed New Relic payloads | milliseconds |
| logs.fb.payload.send.time | statusCode (int), hasError (bool) | Time used to send an individual <=1MB compressed New Relic payload | milliseconds |
| logs.fb.payload.size | statusCode (int), hasError (bool) | Compressed size of an individual <=1MB compressed New Relic payload | bytes |

For convenience, we have included a Dashboard in JSON format (`troubleshooting-dashboard.json.template`) that you can import into your New Relic account. **To use it, search for "YOUR_ACCOUNT_ID" and replace it by your New Relic Account ID before importing it as JSON.** The dashboard displays the above metrics in a convenient way and guidance to help you quickly detect problems in your installation. As mentioned above, this dashboard should be used when troubleshooting a malfunctioning installation, but should not be relied upon in the long term as any of the metrics it uses or their related dimensions could change at any time.

## Docker Image

This plugin also comes packaged in a Docker image, available [here](https://hub.docker.com/r/newrelic/newrelic-fluentbit-output). To use it, you just need to pull the image and run it with your desired configuration:
Expand Down
4 changes: 4 additions & 0 deletions config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ type NRClientConfig struct {
LicenseKey string
UseApiKey bool
TimeoutSeconds int
SendMetrics bool
}

type DataFormatConfig struct {
Expand Down Expand Up @@ -86,6 +87,9 @@ func parseNRClientConfig(ctx unsafe.Pointer) (cfg NRClientConfig, err error) {
cfg.UseApiKey = len(cfg.ApiKey) > 0

cfg.TimeoutSeconds, err = optInt(ctx, "httpClientTimeout", 5)

cfg.SendMetrics, err = optBool(ctx, "sendMetrics", false)

return
}

Expand Down
20 changes: 19 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
@@ -1,10 +1,28 @@
module github.com/newrelic/newrelic-fluent-bit-output

go 1.14
go 1.20

require (
github.com/fluent/fluent-bit-go v0.0.0-20200729034236-b9c0d6a20853
github.com/newrelic/newrelic-telemetry-sdk-go v0.8.1
github.com/onsi/ginkgo v1.8.0
github.com/onsi/gomega v1.5.0
github.com/sirupsen/logrus v1.8.1
github.com/stretchr/testify v1.8.4
)

require (
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/golang/protobuf v1.2.0 // indirect
github.com/hpcloud/tail v1.0.0 // indirect
github.com/pmezard/go-difflib v1.0.0 // indirect
github.com/stretchr/objx v0.5.0 // indirect
github.com/ugorji/go/codec v1.1.7 // indirect
golang.org/x/net v0.0.0-20180906233101-161cd47e91fd // indirect
golang.org/x/sys v0.14.0 // indirect
golang.org/x/text v0.3.0 // indirect
gopkg.in/fsnotify.v1 v1.4.7 // indirect
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 // indirect
gopkg.in/yaml.v2 v2.2.1 // indirect
gopkg.in/yaml.v3 v3.0.1 // indirect
)
19 changes: 16 additions & 3 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/fluent/fluent-bit-go v0.0.0-20200729034236-b9c0d6a20853 h1:QapwPpeMtcxkG6WjcQQneyGtVSsr4meJ0Fz4B2T3j/c=
Expand All @@ -8,6 +9,8 @@ github.com/golang/protobuf v1.2.0 h1:P3YflyNX/ehuJFLhxviNdFxQPkGK5cDcApsge1SqnvM
github.com/golang/protobuf v1.2.0/go.mod h1:6lQm79b+lXiMfvg/cZm0SGofjICqVBUtrP5yJMmIC1U=
github.com/hpcloud/tail v1.0.0 h1:nfCOvKYfkgYP8hkirhJocXT2+zOD8yUNjXaWfTlyFKI=
github.com/hpcloud/tail v1.0.0/go.mod h1:ab1qPbhIpdTxEkNHXyeSf5vhxWSCs/tWer42PpOxQnU=
github.com/newrelic/newrelic-telemetry-sdk-go v0.8.1 h1:6OX5VXMuj2salqNBc41eXKz6K+nV6OB/hhlGnAKCbwU=
github.com/newrelic/newrelic-telemetry-sdk-go v0.8.1/go.mod h1:2kY6OeOxrJ+RIQlVjWDc/pZlT3MIf30prs6drzMfJ6E=
github.com/onsi/ginkgo v1.6.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE=
github.com/onsi/ginkgo v1.8.0 h1:VkHVNpR4iVnU8XQR6DBm8BqYjN7CRzw+xKUbVVbbW9w=
github.com/onsi/ginkgo v1.8.0/go.mod h1:lLunBs/Ym6LB5Z9jYTR76FiuTmxDTDusOGeTQH+WWjE=
Expand All @@ -17,9 +20,15 @@ github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZb
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/sirupsen/logrus v1.8.1 h1:dJKuHgqk1NNQlqoA6BTlM1Wf9DOH3NBjQyu0h9+AZZE=
github.com/sirupsen/logrus v1.8.1/go.mod h1:yWOB1SBYBC5VeMP7gHvWumXLIWorT60ONWic61uBYv0=
github.com/stretchr/testify v1.2.2 h1:bSDNvY7ZPG5RlJ8otE/7V6gMiyenm9RtJ7IUVIAoJ1w=
github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
github.com/stretchr/objx v0.5.0 h1:1zr/of2m5FGMsad5YfcqgdqdWrIhu+EBEJRhR1U7z/c=
github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
github.com/stretchr/testify v1.2.2/go.mod h1:a8OnRcib4nhh0OaRAV+Yts87kKdq0PP7pXfy6kDkUVs=
github.com/ugorji/go v1.1.7 h1:/68gy2h+1mWMrwZFeD1kQialdSzAb432dtpeJ42ovdo=
github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
github.com/stretchr/testify v1.8.4 h1:CcVxjf3Q8PM0mHUKJCdn+eZZtm5yQwehR5yeSVQQcUk=
github.com/stretchr/testify v1.8.4/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
github.com/ugorji/go v1.1.7/go.mod h1:kZn38zHttfInRq0xu/PH0az30d+z6vm202qpg1oXVMw=
github.com/ugorji/go/codec v1.1.7 h1:2SvQaVZ1ouYrrKKwoSk2pzd4A9evlKJb9oTL+OaLUSs=
github.com/ugorji/go/codec v1.1.7/go.mod h1:Ax+UKWsSmolVDwsd+7N3ZtXu+yMGCf907BLYF3GoBXY=
Expand All @@ -28,8 +37,9 @@ golang.org/x/net v0.0.0-20180906233101-161cd47e91fd/go.mod h1:mL1N/T3taQHkDXs73r
golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f h1:wMNYb4v58l5UBM7MYRLPG6ZhfOqbKu7X5eyFl8ZhKvA=
golang.org/x/sync v0.0.0-20180314180146-1d60e4601c6f/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sys v0.0.0-20180909124046-d0be0721c37e/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037 h1:YyJpGZS1sBuBCzLAR1VEpK193GlqGZbnPFnPV/5Rsb4=
golang.org/x/sys v0.0.0-20191026070338-33540a1f6037/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.14.0 h1:Vz7Qs629MkJkGyHxUlRHizWJRG2j8fbQKjELVSNhy7Q=
golang.org/x/sys v0.14.0/go.mod h1:/VUhepiaJMQUp4+oa/7Zr1D23ma6VTLIYjOOTFZPUcA=
golang.org/x/text v0.3.0 h1:g61tztE5qeGQ89tm6NTjjM9VPIm088od1l6aSorWRWg=
golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
Expand All @@ -40,3 +50,6 @@ gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7 h1:uRGJdciOHaEIrze2W8Q3AKkep
gopkg.in/tomb.v1 v1.0.0-20141024135613-dd632973f1e7/go.mod h1:dt/ZhP58zS4L8KSrWDmTeBkI65Dw0HsyUHuEVlX15mw=
gopkg.in/yaml.v2 v2.2.1 h1:mUhvW9EsL+naU5Q3cakzfE91YhliOondGd6ZrsDBHQE=
gopkg.in/yaml.v2 v2.2.1/go.mod h1:hI93XBmqTisBFMUTm0b8Fm+jr3Dg1NNxqwp+5A1VGuI=
gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
28 changes: 28 additions & 0 deletions metrics/constants.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
package metrics

// Metrics sent from this plugin
const (
PackagingTime = "logs.fb.packaging.time"
PayloadSendTime = "logs.fb.payload.send.time"
TotalSendTime = "logs.fb.total.send.time"
PayloadCountPerChunk = "logs.fb.payload.count"
PayloadSize = "logs.fb.payload.size"
)

// API URLs
const (
metricsUsProdUrl = "https://metric-api.newrelic.com/metric/v1"
metricsEuProdUrl = "https://metric-api.eu.newrelic.com/metric/v1"
metricsStagingUrl = "https://staging-metric-api.newrelic.com/metric/v1"
logsUsProdUrl = "https://log-api.newrelic.com/log/v1"
logsEuProdUrl = "https://log-api.eu.newrelic.com/log/v1"
logsStagingUrl = "https://staging-log-api.newrelic.com/log/v1"
)

// Maps the Metrics API URL that corresponds to the same environment as the provided
// Logs API URL. It returns nil if an incorrect Logs API URL was provided.
var logsToMetricsUrlMapping = map[string]string{
logsUsProdUrl: metricsUsProdUrl,
logsEuProdUrl: metricsEuProdUrl,
logsStagingUrl: metricsStagingUrl,
}
Loading

0 comments on commit 64dc438

Please sign in to comment.