Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/netflow] Netflow receiver implementation - PR 2 #36865

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

dlopes7
Copy link
Contributor

@dlopes7 dlopes7 commented Dec 16, 2024

#34164 added the skeleton for the netflow receiver, this adds the implementation along with tests

  • Implement the receiver construction, along with the UDP listener that receives the packets
  • Implement the producer, the function that is responsible for parsing the data and sending it to the consumers
  • Add tests

Link to tracking Issue: #32732

.chloggen/netflow-receiver-implementation.yaml Outdated Show resolved Hide resolved
.chloggen/netflow-receiver-implementation.yaml Outdated Show resolved Hide resolved
receiver/netflowreceiver/factory.go Show resolved Hide resolved
receiver/netflowreceiver/config_test.go Outdated Show resolved Hide resolved

// Construct the actual log record based on the otel semantic conventions
// see https://opentelemetry.io/docs/specs/semconv/general/attributes/
otelMessage := OtelNetworkMessage{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this looks like metadata on the log to me. Instead of creating this struct, could we create a plog.LogRecord and set these values as attributes using the corresponding semconv key for each value? We can probably use the network semantic conventions here. I'm having trouble linking to the Go registry copy of the docs, but you can use the go.opentelemetry.io/collector/semconv module to get constants for these, as seen here.

For the timestamps, we could just set those on the log record. I believe the body will be empty since the incoming messages don't seem to include any kind of equivalent arbitrary-text field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dug into it some more and I see that the network semconv isn't necessarily a 1:1 fit here. I think we should do as much as we can, and we can create attributes for the pieces that don't fit so long as they're documented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented here

receiver/netflowreceiver/receiver.go Outdated Show resolved Hide resolved

case errors.Is(err, debug.PanicError):
var pErrMsg *debug.PanicErrorMessage
if errors.As(err, &pErrMsg) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove a few of these (mostly the stacktrace), or if they're all useful, put a few of them at the "debug" log level? Ideally these should just inform users that a given message couldn't be parsed, and maybe give them some info to help figure out what in the message caused the issue.

Also, if possible, it would be nice to avoid to avoid using "panic" anywhere in here since from what I understand we're essentially handling malformed incoming messages, and I think "panic" makes this sound more alarming than it needs to be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in this commit

I found hard to not use the word panic because of the wrapper and types we are using from GoFlow2

Comment on lines 164 to 165
}
nr.logger.Error("receiver panic", zap.Error(err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
nr.logger.Error("receiver panic", zap.Error(err))
} else {
nr.logger.Error("could not retrieve message parsing error from GoFlow2, this is an error in the Netflow receiver", zap.Error(err))
}

I think we should try to make it a bit clearer that it's an issue with the underlying library or how we're reading from it. Realistically we should never hit this case. Also this should be in an else statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented in this commit

.chloggen/netflow-receiver-implementation.yaml Outdated Show resolved Hide resolved

var decodeFunc utils.DecoderFunc
var p utils.FlowPipe
switch nr.config.Scheme {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we explain in the readme what protocol support users can expect when configuring one of these three schemes? I had to dig into the GoFlow2 source to understand what is happening here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've documented it here and I've decided to remove the confusing flow scheme.

That was just netflow + sflow together, trying to figure out the packet type before forwarding to the actual decoder, for clarity it is easier if we just allow users to select from netflow or sflow instead.

dlopes7 and others added 12 commits January 7, 2025 23:33
…opentelemetry-collector-contrib into netflow-receiver-implementation
* Log empty sflow packets, which can happen if a packet only contains counter samples, which are not yet supported
The flow scheme was strange as it was just netflow + sflow in a single pipe which checked the protocol before forwarding tp the actual pipe

We remove that option for clarity, so users can only chose netflow or sflow as the scheme
@dlopes7
Copy link
Contributor Author

dlopes7 commented Jan 8, 2025

@evan-bradley would you mind taking a look at the changes after your review please?

Eventually we want to be more explicit on how we parse the network packets instead of relying on the goflow2 proto producer.

This way we can parse things like counters from sflow (which are probably better as metrics instead of logs)

For now I've documented the supported schemas and what they can do, thanks for the in depth review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants