-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance Kafka exporter to respect max message size #36982
Comments
Pinging code owners:
See Adding Labels via Comments if you do not have permissions to add labels yourself. |
Hey @yurishkuro, I would like to contribute to this issue. Can you please assign me this? |
Hey @yurishkuro, I’m tied up with another project and it’s taking longer than expected. I won't be able to pick this up. Thanks for your understanding. |
hi @yurishkuro , is this task still available? I’d like to give it a try, assign please |
@LZiHaN are you working on this ? |
Yes, I'm working on it. |
Hi @yurishkuro , I’m working on implementing this feature and wanted to confirm if the approach I’m considering for message splitting and reassembling is feasible.
Is this approach viable, and would it work seamlessly with the Kafka producer/consumer setup? Or are there any potential issues with storing this information in the headers and reassembling the message on the consumer side? Looking forward to your feedback. |
@LZiHaN this is a possible approach but it would be a breaking change since a consumer that does not understand this chunking may not be able to reassemble the message. My idea was that we instead split the spans from the payload into multiple payloads, such that each payload fits in the MaxMessageSize when serialized. It's not quite simple to implement because it's possible the payload has one huge spans, but if we can split it this way then it's a fully backwards compatible solution. |
@yurishkuro this seems to introduce some performance tradeoffs: So, this marshaller, creates a message, which is then exported using sarama client to kafka. Now irrespective of what kafka configuration is, this export would fail for sizes which exceeds the client configuration (exporter). Now, one solution could be to implement chunking traces received here based on the configured size. (as you suggested) However, it looks like we need to calculate the size and then do the chunking based on the resulting size.
Another approach could be appending spans in a given trace, while calculating size after each append, once the limit is received, again build the resource packet and start appending to it. For this a new size aware SplitTraces can be created. Or maybe some other approach, WDYT would be a better implementation to it? The size check is done here in the sarama client, and this ByteSize function can be used in kafka exporter to calculate the size. |
To avoid perf issues the marshaling can be optimistic: try to marshal the whole thing first and only if the result is larger than max message then try to chunk the spans. As for the chunking algorithm, you don't need to be stuck in analysis paralysis, just write something that works correctly. However you implement it will be an improvement over the current state since now the message just gets dropped. |
Btw binary search sounds like a reasonable approach - keep dividing spans in half until acceptable size is produced. Other methods would be hard since marshalers do not allow concatenation of serialized parts. |
Component(s)
exporter/kafka
Is your feature request related to a problem? Please describe.
The exporter has a config option
MaxMessageBytes
but it itself does not respect it and attempts to send a full serialized payload to the Kafka driver which may reject it based on this setting.Describe the solution you'd like
Most payloads can be safely split into chunks of "safe" size that will be accepted by the Kafka driver. For example, in Jaeger integration tests there is a test that writes a trace with 10k spans, which is 3Mb in size when serialized as JSON. The trace can be trivially split into multiple messages that would fit in the default 1Mb size limit.
Describe alternatives you've considered
No response
Additional context
jaegertracing/jaeger#6437 (comment)
The text was updated successfully, but these errors were encountered: