You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We've observed a problematic case where we had two events in the same partition.
Due to batch size of 1, they will be both handed to the StreamObserver's onNext() method in separate batches.
The processing of the first event caused an exception (in our code), which was logged by nakadi-java as StreamBatchRecordSubscriber.detected_retryable_exception, without committing any cursor changes.
But then the StreamObserver's onNext() method was called again, with the second event (in a new 1-event batch). (We have the max_uncommitted_events at a higher setting than 1 – the default is 10, I think.) This one could be processed without problems, and our code committed the cursor. As the new cursor was after the first one, we now got both events committed, and Nakadi won't resend either of them. The first failed event is effectively lost now.
This seems not to happen if there is no later event in the partition – then the first event is retried a bit later.
(I didn't succeed to dig into nakadi-java's code to see what is happening when a retryable exception is caught and more events are available on the same partition.)
Is this behavior expected? What should we have done differently?
The text was updated successfully, but these errors were encountered:
@ePaul thanks for reporting; let me do some digging, this one might be tricky to debug. In the meantime can you add the stream connection parameters as details?
I guess a workaround would be to always set max_uncommitted_events to 1, but this will reduce the possible throughput quite a lot (no parallelization possible).
We are using nakadi-java-client 0.9.17.
We've observed a problematic case where we had two events in the same partition.
Due to batch size of 1, they will be both handed to the StreamObserver's
onNext()
method in separate batches.The processing of the first event caused an exception (in our code), which was logged by nakadi-java as
StreamBatchRecordSubscriber.detected_retryable_exception
, without committing any cursor changes.But then the StreamObserver's
onNext()
method was called again, with the second event (in a new 1-event batch). (We have themax_uncommitted_events
at a higher setting than 1 – the default is 10, I think.) This one could be processed without problems, and our code committed the cursor. As the new cursor was after the first one, we now got both events committed, and Nakadi won't resend either of them. The first failed event is effectively lost now.This seems not to happen if there is no later event in the partition – then the first event is retried a bit later.
(I didn't succeed to dig into nakadi-java's code to see what is happening when a retryable exception is caught and more events are available on the same partition.)
Is this behavior expected? What should we have done differently?
The text was updated successfully, but these errors were encountered: