[ https://issues.apache.org/jira/browse/KAFKA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17308175#comment-17308175 ]
A. Sophie Blee-Goldman commented on KAFKA-9552: ----------------------------------------------- Hey guys, what's the current take on OutOfORderSequenceException handling in Streams -- looks like we currently do catch it and rethrow as a TaskMigrated, but the discussion above seems to conclude this is not the appropriate response and we should actually try to shut down _all_ threads across the app instead. At the time this may not have been possible, but thanks to the new SHUTDOWN_APPLICATION feature we can now implement the proper handling and kill all the threads. [~guozhang] [~mjsax] [~vvcephei] WDYT? Was there ever a consensus on whether we can just recover from OutOfOrderSequence by aborting the txn and triggering a rebalance, or are there some cases where we must fail fast to avoid further processing on potentially corrupt data as you described above? > Stream should handle OutOfSequence exception thrown from Producer > ----------------------------------------------------------------- > > Key: KAFKA-9552 > URL: https://issues.apache.org/jira/browse/KAFKA-9552 > Project: Kafka > Issue Type: Improvement > Components: streams > Affects Versions: 2.5.0 > Reporter: Boyang Chen > Priority: Major > Fix For: 2.6.0 > > > As of today the stream thread could die from OutOfSequence error: > {code:java} > [2020-02-12T07:14:35-08:00] > (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) > org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker > received an out of order sequence number. > [2020-02-12T07:14:35-08:00] > (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) [2020-02-12 > 15:14:35,185] ERROR > [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] > stream-thread > [stream-soak-test-546f8754-5991-4d62-8565-dbe98d51638e-StreamThread-1] Failed > to commit stream task 3_2 due to the following error: > (org.apache.kafka.streams.processor.internals.AssignedStreamsTasks) > [2020-02-12T07:14:35-08:00] > (streams-soak-2-5-eos_soak_i-03f89b1e566ac95cc_streamslog) > org.apache.kafka.streams.errors.StreamsException: task [3_2] Abort sending > since an error caught with a previous record (timestamp 1581484094825) to > topic stream-soak-test-KSTREAM-AGGREGATE-STATE-STORE-0000000049-changelog due > to org.apache.kafka.common.errors.OutOfOrderSequenceException: The broker > received an out of order sequence number. > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.recordSendError(RecordCollectorImpl.java:154) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl.access$500(RecordCollectorImpl.java:52) > at > org.apache.kafka.streams.processor.internals.RecordCollectorImpl$1.onCompletion(RecordCollectorImpl.java:214) > at > org.apache.kafka.clients.producer.KafkaProducer$InterceptorCallback.onCompletion(KafkaProducer.java:1353) > {code} > Although this is fatal exception for Producer, stream should treat it as an > opportunity to reinitialize by doing a rebalance, instead of killing > computation resource. -- This message was sent by Atlassian Jira (v8.3.4#803005)