Replied on StackOverflow: https://stackoverflow.com/questions/67158317/apache-kafka-streams-out-of-order-messages
-Matthias On 4/20/21 4:21 PM, Neeraj Vaidya wrote: > Hi, > I have asked this on StackOverflow, but will ask it here as well. > > I have an Apache Kafka 2.6 Producer which writes to topic-A (TA). I also have > a Kafka streams application which consumes from TA and writes to topic-B > (TB). In the streams application, I have a custom timestamp extractor which > extracts the timestamp from the message payload. > > For one of my failure handling test cases, I shutdown the Kafka cluster while > my applications are running. > > When the producer application tries to write messages to TA, it cannot > because the cluster is down and hence (I assume) buffers the messages. Let's > say it receives 4 messages m1,m2,m3,m4 in increasing time order. (i.e. m1 is > first and m4 is last). > > When I bring the Kafka cluster back online, the producer sends the buffered > messages to the topic, but they are not in order. I receive for example, m2 > then m3 then m1 and then m4. > > Why is that ? Is it because the buffering in the producer is multi-threaded > with each producing to the topic at the same time ? > > I assumed that the custom timestamp extractor would help in ordering messages > when consuming them. But they do not. Or maybe my understanding of the > timestamp extractor is wrong. > > If not, then what are the specific uses of the timestamp extractor ? Just to > associate a timestamp with an event ? > > I got one solution from SO here, to just stream all events from tA to another > intermediate topic (say tA') which will use the TimeStamp extractor to > another topic. But I am not sure if this will cause the events to get > reordered based on the extracted timestamp. > > Regards, > Neeraj >