Replied on StackOverflow:
https://stackoverflow.com/questions/67158317/apache-kafka-streams-out-of-order-messages


-Matthias



On 4/20/21 4:21 PM, Neeraj Vaidya wrote:
> Hi,
> I have asked this on StackOverflow, but will ask it here as well.
> 
> I have an Apache Kafka 2.6 Producer which writes to topic-A (TA). I also have 
> a Kafka streams application which consumes from TA and writes to topic-B 
> (TB). In the streams application, I have a custom timestamp extractor which 
> extracts the timestamp from the message payload.
> 
> For one of my failure handling test cases, I shutdown the Kafka cluster while 
> my applications are running.
> 
> When the producer application tries to write messages to TA, it cannot 
> because the cluster is down and hence (I assume) buffers the messages. Let's 
> say it receives 4 messages m1,m2,m3,m4 in increasing time order. (i.e. m1 is 
> first and m4 is last).
> 
> When I bring the Kafka cluster back online, the producer sends the buffered 
> messages to the topic, but they are not in order. I receive for example, m2 
> then m3 then m1 and then m4.
> 
> Why is that ? Is it because the buffering in the producer is multi-threaded 
> with each producing to the topic at the same time ?
> 
> I assumed that the custom timestamp extractor would help in ordering messages 
> when consuming them. But they do not. Or maybe my understanding of the 
> timestamp extractor is wrong.
> 
> If not, then what are the specific uses of the timestamp extractor ? Just to 
> associate a timestamp with an event ?
> 
> I got one solution from SO here, to just stream all events from tA to another 
> intermediate topic (say tA') which will use the TimeStamp extractor to 
> another topic. But I am not sure if this will cause the events to get 
> reordered based on the extracted timestamp.
> 
> Regards,
> Neeraj
> 

Reply via email to