Rohit Bobade created KAFKA-17380:
------------------------------------

             Summary: Kafka Streams few partition stuck in processing - fixed 
after restart
                 Key: KAFKA-17380
                 URL: https://issues.apache.org/jira/browse/KAFKA-17380
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 2.6.2
            Reporter: Rohit Bobade


Using Kafka Streams 2.6.2 and running stateful aggregations with Exactly once 
semantics.

The processing logic is: 

consume input records -> intermediate aggregate and buffer data in state store 
backed by change log topic -> punctuate every 15seconds - flush state store and 
send aggregated records downstream -> final aggregate operation and send to 
output topic



Since we use spot instances, one of the pod got restarted and rebalance was 
triggered.

we noticed ProducerFenced exceptions:


{quote}org.apache.kafka.common.errors.ProducerFencedException: Producer 
attempted an

operation with an old epoch. Either there is a newer producer with the same 
transactionalId, or the producer's transaction has been expired by the broker.
{quote}

After this a few partitions were stuck and no records were processed util we 
restarted the application.


We had configured:
 
transaction.timeout.ms to 30 seconds

session.timeout.ms to 30 seconds



could you please advise if there's any known fix for this edge case? 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to