Guozhang Wang created KAFKA-6634:
------------------------------------

             Summary: Delay initiating the txn on producers until 
initializeTopology with EOS turned on
                 Key: KAFKA-6634
                 URL: https://issues.apache.org/jira/browse/KAFKA-6634
             Project: Kafka
          Issue Type: Improvement
          Components: streams
            Reporter: Guozhang Wang
            Assignee: Guozhang Wang


In Streams EOS implementation, the created producers for tasks will initiate a 
txn immediately after being created in the constructor of `StreamTask`. 
However, the task may not process any data and hence producer may not send any 
records for that started txn for a long time because of the restoration 
process. And with default txn.session.timeout valued at 60 seconds, it means 
that if the restoration takes more than that amount of time, upon starting the 
producer will immediately get the error that its producer epoch is already old.

To fix this, we should consider instantiating the txn only after the 
restoration phase is done. Although this may have a caveat that if the producer 
is already fenced, it will not be notified until then, in initializeTopology. 
But I think this should not be a correctness issue since during the restoration 
process we do not make any changes to the processing state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to