[ https://issues.apache.org/jira/browse/KAFKA-5364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036092#comment-16036092 ]
Apurva Mehta commented on KAFKA-5364: ------------------------------------- I am leaving the fix version as 0.11.0.0 for now. But since PR 3202 was merged, this has become very rare. So the most common case looks like it is solved. The concurrent read tests (KAFKA-5366) fail all the time because of data inconsistency in the concurrent reader, so I am going to focus on those before getting to this, as those are more common issues. > Producer attempts to send transactional messages before adding partitions to > transaction > ---------------------------------------------------------------------------------------- > > Key: KAFKA-5364 > URL: https://issues.apache.org/jira/browse/KAFKA-5364 > Project: Kafka > Issue Type: Sub-task > Components: clients, core, producer > Affects Versions: 0.11.0.0 > Reporter: Apurva Mehta > Assignee: Apurva Mehta > Priority: Blocker > Labels: exactly-once > Fix For: 0.11.0.0 > > Attachments: KAFKA-5364.tar.gz > > > Due to a race condition between the sender thread and the producer.send(), > the following is possible: > # In KakfaProducer.doSend(), we add partitions to the transaction and then do > accumulator.append. > # In Sender.run(), we check whether there are transactional request. If there > are, we send them and wait for the response. > # If there aren't we drain the accumulator queue and send the produce > requests. > # The problem is that the sequence step 2, 1, 3 is entire possible. This > means that we won't send the 'AddPartitions' request but yet try to send the > produce data. Which results in a fatal error and requires the producer to > close. > The solution is that in the accumulator.drain, we should check again if there > are pending add partitions requests, and if so, don't drain anything. -- This message was sent by Atlassian JIRA (v6.3.15#6346)