Jason Gustafson created KAFKA-9350:
--------------------------------------

             Summary: IllegalStateException when materializing transactional 
offset commits
                 Key: KAFKA-9350
                 URL: https://issues.apache.org/jira/browse/KAFKA-9350
             Project: Kafka
          Issue Type: Bug
    Affects Versions: 2.4.0
            Reporter: Jason Gustafson
            Assignee: Jason Gustafson


We have caught this exception a few times in the log:

{code}
java.lang.IllegalStateException: Trying to complete a transactional offset 
commit for producerId 16031 and groupId foo even though the offset commit 
record itself hasn't been appended to the log.
        at 
kafka.coordinator.group.GroupMetadata.$anonfun$completePendingTxnOffsetCommit$2(GroupMetadata.scala:595)
        at 
scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
        at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
        at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
        at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
        at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
        at 
kafka.coordinator.group.GroupMetadata.$anonfun$completePendingTxnOffsetCommit$1(GroupMetadata.scala:592)
        at 
kafka.coordinator.group.GroupMetadata.$anonfun$completePendingTxnOffsetCommit$1$adapted(GroupMetadata.scala:591)
        at scala.Option.foreach(Option.scala:274)
        at 
kafka.coordinator.group.GroupMetadata.completePendingTxnOffsetCommit(GroupMetadata.scala:591)
        at 
kafka.coordinator.group.GroupMetadataManager.$anonfun$handleTxnCompletion$2(GroupMetadataManager.scala:828)
        at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
        at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209)
        at 
kafka.coordinator.group.GroupMetadataManager.$anonfun$handleTxnCompletion$1(GroupMetadataManager.scala:827)
        at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
        at 
kafka.coordinator.group.GroupMetadataManager.handleTxnCompletion(GroupMetadataManager.scala:824)
        at 
kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleHandleTxnCompletion$1(GroupMetadataManager.scala:819)
        at 
kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
        at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at 
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
{code}
It seems that the end transaction marker callback is getting triggered before 
the offset commit callback. This is puzzling because transaction completion 
should be tied to a successful TxnOffsetCommit response which depends on 
completion of the offset commit callback. So it's possible either that there is 
some case we're missing in the broker or there is some bug in the client. I 
looked through the logic on both sides and there is no obvious problem.

In any case, it probably makes sense to let the broker behave more defensively 
since there is no guarantee that a client won't send EndTxn before receiving a 
successful TxnOffsetCommit response.

Note the impact of this bug would tend to not be noticed because usually there 
is a subsequent offset commit which succeeds. However, in the worst case, it 
can violate EOS guarantees because it could cause the consumer to revert to a 
previously committed offset.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to