Jason Gustafson created KAFKA-9350:
--------------------------------------
Summary: IllegalStateException when materializing transactional
offset commits
Key: KAFKA-9350
URL: https://issues.apache.org/jira/browse/KAFKA-9350
Project: Kafka
Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Jason Gustafson
Assignee: Jason Gustafson
We have caught this exception a few times in the log:
{code}
java.lang.IllegalStateException: Trying to complete a transactional offset
commit for producerId 16031 and groupId foo even though the offset commit
record itself hasn't been appended to the log.
at
kafka.coordinator.group.GroupMetadata.$anonfun$completePendingTxnOffsetCommit$2(GroupMetadata.scala:595)
at
scala.collection.mutable.HashMap.$anonfun$foreach$1(HashMap.scala:149)
at scala.collection.mutable.HashTable.foreachEntry(HashTable.scala:237)
at scala.collection.mutable.HashTable.foreachEntry$(HashTable.scala:230)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:44)
at scala.collection.mutable.HashMap.foreach(HashMap.scala:149)
at
kafka.coordinator.group.GroupMetadata.$anonfun$completePendingTxnOffsetCommit$1(GroupMetadata.scala:592)
at
kafka.coordinator.group.GroupMetadata.$anonfun$completePendingTxnOffsetCommit$1$adapted(GroupMetadata.scala:591)
at scala.Option.foreach(Option.scala:274)
at
kafka.coordinator.group.GroupMetadata.completePendingTxnOffsetCommit(GroupMetadata.scala:591)
at
kafka.coordinator.group.GroupMetadataManager.$anonfun$handleTxnCompletion$2(GroupMetadataManager.scala:828)
at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
at kafka.coordinator.group.GroupMetadata.inLock(GroupMetadata.scala:209)
at
kafka.coordinator.group.GroupMetadataManager.$anonfun$handleTxnCompletion$1(GroupMetadataManager.scala:827)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
at
kafka.coordinator.group.GroupMetadataManager.handleTxnCompletion(GroupMetadataManager.scala:824)
at
kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleHandleTxnCompletion$1(GroupMetadataManager.scala:819)
at
kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:116)
at kafka.utils.CoreUtils$$anon$1.run(CoreUtils.scala:65)
at
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
{code}
It seems that the end transaction marker callback is getting triggered before
the offset commit callback. This is puzzling because transaction completion
should be tied to a successful TxnOffsetCommit response which depends on
completion of the offset commit callback. So it's possible either that there is
some case we're missing in the broker or there is some bug in the client. I
looked through the logic on both sides and there is no obvious problem.
In any case, it probably makes sense to let the broker behave more defensively
since there is no guarantee that a client won't send EndTxn before receiving a
successful TxnOffsetCommit response.
Note the impact of this bug would tend to not be noticed because usually there
is a subsequent offset commit which succeeds. However, in the worst case, it
can violate EOS guarantees because it could cause the consumer to revert to a
previously committed offset.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)