Hello, We have one PR which I believe is addressing exactly your observed problem: https://github.com/apache/kafka/pull/8278
Guozhang On Wed, Mar 11, 2020 at 11:57 PM Guru C G <mailguruchaita...@gmail.com> wrote: > We have come across an issue where in FATAL messages are logged in the > broker. > > FATAL kafka.coordinator.transaction.TransactionMetadata: > TransactionMetadata(transactionalId=tx-id-1, producerId=96011, > producerEpoch=51, txnTimeoutMs=60000, state=CompleteCommit, > pendingState=Some(Ongoing), topicPartitions=Set(), > txnStartTimestamp=1580894482199, txnLastUpdateTimestamp=1580894482292)'s > transition to TxnTransitMetadata(producerId=96011, producerEpoch=51, > txnTimeoutMs=60000, txnState=Ongoing, topicPartitions=Set(topic1-0), > txnStartTimestamp=1580894480766, txnLastUpdateTimestamp=1580894480766) > failed: this should not happen > > On close inspection, we found the message is because the completed > transaction has a newer timestamp(txnStartTimestamp=1580894482199) than the > current timestamp of TxnTransitMetadata(txnStartTimestamp=1580894480766) > and we also found the possibility of clocks in the broker being out of sync > by a few seconds. > > > https://github.com/apache/kafka/blob/b526528cafe4142b73df8c930473b0cddc84ca9d/core/src/main/scala/kafka/coordinator/transaction/TransactionMetadata.scala#L382 > > > The scenario in general is acknowledged and partially addressed below. > However, it does not cover the case where the startTime of Ongoing > transaction is older than start time of completed/aborted. > > https://issues.apache.org/jira/browse/KAFKA-5415?focusedCommentId=16045170&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16045170 > > Is this deliberate? Do we need that check there? > > -- -- Guozhang