Hello,

We have one PR which I believe is addressing exactly your observed problem:
https://github.com/apache/kafka/pull/8278


Guozhang

On Wed, Mar 11, 2020 at 11:57 PM Guru C G <mailguruchaita...@gmail.com>
wrote:

> We have come across an issue where in FATAL messages are logged in the
> broker.
>
> FATAL kafka.coordinator.transaction.TransactionMetadata:
> TransactionMetadata(transactionalId=tx-id-1, producerId=96011,
> producerEpoch=51, txnTimeoutMs=60000, state=CompleteCommit,
> pendingState=Some(Ongoing), topicPartitions=Set(),
> txnStartTimestamp=1580894482199, txnLastUpdateTimestamp=1580894482292)'s
> transition to TxnTransitMetadata(producerId=96011, producerEpoch=51,
> txnTimeoutMs=60000, txnState=Ongoing, topicPartitions=Set(topic1-0),
> txnStartTimestamp=1580894480766, txnLastUpdateTimestamp=1580894480766)
> failed: this should not happen
>
> On close inspection, we found the message is because the completed
> transaction has a newer timestamp(txnStartTimestamp=1580894482199) than the
> current timestamp of TxnTransitMetadata(txnStartTimestamp=1580894480766)
> and we also found the possibility of clocks in the broker being out of sync
> by a few seconds.
>
>
> https://github.com/apache/kafka/blob/b526528cafe4142b73df8c930473b0cddc84ca9d/core/src/main/scala/kafka/coordinator/transaction/TransactionMetadata.scala#L382
>
>
> The scenario in general is acknowledged and partially addressed below.
> However, it does not cover the case where the startTime of Ongoing
> transaction is older than start time of completed/aborted.
>
> https://issues.apache.org/jira/browse/KAFKA-5415?focusedCommentId=16045170&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16045170
>
> Is this deliberate? Do we need that check there?
>
>

-- 
-- Guozhang

Reply via email to