[
https://issues.apache.org/jira/browse/KAFKA-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18060098#comment-18060098
]
sanghyeok An edited comment on KAFKA-20090 at 2/22/26 2:29 AM:
---------------------------------------------------------------
*TL;DR:*
There is still a discrepancy between the {{producerId}} / {{producerEpoch}}
values across the {{PREPARE_ABORT}} → {{COMPLETE_ABORT}} transition. However,
this appears to be intentional by design, and even if a client sends
{{EndTxn(ABORT)}} during this transition window, it does not seem to cause any
issues. Therefore, the change that simply removes {{!isEpochFence()}} also
seems sufficient.
I also considered concurrency during the transition from *#5* to {*}#6{*}.
In this window, the client could send an *EndTxn(ABORT)* request.
However, I do not think this would cause a client-visible issue.
The reason is that at *#5* a pending transition is in progress and
*pendingState is COMPLETE_ABORT.*
In that case, unless the pendingState is the special-case PREPARE_EPOCH_FENCE,
EndTxn(ABORT) is handled by the path that responds with
*CONCURRENT_TRANSACTIONS.* Also, in that error path, *the response callback is
invoked with NO_PRODUCER_ID and NO_PRODUCER_EPOCH (that is, -1 and -1),* so
there is no path where the client receives producerId and producerEpoch as a
successful response. Therefore, even if producerId and producerEpoch appear to
differ internally between PREPARE_ABORT and COMPLETE_ABORT at the epoch
boundary, I do not expect it to lead to a direct client-visible problem.
That said, I cannot fully rule out the possibility that there is an implicit
semantic or invariant somewhere requiring producerId and producerEpoch to be
identical between PREPARE_ABORT and COMPLETE_ABORT. If any logic relies on that
invariant, this boundary condition could potentially introduce nondeterministic
behavior. *However, considering the comment I found that suggests, in TV2, a
producer may use the new producerId in the pending state during overflow
transitions, this kind of mismatch also seems consistent with the intended
design and may be acceptable.
([https://github.com/apache/kafka/blob/b974ccd8d2c0371246aa227952e86f6055557e4c/transaction-coordinator/src/main/java/org/apache/kafka/coordinator/transaction/TransactionMetadata.java#L264-L269)]*
If we do want to prevent producerId and producerEpoch from diverging between
PREPARE_ABORT and COMPLETE_ABORT altogether, one approach would be to allow PID
rotation in prepareFenceProducerEpoch when the epoch is exhausted. But that
could broaden the semantics of epoch fencing, and we would likely need a wider
review of the TransactionCoordinator flow, including rolling-upgrade
considerations.
So I see two options.
*Option 1: Remove only the !isEpochFence condition*
* Pros: Small change and simple implementation
* Cons: Internally, producerId and producerEpoch may still differ between
PREPARE_ABORT and COMPLETE_ABORT; if any code depends on them being identical,
that could be risky
*Option 2: Allow PID rotation in prepareFenceProducerEpoch when the epoch is
exhausted*
* Pros: Provides stronger consistency of producerId and producerEpoch between
PREPARE_ABORT and COMPLETE_ABORT at the boundary
* Cons: Expands the fencing behavior and increases the scope of implementation
and validation. Also, transactions may already exist near epoch 32767, so we
may need to consider workarounds and rolling-upgrade impact
*Given that the PREPARE_ABORT vs COMPLETE_ABORT divergence appears to be
permitted by design (based on the comment I mentioned), I still believe Option
1, the minimal change, is likely sufficient.*
Sorry for the long message.
I wanted to make this boundary case explicit and discuss it clearly!
What do you think?
was (Author: JIRAUSER303328):
I also considered concurrency during the transition from *#5* to {*}#6{*}.
In this window, the client could send an *EndTxn(ABORT)* request.
However, I do not think this would cause a client-visible issue.
The reason is that at *#5* a pending transition is in progress and
*pendingState is COMPLETE_ABORT.*
In that case, unless the pendingState is the special-case PREPARE_EPOCH_FENCE,
EndTxn(ABORT) is handled by the path that responds with
*CONCURRENT_TRANSACTIONS.* Also, in that error path, *the response callback is
invoked with NO_PRODUCER_ID and NO_PRODUCER_EPOCH (that is, -1 and -1),* so
there is no path where the client receives producerId and producerEpoch as a
successful response. Therefore, even if producerId and producerEpoch appear to
differ internally between PREPARE_ABORT and COMPLETE_ABORT at the epoch
boundary, I do not expect it to lead to a direct client-visible problem.
That said, I cannot fully rule out the possibility that there is an implicit
semantic or invariant somewhere requiring producerId and producerEpoch to be
identical between PREPARE_ABORT and COMPLETE_ABORT. If any logic relies on that
invariant, this boundary condition could potentially introduce nondeterministic
behavior. *However, considering the comment I found that suggests, in TV2, a
producer may use the new producerId in the pending state during overflow
transitions, this kind of mismatch also seems consistent with the intended
design and may be acceptable.
([https://github.com/apache/kafka/blob/b974ccd8d2c0371246aa227952e86f6055557e4c/transaction-coordinator/src/main/java/org/apache/kafka/coordinator/transaction/TransactionMetadata.java#L264-L269)]*
If we do want to prevent producerId and producerEpoch from diverging between
PREPARE_ABORT and COMPLETE_ABORT altogether, one approach would be to allow PID
rotation in prepareFenceProducerEpoch when the epoch is exhausted. But that
could broaden the semantics of epoch fencing, and we would likely need a wider
review of the TransactionCoordinator flow, including rolling-upgrade
considerations.
So I see two options.
*Option 1: Remove only the !isEpochFence condition*
* Pros: Small change and simple implementation
* Cons: Internally, producerId and producerEpoch may still differ between
PREPARE_ABORT and COMPLETE_ABORT; if any code depends on them being identical,
that could be risky
*Option 2: Allow PID rotation in prepareFenceProducerEpoch when the epoch is
exhausted*
* Pros: Provides stronger consistency of producerId and producerEpoch between
PREPARE_ABORT and COMPLETE_ABORT at the boundary
* Cons: Expands the fencing behavior and increases the scope of implementation
and validation. Also, transactions may already exist near epoch 32767, so we
may need to consider workarounds and rolling-upgrade impact
*Given that the PREPARE_ABORT vs COMPLETE_ABORT divergence appears to be
permitted by design (based on the comment I mentioned), I still believe Option
1, the minimal change, is likely sufficient.*
Sorry for the long message.
I wanted to make this boundary case explicit and discuss it clearly!
What do you think?
> TV2 can allow for ongoing transactions with max epoch that never complete
> -------------------------------------------------------------------------
>
> Key: KAFKA-20090
> URL: https://issues.apache.org/jira/browse/KAFKA-20090
> Project: Kafka
> Issue Type: Task
> Reporter: Justine Olshan
> Assignee: sanghyeok An
> Priority: Critical
>
> When transaction version 2 was introduced, epoch bumps happen on every
> transaction.
> The original EndTransaction logic considers retries and because of epoch
> bumps we wanted to be careful to not fence ourselves. This means that for
> EndTransaction retries, we have to check if the epoch has been bumped to
> consider a retry.
> The original logic returns the current producer ID and epoch in the
> transaction metadata when a retry has been identified. The normal end
> transaction case with max epoch - 1 was considered and accounted for – the
> state there is safe to return to the producer.
> However, we didn't consider that in the case of fencing epoch bumps with max
> epoch - 1, where we also bump the epoch, but don't create a new producer ID
> and epoch. In this scenario the producer was expected to be fenced and call
> init producer ID, so this isn't a problem, but it is a problem if we try to
> return it to the producer.
> There is a scenario we race a timeout and end transaction abort with max
> epoch - 1, we can consider the end transaction request a "retry" and return
> max epoch as the current producer's epoch instead of fencing.
> 1. The fencing abort on transactional timeout bumps the epoch to max
> 2. The EndTxn request with max epoch - 1 is considered a "retry" and we
> return max epoch
> 3. The producer can start a transaction since we don't check epochs on
> starting transactions
> 4. We cannot commit this transaction with TV2 and we cannot timeout the
> transaction. It is stuck in Ongoing forever.
> I modified
> [https://github.com/apache/kafka/blob/aad33e4e41aaa94b06f10a5be0094b717b98900f/core/src/test/scala/unit/kafka/coordinator/transaction/TransactionCoordinatorTest.scala#L1329]
> to capture this behavior. I added the following code to the end:
> {code:java}
> // Transition to COMPLETE_ABORT since we can't do it via writing markers
> response callback
> txnMetadata.completeTransitionTo(new
> TxnTransitMetadata(txnMetadata.producerId(), txnMetadata.prevProducerId(),
> txnMetadata.nextProducerId(), Short.MaxValue, Short.MaxValue -1,
> txnTimeoutMs, txnMetadata.pendingState().get(), new
> util.HashSet[TopicPartition](), txnMetadata.txnLastUpdateTimestamp(),
> txnMetadata.txnLastUpdateTimestamp(), TV_2))
> coordinator.handleEndTransaction(transactionalId, producerId,
> epochAtMaxBoundary, TransactionResult.ABORT, TV_2, endTxnCallback)
> assertEquals(10, newProducerId) assertEquals(Short.MaxValue, newEpoch)
> assertEquals(Errors.NONE, error){code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)