artemlivshits commented on code in PR #19751:
URL: https://github.com/apache/kafka/pull/19751#discussion_r2094234325


##########
core/src/main/scala/kafka/coordinator/transaction/TransactionCoordinator.scala:
##########
@@ -772,135 +793,159 @@ class TransactionCoordinator(txnConfig: 
TransactionConfig,
           val coordinatorEpoch = epochAndTxnMetadata.coordinatorEpoch
 
           txnMetadata.inLock {
-            producerIdCopy = txnMetadata.producerId
-            producerEpochCopy = txnMetadata.producerEpoch
-            // PrepareEpochFence has slightly different epoch bumping logic so 
don't include it here.
-            // Note that, it can only happen when the current state is Ongoing.
-            isEpochFence = 
txnMetadata.pendingState.contains(TransactionState.PREPARE_EPOCH_FENCE)
-            // True if the client retried a request that had overflowed the 
epoch, and a new producer ID is stored in the txnMetadata
-            val retryOnOverflow = !isEpochFence && txnMetadata.prevProducerId 
== producerId &&
-              producerEpoch == Short.MaxValue - 1 && txnMetadata.producerEpoch 
== 0
-            // True if the client retried an endTxn request, and the bumped 
producer epoch is stored in the txnMetadata.
-            val retryOnEpochBump = !isEpochFence && txnMetadata.producerEpoch 
== producerEpoch + 1
-
-            val isValidEpoch = {
-              if (!isEpochFence) {
-                // With transactions V2, state + same epoch is not sufficient 
to determine if a retry transition is valid. If the epoch is the
-                // same it actually indicates the next endTransaction call. 
Instead, we want to check the epoch matches with the epoch in the retry 
conditions.
-                // Return producer fenced even in the cases where the epoch is 
higher and could indicate an invalid state transition.
-                // Use the following criteria to determine if a v2 retry is 
valid:
-                txnMetadata.state match {
-                  case TransactionState.ONGOING | TransactionState.EMPTY | 
TransactionState.DEAD | TransactionState.PREPARE_EPOCH_FENCE =>
-                    producerEpoch == txnMetadata.producerEpoch
-                  case TransactionState.PREPARE_COMMIT | 
TransactionState.PREPARE_ABORT =>
-                    retryOnEpochBump
-                  case TransactionState.COMPLETE_COMMIT | 
TransactionState.COMPLETE_ABORT =>
-                    retryOnEpochBump || retryOnOverflow || producerEpoch == 
txnMetadata.producerEpoch
+            // These checks are performed first so that we don't have to deal 
with intermediate states
+            // (including PrepareCommit and PrepareAbort) in already complex 
state machine.
+            if (txnMetadata.pendingTransitionInProgress && 
!txnMetadata.pendingState.contains(TransactionState.PREPARE_EPOCH_FENCE)) {
+              Left(Errors.CONCURRENT_TRANSACTIONS)
+            } else if (txnMetadata.state == TransactionState.PREPARE_COMMIT || 
txnMetadata.state == TransactionState.PREPARE_ABORT) {

Review Comment:
   Here we eagerly reject requests if the transaction is in PrepareCommit / 
PrepareAbort state.  Previously the code tried to validate the state and return 
fenced error on mismatch, but in the end if the checks were successful the 
request got rejected with concurrent transactions error.  The checks grew 
extremely complex with the new nextProduceEpoch state, so we avoid this 
complexity altogether by just returning concurrent transactions error for this 
transient state.



##########
core/src/main/scala/kafka/coordinator/transaction/TransactionMetadata.scala:
##########
@@ -109,7 +125,7 @@ private[transaction] class TransactionMetadata(val 
transactionalId: String,
   def prepareIncrementProducerEpoch(newTxnTimeoutMs: Int,
                                     expectedProducerEpoch: Option[Short],
                                     updateTimestamp: Long): Either[Errors, 
TxnTransitMetadata] = {
-    if (isProducerEpochExhausted)
+    if (TransactionMetadata.isEpochExhausted(producerEpoch))

Review Comment:
   This changed because `isProducerEpochExhausted` used `producerEpoch`, but 
now it uses `clientProducerEpoch` (which sometimes could be 
`nextProducerEpoch`), so in the place we want to continue checking 
`producerEpoch`, thus this is an explicit call.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to