Calvin Liu created KAFKA-17877:
----------------------------------

             Summary: IllegalStateException: missing producer id from the 
WriteTxnMarkersResponse
                 Key: KAFKA-17877
                 URL: https://issues.apache.org/jira/browse/KAFKA-17877
             Project: Kafka
          Issue Type: Bug
            Reporter: Calvin Liu
            Assignee: Calvin Liu


{code:java}
java.lang.IllegalStateException: WriteTxnMarkerResponse for 
lkc-devcv9jg9n_transaction-bench-transaction-id-72UwIuNVQkOxl4y_OEBAlA does not 
contain expected error map for producer id 8308
{code}
[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/coordinator/transaction/TransactionMarkerRequestCompletionHandler.scala#L100]

------

It is a data partition side bug. The leader may return the response early 
without all the producer ID included in the response.

 

Consider the following case:
 # We have 2 markers to append, one for producer-0, one for producer-1
 # When we first process producer-0, it appends a marker to the 
__consumer_offset.
 # The __consumer_offset append finishes very fast because the group 
coordinator is no longer the leader. So the coordinator directly returns 
NOT_LEADER_OR_FOLLOWER. In its callback, it calls the {{maybeComplete()}} for 
the first time, and because there is only one partition to append, it is able 
to go further to call {{maybeSendResponseCallback()}} and decrement 
{{{}numAppends{}}}.
 # Then it calls the replica manager append for nothing, in the callback, it 
calls the {{maybeComplete()}} for the second time. This time, it also 
decrements {{{}numAppends{}}}.

Remember, because we only have 2 markers, the initial value for {{numAppends}} 
is also 2. So in step 4, it is able to finish the request without even 
processing producer-1. This will cause the producer-1 missing from the 
WriteTxnMarkers response.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to