[jira] [Updated] (KAFKA-18905) Avoid out of order sequence errors from multiple in-flight batches

Sean Quah (Jira) Fri, 28 Feb 2025 11:09:10 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-18905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sean Quah updated KAFKA-18905:
------------------------------
    Description: 
Consider a similar setup to KAFKA-9199:

The broker attempts to cache the state of the last 5 batches in order to enable 
duplicate detection. It is possible in some cases for this to result in a 
sequence such as the following:
 # Send batch n
 # Batch n successfully written, but response is delayed
 # Send batch n+1, receive successful response for n+1
 # Send batch n+2, receive successful response for n+2
 # Send batch n+3, receive successful response for n+3
 # Send batch n+4, receive successful response for n+4
 # Send batch n+5, receive successful response for n+5
 # Batch n times out, or we receive a {{NOT_LEADER_OR_FOLLOWER}} response
 # Retry batch n, the response is {{OUT_OF_ORDER_SEQUENCE}}, retry again and 
again

This situation could happen with a {{max.in.flight.requests.per.connection}} as 
low as 2.

To avoid this situation, we can avoid putting batch n+5 in flight while batch n 
is in flight.


  was:
Consider a similar setup to KAFKA-9199:

The broker attempts to cache the state of the last 5 batches in order to enable 
duplicate detection. It is possible in some cases for this to result in a 
sequence such as the following:
 # Send batch n
 # Batch n successfully written, but response is delayed
 # Send Batch n+1, receive successful response for n+1
 # Send Batch n+2, receive successful response for n+2
 # Send Batch n+3, receive successful response for n+3
 # Send Batch n+4, receive successful response for n+4
 # Send Batch n+5, receive successful response for n+5
 # Batch n times out, or we receive a {{NOT_LEADER_OR_FOLLOWER}} response
 # Retry batch n, the response is {{OUT_OF_ORDER_SEQUENCE}}, retry again and 
again

This situation could happen with a {{max.in.flight.requests.per.connection}} as 
low as 2.

To avoid this situation, we can avoid putting batch n+5 in flight while batch n 
is in flight.



> Avoid out of order sequence errors from multiple in-flight batches
> ------------------------------------------------------------------
>
>                 Key: KAFKA-18905
>                 URL: https://issues.apache.org/jira/browse/KAFKA-18905
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>            Reporter: Sean Quah
>            Priority: Minor
>
> Consider a similar setup to KAFKA-9199:
> The broker attempts to cache the state of the last 5 batches in order to 
> enable duplicate detection. It is possible in some cases for this to result 
> in a sequence such as the following:
>  # Send batch n
>  # Batch n successfully written, but response is delayed
>  # Send batch n+1, receive successful response for n+1
>  # Send batch n+2, receive successful response for n+2
>  # Send batch n+3, receive successful response for n+3
>  # Send batch n+4, receive successful response for n+4
>  # Send batch n+5, receive successful response for n+5
>  # Batch n times out, or we receive a {{NOT_LEADER_OR_FOLLOWER}} response
>  # Retry batch n, the response is {{OUT_OF_ORDER_SEQUENCE}}, retry again and 
> again
> This situation could happen with a {{max.in.flight.requests.per.connection}} 
> as low as 2.
> To avoid this situation, we can avoid putting batch n+5 in flight while batch 
> n is in flight.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-18905) Avoid out of order sequence errors from multiple in-flight batches

Reply via email to