[ 
https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16101044#comment-16101044
 ] 

Sumant Tambe commented on KAFKA-5621:
-------------------------------------

I'm unsure about the suggestion to share the same bucket of retry tokens in 
accumulator and the network client. I find it comforting that the Sender will 
retry a few times at the network level before bailing out on a record as not 
making progress. If the same bucket of tokens is shared, it is possible that 
network-level retry mechanism does not get enough chances. If/when the broker 
is overloaded, and takes long time to respond to a produce request, the 
producer might pessimistically bail out on the record. I may be OK to double 
the bound on failure notification rather than cutting the network client short 
in some cases.

I'm also not a fan of never expiring a batch (with or without idempotent 
producer). KMM in Linkedin today stop replication if a batch expires because of 
expiration (any error for that matter). While KMM does not care (much) about 
successful replication latency, it does care about how long to wait when 
there's a logjam of records in it's accumulator. It happens usually in catch-up 
mode. KMM would want to commit suicide if/when partitions are drained too 
slowly. I.e., It has a bound on failure notification. 


> The producer should retry expired batches when retries are enabled
> ------------------------------------------------------------------
>
>                 Key: KAFKA-5621
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5621
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>             Fix For: 1.0.0
>
>
> Today, when a batch is expired in the accumulator, a {{TimeoutException}} is 
> raised to the user.
> It might be better the producer to retry the expired batch rather up to the 
> configured number of retries. This is more intuitive from the user's point of 
> view. 
> Further the proposed behavior makes it easier for applications like mirror 
> maker to provide ordering guarantees even when batches expire. Today, they 
> would resend the expired batch and it would get added to the back of the 
> queue, causing the output ordering to be different from the input ordering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to