[ 
https://issues.apache.org/jira/browse/KAFKA-5621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102313#comment-16102313
 ] 

Apurva Mehta commented on KAFKA-5621:
-------------------------------------

I think the core dichotomy is that we have mirror-maker-like use cases and 
application use cases.
 
In the mirror maker use case, each partition is truly independent. If a subset 
of partitions are down, we still want to process the rest. So we want to expire 
batches and raise errors to the application (mirror maker in this case) as soon 
as possible. 

On the other hand, for an application, partitions are not really independent 
(and especially so if you use transactions). If one partition is down, it makes 
sense to wait for it to be ready before continuing. So we would want to handle 
as many errors internally as possible. It would mean blocking sends once the 
queue is too large and not expiring batches in the queue. This simplifies the 
application programming model. 

I think we should optimize the defaults for applications, but yet enable tools 
like mirror maker to get the desired behavior by setting the right configs.

Assuming that the we complete [KAFKA-5494], we could apply retries to expired 
batches only when the idempotent producer is enabled. This way the default 
behavior is the simplest one for the application. 

KMM and other such tools could continue to use the producer without idempotence 
enabled and keep the existing behavior. Of course, if we get into the same 
quandary if KMM wants to enable idempotence, but this is the best compromise 
without introducing an additional config. 

Another option is to introduce the 'queue.time.ms' config. The default would be 
infinite. When it is specified, we would not retry expired batches regardless 
of whether idempotence is enabled. So KMM like tooling could specify a value 
and most application developers could ignore it. 

I am not a fan of introducing new configs for a very narrow use case though, so 
I will continue to think of more alternatives.

> The producer should retry expired batches when retries are enabled
> ------------------------------------------------------------------
>
>                 Key: KAFKA-5621
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5621
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Apurva Mehta
>             Fix For: 1.0.0
>
>
> Today, when a batch is expired in the accumulator, a {{TimeoutException}} is 
> raised to the user.
> It might be better the producer to retry the expired batch rather up to the 
> configured number of retries. This is more intuitive from the user's point of 
> view. 
> Further the proposed behavior makes it easier for applications like mirror 
> maker to provide ordering guarantees even when batches expire. Today, they 
> would resend the expired batch and it would get added to the back of the 
> queue, causing the output ordering to be different from the input ordering.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to