[jira] [Commented] (KAFKA-5494) Idempotent producer should not require max.in.flight.requests.per.connection=1 and acks=all

Apurva Mehta (JIRA) Tue, 25 Jul 2017 17:09:36 -0700

    [ 
https://issues.apache.org/jira/browse/KAFKA-5494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16100976#comment-16100976
 ]


Apurva Mehta commented on KAFKA-5494:
-------------------------------------

I created a separate JIRA to track the progress on making {{acks=all}} the 
default. You can find it here: https://issues.apache.org/jira/browse/KAFKA-5640

> Idempotent producer should not require 
> max.in.flight.requests.per.connection=1 and acks=all
> -------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5494
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5494
>             Project: Kafka
>          Issue Type: Sub-task
>    Affects Versions: 0.11.0.0
>            Reporter: Apurva Mehta
>            Assignee: Apurva Mehta
>              Labels: exactly-once
>             Fix For: 1.0.0
>
>
> Currently, the idempotent producer (and hence transactional producer) 
> requires max.in.flight.requests.per.connection=1.
> This was due to simplifying the implementation on the client and server. With 
> some additional work, we can satisfy the idempotent guarantees even with any 
> number of in flight requests. The changes on the client be summarized as 
> follows:
>  
> # We increment sequence numbers when batches are drained.
> # If for some reason, a batch fails with a retriable error, we know that all 
> future batches would fail with an out of order sequence exception. 
> # As such, the client should treat some OutOfOrderSequence errors as 
> retriable. In particular, we should maintain the 'last acked sequnece'. If 
> the batch succeeding the last ack'd sequence has an OutOfOrderSequence, that 
> is a fatal error. If a future batch fails with OutOfOrderSequence they should 
> be reenqeued.
> # With the changes above, the the producer queues should become priority 
> queues ordered by the sequence numbers. 
> # The partition is not ready unless the front of the queue has the next 
> expected sequence.
> With the changes above, we would get the benefits of multiple inflights in 
> normal cases. When there are failures, we automatically constrain to a single 
> inflight until we get back in sequence. 
> With multiple inflights, we now have the possibility of getting duplicates 
> for batches other than the last appended batch. In order to return the record 
> metadata (including offset) of the duplicates inside the log, we would 
> require a log scan at the tail to get the metadata at the tail. This can be 
> optimized by caching the metadata for the last 'n' batches. For instance, if 
> the default max.inflight is 5, we could cache the record metadata of the last 
> 5 batches, and fall back to a scan if the duplicate is not within those 5. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (KAFKA-5494) Idempotent producer should not require max.in.flight.requests.per.connection=1 and acks=all

Reply via email to