Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

Becket Qin Tue, 15 Aug 2017 22:50:20 -0700

Hi Apurva,

Thanks for the clarification of the definition. The definitions are clear
and helpful.


It seems the scope of this KIP is just about the producer side
configuration change, but not attempting to achieve the exactly once
semantic with all default settings out of the box. The broker still needs
to be configured appropriately to achieve the exactly once semantic. If so,
the current proposal sounds reasonable to me. Apologies if I misunderstood
the goal of this KIP.

Regarding the max.in.flight.requests.per.connection, I don't think we have
to support infinite number of in flight requests. But admittedly there are
use cases that people would want to have reasonably high in flight
requests. Given that we need to make code changes to support idempotence
and in.flight.request > 1, it would be nice to see if we can cover those
use cases instead of doing that later. We can discuss this in a separate
thread.

Thanks,

Jiangjie (Becket) Qin


On Tue, Aug 15, 2017 at 1:46 PM, Guozhang Wang <[email protected]> wrote:

> Hi Jay,
>
> I chatted with Apurva offline, and we think the key of the discussion is
> that, as summarized in the updated KIP wiki, whether we should consider
> replication as a necessary condition of at-least-once, and of course also
> exactly-once. Originally I think replication is not a necessary condition
> for at-least-once, since the scope of failures that we should be covering
> is different in my definition; if we claim that "even for at-least-once,
> you should have replication factor larger than 2, let alone exactly-once"
> then I agree that having acks=all on the client side should also be a
> necessary condition for at-least-once, and for exactly-once as well. Then
> this KIP would be just providing what is necessary but not sufficient
> conditions, from client-side configs to achieve EOS, while you also need
> the broker-side configs together to really support it.
>
> Guozhang
>
>
> On Tue, Aug 15, 2017 at 1:15 PM, Jay Kreps <[email protected]> wrote:
>
> > Hey Guozhang,
> >
> > I think the argument is that with acks=1 the message could be lost and
> > hence you aren't guaranteeing exactly once delivery.
> >
> > -Jay
> >
> > On Mon, Aug 14, 2017 at 1:36 PM, Guozhang Wang <[email protected]>
> wrote:
> >
> > > Just want to clarify that regarding 1), I'm fine with changing it to
> > `all`
> > > but just wanted to argue it is not necessarily correlate with the
> > > exactly-once semantics, but rather on persistence v.s. availability
> > > trade-offs, so I'd like to discuss them separately.
> > >
> > > Regarding 2), one minor concern I had is that the enforcement is on the
> > > client side while the parts it affects is on the broker side. I.e. the
> > > broker code would assume at most 5 in.flight when idempotent is turned
> > on,
> > > but this is not enforced at the broker but relying at the client side's
> > > sanity. So other implementations of the client that may not obey this
> may
> > > likely break the broker code. If we do enforce this we'd better enforce
> > it
> > > at the broker side. Also, I'm wondering if we have considered the
> > approach
> > > for brokers to read the logs in order to get the starting offset when
> it
> > > does not about it in its snapshot, that whether it is worthwhile if we
> > > assume that such issues are very rare to happen?
> > >
> > >
> > > Guozhang
> > >
> > >
> > >
> > > On Mon, Aug 14, 2017 at 11:01 AM, Apurva Mehta <[email protected]>
> > > wrote:
> > >
> > > > Hello,
> > > >
> > > > I just want to summarize where we are in this discussion
> > > >
> > > > There are two major points of contention: should we have acks=1 or
> > > acsk=all
> > > > by default? and how to cap max.in.flight.requests.per.connection?
> > > >
> > > > 1) acks=1 vs acks=all1
> > > >
> > > > Here are the tradeoffs of each:
> > > >
> > > > If you have replication-factor=N, your data is resilient N-1 to disk
> > > > failures. For N>1, here is the tradeoff between acks=1 and acks=all.
> > > >
> > > > With proposed defaults and acks=all, the stock Kafka producer and the
> > > > default broker settings would guarantee that ack'd messages would be
> in
> > > the
> > > > log exactly once.
> > > >
> > > > With the proposed defaults and acks=1, the stock Kafka producer and
> the
> > > > default broker settings would guarantee that 'retained ack'd messages
> > > would
> > > > be in the log exactly once. But all ack'd messages may not be
> > retained'.
> > > >
> > > > If you leave replication-factor=1, acks=1 and acks=all have identical
> > > > semantics and performance, but you are resilient to 0 disk failures.
> > > >
> > > > I think the measured cost (again the performance details are in the
> > wiki)
> > > > of acks=all is well worth the much clearer semantics. What does the
> > rest
> > > of
> > > > the community think?
> > > >
> > > > 2) capping max.in.flight at 5 when idempotence is enabled.
> > > >
> > > > We need to limit the max.in.flight for the broker to de-duplicate
> > > messages
> > > > properly. The limitation would only apply when idempotence is
> enabled.
> > > The
> > > > shared numbers show that when the client-broker latency is low, there
> > is
> > > no
> > > > performance gain for max.inflight > 2.
> > > >
> > > > Further, it is highly debatable that max.in.flight=500 is
> significantly
> > > > better than max.in.flight=5  for a really high latency client-broker
> > > link,
> > > > and so far there are no hard numbers one way or another. However,
> > > assuming
> > > > that max.in.flight=500 is significantly better than max.inflight=5 in
> > > some
> > > > niche use case, the user would have to sacrifice idempotence for
> > > > throughput. In this extreme corner case, I think it is an acceptable
> > > > tradeoff.
> > > >
> > > > What does the community think?
> > > >
> > > > Thanks,
> > > > Apurva
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-185: Make exactly once in order delivery per partition the default producer setting

Reply via email to