Hi, Andrey,

One potential benefit of keeping the per partition limit is for Kafka
stream. When reading messages from different partitions, KStream prefers to
read from partitions with smaller timestamps first and only advances the
KStream timestamp when it sees at least one message from every partition.
Being able to fill up multiple partitions in a single fetch response can
help KStream advance the timestamp quicker when there is backlog from the
input. So, it's probably better if we just add a new response limit while
keeping the per partition limit.

Also, for fairness across partitions, you mentioned "The solution is to
start fetching from first empty partition in round-robin fashion or to
perform random shuffle of partitions.".  It seems the former is more
deterministic. Did you use that in your implementation and should we
recommend that for non-java clients as well?

Thanks,

Jun

On Wed, Aug 10, 2016 at 10:55 AM, Andrey L. Neporada <
anepor...@yandex-team.ru> wrote:

> Hi, Jun!
>
> Thanks for feedback!
>
> > On 10 Aug 2016, at 17:42, Jun Rao <j...@confluent.io> wrote:
> >
> > Hi, Andrey,
> >
> > Thanks for the reply. A couple of more comments inline below.
> >
> > On Wed, Aug 10, 2016 at 3:56 AM, Andrey L. Neporada <
> > anepor...@yandex-team.ru <mailto:anepor...@yandex-team.ru>> wrote:
> >
> >>
> >> Yes, such cooperative configuration for fetch request may look a bit
> weird.
> >> But I don’t see other options if we want to remove partition limits from
> >> fetch request.
> >> In this case we need some server-side configuration for partition
> limits.
> >>
> >>
> > What if we keep the current partition level limit in the fetch request
> and
> > just add an additional response level limit? The default partition limit
> > can be much smaller than the max message size and will only be used for
> > fairness across partitions.
> >
>
> Yes, we can just add global response limit and leave partition limits as
> is.
> In fact, my initial implementation (https://github.com/apache/kaf
> ka/pull/1683) of this KIP preserves per-partition limits.
> However, as it seems from KAFKA-2063 discussion, some people prefer to
> deprecate partition level limit.
> I have no real opinion on this topic - hope we can choose best option here.
>
> ...
> >>
> >> No, I mean that actual response side can be bigger than limit_bytes, but
> >> less than limit_bytes + message.max.bytes.
> >> This behaviour is a result of algorithm proposed in KIP (and in PR).
> >>
> >>
> > Got it. An alternative is to only add a partition's data to the response
> up
> > to the remaining response limit. The only exception is that this is the
> > first partition and the first message in that partition is larger than
> the
> > response limit. Then the bound will be max(limit_bytes,
> message.max.bytes),
> > which is tighter.
> >
>
> Yes, this one looks better.
>
>
> Thanks,
> Andrey.

Reply via email to