Re: [DISCUSS] KIP-74: Add Fetch Response Size Limit in Bytes

Andrey L. Neporada Fri, 12 Aug 2016 09:18:07 -0700

Hi!

> On 12 Aug 2016, at 18:32, Jun Rao <j...@confluent.io> wrote:
> 
> Hi, Andrey,
> 
> Why shouldn't the client library do reordering? It seems that if
> ReplicaFetcher thread does round-robin, the consumer client should do that
> too?
>

IMHO the client library is not a good place to implement such logic.
For example, if we want to put round-robin logic in client lib, it will become 
stageful (since it should remember first empty partition received from server). 
It should also remember some kind of context to distinguish one end-user (or 
partition set) from another, etc. 

Personally, I'd expect from client library to submit requests as is.
Anyway, I think it is difficult to come up with reordering logic that will be 
good for all clients.

So, I propose:

1) preserve per-partition limit in fetch request (as discussed)
2) make default limit of fetch response to be 0 (which means no limit at all). 
So, by default, new fetch request should behave exactly like old one
3) implement round-robin logic in ReplicaFetcherThread and use non-zero default 
fetch response limit there.
4) document that all end-users who want to use FetchRequest with actual 
response limit to implement some kind of round-robin to ensure fairness (if 
they care about it)

> Thanks,
> 
> Jun
> 

Thanks,
Andrey.

> On Fri, Aug 12, 2016 at 3:56 AM, Andrey L. Neporada <
> anepor...@yandex-team.ru> wrote:
> 
>> Hi!
>> 
>>> On 12 Aug 2016, at 04:29, Jun Rao <j...@confluent.io> wrote:
>>> 
>>> Hi, Andrey,
>>> 
>>> One potential benefit of keeping the per partition limit is for Kafka
>>> stream. When reading messages from different partitions, KStream prefers
>> to
>>> read from partitions with smaller timestamps first and only advances the
>>> KStream timestamp when it sees at least one message from every partition.
>>> Being able to fill up multiple partitions in a single fetch response can
>>> help KStream advance the timestamp quicker when there is backlog from the
>>> input. So, it's probably better if we just add a new response limit while
>>> keeping the per partition limit.
>>> 
>> 
>> Yes, this makes sense for me.
>> 
>>> Also, for fairness across partitions, you mentioned "The solution is to
>>> start fetching from first empty partition in round-robin fashion or to
>>> perform random shuffle of partitions.".  It seems the former is more
>>> deterministic. Did you use that in your implementation and should we
>>> recommend that for non-java clients as well?
>>> 
>> 
>> In my initial implementation I just did random shuffling on server side.
>> Now I plan to use deterministic approach - do round-robin in
>> ReplicaFetcher thread - I believe the client library itself shouldn’t do
>> any form of reordering.
>> 
>> But we should definitely recommend some form of shuffling if client wants
>> to limit response size.
>> Not sure which shuffling method is better.
>> 
>> 
>>> Thanks,
>>> 
>>> Jun
>>> 
>> 
>> Thanks,
>> Andrey.
>> 
>>

Re: [DISCUSS] KIP-74: Add Fetch Response Size Limit in Bytes

Reply via email to