Re: [DISCUSS] KIP-932: Queues for Kafka

Andrew Schofield Tue, 30 May 2023 06:25:42 -0700

Hi Adam,
Thanks for your question.

With a share group, each fetch is able to grab available records from any 
partition. So, it alleviates
the “head-of-line” blocking problem where a slow consumer gets in the way. 
There’s no actual
stealing from a slow consumer, but it can be overtaken and must complete its 
processing within
the timeout.


The way I see this working is that when a consumer joins a share group, it 
receives a set of
assigned share-partitions. To start with, every consumer will be assigned all 
partitions. We
can be smarter than that, but I think that’s really a question of writing a 
smarter assignor
just as has occurred over the years with consumer groups.

Only a small proportion of Kafka workloads are super high throughput. Share 
groups would
struggle with those I’m sure. Share groups do not diminish the value of 
consumer groups
for streaming. They just give another option for situations where a different 
style of
consumption is more appropriate.

Thanks,
Andrew

> On 29 May 2023, at 17:18, Adam Warski <a...@warski.org> wrote:
>
> Hello,
>
> thank you for the proposal! A very interesting read.
>
> I do have one question, though. When you subscribe to a topic using consumer 
> groups, it might happen that one consumer has processed all messages from its 
> partitions, while another one still has a lot of work to do (this might be 
> due to unbalanced partitioning, long processing times etc.). In a 
> message-queue approach, it would be great to solve this problem - so that a 
> consumer that is free can steal work from other consumers. Is this somehow 
> covered by share groups?
>
> Maybe this is planned as "further work", as indicated here:
>
> "
> It manages the topic-partition assignments for the share-group members. An 
> initial, trivial implementation would be to give each member the list of all 
> topic-partitions which matches its subscriptions and then use the pull-based 
> protocol to fetch records from all partitions. A more sophisticated 
> implementation could use topic-partition load and lag metrics to distribute 
> partitions among the consumers as a kind of autonomous, self-balancing 
> partition assignment, steering more consumers to busier partitions, for 
> example. Alternatively, a push-based fetching scheme could be used. Protocol 
> details will follow later.
> "
>
> but I’m not sure if I understand this correctly. A fully-connected graph 
> seems like a lot of connections, and I’m not sure if this would play well 
> with streaming.
>
> This also seems as one of the central problems - a key differentiator between 
> share and consumer groups (the other one being persisting state of messages). 
> And maybe the exact way we’d want to approach this would, to a certain 
> degree, dictate the design of the queueing system?
>
> Best,
> Adam Warski
>
> On 2023/05/15 11:55:14 Andrew Schofield wrote:
>> Hi,
>> I would like to start a discussion thread on KIP-932: Queues for Kafka. This 
>> KIP proposes an alternative to consumer groups to enable cooperative 
>> consumption by consumers without partition assignment. You end up with queue 
>> semantics on top of regular Kafka topics, with per-message acknowledgement 
>> and automatic handling of messages which repeatedly fail to be processed.
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka
>>
>> Please take a look and let me know what you think.
>>
>> Thanks.
>> Andrew
>

Re: [DISCUSS] KIP-932: Queues for Kafka

Reply via email to