Re: [DISCUSS] KIP-932: Queues for Kafka

Andrew Schofield Tue, 30 May 2023 09:15:43 -0700

Yes, that’s it. I imagine something similar to KIP-848 for managing the share 
group
membership, and consumers that fetch records from their assigned partitions and
acknowledge when delivery completes.


Thanks,
Andrew

> On 30 May 2023, at 16:52, Adam Warski <a...@warski.org> wrote:
>
> Thanks for the explanation!
>
> So effectively, a share group is subscribed to each partition - but the data 
> is not pushed to the consumer, but only sent on demand. And when demand is 
> signalled, a batch of messages is sent?
> Hence it would be up to the consumer to prefetch a sufficient number of 
> batches to ensure, that it will never be "bored"?
>
> Adam
>
>> On 30 May 2023, at 15:25, Andrew Schofield <andrew_schofi...@live.com> wrote:
>>
>> Hi Adam,
>> Thanks for your question.
>>
>> With a share group, each fetch is able to grab available records from any 
>> partition. So, it alleviates
>> the “head-of-line” blocking problem where a slow consumer gets in the way. 
>> There’s no actual
>> stealing from a slow consumer, but it can be overtaken and must complete its 
>> processing within
>> the timeout.
>>
>> The way I see this working is that when a consumer joins a share group, it 
>> receives a set of
>> assigned share-partitions. To start with, every consumer will be assigned 
>> all partitions. We
>> can be smarter than that, but I think that’s really a question of writing a 
>> smarter assignor
>> just as has occurred over the years with consumer groups.
>>
>> Only a small proportion of Kafka workloads are super high throughput. Share 
>> groups would
>> struggle with those I’m sure. Share groups do not diminish the value of 
>> consumer groups
>> for streaming. They just give another option for situations where a 
>> different style of
>> consumption is more appropriate.
>>
>> Thanks,
>> Andrew
>>
>>> On 29 May 2023, at 17:18, Adam Warski <a...@warski.org> wrote:
>>>
>>> Hello,
>>>
>>> thank you for the proposal! A very interesting read.
>>>
>>> I do have one question, though. When you subscribe to a topic using 
>>> consumer groups, it might happen that one consumer has processed all 
>>> messages from its partitions, while another one still has a lot of work to 
>>> do (this might be due to unbalanced partitioning, long processing times 
>>> etc.). In a message-queue approach, it would be great to solve this problem 
>>> - so that a consumer that is free can steal work from other consumers. Is 
>>> this somehow covered by share groups?
>>>
>>> Maybe this is planned as "further work", as indicated here:
>>>
>>> "
>>> It manages the topic-partition assignments for the share-group members. An 
>>> initial, trivial implementation would be to give each member the list of 
>>> all topic-partitions which matches its subscriptions and then use the 
>>> pull-based protocol to fetch records from all partitions. A more 
>>> sophisticated implementation could use topic-partition load and lag metrics 
>>> to distribute partitions among the consumers as a kind of autonomous, 
>>> self-balancing partition assignment, steering more consumers to busier 
>>> partitions, for example. Alternatively, a push-based fetching scheme could 
>>> be used. Protocol details will follow later.
>>> "
>>>
>>> but I’m not sure if I understand this correctly. A fully-connected graph 
>>> seems like a lot of connections, and I’m not sure if this would play well 
>>> with streaming.
>>>
>>> This also seems as one of the central problems - a key differentiator 
>>> between share and consumer groups (the other one being persisting state of 
>>> messages). And maybe the exact way we’d want to approach this would, to a 
>>> certain degree, dictate the design of the queueing system?
>>>
>>> Best,
>>> Adam Warski
>>>
>>> On 2023/05/15 11:55:14 Andrew Schofield wrote:
>>>> Hi,
>>>> I would like to start a discussion thread on KIP-932: Queues for Kafka. 
>>>> This KIP proposes an alternative to consumer groups to enable cooperative 
>>>> consumption by consumers without partition assignment. You end up with 
>>>> queue semantics on top of regular Kafka topics, with per-message 
>>>> acknowledgement and automatic handling of messages which repeatedly fail 
>>>> to be processed.
>>>>
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka
>>>>
>>>> Please take a look and let me know what you think.
>>>>
>>>> Thanks.
>>>> Andrew
>>>
>>
>
> --
> Adam Warski
>
> https://www.softwaremill.com/
> https://twitter.com/adamwarski

Re: [DISCUSS] KIP-932: Queues for Kafka

Reply via email to