Yes, that’s it. I imagine something similar to KIP-848 for managing the share group membership, and consumers that fetch records from their assigned partitions and acknowledge when delivery completes.
Thanks, Andrew > On 30 May 2023, at 16:52, Adam Warski <a...@warski.org> wrote: > > Thanks for the explanation! > > So effectively, a share group is subscribed to each partition - but the data > is not pushed to the consumer, but only sent on demand. And when demand is > signalled, a batch of messages is sent? > Hence it would be up to the consumer to prefetch a sufficient number of > batches to ensure, that it will never be "bored"? > > Adam > >> On 30 May 2023, at 15:25, Andrew Schofield <andrew_schofi...@live.com> wrote: >> >> Hi Adam, >> Thanks for your question. >> >> With a share group, each fetch is able to grab available records from any >> partition. So, it alleviates >> the “head-of-line” blocking problem where a slow consumer gets in the way. >> There’s no actual >> stealing from a slow consumer, but it can be overtaken and must complete its >> processing within >> the timeout. >> >> The way I see this working is that when a consumer joins a share group, it >> receives a set of >> assigned share-partitions. To start with, every consumer will be assigned >> all partitions. We >> can be smarter than that, but I think that’s really a question of writing a >> smarter assignor >> just as has occurred over the years with consumer groups. >> >> Only a small proportion of Kafka workloads are super high throughput. Share >> groups would >> struggle with those I’m sure. Share groups do not diminish the value of >> consumer groups >> for streaming. They just give another option for situations where a >> different style of >> consumption is more appropriate. >> >> Thanks, >> Andrew >> >>> On 29 May 2023, at 17:18, Adam Warski <a...@warski.org> wrote: >>> >>> Hello, >>> >>> thank you for the proposal! A very interesting read. >>> >>> I do have one question, though. When you subscribe to a topic using >>> consumer groups, it might happen that one consumer has processed all >>> messages from its partitions, while another one still has a lot of work to >>> do (this might be due to unbalanced partitioning, long processing times >>> etc.). In a message-queue approach, it would be great to solve this problem >>> - so that a consumer that is free can steal work from other consumers. Is >>> this somehow covered by share groups? >>> >>> Maybe this is planned as "further work", as indicated here: >>> >>> " >>> It manages the topic-partition assignments for the share-group members. An >>> initial, trivial implementation would be to give each member the list of >>> all topic-partitions which matches its subscriptions and then use the >>> pull-based protocol to fetch records from all partitions. A more >>> sophisticated implementation could use topic-partition load and lag metrics >>> to distribute partitions among the consumers as a kind of autonomous, >>> self-balancing partition assignment, steering more consumers to busier >>> partitions, for example. Alternatively, a push-based fetching scheme could >>> be used. Protocol details will follow later. >>> " >>> >>> but I’m not sure if I understand this correctly. A fully-connected graph >>> seems like a lot of connections, and I’m not sure if this would play well >>> with streaming. >>> >>> This also seems as one of the central problems - a key differentiator >>> between share and consumer groups (the other one being persisting state of >>> messages). And maybe the exact way we’d want to approach this would, to a >>> certain degree, dictate the design of the queueing system? >>> >>> Best, >>> Adam Warski >>> >>> On 2023/05/15 11:55:14 Andrew Schofield wrote: >>>> Hi, >>>> I would like to start a discussion thread on KIP-932: Queues for Kafka. >>>> This KIP proposes an alternative to consumer groups to enable cooperative >>>> consumption by consumers without partition assignment. You end up with >>>> queue semantics on top of regular Kafka topics, with per-message >>>> acknowledgement and automatic handling of messages which repeatedly fail >>>> to be processed. >>>> >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-932%3A+Queues+for+Kafka >>>> >>>> Please take a look and let me know what you think. >>>> >>>> Thanks. >>>> Andrew >>> >> > > -- > Adam Warski > > https://www.softwaremill.com/ > https://twitter.com/adamwarski