Hi Paul, My question is, which of the above benefits you mention *cannot* be achieved with plan consumers and no groups? If I, as a developer, knew the number of partitions in advance and limiting to Kubernetes as the platform:
1, 2. I can launch as many consumers as partitions. A failed consumer will be restarted by Kubernetes. 3. Agree. 4. I can assign multiple partitions in my consumers. Best regards. On Thu, Jan 30, 2025 at 2:46 PM Brebner, Paul <paul.breb...@netapp.com.invalid> wrote: > Hi again – interesting discussion thanks. Coming from a fairly old school > background of MOM and JMS in the 00’s I guess I found Kafka consumer groups > “interesting” because: > > 1 – they enable message fan-out to multiple consumers – i.e. each group > gets a copy of each message – this isn’t “normal” for traditional > pub-sub/MOM > 2 – they enable the sharing of messages for scalability and concurrency – > you can easily scale consumer processing by increasing the number of > consumers (and partitions) > 3 – message processing order – order is only guaranteed per partition > 4 – check out the parallel consumer for higher throughput with less > consumers/partitions – there are more options for parallel processing and > ordering > > Regards, Paul > > From: Artem Livshits <alivsh...@confluent.io.INVALID> > Date: Friday, 24 January 2025 at 5:29 am > To: users@kafka.apache.org <users@kafka.apache.org> > Subject: Re: JoinGroup API response timing. > [You don't often get email from alivsh...@confluent.io.invalid. Learn why > this is important at https://aka.ms/LearnAboutSenderIdentification ] > > EXTERNAL EMAIL - USE CAUTION when clicking links or attachments > > > > > > I wanted to understand the motivation for consumer group > > The motivation for the consumer group is to let Kafka manage the > workload distribution among consumers. The application then can just run > multiple uncoordinated instances and not worry about partitions that have > no consumer assigned or consumers that are overloaded / idle -- the > consumer group coordinator would coordinate partition assignment across > consumers in the same consumer group. If the application doesn't need this > functionality it can use KafkaConsumer.assign method to manually assign > partitions to consumers and run instance coordination logic to manage > liveness / workload balance by itself. > > > As for offsets, let them be saved locally by consumer or Redis, etc. > > This can work, unless we need exactly-once semantics (make sure that if we > consume a record from a topic and produce a corresponding record to another > topic, we either atomically do both or not do both). For exactly-once > semantics, the application would need to use Kafka transactions and store > offsets in Kafka. > > -Artem > > On Wed, Jan 22, 2025 at 5:56 PM Chain Head <mrchainh...@gmail.com> wrote: > > > I wanted to understand the motivation for consumer group. > > > > *If number of partitions are known in advance*, then each consumer can > > subscribe to individual topic-partition. If such a consumer fails, let > > kubernetes reschedule it. (Elsewhere, similar restart mechanism may > exist.) > > Sharing the load of a failed consumer assumes that data across partitions > > are "same". Else, such sharing is needless burden. As for offsets, let > them > > be saved locally by consumer or Redis, etc. > > > > The concept of consumer group shifts the responsibility of partition > > assignment to broker because only broker knows the number of partitions. > > > > Best regards. > > > > On Thu, 23 Jan, 2025, 06:25 Brebner, Paul, <paul.breb...@netapp.com > > .invalid> > > wrote: > > > > > Hi – short answer is consumers can read from a specific partition, but > in > > > general for a consumer group you want to balance the partitions across > > the > > > available consumers for high throughput – if a consumer fails or is > > kicked > > > off the group because it times out etc then the remainder of the > > consumers > > > are rebalanced across the partitions etc. Paul > > > > > > From: Chain Head <mrchainh...@gmail.com> > > > Date: Wednesday, 22 January 2025 at 5:19 pm > > > To: users@kafka.apache.org <users@kafka.apache.org> > > > Subject: Re: JoinGroup API response timing. > > > EXTERNAL EMAIL - USE CAUTION when clicking links or attachments > > > > > > > > > > > > > > > Thanks for the explanation. > > > > > > Slight digression and perhaps a silly question w.r.t. consumers. > > > > > > Since multiple groups are possible, at a high level, the broker > > effectively > > > sends data of a given topic-partition to multiple consumers while > keeping > > > track of offsets. So, why not let consumers specify the partition ID > they > > > want to consume? Is the concept of consumer groups only because the > > > consumers wouldn't know in advance how many partitions exist in a > topic? > > > > > > Best regards. > > > > > > On Wed, Jan 22, 2025 at 7:41 AM Greg Harris > <greg.har...@aiven.io.invalid > > > > > > wrote: > > > > > > > Hi, > > > > > > > > Thanks for the follow up. > > > > > > > > By "classic" I meant the protocol implemented by the > > SyncGroup/JoinGroup > > > > API [1]. It's a general group protocol that is still fully supported > in > > > > Kraft, and at this time has no intention of being deprecated. > > > > It's called "classic" to distinguish it from the newer KIP-848 [2] > > > > ConsumerGroupHeartbeat API and the share group protocol used in > KIP-932 > > > > [3]. It is my understanding that these other protocols are _not_ a > > > > synchronizing barrier in the same way the "classic" protocol is. > > > > > > > > Hope this helps, > > > > Greg > > > > > > > > [1] > > > > > > > > > > > > > > https://urldefense.com/v3/__https://github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$ > < > https://urldefense.com/v3/__https:/github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$ > > > > > < > > > > > > https://urldefense.com/v3/__https:/github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$ > > > > > > > > [2] > > > > > > > > > > > > > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$ > < > https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$ > > > > > < > > > > > > https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$ > > > > > > > > [3] > > > > > > > > > > > > > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$ > < > https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$ > > > > > < > > > > > > https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$ > > > > > > > > > > > > On Tue, Jan 21, 2025 at 4:41 PM Chain Head <mrchainh...@gmail.com> > > > wrote: > > > > > > > > > Thanks. > > > > > > > > > > By "classic" you mean pre-KRaft consensus? What is the "current" > > Kafka > > > > > Group protocol? > > > > > > > > > > Best regards. > > > > > > > > > > On Tue, Jan 21, 2025 at 9:55 PM Greg Harris > > > <greg.har...@aiven.io.invalid > > > > > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > Yes you are correct. The "classic" Kafka Group Protocol is a > > > > > synchronizing > > > > > > barrier for all members. > > > > > > All JoinGroup member responses are returned after all JoinGroup > > > member > > > > > > requests are received. > > > > > > > > > > > > Thanks, > > > > > > Greg > > > > > > > > > > > > On Tue, Jan 21, 2025 at 7:10 AM Chain Head < > mrchainh...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Assume that three consumers of a certain group want to connect > > to a > > > > > > broker > > > > > > > for a topic with 3 partitions. After the FindCoordinator API is > > > done, > > > > > the > > > > > > > consumers send JoinGroup. Since the broker cannot know in > advance > > > how > > > > > > many > > > > > > > consumers are expected to join, it waits > > > > > > group.initial.rebalance.delay.ms > > > > > > > before starting a rebalance. > > > > > > > > > > > > > > Therefore, does this mean the JoinGroup API response of each > > > request > > > > is > > > > > > > "held" until the waiting period is over? > > > > > > > > > > > > > > Best regards. > > > > > > > > > > > > > > > > > > > > > > > > > > > >