Hi again – interesting discussion thanks.  Coming from a fairly old school 
background of MOM and JMS in the 00’s I guess I found Kafka consumer groups 
“interesting” because:

1 – they enable message fan-out to multiple consumers – i.e. each group gets a 
copy of each message – this isn’t “normal” for traditional pub-sub/MOM
2 – they enable the sharing of messages for scalability and concurrency – you 
can easily scale consumer processing by increasing the number of consumers (and 
partitions)
3 – message processing order – order is only guaranteed per partition
4 – check out the parallel consumer for higher throughput with less 
consumers/partitions – there are more options for parallel processing and 
ordering

Regards, Paul

From: Artem Livshits <alivsh...@confluent.io.INVALID>
Date: Friday, 24 January 2025 at 5:29 am
To: users@kafka.apache.org <users@kafka.apache.org>
Subject: Re: JoinGroup API response timing.
[You don't often get email from alivsh...@confluent.io.invalid. Learn why this 
is important at https://aka.ms/LearnAboutSenderIdentification ]

EXTERNAL EMAIL - USE CAUTION when clicking links or attachments




>  I wanted to understand the motivation for consumer group

The motivation for the consumer group is to let Kafka manage the
workload distribution among consumers.  The application then can just run
multiple uncoordinated instances and not worry about partitions that have
no consumer assigned or consumers that are overloaded / idle -- the
consumer group coordinator would coordinate partition assignment across
consumers in the same consumer group.  If the application doesn't need this
functionality it can use KafkaConsumer.assign method to manually assign
partitions to consumers and run instance coordination logic to manage
liveness / workload balance by itself.

> As for offsets, let them be saved locally by consumer or Redis, etc.

This can work, unless we need exactly-once semantics (make sure that if we
consume a record from a topic and produce a corresponding record to another
topic, we either atomically do both or not do both).  For exactly-once
semantics, the application would need to use Kafka transactions and store
offsets in Kafka.

-Artem

On Wed, Jan 22, 2025 at 5:56 PM Chain Head <mrchainh...@gmail.com> wrote:

> I wanted to understand the motivation for consumer group.
>
> *If number of partitions are known in advance*, then each consumer can
> subscribe to individual topic-partition. If such a consumer fails, let
> kubernetes reschedule it. (Elsewhere, similar restart mechanism may exist.)
> Sharing the load of a failed consumer assumes that data across partitions
> are "same". Else, such sharing is needless burden. As for offsets, let them
> be saved locally by consumer or Redis, etc.
>
> The concept of consumer group shifts the responsibility of partition
> assignment to broker because only broker knows the number of partitions.
>
> Best regards.
>
> On Thu, 23 Jan, 2025, 06:25 Brebner, Paul, <paul.breb...@netapp.com
> .invalid>
> wrote:
>
> > Hi – short answer is consumers can read from a specific partition, but in
> > general for a consumer group you want to balance the partitions across
> the
> > available consumers for high throughput – if a consumer fails or is
> kicked
> > off the group because it times out etc then the remainder of the
> consumers
> > are rebalanced across the partitions etc.   Paul
> >
> > From: Chain Head <mrchainh...@gmail.com>
> > Date: Wednesday, 22 January 2025 at 5:19 pm
> > To: users@kafka.apache.org <users@kafka.apache.org>
> > Subject: Re: JoinGroup API response timing.
> > EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
> >
> >
> >
> >
> > Thanks for the explanation.
> >
> > Slight digression and perhaps a silly question w.r.t. consumers.
> >
> > Since multiple groups are possible, at a high level, the broker
> effectively
> > sends data of a given topic-partition to multiple consumers while keeping
> > track of offsets. So, why not let consumers specify the partition ID they
> > want to consume? Is the concept of consumer groups only because the
> > consumers wouldn't know in advance how many partitions exist in a topic?
> >
> > Best regards.
> >
> > On Wed, Jan 22, 2025 at 7:41 AM Greg Harris <greg.har...@aiven.io.invalid
> >
> > wrote:
> >
> > > Hi,
> > >
> > > Thanks for the follow up.
> > >
> > > By "classic" I meant the protocol implemented by the
> SyncGroup/JoinGroup
> > > API [1]. It's a general group protocol that is still fully supported in
> > > Kraft, and at this time has no intention of being deprecated.
> > > It's called "classic" to distinguish it from the newer KIP-848 [2]
> > > ConsumerGroupHeartbeat API and the share group protocol used in KIP-932
> > > [3]. It is my understanding that these other protocols are _not_ a
> > > synchronizing barrier in the same way the "classic" protocol is.
> > >
> > > Hope this helps,
> > > Greg
> > >
> > > [1]
> > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$<https://urldefense.com/v3/__https:/github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$>
> > <
> >
> https://urldefense.com/v3/__https:/github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$
> > >
> > > [2]
> > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$<https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$>
> > <
> >
> https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$
> > >
> > > [3]
> > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$<https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$>
> > <
> >
> https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$
> > >
> > >
> > > On Tue, Jan 21, 2025 at 4:41 PM Chain Head <mrchainh...@gmail.com>
> > wrote:
> > >
> > > > Thanks.
> > > >
> > > > By "classic" you mean pre-KRaft consensus? What is the "current"
> Kafka
> > > > Group protocol?
> > > >
> > > > Best regards.
> > > >
> > > > On Tue, Jan 21, 2025 at 9:55 PM Greg Harris
> > <greg.har...@aiven.io.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > Yes you are correct. The "classic" Kafka Group Protocol is a
> > > > synchronizing
> > > > > barrier for all members.
> > > > > All JoinGroup member responses are returned after all JoinGroup
> > member
> > > > > requests are received.
> > > > >
> > > > > Thanks,
> > > > > Greg
> > > > >
> > > > > On Tue, Jan 21, 2025 at 7:10 AM Chain Head <mrchainh...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Assume that three consumers of a certain group want to connect
> to a
> > > > > broker
> > > > > > for a topic with 3 partitions. After the FindCoordinator API is
> > done,
> > > > the
> > > > > > consumers send JoinGroup. Since the broker cannot know in advance
> > how
> > > > > many
> > > > > > consumers are expected to join, it waits
> > > > > group.initial.rebalance.delay.ms
> > > > > > before starting a rebalance.
> > > > > >
> > > > > > Therefore, does this mean the JoinGroup API response of each
> > request
> > > is
> > > > > > "held" until the waiting period is over?
> > > > > >
> > > > > > Best regards.
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to