Re: JoinGroup API response timing.

Chain Head Thu, 30 Jan 2025 22:26:02 -0800

Hi Paul,
My question is, which of the above benefits you mention *cannot* be
achieved with plan consumers and no groups? If I, as a developer, knew the
number of partitions in advance and limiting to Kubernetes as the platform:


1, 2. I can launch as many consumers as partitions. A failed consumer will
be restarted by Kubernetes.
3. Agree.
4. I can assign multiple partitions in my consumers.

Best regards.

On Thu, Jan 30, 2025 at 2:46 PM Brebner, Paul
<paul.breb...@netapp.com.invalid> wrote:

> Hi again – interesting discussion thanks.  Coming from a fairly old school
> background of MOM and JMS in the 00’s I guess I found Kafka consumer groups
> “interesting” because:
>
> 1 – they enable message fan-out to multiple consumers – i.e. each group
> gets a copy of each message – this isn’t “normal” for traditional
> pub-sub/MOM
> 2 – they enable the sharing of messages for scalability and concurrency –
> you can easily scale consumer processing by increasing the number of
> consumers (and partitions)
> 3 – message processing order – order is only guaranteed per partition
> 4 – check out the parallel consumer for higher throughput with less
> consumers/partitions – there are more options for parallel processing and
> ordering
>
> Regards, Paul
>
> From: Artem Livshits <alivsh...@confluent.io.INVALID>
> Date: Friday, 24 January 2025 at 5:29 am
> To: users@kafka.apache.org <users@kafka.apache.org>
> Subject: Re: JoinGroup API response timing.
> [You don't often get email from alivsh...@confluent.io.invalid. Learn why
> this is important at https://aka.ms/LearnAboutSenderIdentification ]
>
> EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
>
>
>
>
> >  I wanted to understand the motivation for consumer group
>
> The motivation for the consumer group is to let Kafka manage the
> workload distribution among consumers.  The application then can just run
> multiple uncoordinated instances and not worry about partitions that have
> no consumer assigned or consumers that are overloaded / idle -- the
> consumer group coordinator would coordinate partition assignment across
> consumers in the same consumer group.  If the application doesn't need this
> functionality it can use KafkaConsumer.assign method to manually assign
> partitions to consumers and run instance coordination logic to manage
> liveness / workload balance by itself.
>
> > As for offsets, let them be saved locally by consumer or Redis, etc.
>
> This can work, unless we need exactly-once semantics (make sure that if we
> consume a record from a topic and produce a corresponding record to another
> topic, we either atomically do both or not do both).  For exactly-once
> semantics, the application would need to use Kafka transactions and store
> offsets in Kafka.
>
> -Artem
>
> On Wed, Jan 22, 2025 at 5:56 PM Chain Head <mrchainh...@gmail.com> wrote:
>
> > I wanted to understand the motivation for consumer group.
> >
> > *If number of partitions are known in advance*, then each consumer can
> > subscribe to individual topic-partition. If such a consumer fails, let
> > kubernetes reschedule it. (Elsewhere, similar restart mechanism may
> exist.)
> > Sharing the load of a failed consumer assumes that data across partitions
> > are "same". Else, such sharing is needless burden. As for offsets, let
> them
> > be saved locally by consumer or Redis, etc.
> >
> > The concept of consumer group shifts the responsibility of partition
> > assignment to broker because only broker knows the number of partitions.
> >
> > Best regards.
> >
> > On Thu, 23 Jan, 2025, 06:25 Brebner, Paul, <paul.breb...@netapp.com
> > .invalid>
> > wrote:
> >
> > > Hi – short answer is consumers can read from a specific partition, but
> in
> > > general for a consumer group you want to balance the partitions across
> > the
> > > available consumers for high throughput – if a consumer fails or is
> > kicked
> > > off the group because it times out etc then the remainder of the
> > consumers
> > > are rebalanced across the partitions etc.   Paul
> > >
> > > From: Chain Head <mrchainh...@gmail.com>
> > > Date: Wednesday, 22 January 2025 at 5:19 pm
> > > To: users@kafka.apache.org <users@kafka.apache.org>
> > > Subject: Re: JoinGroup API response timing.
> > > EXTERNAL EMAIL - USE CAUTION when clicking links or attachments
> > >
> > >
> > >
> > >
> > > Thanks for the explanation.
> > >
> > > Slight digression and perhaps a silly question w.r.t. consumers.
> > >
> > > Since multiple groups are possible, at a high level, the broker
> > effectively
> > > sends data of a given topic-partition to multiple consumers while
> keeping
> > > track of offsets. So, why not let consumers specify the partition ID
> they
> > > want to consume? Is the concept of consumer groups only because the
> > > consumers wouldn't know in advance how many partitions exist in a
> topic?
> > >
> > > Best regards.
> > >
> > > On Wed, Jan 22, 2025 at 7:41 AM Greg Harris
> <greg.har...@aiven.io.invalid
> > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks for the follow up.
> > > >
> > > > By "classic" I meant the protocol implemented by the
> > SyncGroup/JoinGroup
> > > > API [1]. It's a general group protocol that is still fully supported
> in
> > > > Kraft, and at this time has no intention of being deprecated.
> > > > It's called "classic" to distinguish it from the newer KIP-848 [2]
> > > > ConsumerGroupHeartbeat API and the share group protocol used in
> KIP-932
> > > > [3]. It is my understanding that these other protocols are _not_ a
> > > > synchronizing barrier in the same way the "classic" protocol is.
> > > >
> > > > Hope this helps,
> > > > Greg
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$
> <
> https://urldefense.com/v3/__https:/github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$
> >
> > > <
> > >
> >
> https://urldefense.com/v3/__https:/github.com/apache/kafka/blob/adb033211497e539725366960e1013a4638de59f/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/classic/ClassicGroup.java__;!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWqMDDfrw$
> > > >
> > > > [2]
> > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$
> <
> https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$
> >
> > > <
> > >
> >
> https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-848*3A*The*Next*Generation*of*the*Consumer*Rebalance*Protocol__;JSsrKysrKysr!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCVcPWDEEg$
> > > >
> > > > [3]
> > > >
> > > >
> > >
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$
> <
> https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$
> >
> > > <
> > >
> >
> https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/KAFKA/KIP-932*3A*Queues*for*Kafka__;JSsrKw!!Nhn8V6BzJA!RUMrj1UUerrRHrSLYNqS64hcjRVWOfQEBJYb0SYerdVNtC92K2MjTB8p00y7-OkYac1rhR0rYeBbneBwtCWxDptKnA$
> > > >
> > > >
> > > > On Tue, Jan 21, 2025 at 4:41 PM Chain Head <mrchainh...@gmail.com>
> > > wrote:
> > > >
> > > > > Thanks.
> > > > >
> > > > > By "classic" you mean pre-KRaft consensus? What is the "current"
> > Kafka
> > > > > Group protocol?
> > > > >
> > > > > Best regards.
> > > > >
> > > > > On Tue, Jan 21, 2025 at 9:55 PM Greg Harris
> > > <greg.har...@aiven.io.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Yes you are correct. The "classic" Kafka Group Protocol is a
> > > > > synchronizing
> > > > > > barrier for all members.
> > > > > > All JoinGroup member responses are returned after all JoinGroup
> > > member
> > > > > > requests are received.
> > > > > >
> > > > > > Thanks,
> > > > > > Greg
> > > > > >
> > > > > > On Tue, Jan 21, 2025 at 7:10 AM Chain Head <
> mrchainh...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Assume that three consumers of a certain group want to connect
> > to a
> > > > > > broker
> > > > > > > for a topic with 3 partitions. After the FindCoordinator API is
> > > done,
> > > > > the
> > > > > > > consumers send JoinGroup. Since the broker cannot know in
> advance
> > > how
> > > > > > many
> > > > > > > consumers are expected to join, it waits
> > > > > > group.initial.rebalance.delay.ms
> > > > > > > before starting a rebalance.
> > > > > > >
> > > > > > > Therefore, does this mean the JoinGroup API response of each
> > > request
> > > > is
> > > > > > > "held" until the waiting period is over?
> > > > > > >
> > > > > > > Best regards.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: JoinGroup API response timing.

Reply via email to