Hello everyone,

Since this KIP has been fully discussed, I will initiate a vote for it next
Monday.
Thank you and have a nice weekend.

Best regards,
TengYao

TengYao Chi <kiting...@gmail.com> 於 2024年9月5日 週四 下午2:19寫道:

> Hello everyone,
>
> KT2: It looks like everyone who has expressed an opinion supports the
> second option: “Document a recommendation for clients to use UUIDs as
> member IDs, without strictly enforcing it.”
> I have updated the KIP accordingly.
> Please take a look, and let me know if you have any thoughts or feedback.
>
> Thank you!
>
> Best regards,
> TengYao
>
> Chia-Ping Tsai <chia7...@gmail.com> 於 2024年8月30日 週五 下午9:56寫道:
>
>> hi TengYao
>>
>> KT2: +1 to second approach
>>
>> Best,
>> Chia-Ping
>>
>>
>> David Jacot <dja...@confluent.io.invalid> 於 2024年8月30日 週五 下午9:15寫道:
>>
>> > Hi TengYao,
>> >
>> > KT2: I don't think that we can realistically validate the uuid on the
>> > server. It is basically a string of chars. So I lean towards having a
>> good
>> > recommendation in the KIP and in the document of the field in the RPC's
>> > definition.
>> >
>> > Best,
>> > David
>> >
>> > On Fri, Aug 30, 2024 at 3:02 PM TengYao Chi <kiting...@gmail.com>
>> wrote:
>> >
>> > > Hello Kirk !
>> > >
>> > > Thank you for your comments !
>> > >
>> > > KT1: Yes, you are correct. The issue is not unique to the initial
>> > > heartbeat; there can always be cases where the broker might lose
>> > connection
>> > > with a member.
>> > >
>> > > KT2: Currently, if the client doesn't have a member ID and the
>> > memberEpoch
>> > > equals 0, the coordinator will generate a UUID as the member ID for
>> the
>> > > client. However, at the RPC level, the member ID is sent as a literal
>> > > string, meaning there are no restrictions on the format at this level.
>> > > This also reminds me that we haven't reached a final conclusion on
>> how to
>> > > enforce the use of UUIDs.
>> > > From our previous discussions, I recall two possible approaches:
>> > > The first is to validate the UUID on the server side, and if it's not
>> > > valid, throw an exception to the client.
>> > > The second is to document a recommendation for clients to use UUIDs as
>> > > member IDs, without strictly enforcing it.
>> > > I think it's time to decide on the approach we want to take.
>> > >
>> > > KT3: Yes, "session" can be considered synonymous with "membership" in
>> > this
>> > > context.
>> > >
>> > > KT4: Thank you for pointing that out. I will update the wording to
>> > > specifically say this behavior is for consumers.
>> > >
>> > > Thanks again for your comments.
>> > >
>> > > Best regards,
>> > > TengYao
>> > >
>> > > Kirk True <k...@kirktrue.pro> 於 2024年8月30日 週五 上午12:39寫道:
>> > >
>> > > > Hi TengYao!
>> > > >
>> > > > Sorry for being late to the discussion...
>> > > >
>> > > > After reading the thread and then the KIP, I had a few
>> > > questions/comments:
>> > > >
>> > > > KT1: In Motivation, it states: "This scenario can result in the
>> broker
>> > > > registering a new member for which it will never receive a proper
>> leave
>> > > > request.” Just to be clear, the broker will always have cases where
>> it
>> > > > might lose connection with a member. That’s not unique to the
>> initial
>> > > > heartbeat, right?
>> > > >
>> > > > KT2: There was a bit of back and forth about format of the member
>> ID.
>> > > From
>> > > > what I gathered in the thread, the member ID is still defined in the
>> > RPC
>> > > as
>> > > > a string and not a UUID, right? The KIP states that the “client must
>> > > > generate a UUID as the member ID” and that the “server will validate
>> > > that a
>> > > > valid UUID is provided.” Is that a change for the server, or is it
>> > > already
>> > > > enforced as a UUID?
>> > > >
>> > > > KT3: Lianet mentioned some confusion over the use of the word
>> > “session.”
>> > > > Isn’t “session” synonymous with “membership?”
>> > > >
>> > > > KT4: Under “Member ID Lifecycle,” it states: "The client should
>> reuse
>> > the
>> > > > same UUID as the member ID for all heartbeats and rejoin attempts to
>> > > > maintain continuity within the group.” Could we change the first
>> part
>> > of
>> > > > that to “The Consumer instance should…” We do have lifetimes that
>> > extend
>> > > > past the lifetime of a client instance (such as the transaction ID).
>> > > >
>> > > > Thanks,
>> > > > Kirk
>> > > >
>> > > > > On Aug 29, 2024, at 1:28 AM, TengYao Chi <kiting...@gmail.com>
>> > wrote:
>> > > > >
>> > > > > Hi David,
>> > > > >
>> > > > > Thank you for pointing that out.
>> > > > > I have updated the content of the KIP based on Lianet's and your
>> > > > feedback.
>> > > > > Please take a look and let me know your thoughts.
>> > > > >
>> > > > > Best regards,
>> > > > > TengYao
>> > > > >
>> > > > > David Jacot <dja...@confluent.io.invalid> 於 2024年8月29日 週四
>> 下午3:20寫道:
>> > > > >
>> > > > >> Hi TengYao,
>> > > > >>
>> > > > >> Thanks for the update. I haven't fully read it yet but I will
>> soon.
>> > > > >>
>> > > > >> LM4: This is incorrect. The consumer must keep its member id
>> during
>> > > its
>> > > > >> entire lifetime (until the process stops or dies). The protocol
>> > > > stipulates
>> > > > >> that a member must rejoin with the same member id and the member
>> > epoch
>> > > > set
>> > > > >> to zero when an FENCED_MEMBER_EPOCH occurs. This allows the
>> member
>> > to
>> > > > >> resynchronize itself. We should not change this behavior. I think
>> > that
>> > > > we
>> > > > >> should see the client side generation id as an incarnation id of
>> the
>> > > > >> application. It is generated once and kept until it stops or
>> dies.
>> > > > >>
>> > > > >> Best,
>> > > > >> David
>> > > > >>
>> > > > >> On Thu, Aug 29, 2024 at 6:21 AM TengYao Chi <kiting...@gmail.com
>> >
>> > > > wrote:
>> > > > >>
>> > > > >>> Hello Lianet !
>> > > > >>>
>> > > > >>> Thanks for the reviews and suggestions!
>> > > > >>>
>> > > > >>> LM1: Indeed, we plan to enforce client-side ID generation in the
>> > > > future,
>> > > > >>> and it is not an alternative. I will change the title
>> accordingly.
>> > > > >>>
>> > > > >>> LM2: Yes, that's the expectation. I will add that statement to
>> the
>> > > > public
>> > > > >>> interface section.
>> > > > >>>
>> > > > >>> LM3: Thank you for the high-level perspective review. I think
>> > you're
>> > > > >> right;
>> > > > >>> our intention isn't very clear since it was placed at the end of
>> > the
>> > > > >>> section. I will try to rephrase that section to make it more
>> > obvious.
>> > > > >>>
>> > > > >>> LM4: Regarding the definition of "session" in this KIP, I
>> believe
>> > it
>> > > > >> refers
>> > > > >>> to the period between the *first-time heartbeat* and when the
>> > > *consumer
>> > > > >>> leaves the group* (whether through a graceful shutdown or a
>> > heartbeat
>> > > > >>> timeout). The consumer should reuse its UUID if it has been
>> > generated
>> > > > >>> before. The only situation in which it will regenerate the UUID
>> is
>> > if
>> > > > the
>> > > > >>> coordinator finds that there is already a consumer with the same
>> > > UUID.
>> > > > >>> IIRC, the coordinator should compare the member epochs, and the
>> > > > >>> later-joined consumer should be fenced off by the coordinator
>> due
>> > to
>> > > > >> having
>> > > > >>> a lower member epoch. Once the consumer receives a
>> > > > `FENCED_MEMBER_EPOCH`
>> > > > >>> error, it will generate a new UUID and attempt to rejoin. I will
>> > > > clarify
>> > > > >>> this in the KIP.
>> > > > >>>
>> > > > >>> Thanks again for your reviews, I really appreciate it.
>> > > > >>>
>> > > > >>> Best regards,
>> > > > >>> TengYao
>> > > > >>>
>> > > > >>> Lianet M. <liane...@gmail.com> 於 2024年8月28日 週三 下午7:12寫道:
>> > > > >>>
>> > > > >>>> Hello TengYao! Thanks for taking on this issue, we've been
>> going
>> > > > around
>> > > > >>> it
>> > > > >>>> for a while.
>> > > > >>>>
>> > > > >>>> LM1: About the title of the KIP: "Enable ID Generation for
>> Clients
>> > > > over
>> > > > >>> the
>> > > > >>>> ConsumerGroupHeartbeat RPC". I find it confusing because it
>> hints
>> > > that
>> > > > >>>> we're adding it as an alternative (which was discussed and
>> > > discarded,
>> > > > >> in
>> > > > >>>> favour of really enforcing it). It's also missing the core
>> change
>> > > imo,
>> > > > >>>> which is "where" the generation happens. So, maybe more to the
>> > point
>> > > > >> with
>> > > > >>>> something along the lines of "Client-side generated ID for
>> clients
>> > > > over
>> > > > >>>> ConsumerGroupHeartbeat RPC"?
>> > > > >>>>
>> > > > >>>> LM2: On the public interfaces section, the KIP states that "the
>> > > server
>> > > > >>> will
>> > > > >>>> reject the request", but we should agree on the specific error
>> > > type. I
>> > > > >>>> expect it should fail with an InvalidRequestException, is that
>> the
>> > > > >>>> intention? (This was also suggested in the discussion thread
>> > before
>> > > > but
>> > > > >>> is
>> > > > >>>> not in the KIP).
>> > > > >>>>
>> > > > >>>> LM3. Related to my previous point, I find that to be the true
>> > > > >>> public-facing
>> > > > >>>> change (member ID mandatory at the protocol level), but it's
>> only
>> > at
>> > > > >> the
>> > > > >>>> end of the Public interfaces changes, kind of lost among
>> details
>> > of
>> > > > how
>> > > > >>>> we're going to do it. Should we rephrase that section with the
>> > > actual
>> > > > >>>> change first, and the hows after (ex. Bumping the version is
>> not
>> > the
>> > > > >>>> public-facing change in this case, it's just the mechanism to
>> > > properly
>> > > > >>>> introduce our change)
>> > > > >>>>
>> > > > >>>> LM4. Regarding the lifetime of the UUID: the KIP states we will
>> > > > "Verify
>> > > > >>>> that the UUID remains consistent across all subsequent
>> heartbeats
>> > > > >> during
>> > > > >>>> the session". What is this "session" referring to here? I would
>> > > expect
>> > > > >>> that
>> > > > >>>> the UUID is associated to a consumer instance (generated for
>> the
>> > > > >> consumer
>> > > > >>>> the first time it needs to send a HB if it doesn't have the
>> UUID
>> > > yet.
>> > > > >>> From
>> > > > >>>> there on, every time it needs to send a "first HB" again, it
>> will
>> > > > reuse
>> > > > >>> its
>> > > > >>>> UUID, is that the intention? Note that we should consider that
>> the
>> > > > same
>> > > > >>>> consumer instance may have many "first heartbeats", meaning
>> > > heartbeats
>> > > > >> to
>> > > > >>>> join the group when it's not part of it (ex. consumer
>> unsubscribe
>> > +
>> > > > >>>> subscribe, fenced, stale). Is this the intention or are you
>> > > > considering
>> > > > >>> the
>> > > > >>>> lifetime differently? We should clarify it in the KIP.
>> > > > >>>>
>> > > > >>>> Thanks!
>> > > > >>>>
>> > > > >>>> Lianet
>> > > > >>>>
>> > > > >>>> On Tue, Aug 27, 2024 at 2:27 AM TengYao Chi <
>> kiting...@gmail.com>
>> > > > >> wrote:
>> > > > >>>>
>> > > > >>>>> Hi everyone,
>> > > > >>>>>
>> > > > >>>>> I have revised this KIP multiple times based on the feedback
>> from
>> > > our
>> > > > >>>>> discussions.
>> > > > >>>>> I would greatly appreciate it if you could review it when you
>> > have
>> > > > >> the
>> > > > >>>>> time.
>> > > > >>>>> If there are no further comments or suggestions, I plan to
>> > proceed
>> > > > >> with
>> > > > >>>>> initiating a vote soon.
>> > > > >>>>>
>> > > > >>>>> Best regards,
>> > > > >>>>> TengYao
>> > > > >>>>>
>> > > > >>>>> TengYao Chi <kiting...@gmail.com> 於 2024年8月23日 週五 下午2:43寫道:
>> > > > >>>>>
>> > > > >>>>>> Hi Andrew,
>> > > > >>>>>> Thank you for your previous feedback and insights.
>> > > > >>>>>> Your contributions have added valuable perspectives to the
>> > > > >>> discussions.
>> > > > >>>>>> And we also benefit from the comparison of different
>> solutions.
>> > > > >>>>>> I’m also looking forward to seeing an initial version in
>> > KIP-932,
>> > > > >> as
>> > > > >>> it
>> > > > >>>>>> will provide a good reference for future implementations.
>> > > > >>>>>>
>> > > > >>>>>> Regarding your comment on AS2, I wanted to clarify that my
>> > > > >>>> specification
>> > > > >>>>>> references org.apache.kafka.common.Uuid.
>> > > > >>>>>> I believe we’re referring to the same class, and it might
>> just
>> > be
>> > > a
>> > > > >>>> small
>> > > > >>>>>> oversight due to the busy schedule.
>> > > > >>>>>>
>> > > > >>>>>> I want to express my gratitude once again for your many
>> > insightful
>> > > > >>>>>> comments, which have helped the discussion progress smoothly.
>> > > > >>>>>>
>> > > > >>>>>> Best regards,
>> > > > >>>>>> TengYao
>> > > > >>>>>>
>> > > > >>>>>>
>> > > > >>>>>> Andrew Schofield <andrew_schofi...@live.com> 於 2024年8月22日 週四
>> > > > >>>> 下午11:28寫道:
>> > > > >>>>>>
>> > > > >>>>>>> Hi TengYao,
>> > > > >>>>>>> I’ve been reading through the comments and I’m happy that
>> the
>> > > > >> lobby
>> > > > >>>>>>> approach has not gained support.
>> > > > >>>>>>>
>> > > > >>>>>>> Assuming that this KIP is voted, I will be happy to change
>> > > KIP-932
>> > > > >>> so
>> > > > >>>>>>> that it only supports client-side member ID generation.
>> Because
>> > > > >> that
>> > > > >>>> KIP
>> > > > >>>>>>> is still
>> > > > >>>>>>> under development, I can do this in the first version of
>> > > > >>>>>>> ShareGroupHeartbeat.
>> > > > >>>>>>>
>> > > > >>>>>>> AS2: For the encoding section, I suppose the specific
>> encoding
>> > > > >> which
>> > > > >>>>>>> is used is what org.apache.kafka.utils.Uuid uses.
>> > > > >>>>>>>
>> > > > >>>>>>> Thanks,
>> > > > >>>>>>> Andrew
>> > > > >>>>>>>
>> > > > >>>>>>>> On 14 Aug 2024, at 17:00, TengYao Chi <kiting...@gmail.com
>> >
>> > > > >>> wrote:
>> > > > >>>>>>>>
>> > > > >>>>>>>> Hello Apoorv,
>> > > > >>>>>>>> Thank you for your feedback.
>> > > > >>>>>>>> Regarding the questions you raised, unfortunately, this KIP
>> > > > >> cannot
>> > > > >>>>>>>> guarantee the order of heartbeats. As with many classic
>> > > > >>> distributed
>> > > > >>>>>>> system
>> > > > >>>>>>>> challenges, what we can do is make our best effort to
>> ensure
>> > > > >> that
>> > > > >>>>> there
>> > > > >>>>>>> are
>> > > > >>>>>>>> no idle members or stale assignments under normal
>> > circumstances.
>> > > > >>>>>>>>
>> > > > >>>>>>>> As for the lobby approach, I’m not a fan of it because it
>> > > > >> requires
>> > > > >>>>>>> adding a
>> > > > >>>>>>>> mechanism to maintain client state within the
>> ConsumerGroup,
>> > > > >>> which,
>> > > > >>>> in
>> > > > >>>>>>> my
>> > > > >>>>>>>> view, resembles something like a two-phase commit. This
>> would
>> > > > >>>>> introduce
>> > > > >>>>>>>> more complexity than the proposal in this KIP, which is
>> > > > >> something
>> > > > >>> we
>> > > > >>>>>>> want
>> > > > >>>>>>>> to avoid. KIP-848 aims to simplify the existing protocol,
>> and
>> > > > >>> while
>> > > > >>>>> the
>> > > > >>>>>>>> lobby approach is a good one, I believe it is not the right
>> > fit
>> > > > >>> for
>> > > > >>>>> this
>> > > > >>>>>>>> particular situation.
>> > > > >>>>>>>>
>> > > > >>>>>>>> Best regards,
>> > > > >>>>>>>> TengYao
>> > > > >>>>>>>>
>> > > > >>>>>>>> TengYao Chi <kiting...@gmail.com> 於 2024年8月14日 週三
>> 下午11:45寫道:
>> > > > >>>>>>>>
>> > > > >>>>>>>>> Hi David,
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> I really appreciate your review and suggestions. As I am
>> > still
>> > > > >>>>> gaining
>> > > > >>>>>>>>> experience in writing KIPs, your input has been incredibly
>> > > > >>>> helpful. I
>> > > > >>>>>>> am
>> > > > >>>>>>>>> currently applying your suggestions to the KIP and will
>> > > > >> complete
>> > > > >>> it
>> > > > >>>>> as
>> > > > >>>>>>> soon
>> > > > >>>>>>>>> as possible.
>> > > > >>>>>>>>> Regarding the UUID part, I think we haven’t reached a
>> > > > >> conclusion
>> > > > >>>>>>> yet.(So
>> > > > >>>>>>>>> far according to this thread)
>> > > > >>>>>>>>> However, I will review the current implementation in the
>> > Kafka
>> > > > >>>> `Uuid`
>> > > > >>>>>>>>> class and include a brief specification in the KIP.
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Once again, thank you so much for your help.
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Best regards,
>> > > > >>>>>>>>> TengYao
>> > > > >>>>>>>>>
>> > > > >>>>>>>>> Chia-Ping Tsai <chia7...@gmail.com> 於 2024年8月14日 週三
>> > 下午11:14寫道:
>> > > > >>>>>>>>>
>> > > > >>>>>>>>>> hi Apoorv
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>>> As the memberId is now known to the client, and client
>> > might
>> > > > >>> send
>> > > > >>>>> the
>> > > > >>>>>>>>>> leave
>> > > > >>>>>>>>>> group heartbeat on shutdown prior to receiving the
>> initial
>> > > > >>>> heartbeat
>> > > > >>>>>>>>>> response. If that's true then how do we guarantee that
>> the 2
>> > > > >>>>> requests
>> > > > >>>>>>> to
>> > > > >>>>>>>>>> join and leave will be processed in order, which could
>> still
>> > > > >>> leave
>> > > > >>>>>>> stale
>> > > > >>>>>>>>>> members or throw unknown member id exceptions?
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> This is definitely a good question. the short answer: no
>> > > > >>> guarantee
>> > > > >>>>> but
>> > > > >>>>>>>>>> best
>> > > > >>>>>>>>>> efforts
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Please notice the root cause is "we have no enough time
>> to
>> > > > >> wait
>> > > > >>>>>>> member id
>> > > > >>>>>>>>>> (response) when closing consumer". Sadly, we can'
>> guarantee
>> > > > >> the
>> > > > >>>>>>> request
>> > > > >>>>>>>>>> order due to the same reason.
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> However, in contrast to previous behavior, there is one
>> big
>> > > > >>>> benefit
>> > > > >>>>>>> of new
>> > > > >>>>>>>>>> approach - we can try STONITH because we know the member
>> id
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Best,
>> > > > >>>>>>>>>> Chia-Ping
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>> Apoorv Mittal <apoorvmitta...@gmail.com> 於 2024年8月14日 週三
>> > > > >>>> 下午8:55寫道:
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>>> Hi TengYao,
>> > > > >>>>>>>>>>> Thanks for the KIP. Continuing on the point which Andrew
>> > > > >>>> mentioned
>> > > > >>>>> as
>> > > > >>>>>>>>>> AS1.
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> As the memberId is now known to the client, and client
>> > might
>> > > > >>> send
>> > > > >>>>> the
>> > > > >>>>>>>>>> leave
>> > > > >>>>>>>>>>> group heartbeat on shutdown prior to receiving the
>> initial
>> > > > >>>>> heartbeat
>> > > > >>>>>>>>>>> response. If that's true then how do we guarantee that
>> the
>> > 2
>> > > > >>>>>>> requests to
>> > > > >>>>>>>>>>> join and leave will be processed in order, which could
>> > still
>> > > > >>>> leave
>> > > > >>>>>>> stale
>> > > > >>>>>>>>>>> members or throw unknown member id exceptions?
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> Though the client side member id generation is helpful
>> > which
>> > > > >>> will
>> > > > >>>>>>>>>> represent
>> > > > >>>>>>>>>>> the same group perspective as from client and broker's
>> end.
>> > > > >>> But I
>> > > > >>>>>>> think
>> > > > >>>>>>>>>> the
>> > > > >>>>>>>>>>> major concern we want to solve here is Stale Partition
>> > > > >>>> Assignments
>> > > > >>>>>>> which
>> > > > >>>>>>>>>>> might still exist with the new approach. I am leaning
>> > towards
>> > > > >>> the
>> > > > >>>>>>>>>>> suggestion mentioned by Andrew where partition
>> assignment
>> > > > >>>> triggers
>> > > > >>>>> on
>> > > > >>>>>>>>>>> subsequent heartbeat when client acknowledges the
>> initial
>> > > > >>>>> heartbeat,
>> > > > >>>>>>>>>>> delayed partition assignment.
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> Though on a separate note, I have a different question.
>> > What
>> > > > >>>>> happens
>> > > > >>>>>>>>>> when
>> > > > >>>>>>>>>>> there is an issue with the client which sends the
>> initial
>> > > > >>>> heartbeat
>> > > > >>>>>>>>>> without
>> > > > >>>>>>>>>>> memberId, then crashes and restarts? I think we must be
>> > > > >>>>> re-triggering
>> > > > >>>>>>>>>>> assignments and expiring members only after the
>> heartbeat
>> > > > >>> session
>> > > > >>>>>>>>>> timeout?
>> > > > >>>>>>>>>>> If that's true then shall delayed partition assignment
>> can
>> > > > >> help
>> > > > >>>>>>> benefit
>> > > > >>>>>>>>>> us
>> > > > >>>>>>>>>>> from this situation as well?
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> Regards,
>> > > > >>>>>>>>>>> Apoorv Mittal
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>> On Wed, Aug 14, 2024 at 12:51 PM David Jacot
>> > > > >>>>>>>>>> <dja...@confluent.io.invalid>
>> > > > >>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>>> Hi Andrew,
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> Personally, I don't like the lobby approach. It makes
>> > things
>> > > > >>>> more
>> > > > >>>>>>>>>>>> complicated and it would require changing the records
>> on
>> > the
>> > > > >>>>> server
>> > > > >>>>>>>>>> too.
>> > > > >>>>>>>>>>>> This is why I initially suggested the rejected
>> alternative
>> > > > >> #2
>> > > > >>>>> which
>> > > > >>>>>>> is
>> > > > >>>>>>>>>>>> pretty close but also not perfect.
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> I'd like to clarify one thing. The
>> ConsumerGroupHeartbeat
>> > > > >> API
>> > > > >>>>>>> already
>> > > > >>>>>>>>>>>> supports generating the member id on the client so we
>> > don't
>> > > > >>> need
>> > > > >>>>> any
>> > > > >>>>>>>>>>>> conditional logic on the client side. This is actually
>> > what
>> > > > >> we
>> > > > >>>>>>> wanted
>> > > > >>>>>>>>>> to
>> > > > >>>>>>>>>>> do
>> > > > >>>>>>>>>>>> in the first place but the idea got pushed back by
>> Magnus
>> > > > >> back
>> > > > >>>>> then
>> > > > >>>>>>>>>>> because
>> > > > >>>>>>>>>>>> generating uuid from librdkafka required a new
>> dependency.
>> > > > >> It
>> > > > >>>>> turns
>> > > > >>>>>>>>>> out
>> > > > >>>>>>>>>>>> that librdkafka has that dependency today. In
>> retrospect,
>> > we
>> > > > >>>>> should
>> > > > >>>>>>>>>> have
>> > > > >>>>>>>>>>>> pushed back on this. Long story short, we can just do
>> it.
>> > > > >> The
>> > > > >>>>>>>>>> proposal in
>> > > > >>>>>>>>>>>> this KIP is to make the member id required in future
>> > > > >> versions.
>> > > > >>>> We
>> > > > >>>>>>>>>> could
>> > > > >>>>>>>>>>>> also decide not to do it and to keep supporting both
>> > > > >>>> approaches. I
>> > > > >>>>>>>>>> would
>> > > > >>>>>>>>>>>> also be fine with this.
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> Best,
>> > > > >>>>>>>>>>>> David
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>> On Wed, Aug 14, 2024 at 12:30 PM Andrew Schofield <
>> > > > >>>>>>>>>>>> andrew_schofi...@live.com>
>> > > > >>>>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> Hi TengYao,
>> > > > >>>>>>>>>>>>> Thanks for your response. I’ll have just one more try
>> to
>> > > > >>>>> persuade.
>> > > > >>>>>>>>>>>>> I feel that I will need to follow the approach with
>> > KIP-932
>> > > > >>>> when
>> > > > >>>>>>>>>> we’ve
>> > > > >>>>>>>>>>>>> made a decision, so I do have more than a passing
>> > interest
>> > > > >> in
>> > > > >>>>> this.
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> A group member in the lobby is in the group, but it
>> does
>> > > > >> not
>> > > > >>>> have
>> > > > >>>>>>>>>> any
>> > > > >>>>>>>>>>>>> assignments. A member of a consumer group can have no
>> > > > >>> assigned
>> > > > >>>>>>>>>>>>> partitions (such as 5 CG members subscribed to a topic
>> > > > >> with 4
>> > > > >>>>>>>>>>>> partitions),
>> > > > >>>>>>>>>>>>> so it’s a situation that consumer group members
>> already
>> > > > >>> expect.
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> One of Kafka’s strengths is the way that we handle API
>> > > > >>>>> versioning.
>> > > > >>>>>>>>>>>>> But, there is a cost - the behaviour is different
>> > depending
>> > > > >>> on
>> > > > >>>>> the
>> > > > >>>>>>>>>> RPC
>> > > > >>>>>>>>>>>>> version. KIP-848 is on the cusp of completion, but
>> we’re
>> > > > >>>> already
>> > > > >>>>>>>>>> adding
>> > > > >>>>>>>>>>>>> conditional logic for v0/v1 for
>> ConsumerGroupHeartbeat.
>> > > > >>> That’s
>> > > > >>>> a
>> > > > >>>>>>>>>> pity.
>> > > > >>>>>>>>>>>>> Only a minor issue, but it’s unfortunate.
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>> Thanks,
>> > > > >>>>>>>>>>>>> Andrew
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> On 14 Aug 2024, at 08:47, TengYao Chi <
>> > > > >> kiting...@gmail.com>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> Hello Andrew
>> > > > >>>>>>>>>>>>>> Thank you for your thoughtful suggestions and getting
>> > the
>> > > > >>>>>>>>>> discussion
>> > > > >>>>>>>>>>>>> going.
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> To AS1:
>> > > > >>>>>>>>>>>>>> In the current scenario where the server generates
>> the
>> > > > >> UUID,
>> > > > >>>> if
>> > > > >>>>>>>>>> the
>> > > > >>>>>>>>>>>>> client
>> > > > >>>>>>>>>>>>>> shuts down before receiving the memberId generated by
>> > the
>> > > > >> GC
>> > > > >>>>>>>>>>>> (regardless
>> > > > >>>>>>>>>>>>> of
>> > > > >>>>>>>>>>>>>> whether it’s a graceful shutdown or not), the GC will
>> > > > >> still
>> > > > >>>> have
>> > > > >>>>>>>>>> to
>> > > > >>>>>>>>>>>> wait
>> > > > >>>>>>>>>>>>>> for the heartbeat timeout because the client doesn’t
>> > know
>> > > > >>> its
>> > > > >>>>>>>>>>> memberId.
>> > > > >>>>>>>>>>>>>> This KIP indeed cannot completely resolve the
>> > idempotency
>> > > > >>>> issue,
>> > > > >>>>>>>>>> but
>> > > > >>>>>>>>>>> it
>> > > > >>>>>>>>>>>>> can
>> > > > >>>>>>>>>>>>>> better handle shutdown scenarios under normal
>> > > > >> circumstances
>> > > > >>>>>>>>>> because
>> > > > >>>>>>>>>>> the
>> > > > >>>>>>>>>>>>>> client always knows its memberId. Even if the client
>> > shuts
>> > > > >>>> down
>> > > > >>>>>>>>>>>>> immediately
>> > > > >>>>>>>>>>>>>> after the initial heartbeat, as long as it performs a
>> > > > >>> graceful
>> > > > >>>>>>>>>>> shutdown
>> > > > >>>>>>>>>>>>> and
>> > > > >>>>>>>>>>>>>> sends a leave heartbeat, the GC can manage the
>> situation
>> > > > >> and
>> > > > >>>>>>>>>> remove
>> > > > >>>>>>>>>>> the
>> > > > >>>>>>>>>>>>>> member. Therefore, the goal of this KIP is to address
>> > the
>> > > > >>>> issue
>> > > > >>>>>>>>>> where
>> > > > >>>>>>>>>>>> the
>> > > > >>>>>>>>>>>>>> GC has to wait for the heartbeat timeout due to the
>> > client
>> > > > >>>>> leaving
>> > > > >>>>>>>>>>>>> without
>> > > > >>>>>>>>>>>>>> knowing its memberId, which leads to reduced
>> throughput
>> > > > >> and
>> > > > >>>>>>>>>> limited
>> > > > >>>>>>>>>>>>>> scalability.
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> The solution you suggest has also been proposed by
>> > David.
>> > > > >>> The
>> > > > >>>>>>>>>> concern
>> > > > >>>>>>>>>>>>> with
>> > > > >>>>>>>>>>>>>> this approach is that it introduces additional
>> > complexity
>> > > > >>> for
>> > > > >>>>>>>>>>>>>> compatibility, as the new server would not
>> immediately
>> > add
>> > > > >>> the
>> > > > >>>>>>>>>> member
>> > > > >>>>>>>>>>>> to
>> > > > >>>>>>>>>>>>>> the group, while the old server would. This requires
>> > > > >> clients
>> > > > >>>> to
>> > > > >>>>>>>>>>>>>> differentiate whether their memberId has been added
>> to
>> > the
>> > > > >>>> group
>> > > > >>>>>>>>>> or
>> > > > >>>>>>>>>>>> not,
>> > > > >>>>>>>>>>>>>> which could result in unexpected logs.
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> Best Regards,
>> > > > >>>>>>>>>>>>>> TengYao
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>> Andrew Schofield <andrew_schofi...@live.com> 於
>> > 2024年8月14日
>> > > > >>> 週三
>> > > > >>>>>>>>>>>> 上午12:29寫道:
>> > > > >>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> Hi TengYao,
>> > > > >>>>>>>>>>>>>>> Thanks for the KIP. I wonder if there’s a different
>> way
>> > > > >> to
>> > > > >>>>> close
>> > > > >>>>>>>>>>> what
>> > > > >>>>>>>>>>>>>>> is quite a small window.
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> AS1: It is true that the initial heartbeat is not
>> > > > >>> idempotent,
>> > > > >>>>> but
>> > > > >>>>>>>>>>> this
>> > > > >>>>>>>>>>>>>>> remains
>> > > > >>>>>>>>>>>>>>> true with this KIP. It’s just differently not
>> > idempotent.
>> > > > >>> If
>> > > > >>>>> the
>> > > > >>>>>>>>>>>> client
>> > > > >>>>>>>>>>>>>>> makes its
>> > > > >>>>>>>>>>>>>>> own member ID, sends a request and dies, the GC will
>> > > > >> still
>> > > > >>>> have
>> > > > >>>>>>>>>>> added
>> > > > >>>>>>>>>>>>>>> the member to the group and it will hang around
>> until
>> > the
>> > > > >>>>> session
>> > > > >>>>>>>>>>>>> expires.
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> I wonder if the GC could still generate the member
>> ID
>> > in
>> > > > >>>>>>>>>> response to
>> > > > >>>>>>>>>>>> the
>> > > > >>>>>>>>>>>>>>> first
>> > > > >>>>>>>>>>>>>>> heartbeat, and put the member in a special PENDING
>> > state
>> > > > >>> with
>> > > > >>>>> no
>> > > > >>>>>>>>>>>>>>> assignments until the client sends the next
>> heartbeat,
>> > > > >> thus
>> > > > >>>>>>>>>>> confirming
>> > > > >>>>>>>>>>>>> it
>> > > > >>>>>>>>>>>>>>> has received the member ID. This would not be a
>> > protocol
>> > > > >>>> change
>> > > > >>>>>>>>>> at
>> > > > >>>>>>>>>>>> all,
>> > > > >>>>>>>>>>>>>>> just
>> > > > >>>>>>>>>>>>>>> a change to the GC to keep a new member in the lobby
>> > > > >> until
>> > > > >>>> it’s
>> > > > >>>>>>>>>>>>> comfirmed
>> > > > >>>>>>>>>>>>>>> it knows its member ID.
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>> Thanks,
>> > > > >>>>>>>>>>>>>>> Andrew
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> On 13 Aug 2024, at 15:59, TengYao Chi <
>> > > > >>> kiting...@gmail.com>
>> > > > >>>>>>>>>> wrote:
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> Hi Chia-Ping,
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> Thanks for review and suggestions.
>> > > > >>>>>>>>>>>>>>>> I have updated the content of KIP accordingly.
>> > > > >>>>>>>>>>>>>>>> Please take a look.
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> Best regards,
>> > > > >>>>>>>>>>>>>>>> TengYao
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>> Chia-Ping Tsai <chia7...@apache.org> 於 2024年8月13日
>> 週二
>> > > > >>>>> 下午9:45寫道:
>> > > > >>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> hi TengYao
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> thanks for this KIP.
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> 1) could you please describe the before/after
>> > behavior
>> > > > >> in
>> > > > >>>> the
>> > > > >>>>>>>>>>>>> "Proposed
>> > > > >>>>>>>>>>>>>>>>> Changes" section? IIRC, current RPC allows HB
>> having
>> > > > >>> member
>> > > > >>>>> id
>> > > > >>>>>>>>>>>>>>> generated by
>> > > > >>>>>>>>>>>>>>>>> client, right? If HB has no member ID, server will
>> > > > >>> generate
>> > > > >>>>> one
>> > > > >>>>>>>>>>> and
>> > > > >>>>>>>>>>>>> then
>> > > > >>>>>>>>>>>>>>>>> return. The new behavior will enforce HB "must"
>> have
>> > > > >>> member
>> > > > >>>>> ID.
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> 2) could you please write the version number
>> > explicitly
>> > > > >>> in
>> > > > >>>>> the
>> > > > >>>>>>>>>> KIP
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> 3) how new client code handle the old HB? Does it
>> > > > >> always
>> > > > >>>>>>>>>> generate
>> > > > >>>>>>>>>>>>> member
>> > > > >>>>>>>>>>>>>>>>> ID on client-side even though that is not
>> restricted?
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> Best,
>> > > > >>>>>>>>>>>>>>>>> Chia-Ping
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>> On 2024/08/13 06:20:42 TengYao Chi wrote:
>> > > > >>>>>>>>>>>>>>>>>> Hello everyone,
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> I would like to start a discussion thread on
>> > KIP-1082,
>> > > > >>>> which
>> > > > >>>>>>>>>>>> proposes
>> > > > >>>>>>>>>>>>>>>>>> enabling id generation for clients over the
>> > > > >>>>>>>>>>> ConsumerGroupHeartbeat
>> > > > >>>>>>>>>>>>> RPC.
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> Here is the KIP Link: KIP-1082
>> > > > >>>>>>>>>>>>>>>>>> <
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>
>> > > > >>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1082%3A+Enable+ID+Generation+for+Clients+over+the+ConsumerGroupHeartbeat+RPC
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> Please take a look and let me know what you
>> think,
>> > > > >> and I
>> > > > >>>>> would
>> > > > >>>>>>>>>>>>>>> appreciate
>> > > > >>>>>>>>>>>>>>>>>> any suggestions and feedback.
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>> Best regards,
>> > > > >>>>>>>>>>>>>>>>>> TengYao
>> > > > >>>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>>
>> > > > >>>>>>>>>>>>
>> > > > >>>>>>>>>>>
>> > > > >>>>>>>>>>
>> > > > >>>>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>>>
>> > > > >>>>>
>> > > > >>>>
>> > > > >>>
>> > > > >>
>> > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to