Thanks for the KIP Boyang.

I guess I am missing something, but I am still learning more details
about the rebalance protocol, so maybe you can help me out?

Assume a client sends UNKNOWN_MEMBER_ID in its first joinGroup request.
The broker generates a `member.id` and sends it back via
`MEMBER_ID_REQUIRED` error response. This response might never reach the
client or the client fails before it can send the second joinGroup
request. Thus, a client would need to start over with a new
UNKNOWN_MEMBER_ID in its joinGroup request. Thus, the broker needs to
generate a new `member.id` again.

So it seems the problem is moved, but not resolved? The motivation of
the KIP is:

> The edge case is that if initial join group request keeps failing due to 
> connection timeout, or the consumer keeps restarting,

From my understanding, this KIP move the issue from the first to the
second joinGroup request (or broker joinGroup response).

But maybe I am missing something. Can you help me out?


-Matthias


On 11/27/18 6:00 PM, Boyang Chen wrote:
> Thanks Stanislav and Jason for the suggestions!
> 
> 
>> Thanks for the KIP. Looks good overall. I think we will need to bump the
>> version of the JoinGroup protocol in order to indicate compatibility with
>> the new behavior. The coordinator needs to know when it is safe to assume
>> the client will handle the error code.
>>
>> Also, I was wondering if we could reuse the REBALANCE_IN_PROGRESS error
>> code. When the client sees this error code, it will take the memberId from
>> the response and rejoin. We'd still need the protocol bump since older
>> consumers do not have this logic.
> 
> I will add the join group protocol version change to the KIP. Meanwhile I 
> feel for
> understandability it's better to define a separate error code since 
> REBALANCE_IN_PROGRESS
> is not the actual cause of the returned error.
> 
>> One small question I have is now that we have one and a half round-trips
>> needed to join in a rebalance (1 full RT addition), is it worth it to
>> consider increasing the default value of `group.initial.rebalance.delay.ms`?
> I guess we could keep it for now. After KIP-345 and incremental cooperative 
> rebalancing
> work we should be safe to deprecate `group.initial.rebalance.delay.ms`. Also 
> one round trip
> shouldn't increase the latency too much IMO.
> 
> Best,
> Boyang
> ________________________________
> From: Stanislav Kozlovski <stanis...@confluent.io>
> Sent: Wednesday, November 28, 2018 2:32 AM
> To: dev@kafka.apache.org
> Subject: Re: [DISCUSS] KIP-394: Require member.id for initial join group 
> request
> 
> Hi Boyang,
> 
> The KIP looks very good.
> One small question I have is now that we have one and a half round-trips
> needed to join in a rebalance (1 full RT addition), is it worth it to
> consider increasing the default value of `group.initial.rebalance.delay.ms`?
> 
> Best,
> Stanislav
> 
> On Tue, Nov 27, 2018 at 5:39 PM Jason Gustafson <ja...@confluent.io> wrote:
> 
>> Hi Boyang,
>>
>> Thanks for the KIP. Looks good overall. I think we will need to bump the
>> version of the JoinGroup protocol in order to indicate compatibility with
>> the new behavior. The coordinator needs to know when it is safe to assume
>> the client will handle the error code.
>>
>> Also, I was wondering if we could reuse the REBALANCE_IN_PROGRESS error
>> code. When the client sees this error code, it will take the memberId from
>> the response and rejoin. We'd still need the protocol bump since older
>> consumers do not have this logic.
>>
>> Thanks,
>> Jason
>>
>> On Mon, Nov 26, 2018 at 5:47 PM Boyang Chen <bche...@outlook.com> wrote:
>>
>>> Hey friends,
>>>
>>>
>>> I would like to start a discussion thread for KIP-394 which is trying to
>>> mitigate broker cache bursting issue due to anonymous join group
>> requests:
>>>
>>>
>>>
>> https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-394%253A%2BRequire%2Bmember.id%2Bfor%2Binitial%2Bjoin%2Bgroup%2Brequest&amp;data=02%7C01%7C%7C8c2c54e07967404f0fa808d65496c9c7%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636789403931186848&amp;sdata=oRbPKzwyDx6SodAaVb3Vv%2FXpJoD09E3%2BdTc0p1qKDEo%3D&amp;reserved=0
>>>
>>>
>>> Thanks!
>>>
>>> Boyang
>>>
>>
> 
> 
> --
> Best,
> Stanislav
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to