Hi Luke,

Thanks for the explanation.

I don't see any description of how the broker decides to use the new version of 
ConsumerProtocolSubscription or not. This probably needs to be locked to a new 
IBP version.

One scenario that we need to consider is what happens during a rolling upgrade. 
If the coordinator moves back and forth between brokers with different IBPs, it 
seems that the same epoch numbers could be reused for a group, if things are 
done in the obvious manner (old IBP = don't read or write epoch, new IBP = do).

best,
Colin


On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote:
> Hi Colin,
> Thanks for your comment.
>
>> How are we going to avoid the situation where the broker restarts, and
> the same generation number is reused?
>
> Actually, this KIP doesn't have anything to do with the brokers. The
> "generation" field I added, is in the subscription metadata, which will not
> be deserialized by brokers. The metadata is only deserialized by consumer
> lead. And for the consumer lead, the only thing the lead cared about, is
> the highest generation of the ownedPartitions among all the consumers. With
> the highest generation of the ownedPartitions, the consumer lead can
> distribute the partitions as sticky as possible, and most importantly,
> without errors.
>
> That is, after this KIP, if the broker restarts, and the same generation
> number is reused, it won't break current rebalance behavior. But it'll help
> the consumer lead do the sticky assignments correctly.
>
> Thank you.
> Luke
>
> On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz> wrote:
>
>> How are we going to avoid the situation where the broker restarts, and the
>> same generation number is reused?
>>
>> best,
>> Colin
>>
>> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote:
>> > Hi all,
>> >
>> > I'd like to start the vote for KIP-792: Add "generation" field into
>> > consumer protocol.
>> >
>> > The goal of this KIP is to allow the assignor/consumer coordinator to
>> have
>> > a way to identify the out-of-date members/assignments, to avoid rebalance
>> > stuck issues in current protocol.
>> >
>> > Detailed description can be found here:
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614
>> >
>> > Any feedback is welcome.
>> >
>> > Thank you.
>> > Luke
>>

Reply via email to