Thanks for the explanation, Luke. That makes sense. best, Colin
On Thu, Dec 9, 2021, at 13:31, Guozhang Wang wrote: > Thanks Luke, in that case I'm +1 on this KIP. > > On Thu, Dec 9, 2021 at 1:46 AM Luke Chen <show...@gmail.com> wrote: > >> Hi Guozhang, >> >> Thanks for your comment. >> >> > we need to make sure the old-versioned leader would be able to ignore the >> new >> field during an upgrade e.g. without crashing. >> >> Yes, I understand. I'll be careful to make sure it won't crash the old >> versioned leader. >> But basically, it won't, because we appended the new field into the last of >> the ConsumerProtocolSubscription, which means, when read/deserialize the >> Subscription metadata, the old versioned leader will just read the head >> part of the data. >> >> Thanks for the reminder! >> >> Luke >> >> On Thu, Dec 9, 2021 at 4:00 AM Guozhang Wang <wangg...@gmail.com> wrote: >> >> > Hi Luke, >> > >> > Thanks for the KIP. >> > >> > One thing I'd like to double check is that, since the >> > ConsumerProtocolSubscription is not auto generated from the json file, we >> > need to make sure the old-versioned leader would be able to ignore the >> new >> > field during an upgrade e.g. without crashing. Other than that, the KIP >> > lgtm. >> > >> > Guozhang >> > >> > On Tue, Dec 7, 2021 at 6:16 PM Luke Chen <show...@gmail.com> wrote: >> > >> > > Hi Colin, >> > > >> > > I'm not quite sure if I understand your thoughts correctly. >> > > If I was wrong, please let me know. >> > > >> > > Also, I'm not quite sure how I could lock this feature to a new IBP >> > > version. >> > > I saw "KIP-584: Versioning scheme for features" is still under >> > development. >> > > Not sure if I need to lock the IBP version, how should I do? >> > > >> > > Thank you. >> > > Luke >> > > >> > > On Tue, Dec 7, 2021 at 9:41 PM Luke Chen <show...@gmail.com> wrote: >> > > >> > > > Hi Colin, >> > > > >> > > > Thanks for your comments. I've updated the KIP to mention about the >> KIP >> > > > won't affect current broker side behavior. >> > > > >> > > > > One scenario that we need to consider is what happens during a >> > rolling >> > > > upgrade. If the coordinator moves back and forth between brokers with >> > > > different IBPs, it seems that the same epoch numbers could be reused >> > for >> > > a >> > > > group, if things are done in the obvious manner (old IBP = don't read >> > or >> > > > write epoch, new IBP = do) >> > > > >> > > > I think this KIP doesn't care about the group epoch number at all. >> The >> > > > subscription metadata is passed from each member to group >> coordinator, >> > > and >> > > > then the group coordinator pass all of them back to the consumer >> lead. >> > So >> > > > even if the epoch number is reused in a group, it should be fine. On >> > the >> > > > other hand, the group coordinator will have no idea if the join group >> > > > request sent from consumer containing the new subscription >> "generation" >> > > > field or not, because group coordinator won't deserialize the >> metadata. >> > > > >> > > > I've added also added them into the KIP. >> > > > >> > > > Thank you. >> > > > Luke >> > > > >> > > > On Mon, Dec 6, 2021 at 10:39 AM Colin McCabe <cmcc...@apache.org> >> > wrote: >> > > > >> > > >> Hi Luke, >> > > >> >> > > >> Thanks for the explanation. >> > > >> >> > > >> I don't see any description of how the broker decides to use the new >> > > >> version of ConsumerProtocolSubscription or not. This probably needs >> to >> > > be >> > > >> locked to a new IBP version. >> > > >> >> > > >> One scenario that we need to consider is what happens during a >> rolling >> > > >> upgrade. If the coordinator moves back and forth between brokers >> with >> > > >> different IBPs, it seems that the same epoch numbers could be reused >> > > for a >> > > >> group, if things are done in the obvious manner (old IBP = don't >> read >> > or >> > > >> write epoch, new IBP = do). >> > > >> >> > > >> best, >> > > >> Colin >> > > >> >> > > >> >> > > >> On Fri, Dec 3, 2021, at 18:46, Luke Chen wrote: >> > > >> > Hi Colin, >> > > >> > Thanks for your comment. >> > > >> > >> > > >> >> How are we going to avoid the situation where the broker >> restarts, >> > > and >> > > >> > the same generation number is reused? >> > > >> > >> > > >> > Actually, this KIP doesn't have anything to do with the brokers. >> The >> > > >> > "generation" field I added, is in the subscription metadata, which >> > > will >> > > >> not >> > > >> > be deserialized by brokers. The metadata is only deserialized by >> > > >> consumer >> > > >> > lead. And for the consumer lead, the only thing the lead cared >> > about, >> > > is >> > > >> > the highest generation of the ownedPartitions among all the >> > consumers. >> > > >> With >> > > >> > the highest generation of the ownedPartitions, the consumer lead >> can >> > > >> > distribute the partitions as sticky as possible, and most >> > importantly, >> > > >> > without errors. >> > > >> > >> > > >> > That is, after this KIP, if the broker restarts, and the same >> > > generation >> > > >> > number is reused, it won't break current rebalance behavior. But >> > it'll >> > > >> help >> > > >> > the consumer lead do the sticky assignments correctly. >> > > >> > >> > > >> > Thank you. >> > > >> > Luke >> > > >> > >> > > >> > On Fri, Dec 3, 2021 at 6:30 AM Colin McCabe <co...@cmccabe.xyz> >> > > wrote: >> > > >> > >> > > >> >> How are we going to avoid the situation where the broker >> restarts, >> > > and >> > > >> the >> > > >> >> same generation number is reused? >> > > >> >> >> > > >> >> best, >> > > >> >> Colin >> > > >> >> >> > > >> >> On Tue, Nov 30, 2021, at 16:36, Luke Chen wrote: >> > > >> >> > Hi all, >> > > >> >> > >> > > >> >> > I'd like to start the vote for KIP-792: Add "generation" field >> > into >> > > >> >> > consumer protocol. >> > > >> >> > >> > > >> >> > The goal of this KIP is to allow the assignor/consumer >> > coordinator >> > > to >> > > >> >> have >> > > >> >> > a way to identify the out-of-date members/assignments, to avoid >> > > >> rebalance >> > > >> >> > stuck issues in current protocol. >> > > >> >> > >> > > >> >> > Detailed description can be found here: >> > > >> >> > >> > > >> >> >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614 >> > > >> >> > >> > > >> >> > Any feedback is welcome. >> > > >> >> > >> > > >> >> > Thank you. >> > > >> >> > Luke >> > > >> >> >> > > >> >> > > > >> > > >> > >> > >> > -- >> > -- Guozhang >> > >> > > > -- > -- Guozhang