Thanks Sagar, I made a quick read on your doc, and am also leaning towards having a separate API for connect. Would also love to hear @Randall Hauch <rha...@gmail.com> @Konstantin 's opinion about that.
One thing is that for Kafka Connect, their compatibility story, especially downgrading is more flexible than consumer (you can find that from https://cwiki.apache.org/confluence/display/KAFKA/KIP-415%3A+Incremental+Cooperative+Rebalancing+in+Kafka+Connect) so maybe we need to think a bit differently for the compatibility. Guozhang On Wed, Aug 3, 2022 at 5:39 AM Sagar <sagarmeansoc...@gmail.com> wrote: > Hi Guozhang/David, > > I created a confluence page to discuss how Connect would need to change > based on the new rebalance protocol. Here's the page: > > https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol > > It's also pretty longish and I have tried to keep a format similar to > KIP-848. Let me know what you think. Also, do you think this should be > moved to a separate discussion thread or is this one fine? > > Thanks! > Sagar. > > On Tue, Jul 26, 2022 at 7:37 AM Sagar <sagarmeansoc...@gmail.com> wrote: > > > Hello Guozhang, > > > > Thank you so much for the doc on Kafka Streams. Sure, I would do the > > analysis and come up with such a document. > > > > Thanks! > > Sagar. > > > > On Tue, Jul 26, 2022 at 4:47 AM Guozhang Wang <wangg...@gmail.com> > wrote: > > > >> Hello Sagar, > >> > >> It would be great if you could come back with some analysis on how to > >> implement the Connect side integration with the new protocol; so far > >> besides leveraging on the new "protocol type" we did not yet think > through > >> the Connect side implementations. For Streams here's a draft of > >> integration > >> plan: > >> > >> > https://docs.google.com/document/d/17PNz2sGoIvGyIzz8vLyJTJTU2rqnD_D9uHJnH9XARjU/edit#heading=h.pdgirmi57dvn > >> just FYI for your analysis on Connect. > >> > >> On Tue, Jul 19, 2022 at 10:48 PM Sagar <sagarmeansoc...@gmail.com> > wrote: > >> > >> > Hi David, > >> > > >> > Thank you for your response. The reason I thought connect can also fit > >> into > >> > this new scheme is that even today the connect uses a > WorkerCoordinator > >> > extending from AbstractCoordinator to empower rebalances of > >> > tasks/connectors. The WorkerCoordinator sets the protocolType() to > >> connect > >> > and uses the metadata() method by plumbing into > >> JoinGroupRequestProtocol. > >> > > >> > I think the changes to support connect would be similar at a high > level > >> to > >> > the changes in streams mainly because of the Client side assignors > being > >> > used in both. At an implementation level, we might need to make a lot > of > >> > changes to get onto this new assignment protocol like enhancing the > >> > JoinGroup request/response and SyncGroup and using > >> ConsumerGroupHeartbeat > >> > API etc again on similar lines to streams (or there might be > >> deviations). I > >> > would try to perform a detailed analysis of the same and we can have > a > >> > separate discussion thread for that as that would derail this > discussion > >> > thread. Let me know if that sounds good to you. > >> > > >> > Thanks! > >> > Sagar. > >> > > >> > > >> > > >> > On Fri, Jul 15, 2022 at 5:47 PM David Jacot > <dja...@confluent.io.invalid > >> > > >> > wrote: > >> > > >> > > Hi Sagar, > >> > > > >> > > Thanks for your comments. > >> > > > >> > > 1) Yes. That refers to `Assignment#error`. Sure, I can mention it. > >> > > > >> > > 2) The idea is to transition C from his current assignment to his > >> > > target assignment when he can move to epoch 3. When that happens, > the > >> > > member assignment is updated and persisted with all its assigned > >> > > partitions even if they are not all revoked yet. In other words, the > >> > > member assignment becomes the target assignment. This is basically > an > >> > > optimization to avoid having to write all the changes to the log. > The > >> > > examples are based on the persisted state so I understand the > >> > > confusion. Let me see if I can improve this in the description. > >> > > > >> > > 3) Regarding Connect, it could reuse the protocol with a client side > >> > > assignor if it fits in the protocol. The assignment is about > >> > > topicid-partitions + metadata, could Connect fit into this? > >> > > > >> > > Best, > >> > > David > >> > > > >> > > On Fri, Jul 15, 2022 at 1:55 PM Sagar <sagarmeansoc...@gmail.com> > >> wrote: > >> > > > > >> > > > Hi David, > >> > > > > >> > > > Thanks for the KIP. I just had minor observations: > >> > > > > >> > > > 1) In the Assignment Error section in Client Side mode Assignment > >> > > process, > >> > > > you mentioned => `In this case, the client side assignor can > return > >> an > >> > > > error to the group coordinator`. In this case are you referring to > >> the > >> > > > Assignor returning an AssignmentError that's listed down towards > the > >> > end? > >> > > > If yes, do you think it would make sense to mention this > explicitly > >> > here? > >> > > > > >> > > > 2) In the Case Studies section, I have a slight confusion, not > sure > >> if > >> > > > others have the same. Consider this step: > >> > > > > >> > > > When B heartbeats, the group coordinator transitions him to epoch > 3 > >> > > because > >> > > > B has no partitions to revoke. It persists the change and reply. > >> > > > > >> > > > - Group (epoch=3) > >> > > > - A > >> > > > - B > >> > > > - C > >> > > > - Target Assignment (epoch=3) > >> > > > - A - partitions=[foo-0] > >> > > > - B - partitions=[foo-2] > >> > > > - C - partitions=[foo-1] > >> > > > - Member Assignment > >> > > > - A - epoch=2, partitions=[foo-0, foo-1] > >> > > > - B - epoch=3, partitions=[foo-2] > >> > > > - C - epoch=3, partitions=[foo-1] > >> > > > > >> > > > When C heartbeats, it transitions to epoch 3 but cannot get foo-1 > >> yet. > >> > > > > >> > > > Here,it's mentioned that member C can't get the foo-1 partition > yet, > >> > but > >> > > > based on the description above, it seems it already has it. Do you > >> > think > >> > > it > >> > > > would be better to remove it and populate it only when it actually > >> gets > >> > > it? > >> > > > I see this in a lot of other places, so have I understood it > >> > incorrectly > >> > > ? > >> > > > > >> > > > > >> > > > Regarding connect , it might be out of scope of this discussion, > but > >> > from > >> > > > what I understood it would probably be running in client side > >> assignor > >> > > mode > >> > > > even on the new rebalance protocol as it has its own Custom > >> > > Assignors(Eager > >> > > > and IncrementalCooperative). > >> > > > > >> > > > Thanks! > >> > > > > >> > > > Sagar. > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > On Fri, Jul 15, 2022 at 5:00 PM David Jacot > >> > <dja...@confluent.io.invalid > >> > > > > >> > > > wrote: > >> > > > > >> > > > > Thanks Hector! Our goal is to move forward with specialized API > >> > > > > instead of relying on one generic API. For Connect, we can apply > >> the > >> > > > > exact same pattern and reuse/share the core implementation on > the > >> > > > > server side. For the schema registry, I think that we should > >> consider > >> > > > > having a tailored API to do simple membership/leader election. > >> > > > > > >> > > > > Best, > >> > > > > David > >> > > > > > >> > > > > On Fri, Jul 15, 2022 at 10:22 AM Ismael Juma <ism...@juma.me.uk > > > >> > > wrote: > >> > > > > > > >> > > > > > Three quick comments: > >> > > > > > > >> > > > > > 1. Regarding java.util.regex.Pattern vs > >> com.google.re2j.Pattern, we > >> > > > > should > >> > > > > > document the differences in more detail before deciding one > way > >> or > >> > > > > another. > >> > > > > > That said, if people pass java.util.regex.Pattern, they expect > >> > their > >> > > > > > semantics to be honored. If we are doing something different, > >> then > >> > we > >> > > > > > should consider adding an overload with our own Pattern class > (I > >> > > don't > >> > > > > > think we'd want to expose re2j's at this point). > >> > > > > > 2. Regarding topic ids, any major new protocol should > integrate > >> > fully > >> > > > > with > >> > > > > > it and should handle the topic recreation case correctly. > That's > >> > the > >> > > main > >> > > > > > part we need to handle. I agree with David that we'd want to > add > >> > > topic > >> > > > > ids > >> > > > > > to the relevant protocols that don't have it yet and that we > can > >> > > probably > >> > > > > > focus on the internals versus adding new APIs to the Java > >> Consumer > >> > > > > (unless > >> > > > > > we find that adding new APIs is required for reasonable > >> semantics). > >> > > > > > 3. I am still not sure about the coordinator storing the > >> configs. > >> > > It's > >> > > > > > powerful for configs to be centralized in the metadata log for > >> > > various > >> > > > > > reasons (auditability, visibility, consistency, etc.). > >> Similarly, I > >> > > am > >> > > > > not > >> > > > > > sure about automatically deleting configs in a way that they > >> cannot > >> > > be > >> > > > > > recovered. A good property for modern systems is to minimize > the > >> > > number > >> > > > > of > >> > > > > > unrecoverable data loss scenarios. > >> > > > > > > >> > > > > > Ismael > >> > > > > > > >> > > > > > On Wed, Jul 13, 2022 at 3:47 PM David Jacot > >> > > <dja...@confluent.io.invalid > >> > > > > > > >> > > > > > wrote: > >> > > > > > > >> > > > > > > Thanks Guozhang. My answers are below: > >> > > > > > > > >> > > > > > > > 1) the migration path, especially the last step when > clients > >> > > flip the > >> > > > > > > flag > >> > > > > > > > to enable the new protocol, in which we would have a > window > >> > where > >> > > > > both > >> > > > > > > new > >> > > > > > > > protocols / rpcs and old protocols / rpcs are used by > >> members > >> > of > >> > > the > >> > > > > same > >> > > > > > > > group. How the coordinator could "mimic" the old behavior > >> while > >> > > > > using the > >> > > > > > > > new protocol is something we need to present about. > >> > > > > > > > >> > > > > > > Noted. I just published a new version of KIP which includes > >> more > >> > > > > > > details about this. See the "Supporting Online Consumer > Group > >> > > Upgrade" > >> > > > > > > and the "Compatibility, Deprecation, and Migration Plan". I > >> think > >> > > that > >> > > > > > > I have to think through a few cases now but the overall idea > >> and > >> > > > > > > mechanism should be understandable. > >> > > > > > > > >> > > > > > > > 2) the usage of topic ids. So far as KIP-516 the topic ids > >> are > >> > > only > >> > > > > used > >> > > > > > > as > >> > > > > > > > part of RPCs and admin client, but they are not exposed > via > >> any > >> > > > > public > >> > > > > > > APIs > >> > > > > > > > to consumers yet. I think the question is, first should we > >> let > >> > > the > >> > > > > > > consumer > >> > > > > > > > client to be maintaining the names -> ids mapping itself > to > >> > fully > >> > > > > > > leverage > >> > > > > > > > on all the augmented existing RPCs and the new RPCs with > the > >> > > topic > >> > > > > ids; > >> > > > > > > and > >> > > > > > > > secondly, should we ever consider exposing the topic ids > in > >> the > >> > > > > consumer > >> > > > > > > > public APIs as well (both subscribe/assign, as well as in > >> the > >> > > > > rebalance > >> > > > > > > > listener for cases like topic deletion-and-recreation). > >> > > > > > > > >> > > > > > > a) Assuming that we would include converting all the offsets > >> > > related > >> > > > > > > RPCs to using topic ids in this KIP, the consumer would be > >> able > >> > to > >> > > > > > > fully operate with topic ids. That being said, it still has > to > >> > > provide > >> > > > > > > the topics names in various APIs so having a mapping in the > >> > > consumer > >> > > > > > > seems inevitable to me. > >> > > > > > > b) I don't have a strong opinion on this. Here I wonder if > >> this > >> > > goes > >> > > > > > > beyond the scope of this KIP. I would rather focus on the > >> > internals > >> > > > > > > here and we can consider this separately if we see value in > >> doing > >> > > it. > >> > > > > > > > >> > > > > > > Coming back to Ismael's point about using topic ids in the > >> > > > > > > ConsumerGroupHeartbeatRequest, I think that there is one > >> > advantage > >> > > in > >> > > > > > > favour of it. The consumer will have the opportunity to > >> validate > >> > > that > >> > > > > > > the topics exists before passing them into the group > rebalance > >> > > > > > > protocol. Obviously, the coordinator will also notice it but > >> it > >> > > does > >> > > > > > > not really have a way to reject an invalid topic in the > >> response. > >> > > > > > > > >> > > > > > > > I'm agreeing with David on all other minor questions > except > >> for > >> > > the > >> > > > > > > > `subscribe(Pattern)` question: personally I think it's not > >> > > necessary > >> > > > > to > >> > > > > > > > deprecate the subscribe API with Pattern, but instead we > >> still > >> > > use > >> > > > > > > Pattern > >> > > > > > > > while just documenting that our subscription may be > >> rejected by > >> > > the > >> > > > > > > server. > >> > > > > > > > Since the incompatible case is a very rare scenario I felt > >> > using > >> > > an > >> > > > > > > > overloaded `String` based subscription may be more > >> vulnerable > >> > to > >> > > > > various > >> > > > > > > > invalid regexes. > >> > > > > > > > >> > > > > > > That could work. I have to look at the differences between > the > >> > two > >> > > > > > > engines to better understand the potential issues. My > >> > > understanding is > >> > > > > > > that would work for all the basic regular expressions. The > >> > > differences > >> > > > > > > between the two are mainly about the various character > >> classes. I > >> > > > > > > wonder what other people think about this. > >> > > > > > > > >> > > > > > > Best, > >> > > > > > > David > >> > > > > > > > >> > > > > > > On Tue, Jul 12, 2022 at 11:28 PM Guozhang Wang < > >> > wangg...@gmail.com > >> > > > > >> > > > > wrote: > >> > > > > > > > > >> > > > > > > > Thanks David! I think on the high level there are two meta > >> > > points we > >> > > > > need > >> > > > > > > > to concretize a bit more: > >> > > > > > > > > >> > > > > > > > 1) the migration path, especially the last step when > clients > >> > > flip the > >> > > > > > > flag > >> > > > > > > > to enable the new protocol, in which we would have a > window > >> > where > >> > > > > both > >> > > > > > > new > >> > > > > > > > protocols / rpcs and old protocols / rpcs are used by > >> members > >> > of > >> > > the > >> > > > > same > >> > > > > > > > group. How the coordinator could "mimic" the old behavior > >> while > >> > > > > using the > >> > > > > > > > new protocol is something we need to present about. > >> > > > > > > > 2) the usage of topic ids. So far as KIP-516 the topic ids > >> are > >> > > only > >> > > > > used > >> > > > > > > as > >> > > > > > > > part of RPCs and admin client, but they are not exposed > via > >> any > >> > > > > public > >> > > > > > > APIs > >> > > > > > > > to consumers yet. I think the question is, first should we > >> let > >> > > the > >> > > > > > > consumer > >> > > > > > > > client to be maintaining the names -> ids mapping itself > to > >> > fully > >> > > > > > > leverage > >> > > > > > > > on all the augmented existing RPCs and the new RPCs with > the > >> > > topic > >> > > > > ids; > >> > > > > > > and > >> > > > > > > > secondly, should we ever consider exposing the topic ids > in > >> the > >> > > > > consumer > >> > > > > > > > public APIs as well (both subscribe/assign, as well as in > >> the > >> > > > > rebalance > >> > > > > > > > listener for cases like topic deletion-and-recreation). > >> > > > > > > > > >> > > > > > > > I'm agreeing with David on all other minor questions > except > >> for > >> > > the > >> > > > > > > > `subscribe(Pattern)` question: personally I think it's not > >> > > necessary > >> > > > > to > >> > > > > > > > deprecate the subscribe API with Pattern, but instead we > >> still > >> > > use > >> > > > > > > Pattern > >> > > > > > > > while just documenting that our subscription may be > >> rejected by > >> > > the > >> > > > > > > server. > >> > > > > > > > Since the incompatible case is a very rare scenario I felt > >> > using > >> > > an > >> > > > > > > > overloaded `String` based subscription may be more > >> vulnerable > >> > to > >> > > > > various > >> > > > > > > > invalid regexes. > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > Guozhang > >> > > > > > > > > >> > > > > > > > On Tue, Jul 12, 2022 at 5:23 AM David Jacot > >> > > > > <dja...@confluent.io.invalid > >> > > > > > > > > >> > > > > > > > wrote: > >> > > > > > > > > >> > > > > > > > > Hi Ismael, > >> > > > > > > > > > >> > > > > > > > > Thanks for your feedback. Let me answer your questions > >> > inline. > >> > > > > > > > > > >> > > > > > > > > > 1. I think it's premature to talk about target > versions > >> for > >> > > > > > > deprecation > >> > > > > > > > > and > >> > > > > > > > > > removal of the existing group protocol. Unlike KRaft, > >> this > >> > > > > affects a > >> > > > > > > core > >> > > > > > > > > > client protocol and hence deprecation/removal will be > >> > heavily > >> > > > > > > dependent > >> > > > > > > > > on > >> > > > > > > > > > how quickly applications migrate to the new protocol. > >> > > > > > > > > > >> > > > > > > > > That makes sense. I will remove it. > >> > > > > > > > > > >> > > > > > > > > > 2. The KIP says we intend to release this in 4.x, but > it > >> > > wasn't > >> > > > > made > >> > > > > > > > > clear > >> > > > > > > > > > why. If we added that as a way to estimate when we'd > >> > > deprecate > >> > > > > and > >> > > > > > > remove > >> > > > > > > > > > the group protocol, I also suggest removing this part. > >> > > > > > > > > > >> > > > > > > > > Let me explain my reasoning. As explained, I plan to > >> rewrite > >> > > the > >> > > > > group > >> > > > > > > > > coordinator in Java while we implement the new protocol. > >> This > >> > > means > >> > > > > > > > > that the internals will be slightly different (e.g. > >> threading > >> > > > > model). > >> > > > > > > > > Therefore, I wanted to tighten the switch from the old > >> group > >> > > > > > > > > coordinator to the new group coordinator to a major > >> release. > >> > > The > >> > > > > > > > > alternative would be to use a flag to do the switch > >> instead > >> > of > >> > > > > relying > >> > > > > > > > > on the software upgrade. > >> > > > > > > > > > >> > > > > > > > > > 3. We need to flesh out the details of the migration > >> story. > >> > > It > >> > > > > sounds > >> > > > > > > > > like > >> > > > > > > > > > we're saying we will support online migrations. Is > that > >> > > correct? > >> > > > > We > >> > > > > > > > > should > >> > > > > > > > > > explain this in detail. It could also be done as a > >> separate > >> > > KIP, > >> > > > > if > >> > > > > > > it's > >> > > > > > > > > > easier. > >> > > > > > > > > > >> > > > > > > > > Yes, we will support online migrations for the group. > That > >> > > means > >> > > > > that > >> > > > > > > > > a group using the old protocol will be able to switch to > >> the > >> > > new > >> > > > > > > > > protocol. > >> > > > > > > > > > >> > > > > > > > > Let me briefly explain how that will work though. It is > >> > > basically a > >> > > > > > > > > four step process: > >> > > > > > > > > > >> > > > > > > > > 1. The cluster must be upgraded or rolled to a software > >> > > supporting > >> > > > > the > >> > > > > > > > > new group coordinator. Both the old and the new > >> coordinator > >> > > will > >> > > > > > > > > support the old protocol and rely on the same persisted > >> > > metadata so > >> > > > > > > > > they can work together. This point is an offline > >> migration. > >> > We > >> > > > > cannot > >> > > > > > > > > do this one live because it would require shutting down > >> the > >> > > current > >> > > > > > > > > coordinator and starting up the new one and that would > >> cause > >> > > > > > > > > unavailabilities. > >> > > > > > > > > 2. The cluster's metadata version/IBP must be upgraded > to > >> X > >> > in > >> > > > > order > >> > > > > > > > > to enable the new protocol. This cannot be done before > 1) > >> is > >> > > > > > > > > terminated because the old coordinator doesn't support > the > >> > new > >> > > > > > > > > protocol. > >> > > > > > > > > 3. The consumers must be upgraded to a version > supporting > >> the > >> > > > > online > >> > > > > > > > > migration (must have KIP-792). If the consumer is > already > >> > > there. > >> > > > > > > > > Nothing must be done at this point. > >> > > > > > > > > 4. The consumers must be rolled with the feature flag > >> turned > >> > > on. > >> > > > > The > >> > > > > > > > > consumer group is automatically converted when the first > >> > > consumer > >> > > > > > > > > using the new protocol joins the group. While the > members > >> > > using the > >> > > > > > > > > old protocol are being upgraded, the old protocol is > >> proxied > >> > > into > >> > > > > the > >> > > > > > > > > new one. > >> > > > > > > > > > >> > > > > > > > > Let me clarify all of this in the KIP. > >> > > > > > > > > > >> > > > > > > > > > 4. I am happy that we are pushing the pattern > >> subscriptions > >> > > to > >> > > > > the > >> > > > > > > > > server, > >> > > > > > > > > > but it seems like there could be some tricky > >> compatibility > >> > > > > issues. > >> > > > > > > Will > >> > > > > > > > > we > >> > > > > > > > > > have a mechanism for users to detect that they need to > >> > update > >> > > > > their > >> > > > > > > regex > >> > > > > > > > > > before switching to the new protocol? > >> > > > > > > > > > >> > > > > > > > > I think that I am a bit more optimistic than you on this > >> > > point. I > >> > > > > > > > > believe that the majority of the cases are simple > regexes > >> > which > >> > > > > should > >> > > > > > > > > work with the new engine. The coordinator will verify > the > >> > regex > >> > > > > anyway > >> > > > > > > > > and reject the consumer if the regex is not valid. > Coming > >> > back > >> > > to > >> > > > > the > >> > > > > > > > > migration path, in the worst case, the first upgraded > >> > consumer > >> > > > > joining > >> > > > > > > > > the group will be rejected. This should be used as the > >> last > >> > > > > defence, I > >> > > > > > > > > would say. > >> > > > > > > > > > >> > > > > > > > > One way for customers to validate their regex before > >> > upgrading > >> > > > > their > >> > > > > > > > > prod would be to test them with another group. For > >> instance, > >> > > that > >> > > > > > > > > could be done in a pre-prod environment. Another way > >> would be > >> > > to > >> > > > > > > > > extend the consumer-group tool to provide a regex > >> validation > >> > > > > > > > > mechanism. Would this be enough in your opinion? > >> > > > > > > > > > >> > > > > > > > > > 5. Related to the last question, will the Java client > >> allow > >> > > the > >> > > > > > > users to > >> > > > > > > > > > stick with the current regex engine for compatibility > >> > > reasons? > >> > > > > For > >> > > > > > > > > example, > >> > > > > > > > > > it may be handy to keep using client based regex at > >> first > >> > to > >> > > keep > >> > > > > > > > > > migrations simple and then migrate to server based > >> regexes > >> > > as a > >> > > > > > > second > >> > > > > > > > > step. > >> > > > > > > > > > >> > > > > > > > > I understand your point but I am concerned that this > would > >> > > allow > >> > > > > users > >> > > > > > > > > to actually stay in this mode. That would go against our > >> goal > >> > > of > >> > > > > > > > > simplifying the client because we would have to continue > >> > > monitoring > >> > > > > > > > > the metadata on the client side. I would rather not do > >> this. > >> > > > > > > > > > >> > > > > > > > > > 6. When we say that the group coordinator will be > >> > > responsible for > >> > > > > > > storing > >> > > > > > > > > > the configurations and that the configurations will be > >> > > deleted > >> > > > > when > >> > > > > > > the > >> > > > > > > > > > group is deleted. Will a transition to DEAD trigger > >> > deletion > >> > > of > >> > > > > > > > > > configurations? > >> > > > > > > > > > >> > > > > > > > > That's right. The configurations will be deleted when > the > >> > > group is > >> > > > > > > > > deleted. They go together. > >> > > > > > > > > > >> > > > > > > > > > 7. Will the choice to store the configs in the group > >> > > coordinator > >> > > > > > > make it > >> > > > > > > > > > harder to list all cluster configs and their values? > >> > > > > > > > > > >> > > > > > > > > I don't think so. The group configurations are overrides > >> of > >> > > cluster > >> > > > > > > > > configs. If you want to know all the overrides though, > you > >> > > would > >> > > > > have > >> > > > > > > > > to ask all the group coordinators. You cannot rely on > the > >> > > metadata > >> > > > > log > >> > > > > > > > > for instance. > >> > > > > > > > > > >> > > > > > > > > > 8. How would someone configure a group before starting > >> the > >> > > > > consumers? > >> > > > > > > > > Have > >> > > > > > > > > > we considered allowing the explicit creation of > groups? > >> > > > > > > Alternatively, > >> > > > > > > > > the > >> > > > > > > > > > configs could be decoupled from the group lifecycle. > >> > > > > > > > > > >> > > > > > > > > Yes. The group will be automatically created in this > case. > >> > > However, > >> > > > > > > > > the configs will be lost after the retention period of > the > >> > > group > >> > > > > > > > > passes. > >> > > > > > > > > > >> > > > > > > > > > 9. Will the Consumer.subscribe method for the Java > >> client > >> > > still > >> > > > > take > >> > > > > > > a > >> > > > > > > > > > `java.util.regex.Pattern` of do we have to introduce > an > >> > > overload? > >> > > > > > > > > > >> > > > > > > > > That's a very group question. I forgot about that one. > As > >> the > >> > > > > > > > > `java.util.regex.Pattern` is not fully compatible with > the > >> > > engine > >> > > > > that > >> > > > > > > > > we plan to use, it might be better to deprecate it and > >> use an > >> > > > > overload > >> > > > > > > > > which takes a string. We would rely on the server side > >> > > validation. > >> > > > > > > > > During the migration, I think that we could still try to > >> > > toString > >> > > > > the > >> > > > > > > > > regex and use it. That should work, I think, in the > >> majority > >> > > of the > >> > > > > > > > > cases. > >> > > > > > > > > > >> > > > > > > > > > 10. I agree with Justine that we should be clearer > about > >> > the > >> > > > > reason > >> > > > > > > to > >> > > > > > > > > > switch to IBP/metadata.version from the feature flag. > >> Maybe > >> > > we > >> > > > > mean > >> > > > > > > that > >> > > > > > > > > we > >> > > > > > > > > > can switch the default for the feature flag to true > >> based > >> > on > >> > > the > >> > > > > > > > > > metadata.version once we want to make it the default. > >> > > > > > > > > > >> > > > > > > > > My plan was to use that feature flag mainly during the > >> > > development > >> > > > > > > > > phase. I should not have mentioned it, I think, because > we > >> > > could > >> > > > > use > >> > > > > > > > > an internal config for it. > >> > > > > > > > > > >> > > > > > > > > > 11. Some of the protocol APIs don't mention the > required > >> > > ACLs, it > >> > > > > > > would > >> > > > > > > > > be > >> > > > > > > > > > good to add that for consistency. > >> > > > > > > > > > >> > > > > > > > > Noted. > >> > > > > > > > > > >> > > > > > > > > > 12. It is a bit odd that ConsumerGroupHeartbeat > requires > >> > > "Read > >> > > > > Group" > >> > > > > > > > > even > >> > > > > > > > > > though it seems to do more than reading. > >> > > > > > > > > > >> > > > > > > > > I agree. This is how the current protocol works though. > We > >> > only > >> > > > > > > > > require "Read Group" to join a group. We could consider > >> > > changing > >> > > > > this > >> > > > > > > > > but I am not sure that it is worth it. > >> > > > > > > > > > >> > > > > > > > > > 13. How is topic recreation handled by the consumer > with > >> > the > >> > > new > >> > > > > > > group > >> > > > > > > > > > protocol? It would be good to have a section on this. > >> > > > > > > > > > >> > > > > > > > > Noted. From a protocol perspective, the new topic will > >> have a > >> > > new > >> > > > > > > > > topic id so it will treat it like a topic with a > different > >> > > name. > >> > > > > The > >> > > > > > > > > only issue is that the fetch/commit offsets APIs do not > >> > support > >> > > > > topic > >> > > > > > > > > IDs so the consumer would reuse the offsets based on the > >> > same. > >> > > I > >> > > > > think > >> > > > > > > > > that we should update those APIs as well in order to be > >> > > consistent > >> > > > > end > >> > > > > > > > > to end. That would strengthen the semantics of the > >> consumer. > >> > > > > > > > > > >> > > > > > > > > > 14. The KIP mentions we will write the new coordinator > >> in > >> > > Java. > >> > > > > Even > >> > > > > > > > > though > >> > > > > > > > > > this is an implementation detail, do we plan to have a > >> new > >> > > gradle > >> > > > > > > module > >> > > > > > > > > > for it? > >> > > > > > > > > > >> > > > > > > > > Yes. > >> > > > > > > > > > >> > > > > > > > > > 15. Do we have a scalability goal when it comes to how > >> many > >> > > > > members > >> > > > > > > the > >> > > > > > > > > new > >> > > > > > > > > > group protocol can support? > >> > > > > > > > > > >> > > > > > > > > We don't have numbers at the moment. The protocol should > >> > > support > >> > > > > 1000s > >> > > > > > > > > of members per group. We will measure this when we have > a > >> > first > >> > > > > > > > > implementation. Note that we might have other > bottlenecks > >> > down > >> > > the > >> > > > > > > > > road (e.g. offset commits). > >> > > > > > > > > > >> > > > > > > > > > 16. Did we consider having SubscribedTopidIds instead > of > >> > > > > > > > > > SubscribedTopicNames in ConsumerGroupHeartbeatRequest? > >> Is > >> > the > >> > > > > idea > >> > > > > > > that > >> > > > > > > > > > since we have to resolve the regex on the server, we > >> can do > >> > > the > >> > > > > same > >> > > > > > > for > >> > > > > > > > > > the topic name? The difference is that sending the > >> regex is > >> > > more > >> > > > > > > > > efficient > >> > > > > > > > > > whereas sending the topic names is less efficient. > >> > > Furthermore, > >> > > > > > > delete > >> > > > > > > > > and > >> > > > > > > > > > recreation is easier to handle if we have topic ids. > >> > > > > > > > > > >> > > > > > > > > The idea was to consolidate the metadata lookup on the > >> server > >> > > for > >> > > > > both > >> > > > > > > > > paths but I do agree with your point. As a second > though, > >> > using > >> > > > > topic > >> > > > > > > > > ids may be better here for the delete and recreation > case. > >> > > Also, I > >> > > > > > > > > suppose that we may allow users to subscribe with topic > >> ids > >> > in > >> > > the > >> > > > > > > > > future because that is the only way to be really robust > to > >> > > topic > >> > > > > > > > > re-creation. > >> > > > > > > > > > >> > > > > > > > > Best, > >> > > > > > > > > David > >> > > > > > > > > > >> > > > > > > > > On Tue, Jul 12, 2022 at 1:38 PM David Jacot < > >> > > dja...@confluent.io> > >> > > > > > > wrote: > >> > > > > > > > > > > >> > > > > > > > > > Hi Justine, > >> > > > > > > > > > > >> > > > > > > > > > Thanks for your comments. Please find my answers > below. > >> > > > > > > > > > > >> > > > > > > > > > - Yes, the new protocol relies on topic IDs with the > >> > > exception > >> > > > > of the > >> > > > > > > > > > topic names based in the > ConsumerGroupHeartbeatRequest. > >> I > >> > am > >> > > not > >> > > > > sure > >> > > > > > > > > > if using topic names is the right call here. I need to > >> > think > >> > > > > about it > >> > > > > > > > > > a little more. Obviously, the KIP does not change the > >> > > > > fetch/commit > >> > > > > > > > > > offsets RPCs to use topic IDs. This may be something > >> that > >> > we > >> > > > > should > >> > > > > > > > > > include though as it would give better overall > >> guarantee in > >> > > the > >> > > > > > > > > > producer. > >> > > > > > > > > > - You're right. I think that I should not have > mentioned > >> > this > >> > > > > flag at > >> > > > > > > > > > all. I will remove it. We can use an internal > >> configuration > >> > > while > >> > > > > > > > > > developing the feature. > >> > > > > > > > > > - Both cluster types will be supported. The change is > >> > > > > orthogonal. The > >> > > > > > > > > > only requirement is that the cluster uses topic IDs. > >> > > > > > > > > > > >> > > > > > > > > > Best, > >> > > > > > > > > > David > >> > > > > > > > > > > >> > > > > > > > > > On Mon, Jul 11, 2022 at 9:53 PM Guozhang Wang < > >> > > > > wangg...@gmail.com> > >> > > > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > Hi Ismael, > >> > > > > > > > > > > > >> > > > > > > > > > > Thanks for the feedback. Here are some replies > inlined > >> > > below: > >> > > > > > > > > > > > >> > > > > > > > > > > On Sat, Jul 9, 2022 at 2:53 AM Ismael Juma < > >> > > ism...@juma.me.uk> > >> > > > > > > wrote: > >> > > > > > > > > > > > >> > > > > > > > > > > > Thanks for the KIP. This has the potential to be a > >> > great > >> > > > > > > > > improvement. A few > >> > > > > > > > > > > > initial questions/comments: > >> > > > > > > > > > > > > >> > > > > > > > > > > > 1. I think it's premature to talk about target > >> versions > >> > > for > >> > > > > > > > > deprecation and > >> > > > > > > > > > > > removal of the existing group protocol. Unlike > >> KRaft, > >> > > this > >> > > > > > > affects a > >> > > > > > > > > core > >> > > > > > > > > > > > client protocol and hence deprecation/removal will > >> be > >> > > heavily > >> > > > > > > > > dependent on > >> > > > > > > > > > > > how quickly applications migrate to the new > >> protocol. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Yeah I agree with you. I think we can remove the > >> proposed > >> > > > > timeline > >> > > > > > > in > >> > > > > > > > > the > >> > > > > > > > > > > `Compatibility, Deprecation, and Migration Plan` and > >> > > instead > >> > > > > just > >> > > > > > > state > >> > > > > > > > > > > that we will decide in the future about when we > would > >> > > > > deprecate old > >> > > > > > > > > > > protocol and behaviors. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 2. The KIP says we intend to release this in 4.x, > >> but > >> > it > >> > > > > wasn't > >> > > > > > > made > >> > > > > > > > > clear > >> > > > > > > > > > > > why. If we added that as a way to estimate when > we'd > >> > > > > deprecate > >> > > > > > > and > >> > > > > > > > > remove > >> > > > > > > > > > > > the group protocol, I also suggest removing this > >> part. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > I think that's not specifically related to the > >> > > > > deprecation/removal > >> > > > > > > > > timeline > >> > > > > > > > > > > plan, but it's more for client upgrades. I.e. the > >> > > broker-side > >> > > > > > > > > > > implementation may be done first, and then the > client > >> > side, > >> > > > > and we > >> > > > > > > > > would > >> > > > > > > > > > > only mark it as "released" by the time clients > >> > > implementations > >> > > > > are > >> > > > > > > > > done. At > >> > > > > > > > > > > that time, to enable the feature the clients need to > >> > first > >> > > > > swap-in > >> > > > > > > the > >> > > > > > > > > > > bytecode with a rolling bounce and then set the flag > >> > with a > >> > > > > second > >> > > > > > > > > rolling > >> > > > > > > > > > > bounce, and hence we feel it's better to be released > >> in a > >> > > major > >> > > > > > > > > version. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 3. We need to flesh out the details of the > migration > >> > > story. > >> > > > > It > >> > > > > > > > > sounds like > >> > > > > > > > > > > > we're saying we will support online migrations. Is > >> that > >> > > > > correct? > >> > > > > > > We > >> > > > > > > > > should > >> > > > > > > > > > > > explain this in detail. It could also be done as a > >> > > separate > >> > > > > KIP, > >> > > > > > > if > >> > > > > > > > > it's > >> > > > > > > > > > > > easier. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Yes I think that's the part we can be more concrete > >> about > >> > > for > >> > > > > sure > >> > > > > > > (and > >> > > > > > > > > > > this is related to your question 2) above). We will > >> work > >> > on > >> > > > > making > >> > > > > > > it > >> > > > > > > > > more > >> > > > > > > > > > > explicit in parallel as we solicit more feedback. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 4. I am happy that we are pushing the pattern > >> > > subscriptions > >> > > > > to > >> > > > > > > the > >> > > > > > > > > server, > >> > > > > > > > > > > > but it seems like there could be some tricky > >> > > compatibility > >> > > > > > > issues. > >> > > > > > > > > Will we > >> > > > > > > > > > > > have a mechanism for users to detect that they > need > >> to > >> > > update > >> > > > > > > their > >> > > > > > > > > regex > >> > > > > > > > > > > > before switching to the new protocol? > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Yes I think we need some tooling for non-java client > >> > users > >> > > to > >> > > > > sort > >> > > > > > > of > >> > > > > > > > > > > "dry-run" the client before switching to the new > >> > protocol. > >> > > I > >> > > > > do not > >> > > > > > > > > have a > >> > > > > > > > > > > specific idea on top of my head though, maybe others > >> like > >> > > @Matt > >> > > > > > > > > Howlett can > >> > > > > > > > > > > chime-in here? > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 5. Related to the last question, will the Java > >> client > >> > > allow > >> > > > > the > >> > > > > > > > > users to > >> > > > > > > > > > > > stick with the current regex engine for > >> compatibility > >> > > > > reasons? > >> > > > > > > For > >> > > > > > > > > example, > >> > > > > > > > > > > > it may be handy to keep using client based regex > at > >> > > first to > >> > > > > keep > >> > > > > > > > > > > > migrations simple and then migrate to server based > >> > > regexes > >> > > > > as a > >> > > > > > > > > second > >> > > > > > > > > > > > step. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Honestly I have not thought about that for java > >> clients, > >> > > and > >> > > > > we can > >> > > > > > > > > discuss > >> > > > > > > > > > > that. What kind of compatibility issues do you have > in > >> > > mind? > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 6. When we say that the group coordinator will be > >> > > > > responsible for > >> > > > > > > > > storing > >> > > > > > > > > > > > the configurations and that the configurations > will > >> be > >> > > > > deleted > >> > > > > > > when > >> > > > > > > > > the > >> > > > > > > > > > > > group is deleted. Will a transition to DEAD > trigger > >> > > deletion > >> > > > > of > >> > > > > > > > > > > > configurations? > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Yes, since the DEAD state is an ending state (we > would > >> > only > >> > > > > > > transit to > >> > > > > > > > > that > >> > > > > > > > > > > state when the group is EMPTY and also all of its > >> > metadata > >> > > are > >> > > > > > > gone), > >> > > > > > > > > once > >> > > > > > > > > > > it's transited to DEAD this group would never be > >> revived. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 7. Will the choice to store the configs in the > group > >> > > > > coordinator > >> > > > > > > > > make it > >> > > > > > > > > > > > harder to list all cluster configs and their > values? > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > That's a good question, and our thoughts are that > the > >> > > so-called > >> > > > > > > "group > >> > > > > > > > > > > configurations" are overrides of the cluster-level > >> > > > > configurations > >> > > > > > > > > > > customized per group so when an admin list cluster > >> > configs > >> > > it's > >> > > > > > > okay to > >> > > > > > > > > > > list just the cluster-level "defaults", not showing > >> any > >> > > > > per-group > >> > > > > > > > > > > customizations. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 8. How would someone configure a group before > >> starting > >> > > the > >> > > > > > > > > consumers? Have > >> > > > > > > > > > > > we considered allowing the explicit creation of > >> groups? > >> > > > > > > > > Alternatively, the > >> > > > > > > > > > > > configs could be decoupled from the group > lifecycle. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > The configs can be created before the group itself > as > >> an > >> > > > > > > independent > >> > > > > > > > > entity > >> > > > > > > > > > > --- of course, this requires the corresponding > >> request to > >> > > be > >> > > > > > > routed to > >> > > > > > > > > the > >> > > > > > > > > > > right coordinator based on the group id --- the only > >> > thing > >> > > that > >> > > > > > > > > differs is, > >> > > > > > > > > > > when the group itself is gone we also check if there > >> are > >> > > any > >> > > > > > > > > configuration > >> > > > > > > > > > > entities related to that group and delete as well. > >> > > > > > > > > > > > >> > > > > > > > > > > Admittedly this indeed introduces an asymmetry on > the > >> > > creation > >> > > > > / > >> > > > > > > > > deletion > >> > > > > > > > > > > lifecycles of the config entities, and we would like > >> to > >> > > hear > >> > > > > > > everyone's > >> > > > > > > > > > > feelings whether we should aim for symmetry i.e. > >> totally > >> > > > > decouple > >> > > > > > > group > >> > > > > > > > > > > configs and hence not delete them at all when the > >> group > >> > is > >> > > > > gone, > >> > > > > > > but > >> > > > > > > > > always > >> > > > > > > > > > > require explicit deletion operations by themselves. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 9. Will the Consumer.subscribe method for the Java > >> > client > >> > > > > still > >> > > > > > > take > >> > > > > > > > > a > >> > > > > > > > > > > > `java.util.regex.Pattern` of do we have to > >> introduce an > >> > > > > overload? > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > I think we do not need to introduce an overload, but > >> I'm > >> > > all > >> > > > > ears > >> > > > > > > if > >> > > > > > > > > there > >> > > > > > > > > > > may be some compatibility issues that we may > overlook. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 10. I agree with Justine that we should be clearer > >> > about > >> > > the > >> > > > > > > reason > >> > > > > > > > > to > >> > > > > > > > > > > > switch to IBP/metadata.version from the feature > >> flag. > >> > > Maybe > >> > > > > we > >> > > > > > > mean > >> > > > > > > > > that we > >> > > > > > > > > > > > can switch the default for the feature flag to > true > >> > > based on > >> > > > > the > >> > > > > > > > > > > > metadata.version once we want to make it the > >> default. > >> > > > > > > > > > > > >> > > > > > > > > > > 11. Some of the protocol APIs don't mention the > >> required > >> > > ACLs, > >> > > > > it > >> > > > > > > > > would be > >> > > > > > > > > > > > good to add that for consistency. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Ack, we can certainly do that. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 12. It is a bit odd that ConsumerGroupHeartbeat > >> > requires > >> > > > > "Read > >> > > > > > > > > Group" even > >> > > > > > > > > > > > though it seems to do more than reading. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > I had that thought myself as well, but in the end we > >> > could > >> > > not > >> > > > > > > find a > >> > > > > > > > > > > better alternative: adding Write Group seems an > >> overkill > >> > > here > >> > > > > > > since we > >> > > > > > > > > do > >> > > > > > > > > > > not have it elsewhere (we only have Read / Delete > and > >> > > Describe > >> > > > > on > >> > > > > > > > > groups so > >> > > > > > > > > > > far). Would like to hear others thoughts. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 13. How is topic recreation handled by the > consumer > >> > with > >> > > the > >> > > > > new > >> > > > > > > > > group > >> > > > > > > > > > > > protocol? It would be good to have a section on > >> this. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > You mean with regex subscription right? Yes we can > >> add a > >> > > > > section > >> > > > > > > about > >> > > > > > > > > > > that, but basically the idea is that consumer would > be > >> > > totally > >> > > > > > > > > agnostic in > >> > > > > > > > > > > the new protocol as it's handled all by the brokers. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 14. The KIP mentions we will write the new > >> coordinator > >> > in > >> > > > > Java. > >> > > > > > > Even > >> > > > > > > > > though > >> > > > > > > > > > > > this is an implementation detail, do we plan to > >> have a > >> > > new > >> > > > > gradle > >> > > > > > > > > module > >> > > > > > > > > > > > for it? > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > We have not thought about that. But I think the > answer > >> > > should > >> > > > > be > >> > > > > > > yes. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 15. Do we have a scalability goal when it comes to > >> how > >> > > many > >> > > > > > > members > >> > > > > > > > > the new > >> > > > > > > > > > > > group protocol can support? > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Within a group, I think we should shoot for 1000s of > >> > > members. > >> > > > > But > >> > > > > > > that > >> > > > > > > > > > > scalability goals also depend on the offset > management > >> > > (commit, > >> > > > > > > fetch) > >> > > > > > > > > > > capabilities of the coordinator which we did not > >> cover in > >> > > this > >> > > > > > > KIP, so > >> > > > > > > > > it's > >> > > > > > > > > > > hard to give a number that applies universally. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > 16. Did we consider having SubscribedTopidIds > >> instead > >> > of > >> > > > > > > > > > > > SubscribedTopicNames in > >> ConsumerGroupHeartbeatRequest? > >> > > Is the > >> > > > > > > idea > >> > > > > > > > > that > >> > > > > > > > > > > > since we have to resolve the regex on the server, > we > >> > can > >> > > do > >> > > > > the > >> > > > > > > same > >> > > > > > > > > for > >> > > > > > > > > > > > the topic name? The difference is that sending the > >> > regex > >> > > is > >> > > > > more > >> > > > > > > > > efficient > >> > > > > > > > > > > > whereas sending the topic names is less efficient. > >> > > > > Furthermore, > >> > > > > > > > > delete and > >> > > > > > > > > > > > recreation is easier to handle if we have topic > ids. > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > The main reason to still let the clients send names > >> is to > >> > > keep > >> > > > > the > >> > > > > > > > > > > reasoning of names -> ids on the broker / admin > client > >> > > only. > >> > > > > Note > >> > > > > > > that > >> > > > > > > > > > > although we added topic id in KIP-516, we never > >> > > implemented the > >> > > > > > > logic > >> > > > > > > > > on > >> > > > > > > > > > > consumer/producers leveraging the related newer > >> versioned > >> > > RPCs, > >> > > > > > > > > instead we > >> > > > > > > > > > > just set the topic id as empty UUID. We want to keep > >> the > >> > > > > > > > > consumer/producer > >> > > > > > > > > > > to be thin and only delegate the reasoning on broker > >> and > >> > > > > > > potentially > >> > > > > > > > > admin > >> > > > > > > > > > > clients. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > Thanks, > >> > > > > > > > > > > > Ismael > >> > > > > > > > > > > > > >> > > > > > > > > > > > On Wed, Jul 6, 2022 at 10:45 AM David Jacot > >> > > > > > > > > <dja...@confluent.io.invalid> > >> > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > >> > > > > > > > > > > > > Hi all, > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > I would like to start a discussion thread on > >> KIP-848: > >> > > The > >> > > > > Next > >> > > > > > > > > > > > > Generation of the Consumer Rebalance Protocol. > >> With > >> > > this > >> > > > > KIP, > >> > > > > > > we > >> > > > > > > > > aim > >> > > > > > > > > > > > > to make the rebalance protocol (for consumers) > >> more > >> > > > > reliable, > >> > > > > > > more > >> > > > > > > > > > > > > scalable, easier to implement for clients, and > >> easier > >> > > to > >> > > > > debug > >> > > > > > > for > >> > > > > > > > > > > > > operators. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > The KIP is here: > >> > > > > https://cwiki.apache.org/confluence/x/HhD1D. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Please take a look and let me know what you > think. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > Best, > >> > > > > > > > > > > > > David > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > PS: I will be away from July 18th to August 8th. > >> That > >> > > gives > >> > > > > > > you a > >> > > > > > > > > bit > >> > > > > > > > > > > > > of time to read and digest this long KIP. > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > -- > >> > > > > > > > > > > -- Guozhang > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > -- > >> > > > > > > > -- Guozhang > >> > > > > > > > >> > > > > > >> > > > >> > > > >> > > >> > >> > >> -- > >> -- Guozhang > >> > > > -- -- Guozhang