Hi Ismael, Thanks for the feedback. Here are some replies inlined below:
On Sat, Jul 9, 2022 at 2:53 AM Ismael Juma <ism...@juma.me.uk> wrote: > Thanks for the KIP. This has the potential to be a great improvement. A few > initial questions/comments: > > 1. I think it's premature to talk about target versions for deprecation and > removal of the existing group protocol. Unlike KRaft, this affects a core > client protocol and hence deprecation/removal will be heavily dependent on > how quickly applications migrate to the new protocol. > Yeah I agree with you. I think we can remove the proposed timeline in the `Compatibility, Deprecation, and Migration Plan` and instead just state that we will decide in the future about when we would deprecate old protocol and behaviors. > 2. The KIP says we intend to release this in 4.x, but it wasn't made clear > why. If we added that as a way to estimate when we'd deprecate and remove > the group protocol, I also suggest removing this part. > I think that's not specifically related to the deprecation/removal timeline plan, but it's more for client upgrades. I.e. the broker-side implementation may be done first, and then the client side, and we would only mark it as "released" by the time clients implementations are done. At that time, to enable the feature the clients need to first swap-in the bytecode with a rolling bounce and then set the flag with a second rolling bounce, and hence we feel it's better to be released in a major version. > 3. We need to flesh out the details of the migration story. It sounds like > we're saying we will support online migrations. Is that correct? We should > explain this in detail. It could also be done as a separate KIP, if it's > easier. > Yes I think that's the part we can be more concrete about for sure (and this is related to your question 2) above). We will work on making it more explicit in parallel as we solicit more feedback. > 4. I am happy that we are pushing the pattern subscriptions to the server, > but it seems like there could be some tricky compatibility issues. Will we > have a mechanism for users to detect that they need to update their regex > before switching to the new protocol? > Yes I think we need some tooling for non-java client users to sort of "dry-run" the client before switching to the new protocol. I do not have a specific idea on top of my head though, maybe others like @Matt Howlett can chime-in here? > 5. Related to the last question, will the Java client allow the users to > stick with the current regex engine for compatibility reasons? For example, > it may be handy to keep using client based regex at first to keep > migrations simple and then migrate to server based regexes as a second > step. > Honestly I have not thought about that for java clients, and we can discuss that. What kind of compatibility issues do you have in mind? > 6. When we say that the group coordinator will be responsible for storing > the configurations and that the configurations will be deleted when the > group is deleted. Will a transition to DEAD trigger deletion of > configurations? > Yes, since the DEAD state is an ending state (we would only transit to that state when the group is EMPTY and also all of its metadata are gone), once it's transited to DEAD this group would never be revived. > 7. Will the choice to store the configs in the group coordinator make it > harder to list all cluster configs and their values? > That's a good question, and our thoughts are that the so-called "group configurations" are overrides of the cluster-level configurations customized per group so when an admin list cluster configs it's okay to list just the cluster-level "defaults", not showing any per-group customizations. > 8. How would someone configure a group before starting the consumers? Have > we considered allowing the explicit creation of groups? Alternatively, the > configs could be decoupled from the group lifecycle. > The configs can be created before the group itself as an independent entity --- of course, this requires the corresponding request to be routed to the right coordinator based on the group id --- the only thing that differs is, when the group itself is gone we also check if there are any configuration entities related to that group and delete as well. Admittedly this indeed introduces an asymmetry on the creation / deletion lifecycles of the config entities, and we would like to hear everyone's feelings whether we should aim for symmetry i.e. totally decouple group configs and hence not delete them at all when the group is gone, but always require explicit deletion operations by themselves. > 9. Will the Consumer.subscribe method for the Java client still take a > `java.util.regex.Pattern` of do we have to introduce an overload? > I think we do not need to introduce an overload, but I'm all ears if there may be some compatibility issues that we may overlook. > 10. I agree with Justine that we should be clearer about the reason to > switch to IBP/metadata.version from the feature flag. Maybe we mean that we > can switch the default for the feature flag to true based on the > metadata.version once we want to make it the default. 11. Some of the protocol APIs don't mention the required ACLs, it would be > good to add that for consistency. > Ack, we can certainly do that. > 12. It is a bit odd that ConsumerGroupHeartbeat requires "Read Group" even > though it seems to do more than reading. > I had that thought myself as well, but in the end we could not find a better alternative: adding Write Group seems an overkill here since we do not have it elsewhere (we only have Read / Delete and Describe on groups so far). Would like to hear others thoughts. > 13. How is topic recreation handled by the consumer with the new group > protocol? It would be good to have a section on this. > You mean with regex subscription right? Yes we can add a section about that, but basically the idea is that consumer would be totally agnostic in the new protocol as it's handled all by the brokers. > 14. The KIP mentions we will write the new coordinator in Java. Even though > this is an implementation detail, do we plan to have a new gradle module > for it? > We have not thought about that. But I think the answer should be yes. > 15. Do we have a scalability goal when it comes to how many members the new > group protocol can support? > Within a group, I think we should shoot for 1000s of members. But that scalability goals also depend on the offset management (commit, fetch) capabilities of the coordinator which we did not cover in this KIP, so it's hard to give a number that applies universally. > 16. Did we consider having SubscribedTopidIds instead of > SubscribedTopicNames in ConsumerGroupHeartbeatRequest? Is the idea that > since we have to resolve the regex on the server, we can do the same for > the topic name? The difference is that sending the regex is more efficient > whereas sending the topic names is less efficient. Furthermore, delete and > recreation is easier to handle if we have topic ids. > The main reason to still let the clients send names is to keep the reasoning of names -> ids on the broker / admin client only. Note that although we added topic id in KIP-516, we never implemented the logic on consumer/producers leveraging the related newer versioned RPCs, instead we just set the topic id as empty UUID. We want to keep the consumer/producer to be thin and only delegate the reasoning on broker and potentially admin clients. > > Thanks, > Ismael > > On Wed, Jul 6, 2022 at 10:45 AM David Jacot <dja...@confluent.io.invalid> > wrote: > > > Hi all, > > > > I would like to start a discussion thread on KIP-848: The Next > > Generation of the Consumer Rebalance Protocol. With this KIP, we aim > > to make the rebalance protocol (for consumers) more reliable, more > > scalable, easier to implement for clients, and easier to debug for > > operators. > > > > The KIP is here: https://cwiki.apache.org/confluence/x/HhD1D. > > > > Please take a look and let me know what you think. > > > > Best, > > David > > > > PS: I will be away from July 18th to August 8th. That gives you a bit > > of time to read and digest this long KIP. > > > -- -- Guozhang