Andrew, thanks for the KIP! This is a pretty exciting effort. I've finally made it through the KIP, still trying to grok the whole thing. Sorry if some of my questions are basic :)
Concepts: 70. Does the Group Coordinator communicate with the Share Coordinator over RPC or directly in-process? 71. For preventing name collisions with regular consumer groups, could we define a reserved share group prefix? E.g., the operator defines "sg_" as a prefix for share groups only, and if a regular consumer group tries to use that name it fails. 72. When a consumer tries to use a share group, or a share consumer tries to use a regular group, would INVALID_GROUP_ID make more sense than INCONSISTENT_GROUP_PROTOCOL? -------- Share Group Membership: 73. What goes in the Metadata field for TargetAssignment#Member and Assignment? 74. Under Trigger a rebalance, it says we rebalance when the partition metadata changes. Would this be for any change, or just certain ones? For example, if a follower drops out of the ISR and comes back, we probably don't need to rebalance. 75. "For a share group, the group coordinator does *not* persist the assignment" Can you explain why this is not needed? 76. " If the consumer just failed to heartbeat due to a temporary pause, it could in theory continue to fetch and acknowledge records. When it finally sends a heartbeat and realises it’s been kicked out of the group, it should stop fetching records because its assignment has been revoked, and rejoin the group." A consumer with a long pause might still deliver some buffered records, but if the share group coordinator has expired its session, it wouldn't accept acknowledgments for that share consumer. In such a case, is any kind of error raised to the application like "hey, I know we gave you these records, but really we shouldn't have" ? ----- Record Delivery and acknowledgement 77. If we guarantee that a ShareCheckpoint is written at least every so often, could we add a new log compactor that avoids compacting ShareDelta-s that are still "active" (i.e., not yet superceded by a new ShareCheckpoint). Mechnically, this could be done by keeping the LSO no greater than the oldest "active" ShareCheckpoint. This might let us remove the DeltaIndex thing. 78. Instead of the State in the ShareDelta/Checkpoint records, how about MessageState? (State is kind of overloaded/ambiguous) 79. One possible limitation with the current persistence model is that all the share state is stored in one topic. It seems like we are going to be storing a lot more state than we do in __consumer_offsets since we're dealing with message-level acks. With aggressive checkpointing and compaction, we can mitigate the storage requirements, but the throughput could be a limiting factor. Have we considered other possibilities for persistence? Cheers, David