Hi David, A few more questions: 1. We will store the "targetAssignment" into log now. But as we know, there's max batch size limit (default 1MB), which means, we cannot support 1M partitions in one group (actually, it should be less than 60k partitions since we'll store {topicID+partition id}) by default now. How will we handle that? Do we expect users to adjust the max batch size to support large partitions in groups, which we don't need this change for old protocol?
I'm wondering why we should persist the "targetAssignment" data? If we want to work for coordinator failover, could the new coordinator try to request for currently owned partitions from each consumer when failed over? I'm not sure if the consumer will auto send owned partitions to the new coordinator. If not, maybe we can return an error to client ConsumerGroupHeartbeat API with REQUIRE_OWNED_PARTITION error, and ask client to append the currently owned partitions to new coordinates for new assignment computation. Does that make sense? Luke On Fri, Oct 14, 2022 at 12:22 AM Jun Rao <j...@confluent.io.invalid> wrote: > Hi, David, > > Thanks for the reply and the updated KIP. A few more comments on the > interfaces and the protocols. > > 60. On the consumer side, do we need both PartitionAssignor.onAssignment > and ConsumerRebalanceListener.onPartitionsAssigned? My understanding is > that the former was added for cooperative rebalance, which is now handled > by the coordinator. If we do need both, should we make them more consistent > (e.g. topic name vs topic id, list vs set vs collection)? > > 61. group.local.assignors: Could we make it clear that it's the full class > name that implements PartitionAssignor? > > 62. MemberAssignment: It currently has the following method. > public Set<TopicPartition> topicPartitions() > We are adding List<TopicIdPartition> ownedPartitions. Should we keep the > naming and the return type consistent? > > 63. MemberAssignment.error: should that be reason? > > 64. group.remote.assignor: The client may not know what assignors the > broker supports. Should we default this to what the broker determines (e.g. > first assignor listed in group.consumer.assignors)? > > 65. After the text "When A heartbeats again and acknowledges the > revocation, the group coordinator transitions him to epoch 2 and releases > foo-2.", we have the following. > B - epoch=2, partitions=[foo-2], pending-partitions=[] > Should foo-2 be in pending-partitions? > > 66. In the Online Migration example, is the first occurence of "C - > epoch=23, partitions=[foo-2, foo-5, foo-4], pending-partitions=[]" correct? > It seems that should happen after C receives a SyncGroup response? If so, > subsequent examples have the same issue. > > 67. ConsumerGroupHeartbeatRequest.RebalanceTimeoutMs : Which config > controls this? How is this used by the group coordinator since there is no > sync barrier anymore? > > 68. ConsumerGroupHeartbeatResponse: > 68.1 AssignedTopicPartitions and PendingTopicPartitions are of type > []TopicPartition. Should they be TopicPartitions? > 68.2 Assignment.error. Should we also have an errorMessage field? > > 69. ConsumerGroupPrepareAssignmentResponse.Members.Assignor: Should it > include the selected assignor name? > > 70. ConsumerGroupInstallAssignmentRequest.GroupEpoch: Should we let the > client set this? Intuitively, it seems the coordinator should manage the > group epoch. > > 71. ConsumerGroupDescribeResponse: > 71.1 Members.Assignment.Partitions. Should we include the topic name too > since it's convenient for building tools? Ditto for TargetAssignment. > 71.2 Members: Should we include SubscribedTopicRegex too? > > 72. OffsetFetchRequest: Is GenerationIdOrMemberEpoch needed since tools may > also want to issue this request? > > Thanks, > > Jun >