AndrewJSchofield commented on code in PR #18864:
URL: https://github.com/apache/kafka/pull/18864#discussion_r1970219786

##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/assignor/SimpleAssignor.java:
##########
@@ -67,42 +71,239 @@ private GroupAssignment assignHomogenous(
         GroupSpec groupSpec,
         SubscribedTopicDescriber subscribedTopicDescriber
     ) {
-        Set<Uuid> subscribeTopicIds = 
groupSpec.memberSubscription(groupSpec.memberIds().iterator().next())
+        Set<Uuid> subscribedTopicIds = 
groupSpec.memberSubscription(groupSpec.memberIds().iterator().next())
             .subscribedTopicIds();
-        if (subscribeTopicIds.isEmpty())
-            return new GroupAssignment(Collections.emptyMap());
+        if (subscribedTopicIds.isEmpty())
+            return new GroupAssignment(Map.of());
 
-        Map<Uuid, Set<Integer>> targetPartitions = computeTargetPartitions(
-            subscribeTopicIds, subscribedTopicDescriber);
+        // Subscribed topic partitions for the share group.
+        List<TopicIdPartition> targetPartitions = computeTargetPartitions(
+            subscribedTopicIds, subscribedTopicDescriber);
 
-        return new 
GroupAssignment(groupSpec.memberIds().stream().collect(Collectors.toMap(
-            Function.identity(), memberId -> new 
MemberAssignmentImpl(targetPartitions))));
+        // The current assignment from topic partition to members.
+        Map<TopicIdPartition, List<String>> currentAssignment = 
currentAssignment(groupSpec);
+        return newAssignmentHomogeneous(groupSpec, subscribedTopicIds, 
targetPartitions, currentAssignment);
     }
 
     private GroupAssignment assignHeterogeneous(
         GroupSpec groupSpec,
         SubscribedTopicDescriber subscribedTopicDescriber
     ) {
-        Map<String, MemberAssignment> members = new HashMap<>();
+        Map<String, List<TopicIdPartition>> memberToPartitionsSubscription = 
new HashMap<>();
         for (String memberId : groupSpec.memberIds()) {
             MemberSubscription spec = groupSpec.memberSubscription(memberId);
             if (spec.subscribedTopicIds().isEmpty())
                 continue;
 
-            Map<Uuid, Set<Integer>> targetPartitions = computeTargetPartitions(
+            // Subscribed topic partitions for the share group member.
+            List<TopicIdPartition> targetPartitions = computeTargetPartitions(
                 spec.subscribedTopicIds(), subscribedTopicDescriber);
+            memberToPartitionsSubscription.put(memberId, targetPartitions);
+        }
+
+        // The current assignment from topic partition to members.
+        Map<TopicIdPartition, List<String>> currentAssignment = 
currentAssignment(groupSpec);
+        return newAssignmentHeterogeneous(groupSpec, 
memberToPartitionsSubscription, currentAssignment);
+    }
+
+    /**
+     * Get the current assignment by topic partitions.
+     * @param groupSpec - The group metadata specifications.
+     * @return the current assignment for subscribed topic partitions to 
memberIds.
+     */
+    private Map<TopicIdPartition, List<String>> currentAssignment(GroupSpec 
groupSpec) {
+        Map<TopicIdPartition, List<String>> assignment = new HashMap<>();
 
-            members.put(memberId, new MemberAssignmentImpl(targetPartitions));
+        for (String member : groupSpec.memberIds()) {
+            Map<Uuid, Set<Integer>> assignedTopicPartitions = 
groupSpec.memberAssignment(member).partitions();
+            assignedTopicPartitions.forEach((topicId, partitions) -> 
partitions.forEach(
+                partition -> assignment.computeIfAbsent(new 
TopicIdPartition(topicId, partition), k -> new ArrayList<>()).add(member)));
         }
+        return assignment;
+    }
+
+    /**
+     * This function computes the new assignment for a homogeneous group.
+     * @param groupSpec - The group metadata specifications.
+     * @param subscribedTopicIds - The set of all the subscribed topic ids for 
the group.
+     * @param targetPartitions - The list of all topic partitions that need 
assignment.
+     * @param currentAssignment - The current assignment for subscribed topic 
partitions to memberIds.
+     * @return the new partition assignment for the members of the group.
+     */
+    private GroupAssignment newAssignmentHomogeneous(
+        GroupSpec groupSpec,
+        Set<Uuid> subscribedTopicIds,
+        List<TopicIdPartition> targetPartitions,
+        Map<TopicIdPartition, List<String>> currentAssignment
+    ) {
+        Map<TopicIdPartition, List<String>> newAssignment = new HashMap<>();
+
+        // Step 1: Hash member IDs to topic partitions.
+        memberHashAssignment(targetPartitions, groupSpec.memberIds(), 
newAssignment);
+
+        // Step 2: Round-robin assignment for unassigned partitions which do 
not have members already assigned in the current assignment.
+        List<TopicIdPartition> unassignedPartitions = targetPartitions.stream()
+            .filter(targetPartition -> 
!newAssignment.containsKey(targetPartition))
+            .filter(targetPartition -> 
!currentAssignment.containsKey(targetPartition))
+            .toList();
+
+        roundRobinAssignment(groupSpec.memberIds(), unassignedPartitions, 
newAssignment);
+
+        // Step 3: We combine current assignment and new assignment.
+        Map<String, Set<TopicIdPartition>> finalAssignment = new HashMap<>();
+
+        // As per the KIP, we should revoke the assignments from current 
assignment for partitions that were assigned by step 1
+        // in the new assignment and have members in current assignment by 
step 2. But we haven't implemented it to avoid the
+        // complexity in both the implementation and the run time complexity. 
This step was mentioned in the KIP to reduce
+        // the burden of certain members of the share groups. This can be 
achieved with the help of limiting the max
+        // no. of partitions assignment for every member(KAFKA-18788). Hence, 
the potential problem of burdening
+        // the share consumers will be addressed in a future PR.
+

Review Comment:
   Doesn't the following do the job a bit better?
   ```
           newAssignment.forEach((targetPartition, members) -> 
members.forEach(member ->
                   finalAssignment.computeIfAbsent(member, k -> new 
HashSet<>()).add(targetPartition)));
           currentAssignment.forEach((targetPartition, members) -> {
               if (subscribedTopicIds.contains(targetPartition.topicId())) {}
                   members.forEach(member -> {
                       if (groupSpec.memberIds().contains(member) && 
!newAssignment.containsKey(targetPartition))
                           finalAssignment.computeIfAbsent(member, k -> new 
HashSet<>()).add(targetPartition);
                   });
           });
   ```
   
   The problem with the code as it currently exists is that it assigns all 
partitions to the first member, and then as other members join, it leaves all 
partitions with the first member in spite of assigning the partitions to the 
other members.
   
   What the snippet above does is essentially give precedence to the new 
assignment, and only copies over information from the current assignment which 
augments the new assignment. It's still not perfect because the round-robin 
nature of the reassignment is not sophisticated enough, but I think it's 
probably better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to