dajac commented on code in PR #18020:
URL: https://github.com/apache/kafka/pull/18020#discussion_r1872790894


##########
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/modern/consumer/ConsumerGroup.java:
##########
@@ -798,6 +827,60 @@ private void validateMemberEpoch(
         }
     }
 
+    /**
+     * Computes the subscription type based on the provided information.
+     *
+     * @param subscribedRegularExpressions  The subscribed regular expression 
count.
+     * @param subscribedTopicNames          The subscribed topic name count.
+     * @param numberOfMembers               The number of members in the group.
+     *
+     * @return The subscription type.
+     */
+    public static SubscriptionType subscriptionType(
+        Map<String, Integer> subscribedRegularExpressions,
+        Map<String, SubscriptionCount> subscribedTopicNames,
+        int numberOfMembers
+    ) {
+        if (subscribedRegularExpressions.isEmpty()) {
+            // If the members do not use regular expressions, the subscription 
is
+            // considered as homogeneous if all the members are subscribed to 
the
+            // same topics. Otherwise, it is considered as heterogeneous.
+            for (SubscriptionCount subscriberCount : 
subscribedTopicNames.values()) {
+                if (subscriberCount.byNameCount != numberOfMembers) {
+                    return HETEROGENEOUS;
+                }
+            }
+            return HOMOGENEOUS;
+        } else {
+            int count = 
subscribedRegularExpressions.values().iterator().next();
+            if (count == numberOfMembers) {
+                // If all the members are subscribed to a single regular 
expressions
+                // and none of them are subscribed to topic names, the 
subscription
+                // is considered as homogeneous. If some members are 
subscribed to
+                // topic names too, the subscription is considered as 
heterogeneous.
+                for (SubscriptionCount subscriberCount : 
subscribedTopicNames.values()) {
+                    if (subscriberCount.byRegexCount != 1 || 
subscriberCount.byNameCount > 0) {
+                        return HETEROGENEOUS;

Review Comment:
   The definition is not that well defined. I think that we have the choice 
between two definitions:
   1) All the members use the same subscription; or
   2) All the members are subscribed to the same topics.
   In this patch, I suggests to use 1) while I agree that 2) would be the best.
   
   The challenge with 2) is that it is not easy to compute it. Imagine the 
following:
   * 6 members, 2 topics `foo` and `fooo`
   * 3 members subscribed via name `foo`
   * 1 member subscribed via regex `foo.*`
   * 1 member subscribed via regex `fo.*`
   * 1 member subscribed via regex `.*` and via name `foo`
   
   It should be homogeneous too because they are all subscribed to `foo` and 
`fooo`. However, it is hard to compute it based on the information that we have 
in memory. Our data model makes our life hard here. I am open to suggestions 
though.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to