dajac commented on code in PR #13550:
URL: https://github.com/apache/kafka/pull/13550#discussion_r1168638509


##########
clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractCoordinator.java:
##########
@@ -835,6 +835,7 @@ public void handle(SyncGroupResponse syncResponse,
                 } else if (error == Errors.REBALANCE_IN_PROGRESS) {
                     log.info("SyncGroup failed: The group began another 
rebalance. Need to re-join the group. " +
                                  "Sent generation was {}", sentGeneration);
+                    resetStateAndGeneration("member missed the rebalance", 
true);

Review Comment:
   @philipnee I am trying to convince myself about this change. This basically 
means that whenever a member is late for the sync-group phase, it will abandon 
all its partitions. Here late means that the member sends the sync-group 
request after the next rebalance has started. I wonder how common this is, 
especially in large groups.
   
   My understanding is that all pending sync-group requests are completed when 
the leader sends the assignment. When they are completed, the members with 
partitions to be revoked calls revoke them and re-join more-or-less immediately 
(because we don't commit offsets in the cooperative mode, I think). This makes 
me think that this will happen regularly in medium to large groups.
   
   Could you elaborate a bit more on the reasoning behind this conservative 
change? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to