[ https://issues.apache.org/jira/browse/KAFKA-14016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625187#comment-17625187 ]
Shawn Wang edited comment on KAFKA-14016 at 10/27/22 4:08 PM: -------------------------------------------------------------- [~ableegoldman] Yes, we experienced KAFKA-13891 Here is the full history of our case: We have several consumer group which has more than 2000 consumers. And we are using Kafka Broker 2.3 && Kakfa Client 2.5. We enabled the build-in StaticMembership && Coopearative Rebalance to avoid STW time in rebalance. # we found that in some cases the partition will be duplicated assigned, so we patched KAFKA-12984 , KAFKA-12983, KAFKA-13406 # after we deployed online, we found that some consumer group will rebalance for a long time ( 2 hours) before it finally get Stable, so we then patched KAFKA-13891. # after deployed, we experienced more partition lag when rebalance happens. Then i created this issue and try to get some advise. # Actually we workaround it by 'ignore the generation value when leader calculate assignment' ( just set every memeber's generation to unknonw ). And after we go online for more than 2 months, it looks good now. For "The sticky assignment algorithm should account for this and IIRC will basically consider whoever has the highest valid generation for a partition as its previous owner", i think in the code does't implement in this way [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractStickyAssignor.java#L127] Please kindly correct me if i'm wrong. In this code we clear all previous owned parittions if we got a higer geneartion, so only the ownedPartitions with highest generation will be valid I think "The sticky assignment algorithm should account for this and IIRC will basically consider whoever has the highest valid generation for a partition as its previous owner" is also a fix for this. was (Author: JIRAUSER289108): [~ableegoldman] Yes, we experienced KAFKA-13891 Here is the full history of our case: We have several consumer group which has more than 2000 consumers. And we are using Kafka Broker 2.3 && Kakfa Client 2.5. We enabled StaticMembership && Coopearative Rebalance to avoid STW time in rebalance. # we found that in some cases the partition will be duplicated assigned, so we patched KAFKA-12984 , KAFKA-12983, KAFKA-13406 # after we deployed online, we found that some consumer group will rebalance for a long time ( 2 hours) before it finally get Stable, so we then patched KAFKA-13891. # after deployed, we experienced more partition lag when rebalance happens. Then i created this issue and try to get some advise. # Actually we workaround it by 'ignore the generation value when leader calculate assignment' ( just set every memeber's generation to unknonw ). And after we go online for more than 2 months, it looks good now. For "The sticky assignment algorithm should account for this and IIRC will basically consider whoever has the highest valid generation for a partition as its previous owner", i think in the code does't implement in this way [https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractStickyAssignor.java#L127] Please kindly correct me if i'm wrong. In this code we clear all previous owned parittions if we got a higer geneartion, so only the ownedPartitions with highest generation will be valid I think "The sticky assignment algorithm should account for this and IIRC will basically consider whoever has the highest valid generation for a partition as its previous owner" is also a fix for this. > Revoke more partitions than expected in Cooperative rebalance > ------------------------------------------------------------- > > Key: KAFKA-14016 > URL: https://issues.apache.org/jira/browse/KAFKA-14016 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 3.3.0 > Reporter: Shawn Wang > Priority: Major > Labels: new-rebalance-should-fix > > In https://issues.apache.org/jira/browse/KAFKA-13419 we found that some > consumer didn't reset generation and state after sync group fail with > REABALANCE_IN_PROGRESS error. > So we fixed it by reset generationId (no memberId) when sync group fail with > REABALANCE_IN_PROGRESS error. > But this change missed the reset part, so another change made in > https://issues.apache.org/jira/browse/KAFKA-13891 make this works. > After apply this change, we found that: sometimes consumer will revoker > almost 2/3 of the partitions with cooperative enabled. Because if a consumer > did a very quick re-join, other consumers will get REABALANCE_IN_PROGRESS in > syncGroup and revoked their partition before re-jion. example: > # consumer A1-A10 (ten consumers) joined and synced group successfully with > generation 1 > # New consumer B1 joined and start a rebalance > # all consumer joined successfully and then A1 need to revoke partition to > transfer to B1 > # A1 do a very quick syncGroup and re-join, because it revoked partition > # A2-A10 didn't send syncGroup before A1 re-join, so after the send > syncGruop, will get REBALANCE_IN_PROGRESS > # A2-A10 will revoke there partitions and re-join > So in this rebalance almost every partition revoked, which highly decrease > the benefit of Cooperative rebalance > I think instead of "{*}resetStateAndRejoin{*} when > *RebalanceInProgressException* errors happend in {*}sync group{*}" we need > another way to fix it. > Here is my proposal: > # revert the change in https://issues.apache.org/jira/browse/KAFKA-13891 > # In Server Coordinator handleSyncGroup when generationId checked and group > state is PreparingRebalance. We can send the assignment along with the error > code REBALANCE_IN_PROGRESS. ( i think it's safe since we verified the > generation first ) > # When get the REBALANCE_IN_PROGRESS error in client, try to apply the > assignment first and then set the rejoinNeeded = true to make it re-join > immediately -- This message was sent by Atlassian Jira (v8.20.10#820010)