[ 
https://issues.apache.org/jira/browse/KAFKA-14016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17625187#comment-17625187
 ] 

Shawn Wang edited comment on KAFKA-14016 at 10/27/22 4:08 PM:
--------------------------------------------------------------

[~ableegoldman] 

Yes, we experienced KAFKA-13891

Here is the full history of our case:

We have several consumer group which has more than 2000 consumers. And we are 
using Kafka Broker 2.3 && Kakfa Client 2.5. We enabled the build-in 
StaticMembership && Coopearative Rebalance to avoid STW time in rebalance.
 # we found that in some cases the partition will be duplicated assigned, so we 
patched KAFKA-12984  , KAFKA-12983,  KAFKA-13406
 # after we deployed online, we found that some consumer group will rebalance 
for a long time ( 2 hours) before it finally get Stable,  so we then patched 
KAFKA-13891.
 # after deployed, we experienced more partition lag when rebalance happens. 
Then i created this issue and try to get some advise.
 # Actually we workaround it by 'ignore the generation value when leader 
calculate assignment' ( just set every memeber's generation to unknonw ).  And 
after we go online for more than 2 months, it looks good now.

 

For "The sticky assignment algorithm should account for this and IIRC will 
basically consider whoever has the highest valid generation for a partition as 
its previous owner", i think in the code does't implement in this way 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractStickyAssignor.java#L127]

Please kindly correct me if i'm wrong. In this code we clear all previous owned 
parittions if we got a higer geneartion, so only the ownedPartitions with 
highest generation will be valid

 

I think "The sticky assignment algorithm should account for this and IIRC will 
basically consider whoever has the highest valid generation for a partition as 
its previous owner" is also a fix for this.

 


was (Author: JIRAUSER289108):
[~ableegoldman] 

Yes, we experienced KAFKA-13891

Here is the full history of our case:

We have several consumer group which has more than 2000 consumers. And we are 
using Kafka Broker 2.3 && Kakfa Client 2.5. We enabled StaticMembership && 
Coopearative Rebalance to avoid STW time in rebalance.
 # we found that in some cases the partition will be duplicated assigned, so we 
patched KAFKA-12984  , KAFKA-12983,  KAFKA-13406
 # after we deployed online, we found that some consumer group will rebalance 
for a long time ( 2 hours) before it finally get Stable,  so we then patched 
KAFKA-13891.
 # after deployed, we experienced more partition lag when rebalance happens. 
Then i created this issue and try to get some advise.
 # Actually we workaround it by 'ignore the generation value when leader 
calculate assignment' ( just set every memeber's generation to unknonw ).  And 
after we go online for more than 2 months, it looks good now.

 

For "The sticky assignment algorithm should account for this and IIRC will 
basically consider whoever has the highest valid generation for a partition as 
its previous owner", i think in the code does't implement in this way 
[https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractStickyAssignor.java#L127]

Please kindly correct me if i'm wrong. In this code we clear all previous owned 
parittions if we got a higer geneartion, so only the ownedPartitions with 
highest generation will be valid

 

I think "The sticky assignment algorithm should account for this and IIRC will 
basically consider whoever has the highest valid generation for a partition as 
its previous owner" is also a fix for this.

 

> Revoke more partitions than expected in Cooperative rebalance
> -------------------------------------------------------------
>
>                 Key: KAFKA-14016
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14016
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 3.3.0
>            Reporter: Shawn Wang
>            Priority: Major
>              Labels: new-rebalance-should-fix
>
> In https://issues.apache.org/jira/browse/KAFKA-13419 we found that some 
> consumer didn't reset generation and state after sync group fail with 
> REABALANCE_IN_PROGRESS error.
> So we fixed it by reset generationId (no memberId) when  sync group fail with 
> REABALANCE_IN_PROGRESS error.
> But this change missed the reset part, so another change made in 
> https://issues.apache.org/jira/browse/KAFKA-13891 make this works.
> After apply this change, we found that: sometimes consumer will revoker 
> almost 2/3 of the partitions with cooperative enabled. Because if a consumer 
> did a very quick re-join, other consumers will get REABALANCE_IN_PROGRESS in 
> syncGroup and revoked their partition before re-jion. example:
>  # consumer A1-A10 (ten consumers) joined and synced group successfully with 
> generation 1 
>  # New consumer B1 joined and start a rebalance
>  # all consumer joined successfully and then A1 need to revoke partition to 
> transfer to B1
>  # A1 do a very quick syncGroup and re-join, because it revoked partition
>  # A2-A10 didn't send syncGroup before A1 re-join, so after the send 
> syncGruop, will get REBALANCE_IN_PROGRESS
>  # A2-A10 will revoke there partitions and re-join
> So in this rebalance almost every partition revoked, which highly decrease 
> the benefit of Cooperative rebalance 
> I think instead of "{*}resetStateAndRejoin{*} when 
> *RebalanceInProgressException* errors happend in {*}sync group{*}" we need 
> another way to fix it.
> Here is my proposal:
>  # revert the change in https://issues.apache.org/jira/browse/KAFKA-13891
>  # In Server Coordinator handleSyncGroup when generationId checked and group 
> state is PreparingRebalance. We can send the assignment along with the error 
> code REBALANCE_IN_PROGRESS. ( i think it's safe since we verified the 
> generation first )
>  # When get the REBALANCE_IN_PROGRESS error in client, try to apply the 
> assignment first and then set the rejoinNeeded = true to make it re-join 
> immediately 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to