showuon commented on pull request #11451:
URL: https://github.com/apache/kafka/pull/11451#issuecomment-964156875


   @guozhangwang , when investigating the broken tests, I found my change will 
cause the "normal rebalance" slower. Here's why:
   Before my change, the rebalance with 2 consumers will be like this:
   1. consumer A joined group G, and joinGroup completed
   2. consumer B joined group G, the group G state change to 
`preparingRebalance`
   3. consumer A send syncGroup, and got `REBALANCE_IN_PROGRESS` error
   4. consumer A rejoin group, send joinGroup to group G
   5. consumer B send joinGroup to group G
   6. consumer A and B complete the joinGroup and syncGroup successfully
   
   It looks great. But after my change in this PR, it'll become (the change is 
highlighted in **bold**)
   1. consumer A joined group G, and joinGroup completed
   2. consumer B joined group G, the group G state change to 
`preparingRebalance`
   3. consumer A send syncGroup, and got `REBALANCE_IN_PROGRESS` error
   **4. consumer A reset generation and state** 
   **5. consumer A rejoin group, send joinGroup with a new member ID to group 
G**
   6. consumer B send joinGroup to group G
   **7. waiting until rebalance timeout to kick out the old consumer A(old 
member ID)** 
   8. consumer A and B complete the joinGroup and syncGroup successfully
   
   That's why this change causes the rebalance slower. We can explicitly leave 
group when sync group with `REBALANCE_IN_PROGRESS`, but I think we have to fix 
it from the root!
   
   ===
   Currently, all the issue we faced (i.e. KAFKA-12984, KAFKA-13406), is due to 
the `ownedPartitions` data in the subscription message is out-of-date, and we 
don't have a good way to identify it. So in KAFKA-12984, we have to put 
`generation` info into `userData` in Subscription message in 
`CooperativeStickyAssignor`. (And we have no way to workaround for custom 
cooperative assignors). But then, we found we forgot the cooperative assignment 
validation will also use the `ownedPartitions` data in KAFKA-13406, so we have 
to workaround it again.  
   
   Therefore, I think we should add an additional field `generation` into 
Subscription message, to help `CooperativeStickyAssignor` and custom assignors 
leverage `ownedPartitions + generation` to do "correct" assignment. 
[KIP-792](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=191336614)
 is drafted. I'd like to get your feedback before I send it to dev group 
discussion. Thank you.
   
   So, back to your original comment about 2 places to fix:
   1. Inside the `ConsumerCoordinator#subscriptions`, where we save the 
currently assigned partitions.
   2. Inside the `Assignor#userData`, where for (cooperative) sticky assignor 
where we also encode the generation and the prev-assigned partitions as 
memberAssignment (note that in join-group request we do not encode the 
generation id).
   
   Well, KIP-792 is still focusing on 1) above. For 2), I've thought about it 
for some days, and I think we can ignore it, because in stickyAssignor (not 
cooperative one), we put both `ownedPartitions` and `generation` info into 
`userData`, which means, even the ownedPartition is out-of-date, we can still 
identify it. For custom assignors with old bytecode, I think they can use the 
same way to achieve the same goal. WDYT?
   
   Thank you.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to