dajac opened a new pull request, #12910:
URL: https://github.com/apache/kafka/pull/12910

   We recently had a bug causing the JoinGroup callback to thrown an exception 
(https://github.com/apache/kafka/pull/12909). When it happens, the exception is 
propagated to the caller and the JoinGroup callback is never completed. To make 
it worst, the member whose callback failed become a zombie because the group 
coordinator does not expire member with a pending callback.
   
   This patch catch exceptions for both invocation of JoinGroup and SyncGroup 
callbacks and retry to complete them with a `UNKNOWN_SERVER_ERROR` error if 
they failed.
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to