showuon commented on a change in pull request #10973: URL: https://github.com/apache/kafka/pull/10973#discussion_r667340973
########## File path: clients/src/main/java/org/apache/kafka/clients/admin/internals/AlterConsumerGroupOffsetsHandler.java ########## @@ -136,21 +142,28 @@ private void handleError( ) { switch (error) { case GROUP_AUTHORIZATION_FAILED: - log.error("Received authorization failure for group {} in `OffsetCommit` response", groupId, - error.exception()); + log.error("Received authorization failure for group {} in `{}` response", groupId, + apiName(), error.exception()); failed.put(groupId, error.exception()); break; case COORDINATOR_LOAD_IN_PROGRESS: + // If the coordinator is in the middle of loading, then we just need to retry + log.debug("`{}` request for group {} failed because the coordinator" + + " is still in the process of loading state. Will retry.", apiName(), groupId); + break; case COORDINATOR_NOT_AVAILABLE: case NOT_COORDINATOR: - log.debug("OffsetCommit request for group {} returned error {}. Will retry", groupId, error); + // If the coordinator is unavailable or there was a coordinator change, then we unmap + // the key so that we retry the `FindCoordinator` request + log.debug("`{}` request for group {} returned error {}. " + + "Will attempt to find the coordinator again and retry.", apiName(), groupId, error); unmapped.add(groupId); break; default: - log.error("Received unexpected error for group {} in `OffsetCommit` response", - groupId, error.exception()); - failed.put(groupId, error.exception( - "Received unexpected error for group " + groupId + " in `OffsetCommit` response")); + final String unexpectedErrorMsg = String.format("Received unexpected error for group %s in `%s` response", + groupId, apiName()); + log.error(unexpectedErrorMsg, error.exception()); + failed.put(groupId, error.exception(unexpectedErrorMsg)); Review comment: @dajac , thanks for your comment. > It seems to me that we should differentiate the group level errors from the partition level errors here or we should consider all of them as partition level errors. What do you think? I agree with you. I think what KIP-699 did, is trying to not break existing tests. It indeed needs improvement. > Also, I think that we should handle all the expected errors here. The default error message here is wrong. There are many errors which expect but which are not handled. You're right. I'm thinking we can handle them in separate PR, and open a Jira ticket to track it. And due to V3.0 is released, maybe that improvement can go into next release. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org