aiquestion commented on code in PR #12349:
URL: https://github.com/apache/kafka/pull/12349#discussion_r922707608


##########
clients/src/main/java/org/apache/kafka/clients/consumer/internals/ConsumerCoordinator.java:
##########
@@ -740,24 +743,45 @@ private void validateCooperativeAssignment(final 
Map<String, List<TopicPartition
     }
 
     @Override
-    protected boolean onJoinPrepare(int generation, String memberId) {
+    protected boolean onJoinPrepare(Timer timer, int generation, String 
memberId) {
         log.debug("Executing onJoinPrepare with generation {} and memberId 
{}", generation, memberId);
-        boolean onJoinPrepareAsyncCommitCompleted = false;
+        if (joinPrepareTimer == null) {
+            joinPrepareTimer = time.timer(rebalanceConfig.rebalanceTimeoutMs);
+        }
         // async commit offsets prior to rebalance if auto-commit enabled
-        RequestFuture<Void> future = maybeAutoCommitOffsetsAsync();
-        // return true when
-        // 1. future is null, which means no commit request sent, so it is 
still considered completed
-        // 2. offset commit completed
-        // 3. offset commit failed with non-retriable exception
-        if (future == null)
-            onJoinPrepareAsyncCommitCompleted = true;
-        else if (future.succeeded())
-            onJoinPrepareAsyncCommitCompleted = true;
-        else if (future.failed() && !future.isRetriable()) {
-            log.error("Asynchronous auto-commit of offsets failed: {}", 
future.exception().getMessage());
-            onJoinPrepareAsyncCommitCompleted = true;
+        if (autoCommitEnabled && autoCommitOffsetRequestFuture == null) {
+            autoCommitOffsetRequestFuture = maybeAutoCommitOffsetsAsync();
+        }
+
+        // wait for commit offset response before timer.
+        if (autoCommitOffsetRequestFuture != null) {
+            Timer pollTimer = timer.remainingMs() < 
joinPrepareTimer.remainingMs() ?
+                   timer : joinPrepareTimer;
+            client.poll(autoCommitOffsetRequestFuture, pollTimer);
         }
 
+        // return false when:
+        //   1. offset commit haven't done
+        //   2. offset commit failed with retriable exception and joinPrepare 
haven't expired
+        boolean onJoinPrepareAsyncCommitCompleted = true;
+        if (autoCommitOffsetRequestFuture != null) {
+            if (!autoCommitOffsetRequestFuture.isDone()) {
+                onJoinPrepareAsyncCommitCompleted = false;
+            } else if (autoCommitOffsetRequestFuture.failed() && 
autoCommitOffsetRequestFuture.isRetriable()) {
+                onJoinPrepareAsyncCommitCompleted = 
joinPrepareTimer.isExpired();
+            } else if (autoCommitOffsetRequestFuture.failed() && 
autoCommitOffsetRequestFuture.isRetriable()) {
+                log.error("Asynchronous auto-commit of offsets failed: {}", 
autoCommitOffsetRequestFuture.exception().getMessage());
+            } else if (joinPrepareTimer != null && 
joinPrepareTimer.isExpired()) {
+                log.error("Asynchronous auto-commit of offsets failed: 
joinPrepare timeout");
+            }
+            if (autoCommitOffsetRequestFuture.isDone()) {
+                autoCommitOffsetRequestFuture = null;
+            }
+        }
+        if (!onJoinPrepareAsyncCommitCompleted) {
+            timer.sleep(rebalanceConfig.retryBackoffMs);

Review Comment:
   If we don't backoff here, i think another commit offset request will be sent 
with no delay if user just use a while loop with consumer.poll(). So i change 
to `Math.min(pollTimer.remainingMs, rebalanceConfig.retryBackoffMs)`
   
   But i think we still face the issue in 
[KAFKA-13310](https://issues.apache.org/jira/browse/KAFKA-13310), if commit 
offset request fails with `UnknownTopicOrPartitionException` it will retry 
commit offset until rebalanceTimeout reached. (only difference is 
consumer.poll() will return if timer expired)
   Since we cannot ensure that `UnknownTopicOrPartitionException` is caused by 
topic deletion(as said in 
[KAFKA-13310](https://issues.apache.org/jira/browse/KAFKA-13310)) , do you 
think wait rebalanceTimeout if commit offset failed is acceptable here? 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to