pnowojski commented on a change in pull request #13044: URL: https://github.com/apache/flink/pull/13044#discussion_r466961539
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java ########## @@ -530,15 +530,22 @@ private void startTriggeringCheckpoint(CheckpointTriggerRequest request) { request.getOnCompletionFuture()), timer); - final CompletableFuture<?> masterStatesComplete = pendingCheckpointCompletableFuture - .thenCompose(this::snapshotMasterState); - final CompletableFuture<?> coordinatorCheckpointsComplete = pendingCheckpointCompletableFuture .thenComposeAsync((pendingCheckpoint) -> OperatorCoordinatorCheckpoints.triggerAndAcknowledgeAllCoordinatorCheckpointsWithCompletion( coordinatorsToCheckpoint, pendingCheckpoint, timer), timer); + // We have to take the snapshot of the master hooks after the coordinator checkpoints has completed. + // This is to ensure the tasks are checkpointed after the OperatorCoordinators in case + // ExternallyInducedSource is used. + final CompletableFuture<?> masterStatesComplete = coordinatorCheckpointsComplete + .thenComposeAsync(ignored -> { Review comment: nit: in that case for the future, I wouldn't mix this change (refactor) with the bug fix, but do this in a separate commit, since for the same reasons: > the code that completes coordinatorCheckpointsComplete needs a few clicks to see it is done in the timer as well. it's not that easy to figure out that's indeed just a pure refactor and has nothing to do with the actual bug fix. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org