ifndef-SleePy commented on a change in pull request #11347: [FLINK-14971][checkpointing] Make all the non-IO operations in CheckpointCoordinator single-threaded URL: https://github.com/apache/flink/pull/11347#discussion_r392856449
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java ########## @@ -311,25 +315,32 @@ public CheckpointException getFailureCause() { try (CheckpointMetadataOutputStream out = targetLocation.createMetadataOutputStream()) { Checkpoints.storeCheckpointMetadata(savepoint, out); finalizedLocation = out.closeAndFinalizeCheckpoint(); + } + CompletedCheckpoint completed = new CompletedCheckpoint( + jobId, + checkpointId, + checkpointTimestamp, + System.currentTimeMillis(), + operatorStates, + masterStates, + props, + finalizedLocation); + + try { + completedCheckpointStore.addCheckpoint(completed); + } catch (Throwable t) { + completed.discardOnFailedStoring(); Review comment: Good question! Actually `completedCheckpointStore.addCheckpoint` should be called before `finalizedLocationFuture.thenApplyAsync((completed)`. The `finalizedLocationFuture.thenApplyAsync((completed)` does things like completing `onCompletionPromise`, reporting completed statistics, disposing the pending checkpoint. However if `completedCheckpointStore.addCheckpoint` fails afterwards, does this checkpoint succeeds? I don't think so. But `onCompletionPromise` has been completed in this scenario. It's inconsistent here. So the right way here is calling `completedCheckpointStore.addCheckpoint` first, then completing `onCompletionPromise`. I was planning to do this as a follow-up issue. However since we have decided to combine the finalization and adding into `completedCheckpointStore` to simplify the operations between IO threads and main thread, I think it's a good opportunity to do this. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services