ifndef-SleePy commented on a change in pull request #11347: [FLINK-14971][checkpointing] Make all the non-IO operations in CheckpointCoordinator single-threaded URL: https://github.com/apache/flink/pull/11347#discussion_r392859883
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java ########## @@ -311,25 +315,32 @@ public CheckpointException getFailureCause() { try (CheckpointMetadataOutputStream out = targetLocation.createMetadataOutputStream()) { Checkpoints.storeCheckpointMetadata(savepoint, out); finalizedLocation = out.closeAndFinalizeCheckpoint(); + } + CompletedCheckpoint completed = new CompletedCheckpoint( + jobId, + checkpointId, + checkpointTimestamp, + System.currentTimeMillis(), + operatorStates, + masterStates, + props, + finalizedLocation); + + try { + completedCheckpointStore.addCheckpoint(completed); + } catch (Throwable t) { + completed.discardOnFailedStoring(); Review comment: However I do find a problem inspired by this comment, nice job! There might be a cancellation or shutting down during the finalization. It's not well handled yet, I would update the PR later to cover this scenario. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services