StefanRRichter commented on a change in pull request #7571: [FLINK-10724] Refactor failure handling in check point coordinator URL: https://github.com/apache/flink/pull/7571#discussion_r275338980
########## File path: flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java ########## @@ -666,10 +671,11 @@ else if (!props.forceCheckpoint()) { * Receives a {@link DeclineCheckpoint} message for a pending checkpoint. * * @param message Checkpoint decline from the task manager + * @return <code>true</code> if should fail the job */ - public void receiveDeclineMessage(DeclineCheckpoint message) { + public boolean receiveDeclineMessage(DeclineCheckpoint message) { Review comment: Alright, I had another look into the design doc and the doc is talking about introducing an checkpoint failure manager in a second step. So my assumption would be that this PR is mainly about introducing the unified `PendingCheckpoint#abort` and `CheckpointAbortReason`. Then, as a second step, I would expect the introduction of of the failure manager. And only as a third step I would expect the code wiring all together and changing the behaviour. Otherwise, as it is done now, we already have the changed behaviour but not all refactorings completely in place. But in order to get the behaviour changed you introduced an intermediate state of flawed abstraction that can still linger in the code for a while. Can we separate this as I suggested and only complete the work's behavioural change at the very end, when also the manager is in place? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services