Myasuka commented on code in PR #21281: URL: https://github.com/apache/flink/pull/21281#discussion_r1018741944
########## flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointFailureManager.java: ########## @@ -204,7 +204,8 @@ private void checkFailureAgainstCounter( if (continuousFailureCounter.get() > tolerableCpFailureNumber) { clearCount(); errorHandler.accept( - new FlinkRuntimeException(EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE)); + new FlinkRuntimeException( + EXCEEDED_CHECKPOINT_TOLERABLE_FAILURE_MESSAGE, exception)); Review Comment: The job failed due to the failure counter being larger than the tolerable number, and we can only have the exception reason for the last broken checkpoint. However, this would make users think all checkpoints failed due to the last exception. The correct way is to let users check the job manager logs or checkpoint UI to know what happened in the last checkpoints. From my point of view, I am +0 for this proposal. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org