Ming Li created FLINK-31474:
-------------------------------

             Summary: [Flink] Add failure information for out-of-order 
checkpoints
                 Key: FLINK-31474
                 URL: https://issues.apache.org/jira/browse/FLINK-31474
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Checkpointing
            Reporter: Ming Li


At present, when the checkpoint is out of order, only out-of-order logs will be 
printed on the {{Task}} side, while on the {{JM}} side, the checkpoint can only 
fail through timeout, and the real reason cannot be confirmed.

Therefore, I think we should add failure information on the JM side for the 
out-of-order checkpoint.
{code:java}
if (lastCheckpointId >= metadata.getCheckpointId()) {
    LOG.info(
            "Out of order checkpoint barrier (aborted previously?): {} >= {}",
            lastCheckpointId,
            metadata.getCheckpointId());
    channelStateWriter.abort(metadata.getCheckpointId(), new 
CancellationException(), true);
    checkAndClearAbortedStatus(metadata.getCheckpointId());
    return;
} {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to