fanrui created FLINK-28474: ------------------------------ Summary: ChannelStateWriteResult may not fail after checkpoint abort Key: FLINK-28474 URL: https://issues.apache.org/jira/browse/FLINK-28474 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.15.1, 1.14.5 Reporter: fanrui Fix For: 1.16.0, 1.15.2, 1.14.6 Attachments: image-2022-07-09-22-21-24-417.png
After Checkpoint abort, ChannelStateWriteResult should fail. But if _channelStateWriter.start(id, checkpointOptions);_ is executed after Checkpoint abort, ChannelStateWriteResult will not fail. h2. Cause Analysis: When abort checkpoint, channelStateWriter.start(id, checkpointOptions); may not be executed yet. These checkpointIds will be stored in the abortedCheckpointIds of SubtaskCheckpointCoordinatorImpl, and when checkpointState is called, it will check if the checkpointId should be aborted. _ChannelStateWriter.abort(checkpointId, exception, true) should also be executed here._ !image-2022-07-09-22-21-24-417.png|width=803,height=307! -- This message was sent by Atlassian Jira (v8.20.10#820010)