rkhachatryan commented on a change in pull request #8693: URL: https://github.com/apache/flink/pull/8693#discussion_r425673237
########## File path: flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/SubtaskCheckpointCoordinatorImpl.java ########## @@ -183,6 +235,42 @@ public void notifyCheckpointComplete(long checkpointId, OperatorChain<?, ?> oper env.getTaskStateManager().notifyCheckpointComplete(checkpointId); } + @Override + public void notifyCheckpointAborted(long checkpointId, OperatorChain<?, ?> operatorChain, Supplier<Boolean> isRunning) throws Exception { + + if (isRunning.get()) { + LOG.debug("Notification of aborted checkpoint for task {}", taskName); + // only happens when the task always received checkpoints to abort but never trigger or executing. + if (abortedCheckpointIds.size() >= maxRecordAbortedCheckpoints) { + abortedCheckpointIds.pollFirst(); + } + + channelStateWriter.abort(checkpointId, new NotifiedCheckpointAbortedException(checkpointId)); + boolean canceled = asyncCheckpointRunnableRegistry.cancelAsyncCheckpointRunnable(checkpointId); + + if (!canceled) { + if (checkpointId > lastCheckpointId) { + // only record checkpoints that have not triggered on task side. + abortedCheckpointIds.add(checkpointId); + } + } + + for (StreamOperatorWrapper<?, ?> operatorWrapper : operatorChain.getAllOperators(true)) { + operatorWrapper.getStreamOperator().notifyCheckpointAborted(checkpointId); Review comment: What if `notifyCheckpointAborted` fails for one operator, should we continue to the next? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org