[ https://issues.apache.org/jira/browse/FLINK-20672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17783868#comment-17783868 ]
Yun Tang edited comment on FLINK-20672 at 11/8/23 3:01 AM: ----------------------------------------------------------- [~Zakelly] Thanks for picking up the stale tickets. However, I think this is not true after FLINK-23654 is resolved. was (Author: yunta): [~Zakelly] Thanks for picking up the stale tickets. However, I think this is not true after FLINK-20672 is resolved. > notifyCheckpointAborted RPC failure can fail JM > ----------------------------------------------- > > Key: FLINK-20672 > URL: https://issues.apache.org/jira/browse/FLINK-20672 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing > Affects Versions: 1.11.3, 1.12.0 > Reporter: Roman Khachatryan > Assignee: Zakelly Lan > Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor, > pull-request-available > > Introduced in FLINK-8871, aborted RPC notifications are done asynchonously: > > {code} > private void sendAbortedMessages(long checkpointId, long timeStamp) { > // send notification of aborted checkpoints asynchronously. > executor.execute(() -> { > // send the "abort checkpoint" messages to necessary > vertices. > // .. > }); > } > {code} > However, the executor that eventually executes this request is created as > follows > {code} > final ScheduledExecutorService futureExecutor = > Executors.newScheduledThreadPool( > Hardware.getNumberCPUCores(), > new ExecutorThreadFactory("jobmanager-future")); > {code} > ExecutorThreadFactory uses UncaughtExceptionHandler that exits JVM on error. > cc: [~yunta] -- This message was sent by Atlassian Jira (v8.20.10#820010)