Hi,
So while cancelling one job with savepoint… even though job got cancelled
successfully .. but somehow immediately after that job manager went down. Not
able to deduce anything from given stack trace.. Any help is appreciated
2021-09-24 11:50:44,182 INFO
org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Stopping
checkpoint coordinator for job 1f764a51996d206b28721aa4a1420bea.
2021-09-24 11:50:44,182 INFO
org.apache.flink.runtime.checkpoint.DefaultCompletedCheckpointStore [] -
Shutting down
2021-09-24 11:50:44,240 INFO
org.apache.flink.runtime.zookeeper.ZooKeeperStateHandleStore [] - Removing
/flink/default_ns/checkpoints/1f764a51996d206b28721aa4a1420bea from ZooKeeper
2021-09-24 11:50:44,243 INFO
org.apache.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter [] - Shutting
down.
2021-09-24 11:50:44,243 INFO
org.apache.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter [] - Removing
/checkpoint-counter/1f764a51996d206b28721aa4a1420bea from ZooKeeper
2021-09-24 11:50:44,249 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher [] - Job
1f764a51996d206b28721aa4a1420bea reached globally terminal state CANCELED.
2021-09-24 11:50:44,249 ERROR
org.apache.flink.runtime.util.FatalExitExceptionHandler [] - FATAL: Thread
'cluster-io-thread-16' produced an uncaught exception. Stopping the process...
java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@54a5137c
rejected from
java.util.concurrent.ScheduledThreadPoolExecutor@37ee0790[Terminated, pool size
= 0, active threads = 0, queued tasks = 0, completed tasks = 4513]
at
java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
~[?:1.8.0_232]
at
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
~[?:1.8.0_232]
at
java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
~[?:1.8.0_232]
at
java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
~[?:1.8.0_232]
at
java.util.concurrent.ScheduledThreadPoolExecutor.execute(ScheduledThreadPoolExecutor.java:622)
~[?:1.8.0_232]
at
java.util.concurrent.Executors$DelegatedExecutorService.execute(Executors.java:668)
~[?:1.8.0_232]
at
org.apache.flink.runtime.concurrent.ScheduledExecutorServiceAdapter.execute(ScheduledExecutorServiceAdapter.java:64)
~[flink-dist_2.12-1.12.1.jar:1.12.1]
at
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.scheduleTriggerRequest(CheckpointCoordinator.java:1290)
~[flink-dist_2.12-1.12.1.jar:1.12.1]
at
org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanCheckpoint$0(CheckpointsCleaner.java:66)
~[flink-dist_2.12-1.12.1.jar:1.12.1]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_232]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Regards,
Puneet