Exceeded Checkpoint tolerable failure threshold Exception

Robert Cullen Thu, 07 Oct 2021 09:49:01 -0700

I have Flink set up with 2 taskmanagers and one jobmanager. I've allocated
25 gb of JVM Heap and 15 gb of  Flink managed memory.  I have 2 jobs
running.  After 3 hours this exception was thrown.  How can I configure
flink to prevent this from happening?


2021-10-07 12:38:50
org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable
failure threshold.
    at org.apache.flink.runtime.checkpoint.CheckpointFailureManager
.handleCheckpointException(CheckpointFailureManager.java:98)
    at org.apache.flink.runtime.checkpoint.CheckpointFailureManager
.handleJobLevelCheckpointException(CheckpointFailureManager.java:67)
    at org.apache.flink.runtime.checkpoint.CheckpointCoordinator
.abortPendingCheckpoint(CheckpointCoordinator.java:1934)
    at org.apache.flink.runtime.checkpoint.CheckpointCoordinator
.abortPendingCheckpoint(CheckpointCoordinator.java:1906)
    at org.apache.flink.runtime.checkpoint.CheckpointCoordinator.access$600(
CheckpointCoordinator.java:96)
    at org.apache.flink.runtime.checkpoint.
CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:
1990)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:
511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask
.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor
.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:624)
    at java.lang.Thread.run(Thread.java:748)

-- 
Robert Cullen
240-475-4490

Exceeded Checkpoint tolerable failure threshold Exception

Reply via email to