Hi, 

I’m using Flink 1.5.3 and failed to trigger savepoint for a Flink on YARN job. 
The stack traces shows that an exception occurred while triggering the 
checkpoint, but the normal checkpoints of the job are running well.

What could possibly be the problem? Thanks a lot!

The stack traces are as follow:

org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
1ca7d429484c64eb64fa646672389a74 failed.
        at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:695)
        at 
org.apache.flink.client.cli.CliFrontend.lambda$savepoint$7(CliFrontend.java:673)
        at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:960)
        at 
org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:670)
        at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1040)
        at 
org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101)
Caused by: java.util.concurrent.CompletionException: 
java.util.concurrent.CompletionException: 
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to 
trigger savepoint. Decline reason: An Exception occurred while triggering the 
checkpoint.
        at 
org.apache.flink.runtime.jobmaster.JobMaster.lambda$triggerSavepoint$13(JobMaster.java:955)
        at 
java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870)
        at 
java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:884)
        at 
java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2196)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:951)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:247)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:162)
        at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:70)
        at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142)
        at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.onReceive(FencedAkkaRpcActor.java:40)
        at 
akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165)
        at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
        at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
        at akka.actor.ActorCell.invoke(ActorCell.scala:495)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
        at akka.dispatch.Mailbox.run(Mailbox.scala:224)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.checkpoint.CheckpointTriggerException: Failed to 
trigger savepoint. Decline reason: An Exception occurred while triggering the 
checkpoint.
        at 
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
        at 
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
        at 
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
        at 
java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:614)
        at 
java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:1983)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:943)
        ... 21 more
Caused by: org.apache.flink.runtime.checkpoint.CheckpointTriggerException: 
Failed to trigger savepoint. Decline reason: An Exception occurred while 
triggering the checkpoint.
        at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerSavepoint(CheckpointCoordinator.java:377)
        at 
org.apache.flink.runtime.jobmaster.JobMaster.triggerSavepoint(JobMaster.java:942)
        ... 21 more


Best,
Paul Lam

Reply via email to