Hi Flink Community,

Our flink jobs are in version 1.11 and we use this to trigger savepoint.
$ bin/flink savepoint :jobId [:targetDirectory]
We can get trigger Id with savepoint path successfully.

But we saw these errors by querying savepoint endpoint:
https://ci.apache.org/projects/flink/flink-docs-stable/ops/rest_api.html#jobs-jobid-savepoints-triggerid
e.g. application_id/jobs/job_id/savepoints/trigger_id

{
*  "*errors*": *[
    "org.apache.flink.runtime.rest.NotFoundException: Operation not found
under key:
org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@8893e196\n\tat
org.apache.flink.runtime.rest.handler.async.AbstractAsynchronousOperationHandlers$StatusHandler.handleRequest(AbstractAsynchronousOperationHandlers.java:167)\n\tat
org.apache.flink.runtime.rest.handler.job.savepoints.SavepointHandlers$SavepointStatusHandler.handleRequest(SavepointHandlers.java:193)\n\tat
org.apache.flink.runtime.rest.handler.AbstractRestHandler.respondToRequest(AbstractRestHandler.java:73)\n\tat
org.apache.flink.runtime.rest.handler.AbstractHandler.respondAsLeader(AbstractHandler.java:178)\n\tat
org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.lambda$channelRead0$0(LeaderRetrievalHandler.java:81)\n\tat
java.util.Optional.ifPresent(Optional.java:159)\n\tat
org.apache.flink.util.OptionalConsumer.ifPresent(OptionalConsumer.java:46)\n\tat
org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:78)\n\tat
org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:49)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)\n\tat
org.apache.flink.runtime.rest.handler.router.RouterHandler.routed(RouterHandler.java:110)\n\tat
org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:89)\n\tat
org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:54)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)\n\tat
org.apache.flink.shaded.netty4.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)\n\tat
org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:174)\n\tat
org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:68)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:438)\n\tat
org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:328)\n\tat
org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:302)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:253)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:352)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1421)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:374)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:360)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:930)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:697)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:632)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:549)\n\tat
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:511)\n\tat
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)\n\tat
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat
java.lang.Thread.run(Thread.java:748)\nCaused by:
org.apache.flink.runtime.rest.handler.async.UnknownOperationKeyException:
No ongoing operation for
org.apache.flink.runtime.rest.handler.job.AsynchronousJobOperationKey@8893e196\n\tat
org.apache.flink.runtime.rest.handler.async.CompletedOperationCache.get(CompletedOperationCache.java:134)\n\tat
org.apache.flink.runtime.rest.handler.async.AbstractAsynchronousOperationHandlers$StatusHandler.handleRequest(AbstractAsynchronousOperationHandlers.java:165)\n\t...
48 more\n"
  ]
}

{
"status": {
"id": "COMPLETED"
},
"operation": {
"failure-cause": {
"class": "java.util.concurrent.CompletionException",
"stack-trace": "java.util.concurrent.CompletionException:
java.util.concurrent.CompletionException:
org.apache.flink.runtime.checkpoint.CheckpointException: Not all required
tasks are currently running.\n\tat
org.apache.flink.runtime.scheduler.SchedulerBase.lambda$triggerSavepoint$3(SchedulerBase.java:764)\n\tat
java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:822)\n\tat
java.util.concurrent.CompletableFuture$UniHandle.tryFire(CompletableFuture.java:797)\n\tat
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)\n\tat
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:402)\n\tat
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:195)\n\tat
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)\n\tat
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)\n\tat
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)\n\tat
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)\n\tat
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)\n\tat
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)\n\tat
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)\n\tat
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)\n\tat
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)\n\tat
akka.actor.Actor$class.aroundReceive(Actor.scala:539)\n\tat
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:227)\n\tat
akka.actor.ActorCell.receiveMessage(ActorCell.scala:612)\n\tat
akka.actor.ActorCell.invoke(ActorCell.scala:581)\n\tat
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:268)\n\tat
akka.dispatch.Mailbox.run(Mailbox.scala:229)\n\tat
akka.dispatch.Mailbox.exec(Mailbox.scala:241)\n\tat
akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)\n\tat
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)\n\tat
akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)\n\tat
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)\nCaused
by: java.util.concurrent.CompletionException:
org.apache.flink.runtime.checkpoint.CheckpointException: Not all required
tasks are currently running.\n\tat
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)\n\tat
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)\n\tat
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)\n\tat
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)\n\tat
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)\n\tat
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)\n\tat
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.lambda$null$0(CheckpointCoordinator.java:467)\n\tat
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)\n\tat
java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:778)\n\tat
java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2140)\n\tat
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.lambda$triggerSavepointInternal$1(CheckpointCoordinator.java:463)\n\tat
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat
java.lang.Thread.run(Thread.java:748)\nCaused by:
org.apache.flink.runtime.checkpoint.CheckpointException: Not all required
tasks are currently running.\n\tat
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.getTriggerExecutions(CheckpointCoordinator.java:1723)\n\tat
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.startTriggeringCheckpoint(CheckpointCoordinator.java:510)\n\tat
java.util.Optional.ifPresent(Optional.java:159)\n\tat
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.triggerCheckpoint(CheckpointCoordinator.java:500)\n\tat
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.lambda$triggerSavepointInternal$1(CheckpointCoordinator.java:458)\n\t...
7 more\n",

Any idea what could cause savepoint failure?
Appreciated for any suggestions.
Best regards
Rainie

Reply via email to