Xintong Song created FLINK-23676: ------------------------------------ Summary: ReduceOnNeighborMethodsITCase.testSumOfOutNeighborsNoValue fails due to AskTimeoutException Key: FLINK-23676 URL: https://issues.apache.org/jira/browse/FLINK-23676 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.13.2 Reporter: Xintong Song Fix For: 1.13.3
https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21696&view=logs&j=c5f0071e-1851-543e-9a45-9ac140befc32&t=1fb1a56f-e8b5-5a82-00a0-a2db7757b4f5&l=7075 {code} Aug 06 13:54:59 [ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 18.547 s <<< FAILURE! - in org.apache.flink.graph.scala.test.operations.ReduceOnNeighborMethodsITCase Aug 06 13:54:59 [ERROR] testSumOfOutNeighborsNoValue[Execution mode = CLUSTER](org.apache.flink.graph.scala.test.operations.ReduceOnNeighborMethodsITCase) Time elapsed: 11.25 s <<< ERROR! Aug 06 13:54:59 org.apache.flink.runtime.client.JobExecutionException: Job execution failed. Aug 06 13:54:59 at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) Aug 06 13:54:59 at org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) Aug 06 13:54:59 at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:237) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) Aug 06 13:54:59 at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1081) Aug 06 13:54:59 at akka.dispatch.OnComplete.internal(Future.scala:264) Aug 06 13:54:59 at akka.dispatch.OnComplete.internal(Future.scala:261) Aug 06 13:54:59 at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) Aug 06 13:54:59 at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) Aug 06 13:54:59 at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) Aug 06 13:54:59 at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73) Aug 06 13:54:59 at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) Aug 06 13:54:59 at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) Aug 06 13:54:59 at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) Aug 06 13:54:59 at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:22) Aug 06 13:54:59 at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) Aug 06 13:54:59 at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) Aug 06 13:54:59 at scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) Aug 06 13:54:59 at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) Aug 06 13:54:59 at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) Aug 06 13:54:59 at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) Aug 06 13:54:59 at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) Aug 06 13:54:59 at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) Aug 06 13:54:59 at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) Aug 06 13:54:59 at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) Aug 06 13:54:59 at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) Aug 06 13:54:59 at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) Aug 06 13:54:59 at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) Aug 06 13:54:59 at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) Aug 06 13:54:59 at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) Aug 06 13:54:59 at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Aug 06 13:54:59 Caused by: org.apache.flink.runtime.JobException: Recovery is suppressed by NoRestartBackoffTimeStrategy Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:82) Aug 06 13:54:59 at org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:216) Aug 06 13:54:59 at org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:206) Aug 06 13:54:59 at org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:197) Aug 06 13:54:59 at org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:682) Aug 06 13:54:59 at org.apache.flink.runtime.scheduler.UpdateSchedulerNgOnInternalFailuresListener.notifyTaskFailure(UpdateSchedulerNgOnInternalFailuresListener.java:51) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.DefaultExecutionGraph.notifySchedulerNgAboutInternalTaskFailure(DefaultExecutionGraph.java:1462) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1140) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.Execution.processFail(Execution.java:1080) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.Execution.markFailed(Execution.java:911) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.Execution.lambda$deploy$5(Execution.java:623) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) Aug 06 13:54:59 at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440) Aug 06 13:54:59 at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208) Aug 06 13:54:59 at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77) Aug 06 13:54:59 at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) Aug 06 13:54:59 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) Aug 06 13:54:59 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) Aug 06 13:54:59 at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) Aug 06 13:54:59 at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) Aug 06 13:54:59 at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) Aug 06 13:54:59 at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) Aug 06 13:54:59 at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) Aug 06 13:54:59 at akka.actor.Actor$class.aroundReceive(Actor.scala:517) Aug 06 13:54:59 at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) Aug 06 13:54:59 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) Aug 06 13:54:59 at akka.actor.ActorCell.invoke(ActorCell.scala:561) Aug 06 13:54:59 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) Aug 06 13:54:59 at akka.dispatch.Mailbox.run(Mailbox.scala:225) Aug 06 13:54:59 at akka.dispatch.Mailbox.exec(Mailbox.scala:235) Aug 06 13:54:59 ... 4 more Aug 06 13:54:59 Caused by: java.util.concurrent.CompletionException: java.util.concurrent.TimeoutException: Invocation of public abstract java.util.concurrent.CompletableFuture org.apache.flink.runtime.taskexecutor.TaskExecutorGateway.submitTask(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor,org.apache.flink.runtime.jobmaster.JobMasterId,org.apache.flink.api.common.time.Time) timed out. Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:925) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:913) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) Aug 06 13:54:59 at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) Aug 06 13:54:59 at org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1079) Aug 06 13:54:59 at akka.dispatch.OnComplete.internal(Future.scala:263) Aug 06 13:54:59 at akka.dispatch.OnComplete.internal(Future.scala:261) Aug 06 13:54:59 at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) Aug 06 13:54:59 at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) Aug 06 13:54:59 at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) Aug 06 13:54:59 at org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73) Aug 06 13:54:59 at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) Aug 06 13:54:59 at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) Aug 06 13:54:59 at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:644) Aug 06 13:54:59 at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205) Aug 06 13:54:59 at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) Aug 06 13:54:59 at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) Aug 06 13:54:59 at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:279) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:283) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:235) Aug 06 13:54:59 at java.lang.Thread.run(Thread.java:748) Aug 06 13:54:59 Caused by: java.util.concurrent.TimeoutException: Invocation of public abstract java.util.concurrent.CompletableFuture org.apache.flink.runtime.taskexecutor.TaskExecutorGateway.submitTask(org.apache.flink.runtime.deployment.TaskDeploymentDescriptor,org.apache.flink.runtime.jobmaster.JobMasterId,org.apache.flink.api.common.time.Time) timed out. Aug 06 13:54:59 at org.apache.flink.runtime.jobmaster.RpcTaskManagerGateway.submitTask(RpcTaskManagerGateway.java:60) Aug 06 13:54:59 at org.apache.flink.runtime.executiongraph.Execution.lambda$deploy$4(Execution.java:599) Aug 06 13:54:59 at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604) Aug 06 13:54:59 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) Aug 06 13:54:59 at java.util.concurrent.FutureTask.run(FutureTask.java:266) Aug 06 13:54:59 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) Aug 06 13:54:59 at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) Aug 06 13:54:59 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) Aug 06 13:54:59 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) Aug 06 13:54:59 ... 1 more Aug 06 13:54:59 Caused by: akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://flink/user/rpc/taskmanager_0#595501750]] after [10000 ms]. Message of type [org.apache.flink.runtime.rpc.messages.LocalRpcInvocation]. A typical reason for `AskTimeoutException` is that the recipient actor didn't send a reply. Aug 06 13:54:59 at akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635) Aug 06 13:54:59 at akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635) Aug 06 13:54:59 at akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:648) Aug 06 13:54:59 at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205) Aug 06 13:54:59 at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) Aug 06 13:54:59 at scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) Aug 06 13:54:59 at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:279) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:283) Aug 06 13:54:59 at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:235) Aug 06 13:54:59 ... 1 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)