[ https://issues.apache.org/jira/browse/FLINK-22662?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364755#comment-17364755 ]
Xintong Song commented on FLINK-22662: -------------------------------------- [~fly_in_gis] and I have looked into this instability. Here are our findings. The expected behaviors of {{testKillYarnSessionClusterEntrypoint}} are as follow: 1) AM_1 running 2) Kill AM 3) Yarn brings up AM_2 4) Signal the job to finish (via temp file) The problem is that, it is possible that 3) & 4) happen before 2) is finished. This is because a Yarn container runs as a process tree, where the Flink java process is brought up by a wrapping {{launch_container.sh}} process. Yarn can detect the termination of AM_1 and start bring up AM_2 as soon as the wrapping process is terminated, while the Flink process might be still running. Consequently, the signal from 4) is received by AM_1 and the job finishes before AM_1 is completely shutdown. When AM_2 is started, there's no job to be recovered, thus the "could not find job" exception. To fix this, we need to make sure AM_1 is completely terminated before proceeding 4). This can be achieved by looking for the PID changes. Besides, a ZK outage is occasionally observed right after the AM failover. Due to absence of ZK logs, we do not find the cause of this outage. However, given that the outage is only observed together with the above described problem, we tend to see them as related. > YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint fail > -------------------------------------------------------------------- > > Key: FLINK-22662 > URL: https://issues.apache.org/jira/browse/FLINK-22662 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN > Affects Versions: 1.13.0 > Reporter: Guowei Ma > Assignee: Xintong Song > Priority: Major > Labels: test-stability > > {code:java} > 2021-05-14T00:24:57.8487649Z May 14 00:24:57 [ERROR] > testKillYarnSessionClusterEntrypoint(org.apache.flink.yarn.YARNHighAvailabilityITCase) > Time elapsed: 34.667 s <<< ERROR! > 2021-05-14T00:24:57.8488567Z May 14 00:24:57 > java.util.concurrent.ExecutionException: > 2021-05-14T00:24:57.8489301Z May 14 00:24:57 > org.apache.flink.runtime.rest.util.RestClientException: > [org.apache.flink.runtime.rest.handler.RestHandlerException: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8493142Z May 14 00:24:57 at > org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.propagateException(JobExecutionResultHandler.java:94) > 2021-05-14T00:24:57.8495823Z May 14 00:24:57 at > org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.lambda$handleRequest$1(JobExecutionResultHandler.java:84) > 2021-05-14T00:24:57.8496733Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) > 2021-05-14T00:24:57.8497640Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) > 2021-05-14T00:24:57.8498491Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > 2021-05-14T00:24:57.8499222Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > 2021-05-14T00:24:57.8500003Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234) > 2021-05-14T00:24:57.8500872Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > 2021-05-14T00:24:57.8501702Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > 2021-05-14T00:24:57.8502662Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > 2021-05-14T00:24:57.8503472Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > 2021-05-14T00:24:57.8504269Z May 14 00:24:57 at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1079) > 2021-05-14T00:24:57.8504892Z May 14 00:24:57 at > akka.dispatch.OnComplete.internal(Future.scala:263) > 2021-05-14T00:24:57.8505565Z May 14 00:24:57 at > akka.dispatch.OnComplete.internal(Future.scala:261) > 2021-05-14T00:24:57.8506062Z May 14 00:24:57 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > 2021-05-14T00:24:57.8506819Z May 14 00:24:57 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > 2021-05-14T00:24:57.8507418Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > 2021-05-14T00:24:57.8508373Z May 14 00:24:57 at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73) > 2021-05-14T00:24:57.8509144Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > 2021-05-14T00:24:57.8509972Z May 14 00:24:57 at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > 2021-05-14T00:24:57.8510675Z May 14 00:24:57 at > akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > 2021-05-14T00:24:57.8511376Z May 14 00:24:57 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:23) > 2021-05-14T00:24:57.8512222Z May 14 00:24:57 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > 2021-05-14T00:24:57.8513090Z May 14 00:24:57 at > scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > 2021-05-14T00:24:57.8513835Z May 14 00:24:57 at > scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > 2021-05-14T00:24:57.8514576Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > 2021-05-14T00:24:57.8515344Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > 2021-05-14T00:24:57.8516317Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > 2021-05-14T00:24:57.8517537Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > 2021-05-14T00:24:57.8518525Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > 2021-05-14T00:24:57.8519372Z May 14 00:24:57 at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > 2021-05-14T00:24:57.8520060Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > 2021-05-14T00:24:57.8520845Z May 14 00:24:57 at > akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > 2021-05-14T00:24:57.8521684Z May 14 00:24:57 at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > 2021-05-14T00:24:57.8522646Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > 2021-05-14T00:24:57.8523285Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > 2021-05-14T00:24:57.8524046Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > 2021-05-14T00:24:57.8524892Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 2021-05-14T00:24:57.8525798Z May 14 00:24:57 Caused by: > java.util.concurrent.CompletionException: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8526988Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > 2021-05-14T00:24:57.8527951Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > 2021-05-14T00:24:57.8528731Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957) > 2021-05-14T00:24:57.8529606Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) > 2021-05-14T00:24:57.8530207Z May 14 00:24:57 ... 34 more > 2021-05-14T00:24:57.8530805Z May 14 00:24:57 Caused by: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8531746Z May 14 00:24:57 at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$requestJobStatus$14(Dispatcher.java:596) > 2021-05-14T00:24:57.8532553Z May 14 00:24:57 at > java.util.Optional.orElseGet(Optional.java:267) > 2021-05-14T00:24:57.8533222Z May 14 00:24:57 at > org.apache.flink.runtime.dispatcher.Dispatcher.requestJobStatus(Dispatcher.java:590) > 2021-05-14T00:24:57.8533857Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2021-05-14T00:24:57.8534597Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2021-05-14T00:24:57.8535203Z May 14 00:24:57 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2021-05-14T00:24:57.8535733Z May 14 00:24:57 at > java.lang.reflect.Method.invoke(Method.java:498) > 2021-05-14T00:24:57.8536250Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305) > 2021-05-14T00:24:57.8536861Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) > 2021-05-14T00:24:57.8537578Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77) > 2021-05-14T00:24:57.8538242Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) > 2021-05-14T00:24:57.8538791Z May 14 00:24:57 at > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > 2021-05-14T00:24:57.8539269Z May 14 00:24:57 at > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > 2021-05-14T00:24:57.8539781Z May 14 00:24:57 at > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > 2021-05-14T00:24:57.8540296Z May 14 00:24:57 at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > 2021-05-14T00:24:57.8541002Z May 14 00:24:57 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > 2021-05-14T00:24:57.8541519Z May 14 00:24:57 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > 2021-05-14T00:24:57.8542125Z May 14 00:24:57 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > 2021-05-14T00:24:57.8542696Z May 14 00:24:57 at > akka.actor.Actor$class.aroundReceive(Actor.scala:517) > 2021-05-14T00:24:57.8543188Z May 14 00:24:57 at > akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) > 2021-05-14T00:24:57.8543673Z May 14 00:24:57 at > akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) > 2021-05-14T00:24:57.8544141Z May 14 00:24:57 at > akka.actor.ActorCell.invoke(ActorCell.scala:561) > 2021-05-14T00:24:57.8544612Z May 14 00:24:57 at > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) > 2021-05-14T00:24:57.8545040Z May 14 00:24:57 at > akka.dispatch.Mailbox.run(Mailbox.scala:225) > 2021-05-14T00:24:57.8545475Z May 14 00:24:57 at > akka.dispatch.Mailbox.exec(Mailbox.scala:235) > 2021-05-14T00:24:57.8545802Z May 14 00:24:57 ... 4 more > 2021-05-14T00:24:57.8546046Z May 14 00:24:57 ] > 2021-05-14T00:24:57.8546439Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > 2021-05-14T00:24:57.8546964Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928) > 2021-05-14T00:24:57.8547666Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase.waitForJobTermination(YARNHighAvailabilityITCase.java:324) > 2021-05-14T00:24:57.8548439Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase.lambda$testKillYarnSessionClusterEntrypoint$0(YARNHighAvailabilityITCase.java:180) > 2021-05-14T00:24:57.8549084Z May 14 00:24:57 at > org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:287) > 2021-05-14T00:24:57.8549712Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint(YARNHighAvailabilityITCase.java:156) > 2021-05-14T00:24:57.8550288Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2021-05-14T00:24:57.8550789Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2021-05-14T00:24:57.8551369Z May 14 00:24:57 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2021-05-14T00:24:57.8551872Z May 14 00:24:57 at > java.lang.reflect.Method.invoke(Method.java:498) > 2021-05-14T00:24:57.8552476Z May 14 00:24:57 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2021-05-14T00:24:57.8553062Z May 14 00:24:57 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2021-05-14T00:24:57.8553631Z May 14 00:24:57 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2021-05-14T00:24:57.8554204Z May 14 00:24:57 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2021-05-14T00:24:57.8554798Z May 14 00:24:57 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > 2021-05-14T00:24:57.8555446Z May 14 00:24:57 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > 2021-05-14T00:24:57.8556007Z May 14 00:24:57 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 2021-05-14T00:24:57.8556423Z May 14 00:24:57 at > java.lang.Thread.run(Thread.java:748) > 2021-05-14T00:24:57.8557001Z May 14 00:24:57 Suppressed: > java.lang.AssertionError: There is at least one application on the cluster > that is not finished.[App application_1620949990638_0002 is in state RUNNING.] > 2021-05-14T00:24:57.8557683Z May 14 00:24:57 at > org.junit.Assert.fail(Assert.java:88) > 2021-05-14T00:24:57.8558224Z May 14 00:24:57 at > org.apache.flink.yarn.YarnTestBase$CleanupYarnApplication.close(YarnTestBase.java:324) > 2021-05-14T00:24:57.8558785Z May 14 00:24:57 at > org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:288) > 2021-05-14T00:24:57.8559259Z May 14 00:24:57 ... 13 more > 2021-05-14T00:24:57.8560003Z May 14 00:24:57 Caused by: > org.apache.flink.runtime.rest.util.RestClientException: > [org.apache.flink.runtime.rest.handler.RestHandlerException: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8561068Z May 14 00:24:57 at > org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.propagateException(JobExecutionResultHandler.java:94) > 2021-05-14T00:24:57.8561817Z May 14 00:24:57 at > org.apache.flink.runtime.rest.handler.job.JobExecutionResultHandler.lambda$handleRequest$1(JobExecutionResultHandler.java:84) > 2021-05-14T00:24:57.8562552Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884) > 2021-05-14T00:24:57.8563157Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:866) > 2021-05-14T00:24:57.8563754Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > 2021-05-14T00:24:57.8564321Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > 2021-05-14T00:24:57.8564967Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:234) > 2021-05-14T00:24:57.8565600Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > 2021-05-14T00:24:57.8566171Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > 2021-05-14T00:24:57.8566756Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > 2021-05-14T00:24:57.8567389Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990) > 2021-05-14T00:24:57.8567983Z May 14 00:24:57 at > org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:1079) > 2021-05-14T00:24:57.8568577Z May 14 00:24:57 at > akka.dispatch.OnComplete.internal(Future.scala:263) > 2021-05-14T00:24:57.8569023Z May 14 00:24:57 at > akka.dispatch.OnComplete.internal(Future.scala:261) > 2021-05-14T00:24:57.8569488Z May 14 00:24:57 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:191) > 2021-05-14T00:24:57.8569963Z May 14 00:24:57 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:188) > 2021-05-14T00:24:57.8570430Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > 2021-05-14T00:24:57.8570995Z May 14 00:24:57 at > org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:73) > 2021-05-14T00:24:57.8571561Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:44) > 2021-05-14T00:24:57.8572108Z May 14 00:24:57 at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:252) > 2021-05-14T00:24:57.8572705Z May 14 00:24:57 at > akka.pattern.PromiseActorRef.$bang(AskSupport.scala:572) > 2021-05-14T00:24:57.8573248Z May 14 00:24:57 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:23) > 2021-05-14T00:24:57.8573879Z May 14 00:24:57 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:21) > 2021-05-14T00:24:57.8574448Z May 14 00:24:57 at > scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:436) > 2021-05-14T00:24:57.8574949Z May 14 00:24:57 at > scala.concurrent.Future$$anonfun$andThen$1.apply(Future.scala:435) > 2021-05-14T00:24:57.8575451Z May 14 00:24:57 at > scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36) > 2021-05-14T00:24:57.8575978Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) > 2021-05-14T00:24:57.8576611Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:91) > 2021-05-14T00:24:57.8577417Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > 2021-05-14T00:24:57.8578029Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:91) > 2021-05-14T00:24:57.8578753Z May 14 00:24:57 at > scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72) > 2021-05-14T00:24:57.8579296Z May 14 00:24:57 at > akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:90) > 2021-05-14T00:24:57.8579809Z May 14 00:24:57 at > akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) > 2021-05-14T00:24:57.8580396Z May 14 00:24:57 at > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:44) > 2021-05-14T00:24:57.8580980Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > 2021-05-14T00:24:57.8581481Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > 2021-05-14T00:24:57.8582006Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > 2021-05-14T00:24:57.8582606Z May 14 00:24:57 at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 2021-05-14T00:24:57.8583333Z May 14 00:24:57 Caused by: > java.util.concurrent.CompletionException: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8584065Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > 2021-05-14T00:24:57.8584634Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > 2021-05-14T00:24:57.8585207Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:957) > 2021-05-14T00:24:57.8585775Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) > 2021-05-14T00:24:57.8586183Z May 14 00:24:57 ... 34 more > 2021-05-14T00:24:57.8586682Z May 14 00:24:57 Caused by: > org.apache.flink.runtime.messages.FlinkJobNotFoundException: Could not find > Flink job (610ed4b159ece04c8ee2ec40e7d0c143) > 2021-05-14T00:24:57.8587424Z May 14 00:24:57 at > org.apache.flink.runtime.dispatcher.Dispatcher.lambda$requestJobStatus$14(Dispatcher.java:596) > 2021-05-14T00:24:57.8587950Z May 14 00:24:57 at > java.util.Optional.orElseGet(Optional.java:267) > 2021-05-14T00:24:57.8588522Z May 14 00:24:57 at > org.apache.flink.runtime.dispatcher.Dispatcher.requestJobStatus(Dispatcher.java:590) > 2021-05-14T00:24:57.8589039Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2021-05-14T00:24:57.8589522Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2021-05-14T00:24:57.8590107Z May 14 00:24:57 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2021-05-14T00:24:57.8590624Z May 14 00:24:57 at > java.lang.reflect.Method.invoke(Method.java:498) > 2021-05-14T00:24:57.8591147Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305) > 2021-05-14T00:24:57.8591755Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) > 2021-05-14T00:24:57.8592437Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77) > 2021-05-14T00:24:57.8593065Z May 14 00:24:57 at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) > 2021-05-14T00:24:57.8593604Z May 14 00:24:57 at > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > 2021-05-14T00:24:57.8594079Z May 14 00:24:57 at > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > 2021-05-14T00:24:57.8594581Z May 14 00:24:57 at > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > 2021-05-14T00:24:57.8595090Z May 14 00:24:57 at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > 2021-05-14T00:24:57.8595684Z May 14 00:24:57 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > 2021-05-14T00:24:57.8596201Z May 14 00:24:57 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > 2021-05-14T00:24:57.8596780Z May 14 00:24:57 at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > 2021-05-14T00:24:57.8597354Z May 14 00:24:57 at > akka.actor.Actor$class.aroundReceive(Actor.scala:517) > 2021-05-14T00:24:57.8597844Z May 14 00:24:57 at > akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) > 2021-05-14T00:24:57.8598405Z May 14 00:24:57 at > akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) > 2021-05-14T00:24:57.8598872Z May 14 00:24:57 at > akka.actor.ActorCell.invoke(ActorCell.scala:561) > 2021-05-14T00:24:57.8599329Z May 14 00:24:57 at > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) > 2021-05-14T00:24:57.8599760Z May 14 00:24:57 at > akka.dispatch.Mailbox.run(Mailbox.scala:225) > 2021-05-14T00:24:57.8600187Z May 14 00:24:57 at > akka.dispatch.Mailbox.exec(Mailbox.scala:235) > 2021-05-14T00:24:57.8600519Z May 14 00:24:57 ... 4 more > 2021-05-14T00:24:57.8600763Z May 14 00:24:57 ] > 2021-05-14T00:24:57.8601150Z May 14 00:24:57 at > org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:486) > 2021-05-14T00:24:57.8601707Z May 14 00:24:57 at > org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:466) > 2021-05-14T00:24:57.8602346Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) > 2021-05-14T00:24:57.8603013Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) > 2021-05-14T00:24:57.8603578Z May 14 00:24:57 at > java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456) > 2021-05-14T00:24:57.8604123Z May 14 00:24:57 at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > 2021-05-14T00:24:57.8604648Z May 14 00:24:57 at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > 2021-05-14T00:24:57.8605135Z May 14 00:24:57 ... 1 more > 2021-05-14T00:24:57.8605372Z May 14 00:24:57 > 2021-05-14T00:24:57.8605812Z May 14 00:24:57 [ERROR] > testClusterClientRetrieval(org.apache.flink.yarn.YARNHighAvailabilityITCase) > Time elapsed: 1,800.422 s <<< ERROR! > 2021-05-14T00:24:57.8606429Z May 14 00:24:57 > org.junit.runners.model.TestTimedOutException: test timed out after 1800000 > milliseconds > 2021-05-14T00:24:57.8606847Z May 14 00:24:57 at > java.lang.Thread.sleep(Native Method) > 2021-05-14T00:24:57.8607394Z May 14 00:24:57 at > org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:1223) > 2021-05-14T00:24:57.8607988Z May 14 00:24:57 at > org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:593) > 2021-05-14T00:24:57.8608620Z May 14 00:24:57 at > org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:418) > 2021-05-14T00:24:57.8609254Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase.deploySessionCluster(YARNHighAvailabilityITCase.java:356) > 2021-05-14T00:24:57.8609932Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase.lambda$testClusterClientRetrieval$2(YARNHighAvailabilityITCase.java:224) > 2021-05-14T00:24:57.8610525Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase$$Lambda$250/2101231496.run(Unknown > Source) > 2021-05-14T00:24:57.8611024Z May 14 00:24:57 at > org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:287) > 2021-05-14T00:24:57.8611610Z May 14 00:24:57 at > org.apache.flink.yarn.YARNHighAvailabilityITCase.testClusterClientRetrieval(YARNHighAvailabilityITCase.java:219) > 2021-05-14T00:24:57.8612136Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > 2021-05-14T00:24:57.8612692Z May 14 00:24:57 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > 2021-05-14T00:24:57.8613238Z May 14 00:24:57 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > 2021-05-14T00:24:57.8613831Z May 14 00:24:57 at > java.lang.reflect.Method.invoke(Method.java:498) > 2021-05-14T00:24:57.8614333Z May 14 00:24:57 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > 2021-05-14T00:24:57.8614959Z May 14 00:24:57 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > 2021-05-14T00:24:57.8615521Z May 14 00:24:57 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > 2021-05-14T00:24:57.8616071Z May 14 00:24:57 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > 2021-05-14T00:24:57.8616633Z May 14 00:24:57 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > 2021-05-14T00:24:57.8617325Z May 14 00:24:57 at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > 2021-05-14T00:24:57.8617864Z May 14 00:24:57 at > java.util.concurrent.FutureTask.run(FutureTask.java:266) > 2021-05-14T00:24:57.8618360Z May 14 00:24:57 at > java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)