Hi Gary: I have tried the 1.5.6 version, it shows the same error.
org.apache.flink.client.program.ProgramInvocationException: Could not retrieve the execution result. at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:258) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464) at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$$anonfun$main$1.apply(App.scala:76) at scala.App$$anonfun$main$1.apply(App.scala:76) at scala.collection.immutable.List.foreach(List.scala:392) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) at scala.App$class.main(App.scala:76) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420) at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:798) at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:289) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:215) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1035) at org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1111) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1111) Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph. at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:371) at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:203) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) at org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:795) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.util.concurrent.TimeoutException Best! Sen > 在 2019年2月27日,下午9:30,Gary Yao <g...@ververica.com> 写道: > > Hi, > > How did you determine "jmhost" and "port"? Actually you do not need to specify > these manually. If the client is using the same configuration as your cluster, > the client will look up the leading JM from ZooKeeper. > > If you have already tried omitting the "-m" parameter, you can check in the > client logs which host is used for the job submission [1]. Note that you need > to enable logging on DEBUG level. > > The root cause in your stacktrace is a TimeoutException. I would debug this by > checking if you can establish a TCP connection – from the machine you are > submitting the job from, to the target host/port [2]. > > Moreover, you are using a quite dated Flink version. The newest version in the > 1.5 major release is 1.5.6 – so consider upgrading to that or even to 1.7. > > Best, > Gary > > [1] > https://github.com/apache/flink/blob/3488f8b144a2127497c39b8ed5a48a65b551c57d/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java#L185 > > <https://github.com/apache/flink/blob/3488f8b144a2127497c39b8ed5a48a65b551c57d/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java#L185> > [2] > https://stackoverflow.com/questions/4922943/test-from-shell-script-if-remote-tcp-port-is-open > > <https://stackoverflow.com/questions/4922943/test-from-shell-script-if-remote-tcp-port-is-open> > > On Wed, Feb 27, 2019 at 8:09 AM 孙森 <senny...@163.com > <mailto:senny...@163.com>> wrote: > Hi all: > > I run flink (1.5.1 with hadoop 2.7) on yarn ,and submit job by > “/usr/local/flink/bin/flink run -m jmhost:port my.jar”, but the submission is > failed. > The HA configuration is : > high-availability: zookeeper > high-availability.storageDir: hdfs:///flink/ha/ <> > high-availability.zookeeper.quorum: hdp1:2181,hdp2:2181,hdp3:2181 > yarn.application-attempts: 2 > The info showed int the client log: > > 2019-02-27 11:48:38,651 INFO org.apache.flink.runtime.rest.RestClient > - Shutting down rest endpoint. > 2019-02-27 11:48:38,659 INFO org.apache.flink.runtime.rest.RestClient > - Rest endpoint shutdown complete. > 2019-02-27 11:48:38,662 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Stopping ZooKeeperLeaderRetrievalService /leader/rest_server_lock. > 2019-02-27 11:48:38,665 INFO > org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService - > Stopping ZooKeeperLeaderRetrievalService /leader/dispatcher_lock. > 2019-02-27 11:48:38,670 INFO > org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl > - backgroundOperationsLoop exiting > 2019-02-27 11:48:38,689 INFO > org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Session: > 0x2679c52880c00ee closed > 2019-02-27 11:48:38,689 INFO > org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - > EventThread shut down for session: 0x2679c52880c00ee > 2019-02-27 11:48:38,690 ERROR org.apache.flink.client.cli.CliFrontend > - Error while running the command. > org.apache.flink.client.program.ProgramInvocationException: Could not > retrieve the execution result. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:257) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at > org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654) > at scala.Function0$class.apply$mcV$sp(Function0.scala:34) > at > scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) > at scala.App$$anonfun$main$1.apply(App.scala:76) > at scala.App$$anonfun$main$1.apply(App.scala:76) > at scala.collection.immutable.List.foreach(List.scala:381) > at > scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) > at scala.App$class.main(App.scala:76) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785) > at > org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101) > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. > at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:370) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > at > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:214) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > at > org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:834) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.CompletionException: > org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not > complete the operation. Exception is not retryable. > at > java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) > at > java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) > at > java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911) > at > java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899) > ... 15 more > Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: > Could not complete the operation. Exception is not retryable. > ... 13 more > Caused by: java.util.concurrent.CompletionException: > java.util.concurrent.TimeoutException > at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > ... 10 more > Caused by: java.util.concurrent.TimeoutException > ... 8 more > > ------------------------------------------------------------ > The program finished with the following exception: > > org.apache.flink.client.program.ProgramInvocationException: Could not > retrieve the execution result. > at > org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:257) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:464) > at > org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66) > at > org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654) > at > edp.wormhole.flinkx.eventflow.WormholeFlinkMainProcess.process(WormholeFlinkMainProcess.scala:114) > at > edp.wormhole.flinkx.eventflow.WormholeFlinkxStarter$.delayedEndpoint$edp$wormhole$flinkx$eventflow$WormholeFlinkxStarter$1(WormholeFlinkxStarter.scala:40) > at > edp.wormhole.flinkx.eventflow.WormholeFlinkxStarter$delayedInit$body.apply(WormholeFlinkxStarter.scala:29) > at scala.Function0$class.apply$mcV$sp(Function0.scala:34) > at > scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) > at scala.App$$anonfun$main$1.apply(App.scala:76) > at scala.App$$anonfun$main$1.apply(App.scala:76) > at scala.collection.immutable.List.foreach(List.scala:381) > at > scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) > at scala.App$class.main(App.scala:76) > at > edp.wormhole.flinkx.eventflow.WormholeFlinkxStarter$.main(WormholeFlinkxStarter.scala:29) > at > edp.wormhole.flinkx.eventflow.WormholeFlinkxStarter.main(WormholeFlinkxStarter.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:528) > at > org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:420) > at > org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:404) > at > org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:785) > at > org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:279) > at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:214) > at > org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1025) > at > org.apache.flink.client.cli.CliFrontend.lambda$main$9(CliFrontend.java:1101) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) > at > org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1101) > Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to > submit JobGraph. > at > org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$8(RestClusterClient.java:370) > at > java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) > at > java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > at > org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$5(FutureUtils.java:214) > at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) > at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) > at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) > at > java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977) > at > org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:834) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.util.concurrent.CompletionException: > org.apache.flink.runtime.concurrent.FutureUtils$RetryException: Could not > complete the operation. Exception is not retryable. > at > java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326) > at > java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338) > at > java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911) > at > java.util.concurrent.CompletableFuture$UniRelay.tryFire(CompletableFuture.java:899) > ... 15 more > Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: > Could not complete the operation. Exception is not retryable. > ... 13 more > Caused by: java.util.concurrent.CompletionException: > java.util.concurrent.TimeoutException > at > java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) > at > java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) > at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593) > at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) > ... 10 more > Caused by: java.util.concurrent.TimeoutException > ... 8 more >