I get the following error when trying to savepoint a job for example: The program finished with the following exception:
org.apache.flink.util.FlinkException: Could not connect to the leading JobManager. Please check that the JobManager is running. at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:960) at org.apache.flink.client.program.ClusterClient.triggerSavepoint(ClusterClient.java:737) at org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:771) at org.apache.flink.client.cli.CliFrontend.lambda$checkpoint$10(CliFrontend.java:760) at org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1044) at org.apache.flink.client.cli.CliFrontend.checkpoint(CliFrontend.java:759) at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1127) at org.apache.flink.client.cli.CliFrontend.lambda$main$12(CliFrontend.java:1188) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1188) Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway. at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:83) at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:955) ... 12 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [20000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at scala.concurrent.Await.result(package.scala) at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:81) ... 13 more No error when trying the same operation with the 1.7 client on an 1.6 (legacy execution) job. This looks like a firewall issue so im trying to fix the ports to the open ranges but not sure what I have to change. Gyula Gyula Fóra <gyula.f...@gmail.com> ezt írta (időpont: 2018. dec. 4., K, 15:11): > Hi! > > We have been running Flink on Yarn for quite some time and historically we > specified port ranges so that the client can access the cluster: > > yarn.application-master.port: 100-200 > > Now we updated to flink 1.7 and try to migrate away from the legacy > execution mode but we run into a problem that we cannot connect to the > running job from the command line client like before. > > What is the equivalent port config that would make sure that ports that > are needed to be accessible from the client land between 100 and 200? > > Thanks, > Gyula >