Hi, Using flink 1.2.0, I faced to issue https://issues.apache.org/jira/browse/FLINK-6117 https://issues.apache.org/jira/browse/FLINK-6117. This issue is fixed at version 1.3.0. But I have some reason to trying to find out work around.
I did, 1. change source according to https://github.com/apache/flink/commit/eef85e095a8a0e4c4553631b74ba7b9f173cebf0 2. replace $FLINK_HOME/lib/flink-dist_2.11-1.2.0.jar 3. set flink-conf.yaml "zookeeper.sasl.disable: true" 4. run yarn-session.sh Original problem-Authentication failed- seems to be passed. But I got this error, Exception in thread "main" java.lang.RuntimeException: Failed to retrieve JobManager address at org.apache.flink.client.program.ClusterClient.getJobManagerAddress(ClusterClient.java:248) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:627) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473) at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473) Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader address and leader session ID. at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderConnectionInfo(LeaderRetrievalUtils.java:175) at org.apache.flink.client.program.ClusterClient.getJobManagerAddress(ClusterClient.java:242) ... 9 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [60000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at scala.concurrent.Await.result(package.scala) at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderConnectionInfo(LeaderRetrievalUtils.java:173) ... 10 more I believe related setting(flink, hadoop, zookeeper) is correct. Because yarn-session works smoothly with flink 1.3.2 in same environment. Does anyone have any inspiration for this error message? Thanks. ᐧ