Hi Dominique, Could you tell us the version / build commit of Flink that you’re using?
Cheers, Gordon On 30 May 2017 at 4:29:08 PM, Dominique Rondé (dominique.ro...@allsecur.de) wrote: Hi folks, I just become into the need to bring Flink into a yarn system, that is configured with kerberos. According to the documentation, I changed the flink.conf.yaml like that: security.kerberos.login.use-ticket-cache: true security.kerberos.login.contexts: Client I know that providing a keytab is the prefered, but I have to do a special request to receive one. ;-) After startup, the provisionent is stopped by this error: 2017-05-30 16:16:48,684 INFO org.apache.flink.yarn.YarnClusterClient - Waiting until all TaskManagers have connected Waiting until all TaskManagers have connected 2017-05-30 16:16:48,685 INFO org.apache.flink.yarn.YarnClusterClient - Starting client actor system. 2017-05-30 16:16:52,099 WARN org.apache.flink.runtime.net.ConnectionUtils - Could not connect to lfrar255.srv.allianz/10.17.24.162:56659. Selecting a local address using heuristics. 2017-05-30 16:16:52,473 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started 2017-05-30 16:16:52,512 INFO Remoting - Starting remoting 2017-05-30 16:16:52,670 INFO Remoting - Remoting started; listening on addresses :[akka.tcp://fl...@sla09037.srv.allianz:34579] Exception in thread "main" java.lang.RuntimeException: Unable to get ClusterClient status from Application Client at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248) at org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:520) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:660) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476) at org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473) at org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473) Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: Could not retrieve the leader gateway at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:141) at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:691) at org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:242) ... 10 more Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) at scala.concurrent.Await$.result(package.scala:190) at scala.concurrent.Await.result(package.scala) at org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:139) ... 12 more 2017-05-30 16:17:02,690 INFO org.apache.flink.yarn.YarnClusterClient - Shutting down YarnClusterClient from the client shutdown hook 2017-05-30 16:17:02,691 INFO org.apache.flink.yarn.YarnClusterClient - Disconnecting YarnClusterClient from ApplicationMaster 2017-05-30 16:17:03,693 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting down remote daemon. 2017-05-30 16:17:03,696 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote daemon shut down; proceeding with flushing remote transports. 2017-05-30 16:17:03,744 INFO akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting shut down. Has anyone an idea what is going wrong? Best wished Dominique