Hi Dominique,

Could you tell us the version / build commit of Flink that you’re using?

Cheers,
Gordon


On 30 May 2017 at 4:29:08 PM, Dominique Rondé (dominique.ro...@allsecur.de) 
wrote:

Hi folks,

I just become into the need to bring Flink into a yarn system, that is 
configured with kerberos. According to the documentation, I changed the 
flink.conf.yaml like that:

security.kerberos.login.use-ticket-cache: true
security.kerberos.login.contexts: Client

I know that providing a keytab is the prefered, but I have to do a special 
request to receive one. ;-)

After startup, the provisionent is stopped by this error:

2017-05-30 16:16:48,684 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Waiting until all TaskManagers have connected
Waiting until all TaskManagers have connected
2017-05-30 16:16:48,685 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Starting client actor system.
2017-05-30 16:16:52,099 WARN  org.apache.flink.runtime.net.ConnectionUtils      
            - Could not connect to lfrar255.srv.allianz/10.17.24.162:56659. 
Selecting a local address using heuristics.
2017-05-30 16:16:52,473 INFO  akka.event.slf4j.Slf4jLogger                      
            - Slf4jLogger started
2017-05-30 16:16:52,512 INFO  Remoting                                          
            - Starting remoting
2017-05-30 16:16:52,670 INFO  Remoting                                          
            - Remoting started; listening on addresses 
:[akka.tcp://fl...@sla09037.srv.allianz:34579]
Exception in thread "main" java.lang.RuntimeException: Unable to get 
ClusterClient status from Application Client
        at 
org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:248)
        at 
org.apache.flink.yarn.YarnClusterClient.waitForClusterToBeReady(YarnClusterClient.java:520)
        at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:660)
        at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:476)
        at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli$1.call(FlinkYarnSessionCli.java:473)
        at 
org.apache.flink.runtime.security.HadoopSecurityContext$1.run(HadoopSecurityContext.java:43)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
        at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:40)
        at 
org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:473)
Caused by: org.apache.flink.runtime.leaderretrieval.LeaderRetrievalException: 
Could not retrieve the leader gateway
        at 
org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:141)
        at 
org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:691)
        at 
org.apache.flink.yarn.YarnClusterClient.getClusterStatus(YarnClusterClient.java:242)
        ... 10 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after 
[10000 milliseconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at 
scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at 
scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at scala.concurrent.Await.result(package.scala)
        at 
org.apache.flink.runtime.util.LeaderRetrievalUtils.retrieveLeaderGateway(LeaderRetrievalUtils.java:139)
        ... 12 more
2017-05-30 16:17:02,690 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Shutting down YarnClusterClient from the client shutdown hook
2017-05-30 16:17:02,691 INFO  org.apache.flink.yarn.YarnClusterClient           
            - Disconnecting YarnClusterClient from ApplicationMaster
2017-05-30 16:17:03,693 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator         - Shutting down 
remote daemon.
2017-05-30 16:17:03,696 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remote daemon 
shut down; proceeding with flushing remote transports.
2017-05-30 16:17:03,744 INFO  
akka.remote.RemoteActorRefProvider$RemotingTerminator         - Remoting shut 
down.
 
Has anyone an idea what is going wrong?

Best wished

Dominique

Reply via email to