Hi Max! I'm using Flink 0.10.1 and indeed the cluster seems to be created fine, all in the JobManager Web UI looks good.
It seems like the JobManager initiates the connection with my VM and cannot reach it. It could be that this is similar to the problem here: http://apache-spark-user-list.1001560.n3.nabble.com/spark-with-docker-errors-with-akka-NAT-td7702.html I probably have to make some changes to the networking configuration of my VM so it can be reached by the JobManager despite using a different port each time. - Pieter 2016-02-06 14:05 GMT+01:00 Maximilian Michels <[email protected]>: > Hi Pieter, > > Which version of Flink are you using? It appears you've created a > Flink YARN cluster but you can't reach the JobManager afterwards. > > Cheers, > Max > > On Sat, Feb 6, 2016 at 1:42 PM, Pieter Hameete <[email protected]> wrote: > > Hi Robert, > > > > unfortunately there are no signs of what is going wrong in the logs. The > > last log messages are about succesful registration of the TaskManagers. > > > > I'm also fairly sure it must be something in my VM that is causing this, > > because when I start the yarn-session from a login node that is on the > same > > network as the hadoop cluster there are no problems registering with the > > JobManager. I did also notice the following message in the local console: > > > > 12:30:27,173 WARN Remoting > > - Tried to associate with unreachable remote address > > [akka.tcp://[email protected]:41539]. Address is now gated for 5000 > ms, > > all messages to this address will be delivered to dead letters. Reason: > > connection timed out: /145.100.41.13:41539 > > > > I can ping the JobManager fine from with VM. Could there be some invalid > or > > missing configuration on my side? > > > > Cheers, > > > > Pieter > > > > > > 2016-02-06 12:54 GMT+01:00 Robert Metzger <[email protected]>: > >> > >> Hi, > >> > >> did you check the logs of the JobManager itself? Maybe it'll tell us > >> already whats going on. > >> > >> On Sat, Feb 6, 2016 at 12:14 PM, Pieter Hameete <[email protected]> > >> wrote: > >>> > >>> Hi Guys! > >>> > >>> Im attempting to run Flink on YARN, but I run into an issue. Im > starting > >>> the Flink YARN session from an Ubuntu 14.04 VM. All goes well until > after > >>> the JobManager web UI is started: > >>> > >>> JobManager web interface address > >>> > http://head05.hathi.surfsara.nl:8088/proxy/application_1452780322684_10532/ > >>> Waiting until all TaskManagers have connected > >>> 11:09:51,557 INFO org.apache.flink.yarn.ApplicationClient > >>> - Notification about new leader address > >>> akka.tcp://[email protected]:35666/user/jobmanager with session ID > null. > >>> No status updates from the YARN cluster received so far. Waiting ... > >>> 11:09:51,578 INFO org.apache.flink.yarn.ApplicationClient > >>> - Received address of new leader > >>> akka.tcp://[email protected]:35666/user/jobmanager with session ID > null. > >>> 11:09:51,583 INFO org.apache.flink.yarn.ApplicationClient > >>> - Disconnect from JobManager null. > >>> 11:09:51,595 INFO org.apache.flink.yarn.ApplicationClient > >>> - Trying to register at JobManager > >>> akka.tcp://[email protected]:35666/user/jobmanager. > >>> No status updates from the YARN cluster received so far. Waiting ... > >>> No status updates from the YARN cluster received so far. Waiting ... > >>> > >>> It then hangs on these last steps (trying to register, no status > >>> updates..) > >>> > >>> Im sure there must be a problem on my side that is causing me not to be > >>> able to register at the JobManager. What could cause such connection > >>> problems? > >>> > >>> Any tips are very welcome :-) > >>> > >>> Cheers and have a good weekend! > >>> > >>> - Pieter > >>> > >>> > >> > > >
