Hi, I am running into some strange issues on yarn with Flink 1.1.3 & 4. For some reason I started getting this error (see under text.) The job manager starts and the application is in Accepted state but cannot seem to be able to communicate with the scheduler. (0.0.0.0:8030 seems strange)
I didn't change anything on the yarn cluster and this seemed to work previously (but I just cant get it to work now). The yarn-site.xml contains the proper rm addresses. Anybody has any ideas where to go from here? Cheers, Gyula JM log: 2016-11-12 11:56:06,894 DEBUG org.apache.hadoop.ipc.Client - The ping interval is 60000 ms. 2016-11-12 11:56:06,894 DEBUG org.apache.hadoop.ipc.Client - Connecting to /0.0.0.0:8030 2016-11-12 11:56:06,899 DEBUG org.apache.hadoop.ipc.Client - closing ipc connection to 0.0.0.0/0.0.0.0:8030: Connection refused java.net.ConnectException: Call From splat24.sto.midasplayer.com/172.25.86.166 to 0.0.0.0:8030 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730) at org.apache.hadoop.ipc.Client.call(Client.java:1410) at org.apache.hadoop.ipc.Client.call(Client.java:1359) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) at com.sun.proxy.$Proxy8.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy9.registerApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:196) at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138) at org.apache.flink.yarn.YarnFlinkResourceManager.initialize(YarnFlinkResourceManager.java:259) at org.apache.flink.runtime.clusterframework.FlinkResourceManager.preStart(FlinkResourceManager.java:185) at akka.actor.Actor$class.aroundPreStart(Actor.scala:470) at akka.actor.UntypedActor.aroundPreStart(UntypedActor.scala:97) at akka.actor.ActorCell.create(ActorCell.scala:580) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:456) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) Client: 2016-11-12 12:31:31,080 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2016-11-12 12:31:31,080 INFO org.apache.flink.yarn.cli.FlinkYarnSessionCli - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar 2016-11-12 12:31:31,101 INFO org.apache.flink.yarn.YarnClusterDescriptor - Using values: 2016-11-12 12:31:31,101 INFO org.apache.flink.yarn.YarnClusterDescriptor - TaskManager count = 1 2016-11-12 12:31:31,101 INFO org.apache.flink.yarn.YarnClusterDescriptor - JobManager memory = 1024 2016-11-12 12:31:31,102 INFO org.apache.flink.yarn.YarnClusterDescriptor - TaskManager memory = 11000 2016-11-12 12:31:31,119 INFO org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032 2016-11-12 12:31:31,394 WARN org.apache.flink.yarn.YarnClusterDescriptor - The file system scheme is 'file'. This indicates that the specified Hadoop configuration path is wrong and the system is using the default Hadoop configuration values.The Flink YARN client needs to store its files in a distributed file system 2016-11-12 12:31:31,457 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/flink-1.1.3/conf/log4j.properties to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/log4j.properties 2016-11-12 12:31:42,321 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/flink-1.1.3/lib to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/lib 2016-11-12 12:32:18,457 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/rbea/rbea-on-flink-1.0-SNAPSHOT.jar to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/rbea-on-flink-1.0-SNAPSHOT.jar 2016-11-12 12:32:39,725 INFO org.apache.flink.yarn.Utils - Copying from file:/fjord/sites/flink-1.1.3/lib/flink-dist_2.10-1.1.4.jar to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/flink-dist_2.10-1.1.4.jar 2016-11-12 12:32:58,154 INFO org.apache.flink.yarn.Utils - Copying from /fjord/sites/flink-1.1.3/conf/flink-conf.yaml to file:/fjord/splat/flink/yarn/.flink/application_1478896050022_0013/flink-conf.yaml 2016-11-12 12:33:02,218 INFO org.apache.flink.yarn.YarnClusterDescriptor - Submitting application master application_1478896050022_0013 2016-11-12 12:33:02,256 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl - Submitted application application_1478896050022_0013 2016-11-12 12:33:02,257 INFO org.apache.flink.yarn.YarnClusterDescriptor - Waiting for the cluster to be allocated 2016-11-12 12:33:02,259 INFO org.apache.flink.yarn.YarnClusterDescriptor - Deploying cluster, current state ACCEPTED 2016-11-12 12:34:02,485 INFO org.apache.flink.yarn.YarnClusterDescriptor - Deployment took more than 60 seconds. Please check if the requested resources are available in the YARN cluster