Hi everyone, I'm currently experiencing a weird situation, I hope you can help me out with this.
I've cloned and built from the master, then I've edited the default config fil by adding my Hadoop config path, exported the HADOOP_CONF_DIR env var and ran bin/yarn-session.sh -n 1 -s 2 -jm 2048 -tm 2048 The first thing I noticed is that I had to put "-s 2" or the task managers gets created with -1 slots (!) by default. After putting "-s 2" the YARN session startup hangs when trying to register the task managers. I've stopped the session and aggregated the logs and read a lot (several thousands) of the messages I attach at the bottom; any idea of what this may be? Thank you a lot in advance! 2016-04-19 12:15:59,507 INFO org.apache.flink.yarn.YarnTaskManager - Trying to register at JobManager akka.tcp:// flink@172.31.20.101:57379/user/jobmanager (attempt 1, timeout: 500 milliseconds) 2016-04-19 12:15:59,649 ERROR org.apache.flink.yarn.YarnTaskManager - The registration at JobManager Some(akka.tcp:// flink@172.31.20.101:57379/user/jobmanager) was refused, because: java.lang.IllegalStateException: Resource ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not registered with resource manager.. Retrying later... 2016-04-19 12:16:00,025 INFO org.apache.flink.yarn.YarnTaskManager - Trying to register at JobManager akka.tcp:// flink@172.31.20.101:57379/user/jobmanager (attempt 2, timeout: 1000 milliseconds) 2016-04-19 12:16:00,033 ERROR org.apache.flink.yarn.YarnTaskManager - The registration at JobManager Some(akka.tcp:// flink@172.31.20.101:57379/user/jobmanager) was refused, because: java.lang.IllegalStateException: Resource ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not registered with resource manager.. Retrying later... 2016-04-19 12:16:01,045 INFO org.apache.flink.yarn.YarnTaskManager - Trying to register at JobManager akka.tcp:// flink@172.31.20.101:57379/user/jobmanager (attempt 3, timeout: 2000 milliseconds) 2016-04-19 12:16:01,053 ERROR org.apache.flink.yarn.YarnTaskManager - The registration at JobManager Some(akka.tcp:// flink@172.31.20.101:57379/user/jobmanager) was refused, because: java.lang.IllegalStateException: Resource ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not registered with resource manager.. Retrying later... 2016-04-19 12:16:03,064 INFO org.apache.flink.yarn.YarnTaskManager - Trying to register at JobManager akka.tcp:// flink@172.31.20.101:57379/user/jobmanager (attempt 4, timeout: 4000 milliseconds) 2016-04-19 12:16:03,072 ERROR org.apache.flink.yarn.YarnTaskManager - The registration at JobManager Some(akka.tcp:// flink@172.31.20.101:57379/user/jobmanager) was refused, because: java.lang.IllegalStateException: Resource ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not registered with resource manager.. Retrying later... 2016-04-19 12:16:07,085 INFO org.apache.flink.yarn.YarnTaskManager - Trying to register at JobManager akka.tcp:// flink@172.31.20.101:57379/user/jobmanager (attempt 5, timeout: 8000 milliseconds) 2016-04-19 12:16:07,092 ERROR org.apache.flink.yarn.YarnTaskManager - The registration at JobManager Some(akka.tcp:// flink@172.31.20.101:57379/user/jobmanager) was refused, because: java.lang.IllegalStateException: Resource ResourceID{resourceId='container_e02_1461077293721_0016_01_000002'} not registered with resource manager.. Retrying later... 2016-04-19 12:16:09,664 INFO org.apache.flink.yarn.YarnTaskManager - Trying to register at JobManager akka.tcp:// flink@172.31.20.101:57379/user/jobmanager (attempt 1, timeout: 500 milliseconds) -- BR, Stefano Baghino Software Engineer @ Radicalbit