Hi all,

I tried a couple ways, but couldn't get it to work..

The following seems to be what the online document (
http://spark.apache.org/docs/latest/running-on-yarn.html) is suggesting:
SPARK_JAR=hdfs://test/user/spark/share/lib/spark-assembly-1.0.0-hadoop2.2.0.jar
YARN_CONF_DIR=/opt/hadoop/conf ./spark-shell --master yarn-client

Help info of spark-shell seems to be suggesting "--master yarn
--deploy-mode cluster".

But either way, I am seeing the following messages:
14/06/01 00:33:20 INFO client.RMProxy: Connecting to ResourceManager at /
0.0.0.0:8032
14/06/01 00:33:21 INFO ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/06/01 00:33:22 INFO ipc.Client: Retrying connect to server:
0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)

My guess is that spark-shell is trying to talk to resource manager to setup
spark master/worker nodes - I am not sure where 0.0.0.0:8032 came from
though. I am running CDH5 with two resource managers in HA mode. Their
IP/port should be in /opt/hadoop/conf/yarn-site.xml. I tried both
HADOOP_CONF_DIR and YARN_CONF_DIR, but that info isn't picked up.

Any ideas? Thanks.
-Simon

Reply via email to