I want to make zeppelin work with remote Yarn cluster.

When I run any spark paragraph, spark job is submitted to Yarn cluster and
accepted.
But it does not change to running mode.

Please refer to my configuration.

1. zeppelin-env.sh

export MASTER=yarn-client
export SPARK_YARN_JAR=/home/style95/lib/spark-assembly-1.2.1-hadoop2.4.0.jar
export HADOOP_CONF_DIR=/home/style95/hadoop-conf


2. hadoop-conf/core-site.xml

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://10.251.53.42:8020</value>
        <final>true</final>
    </property>
</configuration>


3. hadoop-conf/yarn-site.xml

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>10.251.53.42</value>
    </property>
</configuration>


There are only above two files in hadoop-conf directory.

Do I miss anything?
Do I have to declare any other environment variables?

There is no permission configured on yarn cluster.
The version of yarn cluster is "hadoop-2.6.0"

I already tried with various version of hadoop and spark profiles and got
nothing.
Tried versions:
 - spark 1.2 with hadoop 2.4
 - spark 1.3 with hadoop 2.4
 - spark 1.3 with hadoop 2.6


Since my pc is behind of NAT, my pc is not reachable from yarn cluster.
Could it affect?

Please let me get out of it

Thanks
Regards
Dongkyoung.

Reply via email to