Sandy, thank you so much — that was indeed my omission!
Eric

On May 19, 2014, at 10:14 AM, Sandy Ryza <sandy.r...@cloudera.com> wrote:

> Hi Eric,
> 
> Have you tried setting the SPARK_WORKER_INSTANCES env variable before running 
> spark-shell?
> http://spark.apache.org/docs/0.9.0/running-on-yarn.html
> 
> -Sandy
> 
> 
> On Mon, May 19, 2014 at 8:08 AM, Eric Friedman <e...@spottedsnake.net> wrote:
> Hi
> 
> I am working with a Cloudera 5 cluster with 192 nodes and can’t work out how 
> to get the spark repo to use more than 2 nodes in an interactive session.
> 
> So, this works, but is non-interactive (using yarn-client as MASTER)
> 
> /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/bin/spark-class \
>   org.apache.spark.deploy.yarn.Client \
>   --jar 
> /opt/cloudera/parcels/CDH-5.0.0-1.cdh5.0.0.p0.47/lib/spark/examples/lib/spark-examples_2.10-0.9.0-cdh5.0.0.jar
>  \
>   --class org.apache.spark.examples.SparkPi \
>   --args yarn-standalone \
>   --args 10 \
>   --num-workers 100
> 
> There does not appear to be an (obvious?) way to get more than 2 nodes 
> involved from the repl.
> 
> I am running the REPL like this:
> 
> #!/bin/sh
> 
> . /etc/spark/conf.cloudera.spark/spark-env.sh
> 
> export SPARK_JAR=hdfs://nameservice1/user/spark/share/lib/spark-assembly.jar
> 
> export SPARK_WORKER_MEMORY=512m
> 
> export MASTER=yarn-client
> 
> exec $SPARK_HOME/bin/spark-shell
> 
> Now if I comment out the line with `export SPARK_JAR=…’ and run this again, I 
> get an error like this:
> 
> 14/05/19 08:03:41 ERROR Client: Error: You must set SPARK_JAR environment 
> variable!
> Usage: org.apache.spark.deploy.yarn.Client [options] 
> Options:
>   --jar JAR_PATH             Path to your application's JAR file (required in 
> yarn-cluster mode)
>   --class CLASS_NAME         Name of your application's main class (required)
>   --args ARGS                Arguments to be passed to your application's 
> main class.
>                              Mutliple invocations are possible, each will be 
> passed in order.
>   --num-workers NUM          Number of workers to start (Default: 2)
>   […]
> 
> But none of those options are exposed at the `spark-shell’ level.
> 
> Thanks in advance for your guidance.
> 
> Eric
> 

Reply via email to