The idea behind YARN is that you can run different application types like MapReduce, Storm and Spark.
I would recommend that you build your spark jobs in the main method without specifying how you deploy it. Then you can use spark-submit to tell Spark how you would want to deploy to it using yarn-cluster as the master. The key point here is that once you have YARN setup, the spark client connects to it using the $HADOOP_CONF_DIR that contains the resource manager address. In particular, this needs to be accessible from the classpath of the submitter since it implicitly uses this when it instantiates a YarnConfiguration instance. If you want more details, read org.apache.spark.deploy.yarn.Client.scala. You should be able to download a standalone YARN cluster from any of the Hadoop providers like Cloudera or Hortonworks. Once you have that, the spark programming guide describes what I mention above in sufficient detail for you to proceed. Thanks, Ron Sent from my iPad > On Jul 9, 2014, at 8:31 AM, John Omernik <j...@omernik.com> wrote: > > I am trying to get my head around using Spark on Yarn from a perspective of a > cluster. I can start a Spark Shell no issues in Yarn. Works easily. This is > done in yarn-client mode and it all works well. > > In multiple examples, I see instances where people have setup Spark Clusters > in Stand Alone mode, and then in the examples they "connect" to this cluster > in Stand Alone mode. This is done often times using the spark:// string for > connection. Cool. s > But what I don't understand is how do I setup a Yarn instance that I can > "connect" to? I.e. I tried running Spark Shell in yarn-cluster mode and it > gave me an error, telling me to use yarn-client. I see information on using > spark-class or spark-submit. But what I'd really like is a instance I can > connect a spark-shell too, and have the instance stay up. I'd like to be able > run other things on that instance etc. Is that possible with Yarn? I know > there may be long running job challenges with Yarn, but I am just testing, I > am just curious if I am looking at something completely bonkers here, or just > missing something simple. > > Thanks! > >