Mayur Rustagi
Ph: +919632149971
h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
https://twitter.com/mayur_rustagi



On Mon, Feb 24, 2014 at 10:22 PM, polkosity <polkos...@gmail.com> wrote:

> Is there any difference in the performance of Spark standalone mode and
> YARN
> when it comes to initializing a new Spark job?
>
Yes Yarn is a much more complex cluster manager than the one provided by
Spark Standalone.

>
> In my application, response time is absolutely critical, and I'm hoping to
> have the executors working within a few seconds of submitting the job.
>
> Both options ran quickly for me (running the SparkPi example) in a single
> node cluster, only a couple of seconds until executors began work.  On my
> 10
> node cluster it takes YARN over 10 seconds before the executors actually
> begin work.  Could I expect Spark standalone to get going any quicker?  If
> so I will take the time to configure it on 10 node cluster.
>
Yes Spark standalone is much much faster & can be prefered if you are not
running any other applications (like hive, hbase, etc ) on the cluster. I
get very responsive 2-3sec response time in standalone mode with 10
machines.

>
> Why does the example run so much quicker on my local single node cluster
> than on my 10 EC2 m1.larges?


> Aside from YARN being able to schedule Spark, MRv2 and other job types, are
> there any major differences between Spark standalone and YARN?
>
Yarn has much more granular control over the cluster resources. You can
also look into Mesos for management which will be much faster than Yarn for
now.

>
> Thanks.
> - Dan
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Job-initialization-performance-of-Spark-standalone-mode-vs-YARN-tp2016.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>

Reply via email to