Mayur Rustagi Ph: +919632149971 h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com https://twitter.com/mayur_rustagi
On Mon, Feb 24, 2014 at 10:22 PM, polkosity <polkos...@gmail.com> wrote: > Is there any difference in the performance of Spark standalone mode and > YARN > when it comes to initializing a new Spark job? > Yes Yarn is a much more complex cluster manager than the one provided by Spark Standalone. > > In my application, response time is absolutely critical, and I'm hoping to > have the executors working within a few seconds of submitting the job. > > Both options ran quickly for me (running the SparkPi example) in a single > node cluster, only a couple of seconds until executors began work. On my > 10 > node cluster it takes YARN over 10 seconds before the executors actually > begin work. Could I expect Spark standalone to get going any quicker? If > so I will take the time to configure it on 10 node cluster. > Yes Spark standalone is much much faster & can be prefered if you are not running any other applications (like hive, hbase, etc ) on the cluster. I get very responsive 2-3sec response time in standalone mode with 10 machines. > > Why does the example run so much quicker on my local single node cluster > than on my 10 EC2 m1.larges? > Aside from YARN being able to schedule Spark, MRv2 and other job types, are > there any major differences between Spark standalone and YARN? > Yarn has much more granular control over the cluster resources. You can also look into Mesos for management which will be much faster than Yarn for now. > > Thanks. > - Dan > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Job-initialization-performance-of-Spark-standalone-mode-vs-YARN-tp2016.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >