Prashant, In another email thread several weeks ago, it was mentioned that YARN support is considered beta until Spark 1.0. Is that not the case?
-Suren On Tue, Apr 15, 2014 at 8:38 AM, Prashant Sharma <scrapco...@gmail.com>wrote: > Hi Ishaaq, > > answers inline from what I know, I had like to be corrected though. > > On Tue, Apr 15, 2014 at 5:58 PM, ishaaq <ish...@gmail.com> wrote: > >> Hi all, >> I am evaluating Spark to use here at my work. >> >> We have an existing Hadoop 1.x install which I planning to upgrade to >> Hadoop >> 2.3. >> >> This is not really a requirement for spark, if you are doing for some > other reason great ! > > >> I am trying to work out whether I should install YARN or simply just >> setup a >> Spark standalone cluster. We already use ZooKeeper so it isn't a problem >> to >> setup HA. I am puzzled however as to how the Spark nodes can coordinate on >> data locality - i.e., assuming I install the nodes on the same machines as >> the DFS data nodes, I don't understand how Spark can work out which nodes >> should get which splits of the jobs? >> >> This happens exactly the same way hadoop's mapreduce figures out data > locality. Since we support hadoop's inputformats(which also has the > information on how data is partitioned) etc. So having spark workers share > the same nodes as your DFS is a good idea. > > >> Anyway, my bigger question remains: YARN or standalone? Which is the more >> stable option currently? Which is the more future-proof option? >> >> > Well I think standalone is stable enough for all purposes and Spark's yarn > support has been keeping up with latest hadoop versions too. It depends on > the fact that if you are already using yarn and don't want the hassle of > setting up another cluster manager you can probably prefer yarn. > > >> Thanks, >> Ishaaq >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/standalone-vs-YARN-tp4271.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> > > -- SUREN HIRAMAN, VP TECHNOLOGY Velos Accelerating Machine Learning 440 NINTH AVENUE, 11TH FLOOR NEW YORK, NY 10001 O: (917) 525-2466 ext. 105 F: 646.349.4063 E: suren.hiraman@v <suren.hira...@sociocast.com>elos.io W: www.velos.io