1. If we add more executors to cluster and data is already cached inside system(rdds are already there) . so, in that case those executors will run job on new executors or not , as rdd are not present there?? if yes, then how the performance on new executors ??
2. What is the replication factor in spark in memory (as for hadoop default is 3 ) and can we change for spark also ?? On Tue, Apr 15, 2014 at 9:53 PM, Manoj Samel <manojsamelt...@gmail.com>wrote: > Thanks Aaron, this is useful ! > > - Manoj > > > On Mon, Apr 14, 2014 at 8:12 PM, Aaron Davidson <ilike...@gmail.com>wrote: > >> Launching drivers inside the cluster was a feature added in 0.9, for >> standalone cluster mode: >> http://spark.apache.org/docs/latest/spark-standalone.html#launching-applications-inside-the-cluster >> >> Note the "supervise" flag, which will cause the driver to be restarted if >> it fails. This is a rather low-level mechanism which by default will just >> cause the whole job to rerun from the beginning. Special recovery would >> have to be implemented by hand, via some sort of state checkpointing into a >> globally visible storage system (e.g., HDFS), which, for example, Spark >> Streaming already does. >> >> Currently, this feature is not supported in YARN or Mesos fine-grained >> mode. >> >> >> On Mon, Apr 14, 2014 at 2:08 PM, Manoj Samel <manojsamelt...@gmail.com>wrote: >> >>> Could you please elaborate how drivers can be restarted automatically ? >>> >>> Thanks, >>> >>> >>> On Mon, Apr 14, 2014 at 10:30 AM, Aaron Davidson <ilike...@gmail.com>wrote: >>> >>>> Master and slave are somewhat overloaded terms in the Spark ecosystem >>>> (see the glossary: >>>> http://spark.apache.org/docs/latest/cluster-overview.html#glossary). >>>> Are you actually asking about the Spark "driver" and "executors", or the >>>> standalone cluster "master" and "workers"? >>>> >>>> To briefly answer for either possibility: >>>> (1) Drivers are not fault tolerant but can be restarted automatically, >>>> Executors may be removed at any point without failing the job (though >>>> losing an Executor may slow the job significantly), and Executors may be >>>> added at any point and will be immediately used. >>>> (2) Standalone cluster Masters are fault tolerant and failure will only >>>> temporarily stall new jobs from starting or getting new resources, but does >>>> not affect currently-running jobs. Workers can fail and will simply cause >>>> jobs to lose their current Executors. New Workers can be added at any >>>> point. >>>> >>>> >>>> >>>> On Mon, Apr 14, 2014 at 11:00 AM, Ian Ferreira <ianferre...@hotmail.com >>>> > wrote: >>>> >>>>> Folks, >>>>> >>>>> I was wondering what the failure support modes where for Spark while >>>>> running jobs >>>>> >>>>> >>>>> 1. What happens when a master fails >>>>> 2. What happens when a slave fails >>>>> 3. Can you mid job add and remove slaves >>>>> >>>>> >>>>> Regarding the install on Meso, if I understand correctly the Spark >>>>> master is behind a Zookeeper quorum so that isolates the slaves from a >>>>> master failure, but what about the masters behind quorum? >>>>> >>>>> Cheers >>>>> - Ian >>>>> >>>>> >>>> >>> >> >