Re: Single point of failure with Driver host crashing

2016-08-12 Thread Jacek Laskowski
Hi, I'm sure that cluster deploy mode would solve it very well. It'd be a cluster issue then to re-execute the driver then? Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jacekla

Re: Single point of failure with Driver host crashing

2016-08-11 Thread Mich Talebzadeh
Thanks Ted, In this case we were using Standalone with Standalone master started on another node. The app was started on a node but not the master node. The master node was not affected. The node in question was the edge (running spark-submit). >From the link I was not sure this matter would hav

Re: Single point of failure with Driver host crashing

2016-08-11 Thread Ted Yu
Have you read https://spark.apache.org/docs/latest/spark-standalone.html#high-availability ? FYI On Thu, Aug 11, 2016 at 12:40 PM, Mich Talebzadeh wrote: > > Hi, > > Although Spark is fault tolerant when nodes go down like below: > > FROM tmp > [Stage 1:===>

Single point of failure with Driver host crashing

2016-08-11 Thread Mich Talebzadeh
Hi, Although Spark is fault tolerant when nodes go down like below: FROM tmp [Stage 1:===> (20 + 10) / 100]16/08/11 20:21:34 ERROR TaskSchedulerImpl: Lost executor 3 on xx.xxx.197.216: worker lost [Stage 1:>