Re: Spark Streaming in Production

2014-12-12 Thread Tathagata Das
Yes, your understanding about the separation of cluster manager and the application driver (that creates the SparkContext) is correct. There are HA solutions for both. Let me explain assuming the cluster manager in Spark Standalone. The master of the Spark standalone cluster manager can be made HA

Re: Spark Streaming in Production

2014-12-12 Thread francois . garillot
IIUC, Receivers run on workers, colocated with other tasks. The Driver, on the other hand, can either run on the querying machine (local mode) or as a worker (cluster mode). — FG On Fri, Dec 12, 2014 at 4:49 PM, twizansk wrote: > Thanks for the reply. I might be misunderstanding something

Re: Spark Streaming in Production

2014-12-12 Thread twizansk
Thanks for the reply. I might be misunderstanding something basic.As far as I can tell, the cluster manager (e.g. Mesos) manages the master and worker nodes but not the drivers or receivers, those are external to the spark cluster: http://spark.apache.org/docs/latest/cluster-overview.html I

Re: Spark Streaming in Production

2014-12-12 Thread rahulkumar-aws
Run Spark Cluster managed my Apache Mesos. Mesos can run in high-availability mode, in which multiple Mesos masters run simultaneously. - Software Developer SigmoidAnalytics, Bangalore -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-in-P

Re: Spark Streaming in Production

2014-12-11 Thread Tathagata Das
Spark Streaming takes care of restarting receivers if it fails. Regarding the fault-tolerance properties and deployment options, we made some improvements in the upcoming Spark 1.2. Here is a staged version of the Spark Streaming programming guide that you can read for the up-to-date explanation of