Yes, your understanding about the separation of cluster manager and
the application driver (that creates the SparkContext) is correct.
There are HA solutions for both. Let me explain assuming the cluster
manager in Spark Standalone. The master of the Spark standalone
cluster manager can be made HA
IIUC, Receivers run on workers, colocated with other tasks.
The Driver, on the other hand, can either run on the querying machine (local
mode) or as a worker (cluster mode).
—
FG
On Fri, Dec 12, 2014 at 4:49 PM, twizansk wrote:
> Thanks for the reply. I might be misunderstanding something
Thanks for the reply. I might be misunderstanding something basic.As far
as I can tell, the cluster manager (e.g. Mesos) manages the master and
worker nodes but not the drivers or receivers, those are external to the
spark cluster:
http://spark.apache.org/docs/latest/cluster-overview.html
I
Run Spark Cluster managed my Apache Mesos. Mesos can run in high-availability
mode, in which multiple Mesos masters run simultaneously.
-
Software Developer
SigmoidAnalytics, Bangalore
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-in-P
Spark Streaming takes care of restarting receivers if it fails.
Regarding the fault-tolerance properties and deployment options, we
made some improvements in the upcoming Spark 1.2. Here is a staged
version of the Spark Streaming programming guide that you can read for
the up-to-date explanation of