On 14 Jul 2015, at 23:26, Tathagata Das <t...@databricks.com> wrote:

> Just to be clear, you mean the Spark Standalone cluster manager's "master" 
> and not the applications "driver", right. 

Sorry, by now I have understood that I would not necessarily put the driver app 
on the master node and that not making that distinction made my question kind 
of hard to answer :-)

So far I have understood that for a spark streaming app that uses the cassandra 
connector (and also needs checkpointing):

slaves: need Spark, C*, the connector and access to a distributed file system 
for the checkpointing
master: needs Spark (configured as master) but none of the rest
the node where the driver runs: needs spark,  C*, the connector and access to a 
distributed file system for the checkpointing

Correct?

(And thanks to everyone for the replies)


Jan



> In that case, the earlier responses are correct. 
> 
> TD
> 
> On Tue, Jul 14, 2015 at 11:26 AM, Mohammed Guller <moham...@glassbeam.com> 
> wrote:
> The master node does not have to be similar to the worker nodes. It can be a 
> smaller machine.
> 
> In case of C*, again you don't need to have C* on the master node. You need 
> C* and Spark workers co-located. Master can be on one of the C* node or a 
> non-C* node.
> 
> Mohammed
> 
> 
> -----Original Message-----
> From: algermissen1971 [mailto:algermissen1...@icloud.com]
> Sent: Sunday, July 12, 2015 12:35 PM
> To: Spark User
> Subject: Master vs. Slave Nodes Clarification
> 
> Hi,
> 
> I have a question that I really have problems with figuring out for myself:
> 
> Does the master node in a spark cluster need to be a node similar to the 
> slave nodes or should I rather view it as a coordinating node, that does not 
> need much computing or storage power?
> 
> For example, when using Spark Streaming and Checkpointing, would the master 
> node need access to the shared file system (e.g. HDFS)? Or do I only need to 
> mount that on the slaves?
> (likewise, if I use the Cassandra-Connector, does that (and C*) need to be 
> installed on the master node, too?)
> 
> Or, in other words: is the master just one node of similar cluster nodes, or 
> is it merely a 'small control node', for which sort of any small VM would do?
> 
> Jan
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
> commands, e-mail: user-h...@spark.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to