On 14 Jul 2015, at 23:26, Tathagata Das <t...@databricks.com> wrote:
> Just to be clear, you mean the Spark Standalone cluster manager's "master" > and not the applications "driver", right. Sorry, by now I have understood that I would not necessarily put the driver app on the master node and that not making that distinction made my question kind of hard to answer :-) So far I have understood that for a spark streaming app that uses the cassandra connector (and also needs checkpointing): slaves: need Spark, C*, the connector and access to a distributed file system for the checkpointing master: needs Spark (configured as master) but none of the rest the node where the driver runs: needs spark, C*, the connector and access to a distributed file system for the checkpointing Correct? (And thanks to everyone for the replies) Jan > In that case, the earlier responses are correct. > > TD > > On Tue, Jul 14, 2015 at 11:26 AM, Mohammed Guller <moham...@glassbeam.com> > wrote: > The master node does not have to be similar to the worker nodes. It can be a > smaller machine. > > In case of C*, again you don't need to have C* on the master node. You need > C* and Spark workers co-located. Master can be on one of the C* node or a > non-C* node. > > Mohammed > > > -----Original Message----- > From: algermissen1971 [mailto:algermissen1...@icloud.com] > Sent: Sunday, July 12, 2015 12:35 PM > To: Spark User > Subject: Master vs. Slave Nodes Clarification > > Hi, > > I have a question that I really have problems with figuring out for myself: > > Does the master node in a spark cluster need to be a node similar to the > slave nodes or should I rather view it as a coordinating node, that does not > need much computing or storage power? > > For example, when using Spark Streaming and Checkpointing, would the master > node need access to the shared file system (e.g. HDFS)? Or do I only need to > mount that on the slaves? > (likewise, if I use the Cassandra-Connector, does that (and C*) need to be > installed on the master node, too?) > > Or, in other words: is the master just one node of similar cluster nodes, or > is it merely a 'small control node', for which sort of any small VM would do? > > Jan > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional > commands, e-mail: user-h...@spark.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org