YARN and Mesos are better for production clusters of "non-trivial" size
that have mixed job kinds and multiple users, as they manage resources more
intelligently and dynamically. They also support other services you
probably need, like HDFS, databases, workflow tools, etc.

Standalone is fine, though, if you have a limited number of jobs competing
for resources, for example a small cluster dedicated to ingesting or
processing a specific kind of data, or for a dev/QA cluster. Standalone
mode has much lower overhead, but you have to manage the daemon services
yourself, including configuration of Zookeeper if you need master failover.
Hence, you don't see it often in production scenarios.

The Spark page on cluster deployments has more details:
http://spark.apache.org/docs/latest/cluster-overview.html

dean

Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
<http://shop.oreilly.com/product/0636920033073.do> (O'Reilly)
Typesafe <http://typesafe.com>
@deanwampler <http://twitter.com/deanwampler>
http://polyglotprogramming.com

On Wed, Jul 22, 2015 at 6:56 PM, Dogtail Ray <[email protected]> wrote:

> Hi,
>
> I am very curious about the differences between Standalone mode and YARN
> mode. According to
> http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/,
> it seems that YARN mode is always better than Standalone mode. Is that the
> case? Or I should choose different modes according to my specific
> requirements? Thanks!
>

Reply via email to