We are experimenting with running Spark on Mesos after running
successfully in Standalone mode for a few months. With the Standalone
resource manager (as well as YARN), you have the option to define the
number of cores, number of executors and memory per executor. In
Mesos, however, it appears as though you cannot specify the number of
executors, even in coarse-grained mode. If this is the case, how do
you define the number of executors to run with?

Here's an example of why this matters (to us). Let's say we have the
following cluster:

num nodes: 8
num cores: 256 (32 per node)
total memory: 512GB (64GB per node)

If I set my job to require 256 cores and per-executor-memory to 30GB,
then Mesos will schedule a single executor per machine (8 executors
total) and each executor will get 32 cores to work with. This means
that we have 8 executors * 32GB each for a total of 240G of cluster
memory in use — less than half of what is available. If you want
actually 16 executors in order to increase the amount of memory in use
across the cluster, how can you do this with Mesos? It seems that a
parameter is missing (or I haven't found it yet) which lets me tune
this for Mesos:
 * number of executors per n-cores OR
 * number of executors total

Furthermore, in fine-grained mode in Mesos, how are the executors
started/allocated? That is, since Spark tasks map to Mesos tasks, when
and how are executors started? If they are transient and an executor
per task is created, does this mean we cannot have cached RDDs?

Thanks for any advice or pointers,

Josh

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to