On Tue, Jul 14, 2015 at 12:03 PM, Shushant Arora <shushantaror...@gmail.com>
wrote:

> Can a container have multiple JVMs running in YARN?
>

Yes and no. A container runs a single command, but that process can start
other processes, and those also count towards the resource usage of the
container (mostly memory). For example, pyspark will spawn python processes
from the main JVM.

But if you're asking about executors, ignoring pyspark or other non
Scala/Java backends, there will be a single JVM. Spark will allow a number
of concurrent tasks to run that matches the number of vcores you requested
for the executor.


> 1.Is the difference is in Hadoop Mapreduce job - say I specify 20 reducers
> and my job uses 10 map tasks then, it need total 30 containers or 30 vcores
> ?
>

It's not that simple and trying to compare that to Spark is kinda
misleading.

-- 
Marcelo

Reply via email to