Nice explanation... Thanks!

On Thu, Jun 5, 2014 at 5:50 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote:

> Hi Xu,
>
> As crazy as it might sound, this all makes sense.
>
> There are a few different quantities at play here:
> * the heap size of the executor (controlled by --executor-memory)
> * the amount of memory spark requests from yarn (the heap size plus
> 384 mb to account for fixed memory costs outside if the heap)
> * the amount of memory yarn grants to the container (yarn rounds up to
> the nearest multiple of yarn.scheduler.minimum-allocation-mb or
> yarn.scheduler.fair.increment-allocation-mb, depending on the
> scheduler used)
> * the amount of memory spark uses for caching on each executor, which
> is spark.storage.memoryFraction (default 0.6) of the executor heap
> size
>
> So, with --executor-memory 8g, spark requests 8g + 384m from yarn,
> which doesn't fit into it's container max.  With --executor-memory 7g,
> Spark requests 7g + 384m from yarn, which fits into its container max.
>  This gets rounded up to 8g by the yarn scheduler.  7g is still used
> as the executor heap size, and .6 of this is about 4g, shown as the
> cache space in the spark.
>
> -Sandy
>
> > On Jun 5, 2014, at 9:44 AM, "Xu (Simon) Chen" <xche...@gmail.com> wrote:
> >
> > I am slightly confused about the "--executor-memory" setting. My yarn
> cluster has a maximum container memory of 8192MB.
> >
> > When I specify "--executor-memory 8G" in my spark-shell, no container
> can be started at all. It only works when I lower the executor memory to
> 7G. But then, on yarn, I see 2 container per node, using 16G of memory.
> >
> > Then on the spark UI, it shows that each worker has 4GB of memory,
> rather than 7.
> >
> > Can someone explain the relationship among the numbers I see here?
> >
> > Thanks.
>

Reply via email to