Nice explanation... Thanks!
On Thu, Jun 5, 2014 at 5:50 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote: > Hi Xu, > > As crazy as it might sound, this all makes sense. > > There are a few different quantities at play here: > * the heap size of the executor (controlled by --executor-memory) > * the amount of memory spark requests from yarn (the heap size plus > 384 mb to account for fixed memory costs outside if the heap) > * the amount of memory yarn grants to the container (yarn rounds up to > the nearest multiple of yarn.scheduler.minimum-allocation-mb or > yarn.scheduler.fair.increment-allocation-mb, depending on the > scheduler used) > * the amount of memory spark uses for caching on each executor, which > is spark.storage.memoryFraction (default 0.6) of the executor heap > size > > So, with --executor-memory 8g, spark requests 8g + 384m from yarn, > which doesn't fit into it's container max. With --executor-memory 7g, > Spark requests 7g + 384m from yarn, which fits into its container max. > This gets rounded up to 8g by the yarn scheduler. 7g is still used > as the executor heap size, and .6 of this is about 4g, shown as the > cache space in the spark. > > -Sandy > > > On Jun 5, 2014, at 9:44 AM, "Xu (Simon) Chen" <xche...@gmail.com> wrote: > > > > I am slightly confused about the "--executor-memory" setting. My yarn > cluster has a maximum container memory of 8192MB. > > > > When I specify "--executor-memory 8G" in my spark-shell, no container > can be started at all. It only works when I lower the executor memory to > 7G. But then, on yarn, I see 2 container per node, using 16G of memory. > > > > Then on the spark UI, it shows that each worker has 4GB of memory, > rather than 7. > > > > Can someone explain the relationship among the numbers I see here? > > > > Thanks. >