Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Jonathan Kelly Tue, 09 Feb 2016 12:18:18 -0800

Interesting, I was not aware of spark.yarn.am.nodeLabelExpression.

We do use YARN labels on EMR; each node is automatically labeled with its
type (MASTER, CORE, or TASK). And we do
set yarn.app.mapreduce.am.labels=CORE in yarn-site.xml, but we do not set
spark.yarn.am.nodeLabelExpression.


Does Spark somehow not actually honor this? It seems weird that Spark would
have its own similar-sounding property (spark.yarn.am.nodeLabelExpression).
If spark.yarn.am.nodeLabelExpression is used
and yarn.app.mapreduce.am.labels ignored, I could be wrong about Spark AMs
only running on CORE instances in EMR.

I'm guessing though that spark.yarn.am.nodeLabelExpression would simply
override yarn.app.mapreduce.am.labels, so yarn.app.mapreduce.am.labels
would be treated as a default when it is set and
spark.yarn.am.nodeLabelExpression is not. Is that correct?

In short, Alex, you should not need to set any of the label-related
properties yourself if you do what I suggested regarding using small CORE
instances and large TASK instances. But if you want to do something
different, it would also be possible to add a TASK instance group with
small nodes and configured with some new label. Then you could set
spark.yarn.am.nodeLabelExpression to that label.

Thanks, Marcelo, for pointing out spark.yarn.am.nodeLabelExpression!

~ Jonathan

On Tue, Feb 9, 2016 at 9:54 AM Marcelo Vanzin <van...@cloudera.com> wrote:

> You should be able to use spark.yarn.am.nodeLabelExpression if your
> version of YARN supports node labels (and you've added a label to the
> node where you want the AM to run).
>
> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
> <apivova...@gmail.com> wrote:
> > Am container starts first and yarn selects random computer to run it.
> >
> > Is it possible to configure yarn so that it selects small computer for am
> > container.
> >
> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
> >>
> >> If it's too small to run an executor, I'd think it would be chosen for
> >> the AM as the only way to satisfy the request.
> >>
> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
> >> <apivova...@gmail.com> wrote:
> >> > If I add additional small box to the cluster can I configure yarn to
> >> > select
> >> > small box to run am container?
> >> >
> >> >
> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
> wrote:
> >> >>
> >> >> Typically YARN is there because you're mediating resource requests
> >> >> from things besides Spark, so yeah using every bit of the cluster is
> a
> >> >> little bit of a corner case. There's not a good answer if all your
> >> >> nodes are the same size.
> >> >>
> >> >> I think you can let YARN over-commit RAM though, and allocate more
> >> >> memory than it actually has. It may be beneficial to let them all
> >> >> think they have an extra GB, and let one node running the AM
> >> >> technically be overcommitted, a state which won't hurt at all unless
> >> >> you're really really tight on memory, in which case something might
> >> >> get killed.
> >> >>
> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
> jonathaka...@gmail.com>
> >> >> wrote:
> >> >> > Alex,
> >> >> >
> >> >> > That's a very good question that I've been trying to answer myself
> >> >> > recently
> >> >> > too. Since you've mentioned before that you're using EMR, I assume
> >> >> > you're
> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
> >> >> >
> >> >> > In this release, we made some changes to the
> >> >> > maximizeResourceAllocation
> >> >> > (which you may or may not be using, but either way this issue is
> >> >> > present),
> >> >> > including the accidental inclusion of somewhat of a bug that makes
> it
> >> >> > not
> >> >> > reserve any space for the AM, which ultimately results in one of
> the
> >> >> > nodes
> >> >> > being utilized only by the AM and not an executor.
> >> >> >
> >> >> > However, as you point out, the only viable fix seems to be to
> reserve
> >> >> > enough
> >> >> > memory for the AM on *every single node*, which in some cases might
> >> >> > actually
> >> >> > be worse than wasting a lot of memory on a single node.
> >> >> >
> >> >> > So yeah, I also don't like either option. Is this just the price
> you
> >> >> > pay
> >> >> > for
> >> >> > running on YARN?
> >> >> >
> >> >> >
> >> >> > ~ Jonathan
> >> >> >
> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
> >> >> > <apivova...@gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Lets say that yarn has 53GB memory available on each slave
> >> >> >>
> >> >> >> spark.am container needs 896MB.  (512 + 384)
> >> >> >>
> >> >> >> I see two options to configure spark:
> >> >> >>
> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
> box.
> >> >> >> So,
> >> >> >> some box will also run am container. So, 1GB memory will not be
> used
> >> >> >> on
> >> >> >> all
> >> >> >> slaves but one.
> >> >> >>
> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
> which
> >> >> >> will
> >> >> >> run only am container. So, 52GB on this additional box will do
> >> >> >> nothing
> >> >> >>
> >> >> >> I do not like both options. Is there a better way to configure
> >> >> >> yarn/spark?
> >> >> >>
> >> >> >>
> >> >> >> Alex
> >> >
> >> >
>
>
>
> --
> Marcelo
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Reply via email to