Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Jonathan Kelly Tue, 09 Feb 2016 16:02:48 -0800

You can set custom per-instance-group configurations (e.g.,
["classification":"yarn-site",properties:{"yarn.nodemanager.labels":"SPARKAM"}])
using the Configurations parameter of
http://docs.aws.amazon.com/ElasticMapReduce/latest/API/API_InstanceGroupConfig.html.
Unfortunately, it's not currently possible to specify per-instance-group
configurations via the CLI though, only cluster wide configurations.


~ Jonathan

On Tue, Feb 9, 2016 at 12:36 PM Alexander Pivovarov <apivova...@gmail.com>
wrote:

> Thanks Jonathan
>
> Actually I'd like to use maximizeResourceAllocation.
>
> Ideally for me would be to add new instance group having single small box
> labelled as AM
> I'm not sure "aws emr create-cluster" supports setting custom LABELS , the
> only settings awailable are:
>
> InstanceCount=1,BidPrice=0.5,Name=sparkAM,InstanceGroupType=TASK,InstanceType=m3.xlarge
>
>
> How can I specify yarn label AM for that box?
>
>
>
> On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly <jonathaka...@gmail.com>
> wrote:
>
>> Interesting, I was not aware of spark.yarn.am.nodeLabelExpression.
>>
>> We do use YARN labels on EMR; each node is automatically labeled with its
>> type (MASTER, CORE, or TASK). And we do
>> set yarn.app.mapreduce.am.labels=CORE in yarn-site.xml, but we do not set
>> spark.yarn.am.nodeLabelExpression.
>>
>> Does Spark somehow not actually honor this? It seems weird that Spark
>> would have its own similar-sounding property
>> (spark.yarn.am.nodeLabelExpression). If spark.yarn.am.nodeLabelExpression
>> is used and yarn.app.mapreduce.am.labels ignored, I could be wrong about
>> Spark AMs only running on CORE instances in EMR.
>>
>> I'm guessing though that spark.yarn.am.nodeLabelExpression would simply
>> override yarn.app.mapreduce.am.labels, so yarn.app.mapreduce.am.labels
>> would be treated as a default when it is set and
>> spark.yarn.am.nodeLabelExpression is not. Is that correct?
>>
>> In short, Alex, you should not need to set any of the label-related
>> properties yourself if you do what I suggested regarding using small CORE
>> instances and large TASK instances. But if you want to do something
>> different, it would also be possible to add a TASK instance group with
>> small nodes and configured with some new label. Then you could set
>> spark.yarn.am.nodeLabelExpression to that label.
>>
>> Thanks, Marcelo, for pointing out spark.yarn.am.nodeLabelExpression!
>>
>> ~ Jonathan
>>
>> On Tue, Feb 9, 2016 at 9:54 AM Marcelo Vanzin <van...@cloudera.com>
>> wrote:
>>
>>> You should be able to use spark.yarn.am.nodeLabelExpression if your
>>> version of YARN supports node labels (and you've added a label to the
>>> node where you want the AM to run).
>>>
>>> On Tue, Feb 9, 2016 at 9:51 AM, Alexander Pivovarov
>>> <apivova...@gmail.com> wrote:
>>> > Am container starts first and yarn selects random computer to run it.
>>> >
>>> > Is it possible to configure yarn so that it selects small computer for
>>> am
>>> > container.
>>> >
>>> > On Feb 9, 2016 12:40 AM, "Sean Owen" <so...@cloudera.com> wrote:
>>> >>
>>> >> If it's too small to run an executor, I'd think it would be chosen for
>>> >> the AM as the only way to satisfy the request.
>>> >>
>>> >> On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
>>> >> <apivova...@gmail.com> wrote:
>>> >> > If I add additional small box to the cluster can I configure yarn to
>>> >> > select
>>> >> > small box to run am container?
>>> >> >
>>> >> >
>>> >> > On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com>
>>> wrote:
>>> >> >>
>>> >> >> Typically YARN is there because you're mediating resource requests
>>> >> >> from things besides Spark, so yeah using every bit of the cluster
>>> is a
>>> >> >> little bit of a corner case. There's not a good answer if all your
>>> >> >> nodes are the same size.
>>> >> >>
>>> >> >> I think you can let YARN over-commit RAM though, and allocate more
>>> >> >> memory than it actually has. It may be beneficial to let them all
>>> >> >> think they have an extra GB, and let one node running the AM
>>> >> >> technically be overcommitted, a state which won't hurt at all
>>> unless
>>> >> >> you're really really tight on memory, in which case something might
>>> >> >> get killed.
>>> >> >>
>>> >> >> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <
>>> jonathaka...@gmail.com>
>>> >> >> wrote:
>>> >> >> > Alex,
>>> >> >> >
>>> >> >> > That's a very good question that I've been trying to answer
>>> myself
>>> >> >> > recently
>>> >> >> > too. Since you've mentioned before that you're using EMR, I
>>> assume
>>> >> >> > you're
>>> >> >> > asking this because you've noticed this behavior on emr-4.3.0.
>>> >> >> >
>>> >> >> > In this release, we made some changes to the
>>> >> >> > maximizeResourceAllocation
>>> >> >> > (which you may or may not be using, but either way this issue is
>>> >> >> > present),
>>> >> >> > including the accidental inclusion of somewhat of a bug that
>>> makes it
>>> >> >> > not
>>> >> >> > reserve any space for the AM, which ultimately results in one of
>>> the
>>> >> >> > nodes
>>> >> >> > being utilized only by the AM and not an executor.
>>> >> >> >
>>> >> >> > However, as you point out, the only viable fix seems to be to
>>> reserve
>>> >> >> > enough
>>> >> >> > memory for the AM on *every single node*, which in some cases
>>> might
>>> >> >> > actually
>>> >> >> > be worse than wasting a lot of memory on a single node.
>>> >> >> >
>>> >> >> > So yeah, I also don't like either option. Is this just the price
>>> you
>>> >> >> > pay
>>> >> >> > for
>>> >> >> > running on YARN?
>>> >> >> >
>>> >> >> >
>>> >> >> > ~ Jonathan
>>> >> >> >
>>> >> >> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>>> >> >> > <apivova...@gmail.com>
>>> >> >> > wrote:
>>> >> >> >>
>>> >> >> >> Lets say that yarn has 53GB memory available on each slave
>>> >> >> >>
>>> >> >> >> spark.am container needs 896MB.  (512 + 384)
>>> >> >> >>
>>> >> >> >> I see two options to configure spark:
>>> >> >> >>
>>> >> >> >> 1. configure spark executors to use 52GB and leave 1 GB on each
>>> box.
>>> >> >> >> So,
>>> >> >> >> some box will also run am container. So, 1GB memory will not be
>>> used
>>> >> >> >> on
>>> >> >> >> all
>>> >> >> >> slaves but one.
>>> >> >> >>
>>> >> >> >> 2. configure spark to use all 53GB and add additional 53GB box
>>> which
>>> >> >> >> will
>>> >> >> >> run only am container. So, 52GB on this additional box will do
>>> >> >> >> nothing
>>> >> >> >>
>>> >> >> >> I do not like both options. Is there a better way to configure
>>> >> >> >> yarn/spark?
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> Alex
>>> >> >
>>> >> >
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>
>

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

Reply via email to