If it's too small to run an executor, I'd think it would be chosen for
the AM as the only way to satisfy the request.

On Tue, Feb 9, 2016 at 8:35 AM, Alexander Pivovarov
<apivova...@gmail.com> wrote:
> If I add additional small box to the cluster can I configure yarn to select
> small box to run am container?
>
>
> On Mon, Feb 8, 2016 at 10:53 PM, Sean Owen <so...@cloudera.com> wrote:
>>
>> Typically YARN is there because you're mediating resource requests
>> from things besides Spark, so yeah using every bit of the cluster is a
>> little bit of a corner case. There's not a good answer if all your
>> nodes are the same size.
>>
>> I think you can let YARN over-commit RAM though, and allocate more
>> memory than it actually has. It may be beneficial to let them all
>> think they have an extra GB, and let one node running the AM
>> technically be overcommitted, a state which won't hurt at all unless
>> you're really really tight on memory, in which case something might
>> get killed.
>>
>> On Tue, Feb 9, 2016 at 6:49 AM, Jonathan Kelly <jonathaka...@gmail.com>
>> wrote:
>> > Alex,
>> >
>> > That's a very good question that I've been trying to answer myself
>> > recently
>> > too. Since you've mentioned before that you're using EMR, I assume
>> > you're
>> > asking this because you've noticed this behavior on emr-4.3.0.
>> >
>> > In this release, we made some changes to the maximizeResourceAllocation
>> > (which you may or may not be using, but either way this issue is
>> > present),
>> > including the accidental inclusion of somewhat of a bug that makes it
>> > not
>> > reserve any space for the AM, which ultimately results in one of the
>> > nodes
>> > being utilized only by the AM and not an executor.
>> >
>> > However, as you point out, the only viable fix seems to be to reserve
>> > enough
>> > memory for the AM on *every single node*, which in some cases might
>> > actually
>> > be worse than wasting a lot of memory on a single node.
>> >
>> > So yeah, I also don't like either option. Is this just the price you pay
>> > for
>> > running on YARN?
>> >
>> >
>> > ~ Jonathan
>> >
>> > On Mon, Feb 8, 2016 at 9:03 PM Alexander Pivovarov
>> > <apivova...@gmail.com>
>> > wrote:
>> >>
>> >> Lets say that yarn has 53GB memory available on each slave
>> >>
>> >> spark.am container needs 896MB.  (512 + 384)
>> >>
>> >> I see two options to configure spark:
>> >>
>> >> 1. configure spark executors to use 52GB and leave 1 GB on each box.
>> >> So,
>> >> some box will also run am container. So, 1GB memory will not be used on
>> >> all
>> >> slaves but one.
>> >>
>> >> 2. configure spark to use all 53GB and add additional 53GB box which
>> >> will
>> >> run only am container. So, 52GB on this additional box will do nothing
>> >>
>> >> I do not like both options. Is there a better way to configure
>> >> yarn/spark?
>> >>
>> >>
>> >> Alex
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to