Hmm...I did try it increase to few gb but did not get a successful run
yet...

Any idea if I am using say 40 executors, each running 16GB, what's the
typical spark.yarn.executor.memoryOverhead for say 100M x 10 M large
matrices with say few billion ratings...

On Tue, Sep 9, 2014 at 10:49 AM, Sandy Ryza <sandy.r...@cloudera.com> wrote:

> Hi Deb,
>
> The current state of the art is to increase
> spark.yarn.executor.memoryOverhead until the job stops failing.  We do have
> plans to try to automatically scale this based on the amount of memory
> requested, but it will still just be a heuristic.
>
> -Sandy
>
> On Tue, Sep 9, 2014 at 7:32 AM, Debasish Das <debasish.da...@gmail.com>
> wrote:
>
>> Hi Sandy,
>>
>> Any resolution for YARN failures ? It's a blocker for running spark on
>> top of YARN.
>>
>> Thanks.
>> Deb
>>
>> On Tue, Aug 19, 2014 at 11:29 PM, Xiangrui Meng <men...@gmail.com> wrote:
>>
>>> Hi Deb,
>>>
>>> I think this may be the same issue as described in
>>> https://issues.apache.org/jira/browse/SPARK-2121 . We know that the
>>> container got killed by YARN because it used much more memory that it
>>> requested. But we haven't figured out the root cause yet.
>>>
>>> +Sandy
>>>
>>> Best,
>>> Xiangrui
>>>
>>> On Tue, Aug 19, 2014 at 8:51 PM, Debasish Das <debasish.da...@gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > During the 4th ALS iteration, I am noticing that one of the executor
>>> gets
>>> > disconnected:
>>> >
>>> > 14/08/19 23:40:00 ERROR network.ConnectionManager: Corresponding
>>> > SendingConnectionManagerId not found
>>> >
>>> > 14/08/19 23:40:00 INFO cluster.YarnClientSchedulerBackend: Executor 5
>>> > disconnected, so removing it
>>> >
>>> > 14/08/19 23:40:00 ERROR cluster.YarnClientClusterScheduler: Lost
>>> executor 5
>>> > on tblpmidn42adv-hdp.tdc.vzwcorp.com: remote Akka client disassociated
>>> >
>>> > 14/08/19 23:40:00 INFO scheduler.DAGScheduler: Executor lost: 5 (epoch
>>> 12)
>>> > Any idea if this is a bug related to akka on YARN ?
>>> >
>>> > I am using master
>>> >
>>> > Thanks.
>>> > Deb
>>>
>>
>>
>

Reply via email to