Hi,

Yes, this is a problem, and I'm not aware of any simple workarounds
(or complex one for that matter). There are people working to fix
this, you can follow progress here:
https://issues.apache.org/jira/browse/SPARK-1239

On Tue, Sep 9, 2014 at 2:54 PM, jbeynon <jbey...@gmail.com> wrote:
> I'm running on Yarn with relatively small instances with 4gb memory. I'm not
> caching any data but when the map stage ends and shuffling begins all of the
> executors request the map output locations at the same time which seems to
> kill the driver when the number of executors is turned up.
>
> For example, the "size of output statuses" is about 10mb and with 500
> executors the driver appears to be making 500 (5gb of data) copies of this
> data to send out and running out of memory. When I turn down the number of
> executors everything runs fine.
>
> Has anyone else run into this? Maybe I'm misunderstanding the underlying
> cause. I don't have a copy of the stack trace handy but can recreate it if
> necessary. It was somewhere in the <init> for HeapByteBuffer. Any advice
> would be helpful.
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Yarn-Driver-OOME-Java-heap-space-when-executors-request-map-output-locations-tp13827.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>



-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to