Re: In-Memory Only Spark Shuffle

Hyukjin Kwon Fri, 15 Apr 2016 08:06:17 -0700

This reminds me of this Jira,
https://issues.apache.org/jira/browse/SPARK-3376 and this PR,
https://github.com/apache/spark/pull/5403.


AFAIK, it is not and won't be supported.
On 2 Apr 2016 4:13 a.m., "slavitch" <slavi...@gmail.com> wrote:

> Hello;
>
> I’m working on spark with very large memory systems (2TB+) and notice that
> Spark spills to disk in shuffle.  Is there a way to force spark to stay
> exclusively in memory when doing shuffle operations?   The goal is to keep
> the shuffle data either in the heap or in off-heap memory (in 1.6.x) and
> never touch the IO subsystem.  I am willing to have the job fail if it runs
> out of RAM.
>
> spark.shuffle.spill true  is deprecated in 1.6 and does not work in
> Tungsten
> sort in 1.5.x
>
> "WARN UnsafeShuffleManager: spark.shuffle.spill was set to false, but this
> is ignored by the tungsten-sort shuffle manager; its optimized shuffles
> will
> continue to spill to disk when necessary.”
>
> If this is impossible via configuration changes what code changes would be
> needed to accomplish this?
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/In-Memory-Only-Spark-Shuffle-tp26661.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: In-Memory Only Spark Shuffle

Reply via email to