FWIW I've seen correctness errors with spark.shuffle.spill on 0.9.0 and have it disabled now. The specific error behavior was that a join would consistently return one count of rows with spill enabled and another count with it disabled.
Sent from my mobile phone On Mar 22, 2014 1:52 PM, "Kane" <[email protected]> wrote: > But i was wrong - map also fails on big file and setting > spark.shuffle.spill > doesn't help. Map fails with the same error. > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/distinct-on-huge-dataset-tp3025p3039.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >
