Hi reynold,

It took me some time, but I've finally found that there is a difference
between spilling on the map-side and spilling on the reduce-side for a
shuffle. Spilling to disk on the map-side happens by default (with the
spillToPartitionFiles call from insertAll in ExternalSorter; don't know yet
why there is a difference in number of calls though), spilling on the reduce
side (with the maybeSpillCollection call from insertAll in ExternalSorter)
is optional and based on the available memory set by
spark.shuffle.memoryFraction and the total memory available. In my case, I
was just seeing the spilling on the map-side, but did not realize that this
is supposed to happen, regardless of the memory settings.

Thanks for your help,

Tom



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Spilling-when-not-expected-tp11017p11884.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to