Well, for what it's worth, I found the issue after spending the whole night
running experiments;).

Basically, I needed to give a higher number of partition for the groupByKey.
I was simply using the default, which generated only 4 partitions and so the
whole thing blew up.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Last-step-of-processing-is-using-too-much-memory-tp10134p10147.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to