Re: How does shuffle work in spark ?

shahid Mon, 19 Oct 2015 06:55:09 -0700

@all i did partitionby using default hash partitioner on data
[(1,data)(2,(data),(n,data)]
the total data was approx 3.5 it showed shuffle write 50G and on next action
e.g count it is showing shuffle read of 50 G. i don't understand this
behaviour and i think the performance is getting slow with so much shuffle
read on next tranformation operations.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/How-does-shuffle-work-in-spark-tp584p25119.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: How does shuffle work in spark ?

Reply via email to