flatmap would have to shuffle data only if output RDD is expected to be partitioned by some key. RDD[X].flatmap(X=>RDD[Y]) If it has to shuffle it should be local.
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Thu, Nov 13, 2014 at 7:31 AM, Debasish Das <debasish.da...@gmail.com> wrote: > Hi, > > I am doing a flatMap followed by mapPartitions to do some blocked > operation...flatMap is shuffling data but this shuffle is strictly > shuffling to disk and not over the network right ? > > Thanks. > Deb >