flatmap would have to shuffle data only if output RDD is expected to be
partitioned by some key.
RDD[X].flatmap(X=>RDD[Y])
If it has to shuffle it should be local.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>


On Thu, Nov 13, 2014 at 7:31 AM, Debasish Das <debasish.da...@gmail.com>
wrote:

> Hi,
>
> I am doing a flatMap followed by mapPartitions to do some blocked
> operation...flatMap is shuffling data but this shuffle is strictly
> shuffling to disk and not over the network right ?
>
> Thanks.
> Deb
>

Reply via email to