I am a newbie to Spark and I need to know how RDD partitioning can be
controlled in the process of shuffling. I have googled for examples but
haven't found much concrete examples, in contrast with the fact that there
are many good tutorials about Hadoop's shuffling and partitioner.

Can anybody show me good tutorials explaining the process of shuffling in
Spark, as well as examples of how to use a customized partitioner.?


Best,
Tao

Reply via email to