I am a newbie to Spark and I need to know how RDD partitioning can be controlled in the process of shuffling. I have googled for examples but haven't found much concrete examples, in contrast with the fact that there are many good tutorials about Hadoop's shuffling and partitioner.
Can anybody show me good tutorials explaining the process of shuffling in Spark, as well as examples of how to use a customized partitioner.? Best, Tao