Hi, Is there a way to increase the amount of partition of RDD without causing shuffle? I've found JIRA issue https://issues.apache.org/jira/browse/SPARK-5997 however there is no implementation yet.
Just in case, I am reading data from ~300 big binary files, which results in 300 partitions, then I need to sort my RDD, but it crashes with outofmemory exception. If I change the number of partitions to 2000, sort works OK, but repartition itself takes a lot of time due to shuffle. Best regards, Alexander