Is there anyway to let spark know ahead of time what size of RDD to expect as a result of a flatmap() operation? And would that help in terms of performance? For instance, if I have an RDD of 1million rows and I know that my flatMap() will produce 100million rows, is there a way to indicate that to Spark? to say "reserve" space for the resulting RDD?
thanks Jeff