Hi Jaonary,

I believe the n folds should be mapped into n Keys in spark using a map 
function. You can reduce the returned PairRDD and you should get your metric.
I don't understand partitions fully, but from whatever I understand of it, they 
aren't required in your scenario.

Regards,
Sanjay



On Friday, 21 March 2014 7:03 PM, Jaonary Rabarisoa <jaon...@gmail.com> wrote:
 
Hi

I need to partition my data represented as RDD into n folds and run metrics 
computation in each fold and finally compute the means of my metrics overall 
the folds.
Does spark can do the data partition out of the box or do I need to implement 
it myself. I know that RDD has a partitions method and mapPartitions but I 
really don't understand the purpose and the meaning of partition here.



Cheers,

Jaonary

Reply via email to