Re: Collect method in Spark

2014-11-06 Thread Akhil Das
Once you do a .collect, it will bring the data from the worker machines to the master node. And if the dataset is too huge, then the master node will go down. This will return an array of ((key, 0) *val rdd2 = rdd1.mapValues(v => 0).collect* Thanks Best Regards On Fri, Nov 7, 2014 at 10:41 A

Collect method in Spark

2014-11-06 Thread Deep Pradhan
Hi, The collect method returns an Array. If I have a huge set of data and I do something like the following: *val rdd2 = rdd1.mapValues(v => 0).collect *//where rdd1 is some key-value pair RDD As per my understanding, this will return an array(String, Int) and if my data is huge this will return