Once you do a .collect, it will bring the data from the worker machines to
the master node. And if the dataset is too huge, then the master node will
go down.
This will return an array of ((key, 0)
*val rdd2 = rdd1.mapValues(v => 0).collect*
Thanks
Best Regards
On Fri, Nov 7, 2014 at 10:41 A
Hi,
The collect method returns an Array. If I have a huge set of data and I do
something like the following:
*val rdd2 = rdd1.mapValues(v => 0).collect *//where rdd1 is some key-value
pair RDD
As per my understanding, this will return an array(String, Int) and if my
data is huge this will return