Thanks for Ye Xianjin's suggestions.
The SizeOf.jar may indeed have some problems. I did a simple test as
follows. The codes are
val n = 1; //5; //10; //100; //1000;
val arr1 = new Array[(Int, Array[Int])](n);
for(i <- 0 until arr1.length){
arr1(i) = (i, new Array[Int](43)
does Spark choose such a poor data structure, Tuple2, for key-value
pairs? Is there any better data structure for storing (key, value) pairs
with less memory cost ?
2. Given a dataset with size of M, in general Spark how many times of memory
to handle it?
Best,
Landmark
--
View this me