Thanks for Ye Xianjin's suggestions.
The SizeOf.jar may indeed have some problems. I did a simple test as
follows. The codes are
val n = 1; //5; //10; //100; //1000;
val arr1 = new Array[(Int, Array[Int])](n);
for(i <- 0 until arr1.length){
arr1(i) = (i, new Array[Int](43)
Hi,
I believe SizeOf.jar may calculate the wrong size for you.
Spark has a util call SizeEstimator located in
org.apache.spark.util.SizeEstimator. And some one extracted it out in
https://github.com/phatak-dev/java-sizeof/blob/master/src/main/scala/com/madhukaraphatak/sizeof/SizeEstimator.scal
A number of comments:
310GB is probably too large for an executor. You probably want many
smaller executors per machine. But this is not your problem.
You didn't say where the OutOfMemoryError occurred. Executor or driver?
Tuple2 is a Scala type, and a general type. It is appropriate for
general