Thanks a lot for replying back.

Actually, I am running the SparkPageRank example with 160GB heap (I am sure
the problem is not GC because the excess time is being spent in java code
only).

What I have observed in Jprofiler and Oprofile outputs is that the amount of
time spent in following 2 functions increases substantially with increasing
N:

1) java.io.ObjectOutputStream.writeObject0
2) scala.Tuple2.hashCode 

I don't think that Linux file system could be causing the issue as my
machine has 256GB RAM, and I am using a tmpfs for java.io.tmpdir. So, I
don't think there is much disk access involved, if that is what you meant.

Regards,
Lokesh



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Ideal-core-count-within-a-single-JVM-tp9566p9630.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to