Thanks a lot for replying back. Actually, I am running the SparkPageRank example with 160GB heap (I am sure the problem is not GC because the excess time is being spent in java code only).
What I have observed in Jprofiler and Oprofile outputs is that the amount of time spent in following 2 functions increases substantially with increasing N: 1) java.io.ObjectOutputStream.writeObject0 2) scala.Tuple2.hashCode I don't think that Linux file system could be causing the issue as my machine has 256GB RAM, and I am using a tmpfs for java.io.tmpdir. So, I don't think there is much disk access involved, if that is what you meant. Regards, Lokesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Ideal-core-count-within-a-single-JVM-tp9566p9630.html Sent from the Apache Spark User List mailing list archive at Nabble.com.