If we do cache() + count() after say every 50 iterations. The whole process becomes very slow. I have tried checkpoint() , cache() + count(), saveAsObjectFiles(). Nothing works. Materializing RDD's lead to drastic decrease in performance & if we don't materialize, we face stackoverflowerror.
----- Lalit Yadav la...@sigmoidanalytics.com -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-StackOverflowError-when-calling-count-tp5649p5699.html Sent from the Apache Spark User List mailing list archive at Nabble.com.