Yes, I've tried. The problem is new broadcast object generated by every step until eat up all of the memory.
I solved it by using RDD.checkpoint to remove dependences to old broadcast object, and use cleanner.ttl to clean up these broadcast object automatically. If there's more elegant way to solve this problem, please tell me:) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Solved-Cache-issue-for-iteration-with-broadcast-tp5350p5385.html Sent from the Apache Spark User List mailing list archive at Nabble.com.