I am writing a Spark application that has many iterations. I am planning to checkpoint on every Nth iteration to cut the graph of my rdd and clear previous shuffle files. I would also like to be able to restart my application completely using the last checkpoint.
I understand that regular checkpoint will work inside the same app, but how can I read the checkpointed rdd in case I launch the new app? In Spark streaming there seems to be support for recreating the full context from a checkpoint, but I can't figure out how to do it for non-streaming Spark. Many thanks, Harel. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Using-RDD-checkpoint-to-recover-app-failure-tp27383.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org