I am writing a Spark application that has many iterations.
I am planning to checkpoint on every Nth iteration to cut the graph of my
rdd and clear previous shuffle files.
I would also like to be able to restart my application completely using the
last checkpoint.

I understand that regular checkpoint will work inside the same app, but how
can I read the checkpointed rdd in case I launch the new app?

In Spark streaming there seems to be support for recreating the full context
from a checkpoint, but I can't figure out how to do it for non-streaming
Spark.

Many thanks,
Harel.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-RDD-checkpoint-to-recover-app-failure-tp27383.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to