Re: RDD which was checkpointed is not checkpointed

2020-08-19 Thread Ivan Petrov
Awesome, thanks for explaining it. ср, 19 авг. 2020 г. в 16:29, Russell Spitzer : > It determines whether it can use the checkpoint at runtime, so you'll be > able to see it in the UI but not in the plan since you are looking at the > plan > before the job is actually running when it checks to se

Re: RDD which was checkpointed is not checkpointed

2020-08-19 Thread Russell Spitzer
It determines whether it can use the checkpoint at runtime, so you'll be able to see it in the UI but not in the plan since you are looking at the plan before the job is actually running when it checks to see if it can use the checkpoint in the lineage. Here is a two stage job for example: *scala

Re: RDD which was checkpointed is not checkpointed

2020-08-19 Thread Ivan Petrov
i did it and see lineage change BEFORE calling action. No success. Job$ - isCheckpointed? false, getCheckpointFile: None Job$ - recordsRDD.toDebugString: (2) MapPartitionsRDD[7] at map at Job.scala:112 [] | MapPartitionsRDD[6] at map at Job.scala:111 [] | MapPartitionsRDD[5] at map at s

Re: RDD which was checkpointed is not checkpointed

2020-08-19 Thread Jacob Lynn
Hi Ivan, Unlike cache/persist, checkpoint does not operate in-place but requires the result to be assigned to a new variable. In your case: val recordsRDD = convertToRecords(anotherRDD).checkpoint() Best, Jacob Op wo 19 aug. 2020 om 14:39 schreef Ivan Petrov : > Hi! > Seems like I do smth wron