Re: Impact of .localCheckpoint() and executor dying

2021-01-06 Thread Jacek Laskowski
Hi, > impact of an executor dying after a localCheckpoint is taken. My memory is a bit vague on this, but I'd not be surprised if this localCheckpoint-ed RDD would be "broken" and any actions would simply throw an exception like missing partitions or similar. There's no way back. I wish myself t

Re: Impact of .localCheckpoint() and executor dying

2021-01-06 Thread Brett Larson
Jacek, Thanks for your response, I am still trying to understand the impact of an executor dying after a localCheckpoint is taken. Would the entire spark application fail in this case due to the broken lineage? Or would the jobs associated with that executor need to be re-computed from scratch? T

Re: Impact of .localCheckpoint() and executor dying

2021-01-06 Thread Jacek Laskowski
Hi, > My understanding is that .localCheckpoint() breaks the lineage of the RDD True. > and this requires that the entire RDD to be rebuild instead of being able to recompute lost partitions. In a sense, it's as if you saved the partitions to executors and re-read them back as source data (for

Impact of .localCheckpoint() and executor dying

2021-01-06 Thread brettplarson
Hello, I am wondering what the impact of using .localCheckpoint() and having the executor die would be? My understanding is that .localCheckpoint() breaks the lineage of the RDD and this requires that the entire RDD to be rebuild instead of being able to recompute lost partitions. Does each exec