Spark’s cache is fault-tolerant – if any partition of an RDD is lost, it will automatically be recomputed using the transformations that originally created it.
> On Mar 23, 2017, at 4:11 AM, nayan sharma <nayansharm...@gmail.com> wrote: > > In case of task failures,does spark clear the persisted RDD > (StorageLevel.MEMORY_ONLY_SER) and recompute them again when the task is > attempted to start from beginning. Or will the cached RDD be appended. > > How does spark checks whether the RDD has been cached and skips the caching > step for a particular task. > >> On 23-Mar-2017, at 3:36 PM, Artur R <ar...@gpnxgroup.com> wrote: >> >> I am not pretty sure, but: >> - if RDD persisted in memory then on task fail executor JVM process fails >> too, so the memory is released >> - if RDD persisted on disk then on task fail Spark shutdown hook just wipes >> temp files >> >>> On Thu, Mar 23, 2017 at 10:55 AM, Jörn Franke <jornfra...@gmail.com> wrote: >>> What do you mean by clear ? What is the use case? >>> >>>> On 23 Mar 2017, at 10:16, nayan sharma <nayansharm...@gmail.com> wrote: >>>> >>>> Does Spark clears the persisted RDD in case if the task fails ? >>>> >>>> Regards, >>>> >>>> Nayan >> >