Cached RDD do not survive SparkContext deletion (they are scoped on a per sparkcontext basis). I am not sure what you mean by disk based cache eviction, if you cache more RDD than disk space the result will not be very pretty :)
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Wed, Sep 10, 2014 at 4:43 AM, Vladimir Rodionov < vrodio...@splicemachine.com> wrote: > Hi, users > > 1. Disk based cache eviction policy? The same LRU? > > 2. What is the scope of a cached RDD? Does it survive application? What > happen if I run Java app next time? Will RRD be created or read from cache? > > If , answer is YES, then ... > > > 3. Is there are any way to invalidate cached RDD automatically? RDD > partitions? Some API kind of : RDD.isValid()? > > 4. HadoopRDD InputFormat - based. Some partitions (splits) may become > invalid in cache. Can we reload only those partitions? Into cache? > > -Vladimir >