Tachyon is another option - this is the "off heap" StorageLevel specified when persisting RDDs: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.storage.StorageLevel
or just use HDFS. this requires subsequent Applications/SparkContext's to reload the data from disk, of course. On Tue, Jun 3, 2014 at 6:58 AM, Gerard Maas <[email protected]> wrote: > I don't think that's supported by default as when the standalone context > will close, the related RDDs will be GC'ed > > You should explore Spark-Job Server, which allows to cache RDDs by name > and reuse them within a context. > > https://github.com/ooyala/spark-jobserver > > -kr, Gerard. > > > On Tue, Jun 3, 2014 at 3:45 PM, Oleg Proudnikov <[email protected] > > wrote: > >> HI All, >> >> Is it possible to run a standalone app that would compute and >> persist/cache an RDD and then run other standalone apps that would gain >> access to that RDD? >> >> -- >> Thank you, >> Oleg >> >> >
