Hi, After running some tests it appears the unpersist is called as soon as it is reached, so any tasks using this rdd later on will have to re calculate it. This is fine for simple programs but when an rdd is created within a function and its reference is then lost but children of it continue to be used the persist/unpersist does not work effectively
Thanks Jem On Thu, 2 Jul 2015 at 08:18, Akhil Das <ak...@sigmoidanalytics.com> wrote: > rdd's which are no longer required will be removed from memory by spark > itself (which you can consider as lazy?). > > Thanks > Best Regards > > On Wed, Jul 1, 2015 at 7:48 PM, Jem Tucker <jem.tuc...@gmail.com> wrote: > >> Hi, >> >> The current behavior of rdd.unpersist() appears to not be lazily executed >> and therefore must be placed after an action. Is there any way to emulate >> lazy execution of this function so it is added to the task queue? >> >> Thanks, >> >> Jem >> > >