Hello Aditya, After an intermediate action has been applied you might want to call rdd.unpersist() to let spark know that this rdd is no longer required.
Thanks, -Hanu On Thu, Sep 22, 2016 at 7:54 AM, Aditya <aditya.calangut...@augmentiq.co.in> wrote: > Hi, > > Suppose I have two RDDs > val textFile = sc.textFile("/user/emp.txt") > val textFile1 = sc.textFile("/user/emp1.xt") > > Later I perform a join operation on above two RDDs > val join = textFile.join(textFile1) > > And there are subsequent transformations without including textFile and > textFile1 further and an action to start the execution. > > When action is called, textFile and textFile1 will be loaded in memory > first. Later join will be performed and kept in memory. > My question is once join is there memory and is used for subsequent > execution, what happens to textFile and textFile1 RDDs. Are they still kept > in memory untill the full lineage graph is completed or is it destroyed > once its use is over? If it is kept in memory, is there any way I can > explicitly remove it from memory to free the memory? > > > > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >