Hi,
Suppose I have two RDDs
val textFile = sc.textFile("/user/emp.txt")
val textFile1 = sc.textFile("/user/emp1.xt")
Later I perform a join operation on above two RDDs
val join = textFile.join(textFile1)
And there are subsequent transformations without including textFile and
textFile1 further and an action to start the execution.
When action is called, textFile and textFile1 will be loaded in memory
first. Later join will be performed and kept in memory.
My question is once join is there memory and is used for subsequent
execution, what happens to textFile and textFile1 RDDs. Are they still
kept in memory untill the full lineage graph is completed or is it
destroyed once its use is over? If it is kept in memory, is there any
way I can explicitly remove it from memory to free the memory?
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org