Hi Datta,
Thanks for the reply.
If I havent cached any rdd and the data that is being loaded into memory
after performing some operations exceeds the memory, how it is handled
by spark.
Is previosly loaded rdds removed from memory to make it free for
subsequent steps in DAG?
I am running in
Hi Aditya,
If you cache the RDDs - like textFile.cache(), textFile1().cache() - then
it will not load the data again from file system.
Once done with related operations it is recommended to uncache the RDDs to
manage memory efficiently and avoid it's exhaustion.
Note caching operation is with ma
Thanks for the reply.
One more question.
How spark handles data if it does not fit in memory? The answer which I
got is that it flushes the data to disk and handle the memory issue.
Plus in below example.
val textFile = sc.textFile("/user/emp.txt")
val textFile1 = sc.textFile("/user/emp1.xt")
v
Hi,
unpersist works on storage memory not execution memory. So I do not think
you can flush it out of memory if you have not cached it using cache or
something like below in the first place.
s.persist(org.apache.spark.storage.StorageLevel.MEMORY_ONLY)
s.unpersist
I believe the recent versions o
Hello Aditya,
After an intermediate action has been applied you might want to call
rdd.unpersist() to let spark know that this rdd is no longer required.
Thanks,
-Hanu
On Thu, Sep 22, 2016 at 7:54 AM, Aditya
wrote:
> Hi,
>
> Suppose I have two RDDs
> val textFile = sc.textFile("/user/emp.txt")