No, spark does not refresh new data automatically.

Spark works on RDD. If you run any "action" on a RDD, then all its parents
will be loaded to memory and computation will be done. Any further call to
any of the parent will come from cache, else drop out from cache through
LRU.

On Mon, Apr 20, 2015 at 11:07 PM, Tash Chainar <[email protected]> wrote:

> Hi all,
>
> On https://spark.apache.org/docs/latest/programming-guide.html
> under the "RDD Persistence > Removing Data", it states
>
> "Spark automatically monitors cache usage on each node and drops out old
>> data partitions in a least-recently-used (LRU) fashion."
>
>
>  Can it be understood that the cache will be automatically refreshed with
> new data. If yes when and how? How Spark determines the old data?
>
> Regards.
>



-- 
Best Regards,
Ayan Guha

Reply via email to