You can remove cached RDDs by calling unpersist() on them.
You can also use SparkContext.getRDDStorageInfo to get info on cache usage,
though this is a developer API so it may change in future versions. We will add
a standard API eventually but this is just very closely tied to framework
intern
Hi,
Is there a programmatic way of checking whether RDD has been 100% cached or
not? I'd like to do this to have two different path ways.
Additionally, how do you clear cache (e.g. if you want to cache different
RDDs, and you'd like to clear an existing cached RDD).
Thanks!