On Wed, Dec 3, 2014 at 10:52 AM, shahab <shahab.mok...@gmail.com> wrote:

> Hi,
>
> I noticed that rdd.cache() is not happening immediately rather due to lazy
> feature of Spark, it is happening just at the moment  you perform some
> map/reduce actions. Is this true?
>

Yes, this is correct.

If this is the case, how can I enforce Spark to cache immediately at its
> cache() statement? I need this to perform some benchmarking and I need to
> separate rdd caching and rdd transformation/action processing time.
>

The typical solution I think is to run rdd.foreach(_ => ()) to trigger a
calculation.

Reply via email to