Re: Spark on Yarn : Map outputs lifetime ?

Imran Rashid Mon, 18 May 2015 18:06:49 -0700

Neither of those two.  Instead, the shuffle data is cleaned up when the
stage they are from get GC'ed by the jvm.  that is, when you are no longer
holding any references to anything which points to the old stages, and
there is an appropriate gc event.

The data is not cleaned up right after the stage completes, because it
might get used again by another later (eg., if the stage is retried).

On Tue, May 12, 2015 at 6:50 PM, Ashwin Shankar <ashwinshanka...@gmail.com>
wrote:

> Hi,
> In spark on yarn and when running spark_shuffle as auxiliary service on
> node manager, does map spills of a stage gets cleaned up once the next
> stage completes OR
> is it preserved till the app completes(ie waits for all the stages to
> complete) ?
>
> --
> Thanks,
> Ashwin
>
>
>
>

Re: Spark on Yarn : Map outputs lifetime ?

Reply via email to