Shuffle files are cleaned when they are no longer referenced. See https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ContextCleaner.scala
On Mon, Mar 27, 2017 at 12:38 PM, Ashwin Sai Shankar < ashan...@netflix.com.invalid> wrote: > Hi! > > In spark on yarn, when are shuffle files on local disk removed? (Is it > when the app completes or > once all the shuffle files are fetched or end of the stage?) > > Thanks, > Ashwin >