I'm experiencing the same behaviour with shuffle data being orphaned on disk (Spark 2.0.1 with Spark streaming).
We are using AWS R4 EC2 instances with 300GB EBS volumes attached, most spilled shuffle data is eventually cleaned up by the ContextCleaner within 10 minutes. We do not use the external shuffle service and also use mesos. Occasionally some shuffle files are never removed until the application is gracefully shutdown or dies due to lack of disk space. I am confident the orphaned shuffle data is not in use by any jobs after 5 minutes (batch duration). Did you know of any possible causes of this shuffle data not being cleaned and left orphaned on the disk? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org