Re: How to clear spark Shuffle files

2020-09-14 Thread lsn248
Our use case is as follows: We repartition 6 months worth of data for each client on clientId & recordcreationdate, so that it can write one file per partition. Our partition is on client and recordcreationdate. The job fills up the disk after it process say 30 tenants out of 50. I am looking

How to clear spark Shuffle files

2020-09-14 Thread lsn248
Hi, I have a long running application and spark seem to fill up the disk with shuffle files. Eventually the job fails running out of disk space. Is there a way for me to clean the shuffle files ? Thanks -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ -