Re: Limit Spark Shuffle Disk Usage

2015-06-17 Thread Al M
Thanks Himanshu and RahulKumar! The databricks forum post was extremely useful. It is great to see an article that clearly details how and when shuffles are cleaned up. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Limit-Spark-Shuffle-Disk-Usage-tp23279p

Re: Limit Spark Shuffle Disk Usage

2015-06-16 Thread Himanshu Mehra
Hi Al M, You should try proving more main memory to shuffle process and it might reduce spill on disk. The default configuration for shuffle memory fraction is 20% of the safe memory that means 16% of the overall heap memory. so when we set executor memory only a small fraction of it is used in th

Re: Limit Spark Shuffle Disk Usage

2015-06-15 Thread rahulkumar-aws
Check this link https://forums.databricks.com/questions/277/how-do-i-avoid-the-no-space-left-on-device-error.html Hope this will solve your problem. - Software Developer Sigmoid (SigmoidA

Re: Limit Spark Shuffle Disk Usage

2015-06-12 Thread Akhil Das
You can disable shuffle spill (spark.shuffle.spill ) if you are having enough memory to hold that much data. I believe adding more resources would be your only choice. Thanks Best Regards On Thu, Jun 11, 2015 at 9:46 PM, Al