set ulimit quite high in root mode & that should resolve it.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Mon, May 26, 2014 at 7:48 PM, Matt Kielo <mki...@oculusinfo.com> wrote:

> Hello,
>
> I currently have a task always failing with
> "java.io.FileNotFoundException: [...]/shuffle_0_257_2155 (Too many open
> files)" when I run sorting operations such as distinct, sortByKey, or
> reduceByKey on a large number of partitions.
>
> Im working with 365 GB of data which is being split into 5959 partitions.
> The cluster Im using has over 1000GB of memory with 20GB of memory per node.
>
> I have tried adding .set("spark.shuffle.consolidate.files",  "true") when
> making my spark context but it doesnt seem to make a difference.
>
>  Has anyone else had similar problems?
>
> Best regards,
>
> Matt
>
>

Reply via email to