Re: IOException and appcache FileNotFoundException in Spark 1.02

2014-10-14 Thread Ilya Ganelin
Hello all . Does anyone else have any suggestions? Even understanding what this error is from would help a lot. On Oct 11, 2014 12:56 AM, "Ilya Ganelin" wrote: > Hi Akhil - I tried your suggestions and tried varying my partition sizes. > Reducing the number of partitions led to memory errors (pre

Re: IOException and appcache FileNotFoundException in Spark 1.02

2014-10-10 Thread Ilya Ganelin
Hi Akhil - I tried your suggestions and tried varying my partition sizes. Reducing the number of partitions led to memory errors (presumably - I saw IOExceptions much sooner). With the settings you provided the program ran for longer but ultimately crashes in the same way. I would like to understa

Re: IOException and appcache FileNotFoundException in Spark 1.02

2014-10-10 Thread Ilya Ganelin
Thank you - I will try this. If I drop the partition count am I not more likely to hit memory issues? Especially if the dataset is rather large? On Oct 10, 2014 3:19 AM, "Akhil Das" wrote: > You could be hitting this issue > (or similar). You can

Re: IOException and appcache FileNotFoundException in Spark 1.02

2014-10-10 Thread Akhil Das
You could be hitting this issue (or similar). You can try the following workarounds: sc.set("spark.core.connection.ack.wait.timeout","600") sc.set("spark.akka.frameSize","50") Also reduce the number of partitions, you could be hitting the kernel's

IOException and appcache FileNotFoundException in Spark 1.02

2014-10-09 Thread Ilya Ganelin
Hi all – I could use some help figuring out a couple of exceptions I’ve been getting regularly. I have been running on a fairly large dataset (150 gigs). With smaller datasets I don't have any issues. My sequence of operations is as follows – unless otherwise specified, I am not caching: Map a 3

IOException and appcache FileNotFoundException in Spark 1.02

2014-10-09 Thread Ilya Ganelin
On Oct 9, 2014 10:18 AM, "Ilya Ganelin" wrote: Hi all – I could use some help figuring out a couple of exceptions I’ve been getting regularly. I have been running on a fairly large dataset (150 gigs). With smaller datasets I don't have any issues. My sequence of operations is as follows – unles