Re: Spark execuotr Memory profiling

2016-03-01 Thread Nirav Patel
Thanks Nilesh, Thanks for sharing those docs. I have came across most of those tuning in past and believe me I have tune the hack of out of this job. What I can't beleive is spark needs 4x more resource then MapReduce to run the same job (for dataset magnitude of >100GB). I was able to run my job

Re: Spark execuotr Memory profiling

2016-02-20 Thread Kuchekar
Hi Nirav, I recently attended the Spark Summit East 2016 and almost every talk about errors faced by community and/or tuning topics for Spark mentioned this being the main problem (Executor lost and JVM out of memory). Checkout this blogs that explains how to tune

Re: Spark execuotr Memory profiling

2016-02-20 Thread Nirav Patel
.run(DFSOutputStream.java:745) > > Kindly help me understand the conf. > > > Thanks in advance. > > Regards > Arun. > > ------ > *From:* Kuchekar [kuchekar.nil...@gmail.com] > *Sent:* 11 February 2016 09:42 > *To:* Nirav Patel >

Re: Spark execuotr Memory profiling

2016-02-20 Thread Nirav Patel
Thanks Nilesh. I don't think there;s heavy communication between driver and executor. However I'll try the settings you suggested. I can not replace groupBy with reduceBy as its not an associative operation. It is very frustrating to be honest. It was a piece of cake with map reduce compare to am

Re: Spark execuotr Memory profiling

2016-02-11 Thread Rishabh Wadhawan
> Regards > Arun. > > From: Kuchekar [kuchekar.nil...@gmail.com] > Sent: 11 February 2016 09:42 > To: Nirav Patel > Cc: spark users > Subject: Re: Spark execuotr Memory profiling > > Hi Nirav, > > I faced similar issue with Yar

RE: Spark execuotr Memory profiling

2016-02-11 Thread ARUN.BONGALE
r.run(DFSOutputStream.java:745) Kindly help me understand the conf. Thanks in advance. Regards Arun. From: Kuchekar [kuchekar.nil...@gmail.com] Sent: 11 February 2016 09:42 To: Nirav Patel Cc: spark users Subject: Re: Spark execuotr Memory profiling Hi Nirav,

Re: Spark execuotr Memory profiling

2016-02-10 Thread Kuchekar
Hi Nirav, I faced similar issue with Yarn, EMR 1.5.2 and following Spark Conf helped me. You can set the values accordingly conf= (SparkConf().set("spark.master","yarn-client").setAppName("HalfWay" ).set("spark.driver.memory", "15G").set("spark.yarn.am.memory","15G")) conf=conf