Re: spark 1.4 GC issue

2015-11-15 Thread PhuDuc Nguyen
You can try this for G1GC: .../spark-submit --conf "spark.executor.extraJavaOptions=-XX:+UseG1GC -XX:+UseCompressedOops -XX:-UseGCOverheadLimit" ... However, I would suggest ensuring your job is properly tuned. If you're experiencing 60% GC in a task it's likely garbage collection is not the probl

Re: spark 1.4 GC issue

2015-11-15 Thread Ted Yu
Please take a look at http://www.infoq.com/articles/tuning-tips-G1-GC Cheers On Sat, Nov 14, 2015 at 10:03 PM, Renu Yadav wrote: > I have tried with G1 GC .Please if anyone can provide their setting for GC. > At code level I am : > 1.reading orc table usind dataframe > 2.map df to rdd of my cas

Re: spark 1.4 GC issue

2015-11-14 Thread Renu Yadav
I have tried with G1 GC .Please if anyone can provide their setting for GC. At code level I am : 1.reading orc table usind dataframe 2.map df to rdd of my case class 3. changed that rdd to paired rdd 4.Applied combineByKey 5. saving the result to orc file Please suggest Regards, Renu Yadav On Fr

Re: spark 1.4 GC issue

2015-11-13 Thread Gaurav Kumar
Please have a look at http://spark.apache.org/docs/1.4.0/tuning.html You may also want to use the latest build of JDK 7/8 and use G1GC instead. I saw considerable reductions in GC time just by doing that. Rest of the tuning parameters are better explained in the link above. Best Regards, Gaurav

spark 1.4 GC issue

2015-11-13 Thread Renu Yadav
am using spark 1.4 and my application is taking much time in GC around 60-70% of time for each task I am using parallel GC. please help somebody as soon as possible. Thanks, Renu