Re: [External Sender] How to debug Spark job

2018-09-08 Thread Sonal Goyal
You could also try to profile your program on the executor or driver by using jvisualvm or yourkit to see if there is any memory/cpu optimization you could do. Thanks, Sonal Nube Technologies On Fri, Sep 7, 2018 at 6:35 PM, James

Re: [External Sender] How to debug Spark job

2018-09-07 Thread James Starks
Got the root cause eventually as it throws java.lang.OutOfMemoryError: Java heap space. Increasing --driver-memory temporarily fixes the problem. Thanks. ‐‐‐ Original Message ‐‐‐ On 7 September 2018 12:32 PM, Femi Anthony wrote: > One way I would go about this would be to try running a

Re: [External Sender] How to debug Spark job

2018-09-07 Thread Femi Anthony
One way I would go about this would be to try running a new_df.show(numcols, truncate=False) on a few columns before you try writing to parquet to force computation of newdf and see whether the hanging is occurring at that point or during the write. You may also try doing a newdf.count() as well.