Re: Troubleshooting Spark OOM

Dillon Dukek Wed, 09 Jan 2019 16:41:43 -0800

I think most spark technical support people would really recommend
upgrading to spark 2.0+ for starters. However, I understand that's not
always possible. In this case I would double check to make sure that you
don't have a situation where you have a join key that has many records
associated with in in one or both datasets. This would cause all of those
records to get pushed into a single partition and then can cause your
process to oom when you go to process that partition in the next phase.


On Wed, Jan 9, 2019 at 4:24 PM William Shen <wills...@marinsoftware.com>
wrote:

> Thank you for the tips. We are running Spark 1.6 (scala), and OOM happens
> with SparkSQL trying to join a few large dataset together for
> processing/transformation...
>
> On Wed, Jan 9, 2019 at 3:42 PM Ramandeep Singh <rs5...@nyu.edu> wrote:
>
>> Hi,
>>
>> Here are a few suggestions that you can try.
>>
>> OOM Issues that, I have faced with Spark:
>> *Not enough shuffle partition*s.Increase them.
>> Less memory Overhead settings: Boosting it to around 12 percent. You
>> usually get this as a error message in your executors.
>> *Large Executor Configs*: They can be problematic, smaller and larger in
>> number executors are preferred over larger and fewer executors.
>> Changing GC algorithm
>>
>> http://orastack.com/spark-scaling-to-large-datasets.html
>>
>>
>> Here are a few tips
>>
>>
>>
>>
>> On Wed, Jan 9, 2019 at 1:55 PM Dillon Dukek
>> <dillon.du...@placed.com.invalid> wrote:
>>
>>> Hi William,
>>>
>>> Just to get started, can you describe the spark version you are using
>>> and the language? It doesn't sound like you are using pyspark, however,
>>> problems arising from that can be different so I just want to be sure. As
>>> well, can you talk through the scenario under which you are dealing with
>>> this error? ie the order of operations for the transformations you are
>>> applying.
>>>
>>> However, if you're set on getting a heap dump, probably the easiest way
>>> would be to just monitor an active application through the spark UI then go
>>> grab a heap dump from the executor java process when you notice one that's
>>> having problems.
>>>
>>> On Wed, Jan 9, 2019 at 10:18 AM William Shen <wills...@marinsoftware.com>
>>> wrote:
>>>
>>>> Hi there,
>>>>
>>>> We've encountered Spark executor Java OOM issues for our Spark
>>>> application. Any tips on how to troubleshoot to identify what objects are
>>>> occupying the heap? In the past, dealing with JVM OOM, we've worked with
>>>> analyzing heap dumps, but we are having a hard time with locating Spark
>>>> heap dump after a crash, and we also anticipate that these heap dump will
>>>> be huge (since our nodes have a large memory allocation) and may be
>>>> difficult to analyze locally. Can someone share their experience dealing
>>>> with Spark OOM?
>>>>
>>>> Thanks!
>>>>
>>>
>>
>> --
>> Regards,
>> Ramandeep Singh
>> Blog:http://ramannanda.blogspot.com
>>
>

Re: Troubleshooting Spark OOM

Reply via email to