Re: OOM on the driver after increasing partitions

Raghava Mutharaju Wed, 22 Jun 2016 17:55:22 -0700

Thank you. Sure, if I find something I will post it.

Regards,
Raghava.



On Wed, Jun 22, 2016 at 7:43 PM, Nirav Patel <npa...@xactlycorp.com> wrote:

> I believe it would be task, partitions, task status etc information. I do
> not know exact of those things but I had OOM on driver with 512MB and
> increasing it did help. Someone else might be able to answer about exact
> memory usage of driver better.
>
> You also seem to use broadcast means sending something from dirver jvm.
> You can try taking memory dump when your driver memory is about full or set
> jvm args to take it automatically on OutOfMemory error. Analyze it and
> share your finding :)
>
>
>
> On Wed, Jun 22, 2016 at 4:33 PM, Raghava Mutharaju <
> m.vijayaragh...@gmail.com> wrote:
>
>> Ok. Would be able to shed more light on what exact meta data it manages
>> and what is the relationship with more number of partitions/nodes?
>>
>> There is one executor running on each node -- so there are 64 executors
>> in total. Each executor, including the driver are give 12GB and this is the
>> maximum available limit. So the other options are
>>
>> 1) Separate the driver from master, i.e., run them on two separate nodes
>> 2) Increase the RAM capacity on the driver/master node.
>>
>> Regards,
>> Raghava.
>>
>>
>> On Wed, Jun 22, 2016 at 7:05 PM, Nirav Patel <npa...@xactlycorp.com>
>> wrote:
>>
>>> Yes driver keeps fair amount of meta data to manage scheduling across
>>> all your executors. I assume with 64 nodes you have more executors as well.
>>> Simple way to test is to increase driver memory.
>>>
>>> On Wed, Jun 22, 2016 at 10:10 AM, Raghava Mutharaju <
>>> m.vijayaragh...@gmail.com> wrote:
>>>
>>>> It is an iterative algorithm which uses map, mapPartitions, join,
>>>> union, filter, broadcast and count. The goal is to compute a set of tuples
>>>> and in each iteration few tuples are added to it. Outline is given below
>>>>
>>>> 1) Start with initial set of tuples, T
>>>> 2) In each iteration compute deltaT, and add them to T, i.e., T = T +
>>>> deltaT
>>>> 3) Stop when current T size (count) is same as previous T size, i.e.,
>>>> deltaT is 0.
>>>>
>>>> Do you think something happens on the driver due to the application
>>>> logic and when the partitions are increased?
>>>>
>>>> Regards,
>>>> Raghava.
>>>>
>>>>
>>>> On Wed, Jun 22, 2016 at 12:33 PM, Sonal Goyal <sonalgoy...@gmail.com>
>>>> wrote:
>>>>
>>>>> What does your application do?
>>>>>
>>>>> Best Regards,
>>>>> Sonal
>>>>> Founder, Nube Technologies <http://www.nubetech.co>
>>>>> Reifier at Strata Hadoop World
>>>>> <https://www.youtube.com/watch?v=eD3LkpPQIgM>
>>>>> Reifier at Spark Summit 2015
>>>>> <https://spark-summit.org/2015/events/real-time-fuzzy-matching-with-spark-and-elastic-search/>
>>>>>
>>>>> <http://in.linkedin.com/in/sonalgoyal>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Jun 22, 2016 at 9:57 PM, Raghava Mutharaju <
>>>>> m.vijayaragh...@gmail.com> wrote:
>>>>>
>>>>>> Hello All,
>>>>>>
>>>>>> We have a Spark cluster where driver and master are running on the
>>>>>> same node. We are using Spark Standalone cluster manager. If the number 
>>>>>> of
>>>>>> nodes (and the partitions) are increased, the same dataset that used to 
>>>>>> run
>>>>>> to completion on lesser number of nodes is now giving an out of memory on
>>>>>> the driver.
>>>>>>
>>>>>> For example, a dataset that runs on 32 nodes with number of
>>>>>> partitions set to 256 completes whereas the same dataset when run on 64
>>>>>> nodes with number of partitions as 512 gives an OOM on the driver side.
>>>>>>
>>>>>> From what I read in the Spark documentation and other articles,
>>>>>> following are the responsibilities of the driver/master.
>>>>>>
>>>>>> 1) create spark context
>>>>>> 2) build DAG of operations
>>>>>> 3) schedule tasks
>>>>>>
>>>>>> I am guessing that 1) and 2) should not change w.r.t number of
>>>>>> nodes/partitions. So is it that since the driver has to keep track of lot
>>>>>> more tasks, that it gives an OOM?
>>>>>>
>>>>>> What could be the possible reasons behind the driver-side OOM when
>>>>>> the number of partitions are increased?
>>>>>>
>>>>>> Regards,
>>>>>> Raghava.
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Raghava
>>>> http://raghavam.github.io
>>>>
>>>
>>>
>>>
>>>
>>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>>
>>> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
>>> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
>>> <https://twitter.com/Xactly>  [image: Facebook]
>>> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
>>> <http://www.youtube.com/xactlycorporation>
>>
>>
>>
>>
>> --
>> Regards,
>> Raghava
>> http://raghavam.github.io
>>
>
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>
>



-- 
Regards,
Raghava
http://raghavam.github.io

Re: OOM on the driver after increasing partitions

Reply via email to