Thank you. Sure, if I find something I will post it. Regards, Raghava.
On Wed, Jun 22, 2016 at 7:43 PM, Nirav Patel <npa...@xactlycorp.com> wrote: > I believe it would be task, partitions, task status etc information. I do > not know exact of those things but I had OOM on driver with 512MB and > increasing it did help. Someone else might be able to answer about exact > memory usage of driver better. > > You also seem to use broadcast means sending something from dirver jvm. > You can try taking memory dump when your driver memory is about full or set > jvm args to take it automatically on OutOfMemory error. Analyze it and > share your finding :) > > > > On Wed, Jun 22, 2016 at 4:33 PM, Raghava Mutharaju < > m.vijayaragh...@gmail.com> wrote: > >> Ok. Would be able to shed more light on what exact meta data it manages >> and what is the relationship with more number of partitions/nodes? >> >> There is one executor running on each node -- so there are 64 executors >> in total. Each executor, including the driver are give 12GB and this is the >> maximum available limit. So the other options are >> >> 1) Separate the driver from master, i.e., run them on two separate nodes >> 2) Increase the RAM capacity on the driver/master node. >> >> Regards, >> Raghava. >> >> >> On Wed, Jun 22, 2016 at 7:05 PM, Nirav Patel <npa...@xactlycorp.com> >> wrote: >> >>> Yes driver keeps fair amount of meta data to manage scheduling across >>> all your executors. I assume with 64 nodes you have more executors as well. >>> Simple way to test is to increase driver memory. >>> >>> On Wed, Jun 22, 2016 at 10:10 AM, Raghava Mutharaju < >>> m.vijayaragh...@gmail.com> wrote: >>> >>>> It is an iterative algorithm which uses map, mapPartitions, join, >>>> union, filter, broadcast and count. The goal is to compute a set of tuples >>>> and in each iteration few tuples are added to it. Outline is given below >>>> >>>> 1) Start with initial set of tuples, T >>>> 2) In each iteration compute deltaT, and add them to T, i.e., T = T + >>>> deltaT >>>> 3) Stop when current T size (count) is same as previous T size, i.e., >>>> deltaT is 0. >>>> >>>> Do you think something happens on the driver due to the application >>>> logic and when the partitions are increased? >>>> >>>> Regards, >>>> Raghava. >>>> >>>> >>>> On Wed, Jun 22, 2016 at 12:33 PM, Sonal Goyal <sonalgoy...@gmail.com> >>>> wrote: >>>> >>>>> What does your application do? >>>>> >>>>> Best Regards, >>>>> Sonal >>>>> Founder, Nube Technologies <http://www.nubetech.co> >>>>> Reifier at Strata Hadoop World >>>>> <https://www.youtube.com/watch?v=eD3LkpPQIgM> >>>>> Reifier at Spark Summit 2015 >>>>> <https://spark-summit.org/2015/events/real-time-fuzzy-matching-with-spark-and-elastic-search/> >>>>> >>>>> <http://in.linkedin.com/in/sonalgoyal> >>>>> >>>>> >>>>> >>>>> On Wed, Jun 22, 2016 at 9:57 PM, Raghava Mutharaju < >>>>> m.vijayaragh...@gmail.com> wrote: >>>>> >>>>>> Hello All, >>>>>> >>>>>> We have a Spark cluster where driver and master are running on the >>>>>> same node. We are using Spark Standalone cluster manager. If the number >>>>>> of >>>>>> nodes (and the partitions) are increased, the same dataset that used to >>>>>> run >>>>>> to completion on lesser number of nodes is now giving an out of memory on >>>>>> the driver. >>>>>> >>>>>> For example, a dataset that runs on 32 nodes with number of >>>>>> partitions set to 256 completes whereas the same dataset when run on 64 >>>>>> nodes with number of partitions as 512 gives an OOM on the driver side. >>>>>> >>>>>> From what I read in the Spark documentation and other articles, >>>>>> following are the responsibilities of the driver/master. >>>>>> >>>>>> 1) create spark context >>>>>> 2) build DAG of operations >>>>>> 3) schedule tasks >>>>>> >>>>>> I am guessing that 1) and 2) should not change w.r.t number of >>>>>> nodes/partitions. So is it that since the driver has to keep track of lot >>>>>> more tasks, that it gives an OOM? >>>>>> >>>>>> What could be the possible reasons behind the driver-side OOM when >>>>>> the number of partitions are increased? >>>>>> >>>>>> Regards, >>>>>> Raghava. >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Raghava >>>> http://raghavam.github.io >>>> >>> >>> >>> >>> >>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> >>> >>> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >>> <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] >>> <https://twitter.com/Xactly> [image: Facebook] >>> <https://www.facebook.com/XactlyCorp> [image: YouTube] >>> <http://www.youtube.com/xactlycorporation> >> >> >> >> >> -- >> Regards, >> Raghava >> http://raghavam.github.io >> > > > > > [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> > > <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] > <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] > <https://twitter.com/Xactly> [image: Facebook] > <https://www.facebook.com/XactlyCorp> [image: YouTube] > <http://www.youtube.com/xactlycorporation> > -- Regards, Raghava http://raghavam.github.io