Re: Executor memory requirement for reduceByKey

2016-05-17 Thread Raghavendra Pandey
Even though it does not sound intuitive, reduce by key expects all values for a particular key for a partition to be loaded into memory. So once you increase the partitions you can run the jobs.

Re: Executor memory requirement for reduceByKey

2016-05-13 Thread Sung Hwan Chung
Ok, so that worked flawlessly after I upped the number of partitions to 400 from 40. Thanks! On Fri, May 13, 2016 at 7:28 PM, Sung Hwan Chung wrote: > I'll try that, as of now I have a small number of partitions in the order > of 20~40. > > It would be great if there's some documentation on the

Re: Executor memory requirement for reduceByKey

2016-05-13 Thread Sung Hwan Chung
I'll try that, as of now I have a small number of partitions in the order of 20~40. It would be great if there's some documentation on the memory requirement wrt the number of keys and the number of partitions per executor (i.e., the Spark's internal memory requirement outside of the user space).

Re: Executor memory requirement for reduceByKey

2016-05-13 Thread Ted Yu
Have you taken a look at SPARK-11293 ? Consider using repartition to increase the number of partitions. FYI On Fri, May 13, 2016 at 12:14 PM, Sung Hwan Chung wrote: > Hello, > > I'm using Spark version 1.6.0 and have trouble with memory when trying to > do reducebykey on a dataset with as many

Re: Executor memory allocations

2015-06-18 Thread Richard Marscher
It would be the "40%", although it's probably better to think of it as shuffle vs. data cache and the remainder goes to tasks. As the comments for the shuffle memory fraction configuration clarify that it will be taking memory at the expense of the storage/data cache fraction: spark.shuffle.memory

Re: Executor memory in web UI

2015-04-17 Thread Sean Owen
This is the fraction available for caching, which is 60% * 90% * total by default. On Fri, Apr 17, 2015 at 11:30 AM, podioss wrote: > Hi, > i am a bit confused with the executor-memory option. I am running > applications with Standalone cluster manager with 8 workers with 4gb memory > and 2 cores

Re: Executor memory

2014-12-16 Thread Pala M Muthaia
Thanks for the clarifications. I misunderstood what the number on UI meant. On Mon, Dec 15, 2014 at 7:00 PM, Sean Owen wrote: > I believe this corresponds to the 0.6 of the whole heap that is > allocated for caching partitions. See spark.storage.memoryFraction on > http://spark.apache.org/docs/l

Re: Executor memory

2014-12-15 Thread Sean Owen
I believe this corresponds to the 0.6 of the whole heap that is allocated for caching partitions. See spark.storage.memoryFraction on http://spark.apache.org/docs/latest/configuration.html 0.6 of 4GB is about 2.3GB. The note there is important, that you probably don't want to exceed the JVM old ge

Re: Executor memory

2014-12-15 Thread sandy . ryza
Hi Pala, Spark executors only reserve spark.storage.memoryFraction (default 0.6) of their spark.executor.memory for caching RDDs. The spark UI displays this fraction. spark.executor.memory controls the executor heap size. spark.yarn.executor.memoryOverhead controls the extra that's tacked on

Re: Executor Memory, Task hangs

2014-08-19 Thread Laird, Benjamin
>> Date: Tuesday, August 19, 2014 at 9:23 AM To: Capital One mailto:benjamin.la...@capitalone.com>> Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" mailto:user@spark.apache.org>> Subject: Re: Executor Memory, Task hangs Given a fixed amount of memory all

Re: Executor Memory, Task hangs

2014-08-19 Thread Sean Owen
Given a fixed amount of memory allocated to your workers, more memory per executor means fewer executors can execute in parallel. This means it takes longer to finish all of the tasks. Set high enough, and your executors can find no worker with enough memory and so they all are stuck waiting for re

Re: Executor Memory, Task hangs

2014-08-19 Thread Akhil Das
Looks like 1 worker is doing the job. Can you repartition the RDD? Also what is the number of cores that you allocated? Things like this, you can easily identify by looking at the workers webUI (default worker:8081) Thanks Best Regards On Tue, Aug 19, 2014 at 6:35 PM, Laird, Benjamin < benjamin.