Thank you Imran!! I was able to solve the issue by setting "spark.storage.blockManagerSlaveTimeoutMs=300000"
As I was seeing some block manager timeouts on master I updated this setting and it fixed the timeout issue as well as OOM errors on workers too. I am not really sure how it fixed the OOM but it is now working for me. Thanks Ankur On Mon, Apr 13, 2015 at 8:09 PM, Imran Rashid <iras...@cloudera.com> wrote: > broadcast variables count towards "spark.storage.memoryFraction", so they > use the same "pool" of memory as cached RDDs. > > That being said, I'm really not sure why you are running into problems, it > seems like you have plenty of memory available. Most likely its got > nothing to do with broadcast variables or caching -- its just whatever > logic you are applying in your transformations that are causing lots of GC > to occur during the computation. Hard to say without knowing more details. > > You could try increasing the timeout for the failed askWithReply by > increasing "spark.akka.lookupTimeout" (defaults to 30), but that would most > likely be treating a symptom, not the root cause. > > On Fri, Mar 27, 2015 at 4:52 PM, Ankur Srivastava < > ankur.srivast...@gmail.com> wrote: > >> Hi All, >> >> I am running a spark cluster on EC2 instances of type: m3.2xlarge. I have >> given 26gb of memory with all 8 cores to my executors. I can see that in >> the logs too: >> >> *15/03/27 21:31:06 INFO AppClient$ClientActor: Executor added: >> app-20150327213106-0000/0 on worker-20150327212934-10.x.y.z-40128 >> (10.x.y.z:40128) with 8 cores* >> >> I am not caching any RDD so I have set "spark.storage.memoryFraction" to >> 0.2. I can see on SparkUI under executors tab Memory used is 0.0/4.5 GB. >> >> I am now confused with these logs? >> >> *15/03/27 21:31:08 INFO BlockManagerMasterActor: Registering block >> manager 10.77.100.196:58407 <http://10.77.100.196:58407> with 4.5 GB RAM, >> BlockManagerId(4, 10.x.y.z, 58407)* >> >> I am broadcasting a large object of 3 gb and after that when I am >> creating an RDD, I see logs which show this 4.5 GB memory getting full and >> then I get OOM. >> >> How can I make block manager use more memory? >> >> Is there any other fine tuning I need to do for broadcasting large >> objects? >> >> And does broadcast variable use cache memory or rest of the heap? >> >> >> Thanks >> >> Ankur >> > >