I have 5 workers each executor-memory 8GB of memory. My driver memory is 8 GB as well. They are all 8 core machines.
To answer Imran's question my configurations are thus. executor_total_max_heapsize = 18GB This problem happens at the end of my program. I don't have to run a lot of jobs to see this behaviour. I can see my output correctly in HDFS and all. I will give it one more try after increasing master's memory(which is default 296MB to 512 MB) ..manas On Thu, Feb 12, 2015 at 2:14 PM, Arush Kharbanda <ar...@sigmoidanalytics.com > wrote: > How many nodes do you have in your cluster, how many cores, what is the > size of the memory? > > On Fri, Feb 13, 2015 at 12:42 AM, Manas Kar <manasdebashis...@gmail.com> > wrote: > >> Hi Arush, >> Mine is a CDH5.3 with Spark 1.2. >> The only change to my spark programs are >> -Dspark.driver.maxResultSize=3g -Dspark.akka.frameSize=1000. >> >> ..Manas >> >> On Thu, Feb 12, 2015 at 2:05 PM, Arush Kharbanda < >> ar...@sigmoidanalytics.com> wrote: >> >>> What is your cluster configuration? Did you try looking at the Web UI? >>> There are many tips here >>> >>> http://spark.apache.org/docs/1.2.0/tuning.html >>> >>> Did you try these? >>> >>> On Fri, Feb 13, 2015 at 12:09 AM, Manas Kar <manasdebashis...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> I have a Hidden Markov Model running with 200MB data. >>>> Once the program finishes (i.e. all stages/jobs are done) the program >>>> hangs for 20 minutes or so before killing master. >>>> >>>> In the spark master the following log appears. >>>> >>>> 2015-02-12 13:00:05,035 ERROR akka.actor.ActorSystemImpl: Uncaught >>>> fatal error from thread [sparkMaster-akka.actor.default-dispatcher-31] >>>> shutting down ActorSystem [sparkMaster] >>>> java.lang.OutOfMemoryError: GC overhead limit exceeded >>>> at scala.collection.immutable.List$.newBuilder(List.scala:396) >>>> at >>>> scala.collection.generic.GenericTraversableTemplate$class.genericBuilder(GenericTraversableTemplate.scala:69) >>>> at >>>> scala.collection.AbstractTraversable.genericBuilder(Traversable.scala:105) >>>> at >>>> scala.collection.generic.GenTraversableFactory$GenericCanBuildFrom.apply(GenTraversableFactory.scala:58) >>>> at >>>> scala.collection.generic.GenTraversableFactory$GenericCanBuildFrom.apply(GenTraversableFactory.scala:53) >>>> at >>>> scala.collection.TraversableLike$class.builder$1(TraversableLike.scala:239) >>>> at >>>> scala.collection.TraversableLike$class.map(TraversableLike.scala:243) >>>> at >>>> scala.collection.AbstractTraversable.map(Traversable.scala:105) >>>> at >>>> org.json4s.MonadicJValue$$anonfun$org$json4s$MonadicJValue$$findDirectByName$1.apply(MonadicJValue.scala:26) >>>> at >>>> org.json4s.MonadicJValue$$anonfun$org$json4s$MonadicJValue$$findDirectByName$1.apply(MonadicJValue.scala:22) >>>> at >>>> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) >>>> at >>>> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) >>>> at scala.collection.immutable.List.foreach(List.scala:318) >>>> at >>>> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) >>>> at >>>> scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) >>>> at org.json4s.MonadicJValue.org >>>> $json4s$MonadicJValue$$findDirectByName(MonadicJValue.scala:22) >>>> at org.json4s.MonadicJValue.$bslash(MonadicJValue.scala:16) >>>> at >>>> org.apache.spark.util.JsonProtocol$.taskStartFromJson(JsonProtocol.scala:450) >>>> at >>>> org.apache.spark.util.JsonProtocol$.sparkEventFromJson(JsonProtocol.scala:423) >>>> at >>>> org.apache.spark.scheduler.ReplayListenerBus$$anonfun$replay$2$$anonfun$apply$1.apply(ReplayListenerBus.scala:71) >>>> at >>>> org.apache.spark.scheduler.ReplayListenerBus$$anonfun$replay$2$$anonfun$apply$1.apply(ReplayListenerBus.scala:69) >>>> at scala.collection.Iterator$class.foreach(Iterator.scala:727) >>>> at >>>> scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >>>> at >>>> org.apache.spark.scheduler.ReplayListenerBus$$anonfun$replay$2.apply(ReplayListenerBus.scala:69) >>>> at >>>> org.apache.spark.scheduler.ReplayListenerBus$$anonfun$replay$2.apply(ReplayListenerBus.scala:55) >>>> at >>>> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) >>>> at >>>> scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34) >>>> at >>>> org.apache.spark.scheduler.ReplayListenerBus.replay(ReplayListenerBus.scala:55) >>>> at >>>> org.apache.spark.deploy.master.Master.rebuildSparkUI(Master.scala:726) >>>> at >>>> org.apache.spark.deploy.master.Master.removeApplication(Master.scala:675) >>>> at >>>> org.apache.spark.deploy.master.Master.finishApplication(Master.scala:653) >>>> at >>>> org.apache.spark.deploy.master.Master$$anonfun$receiveWithLogging$1$$anonfun$applyOrElse$29.apply(Master.scala:399) >>>> >>>> Can anyone help? >>>> >>>> ..Manas >>>> >>> >>> >>> >>> -- >>> >>> [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> >>> >>> *Arush Kharbanda* || Technical Teamlead >>> >>> ar...@sigmoidanalytics.com || www.sigmoidanalytics.com >>> >> >> > > > -- > > [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> > > *Arush Kharbanda* || Technical Teamlead > > ar...@sigmoidanalytics.com || www.sigmoidanalytics.com >