Driver maintains the complete metadata of application ( scheduling of executor and maintaining the messaging to control the execution ) This code seems to be failing in that code path only. With that said there is Jvm overhead based on num of executors , stages and tasks in your app. Do you know your driver heap size and application structure ( num of stages and tasks )
Ashish On Saturday, May 7, 2016, Nirav Patel <npa...@xactlycorp.com> wrote: > Right but this logs from spark driver and spark driver seems to use Akka. > > ERROR [sparkDriver-akka.actor.default-dispatcher-17] > akka.actor.ActorSystemImpl: Uncaught fatal error from thread > [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down > ActorSystem [sparkDriver] > > I saw following logs before above happened. > > 2016-05-06 09:49:17,813 INFO > [sparkDriver-akka.actor.default-dispatcher-17] > org.apache.spark.MapOutputTrackerMasterEndpoint: Asked to send map output > locations for shuffle 1 to hdn6.xactlycorporation.local:44503 > > > As far as I know driver is just driving shuffle operation but not actually > doing anything within its own system that will cause memory issue. Can you > explain in what circumstances I could see this error in driver logs? I > don't do any collect or any other driver operation that would cause this. > It fails when doing aggregateByKey operation but that should happen in > executor JVM NOT in driver JVM. > > > Thanks > > On Sat, May 7, 2016 at 11:58 AM, Ted Yu <yuzhih...@gmail.com > <javascript:_e(%7B%7D,'cvml','yuzhih...@gmail.com');>> wrote: > >> bq. at akka.serialization.JavaSerializer.toBinary(Serializer.scala:129) >> >> It was Akka which uses JavaSerializer >> >> Cheers >> >> On Sat, May 7, 2016 at 11:13 AM, Nirav Patel <npa...@xactlycorp.com >> <javascript:_e(%7B%7D,'cvml','npa...@xactlycorp.com');>> wrote: >> >>> Hi, >>> >>> I thought I was using kryo serializer for shuffle. I could verify it >>> from spark UI - Environment tab that >>> spark.serializer org.apache.spark.serializer.KryoSerializer >>> spark.kryo.registrator >>> com.myapp.spark.jobs.conf.SparkSerializerRegistrator >>> >>> >>> But when I see following error in Driver logs it looks like spark is >>> using JavaSerializer >>> >>> 2016-05-06 09:49:26,490 ERROR >>> [sparkDriver-akka.actor.default-dispatcher-17] akka.actor.ActorSystemImpl: >>> Uncaught fatal error from thread >>> [sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down >>> ActorSystem [sparkDriver] >>> >>> java.lang.OutOfMemoryError: Java heap space >>> >>> at java.util.Arrays.copyOf(Arrays.java:2271) >>> >>> at >>> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) >>> >>> at >>> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) >>> >>> at >>> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) >>> >>> at >>> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876) >>> >>> at >>> java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785) >>> >>> at >>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188) >>> >>> at >>> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >>> >>> at >>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply$mcV$sp(Serializer.scala:129) >>> >>> at >>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:129) >>> >>> at >>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:129) >>> >>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>> >>> at >>> akka.serialization.JavaSerializer.toBinary(Serializer.scala:129) >>> >>> at >>> akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:36) >>> >>> at >>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:843) >>> >>> at >>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:843) >>> >>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>> >>> at >>> akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:842) >>> >>> at akka.remote.EndpointWriter.writeSend(Endpoint.scala:743) >>> >>> at >>> akka.remote.EndpointWriter$$anonfun$2.applyOrElse(Endpoint.scala:718) >>> >>> at akka.actor.Actor$class.aroundReceive(Actor.scala:467) >>> >>> at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:411) >>> >>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) >>> >>> at akka.actor.ActorCell.invoke(ActorCell.scala:487) >>> >>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) >>> >>> at akka.dispatch.Mailbox.run(Mailbox.scala:220) >>> >>> at >>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) >>> >>> at >>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>> >>> at >>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>> >>> at >>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>> >>> at >>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>> >>> >>> >>> What I am missing here? >>> >>> Thanks >>> >>> >>> >>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> >>> >>> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] >>> <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] >>> <https://twitter.com/Xactly> [image: Facebook] >>> <https://www.facebook.com/XactlyCorp> [image: YouTube] >>> <http://www.youtube.com/xactlycorporation> >> >> >> > > > > [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> > > <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] > <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] > <https://twitter.com/Xactly> [image: Facebook] > <https://www.facebook.com/XactlyCorp> [image: YouTube] > <http://www.youtube.com/xactlycorporation>