Yes, I am using yarn client mode hence I specified am settings too. What you mean akka is moved out of picture? I am using spark 2.5.1
Sent from my iPhone > On May 8, 2016, at 6:39 AM, Ted Yu <[email protected]> wrote: > > Are you using YARN client mode ? > > See > https://spark.apache.org/docs/latest/running-on-yarn.html > > In cluster mode, spark.yarn.am.memory is not effective. > > For Spark 2.0, akka is moved out of the picture. > FYI > >> On Sat, May 7, 2016 at 8:24 PM, Nirav Patel <[email protected]> wrote: >> I have 20 executors, 6 cores each. Total 5 stages. It fails on 5th one. All >> of them have 6474 tasks. 5th task is a count operations and it also performs >> aggregateByKey as a part of it lazy evaluation. >> I am setting: >> spark.driver.memory=10G, spark.yarn.am.memory=2G and >> spark.driver.maxResultSize=9G >> >> >> On a side note, could it be something to do with java serialization library, >> ByteArrayOutputStream using byte array? Can it be replaced by some better >> serializing library? >> >> https://bugs.openjdk.java.net/browse/JDK-8055949 >> https://bugs.openjdk.java.net/browse/JDK-8136527 >> >> Thanks >> >>> On Sat, May 7, 2016 at 4:51 PM, Ashish Dubey <[email protected]> wrote: >>> Driver maintains the complete metadata of application ( scheduling of >>> executor and maintaining the messaging to control the execution ) >>> This code seems to be failing in that code path only. With that said there >>> is Jvm overhead based on num of executors , stages and tasks in your app. >>> Do you know your driver heap size and application structure ( num of stages >>> and tasks ) >>> >>> Ashish >>> >>>> On Saturday, May 7, 2016, Nirav Patel <[email protected]> wrote: >>>> Right but this logs from spark driver and spark driver seems to use Akka. >>>> >>>> ERROR [sparkDriver-akka.actor.default-dispatcher-17] >>>> akka.actor.ActorSystemImpl: Uncaught fatal error from thread >>>> [sparkDriver-akka.remote.default-remote-dispatcher-5] shutting down >>>> ActorSystem [sparkDriver] >>>> >>>> I saw following logs before above happened. >>>> >>>> 2016-05-06 09:49:17,813 INFO >>>> [sparkDriver-akka.actor.default-dispatcher-17] >>>> org.apache.spark.MapOutputTrackerMasterEndpoint: Asked to send map output >>>> locations for shuffle 1 to hdn6.xactlycorporation.local:44503 >>>> >>>> >>>> >>>> As far as I know driver is just driving shuffle operation but not actually >>>> doing anything within its own system that will cause memory issue. Can you >>>> explain in what circumstances I could see this error in driver logs? I >>>> don't do any collect or any other driver operation that would cause this. >>>> It fails when doing aggregateByKey operation but that should happen in >>>> executor JVM NOT in driver JVM. >>>> >>>> >>>> >>>> Thanks >>>> >>>> >>>>> On Sat, May 7, 2016 at 11:58 AM, Ted Yu <[email protected]> wrote: >>>>> bq. at akka.serialization.JavaSerializer.toBinary(Serializer.scala:129) >>>>> >>>>> It was Akka which uses JavaSerializer >>>>> >>>>> Cheers >>>>> >>>>>> On Sat, May 7, 2016 at 11:13 AM, Nirav Patel <[email protected]> >>>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> I thought I was using kryo serializer for shuffle. I could verify it >>>>>> from spark UI - Environment tab that >>>>>> spark.serializer org.apache.spark.serializer.KryoSerializer >>>>>> spark.kryo.registrator >>>>>> com.myapp.spark.jobs.conf.SparkSerializerRegistrator >>>>>> >>>>>> >>>>>> But when I see following error in Driver logs it looks like spark is >>>>>> using JavaSerializer >>>>>> >>>>>> 2016-05-06 09:49:26,490 ERROR >>>>>> [sparkDriver-akka.actor.default-dispatcher-17] >>>>>> akka.actor.ActorSystemImpl: Uncaught fatal error from thread >>>>>> [sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down >>>>>> ActorSystem [sparkDriver] >>>>>> >>>>>> java.lang.OutOfMemoryError: Java heap space >>>>>> >>>>>> at java.util.Arrays.copyOf(Arrays.java:2271) >>>>>> >>>>>> at >>>>>> java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113) >>>>>> >>>>>> at >>>>>> java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93) >>>>>> >>>>>> at >>>>>> java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140) >>>>>> >>>>>> at >>>>>> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876) >>>>>> >>>>>> at >>>>>> java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785) >>>>>> >>>>>> at >>>>>> java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1188) >>>>>> >>>>>> at >>>>>> java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:347) >>>>>> >>>>>> at >>>>>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply$mcV$sp(Serializer.scala:129) >>>>>> >>>>>> at >>>>>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:129) >>>>>> >>>>>> at >>>>>> akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:129) >>>>>> >>>>>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>>>>> >>>>>> at >>>>>> akka.serialization.JavaSerializer.toBinary(Serializer.scala:129) >>>>>> >>>>>> at >>>>>> akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:36) >>>>>> >>>>>> at >>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:843) >>>>>> >>>>>> at >>>>>> akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:843) >>>>>> >>>>>> at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57) >>>>>> >>>>>> at >>>>>> akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:842) >>>>>> >>>>>> at akka.remote.EndpointWriter.writeSend(Endpoint.scala:743) >>>>>> >>>>>> at >>>>>> akka.remote.EndpointWriter$$anonfun$2.applyOrElse(Endpoint.scala:718) >>>>>> >>>>>> at akka.actor.Actor$class.aroundReceive(Actor.scala:467) >>>>>> >>>>>> at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:411) >>>>>> >>>>>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) >>>>>> >>>>>> at akka.actor.ActorCell.invoke(ActorCell.scala:487) >>>>>> >>>>>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238) >>>>>> >>>>>> at akka.dispatch.Mailbox.run(Mailbox.scala:220) >>>>>> >>>>>> at >>>>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) >>>>>> >>>>>> at >>>>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >>>>>> >>>>>> at >>>>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >>>>>> >>>>>> at >>>>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >>>>>> >>>>>> at >>>>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> What I am missing here? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> >> >> >> >> >> >> > -- [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/> <https://www.nyse.com/quote/XNYS:XTLY> [image: LinkedIn] <https://www.linkedin.com/company/xactly-corporation> [image: Twitter] <https://twitter.com/Xactly> [image: Facebook] <https://www.facebook.com/XactlyCorp> [image: YouTube] <http://www.youtube.com/xactlycorporation>
