Hi Christian can you please try if 30seconds works for your case .I think your batches are getting queued up .Regards Shahbaz
On Tuesday 31 May 2016, Dancuart, Christian <[email protected]> wrote: > While it has heap space, batches run well below 15 seconds. > > > > Once it starts to run out of space, processing time takes about 1.5 > minutes. Scheduling delay is around 4 minutes and total delay around 5.5 > minutes. I usually shut it down at that point. > > > > The number of stages (and pending stages) does seem to be quite high and > increases over time. > > > > 4584 foreachRDD at HDFSPersistence.java:52 2016/05/30 16:23:52 1.9 > min 36/36 (4964 skipped) 285/285 (28026 skipped) > > 4586 transformToPair at SampleCalculator.java:88 2016/05/30 > 16:25:02 0.2 s 1/1 4/4 > > 4585 (Unknown Stage Name) 2016/05/30 16:23:52 1.2 > min 1/1 1/1 > > 4582 (Unknown Stage Name) 2016/05/30 16:21:51 48 s 1/1 > (4063 skipped) 12/12 (22716 skipped) > > 4583 (Unknown Stage Name) 2016/05/30 16:21:51 48 s > 1/1 1/1 > > 4580 (Unknown Stage Name) 2016/05/30 16:16:38 4.0 > min 36/36 (4879 skipped) 285/285 (27546 skipped) > > 4581 (Unknown Stage Name) 2016/05/30 16:16:38 0.1 s > 1/1 4/4 > > 4579 (Unknown Stage Name) 2016/05/30 16:15:53 45 s > 1/1 1/1 > > 4578 (Unknown Stage Name) 2016/05/30 16:14:38 1.3 > min 1/1 (3993 skipped) 12/12 (22326 skipped) > > 4577 (Unknown Stage Name) 2016/05/30 16:14:37 0.8 s > 1/1 1/1Is this what you mean by pending stages? > > > > I have taken a few heap dumps but I’m not sure what I am looking at for > the problematic classes. > > > > *From:* Shahbaz [mailto:[email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>] > *Sent:* 2016, May, 30 3:25 PM > *To:* Dancuart, Christian > *Cc:* user > *Subject:* Re: Spark Streaming heap space out of memory > > > > Hi Christian, > > > > - What is the processing time of each of your Batch,is it exceeding 15 > seconds. > - How many jobs are queued. > - Can you take a heap dump and see which objects are occupying the > heap. > > > > Regards, > > Shahbaz > > > > > > On Tue, May 31, 2016 at 12:21 AM, [email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');> < > [email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');>> wrote: > > Hi All, > > We have a spark streaming v1.4/java 8 application that slows down and > eventually runs out of heap space. The less driver memory, the faster it > happens. > > Appended is our spark configuration and a snapshot of the of heap taken > using jmap on the driver process. The RDDInfo, $colon$colon and [C objects > keep growing as we observe. We also tried to use G1GC, but it acts the > same. > > Our dependency graph contains multiple updateStateByKey() calls. For each, > we explicitly set the checkpoint interval to 240 seconds. > > We have our batch interval set to 15 seconds; with no delays at the start > of > the process. > > Spark configuration (Spark Driver Memory: 6GB, Spark Executor Memory: 2GB): > spark.streaming.minRememberDuration=180s > spark.ui.showConsoleProgress=false > spark.streaming.receiver.writeAheadLog.enable=true > spark.streaming.unpersist=true > spark.streaming.stopGracefullyOnShutdown=true > spark.streaming.ui.retainedBatches=10 > spark.ui.retainedJobs=10 > spark.ui.retainedStages=10 > spark.worker.ui.retainedExecutors=10 > spark.worker.ui.retainedDrivers=10 > spark.sql.ui.retainedExecutions=10 > spark.serializer=org.apache.spark.serializer.KryoSerializer > spark.kryoserializer.buffer.max=128m > > num #instances #bytes class name > ---------------------------------------------- > 1: 8828200 565004800 org.apache.spark.storage.RDDInfo > 2: 20794893 499077432 scala.collection.immutable.$colon$colon > 3: 9646097 459928736 [C > 4: 9644398 231465552 java.lang.String > 5: 12760625 204170000 java.lang.Integer > 6: 21326 111198632 [B > 7: 556959 44661232 [Lscala.collection.mutable.HashEntry; > 8: 1179788 37753216 > java.util.concurrent.ConcurrentHashMap$Node > 9: 1169264 37416448 java.util.Hashtable$Entry > 10: 552707 30951592 org.apache.spark.scheduler.StageInfo > 11: 367107 23084712 [Ljava.lang.Object; > 12: 556948 22277920 scala.collection.mutable.HashMap > 13: 2787 22145568 > [Ljava.util.concurrent.ConcurrentHashMap$Node; > 14: 116997 12167688 org.apache.spark.executor.TaskMetrics > 15: 360425 8650200 > java.util.concurrent.LinkedBlockingQueue$Node > 16: 360417 8650008 > org.apache.spark.deploy.history.yarn.HandleSparkEvent > 17: 8332 8478088 [Ljava.util.Hashtable$Entry; > 18: 351061 8425464 scala.collection.mutable.ArrayBuffer > 19: 116963 8421336 org.apache.spark.scheduler.TaskInfo > 20: 446136 7138176 scala.Some > 21: 211968 5087232 > io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry > 22: 116963 4678520 > org.apache.spark.scheduler.SparkListenerTaskEnd > 23: 107679 4307160 > org.apache.spark.executor.ShuffleWriteMetrics > 24: 72162 4041072 > org.apache.spark.executor.ShuffleReadMetrics > 25: 117223 3751136 scala.collection.mutable.ListBuffer > 26: 81473 3258920 org.apache.spark.executor.InputMetrics > 27: 125903 3021672 org.apache.spark.rdd.RDDOperationScope > 28: 91455 2926560 java.util.HashMap$Node > 29: 89 2917776 > [Lscala.concurrent.forkjoin.ForkJoinTask; > 30: 116957 2806968 > org.apache.spark.scheduler.SparkListenerTaskStart > 31: 2122 2188568 [Lorg.apache.spark.scheduler.StageInfo; > 32: 16411 1819816 java.lang.Class > 33: 87862 1405792 > org.apache.spark.scheduler.SparkListenerUnpersistRDD > 34: 22915 916600 org.apache.spark.storage.BlockStatus > 35: 5887 895568 [Ljava.util.HashMap$Node; > 36: 480 855552 > [Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry; > 37: 7569 834968 [I > 38: 9626 770080 org.apache.spark.rdd.MapPartitionsRDD > 39: 31748 761952 java.lang.Long > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-heap-space-out-of-memory-tp27050.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');> > For additional commands, e-mail: [email protected] > <javascript:_e(%7B%7D,'cvml','[email protected]');> > > > > _______________________________________________________________________ > > If you received this email in error, please advise the sender (by return > email or otherwise) immediately. You have consented to receive the attached > electronically at the above-noted email address; please retain a copy of > this confirmation for future reference. > > Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur > immédiatement, par retour de courriel ou par un autre moyen. Vous avez > accepté de recevoir le(s) document(s) ci-joint(s) par voie électronique à > l'adresse courriel indiquée ci-dessus; veuillez conserver une copie de > cette confirmation pour les fins de reference future. > >
