While it has heap space, batches run well below 15 seconds.
Once it starts to run out of space, processing time takes about 1.5 minutes.
Scheduling delay is around 4 minutes and total delay around 5.5 minutes. I
usually shut it down at that point.
The number of stages (and pending stages) does seem to be quite high and
increases over time.
4584 foreachRDD at HDFSPersistence.java:52 2016/05/30 16:23:52 1.9 min
36/36 (4964 skipped) 285/285 (28026 skipped)
4586 transformToPair at SampleCalculator.java:88 2016/05/30
16:25:02 0.2 s 1/1 4/4
4585 (Unknown Stage Name) 2016/05/30 16:23:52 1.2 min
1/1 1/1
4582 (Unknown Stage Name) 2016/05/30 16:21:51 48 s 1/1 (4063
skipped) 12/12 (22716 skipped)
4583 (Unknown Stage Name) 2016/05/30 16:21:51 48 s 1/1 1/1
4580 (Unknown Stage Name) 2016/05/30 16:16:38 4.0 min
36/36 (4879 skipped) 285/285 (27546 skipped)
4581 (Unknown Stage Name) 2016/05/30 16:16:38 0.1 s 1/1 4/4
4579 (Unknown Stage Name) 2016/05/30 16:15:53 45 s 1/1 1/1
4578 (Unknown Stage Name) 2016/05/30 16:14:38 1.3 min
1/1 (3993 skipped) 12/12 (22326 skipped)
4577 (Unknown Stage Name) 2016/05/30 16:14:37 0.8 s 1/1
1/1Is this what you mean by pending stages?
I have taken a few heap dumps but I’m not sure what I am looking at for the
problematic classes.
From: Shahbaz [mailto:[email protected]]
Sent: 2016, May, 30 3:25 PM
To: Dancuart, Christian
Cc: user
Subject: Re: Spark Streaming heap space out of memory
Hi Christian,
* What is the processing time of each of your Batch,is it exceeding 15
seconds.
* How many jobs are queued.
* Can you take a heap dump and see which objects are occupying the heap.
Regards,
Shahbaz
On Tue, May 31, 2016 at 12:21 AM,
[email protected]<mailto:[email protected]>
<[email protected]<mailto:[email protected]>> wrote:
Hi All,
We have a spark streaming v1.4/java 8 application that slows down and
eventually runs out of heap space. The less driver memory, the faster it
happens.
Appended is our spark configuration and a snapshot of the of heap taken
using jmap on the driver process. The RDDInfo, $colon$colon and [C objects
keep growing as we observe. We also tried to use G1GC, but it acts the same.
Our dependency graph contains multiple updateStateByKey() calls. For each,
we explicitly set the checkpoint interval to 240 seconds.
We have our batch interval set to 15 seconds; with no delays at the start of
the process.
Spark configuration (Spark Driver Memory: 6GB, Spark Executor Memory: 2GB):
spark.streaming.minRememberDuration=180s
spark.ui.showConsoleProgress=false
spark.streaming.receiver.writeAheadLog.enable=true
spark.streaming.unpersist=true
spark.streaming.stopGracefullyOnShutdown=true
spark.streaming.ui.retainedBatches=10
spark.ui.retainedJobs=10
spark.ui.retainedStages=10
spark.worker.ui.retainedExecutors=10
spark.worker.ui.retainedDrivers=10
spark.sql.ui.retainedExecutions=10
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryoserializer.buffer.max=128m
num #instances #bytes class name
----------------------------------------------
1: 8828200 565004800 org.apache.spark.storage.RDDInfo
2: 20794893 499077432 scala.collection.immutable.$colon$colon
3: 9646097 459928736 [C
4: 9644398 231465552 java.lang.String
5: 12760625 204170000 java.lang.Integer
6: 21326 111198632 [B
7: 556959 44661232 [Lscala.collection.mutable.HashEntry;
8: 1179788 37753216
java.util.concurrent.ConcurrentHashMap$Node
9: 1169264 37416448 java.util.Hashtable$Entry
10: 552707 30951592 org.apache.spark.scheduler.StageInfo
11: 367107 23084712 [Ljava.lang.Object;
12: 556948 22277920 scala.collection.mutable.HashMap
13: 2787 22145568
[Ljava.util.concurrent.ConcurrentHashMap$Node;
14: 116997 12167688 org.apache.spark.executor.TaskMetrics
15: 360425 8650200
java.util.concurrent.LinkedBlockingQueue$Node
16: 360417 8650008
org.apache.spark.deploy.history.yarn.HandleSparkEvent
17: 8332 8478088 [Ljava.util.Hashtable$Entry;
18: 351061 8425464 scala.collection.mutable.ArrayBuffer
19: 116963 8421336 org.apache.spark.scheduler.TaskInfo
20: 446136 7138176 scala.Some
21: 211968 5087232
io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
22: 116963 4678520
org.apache.spark.scheduler.SparkListenerTaskEnd
23: 107679 4307160
org.apache.spark.executor.ShuffleWriteMetrics
24: 72162 4041072
org.apache.spark.executor.ShuffleReadMetrics
25: 117223 3751136 scala.collection.mutable.ListBuffer
26: 81473 3258920 org.apache.spark.executor.InputMetrics
27: 125903 3021672 org.apache.spark.rdd.RDDOperationScope
28: 91455 2926560 java.util.HashMap$Node
29: 89 2917776
[Lscala.concurrent.forkjoin.ForkJoinTask;
30: 116957 2806968
org.apache.spark.scheduler.SparkListenerTaskStart
31: 2122 2188568 [Lorg.apache.spark.scheduler.StageInfo;
32: 16411 1819816 java.lang.Class
33: 87862 1405792
org.apache.spark.scheduler.SparkListenerUnpersistRDD
34: 22915 916600 org.apache.spark.storage.BlockStatus
35: 5887 895568 [Ljava.util.HashMap$Node;
36: 480 855552
[Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
37: 7569 834968 [I
38: 9626 770080 org.apache.spark.rdd.MapPartitionsRDD
39: 31748 761952 java.lang.Long
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-heap-space-out-of-memory-tp27050.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail:
[email protected]<mailto:[email protected]>
For additional commands, e-mail:
[email protected]<mailto:[email protected]>
_______________________________________________________________________
If you received this email in error, please advise the sender (by return email
or otherwise) immediately. You have consented to receive the attached
electronically at the above-noted email address; please retain a copy of this
confirmation for future reference.
Si vous recevez ce courriel par erreur, veuillez en aviser l'expéditeur
immédiatement, par retour de courriel ou par un autre moyen. Vous avez accepté
de recevoir le(s) document(s) ci-joint(s) par voie électronique à l'adresse
courriel indiquée ci-dessus; veuillez conserver une copie de cette confirmation
pour les fins de reference future.