try to set --drive-memory xg , x would be as large as can be set .
On Monday, July 18, 2016 6:31 PM, Saurav Sinha <[email protected]>
wrote:
Hi,
I am running spark job.
Master memory - 5Gexecutor memort 10G(running on 4 node)
My job is getting killed as no of partition increase to 20K.
16/07/18 14:53:13 INFO DAGScheduler: Got job 17 (foreachPartition at
WriteToKafka.java:45) with 13524 output partitions (allowLocal=false)16/07/18
14:53:13 INFO DAGScheduler: Final stage: ResultStage 640(foreachPartition at
WriteToKafka.java:45)16/07/18 14:53:13 INFO DAGScheduler: Parents of final
stage: List(ShuffleMapStage 518, ShuffleMapStage 639)16/07/18 14:53:23 INFO
DAGScheduler: Missing parents: List()16/07/18 14:53:23 INFO DAGScheduler:
Submitting ResultStage 640 (MapPartitionsRDD[271] at map at
BuildSolrDocs.java:209), which has no missing parents16/07/18 14:53:23 INFO
MemoryStore: ensureFreeSpace(8248) called with curMem=41923262,
maxMem=277877882816/07/18 14:53:23 INFO MemoryStore: Block broadcast_90 stored
as values in memory (estimated size 8.1 KB, free 2.5 GB)Exception in thread
"dag-scheduler-event-loop" java.lang.OutOfMemoryError: Java heap space
at
org.apache.spark.util.io.ByteArrayChunkOutputStream.allocateNewChunkIfNeeded(ByteArrayChunkOutputStream.scala:66)
at
org.apache.spark.util.io.ByteArrayChunkOutputStream.write(ByteArrayChunkOutputStream.scala:55)
at
org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:294)
at org.xerial.snappy.SnappyOutputStream.flush(SnappyOutputStream.java:273)
at
org.apache.spark.io.SnappyOutputStreamWrapper.flush(CompressionCodec.scala:197)
at
java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1822)
Help needed.
--
Thanks and Regards,
Saurav Sinha
Contact: 9742879062