Hi,

We are trying to use Spark Data Frames for our use case where we are getting 
this exception.
The parameters used are listed below. Kindly suggest if we are missing 
something.
Version used is Spark 1.3.1
Jira is still showing this issue as Open 
https://issues.apache.org/jira/browse/SPARK-4105
Kindly suggest if there is workaround .

Exception :
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: 
Task 88 in stage 40.0 failed 4 times, most recent failure: Lost task 88.3 in 
stage 40.0 : java.io.IOException: FAILED_TO_UNCOMPRESS(5)
              at 
org.xerial.snappy.SnappyNative.throw_error(SnappyNative.java:78)
              at org.xerial.snappy.SnappyNative.rawUncompress(Native Method)
              at org.xerial.snappy.Snappy.rawUncompress(Snappy.java:391)
              at org.xerial.snappy.Snappy.uncompress(Snappy.java:427)
              at 
org.xerial.snappy.SnappyInputStream.readFully(SnappyInputStream.java:127)
              at 
org.xerial.snappy.SnappyInputStream.readHeader(SnappyInputStream.java:88)
              at 
org.xerial.snappy.SnappyInputStream.<init>(SnappyInputStream.java:58)
              at 
org.apache.spark.io.SnappyCompressionCodec.compressedInputStream(CompressionCodec.scala:160)
              at 
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213)
              at 
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$7.apply(TorrentBroadcast.scala:213)
              at scala.Option.map(Option.scala:145)
              at 
org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:213)
              at 
org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:177)
              at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1153)
              at 
org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
              at 
org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
              at 
org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
              at 
org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:87)
              at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
              at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:61)
              at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
              at org.apache.spark.scheduler.Task.run(Task.scala:64)
              at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
              at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
              at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
              at java.lang.Thread.run(Thread.java:745)

Parameters Changed :
spark.akka.frameSize=50
spark.shuffle.memoryFraction=0.4
spark.storage.memoryFraction=0.5
spark.worker.timeout=120000
spark.storage.blockManagerSlaveTimeoutMs=120000
spark.akka.heartbeat.pauses=6000
spark.akka.heartbeat.interval=1000
spark.ui.port=21000
spark.port.maxRetries=50
spark.executor.memory=10G
spark.executor.instances=100
spark.driver.memory=8G
spark.executor.cores=2
spark.shuffle.compress=true
spark.io.compression.codec=snappy
spark.broadcast.compress=true
spark.rdd.compress=true
spark.worker.cleanup.enabled=true
spark.worker.cleanup.interval=600
spark.worker.cleanup.appDataTtl=600
spark.shuffle.consolidateFiles=true
spark.yarn.preserve.staging.files=false
spark.yarn.driver.memoryOverhead=1024
spark.yarn.executor.memoryOverhead=1024

Best Regards,
Prashant Singh Thakur
Mobile: +91-9740266522


________________________________






NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.

Reply via email to