Which version of Spark are you using? Did you try using spark.shuffle.blockTransferService=nio
Thanks Best Regards On Sat, Apr 18, 2015 at 11:14 PM, roy <rp...@njit.edu> wrote: > Hi, > > My spark job is failing with following error message > > org.apache.spark.shuffle.FetchFailedException: > > /mnt/ephemeral12/yarn/nm/usercache/abc/appcache/application_1429353954024_1691/spark-local-20150418132335-0723/28/shuffle_3_1_0.index > (No such file or directory) > at > > org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.org$apache$spark$shuffle$hash$BlockStoreShuffleFetcher$$unpackBlock$1(BlockStoreShuffleFetcher.scala:67) > at > > org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$$anonfun$3.apply(BlockStoreShuffleFetcher.scala:83) > at > > org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$$anonfun$3.apply(BlockStoreShuffleFetcher.scala:83) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) > at > > org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:32) > at > > org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) > at > > org.apache.spark.graphx.impl.EdgePartition.updateVertices(EdgePartition.scala:89) > at > > org.apache.spark.graphx.impl.ReplicatedVertexView$$anonfun$2$$anonfun$apply$1.apply(ReplicatedVertexView.scala:75) > at > > org.apache.spark.graphx.impl.ReplicatedVertexView$$anonfun$2$$anonfun$apply$1.apply(ReplicatedVertexView.scala:73) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) > at > > org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:210) > at > > org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:65) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) > at > org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) > at org.apache.spark.scheduler.Task.run(Task.scala:56) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196) > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.io.FileNotFoundException: > > /mnt/ephemeral12/yarn/nm/usercache/abc/appcache/application_1429353954024_1691/spark-local-20150418132335-0723/28/shuffle_3_1_0.index > (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:146) > at > > org.apache.spark.shuffle.IndexShuffleBlockManager.getBlockData(IndexShuffleBlockManager.scala:109) > at > org.apache.spark.storage.BlockManager.getBlockData(BlockManager.scala:305) > at > > org.apache.spark.storage.ShuffleBlockFetcherIterator.fetchLocalBlocks(ShuffleBlockFetcherIterator.scala:235) > at > > org.apache.spark.storage.ShuffleBlockFetcherIterator.initialize(ShuffleBlockFetcherIterator.scala:268) > at > > org.apache.spark.storage.ShuffleBlockFetcherIterator.<init>(ShuffleBlockFetcherIterator.scala:115) > at > > org.apache.spark.shuffle.hash.BlockStoreShuffleFetcher$.fetch(BlockStoreShuffleFetcher.scala:76) > at > > org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:40) > at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:92) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) > at > > org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:88) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:263) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:230) > ... 7 more > > ) > > > my job conf was : > --num-executors 30 --driver-memory 3G --executor-memory 5G --conf > spark.yarn.executor.memoryOverhead=1000 --conf > spark.yarn.driver.memoryOverhead=1000 --conf spark.akka.frameSize=100000000 > --executor-cores 3 > > any idea why its failing on shuffle fetch ? > > thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/shuffle-FetchFailedException-in-spark-on-YARN-job-tp22557.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >