Hi all, I'm using spark 1.3.1 and ran the following code:
sc.textFile(path) .map(line => (getEntId(line), line)) .persist(StorageLevel.MEMORY_AND_DISK) .groupByKey .flatMap(x => func(x)) .reduceByKey((a,b) => (a + b).toShort) I get the following error in flatMap() and it's very hard to understand what's happening. Can someone please help me with this ? Also, how do I force Spark to not use /tmp ? I tried changing java.io.tmpdir and spark.local.dir, but without success. Thanks! java.io.FileNotFoundException: /tmp/spark-f6dd2621-6be0-4ddb-a31f-97e0f77eae87/spark-8391db48-fb2b-44a2-b34a-153b16b97292/spark-03e37c8c-95c7-47d4-9ac7-a69fbdfdf3a6/blockmgr-ed70dbea-5747-4765-8942-30dd0b1a4258/16/shuffle_2_31_0.data (No such file or directory) at java.io.FileOutputStream.open(Native Method) at java.io.FileOutputStream.<init>(FileOutputStream.java:221) at org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:130) at org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:201) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$5$$anonfun$apply$2.apply(ExternalSorter.scala:759) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$5$$anonfun$apply$2.apply(ExternalSorter.scala:758) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at org.apache.spark.util.collection.ExternalSorter$IteratorForPartition.foreach(ExternalSorter.scala:823) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$5.apply(ExternalSorter.scala:758) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$5.apply(ExternalSorter.scala:754) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:754) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:71) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:64) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Shuffle-strange-error-tp23179.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org