Hi all,
My code was working fine in spark 1.0.2 ,but after upgrading to 1.1.0, its
throwing exceptions and tasks are getting failed.
The code contains some map and filter transformations followed by groupByKey
(reduceByKey in another code ). What I could find out is that the code works
fine until groupByKey or reduceByKey in both versions.But after that the
following errors show up in Spark 1.1.0
java.io.FileNotFoundException:
/tmp/spark-local-20141006173014-4178/35/shuffle_6_0_5161 (Too many open files)
java.io.FileOutputStream.openAppend(Native Method)
java.io.FileOutputStream.<init>(FileOutputStream.java:210)
org.apache.spark.storage.DiskBlockObjectWriter.open(BlockObjectWriter.scala:123)
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:192)
org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuffleWriter.scala:67)
org.apache.spark.shuffle.hash.HashShuffleWriter$$anonfun$write$1.apply(HashShuffleWriter.scala:65)
scala.collection.Iterator$class.foreach(Iterator.scala:727)
scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:701)
I cleaned my /tmp directory,changed my local directory to another folder ; but
nothing helped.
Can anyone say what could be the reason .?
Thanks & Regards,
Meethu M