A central location, such as NFS? If they are temporary for the purpose of further job processing you'll want to keep them local to the node in the cluster, i.e., in /tmp. If they are centralized you won't be able to take advantage of data locality and the central file store will become a bottleneck for further processing.
If /tmp isn't an option because you want to be able to monitor the file outputs as they occur you can also use HDFS (assuming your Spark nodes are also HDFS members they will benefit from data locality). It looks like the problem you are seeing is that a lock cannot be acquired on the output file in the central file system. On Wed Feb 11 2015 at 11:55:55 AM TJ Klein <tjkl...@gmail.com> wrote: > Hi, > > Using Spark 1.2 I ran into issued setting SPARK_LOCAL_DIRS to a different > path then local directory. > > On our cluster we have a folder for temporary files (in a central file > system), which is called /scratch. > > When setting SPARK_LOCAL_DIRS=/scratch/<node name> > > I get: > > An error occurred while calling > z:org.apache.spark.api.python.PythonRDD.newAPIHadoopFile. > : org.apache.spark.SparkException: Job aborted due to stage failure: Task > 0 > in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage > 0.0 > (TID 3, XXXXXXX): java.io.IOException: Function not implemented > at sun.nio.ch.FileDispatcherImpl.lock0(Native Method) > at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:91) > at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1022) > at java.nio.channels.FileChannel.lock(FileChannel.java:1052) > at org.apache.spark.util.Utils$.fetchFile(Utils.scala:379) > > Using SPARK_LOCAL_DIRS=/tmp, however, works perfectly. Any idea? > > Best, > Tassilo > > > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/SPARK-LOCAL-DIRS-Issue-tp21602.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >