Re: SPARK_LOCAL_DIRS Issue

Tassilo Klein Wed, 11 Feb 2015 11:42:12 -0800

Thanks. Yes, I think it might not always make sense to lock files,
particularly if every executor is getting its own path.


On Wed, Feb 11, 2015 at 2:31 PM, Charles Feduke <charles.fed...@gmail.com>
wrote:

> And just glancing at the Spark source code around where the stack trace
> originates:
>
> val lockFile = new File(localDir, lockFileName)
>       val raf = new RandomAccessFile(lockFile, "rw")
>       // Only one executor entry.
>       // The FileLock is only used to control synchronization for
> executors download file,
>       // it's always safe regardless of lock type (mandatory or advisory).
>       val lock = raf.getChannel().lock()
>       val cachedFile = new File(localDir, cachedFileName)
>       try {
>         if (!cachedFile.exists()) {
>           doFetchFile(url, localDir, cachedFileName, conf, securityMgr,
> hadoopConf)
>         }
>       } finally {
>         lock.release()
>       }
>
> I think Spark is making assumptions about the underlying file system that
> isn't safe to make (locking? I don't know enough about POSIX to know
> whether locking is part of the spec). Maybe file a bug report after someone
> from the dev team chimes in on this issue.
>
>
> On Wed Feb 11 2015 at 2:20:34 PM Charles Feduke <charles.fed...@gmail.com>
> wrote:
>
>> Take a look at this:
>>
>> http://wiki.lustre.org/index.php/Running_Hadoop_with_Lustre
>>
>> Particularly: http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf
>> (linked from that article)
>>
>> to get a better idea of what your options are.
>>
>> If its possible to avoid writing to [any] disk I'd recommend that route,
>> since that's the performance advantage Spark has over vanilla Hadoop.
>>
>> On Wed Feb 11 2015 at 2:10:36 PM Tassilo Klein <tjkl...@gmail.com> wrote:
>>
>>> Thanks for the info. The file system in use is a Lustre file system.
>>>
>>> Best,
>>>  Tassilo
>>>
>>> On Wed, Feb 11, 2015 at 12:15 PM, Charles Feduke <
>>> charles.fed...@gmail.com> wrote:
>>>
>>>> A central location, such as NFS?
>>>>
>>>> If they are temporary for the purpose of further job processing you'll
>>>> want to keep them local to the node in the cluster, i.e., in /tmp. If they
>>>> are centralized you won't be able to take advantage of data locality and
>>>> the central file store will become a bottleneck for further processing.
>>>>
>>>> If /tmp isn't an option because you want to be able to monitor the file
>>>> outputs as they occur you can also use HDFS (assuming your Spark nodes are
>>>> also HDFS members they will benefit from data locality).
>>>>
>>>> It looks like the problem you are seeing is that a lock cannot be
>>>> acquired on the output file in the central file system.
>>>>
>>>> On Wed Feb 11 2015 at 11:55:55 AM TJ Klein <tjkl...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Using Spark 1.2 I ran into issued setting SPARK_LOCAL_DIRS to a
>>>>> different
>>>>> path then local directory.
>>>>>
>>>>> On our cluster we have a folder for temporary files (in a central file
>>>>> system), which is called /scratch.
>>>>>
>>>>> When setting SPARK_LOCAL_DIRS=/scratch/<node name>
>>>>>
>>>>> I get:
>>>>>
>>>>>  An error occurred while calling
>>>>> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopFile.
>>>>> : org.apache.spark.SparkException: Job aborted due to stage failure:
>>>>> Task 0
>>>>> in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in
>>>>> stage 0.0
>>>>> (TID 3, XXXXXXX): java.io.IOException: Function not implemented
>>>>> at sun.nio.ch.FileDispatcherImpl.lock0(Native Method)
>>>>>         at sun.nio.ch.FileDispatcherImpl.lock(FileDispatcherImpl.java:
>>>>> 91)
>>>>>         at sun.nio.ch.FileChannelImpl.lock(FileChannelImpl.java:1022)
>>>>>         at java.nio.channels.FileChannel.lock(FileChannel.java:1052)
>>>>>         at org.apache.spark.util.Utils$.fetchFile(Utils.scala:379)
>>>>>
>>>>> Using SPARK_LOCAL_DIRS=/tmp, however, works perfectly. Any idea?
>>>>>
>>>>> Best,
>>>>>  Tassilo
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context: http://apache-spark-user-list.
>>>>> 1001560.n3.nabble.com/SPARK-LOCAL-DIRS-Issue-tp21602.html
>>>>> Sent from the Apache Spark User List mailing list archive at
>>>>> Nabble.com.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>>>
>>>>>
>>>

Re: SPARK_LOCAL_DIRS Issue

Reply via email to