Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Tassilo Klein
Thanks. Yes, I think it might not always make sense to lock files, particularly if every executor is getting its own path. On Wed, Feb 11, 2015 at 2:31 PM, Charles Feduke wrote: > And just glancing at the Spark source code around where the stack trace > originates: > > val lockFile = new File(lo

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Charles Feduke
And just glancing at the Spark source code around where the stack trace originates: val lockFile = new File(localDir, lockFileName) val raf = new RandomAccessFile(lockFile, "rw") // Only one executor entry. // The FileLock is only used to control synchronization for executors dow

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Tassilo Klein
Thanks a lot. I will have a look at it. On Wed, Feb 11, 2015 at 2:20 PM, Charles Feduke wrote: > Take a look at this: > > http://wiki.lustre.org/index.php/Running_Hadoop_with_Lustre > > Particularly: http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf > (linked from that article) > > to get

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Charles Feduke
Take a look at this: http://wiki.lustre.org/index.php/Running_Hadoop_with_Lustre Particularly: http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf (linked from that article) to get a better idea of what your options are. If its possible to avoid writing to [any] disk I'd recommend that rout

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Tassilo Klein
Thanks for the info. The file system in use is a Lustre file system. Best, Tassilo On Wed, Feb 11, 2015 at 12:15 PM, Charles Feduke wrote: > A central location, such as NFS? > > If they are temporary for the purpose of further job processing you'll > want to keep them local to the node in the

Re: SPARK_LOCAL_DIRS Issue

2015-02-11 Thread Charles Feduke
A central location, such as NFS? If they are temporary for the purpose of further job processing you'll want to keep them local to the node in the cluster, i.e., in /tmp. If they are centralized you won't be able to take advantage of data locality and the central file store will become a bottlenec

SPARK_LOCAL_DIRS Issue

2015-02-11 Thread TJ Klein
Hi, Using Spark 1.2 I ran into issued setting SPARK_LOCAL_DIRS to a different path then local directory. On our cluster we have a folder for temporary files (in a central file system), which is called /scratch. When setting SPARK_LOCAL_DIRS=/scratch/ I get: An error occurred while calling z:o

Re: set SPARK_LOCAL_DIRS issue

2014-08-11 Thread Andrew Ash
// assuming Spark 1.0 Hi Baoqiang, In my experience for the standalone cluster you need to set SPARK_WORKER_DIR not SPARK_LOCAL_DIRS to control where shuffle files are written. I think this is a documentation issue that could be improved, as http://spark.apache.org/docs/latest/spark-standalone.h

set SPARK_LOCAL_DIRS issue

2014-08-09 Thread Baoqiang Cao
Hi I’m trying to using a specific dir for spark working directory since I have limited space at /tmp. I tried: 1) export SPARK_LOCAL_DIRS=“/mnt/data/tmp” or 2) SPARK_LOCAL_DIRS=“/mnt/data/tmp” in spark-env.sh But neither worked, since the output of spark still saying ERROR DiskBlockObjectWrit