Re: Configuring shuffle write directory

2014-03-27 Thread Tsai Li Ming
Hi, Thanks! I found out that I wasn’t setting the SPARK_JAVA_OPTS correctly.. I took a look at the process table and saw that the “org.apache.spark.executor.CoarseGrainedExecutorBackend” didn’t have the -Dspark.local.dir set. On 28 Mar, 2014, at 1:05 pm, Matei Zaharia wrote: > I see, are

Re: Configuring shuffle write directory

2014-03-27 Thread Matei Zaharia
I see, are you sure that was in spark-env.sh instead of spark-env.sh.template? You need to copy it to just a .sh file. Also make sure the file is executable. Try doing println(sc.getConf.toDebugString) in your driver program and seeing what properties it prints. As far as I can tell, spark.local

Re: Configuring shuffle write directory

2014-03-27 Thread Tsai Li Ming
Yes, I have tried that by adding it to the Worker. I can see the "app-20140328124540-000” in the local spark directory of the worker. But the “spark-local” directories are always written to /tmp since is the default spark.local.dir is taken from java.io.tempdir? On 28 Mar, 2014, at 12:42 pm,

Re: Configuring shuffle write directory

2014-03-27 Thread Matei Zaharia
Yes, the problem is that the driver program is overriding it. Have you set it manually in the driver? Or how did you try setting it in workers? You should set it by adding export SPARK_JAVA_OPTS=“-Dspark.local.dir=whatever” to conf/spark-env.sh on those workers. Matei On Mar 27, 2014, at 9:04

Re: Configuring shuffle write directory

2014-03-27 Thread Tsai Li Ming
Anyone can help? How can I configure a different spark.local.dir for each executor? On 23 Mar, 2014, at 12:11 am, Tsai Li Ming wrote: > Hi, > > Each of my worker node has its own unique spark.local.dir. > > However, when I run spark-shell, the shuffle writes are always written to > /tmp des

Configuring shuffle write directory

2014-03-22 Thread Tsai Li Ming
Hi, Each of my worker node has its own unique spark.local.dir. However, when I run spark-shell, the shuffle writes are always written to /tmp despite being set when the worker node is started. By specifying the spark.local.dir for the driver program, it seems to override the executor? Is there