I've had this issue too running Spark 1.0.0 on YARN with HDFS: it defaults to a working directory located in hdfs:///user/$USERNAME and it's not clear how to set the working directory.
In the case where HDFS has a non-standard directory structure (i.e., home directories located in hdfs:///users/) Spark jobs will fail. The MapReduce setting is "mapreduce.job.working.dir". Is there a Spark equivalent? On Thu, Aug 14, 2014 at 7:24 PM, Yana Kadiyska <[email protected]> wrote: > Hi all, trying to change defaults of where stuff gets written. > > I've set "-Dspark.local.dir=/spark/tmp" and I can see that the setting is > used when the executor is started. > > I do indeed see directories like spark-local-20140815004454-bb3f in this > desired location but I also see undesired stuff under /tmp > > usr@executor:~# ls /tmp/spark-93f4d44c-ff4d-477d-8930-5884b10b065f/ > files jars > > usr@driver: ls /tmp/spark-7e456342-a58c-4439-ab69-ff8e6d6b56a5/ > files jars > > Is there a way to move these directories off of /tmp? I am running 0.9.1 > (SPARK_WORKER_DIR is also exported on all nodes though all that I see there > are executor logs) > > Thanks --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
