Hello, I have a setup with spark 1.5.1 on top of Mesos with one master and 4 slaves. I am submitting a Spark job were its output (3 parquet folders that will represent 3 dataframes) should be written in an shared NFS folder. I keep getting an error though though:
15/11/25 10:07:22 WARN TaskSetManager: Lost task 4.0 in stage 12.0 (TID 711, remoteMachineHost): java.io.IOException: Mkdirs failed to create file:/some/shared/folder/my.parquet/_temporary/0/_temporary/attempt_201511251007_0012_m_000004_0 (exists=false, cwd=file:/project/mesos/work/slaves/8ef6b8ae-95fc-4963-b8c1-718edb988f3f-S7/frameworks/8ef6b8ae-95fc-4963-b8c1-718edb988f3f-0085/executors/0/runs/efd17cf8-30eb-4981-8d61-4dbb36a27dc7) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:442) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786) at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:176) at org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:160) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:289) at org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:262) at org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetRelation.scala:94) at org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anon$3.newInstance(ParquetRelation.scala:272) at org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:233) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) The remoteMachineHost that throws the error has write access to the specific folder. Any thoughts? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Mesos-with-Centos-6-6-NFS-tp25489.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org