Spark on Mesos with Centos 6.6 NFS

leonidas Wed, 25 Nov 2015 23:52:58 -0800

Hello,
I have a setup with spark 1.5.1 on top of Mesos with one master and 4
slaves. I am submitting a Spark job were its output (3 parquet folders that
will represent 3 dataframes) should be written in an shared NFS folder. I
keep getting an error though though:


15/11/25 10:07:22 WARN TaskSetManager: Lost task 4.0 in stage 12.0 (TID 711,
remoteMachineHost): java.io.IOException: Mkdirs failed to create
file:/some/shared/folder/my.parquet/_temporary/0/_temporary/attempt_201511251007_0012_m_000004_0
(exists=false,
cwd=file:/project/mesos/work/slaves/8ef6b8ae-95fc-4963-b8c1-718edb988f3f-S7/frameworks/8ef6b8ae-95fc-4963-b8c1-718edb988f3f-0085/executors/0/runs/efd17cf8-30eb-4981-8d61-4dbb36a27dc7)
at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:442)
        at
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:428)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:908)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:889)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:786)
        at
org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:176)
        at
org.apache.parquet.hadoop.ParquetFileWriter.<init>(ParquetFileWriter.java:160)
        at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:289)
        at
org.apache.parquet.hadoop.ParquetOutputFormat.getRecordWriter(ParquetOutputFormat.java:262)
        at
org.apache.spark.sql.execution.datasources.parquet.ParquetOutputWriter.<init>(ParquetRelation.scala:94)
        at
org.apache.spark.sql.execution.datasources.parquet.ParquetRelation$$anon$3.newInstance(ParquetRelation.scala:272)
        at
org.apache.spark.sql.execution.datasources.DefaultWriterContainer.writeRows(WriterContainer.scala:233)
        at
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
        at
org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelation$$anonfun$run$1$$anonfun$apply$mcV$sp$3.apply(InsertIntoHadoopFsRelation.scala:150)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:88)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

The remoteMachineHost that throws the error has write access to the specific
folder.
Any thoughts?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-Mesos-with-Centos-6-6-NFS-tp25489.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Spark on Mesos with Centos 6.6 NFS

Reply via email to