spark on Windows 2008 failed to save RDD to windows shared folder

Wang, Ningjun (LNG-NPV) Fri, 22 May 2015 13:58:25 -0700

I used spark standalone cluster on Windows 2008. I kept on getting the 
following error when trying to save an RDD to a windows shared folder


rdd.saveAsObjectFile("file:///T:/lab4-win02/IndexRoot01/tobacco-07/myrdd.obj")

15/05/22 16:49:05 ERROR Executor: Exception in task 0.0 in stage 12.0 (TID 12)
java.io.IOException: Mkdirs failed to create 
file:/T:/lab4-win02/IndexRoot01/tobacco-07/tmp/docs-150522204904805.op/_temporary/0/_temporary/attempt_201505221649_0012_m_000000_12
            at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:438)
            at 
org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:424)
            at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
            at 
org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1071)
            at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:270)
            at 
org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:527)
            at 
org.apache.hadoop.mapred.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:63)
            at 
org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
            at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1068)
            at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1059)
            at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
            at org.apache.spark.scheduler.Task.run(Task.scala:64)
            at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
            at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
            at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
            at java.lang.Thread.run(Thread.java:745)
The T: drive is mapped to a windows shared folder, e.g.  T:  ->  
\\10.196.119.230\myshare

The id running spark does have write permission to this folder. It works most 
of the time but failed sometime.

Can anybody tell me what is the problem here?

Please advise. Thanks.

spark on Windows 2008 failed to save RDD to windows shared folder

Reply via email to