Re: RDD saveAsTextFile() to local disk

Vijay Pawnarkar Wed, 08 Jul 2015 20:34:55 -0700

Thanks for the help.

Following are the folders I was trying to write to


*saveAsTextFile("*file:///home/someuser/test2/testupload/20150708/0/")

*saveAsTextFile("f*ile:///home/someuser/test2/testupload/20150708/1/")

*saveAsTextFile("*file:///home/someuser/test2/testupload/20150708/2/")

*saveAsTextFile("*file:///home/someuser/test2/testupload/20150708/3/")


The folder name "test2" was causing issue, for whatever reason the the API
does not recognize file:///home/someuser/test2 as directory.

Once folder name was changed file:///home/someuser/batch/testupload/20150708/0/
, its been working well. I am able to reproduce the issue consistently with
folder name "test2"







On Jul 8, 2015 8:31 PM, "canan chen" <[email protected]> wrote:

> It works for me by using the following code. Could you share your code ?
>
>
> *val data =sc.parallelize(List(1,2,3))*
> *data.saveAsTextFile("file:////Users/chen/Temp/c")*
>
> On Thu, Jul 9, 2015 at 4:05 AM, spok20nn <[email protected]> wrote:
>
>> Getting exception when wrting RDD to local disk using following function
>>
>>  saveAsTextFile("file:////home/someuser/dir2/testupload/20150708/")
>>
>> The dir (/home/someuser/dir2/testupload/) was created before running the
>> job. The error message is misleading.
>>
>>
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
>> in
>> stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0
>> (TID 6, xxx.yyy.com): org.apache.hadoop.fs.ParentNotDirectoryException:
>> Parent path is not a directory: file:/home/someuser/dir2
>>         at
>>
>> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:418)
>>         at
>>
>> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:426)
>>         at
>>
>> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:426)
>>         at
>>
>> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:426)
>>         at
>>
>> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:426)
>>         at
>>
>> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:426)
>>         at
>>
>> org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:588)
>>         at
>>
>> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:439)
>>         at
>>
>> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:426)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906)
>>         at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:799)
>>         at
>>
>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:123)
>>         at
>> org.apache.spark.SparkHadoopWriter.open(SparkHadoopWriter.scala:90)
>>         at
>>
>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1060)
>>         at
>>
>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$13.apply(PairRDDFunctions.scala:1051)
>>         at
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
>>         at org.apache.spark.scheduler.Task.run(Task.scala:56)
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:745)
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/RDD-saveAsTextFile-to-local-disk-tp23725.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [email protected]
>> For additional commands, e-mail: [email protected]
>>
>>
>

Re: RDD saveAsTextFile() to local disk

Reply via email to