Re: How to read a multipart s3 file?

Sean Owen Thu, 07 Aug 2014 10:40:52 -0700

That won't be it, since you can see from the directory listing that
there are no data files under test -- only "_" files and dirs. The
output looks like it was written, or partially written at least, but
didn't finish, in that the part-* files were never moved to the target
dir. I don't know why, but at least, that is the nature of final
problem.


On Thu, Aug 7, 2014 at 5:14 PM, Ashish Rangole <arang...@gmail.com> wrote:
> Specify a folder instead of a file name for input and output code, as in:
>
> Output:
> s3n://your-bucket-name/your-data-folder
>
> Input: (when consuming the above output)
>
> s3n://your-bucket-name/your-data-folder/*
>
> On May 6, 2014 5:19 PM, "kamatsuoka" <ken...@gmail.com> wrote:
>>
>> I have a Spark app that writes out a file, s3://mybucket/mydir/myfile.txt.
>>
>> Behind the scenes, the S3 driver creates a bunch of files like
>> s3://mybucket//mydir/myfile.txt/part-0000, as well as the block files like
>> s3://mybucket/block_3574186879395643429.
>>
>> How do I construct an url to use this file as input to another Spark app?
>> I
>> tried all the variations of s3://mybucket/mydir/myfile.txt, but none of
>> them
>> work.
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-read-a-multipart-s3-file-tp5463.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: How to read a multipart s3 file?

Reply via email to