That won't be it, since you can see from the directory listing that there are no data files under test -- only "_" files and dirs. The output looks like it was written, or partially written at least, but didn't finish, in that the part-* files were never moved to the target dir. I don't know why, but at least, that is the nature of final problem.
On Thu, Aug 7, 2014 at 5:14 PM, Ashish Rangole <arang...@gmail.com> wrote: > Specify a folder instead of a file name for input and output code, as in: > > Output: > s3n://your-bucket-name/your-data-folder > > Input: (when consuming the above output) > > s3n://your-bucket-name/your-data-folder/* > > On May 6, 2014 5:19 PM, "kamatsuoka" <ken...@gmail.com> wrote: >> >> I have a Spark app that writes out a file, s3://mybucket/mydir/myfile.txt. >> >> Behind the scenes, the S3 driver creates a bunch of files like >> s3://mybucket//mydir/myfile.txt/part-0000, as well as the block files like >> s3://mybucket/block_3574186879395643429. >> >> How do I construct an url to use this file as input to another Spark app? >> I >> tried all the variations of s3://mybucket/mydir/myfile.txt, but none of >> them >> work. >> >> >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-read-a-multipart-s3-file-tp5463.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org