Matei Zaharia wrote > If you use s3n:// for both, you should be able to pass the exact same file > to load as you did to save.
I'm trying to write a file to s3n in a Spark app and to read it in another one using the same file name, but without luck. Writing data to s3n as val data = Array(1.0, 1.0, 1.0) sc.parallelize(data).saveAsTextFile("s3n://<access_key>:<secret_access_key>@<bucket-name>/test") creates the following files: test/_SUCCESS test/_temporary/0/task_201408071147_m_000000_$folder$ test/_temporary/0/task_201408071147_m_000000/part-00000 test/_temporary/0/task_201408071147_m_000001_$folder$ test/_temporary/0/task_201408071147_m_000001/part-00001 When trying to read the file as val data2 = sc.textFile("s3n://<access_key>:<secret_access_key>@<bucket-name>/test") data2 is an empty array: scala> data2.collect 14/08/07 11:49:56 INFO mapred.FileInputFormat: Total input paths to process : 0 14/08/07 11:49:56 INFO spark.SparkContext: Starting job: collect at <console>:15 14/08/07 11:49:56 INFO spark.SparkContext: Job finished: collect at <console>:15, took 3.7227E-5 s res5: Array[String] = Array() I'm using Spark 1.0.0. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-read-a-multipart-s3-file-tp5463p11643.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org