When you repartiton, ordering can get lost. You would need to sort after repartitioning.
Aniket On Tue, Jan 20, 2015, 7:08 AM anny9699 <anny9...@gmail.com> wrote: > Hi, > > I am using Spark on AWS and want to write the output to S3. It is a > relatively small file and I don't want them to output as multiple parts. So > I use > > result.repartition(1).saveAsTextFile("s3://...") > > However as long as I am using the saveAsTextFile method, the output doesn't > keep the original order. But if I use BufferedWriter in Java to write the > output, I could only write to the master machine instead of S3 directly. Is > there a way that I could write to S3 and the same time keep the order? > > Thanks a lot! > Anny > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/How-to-output-to-S3-and-keep-the-order-tp21246.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >