I'm wondering why you need order preserved, we've had situations where keeping the source as an artificial field in the dataset was important and I had to run contortions to inject that (In this case the datasource had no unique key).
Is this similar? On 13 September 2017 at 10:46, Suzen, Mehmet <su...@acm.org> wrote: > But what happens if one of the partitions fail, how fault tolarence > recover elements in other partitions. > > On 13 Sep 2017 18:39, "Ankit Maloo" <ankitmaloo1...@gmail.com> wrote: > >> AFAIK, the order of a rdd is maintained across a partition for Map >> operations. There is no way a map operation can change sequence across a >> partition as partition is local and computation happens one record at a >> time. >> >> On 13-Sep-2017 9:54 PM, "Suzen, Mehmet" <su...@acm.org> wrote: >> >> I think the order has no meaning in RDDs see this post, specially zip >> methods: >> https://stackoverflow.com/questions/29268210/mind-blown-rdd-zip-method >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >> >> >>