Re: ensuring RDD indices remain immutable

2014-12-01 Thread rok
true though I was hoping to avoid having to sort... maybe there's no way around it. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/ensuring-RDD-indices-remain-immutable-tp20094p20104.html Sent from the Apache Spark User List mailing list archive at

Re: ensuring RDD indices remain immutable

2014-12-01 Thread Sean Owen
I think the robust thing to do is sort the RDD, and then zipWithIndex. Even if the RDD is recomputed, the ordering and thus assignment of IDs should be the same. On Mon, Dec 1, 2014 at 2:36 PM, rok wrote: > I have an RDD that serves as a feature look-up table downstream in my > analysis. I create