help plz! how to use zipWithIndex to each subset of a RDD

askformore Wed, 29 Jul 2015 19:13:41 -0700

I have some data like this:RDD[(String, String)] = ((*key-1*, a),
(*key-1*,b), (*key-2*,a), (*key-2*,c),(*key-3*,b),(*key-4*,d))and I want to
group the data by Key, and for each group, add index fields to the
groupmember, at last I can transform the data to below : RDD[(String, *Int*,
String)] = ((key-1,*1*, a), (key-1,*2,*b), (key-2,*1*,a),
(key-2,*2*,b),(key-3,*1*,b),(key-4,*1*,d))I tried to groupByKey firstly,
then I got a RDD[(String, Iterable[String])], but I don't know how to use
zipWithIndex function to each Iterable...thanks.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/help-plz-how-to-use-zipWithIndex-to-each-subset-of-a-RDD-tp24071.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

help plz! how to use zipWithIndex to each subset of a RDD

Reply via email to