You could try using zipWIthIndex (links below to API docs). For example, in python:
items =['a','b','c'] items2= sc.parallelize(items) print(items2.first()) items3=items2.map(lambda x: (x, x+"!")) print(items3.first()) items4=items3.zipWithIndex() print(items4.first()) items5=items4.map(lambda x: (x[1], x[0])) print(items5.first()) This will give you an output of (0, ('a', 'a!')) - where the 0 is the index. You could also use a map to increment them up by a value (e.g. if you wanted to count from 1). Links http://spark.apache.org/docs/latest/api/python/index.html http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.rdd.RDD -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Appending-an-incrental-value-to-each-RDD-record-tp20718p20720.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org