Would monotonicallyIncreasingId <https://github.com/apache/spark/blob/d4c7a7a3642a74ad40093c96c4bf45a62a470605/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L637> work for you?
Best, Burak On Tue, Jul 21, 2015 at 4:55 PM, Srikanth <srikanth...@gmail.com> wrote: > Hello, > > I'm creating dataframes from three CSV files using spark-csv package. I > want to add a unique ID for each row in dataframe. > Not sure how withColumn() can be used to achieve this. I need a Long value > not an UUID. > > One option I found was to create a RDD and use zipWithUniqueId. > > sqlContext.textFile(file). >> zipWithUniqueId(). >> map(case(d, i)=>i.toString + delimiter + d). >> map(_.split(delimiter)). >> map(s=>caseclass(...)) > > .toDF().select("field1, "field2") > > > Its a bit hacky. Is there an easier way to do this on dataframes and use > spark-csv? > > Srikanth >