Hello, I'm creating dataframes from three CSV files using spark-csv package. I want to add a unique ID for each row in dataframe. Not sure how withColumn() can be used to achieve this. I need a Long value not an UUID.
One option I found was to create a RDD and use zipWithUniqueId. sqlContext.textFile(file). > zipWithUniqueId(). > map(case(d, i)=>i.toString + delimiter + d). > map(_.split(delimiter)). > map(s=>caseclass(...)) .toDF().select("field1, "field2") Its a bit hacky. Is there an easier way to do this on dataframes and use spark-csv? Srikanth