Re: Add row IDs column to data frame

akbar501 Wed, 11 Jan 2017 23:56:17 -0800

RDDs, DataFrames and Datasets are all immutable. So, you cannot edit any of
these. However, the approach you should take is to call transformation
functions on the RDD/DataFrame/Dataset. RDD transformation functions will
return a new RDD, DataFrame transformations will return a new DataFrame and
so on.


Essentially, you chain a series of transformations together, and then apply
an action. The action will cause Spark to actually run a computation.

For example:

```scala
// Creating RDDs is lazy...so nothing will run until the action
val lines = sc.textFile("./README.md")
val words = lines.map(_.split(" "))
...continue with more transformations
// now call an action to cause the computation to run
words.count
```



-----
Delixus.com - Spark Consulting
--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Append-column-to-Data-Frame-or-RDD-tp22385p28299.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Add row IDs column to data frame

Reply via email to