Hi Arun,
This documentation may be helpful:
The 2.0-preview Scala doc for Dataset class:
http://spark.apache.org/docs/2.0.0-preview/api/scala/index.html#org.apache.spark.sql.Dataset
Note that the Dataset API has completely changed from 1.6.
In 2.0, there is no separate DataFrame class. Rather, i
>
> 1) What does this really mean to an Application developer?
>
It means there are less concepts to learn.
> 2) Why this unification was needed in Spark 2.0?
>
To simplify the API and reduce the number of concepts that needed to be
learned. We only didn't do it in 1.6 because we didn't want t
Can anyone answer these questions please.
On Mon, Jun 13, 2016 at 6:51 PM, Arun Patel wrote:
> Thanks Michael.
>
> I went thru these slides already and could not find answers for these
> specific questions.
>
> I created a Dataset and converted it to DataFrame in 1.6 and 2.0. I don't
> see an
Thanks Michael.
I went thru these slides already and could not find answers for these
specific questions.
I created a Dataset and converted it to DataFrame in 1.6 and 2.0. I don't
see any difference in 1.6 vs 2.0. So, I really got confused and asked
these questions about unification.
Appreciat
Here's a talk I gave on the topic:
https://www.youtube.com/watch?v=i7l3JQRx7Qw
http://www.slideshare.net/SparkSummit/structuring-spark-dataframes-datasets-and-streaming-by-michael-armbrust
On Mon, Jun 13, 2016 at 4:01 AM, Arun Patel wrote:
> In Spark 2.0, DataFrames and Datasets are unified. Da