Why dataframe can be more efficient than dataset?

Shiyuan Sat, 08 Apr 2017 11:15:41 -0700

Hi Spark-users,
    I came across a few sources which mentioned DataFrame can be more
efficient than Dataset.  I can understand this is true because Dataset
allows functional transformation which Catalyst cannot look into and hence
cannot optimize well. But can DataFrame be more efficient than Dataset even
if we only use the relational transformation on dataset? If so, can anyone
give some explanation why  it is so? Any benchmark comparing dataset vs.
dataframe?   Thank you!


Shiyuan

Why dataframe can be more efficient than dataset?

Reply via email to