Hi,
Since the DataSet will be the major API in spark2.0, why mllib will
DataFrame-based, and 'future development will focus on the DataFrame-based API.’
Any plan will change mllib form DataFrame-based to DataSet-based?
=
Thanks,
lujinhong
--
DataFrame is a kind of special case of Dataset, so they mean the same thing.
Actually the ML pipeline API will accept Dataset[_] instead of DataFrame in
Spark 2.0.
We can say that MLlib will focus on the Dataset-based API for futher
development more accurately.
Thanks
Yanbo
2016-07-10 20:35 GMT-0