mllib based on dataset or dataframe

2016-07-10 Thread jinhong lu
Hi, Since the DataSet will be the major API in spark2.0, why mllib will DataFrame-based, and 'future development will focus on the DataFrame-based API.’ Any plan will change mllib form DataFrame-based to DataSet-based? = Thanks, lujinhong --

Re: mllib based on dataset or dataframe

2016-07-10 Thread Yanbo Liang
DataFrame is a kind of special case of Dataset, so they mean the same thing. Actually the ML pipeline API will accept Dataset[_] instead of DataFrame in Spark 2.0. We can say that MLlib will focus on the Dataset-based API for futher development more accurately. Thanks Yanbo 2016-07-10 20:35 GMT-0