Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

2017-03-19 Thread jinhong lu
Thanks Dhanesh, and how about the features question? > 在 2017年3月19日,19:08,Dhanesh Padmanabhan 写道: > > Dhanesh Thanks, lujinhong

Re: how to retain part of the features in LogisticRegressionModel (spark2.0)

2017-03-19 Thread jinhong lu
By the way, I found in spark 2.1 I can use setFamily() to decide binomial or multinomial, but how can I do the same thing in spark 2.0.2? If not support , which one is used in spark 2.0.2? binomial or multinomial? > 在 2017年3月19日,18:12,jinhong lu 写道: > > > I train my LogisticReg

how to retain part of the features in LogisticRegressionModel (spark2.0)

2017-03-19 Thread jinhong lu
I train my LogisticRegressionModel like this, I want my model to retain only some of the features(e.g. 500 of them), not all the features. What shou I do? I use .setElasticNetParam(1.0), but still all the features is in lrModel.coefficients. import org.apache.spark.ml.classifi

Re: how to construct parameter for model.transform() from datafile

2017-03-13 Thread jinhong lu
Anyone help? > 在 2017年3月13日,19:38,jinhong lu 写道: > > After train the mode, I got the result look like this: > > > scala> predictionResult.show() > > +-++++--+ >

Re: how to construct parameter for model.transform() from datafile

2017-03-13 Thread jinhong lu
ents of x. A: 144109, x: 804202 at scala.Predef$.require(Predef.scala:224) at org.apache.spark.ml.linalg.BLAS$.gemv(BLAS.scala:521) at org.apache.spark.ml.linalg.Matrix$class.multiply(Matrices.scala:110) at org.apache.spark.ml.linalg.DenseMatrix.multiply(Matrices.scala:176) wh

mllib based on dataset or dataframe

2016-07-10 Thread jinhong lu
Hi, Since the DataSet will be the major API in spark2.0, why mllib will DataFrame-based, and 'future development will focus on the DataFrame-based API.’ Any plan will change mllib form DataFrame-based to DataSet-based? = Thanks, lujinhong --