I am new in MLlib and in Spark.(I use Scala)
I'm trying to understand how LogisticRegressionWithLBFGS and
LogisticRegressionWithSGD work.
I usually use R to do logistic regressions but now I do it on Spark
to be able to analyze Big Data.
The model only returns weights and intercept. My problem is
I'm new to Spark and I'm getting bad performance with classification methods
on Spark MLlib (worse than R in terms of AUC).
I am trying to put my own parameters rather than the default parameters.
Here is the method I want to use :
train(RDD input,
int numIterations,
doub
I've tried several different couple of parameters for my
LogisticRegressionWithSGD and here are my results.
My numIterations varies from 100 to 500 by 50 and my stepSize varies from
0.1 to 1 by 0.1.
My last line represents the maximum of each column and my last column the
maximum of each line and w
I'm trying to read a Json file which is like :
[
{"IFAM":"EQR","KTM":143000640,"COL":21,"DATA":[{"MLrate":"30","Nrout":"0","up":null,"Crate":"2"}
,{"MLrate":"30","Nrout":"0","up":null,"Crate":"2"}
,{"MLrate":"30","Nrout":"0","up":null,"Crate":"2"}
,{"MLrate":"30","Nrout":"0","up":null,"Crate":"
I've imported a Json file which has this schema :
sqlContext.read.json("filename").printSchema
root
|-- COL: long (nullable = true)
|-- DATA: array (nullable = true)
||-- element: struct (containsNull = true)
|||-- Crate: string (nullable = true)