I'm new to Spark and I'm getting bad performance with classification methods
on Spark MLlib (worse than R in terms of AUC).
I am trying to put my own parameters rather than the default parameters.
Here is the method I want to use :
train(RDD<LabeledPoint> input,
int numIterations,
double stepSize,
double miniBatchFraction,
Vector initialWeights)
How to choose "numIterations" and "stepSize"?
What does miniBatchFraction mean?
Is initialWeights necessary to have a good model? Then, how to choose them?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-get-the-best-performance-with-LogisticRegressionWithSGD-tp23053.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]