I used the cross validator tool for tuning the parameter. My code is here:
from pyspark.ml.classification import LogisticRegression
from pyspark.ml.tuning import ParamGridBuilder, CrossValidator
from pyspark.ml.evaluation import BinaryClassificationEvaluator
reg=100.0
lr=LogisticRegression(maxIter
Spark MLlib provides a cross-validation toolkit for selecting
hyperparameters. I think you'll find the documentation quite helpful:
http://spark.apache.org/docs/latest/ml-tuning.html#example-model-selection-via-cross-validation
There is actually a python example for logistic regression there. If
Ok so I tried setting the regParam and tried lowering it. how do I evaluate
which regParam is best. Do I have to to do it by trial and error. I am
currently calculating the log_loss for the model. Is it good to find the
best regparam value. here is my code:
from math import exp,log
#from pyspark.s
Thank you Anurag Verma for replying. I tried increasing the iterations.
However I still get underfitted results. I am checking the model's
prediction by seeing how many pairs of labels and predictions it gets right
data_predict_with_model=best_model.transform(data_test_df)
final_pred_df=data_predi
Probably your regularization parameter is set too high. Try regParam=0.1/
0.2 Also you should probably increase the number to iteration to something
like 500. Additionally you can specify elasticNetParam (between 0 and 1).
-Original Message-
From: aditya1702 [mailto:adityavya...@gmail.com