Spark MLlib provides a cross-validation toolkit for selecting
hyperparameters. I think you'll find the documentation quite helpful:

http://spark.apache.org/docs/latest/ml-tuning.html#example-model-selection-via-cross-validation

There is actually a python example for logistic regression there. If you
still have questions after reading it, then please post back again.

Hope that helps.

On Thu, Oct 13, 2016 at 12:58 PM, aditya1702 <adityavya...@gmail.com> wrote:

> Ok so I tried setting the regParam and tried lowering it. how do I evaluate
> which regParam is best. Do I have to to do it by trial and error. I am
> currently calculating the log_loss for the model. Is it good to find the
> best regparam value. here is my code:
>
> from math import exp,log
> #from pyspark.sql.functions import log
> epsilon = 1e-16
> def sigmoid_log_loss(w,x):
>   ans=float(1/(1+exp(-(w.dot(x.features)))))
>   if ans==0:
>     ans=ans+epsilon
>   if ans==1:
>     ans=ans-epsilon
>   log_loss=-((x.label)*log(ans)+(1-x.label)*log(1-ans))
>   return ((ans,x.label),log_loss)
>
> -------------------------------------------------------
> reg=0.02
> from pyspark.ml.classification import LogisticRegression
> lr=LogisticRegression(regParam=reg,maxIter=500,standardization=True,
> elasticNetParam=0.5)
> model=lr.fit(data_train_df)
>
> w=model.coefficients
> intercept=model.intercept
> data_predicted_df=data_val_df.map(lambda x:(sigmoid_log_loss(w,x)))
> log_loss=data_predicted_df.map(lambda x:x[1]).mean()
> print log_loss
>
>
>
> --
> View this message in context: http://apache-spark-
> developers-list.1001551.n3.nabble.com/Regularized-Logistic-regression-
> tp19432p19444.html
> Sent from the Apache Spark Developers List mailing list archive at
> Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to