Spark 1.4 should be available next month, but I'm not sure about the exact date. Your interpretation of high lambda is reasonable. "High" lambda is really data-dependent. "lambda" is the same as the "regParam" in Spark, available in all recent Spark versions.
On Fri, May 29, 2015 at 5:35 AM, mélanie gallois < melanie.galloi...@gmail.com> wrote: > When will Spark 1.4 be available exactly? > To answer to "Model selection can be achieved through high > lambda resulting lots of zero in the coefficients" : Do you mean that > putting a high lambda as a parameter of the logistic regression keeps only > a few significant variables and "deletes" the others with a zero in the > coefficients? What is a high lambda for you? > Is the lambda a parameter available in Spark 1.4 only or can I see it in > Spark 1.3? > > 2015-05-23 0:04 GMT+02:00 Joseph Bradley <jos...@databricks.com>: > >> If you want to select specific variable combinations by hand, then you >> will need to modify the dataset before passing it to the ML algorithm. The >> DataFrame API should make that easy to do. >> >> If you want to have an ML algorithm select variables automatically, then >> I would recommend using L1 regularization for now and possibly elastic net >> after 1.4 is release, per DB's suggestion. >> >> If you want detailed model statistics similar to what R provides, I've >> created a JIRA for discussing how we should add that functionality to >> MLlib. Those types of stats will be added incrementally, but feedback >> would be great for prioritization: >> https://issues.apache.org/jira/browse/SPARK-7674 >> >> To answer your question: "How are the weights calculated: is there a >> correlation calculation with the variable of interest?" >> --> Weights are calculated as with all logistic regression algorithms, by >> using convex optimization to minimize a regularized log loss. >> >> Good luck! >> Joseph >> >> On Fri, May 22, 2015 at 1:07 PM, DB Tsai <dbt...@dbtsai.com> wrote: >> >>> In Spark 1.4, Logistic Regression with elasticNet is implemented in ML >>> pipeline framework. Model selection can be achieved through high >>> lambda resulting lots of zero in the coefficients. >>> >>> Sincerely, >>> >>> DB Tsai >>> ------------------------------------------------------- >>> Blog: https://www.dbtsai.com >>> >>> >>> On Fri, May 22, 2015 at 1:19 AM, SparknewUser >>> <melanie.galloi...@gmail.com> wrote: >>> > I am new in MLlib and in Spark.(I use Scala) >>> > >>> > I'm trying to understand how LogisticRegressionWithLBFGS and >>> > LogisticRegressionWithSGD work. >>> > I usually use R to do logistic regressions but now I do it on Spark >>> > to be able to analyze Big Data. >>> > >>> > The model only returns weights and intercept. My problem is that I >>> have no >>> > information about which variable is significant and which variable I >>> had >>> > better >>> > to delete to improve my model. I only have the confusion matrix and >>> the AUC >>> > to evaluate the performance. >>> > >>> > Is there any way to have information about the variables I put in my >>> model? >>> > How can I try different variable combinations, do I have to modify the >>> > dataset >>> > of origin (e.g. delete one or several columns?) >>> > How are the weights calculated: is there a correlation calculation >>> with the >>> > variable >>> > of interest? >>> > >>> > >>> > >>> > -- >>> > View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/MLlib-how-to-get-the-best-model-with-only-the-most-significant-explanatory-variables-in-LogisticRegr-tp22993.html >>> > Sent from the Apache Spark User List mailing list archive at >>> Nabble.com. >>> > >>> > --------------------------------------------------------------------- >>> > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> > For additional commands, e-mail: user-h...@spark.apache.org >>> > >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> > > > -- > *Mélanie* >