The regularization is handled after the objective function of data is computed. See https://github.com/apache/spark/blob/6a827d5d1ec520f129e42c3818fe7d0d870dcbef/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala line 348 for L2.
For L1, it's handled by OWLQN, so you don't see it explicitly, but the code is in line 128. Sincerely, DB Tsai ---------------------------------------------------------- Blog: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Tue, Jun 23, 2015 at 3:14 PM, Wei Zhou <zhweisop...@gmail.com> wrote: > Hi DB Tsai, > > Thanks for your reply. I went through the source code of > LinearRegression.scala. The algorithm minimizes square error L = 1/2n ||A > weights - y||^2^. I cannot match this with the elasticNet loss function > found here http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html, which > is the sum of square error plus L1 and L2 penalty. > > I am able to follow the rest of the mathematical deviation in the code > comment. I am hoping if you could point me to any references that can fill > this knowledge gap. > > Best, > Wei > > > > 2015-06-19 12:35 GMT-07:00 DB Tsai <dbt...@dbtsai.com>: >> >> Hi Wei, >> >> I don't think ML is meant for single node computation, and the >> algorithms in ML are designed for pipeline framework. >> >> In short, the lasso regression in ML is new algorithm implemented from >> scratch, and it's faster, and converged to the same solution as R's >> glmnet but with scalability. Here is the talk I gave in Spark summit >> about the new elastic-net feature in ML. I will encourage you to try >> the one ML. >> >> >> http://www.slideshare.net/dbtsai/2015-06-largescale-lasso-and-elasticnet-regularized-generalized-linear-models-at-spark-summit >> >> Sincerely, >> >> DB Tsai >> ---------------------------------------------------------- >> Blog: https://www.dbtsai.com >> PGP Key ID: 0xAF08DF8D >> >> >> On Fri, Jun 19, 2015 at 11:38 AM, Wei Zhou <zhweisop...@gmail.com> wrote: >> > Hi Spark experts, >> > >> > I see lasso regression/ elastic net implementation under both MLLib and >> > ML, >> > does anyone know what is the difference between the two implementation? >> > >> > In spark summit, one of the keynote speakers mentioned that ML is meant >> > for >> > single node computation, could anyone elaborate this? >> > >> > Thanks. >> > >> > Wei > > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org