Hi DB Tsai, Thanks for your reply. I went through the source code of LinearRegression.scala. The algorithm minimizes square error L = 1/2n ||A weights - y||^2^. I cannot match this with the elasticNet loss function found here http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html, which is the sum of square error plus L1 and L2 penalty.
I am able to follow the rest of the mathematical deviation in the code comment. I am hoping if you could point me to any references that can fill this knowledge gap. Best, Wei 2015-06-19 12:35 GMT-07:00 DB Tsai <dbt...@dbtsai.com>: > Hi Wei, > > I don't think ML is meant for single node computation, and the > algorithms in ML are designed for pipeline framework. > > In short, the lasso regression in ML is new algorithm implemented from > scratch, and it's faster, and converged to the same solution as R's > glmnet but with scalability. Here is the talk I gave in Spark summit > about the new elastic-net feature in ML. I will encourage you to try > the one ML. > > > http://www.slideshare.net/dbtsai/2015-06-largescale-lasso-and-elasticnet-regularized-generalized-linear-models-at-spark-summit > > Sincerely, > > DB Tsai > ---------------------------------------------------------- > Blog: https://www.dbtsai.com > PGP Key ID: 0xAF08DF8D > > > On Fri, Jun 19, 2015 at 11:38 AM, Wei Zhou <zhweisop...@gmail.com> wrote: > > Hi Spark experts, > > > > I see lasso regression/ elastic net implementation under both MLLib and > ML, > > does anyone know what is the difference between the two implementation? > > > > In spark summit, one of the keynote speakers mentioned that ML is meant > for > > single node computation, could anyone elaborate this? > > > > Thanks. > > > > Wei >