Re: Difference between Lasso regression in MLlib package and ML package

DB Tsai Tue, 23 Jun 2015 15:59:48 -0700

The regularization is handled after the objective function of data is
computed. See 
https://github.com/apache/spark/blob/6a827d5d1ec520f129e42c3818fe7d0d870dcbef/mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala
 line 348 for L2.


For L1, it's handled by OWLQN, so you don't see it explicitly, but the
code is in line 128.

Sincerely,

DB Tsai
----------------------------------------------------------
Blog: https://www.dbtsai.com
PGP Key ID: 0xAF08DF8D


On Tue, Jun 23, 2015 at 3:14 PM, Wei Zhou <zhweisop...@gmail.com> wrote:
> Hi DB Tsai,
>
> Thanks for your reply. I went through the source code of
> LinearRegression.scala. The algorithm minimizes square error L = 1/2n ||A
> weights - y||^2^. I cannot match this with the elasticNet loss function
> found here http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html, which
> is the sum of square error plus L1 and L2 penalty.
>
> I am able to follow the rest of the mathematical deviation in the code
> comment. I am hoping if you could point me to any references that can fill
> this knowledge gap.
>
> Best,
> Wei
>
>
>
> 2015-06-19 12:35 GMT-07:00 DB Tsai <dbt...@dbtsai.com>:
>>
>> Hi Wei,
>>
>> I don't think ML is meant for single node computation, and the
>> algorithms in ML are designed for pipeline framework.
>>
>> In short, the lasso regression in ML is new algorithm implemented from
>> scratch, and it's faster, and converged to the same solution as R's
>> glmnet but with scalability. Here is the talk I gave in Spark summit
>> about the new elastic-net feature in ML. I will encourage you to try
>> the one ML.
>>
>>
>> http://www.slideshare.net/dbtsai/2015-06-largescale-lasso-and-elasticnet-regularized-generalized-linear-models-at-spark-summit
>>
>> Sincerely,
>>
>> DB Tsai
>> ----------------------------------------------------------
>> Blog: https://www.dbtsai.com
>> PGP Key ID: 0xAF08DF8D
>>
>>
>> On Fri, Jun 19, 2015 at 11:38 AM, Wei Zhou <zhweisop...@gmail.com> wrote:
>> > Hi Spark experts,
>> >
>> > I see lasso regression/ elastic net implementation under both MLLib and
>> > ML,
>> > does anyone know what is the difference between the two implementation?
>> >
>> > In spark summit, one of the keynote speakers mentioned that ML is meant
>> > for
>> > single node computation, could anyone elaborate this?
>> >
>> > Thanks.
>> >
>> > Wei
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Difference between Lasso regression in MLlib package and ML package

Reply via email to