Ah yeah, I see.. .

Yes, it's right that many algorithms perform quite differently
depending on the kind of regularization... . Same holds for cutting
plane algorithms which either reduce to linear or quadratic programs
depending on L1 or L2. Generally speaking, I think this is also not
surprising as L1 is not differentiable everywhere and you'd have to
use different regularizations... .

So it probably makes sense to separate the loss from the cost function
(which is then only defined by the model and the loss function), and
have the regularization extra.

-M

-- 
Mikio Braun - http://blog.mikiobraun.de, http://twitter.com/mikiobraun

Reply via email to