Hi Deb, Which library or paper do you find to use this loss function in SVM ?
But I prefer the implementation in LIBLINEAR which use coordinate descent optimizer. Thanks. On Sun, Dec 17, 2017 at 6:52 AM, Yanbo Liang <yblia...@gmail.com> wrote: > Hello Deb, > > To optimize non-smooth function on LBFGS really should be considered > carefully. > Is there any literature that proves changing max to soft-max can behave > well? > I’m more than happy to see some benchmarks if you can have. > > + Yuhao, who did similar effort in this PR: https://github.com/apache/ > spark/pull/17862 > > Regards > Yanbo > > On Dec 13, 2017, at 12:20 AM, Debasish Das <debasish.da...@gmail.com> > wrote: > > Hi, > > I looked into the LinearSVC flow and found the gradient for hinge as > follows: > > Our loss function with {0, 1} labels is max(0, 1 - (2y - 1) (f_w(x))) > Therefore the gradient is -(2y - 1)*x > > max is a non-smooth function. > > Did we try using ReLu/Softmax function and use that to smooth the hinge > loss ? > > Loss function will change to SoftMax(0, 1 - (2y-1) (f_w(x))) > > Since this function is smooth, gradient will be well defined and > LBFGS/OWLQN should behave well. > > Please let me know if this has been tried already. If not I can run some > benchmarks. > > We have soft-max in multinomial regression and can be reused for LinearSVC > flow. > > Thanks. > Deb > > >