Hi DB and Xiangrui,

L-BFGS is really useful. I think it will be cool if we can refactor the
interface of runMiniBatchSGD, so as to plug in new optimizer more easily
and compatible. What do you think of that ?


2014-02-23 3:49 GMT+08:00 Xiangrui Meng <men...@gmail.com>:

> Hi DB,
>
> It is great to have the L-BFGS optimizer in MLlib and thank you for taking
> care of the license issue. I looked through your PR briefly. It contains a
> Java translation of the L-BFGS implementation, which is part of the RISO
> package. Is it possible that we ask its author to make a release on maven
> central and then we add it as a dependency instead of including the code
> directly?
>
> Best,
> Xiangrui
>
>
> On Sat, Feb 22, 2014 at 4:28 AM, DB Tsai <dbt...@alpinenow.com> wrote:
>
> > Hi guys,
> >
> > First of all, we would like to thank all the Spark community for
> > building such great platform for big data processing. We built the
> > multinomial logistic regression with LBFGS optimizer in Spark, and
> > LBFGS is a limited memory version of quasi-newton method which allows
> > us to train a very high-dimensional data without computing the Hessian
> > matrix as newton method required.
> >
> > In Strata Conference, we did a great demo using Spark with our MLOR to
> > train mnist8m dataset. We're able to train the model in 5 mins with 50
> > iterations, and get 86% accuracy. The first iteration takes 19.8s, and
> > the remaining iterations take about 5~7s.
> >
> > We did comparison between LBFGS and SGD, and often we saw 10x less
> > steps in LBFGS while the cost of per step is the same (just computing
> > the gradient).
> >
> > The following is the paper by Prof. Ng at Stanford comparing different
> > optimizers including LBFGS and SGD. They use them in the context of
> > deep learning, but worth as reference.
> >
> http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf
> >
> > We would like to break our MLOR with LBFGS into three patches to
> > contribute to the community.
> >
> > 1) LBFGS optimizer - which can be used in logistic regression, and
> > liner regression or replacing any algorithms using SGD.
> > The core underneath LBFGS Java implementation we used is from RISO
> > project, and the author, Robert is so kind to relicense it to GPL and
> > Apache2 dual license.
> >
> > We're almost ready to submit a PR for LBFGS, see our github fork,
> > https://github.com/AlpineNow/incubator-spark/commits/dbtsai-LBFGS
> >
> > However, we don't use Updater in LBFGS since it designs for GD, and
> > for LBFGS, we don't need stepSize, and adaptive learning rate, etc.
> > While it seems to be difficult to fit the LBFGS updater logic (well,
> > in lbfgs library, the new weights is returned given old weights, loss,
> > and gradient) into the current framework, I was thinking to abstract
> > out the code computing the gradient and loss terms of regularization
> > into different place so that different optimizer can also use it. Any
> > suggestion about this?
> >
> > 2) and 3), we will add the MLOR gradient to MLLib, and add a few
> > examples. Finally, we will have some tweak using mapPartition instead
> > of map to further improve the performance.
> >
> > Thanks.
> >
> > Sincerely,
> >
> > DB Tsai
> > Machine Learning Engineer
> > Alpine Data Labs
> > --------------------------------------
> > Web: http://alpinenow.com/
> >
>



-- 
Best Regards
-----------------------------------
Xusen Yin    尹绪森
Beijing Key Laboratory of Intelligent Telecommunications Software and
Multimedia
Beijing University of Posts & Telecommunications
Intel Labs China
Homepage: *http://yinxusen.github.io/ <http://yinxusen.github.io/>*

Reply via email to