Hi DB and Xiangrui, L-BFGS is really useful. I think it will be cool if we can refactor the interface of runMiniBatchSGD, so as to plug in new optimizer more easily and compatible. What do you think of that ?
2014-02-23 3:49 GMT+08:00 Xiangrui Meng <men...@gmail.com>: > Hi DB, > > It is great to have the L-BFGS optimizer in MLlib and thank you for taking > care of the license issue. I looked through your PR briefly. It contains a > Java translation of the L-BFGS implementation, which is part of the RISO > package. Is it possible that we ask its author to make a release on maven > central and then we add it as a dependency instead of including the code > directly? > > Best, > Xiangrui > > > On Sat, Feb 22, 2014 at 4:28 AM, DB Tsai <dbt...@alpinenow.com> wrote: > > > Hi guys, > > > > First of all, we would like to thank all the Spark community for > > building such great platform for big data processing. We built the > > multinomial logistic regression with LBFGS optimizer in Spark, and > > LBFGS is a limited memory version of quasi-newton method which allows > > us to train a very high-dimensional data without computing the Hessian > > matrix as newton method required. > > > > In Strata Conference, we did a great demo using Spark with our MLOR to > > train mnist8m dataset. We're able to train the model in 5 mins with 50 > > iterations, and get 86% accuracy. The first iteration takes 19.8s, and > > the remaining iterations take about 5~7s. > > > > We did comparison between LBFGS and SGD, and often we saw 10x less > > steps in LBFGS while the cost of per step is the same (just computing > > the gradient). > > > > The following is the paper by Prof. Ng at Stanford comparing different > > optimizers including LBFGS and SGD. They use them in the context of > > deep learning, but worth as reference. > > > http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf > > > > We would like to break our MLOR with LBFGS into three patches to > > contribute to the community. > > > > 1) LBFGS optimizer - which can be used in logistic regression, and > > liner regression or replacing any algorithms using SGD. > > The core underneath LBFGS Java implementation we used is from RISO > > project, and the author, Robert is so kind to relicense it to GPL and > > Apache2 dual license. > > > > We're almost ready to submit a PR for LBFGS, see our github fork, > > https://github.com/AlpineNow/incubator-spark/commits/dbtsai-LBFGS > > > > However, we don't use Updater in LBFGS since it designs for GD, and > > for LBFGS, we don't need stepSize, and adaptive learning rate, etc. > > While it seems to be difficult to fit the LBFGS updater logic (well, > > in lbfgs library, the new weights is returned given old weights, loss, > > and gradient) into the current framework, I was thinking to abstract > > out the code computing the gradient and loss terms of regularization > > into different place so that different optimizer can also use it. Any > > suggestion about this? > > > > 2) and 3), we will add the MLOR gradient to MLLib, and add a few > > examples. Finally, we will have some tweak using mapPartition instead > > of map to further improve the performance. > > > > Thanks. > > > > Sincerely, > > > > DB Tsai > > Machine Learning Engineer > > Alpine Data Labs > > -------------------------------------- > > Web: http://alpinenow.com/ > > > -- Best Regards ----------------------------------- Xusen Yin 尹绪森 Beijing Key Laboratory of Intelligent Telecommunications Software and Multimedia Beijing University of Posts & Telecommunications Intel Labs China Homepage: *http://yinxusen.github.io/ <http://yinxusen.github.io/>*