Hi Yinxusen, I agree. I found a hack to use the current Updater.scala (it contains not only the logic of regularization but also the adaptive learning rate in SGD which will not be used in general for other optimizers), but I think we still need to have a better design to deal with it.
Sincerely, DB Tsai Machine Learning Engineer Alpine Data Labs -------------------------------------- Web: http://alpinenow.com/ On Mon, Feb 24, 2014 at 6:46 PM, 尹绪森 <yinxu...@gmail.com> wrote: > Hi DB and Xiangrui, > > L-BFGS is really useful. I think it will be cool if we can refactor the > interface of runMiniBatchSGD, so as to plug in new optimizer more easily > and compatible. What do you think of that ? > > > 2014-02-23 3:49 GMT+08:00 Xiangrui Meng <men...@gmail.com>: > >> Hi DB, >> >> It is great to have the L-BFGS optimizer in MLlib and thank you for taking >> care of the license issue. I looked through your PR briefly. It contains a >> Java translation of the L-BFGS implementation, which is part of the RISO >> package. Is it possible that we ask its author to make a release on maven >> central and then we add it as a dependency instead of including the code >> directly? >> >> Best, >> Xiangrui >> >> >> On Sat, Feb 22, 2014 at 4:28 AM, DB Tsai <dbt...@alpinenow.com> wrote: >> >> > Hi guys, >> > >> > First of all, we would like to thank all the Spark community for >> > building such great platform for big data processing. We built the >> > multinomial logistic regression with LBFGS optimizer in Spark, and >> > LBFGS is a limited memory version of quasi-newton method which allows >> > us to train a very high-dimensional data without computing the Hessian >> > matrix as newton method required. >> > >> > In Strata Conference, we did a great demo using Spark with our MLOR to >> > train mnist8m dataset. We're able to train the model in 5 mins with 50 >> > iterations, and get 86% accuracy. The first iteration takes 19.8s, and >> > the remaining iterations take about 5~7s. >> > >> > We did comparison between LBFGS and SGD, and often we saw 10x less >> > steps in LBFGS while the cost of per step is the same (just computing >> > the gradient). >> > >> > The following is the paper by Prof. Ng at Stanford comparing different >> > optimizers including LBFGS and SGD. They use them in the context of >> > deep learning, but worth as reference. >> > >> http://cs.stanford.edu/~jngiam/papers/LeNgiamCoatesLahiriProchnowNg2011.pdf >> > >> > We would like to break our MLOR with LBFGS into three patches to >> > contribute to the community. >> > >> > 1) LBFGS optimizer - which can be used in logistic regression, and >> > liner regression or replacing any algorithms using SGD. >> > The core underneath LBFGS Java implementation we used is from RISO >> > project, and the author, Robert is so kind to relicense it to GPL and >> > Apache2 dual license. >> > >> > We're almost ready to submit a PR for LBFGS, see our github fork, >> > https://github.com/AlpineNow/incubator-spark/commits/dbtsai-LBFGS >> > >> > However, we don't use Updater in LBFGS since it designs for GD, and >> > for LBFGS, we don't need stepSize, and adaptive learning rate, etc. >> > While it seems to be difficult to fit the LBFGS updater logic (well, >> > in lbfgs library, the new weights is returned given old weights, loss, >> > and gradient) into the current framework, I was thinking to abstract >> > out the code computing the gradient and loss terms of regularization >> > into different place so that different optimizer can also use it. Any >> > suggestion about this? >> > >> > 2) and 3), we will add the MLOR gradient to MLLib, and add a few >> > examples. Finally, we will have some tweak using mapPartition instead >> > of map to further improve the performance. >> > >> > Thanks. >> > >> > Sincerely, >> > >> > DB Tsai >> > Machine Learning Engineer >> > Alpine Data Labs >> > -------------------------------------- >> > Web: http://alpinenow.com/ >> > >> > > > > -- > Best Regards > ----------------------------------- > Xusen Yin 尹绪森 > Beijing Key Laboratory of Intelligent Telecommunications Software and > Multimedia > Beijing University of Posts & Telecommunications > Intel Labs China > Homepage: *http://yinxusen.github.io/ <http://yinxusen.github.io/>*