Re: Contributing to MLlib on GLM

2014-07-07 Thread Gang Bai
Poisson and Gamma regressions for modeling count data are definitely important in spark.mllib.regression. So don’t worry. Let’s change the updater to SquaredL2Updater as we discussed in the PR. Then we can ask Jenkins to run the test. On Jul 8, 2014, at 3:00 AM, xwei wrote: > Hi Gang, > > No

Re: Contributing to MLlib on GLM

2014-07-07 Thread xwei
Hi Gang, No admin is looking at our patch:( do you have some suggestions so that our patch can get noticed by the admin? Best regards, Xiaokai On Mon, Jun 30, 2014 at 8:18 PM, Gang Bai [via Apache Spark Developers List] wrote: > Thanks Xiaokai, > > I’ve created a pull request to merge featur

Re: Contributing to MLlib on GLM

2014-06-30 Thread Gang Bai
Thanks Xiaokai, I’ve created a pull request to merge features in my PR to your repo. Please take a review here https://github.com/xwei-datageek/spark/pull/2 . As for GLMs, here at Sina, we are solving the problem of predicting the num of visitors who read a particular news article or watch an o

Re: Contributing to MLlib on GLM

2014-06-28 Thread xwei
Hi Gang, No worries! I agree LBFGS would converge faster and your test suite is more comprehensive. I'd like to merge my branch with yours. I also agree with your viewpoint on the redundancy issue. For different GLMs, usually they only differ in gradient calculation but the regression.sca

Re: Contributing to MLlib on GLM

2014-06-27 Thread 白刚
Hi Xiaokai, My bad. I didn't notice this before I created another PR for Poisson regression. The mails were buried in junk by the corp mail master. Also, thanks for considering my comments and advice in your PR. Adding my two cents here: * PoissonRegressionModel and GammaRegressionModel have t

Re: Contributing to MLlib on GLM

2014-06-26 Thread xwei
Yes, that's what we did: adding two gradient functions to Gradient.scala and create PoissonRegression and GammaRegression using these gradients. We made a PR on this. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-on-GLM-tp7033p7

Re: Contributing to MLlib on GLM

2014-06-25 Thread Sung Hwan Chung
Well, as you said, MLLib already supports GLM in a sense. Except they only support two link functions - identity (linear regression) and logit (logistic regression). It should not be too hard to add other link functions, as all you have to do is add a different gradient function for Poisson/Gamma,

Re: Contributing to MLlib on GLM

2014-06-17 Thread Andrew Ash
Hi Xiaokai, Also take a look through Xiangrui's slides from HadoopSummit a few weeks back: http://www.slideshare.net/xrmeng/m-llib-hadoopsummit The roadmap starting at slide 51 will probably be interesting to you. Andrew On Tue, Jun 17, 2014 at 7:37 PM, Sandy Ryza wrote: > Hi Xiaokai, > > I

Re: Contributing to MLlib on GLM

2014-06-17 Thread Sandy Ryza
Hi Xiaokai, I think MLLib is definitely interested in supporting additional GLMs. I'm not aware of anybody working on this at the moment. -Sandy On Tue, Jun 17, 2014 at 5:00 PM, Xiaokai Wei wrote: > Hi, > > I am an intern at PalantirTech and we are building some stuff on top of > MLlib. In P