Hi Deb, For K possible outcomes in multinomial logistic regression, we can have K-1 independent binary logistic regression models, in which one outcome is chosen as a "pivot" and then the other K-1 outcomes are separately regressed against the pivot outcome. See my presentation for technical detail http://www.slideshare.net/dbtsai/2014-0501-mlor
Since mllib only supports one linear model per classification model, there will be some infrastructure work to support MLOR in mllib. But if you want to implement yourself with the L-BFGS solver in mllib, you can follow the equation in my slide, and it will not be too difficult. I can give you the gradient method for multinomial logistic regression, you just need to put the K-1 intercepts in the right place. def computeGradient(y: Double, x: Array[Double], lambda: Double, w: Array[Array[Double]], b: Array[Double], gradient: Array[Double]): (Double, Int) = { val classes = b.length + 1 val yy = y.toInt def alpha(i: Int): Int = { if (i == 0) 1 else 0 } def delta(i: Int, j: Int): Int = { if (i == j) 1 else 0 } var denominator: Double = 1.0 val numerators: Array[Double] = Array.ofDim[Double](b.length) var predicted = 1 { var i = 0 var j = 0 var acc: Double = 0 while (i < b.length) { acc = b(i) j = 0 while (j < x.length) { acc += x(j) * w(i)(j) j += 1 } numerators(i) = math.exp(acc) if (i > 0 && numerators(i) > numerators(predicted - 1)) { predicted = i + 1 } denominator += numerators(i) i += 1 } if (numerators(predicted - 1) < 1) { predicted = 0 } } { // gradient has dim of (classes-1) * (x.length+1) var i = 0 var m1: Int = 0 var l1: Int = 0 while (i < (classes - 1) * (x.length + 1)) { m1 = i % (x.length + 1) // m0 is intercept l1 = (i - m1) / (x.length + 1) // l + 1 is class if (m1 == 0) { gradient(i) += (1 - alpha(yy)) * delta(yy, l1 + 1) - numerators(l1) / denominator } else { gradient(i) += ((1 - alpha(yy)) * delta(yy, l1 + 1) - numerators(l1) / denominator) * x(m1 - 1) } i += 1 } } val loglike: Double = math.round(y).toInt match { case 0 => math.log(1.0 / denominator) case _ => math.log(numerators(math.round(y - 1).toInt) / denominator) } (loglike, predicted) } Sincerely, DB Tsai ------------------------------------------------------- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Tue, May 13, 2014 at 4:08 AM, Debasish Das <debasish.da...@gmail.com>wrote: > Hi, > > Is there a PR for multinomial logistic regression which does one-vs-all and > compare it to the other possibilities ? > > @dbtsai in your strata presentation you used one vs all ? Did you add some > constraints on the fact that you penalize if mis-predicted labels are not > very far from the true label ? > > Thanks. > Deb >