Re: Problem in running MLlib SVM

2015-12-01 Thread Robert Dodier
Tarek, On looking at the code in SVM.scala, I see that SVMWithSGD.predictPoint first computes dot(w, x) + b where w is the SVM weight vector, x is the input vector, and b is a constant. If there is a threshold defined, then the output is 1 if that's greater than the threshold and 0 otherwise. If t

Re: Problem in running MLlib SVM

2015-12-01 Thread Joseph Bradley
Oh, sorry about that. I forgot that's the behavior when the threshold is not set. My guess would be that you need more iterations, or that the regParam needs to be tuned. I'd recommend testing on some of the LibSVM datasets. They have a lot, and you can find existing examples (and results) for

Re: Problem in running MLlib SVM

2015-11-30 Thread Joseph Bradley
model.predict should return a 0/1 predicted label. The example code is misleading when it calls the prediction a "score." On Mon, Nov 30, 2015 at 9:13 AM, Fazlan Nazeem wrote: > You should never use the training data to measure your prediction > accuracy. Always use a fresh dataset (test data)

Re: Problem in running MLlib SVM

2015-11-30 Thread Fazlan Nazeem
You should never use the training data to measure your prediction accuracy. Always use a fresh dataset (test data) for this purpose. On Sun, Nov 29, 2015 at 8:36 AM, Jeff Zhang wrote: > I think this should represent the label of LabledPoint (0 means negative 1 > means positive) > http://spark.ap

Re: Problem in running MLlib SVM

2015-11-28 Thread Jeff Zhang
I think this should represent the label of LabledPoint (0 means negative 1 means positive) http://spark.apache.org/docs/latest/mllib-data-types.html#labeled-point The document you mention is for the mathematical formula, not the implementation. On Sun, Nov 29, 2015 at 9:13 AM, Tarek Elgamal wrot

Re: Problem in running MLlib SVM

2015-11-28 Thread Tarek Elgamal
According to the documentation , by default, if wTx≥0 then the outcome is positive, and negative otherwise. I suppose that wTx is the "score" in my case. If score is more than 0 and the label is positive, then I return 1 which is correc

Re: Problem in running MLlib SVM

2015-11-28 Thread Jeff Zhang
if((score >=0 && label == 1) || (score <0 && label == 0)) { return 1; //correct classiciation } else return 0; I suspect score is always between 0 and 1 On Sat, Nov 28, 2015 at 10:39 AM, Tarek Elgamal wrote: > Hi, >