Currently, SVMs don't have built-in multiclass support. Logistic Regression supports multiclass, as do trees and random forests. It would be great to add multiclass support for SVMs as well.
There is some ongoing work on generic multiclass-to-binary reductions: https://issues.apache.org/jira/browse/SPARK-7015 I agree that naive one-vs-all reductions might not work that well, but that the raw scores could be calibrated using the scaling you mentioned, or other methods. Joseph On Mon, May 4, 2015 at 6:29 AM, Driesprong, Fokko <fo...@driesprong.frl> wrote: > Hi Robert, > > I would say, taking the sign of the numbers represent the class of the > input-vector. What kind of data are you using, and what kind of traning-set > do you use. Fundamentally a SVM is able to separate only two classes, you > can do one vs the rest as you mentioned. > > I don't see how LVQ can benefit the SVM classifier. I would say that this > is more a SVM problem, than a Spark. > > 2015-05-04 15:22 GMT+02:00 Robert Musters <robert.must...@openindex.io>: > >> Hi all, >> >> I am trying to understand the output of the SVM classifier. >> >> Right now, my output looks like this: >> >> -18.841544889249917 0.0 >> >> 168.32916035523283 1.0 >> >> 420.67763915879794 1.0 >> >> -974.1942589201286 0.0 >> >> 71.73602841256813 1.0 >> >> 233.13636224524993 1.0 >> >> -1000.5902168199027 0.0 >> >> >> The documentation is unclear about what these numbers mean >> <https://spark.apache.org/docs/0.9.2/api/mllib/index.html#org.apache.spark.mllib.regression.LabeledPoint> >> . >> >> I think it is the distance to the hyperplane with sign. >> >> >> My main question is: How can I convert distances from hyperplanes to >> probabilities in a multi-class one-vs-all approach? >> >> SVMLib <http://www.csie.ntu.edu.tw/~cjlin/libsvm/> has this >> functionality and refers the process to get the probabilities as “Platt >> scaling” >> <http://www.researchgate.net/profile/John_Platt/publication/2594015_Probabilistic_Outputs_for_Support_Vector_Machines_and_Comparisons_to_Regularized_Likelihood_Methods/links/004635154cff5262d6000000.pdf>. >> >> >> I think this functionality should be in MLLib, but I can't find it? >> Do you think Platt scaling makes sense? >> >> >> Making clusters using Learning Vector Quantization, determining the >> spread function of a cluster with a Gaussian function and then retrieving >> the probability makes a lot more sense i.m.o. Using the distances from the >> hyperplanes from several SVM classifiers and then trying to determine some >> probability on these distance measures, does not make any sense, because >> the distribution property of the data-points belonging to a cluster is not >> taken into account. >> Does anyone see a fallacy in my reasoning? >> >> >> With kind regards, >> >> Robert >> > >