Tarek, On looking at the code in SVM.scala, I see that SVMWithSGD.predictPoint first computes dot(w, x) + b where w is the SVM weight vector, x is the input vector, and b is a constant. If there is a threshold defined, then the output is 1 if that's greater than the threshold and 0 otherwise. If there is no threshold, then it just returns dot(w, x) + b. There is no requirement that the output be constrained to a specific range.
For a little problem I was working on, I investigated the outputs a little bit; here's a snippet of some stuff you could put in spark-shell: model.clearThreshold val foo = x.map (p => (p.label, model.predict (p.features))) import org.apache.spark.mllib.stat.Statistics val summary = Statistics.colStats (foo.map {case (a, b) => Vectors.dense (a, b)}) summary.mean summary.min summary.max When I tried that, I found a very large range of outputs -- something like -6*10^6 to -400, with a mean of about -30000. If you look into it, let us know what you find, I would be interested to hear about it. best, Robert Dodier -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-in-running-MLlib-SVM-tp15380p15416.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org