Tarek,

On looking at the code in SVM.scala, I see that SVMWithSGD.predictPoint
first computes dot(w, x) + b where w is the SVM weight vector, x is the
input vector, and b is a constant. If there is a threshold defined, then the
output is 1 if that's greater than the threshold and 0 otherwise. If there
is no threshold, then it just returns dot(w, x) + b. There is no requirement
that the output be constrained to a specific range. 

For a little problem I was working on, I investigated the outputs a little
bit; here's a snippet of some stuff you could put in spark-shell:

    model.clearThreshold
    val foo = x.map (p => (p.label, model.predict (p.features)))
    import org.apache.spark.mllib.stat.Statistics
    val summary = Statistics.colStats (foo.map {case (a, b) => Vectors.dense
(a, b)})
    summary.mean
    summary.min
    summary.max

When I tried that, I found a very large range of outputs -- something like
-6*10^6 to -400, with a mean of about -30000. If you look into it, let us
know what you find, I would be interested to hear about it.

best,

Robert Dodier



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Problem-in-running-MLlib-SVM-tp15380p15416.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to