Re: return probability \ confidence instead of actual class

Aris Wed, 24 Sep 2014 16:08:08 -0700

Χαίρε Αδαμάντιε Κοραή....έαν είναι πράγματι το όνομα σου..


Just to follow up on Liquan, you might be interested in removing the
thresholds, and then treating the predictions as a probability from 0..1
inclusive. SVM with the linear kernel is a straightforward linear
classifier -- so you with the model.clearThreshold() you can just get the
raw predicted scores, removing the threshold which simple translates that
into a positive/negative class.

API is here
http://yhuai.github.io/site/api/scala/index.html#org.apache.spark.mllib.classification.SVMModel

Enjoy!
Aris

On Sun, Sep 21, 2014 at 11:50 PM, Liquan Pei <liquan...@gmail.com> wrote:

> HI Adamantios,
>
> For your first question, after you train the SVM, you get a model with a
> vector of weights w and an intercept b, point x such that  w.dot(x) + b = 1
> and w.dot(x) + b = -1 are points that on the decision boundary. The
> quantity w.dot(x) + b for point x is a confidence measure of
> classification.
>
> Code wise, suppose you trained your model via
> val model = SVMWithSGD.train(...)
>
> and you can set a threshold by calling
>
> model.setThreshold(your threshold here)
>
> to set the threshold that separate positive predictions from negative
> predictions.
>
> For more info, please take a look at
> http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.classification.SVMModel
>
> For your second question, SVMWithSGD only supports binary classification.
>
> Hope this helps,
>
> Liquan
>
> On Sun, Sep 21, 2014 at 11:22 PM, Adamantios Corais <
> adamantios.cor...@gmail.com> wrote:
>
>> Nobody?
>>
>> If that's not supported already, can please, at least, give me a few
>> hints on how to implement it?
>>
>> Thanks!
>>
>>
>> On Fri, Sep 19, 2014 at 7:43 PM, Adamantios Corais <
>> adamantios.cor...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am working with the SVMWithSGD classification algorithm on Spark. It
>>> works fine for me, however, I would like to recognize the instances that
>>> are classified with a high confidence from those with a low one. How do we
>>> define the threshold here? Ultimately, I want to keep only those for which
>>> the algorithm is very *very* certain about its its decision! How to do
>>> that? Is this feature supported already by any MLlib algorithm? What if I
>>> had multiple categories?
>>>
>>> Any input is highly appreciated!
>>>
>>
>>
>
>
> --
> Liquan Pei
> Department of Physics
> University of Massachusetts Amherst
>

Re: return probability \ confidence instead of actual class

Reply via email to