Re: How to compute the probability of each class in Naive Bayes

2015-09-10 Thread Sean Owen
Yes, https://github.com/apache/spark/blob/v1.5.0/mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala#L158 is the method you are interested in. It does normalize the probabilities and return them to non-log-space. So you can use predictProbabilities to get the actual posteri

Re: How to compute the probability of each class in Naive Bayes

2015-09-10 Thread Adamantios Corais
Thanks Sean. As far as I can see probabilities are NOT normalized; denominator isn't implemented in either v1.1.0 or v1.5.0 (by denominator, I refer to the probability of feature X). So, for given lambda, how to compute the denominator? FYI: https://github.com/apache/spark/blob/v1.5.0/mllib/src/mai

Re: How to compute the probability of each class in Naive Bayes

2015-09-10 Thread Sean Owen
The log probabilities are unlikely to be very large, though the probabilities may be very small. The direct answer is to exponentiate brzPi + brzTheta * testData.toBreeze -- apply exp(x). I have forgotten whether the probabilities are normalized already though. If not you'll have to normalize to g

Re: How to compute the probability of each class in Naive Bayes

2015-09-10 Thread Adamantios Corais
great. so, provided that *model.theta* represents the log-probabilities and (hence the result of *brzPi + brzTheta * testData.toBreeze* is a big number too), how can I get back the *non-*log-probabilities which - apparently - are bounded between *0.0 and 1.0*? *// Adamantios* On Tue, Sep 1, 2

Re: How to compute the probability of each class in Naive Bayes

2015-09-01 Thread Sean Owen
(pedantic: it's the log-probabilities) On Tue, Sep 1, 2015 at 10:48 AM, Yanbo Liang wrote: > Actually > brzPi + brzTheta * testData.toBreeze > is the probabilities of the input Vector on each class, however it's a > Breeze Vector. > Pay attention the index of this Vector need to map to the corres

Re: How to compute the probability of each class in Naive Bayes

2015-09-01 Thread Yanbo Liang
Actually brzPi + brzTheta * testData.toBreeze is the probabilities of the input Vector on each class, however it's a Breeze Vector. Pay attention the index of this Vector need to map to the corresponding label index. 2015-08-28 20:38 GMT+08:00 Adamantios Corais : > Hi, > > I am trying to change t

How to compute the probability of each class in Naive Bayes

2015-08-28 Thread Adamantios Corais
Hi, I am trying to change the following code so as to get the probabilities of the input Vector on each class (instead of the class itself with the highest probability). I know that this is already available as part of the most recent release of Spark but I have to use Spark 1.1.0. Any help is ap