Yes, certainly you need to consider the problem of how and when you update the model with new info. The principle is the same. Low or high posteriors aren't wrong per se. It seems normal in fact that one class is more probable than others, maybe a lot more.
On Thu, Nov 20, 2014 at 10:31 AM, jatinpreet <jatinpr...@gmail.com> wrote: > Thanks a lot Sean. You are correct in assuming that my examples fall under a > single category. > > It is interesting to see that the posterior probability can actually be > treated as something that is stable enough to have a constant threshold > value on per class basis. It would, I assume, keep changing for a sample as > I add/remove documents in the training set and thus warrant corresponding > change in the threshold. > > Also, I have seen the class prediction probabilities to range from 0.003 to > 0.8 for correct classifications in my sample data. This is a wide spectrum, > so is there a way to change that? Maybe by replicating the samples for the > classes I get low confidence but accurate classification for. > > Thanks, > Jatin > > > > ----- > Novice Big Data Programmer > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Naive-Baye-s-classification-confidence-tp19341p19358.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org