On 2/18/2014 6:05 PM, Amir Caspi wrote:
On Feb 18, 2014, at 3:58 PM, John Hardin <jhar...@impsec.org> wrote:
Is there some reason the Bayes scores can't/shouldn't be static?

Indeed, I am wondering why Bayes would be auto-scored at all. By definition, 
Bayes high scores should match only on spam, low scores should match only on 
ham. That's not perfect, of course, but it is basically by definition of how 
Bayes learns.

Given that, it seems to me that the Bayes scores should be static, and my 
experience suggests that 99 or 999 should be scored pretty heavily. (I'd say 00 
should be scored negatively heavily, but I get enough FNs with 00 that I don't 
like that idea... though it probably means my DB is borked or my ham is full of 
spammy tokens.)
Actually it's a bit the opposite especially if using autolearn where scoring to high on the 99% end can cause low percentage corpora to swing heavily towards the high score too rapidly.

Regards,
KAm

Reply via email to