On 2/18/2014 8:49 PM, Kevin A. McGrail wrote:
On 2/18/2014 6:05 PM, Amir Caspi wrote:
On Feb 18, 2014, at 3:58 PM, John Hardin <jhar...@impsec.org> wrote:
Is there some reason the Bayes scores can't/shouldn't be static?
Indeed, I am wondering why Bayes would be auto-scored at all. By definition,
Bayes high scores should match only on spam, low scores should match only on
ham. That's not perfect, of course, but it is basically by definition of how
Bayes learns.
Given that, it seems to me that the Bayes scores should be static, and my
experience suggests that 99 or 999 should be scored pretty heavily. (I'd say 00
should be scored negatively heavily, but I get enough FNs with 00 that I don't
like that idea... though it probably means my DB is borked or my ham is full of
spammy tokens.)
Actually it's a bit the opposite especially if using autolearn where
scoring to high on the 99% end can cause low percentage corpora to swing
heavily towards the high score too rapidly.
Bayes scores are not included when determining what to autolearn, so
changing the Bayes scores should have no effect on autolearning.
Or am I missing something?
--
Bowie