Re: How can I view bayes score for individual words?

Matt Kettler Fri, 24 Jul 2009 18:37:22 -0700

snowweb wrote:
> I tried to view the files bayes.toks, bayes.journal, bayes.seen and
> autowhitelist but they just look jibberish when opened in a unix editor.
> What's the solution to this?
>   
The bayes database stores truncated SHA1 hashes of the words, it is not
reversible back to human readable text using the database alone. This is
done for performance reasons (fixed size tokens = faster random access),
but has a side benefit of preventing your bayes DB from containing words
that may imply things about your confidential emails.


However, if you run a message through spamassassin with -D bayes=9 it
should dump all the tokens in the message with their score from the
bayes DB.

>  I was hoping to be able to tweak some of the
> scores and add certain words etc.
That would be a very misguided thing to do. Bayes is a statistical
system, and statistics work better with real measurements, not biased
numbers based on your own guesswork.

The reality of things is that a learning statistics system based on
email is really gathering statistics based on human behavior. Human
behavior is *way* more complex than you think it is. :-)

If you really want to tweak the score of some words, create static rules
for them. Leave bayes to doing its own exacting measurements.

Re: How can I view bayes score for individual words?

Reply via email to