snowweb wrote: > I tried to view the files bayes.toks, bayes.journal, bayes.seen and > autowhitelist but they just look jibberish when opened in a unix editor. > What's the solution to this? > The bayes database stores truncated SHA1 hashes of the words, it is not reversible back to human readable text using the database alone. This is done for performance reasons (fixed size tokens = faster random access), but has a side benefit of preventing your bayes DB from containing words that may imply things about your confidential emails.
However, if you run a message through spamassassin with -D bayes=9 it should dump all the tokens in the message with their score from the bayes DB. > I was hoping to be able to tweak some of the > scores and add certain words etc. That would be a very misguided thing to do. Bayes is a statistical system, and statistics work better with real measurements, not biased numbers based on your own guesswork. The reality of things is that a learning statistics system based on email is really gathering statistics based on human behavior. Human behavior is *way* more complex than you think it is. :-) If you really want to tweak the score of some words, create static rules for them. Leave bayes to doing its own exacting measurements.