Re: sa-stats and Spamtagging

Chris St. Pierre Tue, 13 Feb 2007 07:39:43 -0800

On Tue, 13 Feb 2007, LuKreme wrote:

Now, perhaps I am misunderstanding, but BAYES_99 is hitting on 5% of ham? andAWL on 35% of spam?


Keep in mind that AWL is slightly misnamed; it doesn't just whitelist,
it adjusts scores (both positively and negatively) based on previous
history.  So the fact that it's hitting on 35% of your spam is pretty
meaningless, really.

sa-stats counts something as spam that SA marks as spam.  So the fact
that BAYES_99 is hitting on 5% of ham means (roughly) that 5% of your
unmarked mail hit either only BAYES_99 or BAYES_99 and not enough
other rules to mark it as spam.  That means, respectively, that either
you need to work on training your Bayes better, or that your Bayesian
component is very well trained and that you need to turn up the scores
for BAYES_99.  The only way to know the difference is to look at the
messages that are getting tagged with BAYES_99 but are not marked as
spam. If Bayes is right about them, turn up your scoring; if not,
continue training.

This is where a user feedback look -- such as spam/ham reporting links
in your webmail client, or the equivalent training for desktop client
users -- can be really useful.

Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University
----------------------------
Never send mail to [EMAIL PROTECTED]

Re: sa-stats and Spamtagging

Reply via email to