Re: Bayes Auto Learn

Matt Kettler Thu, 17 May 2007 04:33:13 -0700

Daniel Aquino wrote:
> Is spam assassin smart enough to not auto-learn (bayesian) spam if the
> default tests "allready" detect it as spam... ?  
No, in fact, that's exactly what you DO NOT want to do.


Bayes training is not applicable to just one message. Bits learned from
one spam get applied to other spams.

> What I'm wondering is
> if the other tests have allready deamed it to be spam, then why would
> you want to increase the size of your bayesian db...
You won't increase the size of the bayes DB.. SA automatically prunes
tokens that haven't been used recently in order to keep the token count
below a specified limit. (see the conf docs)
> Bayesian I
> believe would be better applied to messages that appear to be slipping
> past the other tests...
That is purely misguided. It is certianly more important to get to
training messages that are missed, but at the same time it is also
important to train fresh spam that is caught.

You have to consider that spam is a mutating thing. Even if a spam is
caught, and even if it already hits BAYES_99, it can still contain new
tokens caused by these mutations.

So, if you avoid training the new mutations, and wait until there are
enough mutations that that family of spam starts getting missed, you'll
have to play catch-up.

On the other hand, if you consistently train spam, as they mutate they
will continue to have high bayes scores, and likely never get missed at all.

Re: Bayes Auto Learn

Reply via email to