Re: SA efficency degrades quickly

Robert Menschel Tue, 21 Jun 2005 18:29:43 -0700

Hello Mailing,

Tuesday, June 21, 2005, 10:48:44 AM, you wrote:


MLANC> Hi!
MLANC>  I have a little problem with spam recognition. I have re-learned
MLANC> SpamAssassin (deleting old file from ".spamassassin" directory, to clear
MLANC> old information) and it worked really nice... but after few days, the
MLANC> efficency of SpamAssassin degrades from >90% of spam correctly
MLANC> identified to a 60%... I tried to learn it again with new, not
MLANC> recognized spam (and with all new ham, to respect a 1:1 - about - ratio
MLANC> of spam:ham) but without any result.

My experience is the opposite -- after wiping a Bayes database SA is
initially 70%-80% accurate, and then rises steadily to 95% and better
(better = with SARE rules).

I'm guessing you may have auto-learn enabled with the default limits,
and spam that sneaks by with 0.0 or 0.1 scores are learned as
non-spam, polluting your database.

If you have reliable negative-scoring ham rules (which generally are
domain- or user-specific, then set your auto-learn ham threshold to
some negative score (-0.2 or -0.5 or something like that).  If you
have no reliable negative-scoring ham rules, then turn off auto-learn
and ONLY use sa-learn manually as you describe above.

That may take care of your problem.

Alternately, are you using SARE rules?  Start with the most reliable
SARE rules files, expand slowly, and they'll probably help you avoid
Bayes degredation.

Bob Menschel

Re: SA efficency degrades quickly

Reply via email to