Bayes Autolearn Threshold - different scoring?

greg 11 Mar 2005 17:00:27 -0000

Hello all,
Let me start out by saying I've been searching for a couple of days on the
web on this subject but to no avail, so I would appreciate any help.


I have been using SA for more than a year and right now I'm running 3.0.1
on linux (bayes corpus size: nspam = 19482, nham = 3249). My filter
behaves very well, I only get about one false positive a month and 2-3
false negatives (averaging about 100 spams a day,
http://www.amnesiak.com/spam/ if you're curious). I'm invoking SA through
procmailrc with | spamassassin -p /home/greg/.spamassassin/user_prefs .

My problem is this: I'm using squirrelmail, and to keep an eye on false
negatives (I define those as real mails that get shuttled to spam, just to
keep things clear) I have a 'spam' folder. As anyone that uses sqmail
knows, it gets very slow when any folder contains more than a few hundred
messages. But, since my filter is trained very well, I'd like to send
autolearned spams to /mail/Trash (ultimately to /dev/null) so I don't have
to deal with those. I figured just setting bayes_auto_learn_threshold_spam
6 would work great. It really does not do much of anything. I've decreased
it to 3, and to 1, but it really doesnt make a difference. I found these
relevant lines in a debug:

debug: running full-text regexp tests; score so far=4.648
debug: auto-learn: currently using scoreset 3, recomputing score based on
scoreset 1.
debug: auto-learn: message score: 4.648, computed score for autolearn: 3.987
debug: auto-learn? ham=0.1, spam=1, body-points=0, head-points=-2.82,
learned-points=1.886
debug: auto-learn? no: scored as spam but too few body points (0 < 3)
debug: is spam? score=4.648 required=1

What, exactly, is going on here? The head points I can explain (this is a
spam I saved that had already come to me) but the body points - I don't
understand. It also wasn't clear to me until this debug that the autolearn
had its own scoring system.

Any help or clarification would be great!

Thanks,
-Greg

Bayes Autolearn Threshold - different scoring?

Reply via email to