Dave Koontz wrote:
I guess milage varies. Auto-Learn has been a life saver for us and has
drastically reduced false postives we used to get with emails to our
College's Health Care & Research departments. We pass all local user email
through SA as well, so this really helps the system learn what is 'good'
email.
I'd suggest that everyone should at least try it and monitor the results.
I have found autolearn to be quite a valuable function here as well.
Keep in mind that i have adjusted the autolearn threshold values to
prevent things from being autolearned incorrectly. I would suggest
others do the same if they use autolearn. IMO, with the default scores,
it is too easy for false learning to occur. I use:
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 10.0
-Jim
-----Original Message-----
From: Nigel Frankcom [mailto:[EMAIL PROTECTED]
Sent: Friday, December 29, 2006 11:17 AM
To: users@spamassassin.apache.org
Subject: Re: sa-learn explained
On Fri, 29 Dec 2006 09:51:05 -0500, Andy Figueroa
<[EMAIL PROTECTED]> wrote:
I still fee like a tyro with SpamAssassin, but my installation is
catching better than 99% with perhaps 0.1% false positives (thanks in
large part to things I've learned from this list), and I think I can
tell you a couple of things better than just read the manual. (But, do
read the manual!) My initial experience with SpamAssassin about a year
ago was through a large web hosting company and I was limited to
playing with SpamAssassin through cpanel, though till they moved
SpamAssassin to its own server, I could also edit my own user
preferences directly. The problem was, this big company never could
get it right, so now I'm running my own mailserver(s) out of what
seemed like necessity. I'm running Gentoo with SA 3.1.7.
sa-learn is used to train and keep up-to-date the bayesian database.
So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the
line reads:
bayes_auto_learn 1
(should be on by default).
This will cause selected spam and ham that you get to be used
automagically to keep the bayesian database up-to-date.
I'm using maildir and have two subdirectories in my .maildir called:
2-learn-spam
2-learn-ham
I put missed spam in 2-learn-spam and ham misclassified as ham in
2-learn-ham. Then, whenever I have a few messages in one of those
directories, I run one of the following scripts:
learnspam.scr, which contains this line:
sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur
learnham.scr which contains this line:
sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur
This is on my personal mailserver. On the mailserver I run at a
school, I run that script on each users 2-learn-spam/ham directories
every night under crontab.
Run an up-to-date version of SpmaAsssasin. I was having pretty good
results with 3.1.3 (the unmasked version in Gentoo), but got
immediately better results when I upgraded to the current version.
Also, to keep your RULES up-to-date, run sa-update as root from
time-to-time.
Good luck! Happy spamassassaning!
Personally, I'd disagree with auto-learn; having used SA in a production
environment for some years I've found manual training to be a better
solution.
YMMV
Just my 2 (pick your currency) worth.
Nigel