RE: sa-learn explained

vertito Fri, 29 Dec 2006 11:25:05 -0800

personally, attended sa-learn is better for me rather than having 1 with 
unattended auto learn,
as what they always say, one man's spam is another man's ham.


2 cents here.

-----Original Message-----
From: Jim Maul [mailto:[EMAIL PROTECTED] 
Sent: Friday, December 29, 2006 6:28 PM
To: users@spamassassin.apache.org
Subject: Re: sa-learn explained

Dave Koontz wrote:
>  
> I guess milage varies.  Auto-Learn has been a life saver for us and 
> has drastically reduced false postives we used to get with emails to 
> our College's Health Care & Research departments.  We pass all local 
> user email through SA as well, so this really helps the system learn what is 
> 'good'
> email.
> 
> I'd suggest that everyone should at least try it and monitor the results.
> 
> 

I have found autolearn to be quite a valuable function here as well. 
Keep in mind that i have adjusted the autolearn threshold values to prevent 
things from being
autolearned incorrectly.  I would suggest others do the same if they use 
autolearn.  IMO, with the
default scores, it is too easy for false learning to occur. I use:

bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 10.0

-Jim


> -----Original Message-----
> From: Nigel Frankcom [mailto:[EMAIL PROTECTED]
> Sent: Friday, December 29, 2006 11:17 AM
> To: users@spamassassin.apache.org
> Subject: Re: sa-learn explained
> 
> On Fri, 29 Dec 2006 09:51:05 -0500, Andy Figueroa 
> <[EMAIL PROTECTED]> wrote:
> 
>> I still fee like a tyro with SpamAssassin, but my installation is 
>> catching better than 99% with perhaps 0.1% false positives (thanks in 
>> large part to things I've learned from this list), and I think I can 
>> tell you a couple of things better than just read the manual.  (But, 
>> do read the manual!)  My initial experience with SpamAssassin about a 
>> year ago was through a large web hosting company and I was limited to 
>> playing with SpamAssassin through cpanel, though till they moved 
>> SpamAssassin to its own server, I could also edit my own user 
>> preferences directly.  The problem was, this big company never could 
>> get it right, so now I'm running my own mailserver(s) out of what 
>> seemed like necessity.  I'm running Gentoo with SA 3.1.7.
>>
>> sa-learn is used to train and keep up-to-date the bayesian database.  
>> So, turn on autolearn in your /etc/mail/spamassassin/local.cf so the 
>> line reads:
>> bayes_auto_learn 1
>> (should be on by default).
>> This will cause selected spam and ham that you get to be used 
>> automagically to keep the bayesian database up-to-date.
>>
>> I'm using maildir and have two subdirectories in my .maildir called:
>> 2-learn-spam
>> 2-learn-ham
>>
>> I put missed spam in 2-learn-spam and ham misclassified as ham in 
>> 2-learn-ham.  Then, whenever I have a few messages in one of those 
>> directories, I run one of the following scripts:
>>
>> learnspam.scr, which contains this line:
>> sa-learn --spam --progress /home/figueroa/.maildir/.2-learn-spam/cur
>>
>> learnham.scr which contains this line:
>> sa-learn --ham --progress /home/figueroa/.maildir/.2-learn-ham/cur
>>
>> This is on my personal mailserver.  On the mailserver I run at a 
>> school, I run that script on each users 2-learn-spam/ham directories 
>> every night under crontab.
>>
>> Run an up-to-date version of SpmaAsssasin.  I was having pretty good 
>> results with 3.1.3 (the unmasked version in Gentoo), but got 
>> immediately better results when I upgraded to the current version.
>>
>> Also, to keep your RULES up-to-date, run sa-update as root from 
>> time-to-time.
>>
>> Good luck!  Happy spamassassaning!
> 
> 
> Personally, I'd disagree with auto-learn; having used SA in a 
> production environment for some years I've found manual training to be 
> a better solution.
> 
> YMMV
> 
> Just my 2 (pick your currency) worth.
> 
> Nigel
> 
> 
> 
>

RE: sa-learn explained

Reply via email to