Re: sa-learn

Gene Heskett Mon, 20 Apr 2009 22:21:36 -0700

On Monday 20 April 2009, alexus wrote:
>i'm trying to teach my SA whats spam
>
>it's a brand new out of box SA, i have few domains that i dont get
>anything but a spam and on the top seems like from same spamers as
>they "picked" emails that they thought would be good to spam and keep
>on spaming them
>
>so i do sa-learn --spam *
>after a while it saying something like
>
>Learned tokens from 52 message(s) (52 message(s) examined)
>
>yet, when more of some what same email comes in it still can't
>determinate if its spam or not...
>
>am i doing something wrong? or is sa-learn isn't suppose to work as i
>thought it would..


You need to have it learn at least 200 messages of both 'ham' and 'spam' 
before it has enough data to switch to working mode.  So sort them into 
separate directories, and have it learn both a clean inbox as ham, and an all 
spam directory.  When it has learned those, it keep track and will not learn 
those particular emails again, so clean the spam box, just delete its 
contents.  I even use a cleaned up, sorted to separate directories mailing 
list as ham just so it knows stuff from that list is generally ham.  I had one 
list that I never figured out what was spammy about it, and since the corpus 
of that list went back several years, I fed the whole thing to SA as ham.  
Took it several hours but no more problems with that lists messages now.  Now, 
the spam that does get through goes into a spam dir, and a cron job learns it, 
then deletes it daily.  I'm lazy, and repetitive tasks are to be done by a 
cron fired script around this camp. :)

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Any two philosophers can tell each other all they know in two hours.
                -- Oliver Wendell Holmes, Jr.

Re: sa-learn

Reply via email to