I'm kind of confused here.  The way I see it (which could very well be a 
misunderstanding, mind you) is that the reason it autolearns spam over 15 points by 
default is to make darned sure that it doesn't learn a false positive.  Then one would 
augment its learning by feeding missed spams through sa-learn.  The only reason I can 
think of to NOT feed low-scoring spams through sa-learn is that I've decided that a 
spam that scores 5.x points has no interesting tokens.  Quite the opposite is true; 
that's why we feed it with a corpus of known spam in the first place, rather than 
feeding it a corpus of known spam that has been run through spamassassin manually and 
the under-15 spams weeded out.  Same goes with hand-feeding hams that score 4.x 
points, in the theory that there's a fixed probability that a ham from that source 
will at some point trigger another test and trip it over the threshold.

Perhaps I misunderstand.  If so, I'd appreciate alternate viewpoints and discussion.

-tom

-----Original Message-----
From: Tony Earnshaw [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 10, 2003 3:10 PM
To: Simon Crowther
Cc: [EMAIL PROTECTED]
Subject: Re: [SAtalk] Removing headers etc.. to feed Bayes correctly


Simon Crowther wrote:

> I wish to start feeding some of these low scoring spams using SA 
> Learn.

Don't. Have patience; trust me.

Tony

-- 
Tony Earnshaw

There's none so daft as them as will not learn

http://j-walk.com/blog/docs/conference.htm
http://www.billy.demon.nl
Mail: [EMAIL PROTECTED]


-------------------------------------------------------
This SF.NET email is sponsored by: eBay
Great deals on office technology -- on eBay now! Click here:
http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to