I'm kind of confused here. The way I see it (which could very well be a misunderstanding, mind you) is that the reason it autolearns spam over 15 points by default is to make darned sure that it doesn't learn a false positive. Then one would augment its learning by feeding missed spams through sa-learn. The only reason I can think of to NOT feed low-scoring spams through sa-learn is that I've decided that a spam that scores 5.x points has no interesting tokens. Quite the opposite is true; that's why we feed it with a corpus of known spam in the first place, rather than feeding it a corpus of known spam that has been run through spamassassin manually and the under-15 spams weeded out. Same goes with hand-feeding hams that score 4.x points, in the theory that there's a fixed probability that a ham from that source will at some point trigger another test and trip it over the threshold.
Perhaps I misunderstand. If so, I'd appreciate alternate viewpoints and discussion. -tom -----Original Message----- From: Tony Earnshaw [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 10, 2003 3:10 PM To: Simon Crowther Cc: [EMAIL PROTECTED] Subject: Re: [SAtalk] Removing headers etc.. to feed Bayes correctly Simon Crowther wrote: > I wish to start feeding some of these low scoring spams using SA > Learn. Don't. Have patience; trust me. Tony -- Tony Earnshaw There's none so daft as them as will not learn http://j-walk.com/blog/docs/conference.htm http://www.billy.demon.nl Mail: [EMAIL PROTECTED] ------------------------------------------------------- This SF.NET email is sponsored by: eBay Great deals on office technology -- on eBay now! Click here: http://adfarm.mediaplex.com/ad/ck/711-11697-6916-5 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk