The biggest problem with -S is due to the ordering of the rule checks. If all of the negative rules (or at least the _large_ negative rules) were processed first, it would probably be ok, but right now (or at least with 2.20) - if you enabled it, the whitelisting would never get used, since it would reach the threshold prior to the whitelist check.
I don't know if that has been changed, but that made it pretty useless for me. -- Nathan ------------------------------------------------------------ Nathan Neulinger EMail: [EMAIL PROTECTED] University of Missouri - Rolla Phone: (573) 341-4841 Computing Services Fax: (573) 341-4216 > -----Original Message----- > From: Sidney Markowitz [mailto:[EMAIL PROTECTED]] > Sent: Thursday, May 02, 2002 1:11 PM > To: [EMAIL PROTECTED] > Subject: Re: [SAtalk] AWL verses early-terminate > > > On Thu, 2002-05-02 at 09:16, Charlie Watts wrote: > > It has just occured to me that this will adjust the AWL math because > > I won't be getting "big" positive numbers into the AWL any more. > > The fact that the -S option is reasonable points out that the > scoring is > not a linear measure of spamminess. The function P(s) of the > probability > that a message with score s is spam stays near 0 until some small > positive s, then asymptotically approaches 1 somewhere around > where you > want to set the spam threshold. This means that a message > with score 20 > and one with score 70 are both certainly spam and should not > contribute > different weights to the AWL calculation. What we really want is some > measure of the probability that a message from somewhere is spam based > on our past experience with messages from the same place. > That indicates > that rather than a linear average of the score we should be averaging > something that approximates the probability of being spam, > i.e., convert > the score into a "spamminess" level that is 0 below some threshold, 1 > above some threshold, and a few values in between for spam scores that > are not considered by themselves to be certain spam or non-spam. Of > course the "1" can be something larger so the whole thing can > be scaled > to integers if that seems more aesthetic. > > This gives me another idea: If you consider the AWL as being a way of > assigning an a priori probability of spamminess to a message based on > local experience with messages with the same From: header, we can > generalize that to keep track of experience with messages that are > similar based on other criteria. Is there a reason not to track any > other headers, such as the return-path or the first or second received > header? Would it make sense to have a configurable AWL that tracks > criteria that are more useful at a local site? A local spam phrase or > non-spam phrase list? > > -- sidney > > > > _______________________________________________________________ > > Have big pipes? SourceForge.net is looking for download > mirrors. We supply > the hardware. You get the recognition. Email Us: > [EMAIL PROTECTED] > _______________________________________________ > Spamassassin-talk mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > _______________________________________________________________ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk