The right way is actually to have the AWL form a prediction based on a more sophisticated predictive model, including a zero-frequency estimate for senders who are not in the whitelist. The weight provided by the AWL in cases where there is a-priori data about a sender should depend on the number of messages sent so far, instead of using the a-priori mean as 50% of the final score, the %age should shift over time. There are well known ways of doing this optimally, but my reference book on the subject is up in Sausalito, and I'm down here in Menlo Park, 60-odd miles away. I suppose I could go look things up online... If only I hadn't smoked all that pot as a youngster, my memory might be good enough now to do it w/out reference books :) Or if I were less mathematically lazy, I could probably derive it from stuff I do remember.
C Daniel Quinlan wrote: DQ> Theo Van Dinter <[EMAIL PROTECTED]> writes: DQ> DQ> > Well, SA does that by (default) adding a -100 points to the message score. DQ> > So this spam, listed as from "concord.net" in the header gets -100, DQ> > then the actual spam scores brought it up to -67. DQ> DQ> Yuck. How about moderating the whitelist modification by the DQ> pre-whitelist score? For example, divide the AWL by (score/5). DQ> DQ> So, since this message had an AWL of -100 and a pre-AWL score of 33. DQ> DQ> awl = -100/(33/5) DQ> awl = -15 DQ> DQ> final = 33 - 15 DQ> final = 18 DQ> DQ> The right way would probably be to search for AWL failures (really DQ> spammy mail gets through because of AWL) and determine a formula to DQ> eliminate those without any additional false positives. _______________________________________________________________ Hundreds of nodes, one monster rendering program. Now that's a super model! Visit http://clustering.foundries.sf.net/ _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk