John Ackermann N8UR wrote: > Hi -- > > I reinstalled SA 2.20 after running and then having to disable an earlier > version some time ago. The new version doesn't seem to be catching as many > spams as the old one, and as I was perusing the report on those it did > find, I was surprised at some of the negative scores I saw assigned to some > tests. > > Grepping through 50_scores.cf, I found: > > score ALL_CAPS_SUBJECT -0.274 > score BE_AMAZED -0.260 > score COPYRIGHT_CLAIMED -1.568 > score DEAR_SOMEBODY -0.468 > score EXCUSE_6 -0.110 > score GAPPY_TEXT -1.237 > score HTML_WITH_BGCOLOR -0.546 > score IN_REP_TO -4.431 > score JAVASCRIPT_URI -1.607 > score LINES_OF_YELLING_3 -1.518 > score MAILTO_WITH_SUBJ -0.310 > score NO_EXPERIENCE -1.063 > score NO_QS_ASKED -0.773 > score OPPORTUNITY -1.010 > score PGP_SIGNATURE -2.095 > score PORN_8 -4.248 > score RATWARE -0.703 > score REAL_THING -0.148 > > and it seems odd to me that some of these should lower the score. Why > would "NO_QS_ASKED", "NO_EXPERIENCE", or "HTML_WITH_BGCOLOR" subtract from > the total score? I can see that a small score might be appropriate, but > I'd think that a negative score would be a sign that the message was > legitimate.
The scores are generated using a genetic algorithm over a corpus of known spam and non-spam. Negative scores usually mean that the rule matched more non-spams than it did spams - often this means it's a bad rule, though sometimes the rule is genuinely there to detect non-spam. Matt. _______________________________________________________________ Don't miss the 2002 Sprint PCS Application Developer's Conference August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk