In the corpus, LINE_OF_YELLING appears almost 9000 times in spam, and about 1300 times in nonspam. So I'm guessing that when it's in the nonspam, there are other telltales that it's not really spam, and those rules have been assigned -ve scores by the GA. There are only 562 false positives from the whole corpus, so at least 800 or so LINE_OF_YELLING nonspam messages made it through OK. This does raise an interesting issue though, which is to maybe take a look at the false positives and false negatives after a GA run, and see which rules are the most commonly triggered in each set -- maybe 500 of those 562 false positives are triggering on LINE_OF_YELLING or something (though them I would imagine the GA would get a little smarter about scoring it so high). I'll take a look at that though.
C Daniel Rogers wrote: > Date: Wed, 27 Feb 2002 11:35:03 -0800 > From: Daniel Rogers <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: [SAtalk] LINE_OF_YELLING > > LINE_OF_YELLING seems to have jumped from a score of 0.70 in SA 2.01 to a > score of 5.442 in SA 2.1. This strikes me as rather a lot. Aren't there > still people who still write their messages all in caps because they don't > know any better? > > Also, any mail that uses a line of all caps as a title (such as NTK) would > get immediately marked as spam. > > Dan. > > _______________________________________________ > Spamassassin-talk mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk > > > _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk