My test set is 1789 messages composed of about 25% spam. Only a few possible tests in this set, but I figure it's good to share some of the failures too.
1. Subject =~ /^\s*Re:/i lacking "In-Reply-To" or "References" header (idea for test from http://linuxconf.unixtech.be/configurations/mutt/mutt.color.index.html) RESULT: 60 matches, only 15% were spam (terrible no rule) What if we exempt the two most prevalent guilty mailers ("Internet Mail Service" and "Lotus Notes")? RESULT: 25 matches, 9 were spam (26% spam, not a great rule) 2. Message-Id tests (idea for test from http://linuxconf.unixtech.be/configurations/mutt/mutt.color.index.html) As the author notes, might be good to also check the RFC. Key: TEST = the rule MATCH = number matched out of 1789 messages MSG = number of messages already flagged by current SA Message-Id tests BAD = number of spam in MATCH GOOD = number of non-spam in MATCH RESULT = my assessment of the test (sorted by RESULT) TEST MATCH MSG BAD GOOD RESULT =~ /[{:%#|/]/ 23 2 22 1 great test =~ /[.]>/ 2 0 2 0 good test !~ /@.*[.]/ 163 11 87 76 so-so test =~ /@>/ 8 8 8 0 duplicates existing =~ /<.*</ 1 1 1 0 duplicates existing !~ /@/ 2 2 2 0 duplicates existing !~ /</ 12 12 12 0 duplicates existing =~ /<>/ 0 0 0 0 bad test =~ /<.* .*>/ 29 29 3 26 bad test =~ /localhost/ 65 0 1 64 bad test =~ /localdomain/ 78 0 0 78 bad test =~ /[.][a-z]>/ 0 0 0 0 bad test =~ /[.][a-z]{4,}>/ 76 0 2 74 bad test I further revised the first test to be simply: Message-Id =~ /[#/,:]/ which removed the false positive. The '%' is used by a large corporation. I never saw '|' or '{', so I removed them for now. 3. Lots of "X-" headers. (idea for test from http://silenroc.com/angel/filter2.html) 3 or more: 593 matched, 142 spam 4 or more: 370 matched, 55 spam 5 or more: 194 matched, 18 spam Hmmm... this does not seem to be working. What about the reverse? none: 341 matched, 202 spam 1 or less: 900 matched, 258 spam Weird. The no "X-" header test is not too bad. Worth trying, I think. I now believe I actually misinterpreted the web page, it was supposed to be "X-x", but we already have a test for that and it seems to work okay. But, the no "X-" header test might be worth trying. Dan _______________________________________________________________ Have big pipes? SourceForge.net is looking for download mirrors. We supply the hardware. You get the recognition. Email Us: [EMAIL PROTECTED] _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk