Well, the bugs in SA 2.0 just got to be too annoying, so I've upgraded one box to SA 2.1 -- never mind what people say about the new scores in 2.1. However, I decided to do a little experiment to see how the new scores stack up against my own personal spam corpus. It isn't much; on the one hand I have 59 spam messages caught by SA on this system since I started using SA about a month ago; and on the other hand I have ~200 spam messages that I have saved up since 1999. The last 27 of these are most interesting, as they are the false negatives that have leaked through SA in the last month. (Hmmm: it sounds like SA only catches 2/3 of my spam, which isn't so hot. Hope 2.1 can do better!)
First, I ran the 59 existing true positives through SA 2.1, looking for any that scored < 5. There were four; three of them slipped by because of DEAR_SOMEBODY scoring -4.4, and the fourth because of CASHCASHCASH scoring -3.7. I changed both of these scores to 1.0, and now SA 2.1 flags all of the spam flagged by SA 2.0. Next, I ran my mini-archive of 207 historical spam messages through SA 2.1, again looking for false negatives (score < 5). There were 32. I looked at the report for each of these 32 messages; several were obviously scored artificially low because of the bizarre negative scores for many obvious spam-markers in SA 2.1. I adjusted a bunch of these scores accordingly and reran the 32 false negatives; now there are only 17 false negs, which I can live with. (Especially since most of them are pretty tricky to detect automatically: eg. headhunters who scoured CPAN for addresses, one-line-URL-only spam, that type of thing.) Here are my corrected scores, in no particular order. These scores were derived using a highly sophisticated natural intelligence algorithm, namely gut instinct: score DEAR_SOMEBODY 1.0 # was -4.4 score CASHCASHCASH 1.0 # was -3.7 score TO_BE_REMOVED_REPLY 1.5 # was -2.0 score MAILTO_WITH_SUBJ_REMOVE 2.5 # was -3.7 score MAILTO_TO_REMOVE 2.5 # was -1.1 score MSGID_HAS_NO_AT 1.0 # was -0.7 score MAILTO_LINK 1.0 # was -0.5 score EXCUSE_10 1.0 # was -1.4 score COPYRIGHT_CLAIMED 0.0 # was -3.1 score MAILTO_LINK 1.0 # was -0.5 score TO_BE_REMOVED_REPLY 2.0 # was -2.0 score TO_UNSUB_REPLY 2.0 # was -2.3 score SUSPICIOUS_RECIPS 1.5 # was 0.9 score VERY_SUSP_RECIPS 2.5 # was 0.9 score LARGE_HEX 1.5 # was -5.9 Executive summary: I think that unconstraining the GA was an interesting idea, but not 100% successful. Perhaps the GA needs hints as to which scores are spam markers and which are not, so it can keep spam markers out of negative territory. Maybe it should warn about spam-marker tests that want to be negative: that indicates that there's something wrong either with the test or with the corpus. Greg -- Greg Ward - software developer [EMAIL PROTECTED] MEMS Exchange http://www.mems-exchange.org _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk