Well, the bugs in SA 2.0 just got to be too annoying, so I've upgraded
one box to SA 2.1 -- never mind what people say about the new scores in
2.1.  However, I decided to do a little experiment to see how the new
scores stack up against my own personal spam corpus.  It isn't much; on
the one hand I have 59 spam messages caught by SA on this system since I
started using SA about a month ago; and on the other hand I have ~200
spam messages that I have saved up since 1999.  The last 27 of these are
most interesting, as they are the false negatives that have leaked
through SA in the last month.  (Hmmm: it sounds like SA only catches 2/3
of my spam, which isn't so hot.  Hope 2.1 can do better!)

First, I ran the 59 existing true positives through SA 2.1, looking for
any that scored < 5.  There were four; three of them slipped by because
of DEAR_SOMEBODY scoring -4.4, and the fourth because of CASHCASHCASH
scoring -3.7.  I changed both of these scores to 1.0, and now SA 2.1
flags all of the spam flagged by SA 2.0.

Next, I ran my mini-archive of 207 historical spam messages through SA
2.1, again looking for false negatives (score < 5).  There were 32.  I
looked at the report for each of these 32 messages; several were
obviously scored artificially low because of the bizarre negative scores
for many obvious spam-markers in SA 2.1.  I adjusted a bunch of these
scores accordingly and reran the 32 false negatives; now there are only
17 false negs, which I can live with.  (Especially since most of them
are pretty tricky to detect automatically: eg. headhunters who scoured
CPAN for addresses, one-line-URL-only spam, that type of thing.)

Here are my corrected scores, in no particular order.  These scores were
derived using a highly sophisticated natural intelligence algorithm,
namely gut instinct:

  score DEAR_SOMEBODY           1.0  # was -4.4
  score CASHCASHCASH            1.0  # was -3.7
  score TO_BE_REMOVED_REPLY     1.5  # was -2.0
  score MAILTO_WITH_SUBJ_REMOVE 2.5  # was -3.7
  score MAILTO_TO_REMOVE        2.5  # was -1.1
  score MSGID_HAS_NO_AT         1.0  # was -0.7
  score MAILTO_LINK             1.0  # was -0.5
  score EXCUSE_10               1.0  # was -1.4
  score COPYRIGHT_CLAIMED       0.0  # was -3.1
  score MAILTO_LINK             1.0  # was -0.5
  score TO_BE_REMOVED_REPLY     2.0  # was -2.0
  score TO_UNSUB_REPLY          2.0  # was -2.3
  score SUSPICIOUS_RECIPS       1.5  # was  0.9
  score VERY_SUSP_RECIPS        2.5  # was  0.9
  score LARGE_HEX               1.5  # was -5.9

Executive summary: I think that unconstraining the GA was an interesting
idea, but not 100% successful.  Perhaps the GA needs hints as to which
scores are spam markers and which are not, so it can keep spam markers
out of negative territory.  Maybe it should warn about spam-marker tests
that want to be negative: that indicates that there's something wrong
either with the test or with the corpus.

        Greg
-- 
Greg Ward - software developer                [EMAIL PROTECTED]
MEMS Exchange                            http://www.mems-exchange.org

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to