Hi Jeremy,

interesting article.  I think you're wrong that this is the last stand for
spamfilters (you should read up on Boosting as a method of chaining
multiple filters), but the Bayes-attack tool is an interesting approach.
I plan to blog about it on http://taint.org/ when I get a chance.

In the meantime, you might like to see what SpamAssassin 2.50 made of it
with my personal training set; a square-ish 0.49 with chi2.

(I had to make a few mods to your message; namely, I cut and pasted the
headers of another 419 spam, and changed the From addr to match the name,
in traditional 419 spam style.  SpamAssassin doesn't filter messages
without headers.)

The bayes tokens used in the calculation can be seen in the debug
lines like this:

  debug: bayes token 'captain' => 0.00453686200378072

Based on Bayes alone, this would have been an 'unsure'; with the forged
set of headers, SpamAssassin would have caught it with its own rules.  (we
have a good set of rules that catch most 419s now, since they mostly seem
to use one particular spamware tool.)

BTW maybe this spammer -- http://taint.org/2003/02/07/141629a.html -- has
already read your article?  Who'd have thought 'concupiscent' would get a
high spamprob for me ;)

--j.

Delivered-To: [EMAIL PROTECTED]
Received: from localhost (jalapeno [127.0.0.1])
        by jmason.org (Postfix) with ESMTP id 002B016F16
        for <jm@localhost>; Fri,  7 Feb 2003 15:58:47 +0000 (GMT)
Received: from jalapeno [127.0.0.1]
        by localhost with IMAP (fetchmail-5.9.0)
        for jm@localhost (single-drop); Fri, 07 Feb 2003 15:58:47 +0000 (GMT)
Received: from mail.hivelocity.net (mail.hivelocity.net [65.59.189.58]) by
    dogma.slashnull.org (8.11.6/8.11.6) with SMTP id h0E4ERv29190 for
    <[EMAIL PROTECTED]>; Tue, 14 Jan 2003 04:14:31 GMT
From: "Mrs. SANDRA MANI" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
X-Mailer: Microsoft Outlook Express 5.00.2919.6900 DM
MIME-Version: 1.0
Message-Id: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="us-ascii"
Subject: CONFIDENTIAL
Date: Fri, 7 Feb 2003 15:23:30 +0000

I mean to be writing you this sensitive message believing that you won't
violate the trust I'm to about impose on you.

By way of introduction, I am Sandra Mani the wife of Hanis Mani Mani previous
chief of defense-staff of (Republic of Guinea Bissau).

I heard about you through a good businessman who told me I can freely deal with
you that you are truthful and also your good ability of dealing with this. He
made me to surmise that you must be such an erudite businessman of ingenuity
and much compassion.

My loving husband was killed not long ago in an attack last December for his
role as a brave subversive rebel captain against the previous evil totalitarian
government of guinea Bissau. His sad absence has affected me more then I can
express.

Subsequent to this political crisis, I was forced to flee to the good land of
Cote d'Ivoire for my very life.

In Abidjan he stored ONE METALLIC CRATE in a safe storage location. He marked
it as an African Artworks as belonging to his American friend who would come
with the keys for the claim of the consignment. He did not disclose to the
storage people the real contents of the crate.

The crate contain almost $ 18,000.000.00. To be truthful with you, this is the
only legacy that my husband left for me. I have the proof of ownership and
other requisite proofs for that deposit, but I am not an American, which is
what they are expecting.

I'd like you to behave as the true claiment of the container and claim it for
wiring to your Checking for use in your hometown. I've decided to render to you
the contribution of 5% of the final quantity and 2% for other mecellaneous
expenditures you may cause while you do that. Should you elect to help me, I'll
tell you the procedure we should use to make certain the liberation of the
cache isn't difficult. It should just be a matter of meeting the formalities.

Hopefully.

Sandra Mani

NB: see that you call me when you get this message for more briefing.

debug: Score set 0 chosen.
debug: using "/home/jm/.spamassassin" for user state dir
debug: bayes: tie-ing to DB file R/O /home/jm/.spamassassin/bayes_toks
debug: bayes: tie-ing to DB file R/O /home/jm/.spamassassin/bayes_seen
debug: Score set 2 chosen.
debug: using "./rules" for default rules dir
debug: using "/etc/mail/spamassassin" for site rules dir
debug: using "/home/jm/.spamassassin" for user state dir
debug: using "/home/jm/.spamassassin/user_prefs" for user prefs file
debug: Initialising learner
debug: running header regexp tests; score so far=0
debug: running body-text per-line regexp tests; score so far=3.5
debug: bayes corpus size: nspam = 3941, nham = 19145
debug: tokenize: header tokens for *F = ""Mrs. SANDRA MANI" <[EMAIL PROTECTED]>"
debug: tokenize: header tokens for To = "[EMAIL PROTECTED]"
debug: tokenize: header tokens for *x = "Microsoft Outlook Express 5.00.2919.6900 DM"
debug: tokenize: header tokens for MIME-Version = ""
debug: tokenize: header tokens for *m = " 200301140414 h0E4ERv29190 dogma slashnull 
org "
debug: tokenize: header tokens for *c = "/plain; charset="us-ascii""
debug: tokenize: header tokens for *r = "  mail.hivelocity.net (mail.hivelocity.net 
[65.59.189]) by dogma.slashnull.org (8.11.6/8.11.6)         <[EMAIL PROTECTED]>; "
debug: tokenize: header tokens for *r = "  mail.hivelocity.net (mail.hivelocity.net 
[65.59.189]) by dogma.slashnull.org (8.11.6/8.11.6)         <[EMAIL PROTECTED]>;    
jalapeno [127.0.0] by localhost   IMAP (fetchmail-5.9.0)   jm@localhost (single-drop); 
"
debug: bayes token 'H*x:5.00.2919.6900' => 0.999
debug: bayes token 'behave' => 0.00222428174235403
debug: bayes token 'captain' => 0.00453686200378072
debug: bayes token 'totalitarian' => 0.00594059405940594
debug: bayes token 'Abidjan' => 0.993013100436681
debug: bayes token 'hometown' => 0.0094488188976378
debug: bayes token 'd'Ivoire' => 0.987596899224806
debug: bayes token 'H*F:Mrs' => 0.987596899224806
debug: bayes token 'formalities' => 0.987596899224806
debug: bayes token 'liberation' => 0.0155844155844156
debug: bayes token 'requisite' => 0.0186046511627907
debug: bayes token 'crate' => 0.0230769230769231
debug: bayes token 'evil' => 0.0289795078191229
debug: bayes token 'Bissau' => 0.97037037037037
debug: bayes token 'Cote' => 0.964149922197416
debug: bayes token 'CONFIDENTIAL' => 0.959005630527377
debug: bayes token 'surmise' => 0.0444444444444444
debug: bayes token 'consignment' => 0.945057374972883
debug: bayes token 'ONE' => 0.943633128358777
debug: bayes token 'i'd' => 0.0595303472728287
debug: bayes: score = 0.496731078390794
debug: using "/home/jm/.spamassassin" for user state dir
debug: bayes: untie-ing
debug: bayes: untie-ing db_toks
debug: bayes: untie-ing db_seen
debug: running raw-body-text per-line regexp tests; score so far=4.6
debug: running uri tests; score so far=4.6
debug: uri tests: Done uriRE
debug: running full-text regexp tests; score so far=4.6
debug: local tests only, ignoring Pyzor
debug: all '*To' addrs: [EMAIL PROTECTED] [EMAIL PROTECTED]
debug: all '*From' addrs: [EMAIL PROTECTED]
debug: is DNS available? 0
debug: forged_rcvd_trail: entry 0: by=jmason.org from=(undef) mismatches=0
debug: forged_rcvd_trail: entry 1: by=(undef) from=(undef) mismatches=0
debug: running meta tests; score so far=8.3
debug: auto-learn? safety=+/-4, body-hits=1.1, head-hits=7.2
debug: auto-learn: recomputing score based on scoreset 0
debug: Score set 0 chosen.
debug: auto-learn: original score: 12.2, recomputed score: 12.174
debug: Score set 2 chosen.
debug: auto-learn? no: inside auto-learn thresholds or safety zone around required_hits
debug: is spam? score=12.2 required=5 
tests=BAYES_44,DATE_IN_FUTURE_96_XX,FORGED_MUA_OUTLOOK,FROM_ENDS_IN_NUMS,RATWARE_OE_MALFORMED,SEMIFORGED_HOTMAIL_RCVD,US_DOLLARS_3
Received: from localhost [127.0.0.1] by jalapeno
        with SpamAssassin (2.50-cvs 1.167-2003-02-03-exp);
        Fri, 07 Feb 2003 19:24:12 +0000
From: "Mrs. SANDRA MANI" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: CONFIDENTIAL
Date: Fri, 7 Feb 2003 15:23:30 +0000
Message-Id: <[EMAIL PROTECTED]>
X-Spam-Flag: YES
X-Spam-Status: Yes, hits=12.2 required=5.0
        tests=BAYES_44,DATE_IN_FUTURE_96_XX,FORGED_MUA_OUTLOOK,
              FROM_ENDS_IN_NUMS,RATWARE_OE_MALFORMED,
              SEMIFORGED_HOTMAIL_RCVD,US_DOLLARS_3
        version=2.50-cvs
X-Spam-Level: ************
X-Spam-Checker-Version: SpamAssassin 2.50-cvs 1.167-2003-02-03-exp
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----------=_3E4407DC.C8909ED8"

This is a multi-part message in MIME format.

------------=_3E4407DC.C8909ED8
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

This mail is probably spam.  The original message has been attached
along with this report, so you can recognize or block similar unwanted
mail in future.  See http://spamassassin.org/tag/ for more details.

Content preview:  I mean to be writing you this sensitive message
  believing that you won't violate the trust I'm to about impose on you.
  By way of introduction, I am Sandra Mani the wife of Hanis Mani Mani
  previous chief of defense-staff of (Republic of Guinea Bissau). [...] 

Content analysis details:   (12.20 points, 5 required)
FROM_ENDS_IN_NUMS  (0.6 points)  From: ends in numbers
RATWARE_OE_MALFORMED (2.9 points)  X-Mailer contains malformed Outlook Express version
US_DOLLARS_3       (1.1 points)  BODY: Nigerian scam key phrase ($NN,NNN,NNN.NN)
BAYES_44           (0.0 points)  BODY: Bayesian classifier says spam probability is 44 
to 50%
                   [score: 0.4967]
DATE_IN_FUTURE_96_XX (1.6 points)  Date: is 96 hours or more after Received: date
SEMIFORGED_HOTMAIL_RCVD (2.1 points)  hotmail.com 'From' address, but no 'Received:'
FORGED_MUA_OUTLOOK (3.9 points)  Forged mail pretending to be from MS Outlook



------------=_3E4407DC.C8909ED8
Content-Type: message/rfc822
Content-Description: original message before SpamAssassin
Content-Disposition: inline
Content-Transfer-Encoding: 8bit

Delivered-To: [EMAIL PROTECTED]
Received: from localhost (jalapeno [127.0.0.1])
        by jmason.org (Postfix) with ESMTP id 002B016F16
        for <jm@localhost>; Fri,  7 Feb 2003 15:58:47 +0000 (GMT)
Received: from jalapeno [127.0.0.1]
        by localhost with IMAP (fetchmail-5.9.0)
        for jm@localhost (single-drop); Fri, 07 Feb 2003 15:58:47 +0000 (GMT)
Received: from mail.hivelocity.net (mail.hivelocity.net [65.59.189.58]) by
    dogma.slashnull.org (8.11.6/8.11.6) with SMTP id h0E4ERv29190 for
    <[EMAIL PROTECTED]>; Tue, 14 Jan 2003 04:14:31 GMT
From: "Mrs. SANDRA MANI" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
X-Mailer: Microsoft Outlook Express 5.00.2919.6900 DM
MIME-Version: 1.0
Message-Id: <[EMAIL PROTECTED]>
Content-Type: text/plain; charset="us-ascii"
Subject: CONFIDENTIAL
Date: Fri, 7 Feb 2003 15:23:30 +0000

I mean to be writing you this sensitive message believing that you won't
violate the trust I'm to about impose on you.

By way of introduction, I am Sandra Mani the wife of Hanis Mani Mani previous
chief of defense-staff of (Republic of Guinea Bissau).

I heard about you through a good businessman who told me I can freely deal with
you that you are truthful and also your good ability of dealing with this. He
made me to surmise that you must be such an erudite businessman of ingenuity
and much compassion.

My loving husband was killed not long ago in an attack last December for his
role as a brave subversive rebel captain against the previous evil totalitarian
government of guinea Bissau. His sad absence has affected me more then I can
express.

Subsequent to this political crisis, I was forced to flee to the good land of
Cote d'Ivoire for my very life.

In Abidjan he stored ONE METALLIC CRATE in a safe storage location. He marked
it as an African Artworks as belonging to his American friend who would come
with the keys for the claim of the consignment. He did not disclose to the
storage people the real contents of the crate.

The crate contain almost $ 18,000.000.00. To be truthful with you, this is the
only legacy that my husband left for me. I have the proof of ownership and
other requisite proofs for that deposit, but I am not an American, which is
what they are expecting.

I'd like you to behave as the true claiment of the container and claim it for
wiring to your Checking for use in your hometown. I've decided to render to you
the contribution of 5% of the final quantity and 2% for other mecellaneous
expenditures you may cause while you do that. Should you elect to help me, I'll
tell you the procedure we should use to make certain the liberation of the
cache isn't difficult. It should just be a matter of meeting the formalities.

Hopefully.

Sandra Mani

NB: see that you call me when you get this message for more briefing.


------------=_3E4407DC.C8909ED8--
This mail is probably spam.  The original message has been attached
along with this report, so you can recognize or block similar unwanted
mail in future.  See http://spamassassin.org/tag/ for more details.

Content preview:  I mean to be writing you this sensitive message
  believing that you won't violate the trust I'm to about impose on you.
  By way of introduction, I am Sandra Mani the wife of Hanis Mani Mani
  previous chief of defense-staff of (Republic of Guinea Bissau). [...] 

Content analysis details:   (12.20 points, 5 required)
FROM_ENDS_IN_NUMS  (0.6 points)  From: ends in numbers
RATWARE_OE_MALFORMED (2.9 points)  X-Mailer contains malformed Outlook Express version
US_DOLLARS_3       (1.1 points)  BODY: Nigerian scam key phrase ($NN,NNN,NNN.NN)
BAYES_44           (0.0 points)  BODY: Bayesian classifier says spam probability is 44 
to 50%
                   [score: 0.4967]
DATE_IN_FUTURE_96_XX (1.6 points)  Date: is 96 hours or more after Received: date
SEMIFORGED_HOTMAIL_RCVD (2.1 points)  hotmail.com 'From' address, but no 'Received:'
FORGED_MUA_OUTLOOK (3.9 points)  Forged mail pretending to be from MS Outlook

Reply via email to