I got 179 Nigerian scam message bodies (though not headers) from 
http://www.quatloos.com/cm-niger/cm-niger.htm, and used them to test out how 
SA handles them.  Testing with the default 2.2 setup (taken from CVS today), 
72 out of 179 is correctly tagged as spam.  Overriding some of the weird 
negative scores the GA made, especially DEAR_SOMEBODY, filtered out 19 of the 
remaining 108 messages.  Then I made some modifications and additions that 
filtering out 74 of the remaining 89.

First I made some mods to the 20_body_tests.cf that deal with Nigerian scams. 
Here's a diff:

<<<<<
Index: 20_body_tests.cf
===================================================================
RCS file: /cvsroot/spamassassin/spamassassin/rules/20_body_tests.cf,v
retrieving revision 1.50
diff -u -3 -p -r1.50 20_body_tests.cf
--- 20_body_tests.cf    24 Feb 2002 11:03:23 -0000      1.50
+++ 20_body_tests.cf    1 Mar 2002 06:03:54 -0000
@@ -522,10 +522,10 @@ body NIGERIAN_SCAM                /BASED ON INFORMATIO
 describe NIGERIAN_SCAM         Nigerian scam, cf 
http://www.snopes2.com/inboxer/scams/nigeria.htm
 
 # (contrib: skod)
-body NIGERIAN_SCAM_2           /(?:Government of Nigeria|NIGERIAN? 
NATIONAL|Nigerian? Government)/
+body NIGERIAN_SCAM_2           /(?:Government of Nigeria|NIGERIAN? 
NATIONAL|Nigerian? Government|Federal Republic of Nigeria)/
 describe NIGERIAN_SCAM_2       Mutated Nigerian scams
 
-body US_DOLLARS                        /Million\b.{0,40}\b(?:United States 
Dollars|USD)/i
+body US_DOLLARS                        /Million\b.{0,40}\b(?:United States 
Dollars|USD|U\. ?S\. Dollars)/i
 describe US_DOLLARS            Nigerian scam key phrase
 
 rawbody UNNEEDED_HTML_ENCODING /font=3E/i
>>>>>>>>>

I added "Federal Republic of Nigeria" to NIGERIAN_SCAM_2 and "U\. ?S\. 
Dollars" (matches "U.S. Dollars" or "U. S. Dollars") to US_DOLLARS.

Then I created these new rules:

<<<<<<<<
body NIGERIAN_SCAM_3            /(?:Bank of Nigeria|Nigerian? National 
Petroleum)/i
describe NIGERIAN_SCAM_3        Nigerian Bank or Petroleum scam

body NIGERIAN_SCAM_4            /\b(?:closed?|freeze|frozen?)\b.{0,10}bank 
account/i
describe NIGERIAN_SCAM_4        Some poor Nigerian got his bank account frozen

body NIGERIAN_SCAM_5            /\b(wife|widow|son|husband)\b.{0,60}Sann?i 
Abacha/
describe NIGERIAN_SCAM_5        Nigerian widow needs your help...

body NIGERIAN_SCAM_6            /Sann?i 
Abacha.{0,60}\b(wife|widow|son|husband)\b/
describe NIGERIAN_SCAM_6        Nigerian widow needs your help...

body US_DOLLARS_2               /(?:\$|usd)\d{2,3}(?:\.\d)?m\b/i
describe US_DOLLARS_2           Nigerian scam key phrase ($NN.Nm)

body US_DOLLARS_3               /(?:\$|usd ?)\d{1,3},\d{3},\d{3}(?:\.\d\d)?/i
describe US_DOLLARS_3           Nigerian scam key phrase ($NN,NNN,NNN.NN)

score NIGERIAN_SCAM_3 7.050
score NIGERIAN_SCAM_4 7.050
score NIGERIAN_SCAM_5 7.050
score NIGERIAN_SCAM_6 7.050
score US_DOLLARS_2    3.339
score US_DOLLARS_3    3.339
>>>>>>>>>>>>>

Anything matching "Bank of Nigeria" or "Nigerian National Petroleum" is one 
of the scams.  Also anything taking about closed or frozen bank accounts is 
probably a scam, though I suppose those are more likely to show up in 
legitimate mail.  The next two rules look for people claiming to be a 
relative of the late General Sani Abacha.  Finally there's two rules to look 
for descriptions of millions of dollars in the form of "$12.3m" or 
"$1,234,567.89".

If anyone wants the entire archive of bodies I have, or just the ones that 
are missed, please mail me.

-- 
http://dmoz.org                  | Give a man a match, and he'll be warm for a
                                 | minute, but light him on fire, and he'll be
The world's largest human edited | warm for the rest of his life.
edited web directory directory   | ICQ: 132152059

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to