Hello,

I've recently installed SA 3.0.1, and found some junk was
getting through with scores too low for my liking, especially before the
URLs made it into SURBL. I've put together a few rules to match some
of these that you might find interesting.

They are:

Rolex and "Want Watch?" messages (there must be loads of rules out there
to do this, I guess, but the default installation doesn't seem to
include any?)

header    UOLCC_ROLEX_SUB1   Subject =~ /\brolex\b/i
describe  UOLCC_ROLEX_SUB1   Subject contains the word 'rolex'
score     UOLCC_ROLEX_SUB1   0.5

header    UOLCC_ROLEX_SUB2   Subject =~ /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_SUB2   Subject contains a gappy version of 'rolex'
score     UOLCC_ROLEX_SUB2   1.5

body      UOLCC_ROLEX_BODY1  /\brolex\b/i
describe  UOLCC_ROLEX_BODY1  Body contains the word 'rolex'
score     UOLCC_ROLEX_BODY1  0.5

body      UOLCC_ROLEX_BODY2  /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_BODY2  Body contains a gappy version of 'rolex'
score     UOLCC_ROLEX_BODY2  1.5

rawbody   UOLCC_WATCH_BODY   /^(Do you )?[Ww]ant (a )?(cheap 
)?([Ww]ristw|W)atch\?\s*$/m
describe  UOLCC_WATCH_BODY   Body asks if you want a watch
score     UOLCC_WATCH_BODY   2

Checking messages with two lines of just b, B, space and 1 in them.
Seems to be some sort of code used in spam, maybe:

full      UOLCC_BBONE        /\n[bB1 ]{8,20}\n[bB1 ]{8,20}\n/s
describe  UOLCC_BBONE        Contains two code lines with b, B and 1
score     UOLCC_BBONE        2

Checking one particular type of spam that has a URL (that follows a
certain pattern, ends .htm), blank line, line of proverb or something,
blank, line of name, blank, exact same URL with "l" on the end (i.e.
ends .html). I guess the rules should be small, but this one has picked
up loads of spam for me:

full      UOLCC_HTM_HTML_URL 
/\n(http:\/\/[a-z]+\.[a-z]{3,4}\/[0-9a-f]{5,35}\/[[:alnum:]]{5,20}=?\.htm)\s\n\s*\n[[:alnum:]\?\.',\s:,-]+\n\s*\n[^\s,.]+(\s[^\s,.]+){0,15}\n\s*\n\1l/s
describe  UOLCC_HTM_HTML_URL Matches pattern of spam mail (.htm .html)
score     UOLCC_HTM_HTML_URL 3.5

Finally, a string of words (more than 15 here) that all begin with a
capital letter, and no punctuation (I'm only testing this one at the
moment, hence the low score):

body      UOLCC_CAPWORD_TEST /([A-Z][a-z]{3,}\s{1,2}){15,}/s
describe  UOLCC_CAPWORD_TEST String of words that all begin with caps letter
score     UOLCC_CAPWORD_TEST 0.1


Hope these are of use to someone. If anyone can show me that they are
likely to pick up false positives, I'd be most grateful.

Thanks,

-- 
Matthew Newton <[EMAIL PROTECTED]>

UNIX Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom

Reply via email to