Hi

On Mon, Dec 13, 2004 at 04:43:28PM -0800, jdow wrote:
> > I've seen another variant about by Matthew Newton that makes a bunch of
> > rules for both subject and body separately. I generally don't do this as
> > the body rules will match the subject line, so there's really no need,
> > other than as a score amplifier. I usually only make subject rules when a
> > body rule isn't appropriate. He's also done separate regular and
> gappy-text
> > rules, but doesn't pick up on character-sub obfuscations.. It is a decent
> > set however..
> >
> > One good rule I've seen that Matthew Newton wrote is this one:
> >
> > rawbody   UOLCC_WATCH_BODY   /^(Do you )?[Ww]ant (a )?(cheap
> > )?([Ww]ristw|[Ww])atch\?\s*$/m
> > describe  UOLCC_WATCH_BODY   Body asks if you want a watch
> > score     UOLCC_WATCH_BODY   1.5
> >
> > Very targeted, but effective with low risk of FPs.
> 
> Here is the full set of his stuff I am running. So far it has hit no ham.

I've recently updated some of these to try and match a few that were
slipping through. The UOLCC_WATCH_BODY has now been modified to accept
"rolex" in the place of "cheap", as one like that arrived the other day.
The UOLCC_HTM_HTML_URL one is slightly less picky about which characters
can appear in the "proverb" line and the "name" line, just looking for
more than 8 "words" and less than 15 "words". I figured out that it's
more the repeated URLs that will be unique to the spam, rather than the
formatting of the two text lines. Oh, and the URL can now contain 0-9
and -, too.

Didn't realise that the body test checks the subject, too, but I don't
suppose it can hurt with both tests.

Current set below.

Matthew


---------------------------------------------------------------------

header    UOLCC_ROLEX_SUB1   Subject =~ /\brolex\b/i
describe  UOLCC_ROLEX_SUB1   Subject contains the word 'rolex'
score     UOLCC_ROLEX_SUB1   0.5

header    UOLCC_ROLEX_SUB2   Subject =~ /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_SUB2   Subject contains a gappy version of 'rolex'
score     UOLCC_ROLEX_SUB2   1.5

body      UOLCC_ROLEX_BODY1  /\brolex\b/i
describe  UOLCC_ROLEX_BODY1  Body contains the word 'rolex'
score     UOLCC_ROLEX_BODY1  0.5

body      UOLCC_ROLEX_BODY2  /\br.{1,2}o.{1,2}l.{1,2}e.{1,2}x\b/i
describe  UOLCC_ROLEX_BODY2  Body contains a gappy version of 'rolex'
score     UOLCC_ROLEX_BODY2  1.5

rawbody   UOLCC_WATCH_BODY  
/^(Do\syou\s)?[Ww]ant\s(a\s)?(rolex\s|cheap\s)?[Ww](ristw)?atch\?\s*$/m
describe  UOLCC_WATCH_BODY  Body asks if you want a watch
score     UOLCC_WATCH_BODY  2

full      UOLCC_HTM_HTML_URL 
/\n(http:\/\/[a-z0-9-]+\.[a-z]{3,4}\/[0-9a-f]{5,35}\/[[:alnum:]]{5,20}=?\.htm)\s*\n\s*\n\s*([^\s]+)(\s+[^\s]+){6,}\n\s*\n[^\s,.]+(\s[^\s,.]+){0,15}\n\s*\n\1l/s
describe  UOLCC_HTM_HTML_URL Matches pattern of spam mail (.htm .html)
score     UOLCC_HTM_HTML_URL 3.5

full      UOLCC_BBONE        /\n[bB1 ]{8,20}\n[bB1 ]{8,20}\n/s
describe  UOLCC_BBONE        Contains two code lines with b, B and 1
score     UOLCC_BBONE        2

body      UOLCC_CAPWORD_TEST /([A-Z][a-z]{3,}\s{1,2}){15,}/s
describe  UOLCC_CAPWORD_TEST String of words that all begin with caps letter
score     UOLCC_CAPWORD_TEST 1.2

---------------------------------------------------------------------

-- 
Matthew Newton <[EMAIL PROTECTED]>

UNIX Systems Administrator, Network Support Section,
Computer Centre, University of Leicester,
Leicester LE1 7RH, United Kingdom

Reply via email to