---Original Message-
| From: Carl Chipman [mailto:[EMAIL PROTECTED]
| Sent: Friday, January 09, 2004 10:32 AM
| To: Smart,Dan
| Subject: RE: [SAtalk] detecting large collections of random words
|
| Btw, why not rename your rules CP_RANDOMWORDS_10 and _15 so
| that the name is more accurate descrip
15 2.5
## EOF
---
<>
| -Original Message-
| From: McWhirter,Julia [mailto:[EMAIL PROTECTED]
| Sent: Friday, January 09, 2004 5:45 AM
| To: Chris Petersen; [EMAIL PROTECTED]
| Subject: RE: [SAtalk] detecting large collections of random words
|
| I have tried this and still SA doe
]
Subject: RE: [SAtalk] detecting large collections of random words
> Looks good. just running this over a ham mail box with about 500
messages
> and a spam mail box with the same, and not decoding base64 and such, I
> see the following:
what about something like:
/(?:\b(?!=(?:from|even|
> Negative look-ahead is (?!...), not (?!=...). In your version,
> the equals sign is part of the pattern to match, and since
> anything that matches [a-z] can't be '=', the negative look-
> ahead ends up doing nothing.
Oops.. Shows how long it's been since I've done serious regex stuff
(and I
Chris Petersen <[EMAIL PROTECTED]> wrote:
> what about something like:
>
> /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){12}/
Negative look-ahead is (?!...), not (?!=...). In your version,
the equals sign is part of the pattern to match, and since
anything that matches [a-z] can't
> Looks good. just running this over a ham mail box with about 500 messages
> and a spam mail box with the same, and not decoding base64 and such, I
> see the following:
what about something like:
/(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){12}/
I'm trying to think of extremely comm
> Slightly better might be:
> /(?:(\b[a-z]{4,12}\s+){12,})/
The surrounding (?:) doesn't actually do anything - you're just grouping
the whole regex itself. Thus /(\b[a-z]{4,12}\s+){12}/
would work just as well. or /(?:\b[a-z]{4,12}\s+){12}/ if you wanted to
make a slight optimization and not g
Here's a rule I wrote for just this sort of spam:
rawbody WORDWORD/[a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12}
[a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} /
describe WORDWORD long string of random words
score WORDWORD 2.0
(Sorry if it wraps,
> From: Chris Petersen
[...]
>
> Yes. though I used:
>
> /(\b[a-z]{4,12}\s+){12}/
>
> notice the initial /b, and there's no need to make SA continue to search
> beyond the "minimum" match, so leave off the , in the last {} cluster.
>
Looks good. just running this over a ham mail box with about
> -Original Message-
> From: [EMAIL PROTECTED]
> Sent: Thursday, January 08, 2004 12:57 PM
>
> Would this regex make more sense?
>
> /([a-z]{4,12}\s){12,}/
Slightly better might be:
/(?:(\b[a-z]{4,12}\s+){12,})/
---
This SF.net e
BTW, I'm already seeing some random-word spam with random punctuation thrown in as
well...
Pierre Thomson
-Original Message-
From: Chris Petersen [mailto:[EMAIL PROTECTED]
Sent: Thursday, January 08, 2004 4:18 PM
To: [EMAIL PROTECTED]
Subject: RE: [SAtalk] detecting large collec
> Would this regex make more sense?
> /([a-z]{4,12}\s){12,}/
Yes. though I used:
/(\b[a-z]{4,12}\s+){12}/
notice the initial /b, and there's no need to make SA continue to search
beyond the "minimum" match, so leave off the , in the last {} cluster.
--
Chris Petersen
Programmer / Web Designe
Would this regex make more sense?
/([a-z]{4,12}\s){12,}/
Andrew Hoying
[EMAIL PROTECTED] wrote on 01/08/2004 01:37:49
PM:
> Here's a rule I wrote for just this sort of spam:
>
> rawbody WORDWORD/[a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]
> {4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [
Here's a rule I wrote for just this sort of spam:
rawbody WORDWORD/[a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12}
[a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} [a-z]{4,12} /
describe WORDWORD long string of random words
score WORDWORD
14 matches
Mail list logo