So I take it that the final rewrite of this rule would be:
---- cpeterson.cf ---------
## I've noticed that a lot of spams recently have been following the
random-words technique,
## with very little "spam" content - often just an image or some obfuscated
text. Has anyone
## given any thought to writing up a rule that detects a LACK of
punctuation, or a lack of
## short words like a/and/the? It'd be easy for spammers to get around, but
at least it would
## keep them out of inboxes for awhile.
rawbody CP_WORDWORD_10
/(?:\b(?!(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){10}/
describe CP_WORDWORD_10 string of 10+ random words
score CP_WORDWORD_10 0.5
rawbody CP_WORDWORD_15
/(?:\b(?!(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/
describe CP_WORDWORD_15 string of 15+ random words
score CP_WORDWORD_15 2.5
## EOF
---------------------------
<<Dan>>
| -----Original Message-----
| From: McWhirter,Julia [mailto:[EMAIL PROTECTED]
| Sent: Friday, January 09, 2004 5:45 AM
| To: Chris Petersen; [EMAIL PROTECTED]
| Subject: RE: [SAtalk] detecting large collections of random words
|
| I have tried this and still SA does not pick up the messages
| as spam. I have put it in a .cf files under
| /etc/mail/spamassassin with my other ones.
|
| Running
| Solaris 8
| Spamassassin 2.6
| Mimedefang 2.39
| Sendmail 8.12.10
|
|
| Regards
| Julia McWhirter
| IT Manager
|
| SuperH (UK) Ltd
| Network House
| 2410 Aztec West
| Almondsbury
| Bristol
| BS32 4QX
|
| Tel : 01454 465661
| Fax : 01454 465601
| Mobile : 07979 913494
| Email : [EMAIL PROTECTED]
| Web : www.superh.com
|
|
| -----Original Message-----
| From: [EMAIL PROTECTED]
| [mailto:[EMAIL PROTECTED] On
| Behalf Of Chris Petersen
| Sent: 09 January 2004 02:49
| To: [EMAIL PROTECTED]
| Subject: RE: [SAtalk] detecting large collections of random words
|
| > Looks good. just running this over a ham mail box with about 500
| messages
| > and a spam mail box with the same, and not decoding base64
| and such, I
| > see the following:
|
| what about something like:
|
| /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){12}/
|
| I'm trying to think of extremely common 4-letter words, so
| this is probably just a quick example.
|
| > I tend to like the idea of weighting the 10 sequence low,
| say 0.5, and
| > the 13 sequence would get an extra bump of 2.0 more (making
| a total of
| > 2.5).
|
| That makes sense. Though I'd probably go with 10 low, and 15
| high (like
| 3 or more). But that's just me:
|
| rawbody WORDWORD_10
| /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){10}/
| describe WORDWORD_10 string of 10+ random words
| score WORDWORD_10 .5
|
|
| rawbody WORDWORD_15
| /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/
| describe WORDWORD_15 string of 15+ random words
| score WORDWORD_15 2.5
|
|
|
| --
| Chris Petersen
| Programmer / Web Designer
| Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers:
| http://www.siliconmechanics.com/c292/blade-server.php
| 1U Servers: http://www.siliconmechanics.com/c272/1u-server.php
|
|
|
|
| -------------------------------------------------------
| This SF.net email is sponsored by: Perforce Software.
| Perforce is the Fast Software Configuration Management System offering
| advanced branching capabilities and atomic changes on 50+ platforms.
| Free Eval! http://www.perforce.com/perforce/loadprog.html
| _______________________________________________
| Spamassassin-talk mailing list
| [EMAIL PROTECTED]
| https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
|
|
| -------------------------------------------------------
| This SF.net email is sponsored by: Perforce Software.
| Perforce is the Fast Software Configuration Management System offering
| advanced branching capabilities and atomic changes on 50+ platforms.
| Free Eval! http://www.perforce.com/perforce/loadprog.html
| _______________________________________________
| Spamassassin-talk mailing list
| [EMAIL PROTECTED]
| https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
|
-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk