So I take it that the final rewrite of this rule would be: ---- cpeterson.cf ---------
## I've noticed that a lot of spams recently have been following the random-words technique, ## with very little "spam" content - often just an image or some obfuscated text. Has anyone ## given any thought to writing up a rule that detects a LACK of punctuation, or a lack of ## short words like a/and/the? It'd be easy for spammers to get around, but at least it would ## keep them out of inboxes for awhile. rawbody CP_WORDWORD_10 /(?:\b(?!(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){10}/ describe CP_WORDWORD_10 string of 10+ random words score CP_WORDWORD_10 0.5 rawbody CP_WORDWORD_15 /(?:\b(?!(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/ describe CP_WORDWORD_15 string of 15+ random words score CP_WORDWORD_15 2.5 ## EOF --------------------------- <<Dan>> | -----Original Message----- | From: McWhirter,Julia [mailto:[EMAIL PROTECTED] | Sent: Friday, January 09, 2004 5:45 AM | To: Chris Petersen; [EMAIL PROTECTED] | Subject: RE: [SAtalk] detecting large collections of random words | | I have tried this and still SA does not pick up the messages | as spam. I have put it in a .cf files under | /etc/mail/spamassassin with my other ones. | | Running | Solaris 8 | Spamassassin 2.6 | Mimedefang 2.39 | Sendmail 8.12.10 | | | Regards | Julia McWhirter | IT Manager | | SuperH (UK) Ltd | Network House | 2410 Aztec West | Almondsbury | Bristol | BS32 4QX | | Tel : 01454 465661 | Fax : 01454 465601 | Mobile : 07979 913494 | Email : [EMAIL PROTECTED] | Web : www.superh.com | | | -----Original Message----- | From: [EMAIL PROTECTED] | [mailto:[EMAIL PROTECTED] On | Behalf Of Chris Petersen | Sent: 09 January 2004 02:49 | To: [EMAIL PROTECTED] | Subject: RE: [SAtalk] detecting large collections of random words | | > Looks good. just running this over a ham mail box with about 500 | messages | > and a spam mail box with the same, and not decoding base64 | and such, I | > see the following: | | what about something like: | | /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){12}/ | | I'm trying to think of extremely common 4-letter words, so | this is probably just a quick example. | | > I tend to like the idea of weighting the 10 sequence low, | say 0.5, and | > the 13 sequence would get an extra bump of 2.0 more (making | a total of | > 2.5). | | That makes sense. Though I'd probably go with 10 low, and 15 | high (like | 3 or more). But that's just me: | | rawbody WORDWORD_10 | /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){10}/ | describe WORDWORD_10 string of 10+ random words | score WORDWORD_10 .5 | | | rawbody WORDWORD_15 | /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/ | describe WORDWORD_15 string of 15+ random words | score WORDWORD_15 2.5 | | | | -- | Chris Petersen | Programmer / Web Designer | Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: | http://www.siliconmechanics.com/c292/blade-server.php | 1U Servers: http://www.siliconmechanics.com/c272/1u-server.php | | | | | ------------------------------------------------------- | This SF.net email is sponsored by: Perforce Software. | Perforce is the Fast Software Configuration Management System offering | advanced branching capabilities and atomic changes on 50+ platforms. | Free Eval! http://www.perforce.com/perforce/loadprog.html | _______________________________________________ | Spamassassin-talk mailing list | [EMAIL PROTECTED] | https://lists.sourceforge.net/lists/listinfo/spamassassin-talk | | | ------------------------------------------------------- | This SF.net email is sponsored by: Perforce Software. | Perforce is the Fast Software Configuration Management System offering | advanced branching capabilities and atomic changes on 50+ platforms. | Free Eval! http://www.perforce.com/perforce/loadprog.html | _______________________________________________ | Spamassassin-talk mailing list | [EMAIL PROTECTED] | https://lists.sourceforge.net/lists/listinfo/spamassassin-talk | ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk