At 7/25/03 01:41 PM , Daniel Carrera wrote:

Here is a thought:  We could test for n consecutive consonants.  The more
consecutive consotants, the more likely it is to be spam.
[snip]
Any thoughts on these rules?  I guess I'm assuming that you don't get
emails in German.

I also see these sorts of things a lot at the ends of Subject: lines, often preceded by a large number of spaces. I think something like:


header SUBJECT_SPAM_ID_ENDING Subject =~ /\s{3,}[^aeiou]{3,}$/

might catch a decent amount of spam. Can anyone see any ways it might false-positive?

I note that using [^aeiou] instead of [bcdfghjklmnpqrstvwxyz] means it will fire on things like "!!!" or "###" (or even "$#@&!", as in "mock swearing"). It might also trigger on leet-speak expressions, like "l337" itself... but it will only trigger if it's set off by three or more spaces. People I routinely correspond with don't tend to put extra spaces into their Subject: lines. (And someone ending their Subject: with three spaces and then "!!!" or "###" probably *is* a spammer.)

                                                --Kai MacTane
----------------------------------------------------------------------
"Hey, sister Moonshine, hold me 'til the break of dawn,
 Hold me long,
 Hold me hard,
 Hold me 'til the shadows fade away..."
                                                --The Mission UK,
                                                 "Paradise (Will Shine
                                                  Like the Moon)"



-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to