I wonder if a better way to do this would be to add an extra field to
the rule (or maybe change BODY to BODY_STRIPPED or HEADER_STRIPPED)
which removes everything that is *not* a letter before doing the regexp
check.  IE, does a s/[^a-zA-Z]//g) on the body/header before checking
the rule.  I can't see it being too useful on the body, but it would be
great to catch those Per\scri;ption subject lines.


Rich Puhek wrote:

Roger Merchberger wrote:

> Rumor has it that Charles Gregory may have mentioned these words:
> 
>> [snippety]
>> Rule:
>> BODY RULENAME /a string/i
>>
>> Coded Rule:
>> BODY RULENAME /a{1,3} s{1,3}t{1,3}r{1,3}i{1,3}n{1,3}g{1,3}/i
>>
>> You get the idea. This could be quite burdensome to implement
manually,
>> but an easy enough thing to automate 'behind the scenes'.
> 
> 
> However, if one were to do this with every body ruleset that exists,
> it quite possibly could crush the SA server, as it multiply the amount
> of CPU used to do a match like that, quite possibly exponentially. [1]
> 
> If there was a way of optimizing the search (or at least only doing it
> on the subject of the mail, not the body) it wouldn't be a bad idea,
but 
> [[ as always with this type of 
> measure/countermeasure/countercountermeasure war ]] as soon as it was 
> widespread, the spammers would stop this yet again, and move onto the 
> next useful (for them) obfuscation scheme... :-/
> 

Would something like "excessive" instances of /(\w)\1/ work? Obviously 
such patterns are fairly common in regular english, but perhaps looking 
for an excessive quantity in an email could be an indication of the 
above problem.

Another possible solution might be to preprocess the mail with something
like: s/(\w)\1/\1/ in order to cull out the crap.

But... like you said, it's an arms race. Fortunatly, Bayes should eat up
the double-letter obfuscations...

--Rich






-------------------------------------------------------
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to