>-----Original Message-----
>From: Sven Riedel [mailto:[EMAIL PROTECTED]
>Sent: Tuesday, June 07, 2005 8:58 AM
>To: users@spamassassin.apache.org
>Subject: Would a normalization plugin make sense?
>
>
>Hi,
>since a lot of spam nowadays tries to get past the filters
>by multiplying random letters, wouldn't it make sense to
>introduce normalization plugins to spamassassin?
>
>These would run over the mail once before the actual scanning
>starts, and perform transformations on the decoded mail body.
>
>Some functions I could think of off of the  top of my head would
>be:
>- reducing multiple consecutive letter instances to one occurance
>of the given letter
>
>- Transforming html-entities to their given roman letter equivalent
>
>- removing all non-alphanumericals from the mail body
>
>This would require a new new rule calls (e.g. normalbody), to avoid
>breaking existing rulesets.
>
>Would this make sense? Can this be included into spamassassin, or 
>are the current internals structured in way that makes the introduction
>of such plugins hard/impossible?

Or one could do like Theo, and strip all HTML content from the emails. :)
If I didn't have such retarded starfish here, I would do it. 

The problem with the normalization, is like anything else. One mans ham,
anothers spam. Repetitive letters show up in item codes, code snippets,
fubar'd uuencoding, ect...

It would also void out a lot of pre-exhisting rules that look for some of
these filter bypassing codes. 

I always try to turn their attempts to bypass, into spam flags. 

Chris Santerre 
System Admin and SARE/URIBL Ninja
http://www.rulesemporium.com 
http://www.uribl.com

Reply via email to