First, I want to give credit where credit is due.  Jennifer Wheeler authored
a set of rules to catch text being obfuscated by HTML and script tags.  They
are housed at http://spamhammers.nxtek.net/ and moving (?) to
http://www.merchantsoverseas.com/wwwroot/gorilla/sa_rules.htm.  The long
thread about Popcorn has been an exploration in regex (thanks to everyone's
input - especially yours) and methodology in attacking the problem.


> -----Original Message-----
> From: Keith C. Ivey
> Sent: Wednesday, October 15, 2003 8:16 PM

> Kai MacTane <[EMAIL PROTECTED]> wrote:
> 
> > /[>\s]\w{1}<[-\w\s\$&!]{0,150}>\w{1}\W/
> 
> Writing '{1}' serves no purpose.  It means "exactly one of the 
> preceding thing", but the thing by itself already matches 
> exactly once if there's no quantifier.  So you can write it 
> this way:
> 
>    /[>\s]\w<[-\w\s\$&!]{0,150}>\w\W/
> 
> I must admit I'm puzzled about why Larry wants to limit the 
> pattern to having only one letter on each side of the angle-
> bracketed stuff.

That was only one example as there are a series of similar rules that expand
the permutations of wordcharacter{1,5}<stuff>wordcharacters{1,7}.  So the
above example is one of the permutations.  In the above specific case even
though it may not serve a purpose, it is easer to see the pattern amongst
the other rules.  So the rules will catch stuff like:

I< junk1 > lo< junk2 >ve spa< junk3 >mas<junk4>sass<junk5>in.
Di<ju$nk>dn't<!-- junkpile -->yo<!-- junk&pile -->u alr<!-- junkpile
/-->eady kno<!-- junkpile2 -->w th<!-- junkpile3 -->at?

Her methodology is excellent in that the more rules with lower scores
buffers any false positives but the messages littered with the obfuscation
gets slammed. (Go Jennifer go!).



--Larry



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to