First, I want to give credit where credit is due. Jennifer Wheeler authored a set of rules to catch text being obfuscated by HTML and script tags. They are housed at http://spamhammers.nxtek.net/ and moving (?) to http://www.merchantsoverseas.com/wwwroot/gorilla/sa_rules.htm. The long thread about Popcorn has been an exploration in regex (thanks to everyone's input - especially yours) and methodology in attacking the problem.
> -----Original Message----- > From: Keith C. Ivey > Sent: Wednesday, October 15, 2003 8:16 PM > Kai MacTane <[EMAIL PROTECTED]> wrote: > > > /[>\s]\w{1}<[-\w\s\$&!]{0,150}>\w{1}\W/ > > Writing '{1}' serves no purpose. It means "exactly one of the > preceding thing", but the thing by itself already matches > exactly once if there's no quantifier. So you can write it > this way: > > /[>\s]\w<[-\w\s\$&!]{0,150}>\w\W/ > > I must admit I'm puzzled about why Larry wants to limit the > pattern to having only one letter on each side of the angle- > bracketed stuff. That was only one example as there are a series of similar rules that expand the permutations of wordcharacter{1,5}<stuff>wordcharacters{1,7}. So the above example is one of the permutations. In the above specific case even though it may not serve a purpose, it is easer to see the pattern amongst the other rules. So the rules will catch stuff like: I< junk1 > lo< junk2 >ve spa< junk3 >mas<junk4>sass<junk5>in. Di<ju$nk>dn't<!-- junkpile -->yo<!-- junk&pile -->u alr<!-- junkpile /-->eady kno<!-- junkpile2 -->w th<!-- junkpile3 -->at? Her methodology is excellent in that the more rules with lower scores buffers any false positives but the messages littered with the obfuscation gets slammed. (Go Jennifer go!). --Larry ------------------------------------------------------- This SF.net email is sponsored by: SF.net Giveback Program. SourceForge.net hosts over 70,000 Open Source Projects. See the people who have HELPED US provide better services: Click here: http://sourceforge.net/supporters.php _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk