On Sat, 20 Nov 2010, David B Funk wrote:
The idea was that most all legit 3 character HTML tags such as '<div>' contained at least one of those letters ([dpry]) in them. So a purported tag that had none of them was not legit and thus probably bogus spammer spoor. With the evolution of HTML (xml, etc) that's no longer a safe asumption, so that rule probably FPs.
The presence of multiple empty tag pairs might still be useful... Off the top of my head and untested: rawbody __EMPTY_HTML_TAG m,<([a-z]+)></\1>,i tflags __EMPTY_HTML_TAG multiple meta MANY_EMPTY_TAGS __EMPTY_HTML_TAG > 9 This might already be a rule, I didn't look. -- John Hardin KA7OHZ http://www.impsec.org/~jhardin/ jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 ----------------------------------------------------------------------- Activist: Someone who gets involved. Unregistered Lobbyist: Someone who gets involved with something the MSM doesn't approve of. -- WizardPC ----------------------------------------------------------------------- 27 days until TRON Legacy