On Wed, 15 Jul 2009, Karsten Bräckelmann wrote:
body =~ /(?!www\.[a-z]{2,3}[0-9]{2,3}\.(com|net|org))
^^^^^^^^
This is invalid.
Please ignore. I use a generator. To avoid needless discussion of its
syntax, here are the actual rules from my generated .cf file...
body LOC_09061901 /(?!www\.[a-z]{2,3}[0-9]{2,3}\.(?:com|net|org))
www(?:[^\.][^a-z0-9]{0,9}|\.[^a-z0-9]{1,9})(?:[a-z]{2,3}|pill|meds|shop)
[0-9]{2,3}(?:[^\.][^a-z0-9]{0,9}|\.[^a-z0-9]{1,9})
(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)/i
describe LOC_09061901 HWCNBL obfuscated xxx99 url in body
score LOC_09061901 6
body LOC_09061905 /
www(?:[^\.][^a-z0-9]{0,9}|\.[^a-z0-9]{1,9})(?:[a-z]{2,3}|pill|meds|shop)
[0-9]{2,3}(?:[^\.][^a-z0-9]{0,9}|\.[^a-z0-9]{1,9})
(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)/i
describe LOC_09061905 HWCNBL obfuscated xxx99 url in body
score LOC_09061905 1
Notice I've arranged the line-breaks in this mail to emphasize
that the two rules are identical except for the negative look-ahead.
The RE, if properly used as a body rule, works for me -- both with and
without the negative look-ahead exonerating a non-obfuscated URI.
Very mysterious..... I've retyped it here, and.www .te21. net
should end up registering on *both* rules, but for me, it only
triggers the second one....
now that you have the absolutely-live code, please test again,
and if it works, anyone care to venture why it wouldn't for me?
- C