On Sun, 26 Jul 2009, Karsten Br�ckelmann wrote:
On Sun, 2009-07-26 at 17:19 +0200, Karsten Bräckelmann wrote:
On Sat, 2009-07-25 at 16:07 -0500, McDonald, Dan wrote:
... (?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)[[:punct:]]?\b/i
^^^^^^^^^^^^
That part is superfluous. If it matches a punctuation char, its
optional variant (matching no char) will make the \b word boundary
match as well.
Crap, that's actually wrong. :/ There is exactly *one* char in the POSIX
punct char class, that also is a word char -- the underscore...
So that translates to "with an optional underscore after the TLD".
Sorry, my bad.
That's an inefficient and confusing way to deal with the fact that
underscore breaks \b, and doesn't cover all cases. I prefer:
Before text: \b_*
After text: _*\b
As in: /\b_*www_*\b/
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
When designing software, any time you think to yourself "a user
would never be stupid enough to do *that*", you're wrong.
-----------------------------------------------------------------------
10 days until the 274th anniversary of John Peter Zenger's acquittal