unexpected wrote: > > \b matches the beginning/end of a word (characters a-zA-Z_0-9). > > So that regex will match e.g. MULTX-FOO but not MULTX-. > > > > So is there a way to get \b to include - ?
No, but you can get the behaviour you want using negative lookaheads. The following regex is effectively \b where - is treated as a word character: pattern = r"(?![a-zA-Z0-9_-])" This effectively matches the next character that isn't in the group [a-zA-Z0-9_-] but doesn't consume it. For example: >>> p = re.compile(r".*?(?![a-zA-Z0-9_-])(.*)") >>> s = "aabbcc_d-f-.XXX YYY" >>> m = p.search(s) >>> print m.group(1) .XXX YYY Note that the regex recognises the '.' as the end of the word, but doesn't use it up in the match, so it is present in the final capturing group. Contrast it with: >>> p = re.compile(r".*?[^a-zA-Z0-9_-](.*)") >>> s = "aabbcc_d-f-.XXX YYY" >>> m = p.search(s) >>> print m.group(1) XXX YYY Note here that "[^a-zA-Z0-9_-]" still denotes the end of the word, but this time consumes it, so it doesn't appear in the final captured group. -- http://mail.python.org/mailman/listinfo/python-list