Bugs item #1611131, was opened at 2006-12-07 23:44 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1611131&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Regular Expressions Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: akaihola (akaihola) Assigned to: Gustavo Niemeyer (niemeyer) Summary: \b in unicode regex gives strange results Initial Comment: The problem: This doesn't give a match: >>> re.match(r'ä\b', 'ä ', re.UNICODE) This works ok and gives a match: >>> re.match(r'.\b', 'ä ', re.UNICODE) Both of these work as well: >>> re.match(r'a\b', 'a ', re.UNICODE) >>> re.match(r'.\b', 'a ', re.UNICODE) Docs say \b is defined as an empty string between \w and \W. These do match accordingly: >>> re.match(r'\w', 'ä', re.UNICODE) >>> re.match(r'\w', 'a', re.UNICODE) >>> re.match(r'\W', ' ', re.UNICODE) So something strange happens in my first example, and I can't help but assume it's a bug. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1611131&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com