Björn Lindqvist <bjou...@gmail.com> added the comment: In my app, I need to transform the regexp created from user input so that it matches unicode characters with their ascii equivalents. For example, if someone searches for "el nino", that should match the string "el ñino". Similarly, searching for "el ñino" should match "el nino".
The code to transform the regexp looks like this: s = re.escape(user_input) s = re.sub(u'n|ñ', u'[n|ñ]') matches = list(re.findall(s, data, re.IGNORECASE|re.UNICODE)) It doesn't work because the ñ in the user_input is escaped with a backslash. My workaround, to compensate for re.escape's to eager escaping, is to escape re.sub pattern: s = re.sub(u'\\\\n|\\\\ñ', u'[\\\\n|\\\\ñ]') It works but is not very nice. It would have been much better if re.escape worked like one could expect in the first place. ---------- nosy: +bjourne _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue2650> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com