[issue2650] re.escape should not escape underscore

Björn Lindqvist Sat, 12 Sep 2009 09:21:21 -0700

Björn Lindqvist <bjou...@gmail.com> added the comment:

In my app, I need to transform the regexp created from user input so
that it matches unicode characters with their ascii equivalents. For
example, if someone searches for "el nino", that should match the string
"el ñino". Similarly, searching for "el ñino" should match "el nino".


The code to transform the regexp looks like this:

s = re.escape(user_input)
s = re.sub(u'n|ñ', u'[n|ñ]')
matches = list(re.findall(s, data, re.IGNORECASE|re.UNICODE))

It doesn't work because the ñ in the user_input is escaped with a
backslash. My workaround, to compensate for re.escape's to eager
escaping, is to escape re.sub pattern:

s = re.sub(u'\\\\n|\\\\ñ', u'[\\\\n|\\\\ñ]')

It works but is not very nice. It would have been much better if
re.escape worked like one could expect in the first place.

----------
nosy: +bjourne

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue2650>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue2650] re.escape should not escape underscore

Reply via email to