Bugs item #1500179, was opened at 2006-06-03 21:32 Message generated for change (Settings changed) made by blep You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1500179&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Regular Expressions Group: Python 2.4 >Status: Closed >Resolution: Accepted Priority: 5 Submitted By: Baptiste Lepilleur (blep) Assigned to: Gustavo Niemeyer (niemeyer) Summary: re.escape incorrectly escape literal. Initial Comment: Using Python 2.4.2. Here is a small programm excerpt that reproduce the issue (attached): --- import re literal = r'E:\prg\vc' print 'Expected:', literal print 'Actual:', re.sub('a', re.escape(literal), 'a' ) assert re.sub('a', re.escape(literal), 'a' ) == literal --- And the output of the sample: --- Expected: E:\prg\vc Actual : E\:\prg\vc Traceback (most recent call last): File "re_escape_bug.py", line 5, in ? assert re.sub('a', re.escape(literal), 'a' ) == literal AssertionError --- Looking at regular expression syntax of python documentation I don't see why ':' is escaped as '\:'. Baptiste. ---------------------------------------------------------------------- >Comment By: Baptiste Lepilleur (blep) Date: 2006-06-03 23:45 Message: Logged In: YES user_id=196852 You are correct. Though, the 'repl' string parameter is not a literal string and is interpreted. The correct escape function to preserve the literal is literal.replace('\\','\\\\') not re.escape(). It would preserve any interpretation of the repl pattern. I believe this fact should be clearly stated in the documentation as it is not that obvious. The following assertion pass: --- import re literal = r'e:\prg\vc\1' assert re.sub( '(a+)', literal.replace('\\','\\\\'), 'aabac' ) == (literal+'b'+literal+'c') --- In the above example neither \v nor \1 are interpreted. Regards, Baptiste. ---------------------------------------------------------------------- Comment By: A.M. Kuchling (akuchling) Date: 2006-06-03 22:27 Message: Logged In: YES user_id=11375 The assertion is wrong, I think. The signature is re.sub(pattern, replacement, string), so the assertion is replacing 'a' with re.escape(literal), which is obviously not going to equal literal. re.escape() puts a backslash in front of all non-alphanumeric characters; ':' is non-alphanumeric, so it will be escaped. The regex parser will ignore unknown escapes, so \: is the same as : -- the redundant escaping is harmless. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1500179&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com