New submission from Aaron Sherman <a...@ajs.com>:

I tested this under 2.6 and 3.1. Under both, the common mistake that I'm sure 
many others have made, and which cost me quite some time today was:

 re.sub(r'(foo)bar', '\1baz', 'foobar')

It's obvious, I'm sure, to many reading this that the second "r" was left out 
before the replacement spec. It's probably obvious that this is going to happen 
quite a lot, and there are many edge cases which are equally baffling to the 
uninitiated (e.g. \8, \418 and \1111)

In order to avoid this, I'd like to request that such usage be deprecated, 
leaving only numeric escapes of the form matched by r'\\[0-7][0-7][0-7]?(?!\d)' 
as valid, non-deprecated uses (e.g. \01 or \111 are fine). Let's look at what 
that would do:

Right now, the standard library uses escape sequences with \n where n is a 
single digit in a handful of places like sndhdr.py and difflib.py. These are 
certainly not widespread enough to consider this a common usage, but certainly 
those few would have to change to add a leading zero before the digit.

OK, so the specific requested feature is that \xxx produces a warning where xxx 
is:

* any single digit or
* any invalid sequence of two or three digits (e.g containing 8 or 9) or
* any sequence of 4 or more digits

... guiding the user to the more explicit \01, \x01 or, if they intended a 
literal backslash, the r notation.

If you wish to go a step further, I'd suggest adding a no-op escape \e such 
that:

 \41\e1

would print "!1". Otherwise, there's no clean way to halt the interpretation of 
a digit-based escape sequence.

----------
components: Regular Expressions, Unicode
messages: 103640
nosy: Aaron.Sherman
severity: normal
status: open
title: Backreferences vs. escapes: a silent failure solved
type: feature request
versions: Python 2.6, Python 3.1

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue8465>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to