Bugs item #1591319, was opened at 2006-11-06 11:49
Message generated for change (Comment added) made by niemeyer
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1591319&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.4
>Status: Closed
>Resolution: Works For Me
Priority: 5
Private: No
Submitted By: Thomas K. (tomek74)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: replace groups doesn't work in this special case

Initial Comment:
If you have a regular expression like this:
([0-9])([a-z])?
matching this string:
1 1a
and replacing with this:
yx
you get what expected:
yx yx

BUT:
If you replace with this:
\1\2
you get nothing replaced, because the group \2 
doesn't exist for the pattern "1".
But it does exist for the pattern "1a"!

We have multiple possibilities here:
1.) The string "1" gives no result, because \2 
doesn't exist. The string "1a" gives a result, so the 
output should be: 1a
2.) The sring "1" gives a result, because \2 is 
handled like an empty string. The string "1a" gives a 
result, so the output should be: 1 1a


I think the case that the sring "1" has no results, 
but effects the string "1a" wich would normaly have a 
result, is bad.

What are your thoughts on it?


Test code:
import re

# common variables

rawstr = r"""([0-9])([a-z])?"""
embedded_rawstr = r"""([0-9])([a-z])?"""
matchstr = """1 1a"""

# method 1: using a compile object
compile_obj = re.compile(rawstr)
match_obj = compile_obj.search(matchstr)

# method 2: using search function (w/ external flags)
match_obj = re.search(rawstr, matchstr)

# method 3: using search function (w/ embedded flags)
match_obj = re.search(embedded_rawstr, matchstr)

# Retrieve group(s) from match_obj
all_groups = match_obj.groups()

# Retrieve group(s) by index
group_1 = match_obj.group(1)
group_2 = match_obj.group(2)

# Replace string
newstr = compile_obj.subn('\1\2', 0)


----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2006-11-06 12:17

Message:
Logged In: YES 
user_id=7887

Hello Thomas,

I don't understand exactly what you mean here.

This doesn't work:

  >>> re.compile("([0-9])([a-z])?").subn(r"<\1\2>", "1 1a")
  Traceback (most recent call last):
  ...
  sre_constants.error: unmatched group

And this works fine:

  >>> re.compile("([0-9])([a-z]?)").subn(r"<\1\2>", "1 1a")
  ('<1> <1a>', 2)

The example code you provided doesn't run here, because
'subn()' is being provided
bad data (check http://docs.python.org/lib/node46.html for
docs). It's also
being passed '\1\2', which is really '\x01\x02', and won't
do what you want.


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1591319&group_id=5470
_______________________________________________
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to