New submission from Χρήστος Γεωργίου (Christos Georgiou) 
<t...@users.sourceforge.net>:

This is based on that StackOverflow answer: 
http://stackoverflow.com/questions/3957164/3963443#3963443. It also applies to 
Python 2.6 .

Searching for a regular expression that satisfies the mentioned SO question (a 
regular expression that matches strings with an initial A and/or final Z and 
returns everything except said initial A and final Z), I discovered something 
that I consider a bug. I've tried to thoroughly verify that this is not a 
PEBCAK before reporting the issue here.

Given:

>>> import re
>>> text= 'A***Z'

then:

>>> re.compile('(?<=^A).*(?=Z$)').search(text).group(0) # regex_1
'***'
>>> re.compile('(?<=^A).*').search(text).group(0) # regex_2
'***Z'
>>> re.compile('.*(?=Z$)').search(text).group(0) # regex_3
'A***'
>>> re.compile('(?<=^A).*(?=Z$)|(?<=^A).*').search(text).group(0) # 
>>> regex_1|regex_2
'***'
>>> re.compile('(?<=^A).*(?=Z$)|.*(?=Z$)').search(text).group(0) # 
>>> regex_1|regex_3
'A***'
>>> re.compile('(?<=^A).*|.*(?=Z$)').search(text).group(0) # regex_2|regex_3
'A***'
>>> re.compile('(?<=^A).*(?=Z$)|(?<=^A).*|.*(?=Z$)').search(text).group(0) # 
>>> regex_1|regex_2|regex_3
'A***'

regex_1 returns '***'. Based on the documentation 
(http://docs.python.org/py3k/library/re.html#regular-expression-syntax), I 
assert that, likewise, '***' should be returned by:

regex_1|regex_2
regex_1|regex_3
regex_1|regex_2|regex_3

And yet, regex_3 ( ".*(?=Z$)" ) seems to take precedence over both regex_1 and 
regex_2, even though it's the last alternative.

This works even if I substitute "(?:regex_n)" for every "regex_n", so it's not 
a matter of precedence.

I really hope that this is a PEBCAK; if that is true, I apologize for any time 
lost on the issue by anyone; but really don't think it is.

----------
components: Regular Expressions
messages: 119088
nosy: tzot
priority: normal
severity: normal
status: open
title: regex A|B : both A and B match, but B is wrongly preferred
type: behavior
versions: Python 3.1

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10139>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to