[issue32211] Document the bug in re.findall() and re.finditer() in 2.7 and 3.6

Serhiy Storchaka Mon, 04 Dec 2017 02:08:30 -0800

New submission from Serhiy Storchaka <[email protected]>:

>>> re.findall(r'^|\w+', 'two words')
['', 'wo', 'words']


Seems the current behavior was documented incorrectly in issue732120.

It will be fixed in 3.7 (see issue1647489, issue25054), but I hesitate to 
backport the fix to 3.6 and 2.7 because this can break the user code. For 
example:

In Python 3.6:

>>> list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar'))
[<_sre.SRE_Match object; span=(4, 4), match=''>, <_sre.SRE_Match object; 
span=(5, 5), match=''>]

In Python 3.7:

>>> list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar'))
[<re.Match object; span=(4, 4), match=''>, <re.Match object; span=(4, 5), 
match='\n'>, <re.Match object; span=(5, 5), match=''>]

(This is a real pattern used in the docstring module, but with re.sub()).

The proposed PR documents the current weird behavior in 2.7 and 3.6.

----------
assignee: docs@python
components: Documentation, Regular Expressions
messages: 307546
nosy: docs@python, ezio.melotti, mrabarnett, rhettinger, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Document the bug in re.findall() and re.finditer() in 2.7 and 3.6
type: enhancement
versions: Python 2.7, Python 3.6

_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue32211>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue32211] Document the bug in re.findall() and re.finditer() in 2.7 and 3.6

Reply via email to