New submission from Serhiy Storchaka <storchaka+cpyt...@gmail.com>: >>> re.findall(r'^|\w+', 'two words') ['', 'wo', 'words']
Seems the current behavior was documented incorrectly in issue732120. It will be fixed in 3.7 (see issue1647489, issue25054), but I hesitate to backport the fix to 3.6 and 2.7 because this can break the user code. For example: In Python 3.6: >>> list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar')) [<_sre.SRE_Match object; span=(4, 4), match=''>, <_sre.SRE_Match object; span=(5, 5), match=''>] In Python 3.7: >>> list(re.finditer(r'(?m)^\s*?$', 'foo\n\n\nbar')) [<re.Match object; span=(4, 4), match=''>, <re.Match object; span=(4, 5), match='\n'>, <re.Match object; span=(5, 5), match=''>] (This is a real pattern used in the docstring module, but with re.sub()). The proposed PR documents the current weird behavior in 2.7 and 3.6. ---------- assignee: docs@python components: Documentation, Regular Expressions messages: 307546 nosy: docs@python, ezio.melotti, mrabarnett, rhettinger, serhiy.storchaka priority: normal severity: normal status: open title: Document the bug in re.findall() and re.finditer() in 2.7 and 3.6 type: enhancement versions: Python 2.7, Python 3.6 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue32211> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com