New submission from Serhiy Storchaka:

For now re.split doesn't split with zero-width regex. There are a number of 
issues for this (issue852532, issue988761, issue3262, issue22817). This is 
definitely a bug, but fixing this bug will likely break existing code which use 
regular expressions which can match zero-width (e.g. re.split('(:*)', 'ab')).

I propose to deprecate splitting on possible zero-width regular expressions. 
This expressions either not work at all as expected (r'\b' never split) or can 
be rewritten to not match empty string ('(:*)' to '(:+)').

In next release (3.6) we can convert deprecation warning to the exception, an 
then after transitional period change behavior to more correct handling 
zero-width matches without breaking backward compatibility.

----------
components: Extension Modules, Regular Expressions
files: re_deprecate_split_zero_width.patch
keywords: patch
messages: 230843
nosy: ezio.melotti, mrabarnett, pitrou, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: Deprecate splitting on possible zero-width re patterns
type: behavior
versions: Python 3.5
Added file: http://bugs.python.org/file37148/re_deprecate_split_zero_width.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22818>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to