Bob Kline <bkl...@rksystems.com> added the comment:

Ah, this is worse than I first thought. It's not just converting code by adding 
extra backslashes to regular expression strings, where at least the regular 
expression engine will do what the original code was asking the Python parser 
to do (unless user code checks for and enforces limits on regular expression 
string lengths, so even that case is broken), but 2to3 is also mangling strings 
in places where the behavior is changed (that is, broken). 2to3 wants to change

    if c not in ".-_:\u00B7\u0e87":

to

    if c not in ".-_:\\u00B7\\u0e87":

Not the same thing at all, as illustrated here:

$ python
Python 3.7.3 (default, Jun 19 2019, 07:38:49)
[Clang 10.0.1 (clang-1001.0.46.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> len("\u00B7")
1
>>> len("\\u00B7")
6
>>>

That breaks the original code. This is a serious bug.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue37996>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to