Bob Kline <bkl...@rksystems.com> added the comment:
Ah, this is worse than I first thought. It's not just converting code by adding extra backslashes to regular expression strings, where at least the regular expression engine will do what the original code was asking the Python parser to do (unless user code checks for and enforces limits on regular expression string lengths, so even that case is broken), but 2to3 is also mangling strings in places where the behavior is changed (that is, broken). 2to3 wants to change if c not in ".-_:\u00B7\u0e87": to if c not in ".-_:\\u00B7\\u0e87": Not the same thing at all, as illustrated here: $ python Python 3.7.3 (default, Jun 19 2019, 07:38:49) [Clang 10.0.1 (clang-1001.0.46.4)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> len("\u00B7") 1 >>> len("\\u00B7") 6 >>> That breaks the original code. This is a serious bug. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue37996> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com