New submission from Emanuel Barry:

Attached patch deprecates invalid escape sequences in unicode strings. The 
point of this is to prevent issues such as #27356 (and possibly other similar 
ones) in the future.

Without the patch:

>>> "hello \world"
'hello \\world'

With the patch:

>>> "hello \world"
DeprecationWarning: invalid escape sequence 'w'

I'll need some help (patch isn't mergeable yet):

test_doctest fails on my machine with the patch (and -W), and I don't know how 
to fix it. test_ast fails an assertion (!PyErr_Occurred() in PyObject_Call in 
abstract.c) when -W is on, and I also don't know how to fix it (I don't even 
know what causes it).

Of course, I went ahead and fixed all instances of invalid escape sequences in 
the stdlib (that I could find) so that no DeprecationWarning is encountered.

Lastly, I thought about also doing this to bytes, but I ran into some issues 
with some invalid escapes such as \u, and _codecs.escape_decode would trigger 
the warning when passed br"\8" (for example). Ultimately, I decided to leave 
bytes alone for now, since it's mostly on the lower-level side of things. If 
there's interest I can add it back.

----------
components: Interpreter Core, Library (Lib), Unicode
files: deprecate_invalid_unicode_escapes.patch
keywords: patch
messages: 269022
nosy: ebarry, ezio.melotti, haypo
priority: normal
severity: normal
stage: patch review
status: open
title: Deprecate invalid unicode escape sequences
type: behavior
versions: Python 3.6
Added file: 
http://bugs.python.org/file43499/deprecate_invalid_unicode_escapes.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue27364>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to