Serhiy Storchaka <storch...@gmail.com> added the comment:
There is the crasher and leaker. When Python is not crashing, there is garbage
(i.e. leakage of data) at the end of the decoded string. Indeed, I see an
English text in some versions of Python.
There are many other errors in utf-16 decoder (see, for example,
b'\xD8\x00\xDC'.decode('utf-16be')). I'm now finishing work on a new decoder,
and after that take the bug fixing in 3.2.
----------
Added file: http://bugs.python.org/file25276/utf16crasher.py
_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue14579>
_______________________________________
k = len(b'\x00\x01\x00\x00'.decode('utf-32be'))
for i in range(1000):
print(i, ascii((b'\xD8\x00\xDC\x00' * i + b'\xDC\x00' + b'\x00>' *
2).decode('utf-16be', 'ignore')[i * k:]))
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com