[issue11322] encoding package's normalize_encoding() function is too slow

STINNER Victor Thu, 15 Dec 2016 01:53:46 -0800

STINNER Victor added the comment:

It seems like encodings.normalize_encoding() currently has no unit test! Before 
modifying it, I would prefer to see a few unit tests:


* " utf 8 "
* "UtF 8"
* "utf8\xE9"
* etc.

Since we are talking about an optimmization, I would like to see a benchmark 
result before/after. I also would like to test Marc-Andre's idea of exposing 
the C function _Py_normalize_encoding().

_Py_normalize_encoding() works on a byte string encoded to Latin1. To implement 
encodings.normalize_encoding(), we might rewrite the function to work on 
Py_UCS4 character, or have a fast version on char*, and a more generic version 
for UCS2 and UCS4?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue11322>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue11322] encoding package's normalize_encoding() function is too slow

Reply via email to