STINNER Victor added the comment: It seems like encodings.normalize_encoding() currently has no unit test! Before modifying it, I would prefer to see a few unit tests:
* " utf 8 " * "UtF 8" * "utf8\xE9" * etc. Since we are talking about an optimmization, I would like to see a benchmark result before/after. I also would like to test Marc-Andre's idea of exposing the C function _Py_normalize_encoding(). _Py_normalize_encoding() works on a byte string encoded to Latin1. To implement encodings.normalize_encoding(), we might rewrite the function to work on Py_UCS4 character, or have a fast version on char*, and a more generic version for UCS2 and UCS4? ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11322> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com