STINNER Victor <victor.stin...@haypocalc.com> added the comment: We should first implement the same algorithm of the 3 normalization functions and add tests for them (at least for the function in normalization):
- normalize_encoding() in encodings: it doesn't convert to lowercase and keep non-ASCII letters - normalize_encoding() in unicodeobject.c - normalizestring() in codecs.c normalize_encoding() in encodings is more laxist than the two other functions: it normalizes " utf 8 " to 'utf_8'. But it doesn't convert to lowercase and keeps non-ASCII letters: "UTF-8é" is normalized "UTF_8é". I don't know if the normalization functions have to be more or less strict, but I think that they should all give the same result. ---------- nosy: +haypo _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue11322> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com