New submission from STINNER Victor <vstin...@python.org>: bpo-37751 changed codecs.lookup() in a subtle way: non-ASCII characters are now ignored, whereas they were copied unmodified previously.
I would prefer that codecs.lookup() and encodings.normalize_encoding() behave the same. Either always ignore or always copy. Moreover, it seems like there is no test on how the encoding names are normalized in codecs.register(). I recall that using codecs.register() in an unit test causes troubles since there is no API to unregister a search function. Maybe we should just add a private function for test in _testcapi. Serhiy Storchaka wrote an example on my PR: https://github.com/python/cpython/pull/17997/files > There are other differences. For example, normalize_encoding("КОИ-8") returns > "кои_8", but codecs.lookup normalizes it to "8". > The comment in the sources is also not correct. ---------- components: Library (Lib) messages: 360004 nosy: lemburg, serhiy.storchaka, vstinner priority: normal severity: normal status: open title: codecs.lookup() ignores non-ASCII characters, whereas encodings.normalize_encoding() copies them versions: Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue39337> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com