Ezio Melotti <ezio.melo...@gmail.com> added the comment:
If we can't fix the behavior, it should at least be documented. Currently the docs says "This function returns a list of (decoded_string, charset) pairs containing each of the decoded parts of the header.". One could assume that this means that a Unicode string is returned, but and as far as I can tell, "decoded_string" means decoded from the format used by the header, not from bytes -- in fact the example below shows a byte string. #24797 suggest an alternative solution, but there is no indications about it in the docs except an easy-to-miss note about the new API at the top. Coincidentally as I was reporting this issue I also found the recently opened #37139. There are also a few other reports: #24797, #37139, #32975, #6302, #4661. If this method is not actually deprecated, I would document the current behavior (i.e. sometimes it returns bytes, sometimes unicode -- bonus points if there's a simple rule to predict which one), explain that it exists for legacy/backward-compatibility reasons, and point to the alternatives. FWIW here are 3 more samples that show the inconsistency. >>> from email.header import decode_header >>> # str + None >>> h = '\x80SOKCrGxsbw===== <he...@example.com>'; decode_header(h) [('\x80SOKCrGxsbw===== <he...@example.com>', None)] >>> # bytes + '', bytes + None >>> h = '=??b?SOKCrGxsbw=====?= <he...@example.com>'; decode_header(h) [(b'H\xe2\x82\xacllo', ''), (b' <he...@example.com>', None)] >>> # bytes + 'utf8', bytes + None >>> h = '=?utf8?b?SOKCrGxsbw==?= <he...@example.com>'; decode_header(h) [(b'H\xe2\x82\xacllo', 'utf8'), (b' <he...@example.com>', None)] ---------- assignee: -> docs@python components: +Documentation nosy: +docs@python, ezio.melotti, louis.abra...@yahoo.fr resolution: duplicate -> stage: resolved -> needs patch status: closed -> open type: behavior -> enhancement _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue21492> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com