New submission from monson: In /cpython/Lib/zipfile.py, there are some codes like
if flags & 0x800: # UTF-8 file names extension filename = filename.decode('utf-8') else: # Historical ZIP filename encoding filename = filename.decode('cp437') But actually there is no "Historical ZIP filename encoding", because zip files contain no charset info. In English countries, it's usually not a big deal. But if the files zip on a non-cp437-based system (especially like China or Japan), filename is encoded from charsets like gb18030, but ZipFile decodes the byte stream to cp437, then everything goes wrong and people are hard to find the reason. It's a problem new in py3k, and I found it on python3.2 and python3.4. I suggest the filename returned in Bytes objects, or add decoding parameter when opening zipfile. ---------- components: Library (Lib) messages: 167760 nosy: monson priority: normal severity: normal status: open title: zipfile: wrong encoding charset of member filename type: behavior versions: Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15602> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com