New submission from Laurent Mazuel: Hello,
Considering a zip file which contains utf-8 filenames (as uploaded zip file), the following code fails if launched in a Posix shell. >>> with zipfile.ZipFile("test_ut8.zip") as fd: ... fd.extractall() ... Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/opt/python/3.3/lib/python3.3/zipfile.py", line 1225, in extractall self.extract(zipinfo, path, pwd) File "/opt/python/3.3/lib/python3.3/zipfile.py", line 1213, in extract return self._extract_member(member, path, pwd) File "/opt/python/3.3/lib/python3.3/zipfile.py", line 1276, in _extract_member open(targetpath, "wb") as target: UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-14: ordinal not in range(128) With shell: $ locale LANG=POSIX ... But filesystem is not encoding dependant. On a Unix system, filename are only bytes, there is no reason to refuse to unzip a zip file (in fact, "unzip" command line don't fail to unzip the file in a Posix shell). Since "open" can take "bytes" filename, changing the line 1276 from > open(targetpath) to: > open(targetpath.encode("utf-8")) fixes the problem. zipfile should not care about the encoding of the filename and should use the bytes sequence filename extracted directly from the bytes sequence of the zipfile. Having "ZipInfo.filename" as a string (and not bytes) is great for an API, but is not needed to open/write a file on the disk. Then, ZipInfo should store the direct bytes sequences of filename as a "bytes_filename" field and use it in the "open" of "extract". In addition, considering the patch of bug 10614, the right patch could use the new "ZipInfo.encoding" field: > open(targetpath.encode(member.encoding)) ---------- components: Extension Modules files: test_ut8.zip messages: 208648 nosy: Laurent.Mazuel priority: normal severity: normal status: open title: zipfile.extractall fails in Posix shell with utf-8 filename type: behavior versions: Python 3.3 Added file: http://bugs.python.org/file33589/test_ut8.zip _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue20329> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com