New submission from Laurent Mazuel:

Hello,

Considering a zip file which contains utf-8 filenames (as uploaded zip file), 
the following code fails if launched in a Posix shell.

>>> with zipfile.ZipFile("test_ut8.zip") as fd:
...     fd.extractall()
... 
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/opt/python/3.3/lib/python3.3/zipfile.py", line 1225, in extractall
    self.extract(zipinfo, path, pwd)
  File "/opt/python/3.3/lib/python3.3/zipfile.py", line 1213, in extract
    return self._extract_member(member, path, pwd)
  File "/opt/python/3.3/lib/python3.3/zipfile.py", line 1276, in _extract_member
    open(targetpath, "wb") as target:
UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-14: 
ordinal not in range(128)

With shell:
$ locale
LANG=POSIX
...

But filesystem is not encoding dependant. On a Unix system, filename are only 
bytes, there is no reason to refuse to unzip a zip file (in fact, "unzip" 
command line don't fail to unzip the file in a Posix shell).

Since "open" can take "bytes" filename, changing the line 1276 from
> open(targetpath)
to:
> open(targetpath.encode("utf-8"))

fixes the problem.

zipfile should not care about the encoding of the filename and should use the 
bytes sequence filename extracted directly from the bytes sequence of the 
zipfile. Having "ZipInfo.filename" as a string (and not bytes) is great for an 
API, but is not needed to open/write a file on the disk. Then, ZipInfo should 
store the direct bytes sequences of filename as a "bytes_filename" field and 
use it in the "open" of "extract".

In addition, considering the patch of bug 10614, the right patch could use the 
new "ZipInfo.encoding" field:
> open(targetpath.encode(member.encoding))

----------
components: Extension Modules
files: test_ut8.zip
messages: 208648
nosy: Laurent.Mazuel
priority: normal
severity: normal
status: open
title: zipfile.extractall fails in Posix shell with utf-8 filename
type: behavior
versions: Python 3.3
Added file: http://bugs.python.org/file33589/test_ut8.zip

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20329>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to