Serhiy Storchaka <storch...@gmail.com> added the comment:

This is not because zipfile module is unbuffered. This is the difference 
between expensive function call and cheap bytes slicing. Replace 
`zf.open(namelist [0])` to `io.BufferedReader(zf.open(namelist [0]))` to see 
the effect of a good buffering. In 3.2 zipfile read() implemented not optimal, 
so it slower (twice), but in 3.3 it will be almost as fast as using 
io.BufferedReader. It is still several times more slowly than bytes slicing, 
but there's nothing you can do with it.

Here is a patch, which is speeds up (+20%) the reading from a zip file by small 
chunks. Microbenchmark:

./python -m zipfile -c test.zip python
./python -m timeit -n 1 -s "import zipfile;zf=zipfile.ZipFile('test.zip')"  
"with zf.open('python') as f:"  "  while f.read(1):pass"

Python 3.3 (vanilla):  1 loops, best of 3: 36.4 sec per loop
Python 3.3 (patched):  1 loops, best of 3: 30.1 sec per loop
Python 3.3 (with io.BufferedReader):  1 loops, best of 3: 30.2 sec per loop
And, for comparison, Python 3.2:  1 loops, best of 3: 74.5 sec per loop

----------
components:  -Documentation
keywords: +patch
versions:  -Python 2.7, Python 3.2
Added file: http://bugs.python.org/file25530/zipfile_optimize_read.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to