Nadeem Vawda added the comment:

> I agree that making lzma.open() wrap its return value in a BufferedReader
> (or BufferedWriter, as appropriate) is the way to go.

On second thoughts, there's no need to change the behavior for mode='wb'.
We can just return a BufferedReader for mode='rb', and leave the current
behavior (returning a raw LZMAFile) in place for mode='wb'.


I also ran some additional benchmarks for the bz2 and gzip modules. It
looks like those two modules would also benefit from having their open()
functions use io.BufferedReader:

[lzma]

  $ time xzcat src.xz | wc -l
  1057980

  real    0m0.543s
  user    0m0.556s
  sys     0m0.024s
  $ ../cpython/python -m timeit -s 'import lzma, io' 'f = lzma.open("src.xz", 
"r")' 'for line in f: pass'
  10 loops, best of 3: 2.01 sec per loop
  $ ../cpython/python -m timeit -s 'import lzma, io' 'f = 
io.BufferedReader(lzma.open("src.xz", "r"))' 'for line in f: pass'
  10 loops, best of 3: 795 msec per loop

[bz2]

  $ time bzcat src.bz2 | wc -l
  1057980

  real    0m1.322s
  user    0m1.324s
  sys     0m0.044s
  $ ../cpython/python -m timeit -s 'import bz2, io' 'f = bz2.open("src.bz2", 
"r")' 'for line in f: pass'
  10 loops, best of 3: 3.71 sec per loop
  $ ../cpython/python -m timeit -s 'import bz2, io' 'f = 
io.BufferedReader(bz2.open("src.bz2", "r"))' 'for line in f: pass'
  10 loops, best of 3: 2.04 sec per loop

[gzip]

  $ time zcat src.gz | wc -l
  1057980

  real    0m0.310s
  user    0m0.296s
  sys     0m0.028s
  $ ../cpython/python -m timeit -s 'import gzip, io' 'f = gzip.open("src.gz", 
"r")' 'for line in f: pass'
  10 loops, best of 3: 1.94 sec per loop
  $ ../cpython/python -m timeit -s 'import gzip, io' 'f = 
io.BufferedReader(gzip.open("src.gz", "r"))' 'for line in f: pass'
  10 loops, best of 3: 556 msec per loop

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue18003>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to