Michael Fox added the comment: io.BufferedReader works well for me. Thanks for the good suggestion. Now python 3.3 and 3.4 have similar performance to each other and they are only 2x slower than pyliblzma.
>From my perspective default wrapping with io.BufferedReader is a great idea. I can't think of who would suffer. Maybe someone who wants to open thousands of simultaneous streams wouldn't appreciate the memory overhead. If that person exists then he would want an option to turn it off. m@air:~/q/topaz/parse_datalog$ time python2 lzmaperf.py 102368 real 0m0.049s user 0m0.040s sys 0m0.008s m@air:~/q/topaz/parse_datalog$ time python3 lzmaperf.py 102368 real 0m0.109s user 0m0.092s sys 0m0.020s m@air:~/q/topaz/parse_datalog$ time ~/tmp/cpython-23836f17e4a2/bin/python3 lzmaperf.py 102368 real 0m0.101s user 0m0.084s sys 0m0.012s On Sun, May 19, 2013 at 7:07 AM, Antoine Pitrou <rep...@bugs.python.org> wrote: > > Antoine Pitrou added the comment: > > I second Serhiy here. Wrapping the LZMAFile in a BufferedReader is the simple > solution to the performance problem: > > ./python -m timeit -s "import lzma, io" "f=lzma.LZMAFile('words.xz', 'r')" > "for line in f: pass" > 10 loops, best of 3: 148 msec per loop > > $ ./python -m timeit -s "import lzma, io" > "f=io.BufferedReader(lzma.LZMAFile('words.xz', 'r'))" "for line in f: pass" > 10 loops, best of 3: 44.3 msec per loop > > $ time xzcat words.xz | wc -l > 99156 > > real 0m0.021s > user 0m0.016s > sys 0m0.004s > > > Perhaps the top-level lzma.open() should do the wrapping for you, though. > Interestingly, opening in text (i.e. unicode) mode is almost as fast as with > a BufferedReader: > > $ ./python -m timeit -s "import lzma, io" "f=lzma.open('words.xz', 'rt')" > "for line in f: pass" > 10 loops, best of 3: 51.1 msec per loop > > ---------- > nosy: +pitrou > > _______________________________________ > Python tracker <rep...@bugs.python.org> > <http://bugs.python.org/issue18003> > _______________________________________ -- - Michael ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18003> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com