Antoine Pitrou added the comment: I second Serhiy here. Wrapping the LZMAFile in a BufferedReader is the simple solution to the performance problem:
./python -m timeit -s "import lzma, io" "f=lzma.LZMAFile('words.xz', 'r')" "for line in f: pass" 10 loops, best of 3: 148 msec per loop $ ./python -m timeit -s "import lzma, io" "f=io.BufferedReader(lzma.LZMAFile('words.xz', 'r'))" "for line in f: pass" 10 loops, best of 3: 44.3 msec per loop $ time xzcat words.xz | wc -l 99156 real 0m0.021s user 0m0.016s sys 0m0.004s Perhaps the top-level lzma.open() should do the wrapping for you, though. Interestingly, opening in text (i.e. unicode) mode is almost as fast as with a BufferedReader: $ ./python -m timeit -s "import lzma, io" "f=lzma.open('words.xz', 'rt')" "for line in f: pass" 10 loops, best of 3: 51.1 msec per loop ---------- nosy: +pitrou _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18003> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com