New submission from Michael Fox: import lzma count = 0 f = lzma.LZMAFile('bigfile.xz' ,'r') for line in f: count += 1 print(count)
Comparing python2 with pyliblzma to python3.3.1 with the new lzma: m@air:~/q/topaz/parse_datalog$ time python lzmaperf.py 102368 real 0m0.062s user 0m0.056s sys 0m0.004s m@air:~/q/topaz/parse_datalog$ time python3 lzmaperf.py 102368 real 0m7.506s user 0m7.484s sys 0m0.012s Profiling shows most of the time is spent here: 102371 6.881 0.000 6.972 0.000 lzma.py:247(_read_block) I also notice that reading the entire file into memory with f.read() is perfectly fast. I think it has something to do with lack of buffering. ---------- components: Library (Lib) messages: 189488 nosy: Michael.Fox, nadeem.vawda priority: normal severity: normal status: open title: New lzma crazy slow with line-oriented reading. type: performance versions: Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue18003> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com