Nadeem Vawda added the comment: I've tried reimplementing LZMAFile in terms of the decompress_into() method, and it has ended up not being any faster than the existing implementation. (It is _slightly_ faster for readinto() with a large buffer size, but all other cases it was either of equal performance or significantly slower.)
In addition, decompress_into() is more complicated to work with than I had expected, so I withdraw my objection to the approach based on max_length/unconsumed_tail. > unconsumed_tail should be private hidden attribute, which automatically > prepends any consumed data. I don't think this is a good idea. In order to have predictable memory usage, the caller will need to ensure that the current input is fully decompressed before passing in the next block of compressed data. This can be done more simply with the interface used by zlib. Compare: while not d.eof: output = d.decompress(b'', 8192) if not output: compressed = f.read(8192) if not compressed: raise ValueError('End-of-stream marker not found') output = d.decompress(compressed, 8192) # <process output> with: # Using zlib's interface while not d.eof: compressed = d.unconsumed_tail or f.read(8192) if not compressed: raise ValueError('End-of-stream marker not found') output = d.decompress(compressed, 8192) # <process output> A related, but orthogonal proposal: We might want to make unconsumed_tail a memoryview (provided the input data is know to be immutable), to avoid creating an unnecessary copy of the data. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue15955> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com