[issue20962] Rather modest chunk size in gzip.GzipFile

Skip Montanaro Mon, 28 Apr 2014 12:24:59 -0700

Skip Montanaro added the comment:

On Mon, Apr 28, 2014 at 1:59 PM, Antoine Pitrou <rep...@bugs.python.org> wrote:
> Well, I think that compressed files in general would benefit from a
> larger buffer size than plain binary I/O, but that's just a hunch.


I agree. When writing my patch, my (perhaps specious) thinking went like this.

* We have a big-ass file, so we compress it.
* On average, when seeking to another point in that file, we probably
want to go a long way.
* It's possible that operating system read-ahead semantics will make
read performance relatively high.
* That would put more burden on the Python code to be efficient.
* Larger buffer sizes will reduce the amount of Python bytecode which
must be executed.

So, if I have a filesystem block size of 8192 bytes, while that would
represent some sort of "optimal" chunk size, in practice, I think
operating system read-ahead and post-read processing of the bytes read
will tend to suggest larger chunk sizes. Hence my naive choice of 16k
bytes for _CHUNK_SIZE in my patch.

Skip

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20962>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue20962] Rather modest chunk size in gzip.GzipFile

Reply via email to