Inada Naoki <songofaca...@gmail.com> added the comment:
> > desbma <dutch...@gmail.com> added the comment: > > If you do a benchmark by reading from a file, and then writing to /dev/null > several times, without clearing caches, you are measuring *only* the syscall > overhead: > * input data is read from the Linux page cache, not the file on your SSD > itself Yes. I measures syscall overhead to determine reasonable buffer size. shutil may be used when page cache is warm. > * no data is written (obviously because output is /dev/null) As I said before, my SSD doesn't have stable write performance. (It is typical for consumer SSD). So this is intensional. And there are use cases copy from/to io.BytesIO or other file-like objects. > > Your current command line also measures open/close timings, without that I > think the speed should linearly increase when doubling buffer size, but of > course this is misleading, because its a synthetic benchmark. I'm not measuring speed of my cheap SSD. The goal of this benchmark is finding reasonable buffer size. There are vary real usages. So reducing syscall overhead with reasonable buffer size is worth enough. > > Also if you clear caches in between tests, and write the output file to the > SSD itself, sendfile will be used, and should be even faster. No. sendfile is not used by shutil.copyfileobj, even if dst is real file on disk. > > So again I'm not sure this means much compared to real world usage. > "Real world usage" is vary. Sometime it is not affected. Sometime it affects. On the other hand, what is the cons of changing 16KiB to 64KiB? Windows used 1MiB already. And CPython runtime uses a few MBs of memory too. ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36103> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com