On Wed, March 29, 2017 10:01, Joerg Schilling wrote: > Paul Eggert <egg...@cs.ucla.edu> wrote: > >> On 03/27/2017 07:02 AM, Carlo Alberto Ferraris wrote: >> > This is a PoC patch that improves archive creation performance at >> least in certain configurations >> >> What configuration performs poorly with sequential access? How much >> improvement do you see with the patch, and why? > > I doubt that such methods will help to speed up archiving. I did many > tests > with similar approaches with star since aprox. 1997 and I did never see > any > performance win on any modern OS.
Carlo's patch calls posix_fadvise(fd, offset, len, POSIX_FADV_WILLNEED|POSIX_FADV_SEQUENTIAL|POSIX_FADV_NOREUSE); According to http://pubs.opengroup.org/onlinepubs/009695399/functions/posix_fadvise.html "The advice to be applied to the data is specified by the advice parameter and may be one of the following values: [lists various POSIX_FADV_xxx definitions]" You can't bitwise OR several POSIX_FADV_xxx values together when calling posix_fadvise(). Instead you would need to call posix_fadvise() three times: posix_fadvise(fd, offset, len, POSIX_FADV_WILLNEED); posix_fadvise(fd, offset, len, POSIX_FADV_SEQUENTIAL); posix_fadvise(fd, offset, len, POSIX_FADV_NOREUSE); Looking at Linux include/uapi/linux/fadvise.h bears that out; POSIX_FADV_NOREUSE is 5, the same value as POSIX_FADV_DONTNEED | POSIX_FADV_RANDOM. Whether or not the OS does anything with posix_fadvise() hints is up to the OS. But I seem to remember reading that Linux uses a larger read-ahead if told that a file will be read sequentially. Both POSIX_FADV_WILLNEED and POSIX_FADV_SEQUENTIAL probably won't do any harm, since they match the way in which tar reads files. For POSIX_FADV_NOREUSE, Linux currently treats that as a no-op, though a patches was proposed a few years ago. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/tree/mm/fadvise.c?id=refs/tags/v4.10.6 https://lwn.net/Articles/480930/ It might be a good idea to add an option to tar to use POSIX_FADV_DONTNEED, since that could reduce tar's impact on other processes (less filling the page cache with file data and evicting the working set of other programs).