Paul,
any chance of having this pulled in? To recap, this simply uses posix_fadvise 
to provide a hint to the OS that we’re going to perform a sequential read of 
the source files when creating an archive.
In our testing on linux 4.4 when creating a tar archive with source files on a 
ext4 filesystem (on a SAN volume) this patch doubles tar's throughput. This is 
because when linux is provided the FADV_SEQUENTIAL hint it doubles the 
readahead on the underlying block device.

Any feedback is welcome.

Carlo

> On Apr 13, 2017, at 1:32 PM, Carlo Alberto Ferraris <ca...@strayorange.com> 
> wrote:
> 
> Paul,
> friendly ping.
> 
> Carlo
> 
>> On Apr 5, 2017, at 10:16 AM, Carlo Alberto Ferraris <ca...@strayorange.com 
>> <mailto:ca...@strayorange.com>> wrote:
>> 
>> Just as a comment about why in my patch I use len explicitly instead of 0: 
>> it’s to workaround a bug in linux versions <2.6.6.
>> 
>> > In kernels before 2.6.6, if len was specified as 0, then this was 
>> > interpreted 
>> > literally as "zero bytes", rather than as meaning “all bytes through to 
>> > the 
>> > end of the file”. 
>> > http://man7.org/linux/man-pages/man2/posix_fadvise.2.html#BUGS 
>> > <http://man7.org/linux/man-pages/man2/posix_fadvise.2.html#BUGS>
>> 
>> Since 0 means is supposed to mean “until the end of file”, passing 
>> explicitly the length of the file (that we already have) should be 
>> semantically the same.
>> 
>> Carlo
>> 
>>> On Apr 4, 2017, at 10:14 PM, Mark <ma...@clara.co.uk 
>>> <mailto:ma...@clara.co.uk>> wrote:
>>> 
>>> On Mon, April 3, 2017 03:17, Paul Eggert wrote:
>>>> I've lost context. I prefer not having this depend on an environment
>>>> variable.
>>>> 
>>>> Can't the filesystem in question be fixed to have decent performance in
>>>> the typical case where applications access files sequentially? It's not
>>>> like 'tar' is a special case. I'd hate to have to modify lots of
>>>> programs just to work around a lame filesystem.
>>> 
>>> I think you're confusing two things:
>>> - Carlo's patch
>>> - The suggestion to allow the user to tell tar to use POSIX_FADV_NOREUSE
>>> and/or POSIX_FADV_DONTNEED. In certain scenarios one, both or neither of
>>> those could perform best. [On Linux POSIX_FADV_NOREUSE is currently a
>>> no-op.]
>>> 
>>> Let's ignore the second point for now.
>>> 
>>> Carlo's patch at
>>> https://github.com/CAFxX/tar/commit/8b3ccb099c6ddf9f03d12d1f7c433c7927b964d5
>>>  
>>> <https://github.com/CAFxX/tar/commit/8b3ccb099c6ddf9f03d12d1f7c433c7927b964d5>
>>> uses
>>>  posix_fadvise(fd, offset, len, POSIX_FADV_SEQUENTIAL);
>>>  posix_fadvise(fd, offset, len, POSIX_FADV_WILLNEED);
>>> to give a hint to the OS/filesystem about how the file will be accessed.
>>> There shouldn't be any down-side to doing that. On Linux for example,
>>> POSIX_FADV_SEQUENTIAL causes the filesystem read-ahead amount to be
>>> doubled.
>>> 
>>> I'm not qualified to say whether the patch should be committed as-is, but
>>> the principle is sound. [I might choose a different name for the
>>> prefetch() function though.]
>>> 
>>> 
>> 
> 

Reply via email to