Hi, Pavel Raiskup wrote on Wed, Jan 10, 2018 at 12:50:06PM +0100: > > This check benefitted only one unlikely case (large files containing > > only zeroes, on systems that do not support SEEK_HOLE) > > It drops the optimization even for situations when SEEK_HOLE is not > available, which is not 100% necessary. I'm not proposing doing otherwise > (I actually proposed this in [1]), but I'm rather CCing Andreas once more, > as that's the original requester, the use-cases with lustre were > definitely not unlikely and the question is whether SEEK_HOLE covers them > nowadays.
Jumping in for lustre, for which there currently is a trivial SEEK_HOLE implementation that only checks file size boundaries, but I'd like to properly implement it soonish so I don't think tar should wait for lustre there. (not sure how Andreas feels about that though, will let him speak up) I actually recently asked on the GNU coreutils mailing list whether cp should look at st_blocks like GNU/tar does in case fiemap fails and got told they plan on changing the sparse file implementation from fiemap to SEEK_HOLE, so I'd tend to want tar to move in the same direction if possible. For what it's worth, the formula Jörg gave earlier today also looks good to me (st_size > (st_blocks * DEV_BSIZE) + DEV_BSIZE) and it should be easy enough to use that when SEEK_HOLE isn't available. I don't think there is any point in using both when it does work, I personally don't think the speedup of not opening the file is worth it precisely for the problems listed in the mail you quoted wrt. recent modifications not necessarily propagated immediately... Thanks, -- Dominique Martinet