On 08.02.2017 15:06, Stephane Chazelas wrote: > 2017-02-08 00:43:18 +0100, Max Reitz: > [...] >> OTOH, it may make sense to offer a way for the user to disable >> lseek(SEEK_{DATA,HOLE}) in our "file" block driver. That way your issue >> would be solved, too, I guess. I'll look into it. > [...] > > Thanks Max, > > Yes, that would work for me and other users of ZFS. What I do > for now is recompile with those lseek(SEEK_{DATA,HOLE}) disabled > in the code and it's working fine. > > As I already hinted, something that would also possibly work for > me and could benefit everyone (well at least Linux users on > filesystems supporting hole punching), is instead of checking > beforehand if the file is allocated, do a > fallocate(FALLOC_FL_PUNCH_HOLE), or IOW, tell the underlying > layer to deallocate the data. > > That would be those two lseek() replaced by a fallocate(), and > some extra disk space being saved.
When using qcow2, however, qcow2 will try to take care of that by discarding clusters or using special zero clusters. The lseek() thing just tries to mitigate the effect of writing less than a cluster of zeroes. Yes, we could punch a hole, but is that faster or has at least the same speed as lseek(SEEK_{DATA,HOLE}) on all filesystems? Also, is it the same speed on all protocols? We not only support files as image storage, but also network protocols etc.. But this is just generally speaking for the "write zeroes" case. With detect-zeroes=unmap (as opposed to detect-zeroes=on), things are a bit different. It may indeed make sense to fall through to the protocol level and punch holes there, even if it may be slower than lseek(). > One may argue that's what one would expect would be done when > using detect-zeroes=unmap. A bit of a stupid question, but: How is your performance when using detect-zeroes=off? > I suppose that would be quite significant work as that would > imply a framework to pass those "deallocates" down and you'd > probably have to differenciate "deallocates" that zero (like > hole punching in a reguar file), and those that don't (like > BLKDISCARD on a SSD) Well, we internally already have different functions for writing zeroes and discarding. The thing is, though, that qcow2 is supposed to handle those deallocations and not hand them down, because qcow2 can handle them -- but not if they're not aligned to whole qcow2 clusters (which, by default, are 64 kB in size). Max > > I also suppose that could cause fragmentation that would be > unwanted in some contexts, so maybe it should be tunable as > well. >
signature.asc
Description: OpenPGP digital signature