Re: [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming

Christoph Hellwig Sat, 24 Mar 2012 08:40:56 -0700

On Fri, Mar 09, 2012 at 02:36:50PM -0600, Richard Laager wrote:
> I'm not sure if fallocate() and/or BLKDISCARD always guarantee that the
> discard has made it to stable storage. If they don't, does O_DIRECT or
> O_DSYNC on open() cause them to make any such guarantee? If not, should
> you be calling fdatasync() or fsync() when the user has specified
> cache=direct or cache=writethrough?
> 
> Note that the Illumos implementation (see below) has a flag to ask for
> either behavior.

For fallocate the current Linux behaviour is that you need an fdatasync
or the O_DSYNC flag to guarantee that it makes it's way to disk.  For
XFS the history implementation of both XFS_IOC_FREESP and hole punching
was that it always made it to disk, and for all other filesystems that
historic behaviour was that O_DSYNC was ignored, and depending on the
fs a fdatasync might or might not be able to give your a guarantee either.

For BLKDISCARD you'd strictly speaking need to issue a cache flush via
fdatasync, too, but I'm not sure if any actual devices require that.

> On Fri, 2012-03-09 at 00:35 -0800, Chris Wedgwood wrote:
> > Simplest still compare the blocks allocated by the file
> > to it's length (ie. stat.st_blocks != stat.st_size>>9).
> 
> I thought of this as well. It covers "99%" of the cases, but there's one
> case where it breaks down. Imagine I have a sparse file backing my
> virtual disk. In the guest, I fill the virtual disk 100%. Then, I
> restart QEMU. Now it thinks that sparse file is non-sparse and stops
> issuing hole punches.

Note that due to speculative preallocation in the filesystem stat.st_blocks
can easily be bigger than stat.st_size>>9, so the above comparism should
at least be a < instead of !=.

Re: [Qemu-devel] [RFC PATCH 14/17] block: support FALLOC_FL_PUNCH_HOLE trimming

Reply via email to