Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations

Paolo Bonzini Fri, 09 Mar 2012 10:06:55 -0800

Il 09/03/2012 17:37, Kevin Wolf ha scritto:
>> > Remove the bdrv_co_write_zeroes callback.  Instead use the discard
>> > information from bdrv_get_info to choose between bdrv_co_discard
>> > and a normal write.
>> > 
>> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>
> I'm not sure if this a good idea.
> 
> The goal of discard is to remove data from the image (or not add it if
> it isn't there yet) and ideally deallocate the used clusters. The goal
> of write_zeroes is to mark space as zero and explicitly allocate it for
> this purpose.
> 
> From a guest point of view these are pretty similar, but from a host
> perspective I'd say there's a difference.


True.  However, we need to present a uniform view to the guests,
including the granularity, or discard can never be enabled.  The
granularity must be 512 on IDE (though it can be higher on SCSI), so
there are problems mapping block layer discard straight down to the guest.

There are basically three ways to do this:

1) we could cheat and present a discard_granularity that is smaller than
what the underlying format/protocol supports.  This is fine but forces
discard_zeroes_data to be false.  That's a pity because Linux 3.4 will
start using efficient zero write operations (WRITE SAME on SCSI, but
could be extended to UNMAP/TRIM if discard_zeroes_data is true).

2) we can make an emulated discard that always supports 512 bytes
granularity and always zeroes data.  This patch series takes this route.

3) we can let the user choose between (1) and (2).  I didn't choose this
because of laziness mostly---co_write_zeroes support is not really
complete, for example there's no aio version to use in device
models---and also because I doubt anyone would really use the option.

Paolo

Re: [Qemu-devel] [RFC PATCH 06/17] block: use bdrv_{co, aio}_discard for write_zeroes operations

Reply via email to