Il 09/03/2012 17:37, Kevin Wolf ha scritto: >> > Remove the bdrv_co_write_zeroes callback. Instead use the discard >> > information from bdrv_get_info to choose between bdrv_co_discard >> > and a normal write. >> > >> > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> > I'm not sure if this a good idea. > > The goal of discard is to remove data from the image (or not add it if > it isn't there yet) and ideally deallocate the used clusters. The goal > of write_zeroes is to mark space as zero and explicitly allocate it for > this purpose. > > From a guest point of view these are pretty similar, but from a host > perspective I'd say there's a difference.
True. However, we need to present a uniform view to the guests, including the granularity, or discard can never be enabled. The granularity must be 512 on IDE (though it can be higher on SCSI), so there are problems mapping block layer discard straight down to the guest. There are basically three ways to do this: 1) we could cheat and present a discard_granularity that is smaller than what the underlying format/protocol supports. This is fine but forces discard_zeroes_data to be false. That's a pity because Linux 3.4 will start using efficient zero write operations (WRITE SAME on SCSI, but could be extended to UNMAP/TRIM if discard_zeroes_data is true). 2) we can make an emulated discard that always supports 512 bytes granularity and always zeroes data. This patch series takes this route. 3) we can let the user choose between (1) and (2). I didn't choose this because of laziness mostly---co_write_zeroes support is not really complete, for example there's no aio version to use in device models---and also because I doubt anyone would really use the option. Paolo