On Tue, Aug 06, 2013 at 10:38:32AM +0800, Asias He wrote: > On Tue, Aug 06, 2013 at 10:02:22AM +0800, Fam Zheng wrote: > > On Tue, 08/06 09:53, Asias He wrote: > > > From: MORITA Kazutaka <morita.kazut...@lab.ntt.co.jp> > > > > > > While Asias is debugging an issue creating qcow2 images on top of > > > non-file protocols. It boils down to this example using NBD: > > > > > > $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 > > > 512' > > > > > > Notice the open -g option to set bs->growable. This means you can > > > read/write beyond end of file. Reading beyond end of file is supposed > > > to produce zeroes. > > > > > > We rely on this behavior in qcow2_create2() during qcow2 image > > > creation. We create a new file and then write the qcow2 header > > > structure using bdrv_pwrite(). Since QCowHeader is not a multiple of > > > sector size, block.c first uses bdrv_read() on the empty file to fetch > > > the first sector (should be all zeroes). > > > > > > Here is the output from the qemu-io NBD example above: > > > > > > $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 > > > 512' > > > 00000000: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab > > > ................ > > > 00000010: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab > > > ................ > > > 00000020: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab > > > ................ > > > ... > > > > > > We are not zeroing the buffer! As a result qcow2 image creation on top > > > of protocols is not guaranteed to work even when file creation is > > > supported by the protocol. > > > > It seems to me that "read beyond EOF" is more protocol defined than to > > be forced in block layer. Is this behavior common to all protocols, is > > it reasonable if some protocol wants other values than zero? > > The reason to do this in block layer is that we do not want to duplicate > the memset in all protocols. > > Do we actually have protocols that really want values other than zero?
I think we rely on zeroes when bs->growable is true, because bdrv_pwrite() handles sub-sector I/O using read-modify-write. So it reads beyond EOF first (expects to get zeroes), then copies the sub-sector data from the caller, then writes it back. If we don't zero beyond-EOF data then we would write unexpected values to the image. Stefan