On Mon, Apr 29, 2019 at 04:30:14PM +0200, Kevin Wolf wrote: > Am 29.04.2019 um 16:04 hat Stefano Garzarella geschrieben: > > On Mon, Apr 29, 2019 at 12:25:10PM +0200, Kevin Wolf wrote: > > > Am 11.04.2019 um 12:50 hat Stefano Garzarella geschrieben: > > > > RBD APIs don't allow us to write more than the size set with > > > > rbd_create() > > > > or rbd_resize(). > > > > In order to support growing images (eg. qcow2), we resize the image > > > > before RW operations that exceed the current size. > > > > > > > > Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1171007 > > > > Signed-off-by: Stefano Garzarella <sgarz...@redhat.com> > > > > --- > > > > block/rbd.c | 25 +++++++++++++++++++++++++ > > > > 1 file changed, 25 insertions(+) > > > > > > > > diff --git a/block/rbd.c b/block/rbd.c > > > > index 0c549c9935..228658e20a 100644 > > > > --- a/block/rbd.c > > > > +++ b/block/rbd.c > > > > @@ -102,6 +102,7 @@ typedef struct BDRVRBDState { > > > > rbd_image_t image; > > > > char *image_name; > > > > char *snap; > > > > + uint64_t image_size; > > > > } BDRVRBDState; > > > > > > Can't we use bs->total_sectors instead of adding a new image_size field? > > > > I'm not sure we can use bs->total_sectors. IIUC, for example, it doesn't > > take care of bytes used by QCOW2 metadata. > > bs->total_sectors for the rbd BLockDriverState is the image file size, > not the virtual disk size. > > The only reason not to use it would be if we need byte granularity > rather than 512 byte granularity. But I don't think it's a problem to > round up offsets to the next 512 bytes (BDRV_SECTOR_SIZE) boundary. >
I tried and it works as you told me :) I'll remove the image_size in the v2. > > > > static int qemu_rbd_connect(rados_t *cluster, rados_ioctx_t *io_ctx, > > > > @@ -777,6 +778,14 @@ static int qemu_rbd_open(BlockDriverState *bs, > > > > QDict *options, int flags, > > > > goto failed_open; > > > > } > > > > > > > > + r = rbd_get_size(s->image, &s->image_size); > > > > + if (r < 0) { > > > > + error_setg_errno(errp, -r, "error reading image size from %s", > > > > + s->image_name); > > > > + rbd_close(s->image); > > > > + goto failed_open; > > > > + } > > > > + > > > > /* If we are using an rbd snapshot, we must be r/o, otherwise > > > > * leave as-is */ > > > > if (s->snap != NULL) { > > > > @@ -921,6 +930,20 @@ static BlockAIOCB *rbd_start_aio(BlockDriverState > > > > *bs, > > > > rcb->buf = acb->bounce; > > > > } > > > > > > > > + /* > > > > + * RBD APIs don't allow us to write more than actual size, so in > > > > order > > > > + * to support growing images, we resize the image before RW > > > > operations > > > > + * that exceed the current size. > > > > + */ > > > > + if (s->image_size < off + size) { > > > > + r = rbd_resize(s->image, off + size); > > > > + if (r < 0) { > > > > + goto failed; > > > > + } > > > > + > > > > + s->image_size = off + size; > > > > + } > > > > > > This doesn't check the request type, so it's actually not limited to RW > > > operations, but even reads will try to resize the image. This is at > > > least surprising. For regular files, file-posix extends the file for > > > write requests, but for reads it returns a zeroed buffer without > > > actually changing the file size. > > > > Yes, I'll change the behaviour in the v2. > > > > I did some tries (i.e. using qemu-io and reading more than bytes used) and > > the RBD driver didn't receive 'read' requests that exceed the current > > size, maybe because it is managed in the QCOW2 protocol, but of course > > I'll handle also in the RBD driver. > > I don't remember the exact scenario where it happened, but I know I > implemented it for file-posix to fix a bug. Maybe it actually doesn't > happen any more because we have made other changes in the meantime, but > I'm not sure. > Thanks for the details, I'll check better if we can avoid it, otherwise I'll take care of this in the RBD driver. Cheers, Stefano