On Thu, May 09, 2019 at 02:07:34PM +0200, Markus Armbruster wrote: > Stefano Garzarella <sgarz...@redhat.com> writes: > > > On Wed, May 08, 2019 at 01:44:27PM +0200, Markus Armbruster wrote: > >> Stefano Garzarella <sgarz...@redhat.com> writes: > >> > >> > On Tue, May 07, 2019 at 08:34:51AM +0200, Markus Armbruster wrote: > >> >> Cc: Peter for a libvirt perspective. > >> >> > >> >> Stefano Garzarella <sgarz...@redhat.com> writes: > >> >> > >> >> > This patch adds the support of preallocation (off/full) for the RBD > >> >> > block driver. > >> >> > If available, we use rbd_writesame() to quickly fill the image when > >> >> > full preallocation is required. > >> >> > > >> >> > Signed-off-by: Stefano Garzarella <sgarz...@redhat.com> > >> >> > --- > >> >> > block/rbd.c | 149 > >> >> > ++++++++++++++++++++++++++++++++++++++----- > >> >> > qapi/block-core.json | 4 +- > >> >> > 2 files changed, 136 insertions(+), 17 deletions(-) > >> >> > > >> >> > diff --git a/block/rbd.c b/block/rbd.c > >> >> > index 0c549c9935..29dd1bb040 100644 > >> >> > --- a/block/rbd.c > >> >> > +++ b/block/rbd.c > >> >> > @@ -13,6 +13,7 @@ > >> >> > > >> >> > #include "qemu/osdep.h" > >> >> > > >> >> > +#include "qemu/units.h" > >> >> > #include <rbd/librbd.h> > >> >> > #include "qapi/error.h" > >> >> > #include "qemu/error-report.h" > >> >> > @@ -331,6 +332,110 @@ static void qemu_rbd_memset(RADOSCB *rcb, > >> >> > int64_t offs) > >> >> > } > >> >> > } > >> >> > > >> >> > +static int qemu_rbd_do_truncate(rbd_image_t image, int64_t offset, > >> >> > + PreallocMode prealloc, Error **errp) > >> >> > +{ > >> >> > + uint64_t current_length; > >> >> > + char *buf = NULL; > >> >> > + int ret; > >> >> > + > >> >> > + ret = rbd_get_size(image, ¤t_length); > >> >> > + if (ret < 0) { > >> >> > + error_setg_errno(errp, -ret, "Failed to get file length"); > >> >> > + goto out; > >> >> > + } > >> >> > + > >> >> > + if (current_length > offset && prealloc != PREALLOC_MODE_OFF) { > >> >> > + error_setg(errp, "Cannot use preallocation for shrinking > >> >> > files"); > >> >> > + ret = -ENOTSUP; > >> >> > + goto out; > >> >> > + } > >> >> > + > >> >> > + switch (prealloc) { > >> >> > + case PREALLOC_MODE_FULL: { > >> >> [...] > >> >> > + case PREALLOC_MODE_OFF: > >> >> [...] > >> >> > + default: > >> >> > + error_setg(errp, "Unsupported preallocation mode: %s", > >> >> > + PreallocMode_str(prealloc)); > >> >> > + ret = -ENOTSUP; > >> >> > + goto out; > >> >> > + } > >> >> > >> >> Other block drivers also accept only some values of PreallocMode. Okay. > >> >> > >> >> I wonder whether management applications need to know which values are > >> >> supported. > >> > > >> > Good point! > >> > >> We can continue to assume they don't until somebody tells us otherwise. > >> > >> >> Let me review support in drivers: > >> >> > >> >> * file (file-win32.c) > >> >> * iscsi > >> >> * nfs > >> >> * qed > >> >> * ssh > >> >> > >> >> - Reject all but PREALLOC_MODE_OFF > >> >> > >> >> * copy-on-read > >> >> * luks (crypto.c) > >> >> * raw > >> >> > >> >> - Pass through only > >> >> > >> >> * file host_cdrom host_device (file-posix.c) > >> >> > >> >> - Reject all but PREALLOC_MODE_OFF when shrinking and for non-regular > >> >> files > >> >> - Reject PREALLOC_MODE_FALLOC unless CONFIG_POSIX_FALLOCATE > >> >> - Reject PREALLOC_MODE_METADATA > >> >> > >> >> * gluster > >> >> > >> >> - Reject all but PREALLOC_MODE_OFF when shrinking > >> >> - Reject PREALLOC_MODE_FALLOC unless CONFIG_GLUSTERFS_FALLOCATE > >> >> - Reject PREALLOC_MODE_FULL unless CONFIG_GLUSTERFS_ZEROFILL > >> >> - Reject PREALLOC_MODE_METADATA > >> >> > >> >> * qcow2 > >> >> > >> >> - Reject all but PREALLOC_MODE_OFF when shrinking and with a backing > >> >> file > >> >> > >> >> * rbd with this patch > >> >> > >> >> - Reject all but PREALLOC_MODE_OFF when shrinking > >> >> - Reject PREALLOC_MODE_METADATA and PREALLOC_MODE_FALLOC > >> >> > >> >> * sheepdog > >> >> > >> >> - Reject PREALLOC_MODE_METADATA and PREALLOC_MODE_FALLOC > >> >> - Doesn't support shrinking > >> >> > >> >> * vdi > >> >> > >> >> - Reject PREALLOC_MODE_FALLOC and PREALLOC_MODE_FULL > >> >> - Doesn't support shrinking > >> >> > >> >> * blkdebug > >> >> * blklogwrites > >> >> * blkverify > >> >> * bochs > >> >> * cloop > >> >> * dmg > >> >> * ftp > >> >> * ftps > >> >> * http > >> >> * https > >> >> * luks > >> >> * nbd > >> >> * null-aio > >> >> * null-co > >> >> * nvme > >> >> * parallels > >> >> * qcow > >> >> * quorum > >> >> * replication > >> >> * throttle > >> >> * vhdx > >> >> * vmdk > >> >> * vpc > >> >> * vvfat > >> >> * vxhs > >> >> > >> >> - These appear not to use PreallocMode: they don't implement > >> >> .bdrv_co_truncate(), and either don't implement .bdrv_co_create() or > >> >> implement it without a prealloc parameter. > >> >> > >> >> Looks good to me. > >> >> > >> > > >> > Thanks for the analysis! > >> > > >> >> > + > >> >> > + ret = 0; > >> >> > + > >> >> > +out: > >> >> > + g_free(buf); > >> >> > + return ret; > >> >> > +} > >> >> > + > >> >> > static QemuOptsList runtime_opts = { > >> >> > .name = "rbd", > >> >> > .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head), > >> >> [...] > >> >> > diff --git a/qapi/block-core.json b/qapi/block-core.json > >> >> > index 7ccbfff9d0..db25a4065b 100644 > >> >> > --- a/qapi/block-core.json > >> >> > +++ b/qapi/block-core.json > >> >> > @@ -4277,13 +4277,15 @@ > >> >> > # point to a snapshot. > >> >> > # @size Size of the virtual disk in bytes > >> >> > # @cluster-size RBD object size > >> >> > +# @preallocation Preallocation mode (allowed values: off, full) > >> >> > # > >> >> > # Since: 2.12 > >> >> > ## > >> >> > { 'struct': 'BlockdevCreateOptionsRbd', > >> >> > 'data': { 'location': 'BlockdevOptionsRbd', > >> >> > 'size': 'size', > >> >> > - '*cluster-size' : 'size' } } > >> >> > + '*cluster-size' : 'size', > >> >> > + '*preallocation': 'PreallocMode' } } > >> >> > > >> >> > ## > >> >> > # @BlockdevVmdkSubformat: > >> >> > >> >> The non-support of values 'metadata' and 'falloc' is not visible in > >> >> introspection, only in documentation. No reason to block this patch, as > >> >> the other block drivers have the same introspection weakness (only > >> >> sheepdog and vdi bother to document). > >> >> > >> >> Should we address the introspection weakness? Only if there's a use for > >> >> the information, I think. > >> > > >> > If the management applications will use that information (or maybe also > >> > our help pages), could be useful to have an array of 'PreallocMode' > >> > supported per-driver. > >> > >> Ideally, query-qmp-schema would show only the supported values. > >> > >> Not hard to do, just tedious: we'd get a number of sub-enums in addition > >> to the full one, and we'd have to map from sub-enum to the full one. > >> > >> QAPI language support for sub-enums would remove most of the tedium. > >> Not worthwhile unless the need for sub-enums is actually common. > > > > I should study better the QMP and QAPI to understand how to implement > > the sub-enums. > > Sub-enums of > > { 'enum': 'PreallocMode', > 'data': [ 'off', 'metadata', 'falloc', 'full' ] } > > done the obvious way: > > { 'enum': 'PreallocModeOff', > 'data': [ 'off' ] } > { 'enum': 'PreallocModeOffPosix', > 'data': [ 'off', 'metadata', > { 'name': 'falloc', 'if': 'defined(CONFIG_POSIX_FALLOCATE)' > }, > 'full' ] } > > and so forth. > > This generates a bunch of different C enum types in addition to > PreallocMode: PreallocModeOff, PreallocModePosix, ... > > Common C code continues to use just PreallocMode. The QMP command > handlers using sub-enums will have to map between the sub-enums and > PreallocMode. > > Tedious. > > With QAPI language support for sub-enums, we could eliminate the > additional C enums. >
Okay, I understood your idea. Thanks for the explanation! > > If you agree, I'll put it as a background task, until somebody from > > management applications tell us his interest. > > Only act if there's a compelling use case. Sure. Thanks, Stefano