Re: QEMU RBD is slow with QCOW2 images

Kevin Wolf Thu, 04 Mar 2021 04:06:19 -0800

Am 03.03.2021 um 18:40 hat Stefano Garzarella geschrieben:
> Hi Jason,
> as reported in this BZ [1], when qemu-img creates a QCOW2 image on RBD
> writing data is very slow compared to a raw file.
> 
> Comparing raw vs QCOW2 image creation with RBD I found that we use a
> different object size, for the raw file I see '4 MiB objects', for QCOW2 I
> see '64 KiB objects' as reported on comment 14 [2].
> This should be the main issue of slowness, indeed forcing in the code 4 MiB
> object size also for QCOW2 increased the speed a lot.
> 
> Looking better I discovered that for raw files, we call rbd_create() with
> obj_order = 0 (if 'cluster_size' options is not defined), so the default
> object size is used.
> Instead for QCOW2, we use obj_order = 16, since the default 'cluster_size'
> defined for QCOW2, is 64 KiB.


Hm, the QemuOpts-based image creation is messy, but why does the rbd
driver even see the cluster_size option?

The first thing qcow2_co_create_opts() does is splitting the passed
QemuOpts into options it will process on the qcow2 layer and options
that are passed to the protocol layer. So if you pass a cluster_size
option, qcow2 should take it for itself and not pass it to rbd.

If it is passed to rbd, I think that's a bug in the qcow2 driver.

> Using '-o cluster_size=2M' with qemu-img changed only the qcow2 cluster
> size, since in qcow2_co_create_opts() we remove the 'cluster_size' from
> QemuOpts calling qemu_opts_to_qdict_filtered().
> For some reason that I have yet to understand, after this deletion, however
> remains in QemuOpts the default value of 'cluster_size' for qcow2 (64 KiB),
> that it's used in qemu_rbd_co_create_opts()

So it seems you came to a similar conclusion. We need to find out where
the 64k come from and just fix that so that rbd uses its default.

> At this point my doubts are:
> Does it make sense to use the same cluster_size as qcow2 as object_size in
> RBD?
> If we want to keep the 2 options separated, how can it be done? Should we
> rename the option in block/rbd.c?

My lazy answer is that you could just use QMP blockdev-create, where you
create layer by layer separately.

What could possibly be done for the QemuOpts is using the dotted syntax
like for opening, so you could specify file.cluster_size=... for the
protocol layer (or data_file.cluster_size=... for the external data
file etc.)

Kevin

Re: QEMU RBD is slow with QCOW2 images

Reply via email to