On Mon, 30 Jan 2023 at 07:33, Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Sun, Jan 29, 2023 at 06:39:49PM +0800, Sam Li wrote: > > This patch extends virtio-blk emulation to handle zoned device commands > > by calling the new block layer APIs to perform zoned device I/O on > > behalf of the guest. It supports Report Zone, four zone oparations (open, > > close, finish, reset), and Append Zone. > > > > The VIRTIO_BLK_F_ZONED feature bit will only be set if the host does > > support zoned block devices. Regular block devices(conventional zones) > > will not be set. > > > > The guest os can use blktests, fio to test those commands on zoned devices. > > Furthermore, using zonefs to test zone append write is also supported. > > > > Signed-off-by: Sam Li <faithilike...@gmail.com> > > --- > > hw/block/virtio-blk-common.c | 2 + > > hw/block/virtio-blk.c | 394 +++++++++++++++++++++++++++++++++++ > > 2 files changed, 396 insertions(+) > > > > > @@ -949,6 +1311,30 @@ static void virtio_blk_update_config(VirtIODevice > > *vdev, uint8_t *config) > > blkcfg.write_zeroes_may_unmap = 1; > > virtio_stl_p(vdev, &blkcfg.max_write_zeroes_seg, 1); > > } > > + if (bs->bl.zoned != BLK_Z_NONE) { > > + switch (bs->bl.zoned) { > > + case BLK_Z_HM: > > + blkcfg.zoned.model = VIRTIO_BLK_Z_HM; > > + break; > > + case BLK_Z_HA: > > + blkcfg.zoned.model = VIRTIO_BLK_Z_HA; > > + break; > > + default: > > + g_assert_not_reached(); > > + } > > + > > + virtio_stl_p(vdev, &blkcfg.zoned.zone_sectors, > > + bs->bl.zone_size / 512); > > + virtio_stl_p(vdev, &blkcfg.zoned.max_active_zones, > > + bs->bl.max_active_zones); > > + virtio_stl_p(vdev, &blkcfg.zoned.max_open_zones, > > + bs->bl.max_open_zones); > > + virtio_stl_p(vdev, &blkcfg.zoned.write_granularity, blk_size); > > + virtio_stl_p(vdev, &blkcfg.zoned.max_append_sectors, > > + bs->bl.max_append_sectors); > > So these are all ABI sensitive frontend device settings, but they are > not exposed as tunables on the virtio-blk device, instead they are > implicitly set from the backend. > > We have done this kind of thing before in QEMU, but several times it > has bitten QEMU maintainers/users, as having a backend affect the > frontend ABI is not to typical. It wouldn't be immediately obvious > when starting QEMU on a target host that the live migration would > be breaking ABI if the target host wasn't using a zoned device with > exact same settings. > > This also limits mgmt flexibility across live migration, if the > mgmt app wants/needs to change the storage backend. eg maybe they > need to evacuate the host for an emergency, but don't have spare > hosts with same kind of storage. It might be desirable to migrate > and switch to a plain block device or raw/qcow2 file, rather than > let the VM die. > > Can we make these virtio setting be explicitly controlled on the > virtio-blk device. If not specified explicitly they could be > auto-populated from the backend for ease of use, but if specified > then simply validate the backend is a match. libvirt would then > make sure these are always explicitly set on the frontend.
I think this is a good idea, especially if we streamline the file-posix.c driver by merging --blockdev zoned_host_device into --blockdev host_device. It won't be obvious from the command-line whether this is a zoned or non-zoned device. There should be a --device virtio-blk-pci,drive=drive0,zoned=on option that fails when drive0 isn't zoned. It should probably be on/off/auto where auto is the default and doesn't check anything, on requires a zoned device, and off requires a non-zoned device. That will prevent accidental migration between zoned/non-zoned devices. I want to point out that virtio-blk doesn't have checks for the disk size or other details, so what you're suggesting for zone_sectors, etc is stricter than what QEMU does today. Since the virtio-blk parameters you're proposing are optional, I think it doesn't hurt though. Stefan