Although QEMU virtio-blk is quite fast, there is still some room for improvements. Disk latency can be reduced if we handle virito-blk requests in host kernel so we avoid a lot of syscalls and context switches.
The biggest disadvantage of this vhost-blk flavor is raw format. Luckily Kirill Thai proposed device mapper driver for QCOW2 format to attach files as block devices: https://www.spinics.net/lists/kernel/msg4292965.html Also by using kernel modules we can bypass iothread limitation and finaly scale block requests with cpus for high-performance devices. This is planned to be implemented in next version. Linux kernel module part: https://lore.kernel.org/kvm/20220725202753.298725-1-andrey.zhadche...@virtuozzo.com/ test setups and results: fio --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=128 QEMU drive options: cache=none filesystem: xfs SSD: | randread, IOPS | randwrite, IOPS | Host | 95.8k | 85.3k | QEMU virtio | 57.5k | 79.4k | QEMU vhost-blk | 95.6k | 84.3k | RAMDISK (vq == vcpu): | randread, IOPS | randwrite, IOPS | virtio, 1vcpu | 123k | 129k | virtio, 2vcpu | 253k (??) | 250k (??) | virtio, 4vcpu | 158k | 154k | vhost-blk, 1vcpu | 110k | 113k | vhost-blk, 2vcpu | 247k | 252k | vhost-blk, 4vcpu | 576k | 567k | Andrey Zhadchenko (1): block: add vhost-blk backend configure | 13 ++ hw/block/Kconfig | 5 + hw/block/meson.build | 1 + hw/block/vhost-blk.c | 395 ++++++++++++++++++++++++++++++++++ hw/virtio/meson.build | 1 + hw/virtio/vhost-blk-pci.c | 102 +++++++++ include/hw/virtio/vhost-blk.h | 44 ++++ linux-headers/linux/vhost.h | 3 + 8 files changed, 564 insertions(+) create mode 100644 hw/block/vhost-blk.c create mode 100644 hw/virtio/vhost-blk-pci.c create mode 100644 include/hw/virtio/vhost-blk.h -- 2.31.1