On Wed, Oct 05, 2022 at 01:28:14PM +0300, Andrey Zhadchenko wrote: > > > On 10/4/22 21:26, Stefan Hajnoczi wrote: > > On Mon, Jul 25, 2022 at 11:55:26PM +0300, Andrey Zhadchenko wrote: > > > Although QEMU virtio-blk is quite fast, there is still some room for > > > improvements. Disk latency can be reduced if we handle virito-blk requests > > > in host kernel so we avoid a lot of syscalls and context switches. > > > > > > The biggest disadvantage of this vhost-blk flavor is raw format. > > > Luckily Kirill Thai proposed device mapper driver for QCOW2 format to > > > attach > > > files as block devices: > > > https://www.spinics.net/lists/kernel/msg4292965.html > > > > > > Also by using kernel modules we can bypass iothread limitation and finaly > > > scale > > > block requests with cpus for high-performance devices. This is planned to > > > be > > > implemented in next version. > > > > > > Linux kernel module part: > > > https://lore.kernel.org/kvm/20220725202753.298725-1-andrey.zhadche...@virtuozzo.com/ > > > > > > test setups and results: > > > fio --direct=1 --rw=randread --bs=4k --ioengine=libaio --iodepth=128 > > > > > QEMU drive options: cache=none > > > filesystem: xfs > > > > Please post the full QEMU command-line so it's clear exactly what this > > is benchmarking. > > The full command for vhost is this: > qemu-system-x86_64 \ > -kernel bzImage -nographic -append "console=ttyS0 root=/dev/sdb rw > systemd.unified_cgroup_hierarchy=0 nokaslr" \ > -m 1024 -s --enable-kvm -smp $2 \ > -drive id=main_drive,file=debian_sid.img,media=disk,format=raw \ > -drive id=vhost_drive,file=$1,media=disk,format=raw,if=none \
No cache=none because vhost-blk directly submits bios in the kernel? > -device vhost-blk-pci,drive=vhost_drive,num-threads=$3 > > (num-threads option for vhost-blk-pci was not used) > > For virtio I used this: > qemu-system-x86_64 \ > -kernel bzImage -nographic -append "console=ttyS0 root=/dev/sdb rw > systemd.unified_cgroup_hierarchy=0 nokaslr" \ > -m 1024 -s --enable-kvm -smp $2 \ > -drive file=debian_sid.img,media=disk \ > -drive file=$1,media=disk,if=virtio,cache=none,if=none,id=d1,aio=threads\ > -device virtio-blk-pci,drive=d1 > > > > > A preallocated raw image file is a good baseline with: > > > > --object iothread,id=iothread0 \ > > --blockdev > > file,filename=test.img,cache.direct=on,aio=native,node-name=drive0 > > > --device virtio-blk-pci,drive=drive0,iothread=iothread0 > The image I used was preallocated qcow2 image set up with dm-qcow2 because > this vhost-blk version directly uses bio interface and can't work with > regular files. I see. > > > > > (BTW QEMU's default vq size is 256 descriptors and the number of vqs is > > the number of vCPUs.) > > > > > > > > SSD: > > > | randread, IOPS | randwrite, IOPS | > > > Host | 95.8k | 85.3k | > > > QEMU virtio | 57.5k | 79.4k | > > Adding iothread0 and using raw file instead of qcow2 + dm-qcow2 setup brings > the numbers to > | 60.4k | 84.3k | > > > > QEMU vhost-blk | 95.6k | 84.3k | > > > > > > RAMDISK (vq == vcpu): > > > > With fio numjobs=vcpu here? > > Yes > > > > > > | randread, IOPS | randwrite, IOPS | > > > virtio, 1vcpu | 123k | 129k | > > > virtio, 2vcpu | 253k (??) | 250k (??) | > > > > QEMU's aio=threads (default) gets around the single IOThread. It beats > > aio=native for this reason in some cases. Were you using aio=native or > > aio=threads? > > At some point of time I started to specify aio=threads (and before that I > did not use this option). I am not sure when exactly. I will re-measure all > cases for the next submission. aio=native is usually recommended. aio=threads is less optimized. aio=native should have lower latency than aio=threads although it scales worse on hosts with free CPUs because it's limited to a single thread. > > > > > > virtio, 4vcpu | 158k | 154k | > > > vhost-blk, 1vcpu | 110k | 113k | > > > vhost-blk, 2vcpu | 247k | 252k | >
signature.asc
Description: PGP signature