Hello, I'm running x86 Ubuntu VM with GlusterFS-backed /, e.g.
qemu-system-x86_64 -enable-kvm -smp 8 -m 4096 -drive \ file=gluster://192.168.1.xxx/test/bionic.qcow2,if=virtio and the following Fio (https://github.com/axboe/fio) benchmark inside a VM: [global] name=fio-rand-write-libaio filename=fio-rand-write ioengine=libaio create_on_open=1 rw=randwrite direct=0 numjobs=8 time_based=1 runtime=60 [test-4-kbytes] bs=4k size=1G iodepth=8 (e.g. 8 tasks issues 4K-sized random writes across 1G file inside a VM). Tracing I/O path down to qemu_gluster_co_rw() (block/gluster.c), I've found that glfs_pwritev_async() is asked to write I/O vectors of the very different length (mostly 1, average may be 15-20, and maximum value is around 150). Since the workload generated by Fio looks regular and stable over the running time, this looks suboptimal. At the first glance, long batches of I/O vectors of the same length should give the better performance. Next, looking through virtio_blk_submit_multireq() shows that non-sequential requests are not merged. IIUC this approach assumes that the virtual block device is always backed by the physical one - but what if not? In GlusterFS API, each glfs_xxx() call requires the roundtrip to the server at least, and merging more (even non-sequential) requests into longer I/O vectors may be really useful. So, 1) is my understanding of the whole picture is correct and 2) is there something to tune in attempt to improve an I/O performance in my case? Thanks, Dmitry