A guest with 100 virtio-blk-pci,num-queues=32 devices only reaches 10k IOPS while a guest with a single device reaches 105k IOPS (rw=randread,bs=4k,iodepth=1,ioengine=libaio).
The bottleneck is that aio_poll() userspace polling iterates over all AioHandlers to invoke their ->io_poll() callbacks. All AioHandlers are polled even if only one of them was recently active. Therefore a guest with many disks is slower than a guest with a single disk even when the workload only accesses a single disk. This patch series solves this scalability problem so that IOPS is unaffected by the number of devices. The trick is to poll only AioHandlers that were recently active so that userspace polling scales well. Unfortunately it's not possible to accomplish this with the existing epoll(7) fd monitoring implementation. This patch series adds a Linux io_uring fd monitoring implementation. The critical feature is that io_uring can check the readiness of file descriptors through userspace polling. This makes it possible to safely poll a subset of AioHandlers from userspace without risk of starving the other AioHandlers. Stefan Hajnoczi (7): aio-posix: completely stop polling when disabled aio-posix: move RCU_READ_LOCK() into run_poll_handlers() aio-posix: extract ppoll(2) and epoll(7) fd monitoring aio-posix: simplify FDMonOps->update() prototype aio-posix: add io_uring fd monitoring implementation aio-posix: support userspace polling of fd monitoring aio-posix: remove idle poll handlers to improve scalability MAINTAINERS | 2 + configure | 5 + include/block/aio.h | 70 ++++++- util/Makefile.objs | 3 + util/aio-posix.c | 449 ++++++++++++++---------------------------- util/aio-posix.h | 81 ++++++++ util/fdmon-epoll.c | 155 +++++++++++++++ util/fdmon-io_uring.c | 332 +++++++++++++++++++++++++++++++ util/fdmon-poll.c | 107 ++++++++++ util/trace-events | 2 + 10 files changed, 898 insertions(+), 308 deletions(-) create mode 100644 util/aio-posix.h create mode 100644 util/fdmon-epoll.c create mode 100644 util/fdmon-io_uring.c create mode 100644 util/fdmon-poll.c -- 2.24.1
