On Wed, Nov 27 2013, Stefan Hajnoczi wrote: > I finally got around to reading the Linux multiqueue block layer paper > and wanted to share some thoughts about how it relates to QEMU and > dataplane/QContext: > http://kernel.dk/blk-mq.pdf > > I think Jens has virtio-blk multiqueue patches. So let's imagine that > the virtio-blk device has multiple virtqueues. (virtio-scsi is > already multiqueue BTW.) > > The paper focusses on two queue mappings: 1 queue per core and 1 queue > per node. In both cases the idea is to keep the block I/O code path > localized. This makes block I/O scale as the number of CPUs > increases. > > In QEMU we'd want to set up a mapping for the virtio-blk mq device: > each guest vcpu or guest node has a virtio-blk virtqueue which is > serviced by a dataplane/QContext thread. > > QEMU would then process requests across these queues in parallel, > although currently BlockDriverState is not thread-safe. At least for > raw we should be able to submit requests in parallel from QEMU. > > Unfortunately there are some complications in the QEMU block layer: > QEMU's own accounting, request tracking, and throttling features are > global. We'd need to eventually do something similar to the > multiqueue block layer changes in the kernel to detangle this state. > > Doing multiqueue for image formats is much more challenging - we'd > have to tackle thread-safety in qcow2 and friends. For network block > drivers like Gluster or NBD it's also not 100% clear what the best > approach is. But I think the target here is local SSDs that are > capable of high IOPs together with an SMP guest. > > At the end of all this we'd arrive at the following architecture: > 1. Guest virtio device has multiple queues (1 per node or vcpu). > 2. QEMU has multiple dataplane/QContext threads that process virtqueue > kicks, they are bound to host CPUs/nodes. > 3. Linux kernel has multiqueue block I/O.
I think that sounds very reasonable. Let me know if there's anything you need help or advice with. > Jens: when experimenting with multiqueue virtio-blk, how far did you > modify QEMU to eliminate global request processing state from block.c? I did very little scaling testing on virtio-blk, it was more a demo case for conversion than anything else. So probably not of much use to what you are looking for... -- Jens Axboe