Il 08/10/2012 13:39, Stefan Hajnoczi ha scritto: > This series looks useful - it compartmentalizes aio.c so there can be multiple > event loops. In order to get a performance benefit (hooking up virtio-blk > ioeventfd to a non-QEMU mutex thread) we need two more things: > > 1. Block layer association with an AioContext (perhaps BlockDriverState <-> > AioContext attaching).
Right. This governs which AioContext the bottom halves are created on, mostly. Plus, all block devices along the ->file and ->backing_hd chains need to belong to the same AioContext. > 2. Thread pool for dispatching I/O requests outside the QEMU global mutex. I looked at this in the past and it feels like a dead end to me. I had a lot of special code in the thread-pool to mimic yield/enter of threadpool work-items. It was needed mostly for I/O throttling, but also because it feels unsafe to swap a CoMutex with a Mutex---the waiting I/O operations can starve the threadpool. I now think it is simpler to keep a cooperative coroutine-based multitasking in the general case. At the same time you can ensure that AIO formats (both Linux and posix-aio-compat) gets a suitable no-coroutine fast path in the common case of no copy-on-read, no throttling, etc. -- which can be done in the current code too. Another important step would be to add bdrv_drain. Kevin pointed out to me that only ->file and ->backing_hd need to be drained. Well, there may be other BlockDriverStates for vmdk extents or similar cases (Benoit's quorum device for example)... these need to be handled the same way for bdrv_flush, bdrv_reopen, bdrv_drain so perhaps it is useful to add a common way to get them. And you need a lock to the AioContext, too. Then the block device can we the AioContext lock in order to synchronize multiple threads working on the block device. The lock will effectively block the ioeventfd thread, so that bdrv_lock+bdrv_drain+...+bdrv_unlock is a replacement for the current usage of bdrv_drain_all within the QEMU lock. > I'm starting to work on these steps and will send RFCs. This series > looks good to me. Thanks! A lot of the next steps can be done in parallel and more importantly none of them blocks each other (roughly)... so I'm eager to look at your stuff! :) Paolo