Am 09.10.2012 17:02, schrieb Anthony Liguori: > Stefan Hajnoczi <stefa...@gmail.com> writes: > >> On Mon, Oct 08, 2012 at 03:00:04PM +0200, Paolo Bonzini wrote: >>> Another important step would be to add bdrv_drain. Kevin pointed out to >>> me that only ->file and ->backing_hd need to be drained. Well, there >>> may be other BlockDriverStates for vmdk extents or similar cases >>> (Benoit's quorum device for example)... these need to be handled the >>> same way for bdrv_flush, bdrv_reopen, bdrv_drain so perhaps it is useful >>> to add a common way to get them. >>> >>> And you need a lock to the AioContext, too. Then the block device can >>> we the AioContext lock in order to synchronize multiple threads working >>> on the block device. The lock will effectively block the ioeventfd >>> thread, so that bdrv_lock+bdrv_drain+...+bdrv_unlock is a replacement >>> for the current usage of bdrv_drain_all within the QEMU lock. >>> >>>> I'm starting to work on these steps and will send RFCs. This series >>>> looks good to me. >>> >>> Thanks! A lot of the next steps can be done in parallel and more >>> importantly none of them blocks each other (roughly)... so I'm eager to >>> look at your stuff! :) >> >> Some notes on moving virtio-blk processing out of the QEMU global mutex: >> >> 1. Dedicated thread for non-QEMU mutex virtio ioeventfd processing. >> The point of this thread is to process without the QEMU global mutex, >> using >> only fine-grained locks. (In the future this thread can be integrated >> back >> into the QEMU iothread when the global mutex has been eliminated.) >> >> Dedicated thread must hold reference to virtio-blk device so it will >> not be destroyed. Hot unplug requires asking ioeventfd processing >> threads to release reference. >> >> 2. Versions of virtqueue_pop() and virtqueue_push() that execute outside >> global QEMU mutex. Look at memory API and threaded device dispatch. >> >> The virtio device itself must have a lock so its vring-related state >> can be modified safely. >> >> Here are the steps that have been mentioned: >> >> 1. aio fastpath - for raw-posix and other aio block drivers, can we reduce >> I/O >> request latency by skipping block layer coroutines? This is can be >> prototyped (hacked) easily to scope out how much benefit we get. It's >> completely independent from the global mutex related work. > > We've discussed previously about having an additional layer on top of > the block API. > > One problem with the block API today is that it doesn't distinguish > between device access and internal access. I think this is an > opportunity to introduce a device-only API. > > In the very short term, I can imagine an aio fastpath that was only > implemented in terms of the device API. We could have a slow path that > acquired the BQL.
FWIW, I think we'll automatically get two APIs with the BlockDriverState/BlockBackend separation. However, I'm not entirely sure if it's exactly the thing you're imagining, because BlockBackend (the "device API") wouldn't only be used by devices, but also by qemu-img/io, libqblock and probably block jobs, too. Kevin