On Tue, Apr 04, 2023 at 03:43:20PM +0200, Paolo Bonzini wrote: > On 4/3/23 20:29, Stefan Hajnoczi wrote: > > The aio_disable_external() API temporarily suspends file descriptor > > monitoring > > in the event loop. The block layer uses this to prevent new I/O requests > > being > > submitted from the guest and elsewhere between bdrv_drained_begin() and > > bdrv_drained_end(). > > > > While the block layer still needs to prevent new I/O requests in drained > > sections, the aio_disable_external() API can be replaced with > > .drained_begin/end/poll() callbacks that have been added to BdrvChildClass > > and > > BlockDevOps. > > > > This newer .bdrained_begin/end/poll() approach is attractive because it > > works > > without specifying a specific AioContext. The block layer is moving towards > > multi-queue and that means multiple AioContexts may be processing I/O > > simultaneously. > > > > The aio_disable_external() was always somewhat hacky. It suspends all file > > descriptors that were registered with is_external=true, even if they have > > nothing to do with the BlockDriverState graph nodes that are being drained. > > It's better to solve a block layer problem in the block layer than to have > > an > > odd event loop API solution. > > > > That covers the motivation for this change, now on to the specifics of this > > series: > > > > While it would be nice if a single conceptual approach could be applied to > > all > > is_external=true file descriptors, I ended up looking at callers on a > > case-by-case basis. There are two general ways I migrated code away from > > is_external=true: > > > > 1. Block exports are typically best off unregistering fds in > > .drained_begin() > > and registering them again in .drained_end(). The .drained_poll() > > function > > waits for in-flight requests to finish using a reference counter. > > > > 2. Emulated storage controllers like virtio-blk and virtio-scsi are a little > > simpler. They can rely on BlockBackend's request queuing during drain > > feature. Guest I/O request coroutines are suspended in a drained > > section and > > resume upon the end of the drained section. > > Sorry, I disagree with this. > > Request queuing was shown to cause deadlocks; Hanna's latest patch is piling > another hack upon it, instead in my opinion we should go in the direction of > relying _less_ (or not at all) on request queuing. > > I am strongly convinced that request queuing must apply only after > bdrv_drained_begin has returned, which would also fix the IDE TRIM bug > reported by Fiona Ebner. The possible livelock scenario is generally not a > problem because 1) outside an iothread you have anyway the BQL that prevents > a vCPU from issuing more I/O operations during bdrv_drained_begin 2) in > iothreads you have aio_disable_external() instead of .drained_begin(). > > It is also less tidy to start a request during the drained_begin phase, > because a request that has been submitted has to be completed (cancel > doesn't really work). > > So in an ideal world, request queuing would not only apply only after > bdrv_drained_begin has returned, it would log a warning and .drained_begin() > should set up things so that there are no such warnings.
That's fine, I will give .drained_begin/end/poll() a try with virtio-blk and virtio-scsi in the next revision. Stefan
signature.asc
Description: PGP signature