Am 29.06.2018 um 14:40 hat Denis Plotnikov geschrieben: > Fixes the problem of ide request appearing when the BDS is in > the "drained section". > > Without the patch the request can come and be processed by the main > event loop, as the ide requests are processed by the main event loop > and the main event loop doesn't stop when its context is in the > "drained section". > The request execution is postponed until the end of "drained section". > > The patch doesn't modify ide specific code, as well as any other > device code. Instead, it modifies the infrastructure of asynchronous > Block Backend requests, in favor of postponing the requests arisen > when in "drained section" to remove the possibility of request appearing > for all the infrastructure clients. > > This approach doesn't make vCPU processing the request wait untill > the end of request processing. > > Signed-off-by: Denis Plotnikov <dplotni...@virtuozzo.com>
I generally agree with the idea that requests should be queued during a drained section. However, I think there are a few fundamental problems with the implementation in this series: 1) aio_disable_external() is already a layering violation and we'd like to get rid of it (by replacing it with a BlockDevOps callback from BlockBackend to the devices), so adding more functionality there feels like a step in the wrong direction. 2) Only blk_aio_* are fixed, while we also have synchronous public interfaces (blk_pread/pwrite) as well as coroutine-based ones (blk_co_*). They need to be postponed as well. blk_co_preadv/pwritev() are the common point in the call chain for all of these variants, so this is where the fix needs to live. 3) Within a drained section, you want requests from other users to be blocked, but not your own ones (essentially you want exclusive access). We don't have blk_drained_begin/end() yet, so this is not something to implement right now, but let's keep this requirement in mind and choose a design that allows this. I believe the whole logic should be kept local to BlockBackend, and blk_root_drained_begin/end() should be the functions that start queuing requests or let queued requests resume. As we are already in coroutine context in blk_co_preadv/pwritev(), after checking that blk->quiesce_counter > 0, we can enter the coroutine object into a list and yield. blk_root_drained_end() calls aio_co_wake() for each of the queued coroutines. This should be all that we need to manage. Kevin