On 17.01.2019 17:23, Kevin Wolf wrote: > Am 17.01.2019 um 13:57 hat Denis Plotnikov geschrieben: >> Kevin, >> >> could you please take a look at my last comments? > > I read it, and what it told me is essentially that I need to work on it > myself to fully understand the problem and possible acceptable solutions > because you can't seem to find one yourself. I will, but I can't > guarantee when I can find the time for it. > > Kevin ok. Thanks!
Denis > >> On 15.01.2019 10:22, Denis Plotnikov wrote: >>> ping ping ping ping!!!! >>> >>> On 09.01.2019 11:18, Denis Plotnikov wrote: >>>> ping ping!!! >>>> >>>> On 18.12.2018 11:53, Denis Plotnikov wrote: >>>>> ping ping >>>>> >>>>> On 14.12.2018 14:54, Denis Plotnikov wrote: >>>>>> >>>>>> >>>>>> On 13.12.2018 15:20, Kevin Wolf wrote: >>>>>>> Am 13.12.2018 um 12:07 hat Denis Plotnikov geschrieben: >>>>>>>> On 12.12.2018 15:24, Kevin Wolf wrote: >>>>>>>>> Am 11.12.2018 um 17:55 hat Denis Plotnikov geschrieben: >>>>>>>>>>> Why involve the AioContext at all? This could all be kept at the >>>>>>>>>>> BlockBackend level without extending the layering violation that >>>>>>>>>>> aio_disable_external() is. >>>>>>>>>>> >>>>>>>>>>> BlockBackends get notified when their root node is drained, so >>>>>>>>>>> hooking >>>>>>>>>>> things up there should be as easy, if not even easier than in >>>>>>>>>>> AioContext. >>>>>>>>>> >>>>>>>>>> Just want to make sure that I understood correctly what you >>>>>>>>>> meant by >>>>>>>>>> "BlockBackends get notified". Did you mean that bdrv_drain_end >>>>>>>>>> calls >>>>>>>>>> child's role callback blk_root_drained_end by calling >>>>>>>>>> bdrv_parent_drained_end? >>>>>>>>> >>>>>>>>> Yes, blk_root_drained_begin/end calls are all you need. >>>>>>>>> Specifically, >>>>>>>>> their adjustments to blk->quiesce_counter that are already there, >>>>>>>>> and in >>>>>>>>> the 'if (--blk->quiesce_counter == 0)' block of >>>>>>>>> blk_root_drained_end() >>>>>>>>> we can resume the queued requests. >>>>>>>> Sounds it should be so, but it doesn't work that way and that's why: >>>>>>>> when doing mirror we may resume postponed coroutines too early >>>>>>>> when the >>>>>>>> underlying bs is protected from writing at and thus we encounter the >>>>>>>> assert on a write request execution at bdrv_co_write_req_prepare when >>>>>>>> resuming the postponed coroutines. >>>>>>>> >>>>>>>> The thing is that the bs is protected for writing before execution of >>>>>>>> bdrv_replace_node at mirror_exit_common and bdrv_replace_node calls >>>>>>>> bdrv_replace_child_noperm which, in turn, calls >>>>>>>> child->role->drained_end >>>>>>>> where one of the callbacks is blk_root_drained_end which check >>>>>>>> if(--blk->quiesce_counter == 0) and runs the postponed requests >>>>>>>> (coroutines) if the coundition is true. >>>>>>> >>>>>>> Hm, so something is messed up with the drain sections in the mirror >>>>>>> driver. We have: >>>>>>> >>>>>>> bdrv_drained_begin(target_bs); >>>>>>> bdrv_replace_node(to_replace, target_bs, &local_err); >>>>>>> bdrv_drained_end(target_bs); >>>>>>> >>>>>>> Obviously, the intention was to keep the BlockBackend drained during >>>>>>> bdrv_replace_node(). So how could blk->quiesce_counter ever get to 0 >>>>>>> inside bdrv_replace_node() when target_bs is drained? >>>>>>> >>>>>>> Looking at bdrv_replace_child_noperm(), it seems that the function has >>>>>>> a bug: Even if old_bs and new_bs are both drained, the quiesce_counter >>>>>>> for the parent reaches 0 for a moment because we call .drained_end for >>>>>>> the old child first and .drained_begin for the new one later. >>>>>>> >>>>>>> So it seems the fix would be to reverse the order and first call >>>>>>> .drained_begin for the new child and then .drained_end for the old >>>>>>> child. Sounds like a good new testcase for tests/test-bdrv-drain.c, >>>>>>> too. >>>>>> Yes, it's true, but it's not enough... >>>>>> In mirror_exit_common() we actively manipulate with block driver >>>>>> states. >>>>>> When we replaced a node in the snippet you showed we can't allow the >>>>>> postponed coroutines to run because the block tree isn't ready to >>>>>> receive the requests yet. >>>>>> To be ready, we need to insert a proper block driver state to the >>>>>> block backend which is done here >>>>>> >>>>>> blk_remove_bs(bjob->blk); >>>>>> blk_set_perm(bjob->blk, 0, BLK_PERM_ALL, &error_abort); >>>>>> blk_insert_bs(bjob->blk, mirror_top_bs, &error_abort); << << << << >>>>>> >>>>>> bs_opaque->job = NULL; >>>>>> >>>>>> bdrv_drained_end(src); >>>>>> >>>>>> If the tree isn't ready and we resume the coroutines, we'll end up >>>>>> with the request landed in a wrong block driver state. >>>>>> >>>>>> So, we explicitly should stop all activities on all the driver states >>>>>> and its parents and allow the activities when everything is ready to >>>>>> go. >>>>>> >>>>>> Why explicitly, because the block driver states may belong to >>>>>> different block backends at the moment of the manipulation beginning. >>>>>> >>>>>> So, it seems we need to disable all their contexts until the >>>>>> manipulation ends. >>>>>> >>>>>> Please, correct me if I'm wrong. >>>>>> >>>>>>> >>>>>>>> In seems that if the external requests disabled on the context we >>>>>>>> can't >>>>>>>> rely on anything or should check where the underlying bs and its >>>>>>>> underlying nodes are ready to receive requests which sounds quite >>>>>>>> complicated. >>>>>>>> Please correct me if still don't understand something in that >>>>>>>> routine. >>>>>>> >>>>>>> I think the reason why reyling on aio_disable_external() works is >>>>>>> simply >>>>>>> because src is also drained, which keeps external events in the >>>>>>> AioContext disabled despite the bug in draining the target node. >>>>>>> >>>>>>> The bug would become apparent even with aio_disable_external() if we >>>>>>> didn't drain src, or even if we just supported src and target being in >>>>>>> different AioContexts. >>>>>> >>>>>> Why don't we disable all those contexts involved until the end of >>>>>> the block device tree reconstruction? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> Denis >>>>>>> >>>>>>> Kevin >>>>>>> >>>>>> >>>>> >>>> >>> >> >> -- >> Best, >> Denis -- Best, Denis