On 08.02.2019 14:03, Kevin Wolf wrote: > Am 24.01.2019 um 08:48 hat Denis Plotnikov geschrieben: >> When there is a Backup Block Job running and shutdown command is sent to >> a guest, the guest crushes due to assert(!bs->walking_aio_notifiers). >> >> Call stack: >> >> 0 __GI_raise >> 1 __GI_abort >> 2 __assert_fail_base >> 3 __GI___assert_fail >> 4 bdrv_detach_aio_context (bs=0x55f54d65c000) <<< >> 5 bdrv_detach_aio_context (bs=0x55f54fc8a800) >> 6 bdrv_set_aio_context (bs=0x55f54fc8a800, ...) >> 7 block_job_attached_aio_context >> 8 bdrv_attach_aio_context (bs=0x55f54d65c000, ...) <<< >> 9 bdrv_set_aio_context (bs=0x55f54d65c000) >> 10 blk_set_aio_context >> 11 virtio_blk_data_plane_stop >> 12 virtio_bus_stop_ioeventfd >> 13 virtio_vmstate_change >> 14 vm_state_notify (running=0, state=RUN_STATE_SHUTDOWN) >> 15 do_vm_stop (state=RUN_STATE_SHUTDOWN, send_stop=true) >> 16 vm_stop (state=RUN_STATE_SHUTDOWN) >> 17 main_loop_should_exit >> 18 main_loop >> 19 main >> >> This happens because of "new" context attachment to VM disk bds. >> When attaching a new context the corresponding aio context handler is >> called for each of aio_notifiers registered on the VM disk bds context. >> Among those handlers there is the block_job_attached_aio_context handler >> which sets a new aio context for the block job bds. When doing so, >> the old context is detached from all the block job bds children and one of >> them is the VM disk bds, serving as backing store for the blockjob bds, >> although the VM disk bds is actually the initializer of that process. >> Since the VM disk bds is protected with walking_aio_notifiers flag >> from double processing in recursive calls, the assert fires. >> >> The patch fixes the problem by skipping the bds-es in recursive calls >> which have started attachment/detachment already. >> >> Signed-off-by: Denis Plotnikov <dplotni...@virtuozzo.com> > > I'm not sure if this fix is actually correct in more than just your > special case. It assumes without checking that the nested attach is for > the same context as the outer attach. (Of course, I'm not sure what you > would do if they were for different contexts! It's a case to avoid, and > it should fail very visibly instead of moving nodes to unexpected > contexts.) > > bdrv_set_aio_context() is pretty messy, so it may still turn out that > leaving the greater problem unadressed and just fixing what breaks in > practice is what we need to do. > > But did you consider just forbidding nested bdrv_set_aio_context() > instead? Essentially, instead of your checks you would assert > !walking_aio_notifiers. This would simplify the interface instead of > adding more complexity. > > If I understand the code correctly, at the time of the nested call, the > respective nodes have already been moved to the new context, so avoiding > the nested calls might be as easy as adding a shortcut to > bdrv_set_aio_context() for this case: > > if (ctx == new_context) { > return; > } > Yes, this is probably the best solution. I'll send the patch shortly. > Most importantly, though, please write a test case for this. We will > probably need to rework bdrv_set_aio_context() fundamentally (as Berto > recently noticed), and we will want to avoid regressing on this bug. > Will do > Kevin >
Denis -- Best, Denis