On 03/01/2017 08:53 PM, Paolo Bonzini wrote: > > > On 01/03/2017 17:08, Halil Pasic wrote: >> applied I do not see the problem any more. I will most likely >> turn this into a patch tomorrow. I would like to give it some more testing >> and >> thinking (see questions below) until tomorrow. >> >> I should probably cc stable, or? > > Yes, please do! > >> >> Q1. For this to work correctly, it seems to me, we need to be sure that >> virtio_blk_req_complete can not be happen between the newly added >> notify_guest_bh(s); >> and >> vblk->dataplane_started = false; >> becomes visible. How is this ensured? > > blk_set_aio_context drains the block device, and the event notifiers are > not active anymore so draining the block device coincides with the last > call to virtio_blk_req_complete. > > Please add a comment - it's a good observation. > >> Q2. The virtio_blk_data_plane_stop should be from the thread/context >> associated with the main event loop, and with that >> vblk->dataplane_started = false too. But I think dataplane_started >> may end up being used form a different thread (e.g. req_complete). > > 1) virtio_queue_aio_set_host_notifier_handler stops the event notifiers > > 2) virtio_bus_set_host_notifier invokes them one last time before exiting > > Note that this could call again virtio_queue_notify_vq and hence > virtio_device_start_ioeventfd, but dataplane won't be reactivated > because vblk->dataplane_started is still true. > >> How does the sequencing work there and/or is it even important? > > It is important and not really easy to get right---as shown by the bug > you found, in fact. >
Thank you very much for the explanations. I have just sent a patch based on what we discussed here. I think I roughly understand now, how this is supposed to work regarding concurrency, but I guess I will have to just trust you to some extent.