On 23/02/2016 13:49, Fam Zheng wrote: > On Tue, 02/23 11:43, Paolo Bonzini wrote: >> >> >> On 23/02/2016 06:57, Fam Zheng wrote: >>>>>> + qed_cancel_need_check_timer(s); >>>>>> + qed_need_check_timer_cb(s); >>>>>> + } >>>>> >>>>> What if an allocating write is queued (the else branch case)? Its >>>>> completion >>>>> will be in bdrv_drain and it could arm the need_check_timer which is >>>>> wrong. >>>>> >>>>> We need to drain the allocating_write_reqs queue before checking the >>>>> timer. >>>> >>>> You're right, but how? That's what bdrv_drain(bs) does, it's a >>>> chicken-and-egg problem. >>> >>> Maybe use an aio_poll loop before the if? >> >> That would not change the fact that you're reimplementing bdrv_drain >> inside bdrv_qed_drain. > > But it fulfills the contract of .bdrv_drain. This is the easy way, the hard > way > would be iterating through the allocating_write_reqs list and process reqs one > by one synchronously, which still involves aio_poll indirectly.
The easy way would be better then. Stefan, any second opinion? Paolo >> Perhaps for now it's simplest to just remove the QED .bdrv_drain >> callback, if you think this patch is not a good stopgap measure to avoid >> the segmentation faults. > > OK, I'm fine with this as a stopgap measure. > >> Once the bdrv_drain rework is in, we can move the callback _after_ I/O >> is drained on bs and before it is drained on bs->file->bs. > > Sounds good.