On Tue, Apr 05, 2016 at 09:27:39AM +0800, Fam Zheng wrote: > On Mon, 04/04 12:57, Stefan Hajnoczi wrote: > > On Fri, Apr 01, 2016 at 09:57:38PM +0800, Fam Zheng wrote: > > > + BdrvCoDrainData data; > > > + > > > + assert(qemu_in_coroutine()); > > > + data = (BdrvCoDrainData) { > > > + .co = qemu_coroutine_self(), > > > + .bs = bs, > > > + .done = false, > > > + .bh = aio_bh_new(bdrv_get_aio_context(bs), bdrv_co_drain_bh_cb, > > > &data), > > > + }; > > > + qemu_bh_schedule(data.bh); > > > + > > > + qemu_coroutine_yield(); > > > + /* If we are resumed from some other event (such as an aio > > > completion or a > > > + * timer callback), it is a bug in the caller that should be fixed. > > > */ > > > + assert(data.done); > > > +} > > > + > > > /* > > > * Wait for pending requests to complete on a single BlockDriverState > > > subtree, > > > * and suspend block driver's internal I/O until next request arrives. > > > @@ -269,6 +306,10 @@ void bdrv_drain(BlockDriverState *bs) > > > bool busy = true; > > > > > > bdrv_drain_recurse(bs); > > > + if (qemu_in_coroutine()) { > > > + bdrv_co_drain(bs); > > > + return; > > > + } > > > while (busy) { > > > /* Keep iterating */ > > > bdrv_flush_io_queue(bs); > > > -- > > > 2.7.4 > > > > block/mirror.c should call bdrv_co_drain() explicitly and bdrv_drain() > > should assert(!qemu_in_coroutine()). > > > > The reason why existing bdrv_read() and friends detect coroutine context > > at runtime is because it is meant for legacy code that runs in both > > coroutine and non-coroutine contexts. > > blk_unref/bdrv_unref are called in coroutine (.bdrv_create implementations), > and this doesn't just work with the assertion. Should I clean up this "legacy" > code first, i.e. move bdrv_unref calls to BHs in the callers and > assert(!qemu_in_coroutine()) there too? I didn't think this because it > complicates the code somehow.
This is a messy problem. In general I don't like introducing yields into non-coroutine_fn functions because it can lead to bugs when the caller didn't expect a yield point. For example, I myself wouldn't have expected bdrv_unref() to be a yield point. So maybe coroutine code I've written would be vulnerable to interference (I won't call it a race condition) from another coroutine across the bdrv_unref() call. This could mean that another coroutine now sees intermediate state that would never be visible without the new yield point. I think attempting to invoke qemu_co_queue_run_restart() directly instead of scheduling a BH and yielding does not improve the situation. It's also a layering violation since qemu_co_queue_run_restart() is just meant for the core coroutine code and isn't a public interface. Anyway, let's consider bdrv_drain() legacy code that can call if (qemu_in_coroutine()) but please make bdrv_co_drain() public so block/mirror.c can at least call it directly. Stefan
signature.asc
Description: PGP signature