Am 05.09.2012 09:41, schrieb Bharata B Rao: > On Thu, Aug 09, 2012 at 06:32:16PM +0530, Bharata B Rao wrote: >> +static void qemu_gluster_complete_aio(GlusterAIOCB *acb) >> +{ >> + int ret; >> + >> + if (acb->canceled) { >> + qemu_aio_release(acb); >> + return; >> + } >> + >> + if (acb->ret == acb->size) { >> + ret = 0; /* Success */ >> + } else if (acb->ret < 0) { >> + ret = acb->ret; /* Read/Write failed */ >> + } else { >> + ret = -EIO; /* Partial read/write - fail it */ >> + } >> + acb->common.cb(acb->common.opaque, ret); > > The .cb() here is bdrv_co_io_em_complete(). It does qemu_coroutine_enter(), > handles the return value and comes back here.
Right. .cb is set by qemu_gluster_aio_rw/flush(), and the only way these can be called is through bdrv_co_io_em() and bdrv_co_flush(), which both set bdrv_co_io_em_complete as the callback. > But if the bdrv_read or bdrv_write or bdrv_flush was called from a > coroutine context (as against they themselves creating a new coroutine), > the above .cb() call above doesn't return to this point. Why? A coroutine that yields before it's completed must be reentered, no matter whether it's been created for a single request or if it already existed. Conversely, a coroutine that you enter, always yields at some point and then you return from the qemu_coroutine_enter() and get back to this line of code. If you never come back to this point, there's a bug somewhere. > Hence I won't > be able to release the acb and decrement the qemu_aio_count. > > What could be the issue here ? In general, how do I ensure that my > aio calls get completed correctly in such scenarios where bdrv_read etc > are called from coroutine context rather than from main thread context ? > > Creating qcow2 image would lead to this scenario where > ->bdrv_create (=qcow2_create) will create a coroutine and subsequently > read and write are called within qcow2_create in coroutine context itself. Can you describe in more detail what code paths it's taking and at which point you're thinking it's wrong? Kevin