Am 15.08.2012 07:21, schrieb Bharata B Rao: > On Tue, Aug 14, 2012 at 10:29:26AM +0200, Kevin Wolf wrote: >>>>> +static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void >>>>> *arg) >>>>> +{ >>>>> + GlusterAIOCB *acb = (GlusterAIOCB *)arg; >>>>> + BDRVGlusterState *s = acb->common.bs->opaque; >>>>> + >>>>> + acb->ret = ret; >>>>> + if (qemu_gluster_send_pipe(s, acb) < 0) { >>>>> + /* >>>>> + * Gluster AIO callback thread failed to notify the waiting >>>>> + * QEMU thread about IO completion. Nothing much can be done >>>>> + * here but to abruptly abort. >>>>> + * >>>>> + * FIXME: Check if the read side of the fd handler can somehow >>>>> + * be notified of this failure paving the way for a graceful >>>>> exit. >>>>> + */ >>>>> + error_report("Gluster failed to notify QEMU about IO >>>>> completion"); >>>>> + abort(); >>>> >>>> In the extreme case you may choose to make this disk inaccessible >>>> (something like bs->drv = NULL), but abort() kills the whole VM and >>>> should only be called when there is a bug. >>> >>> There have been concerns raised about this earlier too. I settled for this >>> since I couldn't see a better way out and I could see the precedence >>> for this in posix-aio-compat.c >>> >>> So I could just do the necessary cleanup, set bs->drv to NULL and return >>> from >>> here ? But how do I wake up the QEMU thread that is waiting on the read side >>> of the pipe ? W/o that, the QEMU thread that waits on the read side of the >>> pipe is still hung. >> >> There is no other thread. But you're right, you should probably >> unregister the aio_fd_handler and any other pending callbacks. > > As I clarified in the other mail, this (gluster_finish_aiocb) is called > from gluster thread context and hence QEMU thread that raised the original > read/write request is still blocked on qemu_aio_wait(). > > I tried the following cleanup instead of abrupt abort: > > close(read_fd); /* This will wake up the QEMU thread blocked on > select(read_fd...) */ > close(write_fd); > qemu_aio_set_fd_handler(read_fd, NULL, NULL, NULL, NULL); > qemu_aio_release(acb); > s->qemu_aio_count--; > bs->drv = NULL; > > I tested this by manually injecting faults into qemu_gluster_send_pipe(). > With the above cleanup, the guest kernel crashes with IO errors.
What does "crash" really mean? IO errors certainly shouldn't cause a kernel to crash? > Is there anything else that I need to do or do differently to retain the > VM running w/o disk access ? > > I thought of completing the aio callback by doing > acb->common.cb(acb->common.opaque, -EIO); > but that would do a coroutine enter from gluster thread, which I don't think > should be done. You would have to take the global qemu mutex at least. I agree it's not a good thing to do. Kevin