On Wed, Aug 21, 2013 at 10:02:47AM +0800, Asias He wrote: > In block/gluster.c, we have > > gluster_finish_aiocb > { > if (retval != sizeof(acb)) { > qemu_mutex_lock_iothread(); /* We are in gluster thread context */ > ... > qemu_mutex_unlock_iothread(); > } > } > > qemu tools, e.g. qemu-img, might race here because > qemu_mutex_{lock,unlock}_iothread are a nop operation and > gluster_finish_aiocb is in the gluster thread context. > > To fix, we introduce our own mutex for qemu tools.
I think we need to look more closely at the error code path: acb->ret = ret; retval = qemu_write_full(s->fds[GLUSTER_FD_WRITE], &acb, sizeof(acb)); if (retval != sizeof(acb)) { /* * Gluster AIO callback thread failed to notify the waiting * QEMU thread about IO completion. * * Complete this IO request and make the disk inaccessible for * subsequent reads and writes. */ error_report("Gluster failed to notify QEMU about IO completion"); qemu_mutex_lock_iothread(); /* We are in gluster thread context */ acb->common.cb(acb->common.opaque, -EIO); qemu_aio_release(acb); close(s->fds[GLUSTER_FD_READ]); close(s->fds[GLUSTER_FD_WRITE]); Is it safe to close the fds? There is a race here: 1. Another thread opens a new file descriptor and gets GLUSTER_FD_READ or GLUSTER_FD_WRITE's old fd value. 2. Another gluster thread invokes the callback and does qemu_write_full(s->fds[GLUSTER_FD_WRITE], ...). Since the mutex doesn't protect s->fds[] this is possible. Maybe a simpler solution for request completion is: 1. Linked list of completed acbs. 2. Mutex to protect the linked list. 3. EventNotifier to signal iothread. Then this function becomes: static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg) { GlusterAIOCB *acb = arg; BlockDriverState *bs = acb->common.bs; BDRVGlusterState *s = bs->opaque; int retval; acb->ret = ret; qemu_mutex_lock(&s->finish_list_lock); QSIMPLEQ_INSERT_TAIL(&s->finish_list, acb, list); qemu_mutex_unlock(&s->finish_list_lock); event_notifier_set(&s->finish_list_notifier); } Stefan