Kevin, Thanks for your review. I will address all of your comments in the next iteration, but have a few questions/comments on the others...
On Mon, Aug 13, 2012 at 02:50:29PM +0200, Kevin Wolf wrote: > > +static int parse_server(GlusterURI *uri, char *server) > > +{ > > + int ret = -EINVAL; > > + char *token, *saveptr; > > + char *p, *q = server; > > + > > + p = strchr(server, '['); > > + if (p) { > > + /* [ipv6] */ > > + if (p != server) { > > + /* [ not in the beginning */ > > + goto out; > > + } > > + q++; > > + p = strrchr(p, ']'); > > + if (!p) { > > + /* No matching ] */ > > + goto out; > > + } > > + *p++ = '\0'; > > + uri->server = g_strdup(q); > > + > > + if (*p) { > > + if (*p != ':') { > > + /* [ipv6] followed by something other than : */ > > + goto out; > > + } > > + uri->port = strtoul(++p, NULL, 0); > > + if (uri->port < 0) { > > + goto out; > > + } > > This accepts inputs where the colon isn't followed by any number. Yes, and that will result in port=0, which is default. So this is to cater for cases like gluster://[1:2:3:4:5]:/volname/image In any case, let me see if I can get rid of this altogether and reuse qemu-sockets.c:inet_parse(). > > + if (token) { > > + uri->port = strtoul(token, NULL, 0); > > + if (uri->port < 0) { > > + goto out; > > + } > > + } else { > > + uri->port = 0; > > + } > > The port parsing code is duplicated in IPv4 and IPv6, even though it's > really the same. Being such a small piece of code, I didn't think it deserves to be made a function on its own and re-used. But if you really want it that way, I can do. > > +static struct glfs *qemu_gluster_init(GlusterURI *uri, const char > > *filename) > > +{ > > + struct glfs *glfs = NULL; > > + int ret; > > + > > + ret = qemu_gluster_parseuri(uri, filename); > > + if (ret < 0) { > > + error_report("Usage: file=gluster://server[:port]/volname/image" > > + "[?transport=socket]"); > > Is 'socket' really the only valid transport and will it stay like this > without changes to qemu? There are others like 'unix' and 'rdma'. I will fix this error message to reflect that. However QEMU needn't change for such transport types because I am not interpreting the transport type in QEMU but instead passing it on directly to GlusterFS. > > + > > +static void qemu_gluster_aio_event_reader(void *opaque) > > +{ > > + BDRVGlusterState *s = opaque; > > + GlusterAIOCB *event_acb; > > + int event_reader_pos = 0; > > + ssize_t ret; > > + > > + do { > > + char *p = (char *)&event_acb; > > + > > + ret = read(s->fds[GLUSTER_FD_READ], p + event_reader_pos, > > + sizeof(event_acb) - event_reader_pos); > > So you're reading in a pointer address from a pipe? This is fun. > > > + if (ret > 0) { > > + event_reader_pos += ret; > > + if (event_reader_pos == sizeof(event_acb)) { > > + event_reader_pos = 0; > > + qemu_gluster_complete_aio(event_acb); > > + s->qemu_aio_count--; > > + } > > + } > > + } while (ret < 0 && errno == EINTR); > > In case of a short read the read data is just discarded? Maybe > event_reader_pos was supposed to static? In earlier versions event_reader_pos was part of BDRVGlusterState and I made it local in subsequent versions and that is causing this problem. Will fix. > > + > > +static void qemu_gluster_aio_cancel(BlockDriverAIOCB *blockacb) > > +{ > > + GlusterAIOCB *acb = (GlusterAIOCB *)blockacb; > > + > > + acb->common.cb(acb->common.opaque, -ECANCELED); > > + acb->canceled = true; > > +} > > After having called acb->common.cb you must not write any longer to the > memory pointed at by the qiov. Either you can really cancel the request, > or you need to wait until it completes. I don't think I can cancel the request that has already been sent to gluster server. block/qed.c introduces acb->finished and waits on it in qed_aio_canel. Do you suggest I follow that model ? > > > + > > +static AIOPool gluster_aio_pool = { > > + .aiocb_size = sizeof(GlusterAIOCB), > > + .cancel = qemu_gluster_aio_cancel, > > +}; > > + > > +static int qemu_gluster_send_pipe(BDRVGlusterState *s, GlusterAIOCB *acb) > > +{ > > + int ret = 0; > > + while (1) { > > + fd_set wfd; > > + int fd = s->fds[GLUSTER_FD_WRITE]; > > + > > + ret = write(fd, (void *)&acb, sizeof(acb)); > > + if (ret >= 0) { > > + break; > > + } > > + if (errno == EINTR) { > > + continue; > > + } > > + if (errno != EAGAIN) { > > + break; > > + } > > + > > + FD_ZERO(&wfd); > > + FD_SET(fd, &wfd); > > + do { > > + ret = select(fd + 1, NULL, &wfd, NULL, NULL); > > + } while (ret < 0 && errno == EINTR); > > What's the idea behind this? While we're hanging in this loop noone will > read anything from the pipe, so it's unlikely that it magically becomes > ready. I write to the pipe and wait for the reader to read it. The reader (qemu_gluster_aio_event_reader) is already waiting on the other end of the pipe. > > > + } > > + return ret; > > +} > > + > > +static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void > > *arg) > > +{ > > + GlusterAIOCB *acb = (GlusterAIOCB *)arg; > > + BDRVGlusterState *s = acb->common.bs->opaque; > > + > > + acb->ret = ret; > > + if (qemu_gluster_send_pipe(s, acb) < 0) { > > + /* > > + * Gluster AIO callback thread failed to notify the waiting > > + * QEMU thread about IO completion. Nothing much can be done > > + * here but to abruptly abort. > > + * > > + * FIXME: Check if the read side of the fd handler can somehow > > + * be notified of this failure paving the way for a graceful exit. > > + */ > > + error_report("Gluster failed to notify QEMU about IO completion"); > > + abort(); > > In the extreme case you may choose to make this disk inaccessible > (something like bs->drv = NULL), but abort() kills the whole VM and > should only be called when there is a bug. There have been concerns raised about this earlier too. I settled for this since I couldn't see a better way out and I could see the precedence for this in posix-aio-compat.c So I could just do the necessary cleanup, set bs->drv to NULL and return from here ? But how do I wake up the QEMU thread that is waiting on the read side of the pipe ? W/o that, the QEMU thread that waits on the read side of the pipe is still hung. Regards, Bharata.