On 2015/2/9 17:23, Paolo Bonzini wrote: > > > On 07/02/2015 10:51, w00214312 wrote: >> From: Bin Wu <wu.wu...@huawei.com> >> >> When we test the drive_mirror between different hosts by ndb devices, >> we find that, during the cancel phase the qemu process crashes sometimes. >> By checking the crash core file, we find the stack as follows, which means >> a coroutine re-enter error occurs: > > This bug probably can be fixed simply by delaying the setting of > recv_coroutine. > > What are the symptoms if you only apply your "qemu-coroutine-lock: fix > co_queue multi-adding bug" patch but not "qemu-coroutine: fix > qemu_co_queue_run_restart error"? > > Can you try the patch below? (Compile-tested only). >
yes, I think this patch can solve the problem too. I will try the patch latter. > diff --git a/block/nbd-client.c b/block/nbd-client.c > index 6e1c97c..23d6a71 100644 > --- a/block/nbd-client.c > +++ b/block/nbd-client.c > @@ -104,10 +104,21 @@ static int nbd_co_send_request(NbdClientSession *s, > QEMUIOVector *qiov, int offset) > { > AioContext *aio_context; > - int rc, ret; > + int rc, ret, i; > > qemu_co_mutex_lock(&s->send_mutex); > + > + for (i = 0; i < MAX_NBD_REQUESTS; i++) { > + if (s->recv_coroutine[i] == NULL) { > + s->recv_coroutine[i] = qemu_coroutine_self(); > + break; > + } > + } > + > + assert(i < MAX_NBD_REQUESTS); > + request->handle = INDEX_TO_HANDLE(s, i); > s->send_coroutine = qemu_coroutine_self(); > + > aio_context = bdrv_get_aio_context(s->bs); > aio_set_fd_handler(aio_context, s->sock, > nbd_reply_ready, nbd_restart_write, s); > @@ -164,8 +175,6 @@ static void nbd_co_receive_reply(NbdClientSession *s, > static void nbd_coroutine_start(NbdClientSession *s, > struct nbd_request *request) > { > - int i; > - > /* Poor man semaphore. The free_sema is locked when no other request > * can be accepted, and unlocked after receiving one reply. */ > if (s->in_flight >= MAX_NBD_REQUESTS - 1) { > @@ -174,15 +183,7 @@ static void nbd_coroutine_start(NbdClientSession *s, > } > s->in_flight++; > > - for (i = 0; i < MAX_NBD_REQUESTS; i++) { > - if (s->recv_coroutine[i] == NULL) { > - s->recv_coroutine[i] = qemu_coroutine_self(); > - break; > - } > - } > - > - assert(i < MAX_NBD_REQUESTS); > - request->handle = INDEX_TO_HANDLE(s, i); > + /* s->recv_coroutine[i] is set as soon as we get the send_lock. */ > } > > static void nbd_coroutine_end(NbdClientSession *s, > > > -- Bin Wu