On Thu, Aug 24, 2017 at 5:21 PM, Paolo Bonzini <pbonz...@redhat.com> wrote: > On 24/08/2017 17:33, Stefan Hajnoczi wrote: >> This patch enters read_reply_co directly in >> nbd_client_attach_aio_context(). This is safe because new_context is >> acquired by the caller. This ensures that read_reply_co reaches its >> first yield point and its ctx is set up. > > I'm not very confident with this patch. aio_context_acquire/release is > going to go away, and this then becomes possible > > main context new_context > qemu_aio_coroutine_enter > send request > wait for reply > read first reply > wake coroutine > > where the "wake coroutine" part thinks it's running in new_context, and > thus simply enters the coroutine instead of using the bottom half. > > But blk_co_preadv() should need the read_reply_co itself, in order to be > woken up after reading the reply header. The core issue here is that > nbd_co_receive_reply was never called, I suspect. And if it was never > called, read_reply_co should not be woken up by nbd_coroutine_end. > > So the fix is: > > 1) assign NULL to s->recv_coroutine[i] when nbd_co_send_request fails > > 2) move this to nbd_co_receive_reply: > > s->recv_coroutine[i] = NULL; > > /* Kick the read_reply_co to get the next reply. */ > if (s->read_reply_co) { > aio_co_wake(s->read_reply_co); > } > > Does this make sense? (Note that the read_reply_co idea actually came > from you, or from my recollections of your proposed design :)).
Yes, I think that would work. Stefan