On 03/04/2012 02:41 PM, Michael Tokarev wrote: > Since all block (bdrv) layer is now implemented using > coroutines, I thought I'd give it a try. But immediately > hit a question to which I don't know a good answer. > > Suppose we've some networking block device (like NBD) and > want to be able to support reconnection - this is actually > very useful feature, in order to be able to reboot/restart > the NBD server without a need to restart all the clients. > > For this to work, we should have an ability to reconnect > to the server and re-issue all requests which were waiting > for reply. > > Traditionally, in asyncronous event-loop-based scheme, this > is implemented as a queue of requests linked to the block > driver state structure, and in case of reconnection we just > walk over all requests and requeue these. > > But if the block driver is implemented as a set of coroutines > (like nbd currently does), I see no sane/safe way to restart > the requests. Setjmp/longjmp can be uses with extra care > there, but with these it is extremly fragile. > > Any hints on how to do that? >
>From the block layer's point of view, the requests should still be pending. For example, if a read request sees a dropped connection, it adds itself to a list of coroutines waiting for a reconnect, wakes up a connection manager coroutine (or thread), and sleeps. The connection manager periodically tries to connect, and if it succeeds, it wakes up the coroutines waiting for a reconnection. It's important to implement request cancellation correctly here, or we can end up with a device that cannot be unplugged or a guest that cannot be shutdown. -- error compiling committee.c: too many arguments to function