On Tue, Sep 12 2006, Peter Zijlstra wrote: > @@ -463,10 +465,13 @@ static void do_nbd_request(request_queue > > error_out: > req->errors++; > - spin_unlock(q->queue_lock); > - nbd_end_request(req); > - spin_lock(q->queue_lock); > + __nbd_end_request(req); > } > + /* > + * q->queue_lock has been dropped, this opens up a race > + * plug the device to close it. > + */ > + blk_plug_device(q); > return; > }
This looks wrong, I wonder if this only fixes things for you because it happens to reinvoke the request handler after the timeout occurs? Your comment doesn't really describe what you think is going on, please describe in detail what you think is happening here that the plugging supposedly solves. Generally the block device rule is that once you are invoked due to an unplug (or whatever) event, it is the responsibility of the block device to run the queue until it's done. So if you bail out of queue handling for whatever reason (might be resource starvation in hard- or software), you must make sure to reenter queue handling since the device will not get replugged while it has requests pending. Unless you run into some software resource shortage, running of the queue is done deterministically when you know resources are available (ie an io completes). The device plugging itself is only ever done when you encounter a shortage outside of your control (memory shortage, for instance) _and_ you don't already have pending work where you can invoke queueing from again. -- Jens Axboe - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html