On Tue, Jan 08, 2008 at 02:33:18PM -0500, Jeff Layton wrote: > ...and only have lockd exit when the last reference is dropped. > > The problem is this: > > When a lock that a client is blocking on comes free, lockd does this in > nlmsvc_grant_blocked(): > > nlm_async_call(block->b_call, NLMPROC_GRANTED_MSG, &nlmsvc_grant_ops); > > the callback from this call is nlmsvc_grant_callback(). That function > does this at the end to wake up lockd: > > svc_wake_up(block->b_daemon); > > However there is no guarantee that lockd will be up when this happens. > If someone shuts down or restarts lockd before the async call completes, > then the b_daemon pointer will point to freed memory and the kernel may > oops. > > I first noticed this on older kernels and had mistakenly thought that > newer kernels weren't susceptible, but that's not correct. There's a bit > of a race to make sure that the nlm_host is bound when the async call is > done, but I can now reproduce this at will on current kernels. > > This patch is based on Trond's suggestion to add a new reference counter > to lockd, and only allows lockd to go down when it reaches 0. With this > change we can't use kthread_stop here. nlmsvc_unlink_block is called by > lockd and a kthread can't call kthread_stop on itself. So the patch > changes lockd to check the refcount itself and to return if it goes to > 0. We do the checking and exit while holding the nlmsvc_mutex to make > sure that a new lockd is not started until the old one is down.
I don't like this signals/kthread mixture at all. Why can't we simply call kthread_stop when the refcount hits zero and keep all the nice kthread helpers? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/