Re: Crash due to mutex genl_lock called from RCU context

Eric Dumazet Sat, 26 Nov 2016 18:26:43 -0800

On Sat, 2016-11-26 at 18:08 -0800, Cong Wang wrote:
> On Fri, Nov 25, 2016 at 8:54 PM, Eric Dumazet <eric.duma...@gmail.com> wrote:
> >
> > Oh well, this wont work, since sk->sk_destruct will be called from RCU
> > callback.
> >
> > Grabbing the mutex should not be done from netlink_sock_destruct() but
> > from netlink_release()
> 
> But you also change the behavior of cb.done(), currently it is called when the
> last sock ref is gone, with your patch it is called when the first
> sock is closed.


No. It will be called when last refcount on the socket is released,
sk_refcnt transitions to final 0.


My patch changes the sock_hold() to the variant that makes sure
sk_refcnt is not 0 before increase, otherwise a race can happen and
release could be called twice.

Classic refcounting stuff coupled to rcu rules.

> No?


Are you telling me inet_release() is called when we close() the first
file descriptor ?

fd1 = socket()
fd2 = dup(fd1);
close(fd2) -> release() ???


> 
> I don't see why we need to get genl_lock in ->done() here, because we are
> already the last sock using it and module ref protects the ops from being
> removed via module, seems we are pretty safe without any lock.

Well, at least this exposes a real bug in Thomas patch.

Removing the lock might be done for net-next, not stable branches.

Re: Crash due to mutex genl_lock called from RCU context

Reply via email to