syzbot reported a circular locking dependency in rds_tcp_tune() where
sk_net_refcnt_upgrade() is called while holding the socket lock:

======================================================
WARNING: possible circular locking dependency detected
======================================================
kworker/u10:8/15040 is trying to acquire lock:
ffffffff8e9aaf80 (fs_reclaim){+.+.}-{0:0},
at: __kmalloc_cache_noprof+0x4b/0x6f0

but task is already holding lock:
ffff88805a3c1ce0 (k-sk_lock-AF_INET6){+.+.}-{0:0},
at: rds_tcp_tune+0xd7/0x930

The issue occurs because sk_net_refcnt_upgrade() performs memory
allocation (via get_net_track() -> ref_tracker_alloc()) while the
socket lock is held, creating a circular dependency with fs_reclaim.

Fix this by moving sk_net_refcnt_upgrade() outside the socket lock
critical section. This is safe because the fields modified by the
sk_net_refcnt_upgrade() call (sk_net_refcnt, ns_tracker) are not
accessed by any concurrent code path at this point.

v2:
  - Corrected fixes tag
  - check patch line wrap nits
  - ai commentary nits

Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=2e2cf5331207053b8106
Fixes: 3a58f13a881e ("net: rds: acquire refcount on TCP sockets")
Signed-off-by: Allison Henderson <[email protected]>
---
 net/rds/tcp.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 04f310255692..654e23d13e3d 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -490,18 +490,24 @@ bool rds_tcp_tune(struct socket *sock)
        struct rds_tcp_net *rtn;
 
        tcp_sock_set_nodelay(sock->sk);
-       lock_sock(sk);
        /* TCP timer functions might access net namespace even after
         * a process which created this net namespace terminated.
         */
        if (!sk->sk_net_refcnt) {
-               if (!maybe_get_net(net)) {
-                       release_sock(sk);
+               if (!maybe_get_net(net))
                        return false;
-               }
+               /*
+                * sk_net_refcnt_upgrade() must be called before lock_sock()
+                * because it does a GFP_KERNEL allocation, which can trigger
+                * fs_reclaim and create a circular lock dependency with the
+                * socket lock.  The fields it modifies (sk_net_refcnt,
+                * ns_tracker) are not accessed by any concurrent code path
+                * at this point.
+                */
                sk_net_refcnt_upgrade(sk);
                put_net(net);
        }
+       lock_sock(sk);
        rtn = net_generic(net, rds_tcp_netid);
        if (rtn->sndbuf_size > 0) {
                sk->sk_sndbuf = rtn->sndbuf_size;
-- 
2.43.0


Reply via email to