We're seeing a memleak when we run an infinite loop that loads/unloads rds-tcp, and runs some traffic between each load/unload.
Analysis shows that this is happening for the following reason: inet_accept -> sock_graft does parent->sk = sk but if the parent->sk was previously pointing at some other struct sock "old_sk" (happens in the case of rds_tcp_accept_one() which has historically called sock_create_kern() to set up the new_sock), we need to sock_put(old_sk), else we'd leak it. In general, sock_graft() is cutting loose the parent->sk, so it looks like it needs to release its refcnt on it? Patch below takes care of the leak in our case, but I could use some input on other locking considerations, and if this is ok with other modules that use sock_graft() -----------------------patch below--------------------------------- diff --git a/include/net/sock.h b/include/net/sock.h index 5374c0d..014ad56 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1686,12 +1686,19 @@ static inline void sock_orphan(struct sock *sk) static inline void sock_graft(struct sock *sk, struct socket *parent) { + struct sock *old_sk; + write_lock_bh(&sk->sk_callback_lock); sk->sk_wq = parent->wq; + old_sk = parent->sk; parent->sk = sk; sk_set_socket(sk, parent); security_sock_graft(sk, parent); write_unlock_bh(&sk->sk_callback_lock); + if (old_sk) { + sock_orphan(old_sk); + sock_put(old_sk); + } }