> -----Original Message----- > From: Eric Dumazet [mailto:eric.duma...@gmail.com] > Sent: Sunday, November 1, 2015 6:37 PM > To: David Miller <da...@davemloft.net> > Cc: Haiyang Zhang <haiya...@microsoft.com>; eduma...@google.com; > netdev@vger.kernel.org; KY Srinivasan <k...@microsoft.com> > Subject: [PATCH v2 net-next] net: make skb_set_owner_w() more robust > > From: Eric Dumazet <eduma...@google.com> > > skb_set_owner_w() is called from various places that assume > skb->sk always point to a full blown socket (as it changes > sk->sk_wmem_alloc) > > We'd like to attach skb to request sockets, and in the future > to timewait sockets as well. For these kind of pseudo sockets, > we need to take a traditional refcount and use sock_edemux() > as the destructor. > > It is now time to un-inline skb_set_owner_w(), being too big. > > Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets > instead of listener") > Signed-off-by: Eric Dumazet <eduma...@google.com> > Bisected-by: Haiyang Zhang <haiya...@microsoft.com> > --- > v2: sock_edemux() must be guarded by CONFIG_INET > > include/net/sock.h | 17 ++--------------- > net/core/sock.c | 22 ++++++++++++++++++++++ > net/ipv4/tcp_output.c | 4 +--- > 3 files changed, 25 insertions(+), 18 deletions(-) > > diff --git a/include/net/sock.h b/include/net/sock.h > index aeed5c95f3ca..f570e75e3da9 100644 > --- a/include/net/sock.h > +++ b/include/net/sock.h > @@ -1951,6 +1951,8 @@ static inline void skb_set_hash_from_sk(struct > sk_buff *skb, struct sock *sk) > } > } > > +void skb_set_owner_w(struct sk_buff *skb, struct sock *sk); > + > /* > * Queue a received datagram if it will fit. Stream and sequenced > * protocols can't normally use this as they need to fit buffers in > @@ -1959,21 +1961,6 @@ static inline void skb_set_hash_from_sk(struct > sk_buff *skb, struct sock *sk) > * Inlined as it's very short and called for pretty much every > * packet ever received. > */ > - > -static inline void skb_set_owner_w(struct sk_buff *skb, struct sock *sk) > -{ > - skb_orphan(skb); > - skb->sk = sk; > - skb->destructor = sock_wfree; > - skb_set_hash_from_sk(skb, sk); > - /* > - * We used to take a refcount on sk, but following operation > - * is enough to guarantee sk_free() wont free this sock until > - * all in-flight packets are completed > - */ > - atomic_add(skb->truesize, &sk->sk_wmem_alloc); > -} > - > static inline void skb_set_owner_r(struct sk_buff *skb, struct sock *sk) > { > skb_orphan(skb); > diff --git a/net/core/sock.c b/net/core/sock.c > index 0ef30aa90132..7529eb9463be 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -1656,6 +1656,28 @@ void sock_wfree(struct sk_buff *skb) > } > EXPORT_SYMBOL(sock_wfree); > > +void skb_set_owner_w(struct sk_buff *skb, struct sock *sk) > +{ > + skb_orphan(skb); > + skb->sk = sk; > +#ifdef CONFIG_INET > + if (unlikely(!sk_fullsock(sk))) {
Thanks for the fix! For some driver, like ours, this condition may not be "unlikely". So could you remove the "unlikely"? Thanks, - Haiyang