On 13.07.2015 10:23, Herbert Xu wrote:
On Fri, Jul 10, 2015 at 02:51:41PM +0300, Konstantin Khlebnikov wrote:
This fixes race between non-atomic updates of adjacent bit-fields:
skb->cloned could be lost because netlink broadcast clones skb after
sending it to the first listener who sets skb->peeked at the same skb.
As a result atomic refcounting of skb header stays disabled and
skb_release_data() frees it twice. Race leads to double-free in kmalloc-xxx.

Signed-off-by: Konstantin Khlebnikov <khlebni...@yandex-team.ru>
Fixes: b19372273164 ("net: reorganize sk_buff for faster __copy_skb_header()")
---
  net/netlink/af_netlink.c |    6 ++++++
  1 file changed, 6 insertions(+)

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index dea925388a5b..921e0d8dfe3a 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2028,6 +2028,12 @@ int netlink_broadcast_filtered(struct sock *ssk, struct 
sk_buff *skb, u32 portid
        info.tx_filter = filter;
        info.tx_data = filter_data;

+       /* Enable atomic refcounting in skb_release_data() before first send:
+        * non-atomic set of that bit-field in __skb_clone() could race with
+        * __skb_recv_datagram() which touches the same set of bit-fields.
+        */
+       skb->cloned = 1;
+
        /* While we sleep in clone, do not allow to change socket list */

        netlink_lock_table();

Your effort in finding this bug is wonderful.  However I think
the fix is a bit dirty.

The real issue here is that the recv path no longer handles shared
skbs.  So either we need to fix the recv path to not touch skbs
without cloning them, or we need to get rid of the use of shared
skbs in netlink.

I don't think that recv path should care about shared skb -- skb can be
delivered into only one socket anyway.


Less dirty fix for that: do not send original skb.
That adds one extra clone but makes code much cleaner.


--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1957,17 +1957,16 @@ static void do_one_broadcast(struct sock *sk,
        }

        sock_hold(sk);
-       if (p->skb2 == NULL) {
-               if (skb_shared(p->skb)) {
-                       p->skb2 = skb_clone(p->skb, p->allocation);
-               } else {
-                       p->skb2 = skb_get(p->skb);
-                       /*
-                        * skb ownership may have been set when
-                        * delivered to a previous socket.
-                        */
-                       skb_orphan(p->skb2);
-               }
+       if (p->skb2 == NULL || skb_shared(p->skb2)) {
+               kfree_skb(p->skb2);
+               p->skb2 = skb_clone(p->skb, p->allocation);
+       } else {
+               skb_get(p->skb2);
+               /*
+                * skb ownership may have been set when
+                * delivered to a previous socket.
+                */
+               skb_orphan(p->skb2);
        }
        if (p->skb2 == NULL) {
                netlink_overrun(sk);
@@ -1997,7 +1996,6 @@ static void do_one_broadcast(struct sock *sk,
        } else {
                p->congested |= val;
                p->delivered = 1;
-               p->skb2 = NULL;
        }
 out:
        sock_put(sk);




In fact it looks I introduced the bug way back in

commit a59322be07c964e916d15be3df473fb7ba20c41e
Author: Herbert Xu <herb...@gondor.apana.org.au>
Date:   Wed Dec 5 01:53:40 2007 -0800

     [UDP]: Only increment counter on first peek/recv

I will try to mend this error :)

Cheers,



--
Konstantin
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to