On Tue, 17 Feb 2009, Mike Tancsa wrote:

Do you have any other details about these issues ? Were the fixes ever MFC'd

Earlier today I handed off some patches for Pete to test (attached below), which he's running alongside the patches in kern/130652. When I run with the patches, basically an MFC of a subset of Kip's routing improvements in 8.x, I can no longer reproduce the lock reversal, which will hopefully mean Pete can no longer reproduce the hang. I plan to merge these in a couple of days once (with any luck) he's confirmed that is the case. We may want to get a subset of this patch on the errata note path, if we can get the ICMP redirect fix down to a very short patch.

Robert N M Watson
Computer Laboratory
University of Cambridge


Merge r185747, r185774, r185807, r185849, r185964, r185965, r186051,
r186052 from head to stable/7; note that only the locking fixes and
invariants checking are added from r185747, but not the move to an
rwlock which would modify the kernel binary interface, nor the move
to a non-recursible lock, which is still seeing problem reports in
head.  This corrects, among other things, a deadlock that may occur
when processing incoming ICMP redirects.

r185747:

  - convert radix node head lock from mutex to rwlock
  - make radix node head lock not recursive
  - fix LOR in rtexpunge
  - fix LOR in rtredirect

  Reviewed by:  sam

r185774:

  - avoid recursively locking the radix node head lock
  - assert that it is held if RTF_RNH_LOCKED is not passed

r185807:

  Fix a bug introduced in r185747: rather than dereferencing an
  uninitialized *rt to something undefined, use the fibnum that came in as
  function argument.

  Found with:   Coverity Prevent(tm)
  CID:          4168

r185849:

  fix a reported panic when adding a route and one hit here when deleting a
  route

  - pass RTF_RNH_LOCKED to rtalloc1_fib in 2 cases where the lock is held
  - make sure the rnh lock is held across rt_setgate and rt_getifa_fib

r185964:

  Pass RTF_RNH_LOCKED to rtalloc1 sunce the node head is locked, this avoids
  a recursive lock panic on inet6 detach.

  Reviewed by:  kmacy

r185965:

  RTF_RNH_LOCKED needs to be passed in the flags arg not report,
  apologies to thompsa

r186051:

  in6_addroute is called through rnh_addadr which is always called with the
  radix node head lock held exclusively. Pass RTF_RNH_LOCKED to rtalloc so
  that rtalloc1_fib will not try to re-acquire the lock.

r186052:

  don't acquire lock recursively

All original commits to head were by Kip Macy <kmacy>, except r185964 by
<thompsa>.

Reviewed by:    bz
Tested by:      Pete French <petefrench at ticketswitch com>



Property changes on: sys
___________________________________________________________________
Modified: svn:mergeinfo
   Merged /head/sys:r185747,185774,185807,185849,185964-185965,186051-186052

Index: sys/netinet/in_rmx.c
===================================================================
--- sys/netinet/in_rmx.c        (revision 188767)
+++ sys/netinet/in_rmx.c        (working copy)
@@ -111,7 +111,7 @@
                 * ARP entry and delete it if so.
                 */
                rt2 = in_rtalloc1((struct sockaddr *)sin, 0,
-                   RTF_CLONING, rt->rt_fibnum);
+                   RTF_CLONING|RTF_RNH_LOCKED, rt->rt_fibnum);
                if (rt2) {
                        if (rt2->rt_flags & RTF_LLINFO &&
                            rt2->rt_flags & RTF_HOST &&

Property changes on: sys/dev/cxgb
___________________________________________________________________
Modified: svn:mergeinfo
   Merged 
/head/sys/dev/cxgb:r185747,185774,185807,185849,185964-185965,186051-186052


Property changes on: sys/dev/ath/ath_hal
___________________________________________________________________
Modified: svn:mergeinfo
   Merged 
/head/sys/dev/ath/ath_hal:r185747,185774,185807,185849,185964-185965,186051-186052

Index: sys/net/route.c
===================================================================
--- sys/net/route.c     (revision 188767)
+++ sys/net/route.c     (working copy)
@@ -277,6 +277,7 @@
        struct rt_addrinfo info;
        u_long nflags;
        int err = 0, msgtype = RTM_MISS;
+       int needlock;

        KASSERT((fibnum < rt_numfibs), ("rtalloc1_fib: bad fibnum"));
        if (dst->sa_family != AF_INET)       /* Only INET supports > 1 fib now 
*/
@@ -290,7 +291,13 @@
                rtstat.rts_unreach++;
                goto miss2;
        }
-       RADIX_NODE_HEAD_LOCK(rnh);
+       needlock = !(ignflags & RTF_RNH_LOCKED);
+       if (needlock)
+               RADIX_NODE_HEAD_LOCK(rnh);
+#ifdef INVARIANTS
+       else
+               RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
+#endif
        if ((rn = rnh->rnh_matchaddr(dst, rnh)) &&
            (rn->rn_flags & RNF_ROOT) == 0) {
                /*
@@ -343,7 +350,8 @@
                        RT_LOCK(newrt);
                        RT_ADDREF(newrt);
                }
-               RADIX_NODE_HEAD_UNLOCK(rnh);
+               if (needlock)
+                       RADIX_NODE_HEAD_UNLOCK(rnh);
        } else {
                /*
                 * Either we hit the root or couldn't find any match,
@@ -352,7 +360,8 @@
                 */
                rtstat.rts_unreach++;
        miss:
-               RADIX_NODE_HEAD_UNLOCK(rnh);
+               if (needlock)
+                       RADIX_NODE_HEAD_UNLOCK(rnh);
        miss2:  if (report) {
                        /*
                         * If required, report the failure to the supervising
@@ -482,6 +491,8 @@
        short *stat = NULL;
        struct rt_addrinfo info;
        struct ifaddr *ifa;
+       struct radix_node_head *rnh =
+           rt_tables[fibnum][dst->sa_family];

        /* verify the gateway is directly reachable */
        if ((ifa = ifa_ifwithnet(gateway)) == NULL) {
@@ -531,6 +542,8 @@
                        info.rti_info[RTAX_NETMASK] = netmask;
                        info.rti_ifa = ifa;
                        info.rti_flags = flags;
+                       if (rt0 != NULL)
+                               RT_UNLOCK(rt0); /* drop lock to avoid LOR with 
RNH */
                        error = rtrequest1_fib(RTM_ADD, &info, &rt, fibnum);
                        if (rt != NULL) {
                                RT_LOCK(rt);
@@ -538,7 +551,7 @@
                                flags = rt->rt_flags;
                        }
                        if (rt0)
-                               RTFREE_LOCKED(rt0);
+                               RTFREE(rt0);

                        stat = &rtstat.rts_dynamic;
                } else {
@@ -554,8 +567,12 @@
                        /*
                         * add the key and gateway (in one malloc'd chunk).
                         */
+                       RT_UNLOCK(rt);
+                       RADIX_NODE_HEAD_LOCK(rnh);
+                       RT_LOCK(rt);
                        rt_setgate(rt, rt_key(rt), gateway);
-                       gwrt = rtalloc1(gateway, 1, 0);
+                       gwrt = rtalloc1(gateway, 1, RTF_RNH_LOCKED);
+                       RADIX_NODE_HEAD_UNLOCK(rnh);
                        EVENTHANDLER_INVOKE(route_redirect_event, rt, gwrt, 
dst);
                        RTFREE_LOCKED(gwrt);
                }
@@ -641,7 +658,7 @@
        if (ifa == NULL)
                ifa = ifa_ifwithnet(gateway);
        if (ifa == NULL) {
-               struct rtentry *rt = rtalloc1_fib(gateway, 0, 0UL, fibnum);
+               struct rtentry *rt = rtalloc1_fib(gateway, 0, RTF_RNH_LOCKED, 
fibnum);
                if (rt == NULL)
                        return (NULL);
                /*
@@ -788,7 +805,9 @@
        struct ifaddr *ifa;
        int error = 0;

+       rnh = rt_tables[rt->rt_fibnum][rt_key(rt)->sa_family];
        RT_LOCK_ASSERT(rt);
+       RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
 #if 0
        /*
         * We cannot assume anything about the reference count
@@ -804,8 +823,6 @@
        if (rnh == NULL)
                return (EAFNOSUPPORT);

-       RADIX_NODE_HEAD_LOCK(rnh);
-
        /*
         * Remove the item from the tree; it should be there,
         * but when callers invoke us blindly it may not (sigh).
@@ -826,7 +843,7 @@
         * Now search what's left of the subtree for any cloned
         * routes which might have been formed from this node.
         */
-       if ((rt->rt_flags & RTF_CLONING) && rt_mask(rt))
+       if ((rt->rt_flags & RTF_CLONING) && rt_mask(rt))
                rnh->rnh_walktree_from(rnh, rt_key(rt), rt_mask(rt),
                                       rt_fixdelete, rt);

@@ -860,7 +877,6 @@
         */
        rttrash++;
 bad:
-       RADIX_NODE_HEAD_UNLOCK(rnh);
        return (error);
 }

@@ -874,7 +890,7 @@
 rtrequest1_fib(int req, struct rt_addrinfo *info, struct rtentry **ret_nrt,
                                u_int fibnum)
 {
-       int error = 0;
+       int error = 0, needlock = 0;
        register struct rtentry *rt;
        register struct radix_node *rn;
        register struct radix_node_head *rnh;
@@ -891,7 +907,12 @@
        rnh = rt_tables[fibnum][dst->sa_family];
        if (rnh == NULL)
                return (EAFNOSUPPORT);
-       RADIX_NODE_HEAD_LOCK(rnh);
+       needlock = ((flags & RTF_RNH_LOCKED) == 0);
+       flags &= ~RTF_RNH_LOCKED;
+       if (needlock)
+               RADIX_NODE_HEAD_LOCK(rnh);
+       else
+               RADIX_NODE_HEAD_LOCK_ASSERT(rnh);
        /*
         * If we are adding a host route then we don't want to put
         * a netmask in the tree, nor do we want to clone it.
@@ -1036,7 +1057,7 @@
                         * then we just blow it away and retry the insertion
                         * of the new one.
                         */
-                       rt2 = rtalloc1_fib(dst, 0, 0, fibnum);
+                       rt2 = rtalloc1_fib(dst, 0, RTF_RNH_LOCKED, fibnum);
                        if (rt2 && rt2->rt_parent) {
                                rtexpunge(rt2);
                                RT_UNLOCK(rt2);
@@ -1125,7 +1146,8 @@
                error = EOPNOTSUPP;
        }
 bad:
-       RADIX_NODE_HEAD_UNLOCK(rnh);
+       if (needlock)
+               RADIX_NODE_HEAD_UNLOCK(rnh);
        return (error);
 #undef senderr
 }
@@ -1153,7 +1175,7 @@
        if (rt->rt_parent == rt0 &&
            !(rt->rt_flags & (RTF_PINNED | RTF_CLONING))) {
                return rtrequest_fib(RTM_DELETE, rt_key(rt), NULL, rt_mask(rt),
-                                rt->rt_flags, NULL, rt->rt_fibnum);
+                                rt->rt_flags|RTF_RNH_LOCKED, NULL, 
rt->rt_fibnum);
        }
        return 0;
 }
@@ -1230,6 +1252,7 @@

 again:
        RT_LOCK_ASSERT(rt);
+       RADIX_NODE_HEAD_LOCK_ASSERT(rnh);

        /*
         * A host route with the destination equal to the gateway
@@ -1256,7 +1279,7 @@
                struct rtentry *gwrt;

                RT_UNLOCK(rt);          /* XXX workaround LOR */
-               gwrt = rtalloc1_fib(gate, 1, 0, rt->rt_fibnum);
+               gwrt = rtalloc1_fib(gate, 1, RTF_RNH_LOCKED, rt->rt_fibnum);
                if (gwrt == rt) {
                        RT_REMREF(rt);
                        return (EADDRINUSE); /* failure */
@@ -1327,12 +1350,8 @@

                arg.rnh = rnh;
                arg.rt0 = rt;
-               RT_UNLOCK(rt);          /* XXX workaround LOR */
-               RADIX_NODE_HEAD_LOCK(rnh);
-               RT_LOCK(rt);
                rnh->rnh_walktree_from(rnh, rt_key(rt), rt_mask(rt),
                                       rt_fixchange, &arg);
-               RADIX_NODE_HEAD_UNLOCK(rnh);
        }

        return 0;
Index: sys/net/route.h
===================================================================
--- sys/net/route.h     (revision 188767)
+++ sys/net/route.h     (working copy)
@@ -172,6 +172,7 @@
 #define        RTF_BROADCAST   0x400000        /* route represents a bcast 
address */
 #define        RTF_MULTICAST   0x800000        /* route represents a mcast 
address */
                                        /* 0x1000000 and up unassigned */
+#define        RTF_RNH_LOCKED  0x40000000      /* radix node head locked by 
caller */

 /* Mask of RTF flags that are allowed to be modified by RTM_CHANGE. */
 #define RTF_FMASK      \
Index: sys/net/rtsock.c
===================================================================
--- sys/net/rtsock.c    (revision 188767)
+++ sys/net/rtsock.c    (working copy)
@@ -628,9 +628,11 @@
                             !sa_equal(info.rti_info[RTAX_IFA],
                                       rt->rt_ifa->ifa_addr))) {
                                RT_UNLOCK(rt);
+                               RADIX_NODE_HEAD_LOCK(rnh);
                                if ((error = rt_getifa_fib(&info,
                                    rt->rt_fibnum)) != 0)
                                        senderr(error);
+                               RADIX_NODE_HEAD_UNLOCK(rnh);
                                RT_LOCK(rt);
                        }
                        if (info.rti_ifa != NULL &&
@@ -642,8 +644,14 @@
                                IFAFREE(rt->rt_ifa);
                        }
                        if (info.rti_info[RTAX_GATEWAY] != NULL) {
-                               if ((error = rt_setgate(rt, rt_key(rt),
-                                       info.rti_info[RTAX_GATEWAY])) != 0) {
+                               RT_UNLOCK(rt);
+                               RADIX_NODE_HEAD_LOCK(rnh);
+                               RT_LOCK(rt);
+ + error = rt_setgate(rt, rt_key(rt),
+                                   info.rti_info[RTAX_GATEWAY]);
+                               RADIX_NODE_HEAD_UNLOCK(rnh);
+                               if (error != 0) {
                                        RT_UNLOCK(rt);
                                        senderr(error);
                                }
Index: sys/netinet6/in6_ifattach.c
===================================================================
--- sys/netinet6/in6_ifattach.c (revision 188767)
+++ sys/netinet6/in6_ifattach.c (working copy)
@@ -823,7 +823,7 @@
        /* XXX grab lock first to avoid LOR */
        if (rt_tables[0][AF_INET6] != NULL) {
                RADIX_NODE_HEAD_LOCK(rt_tables[0][AF_INET6]);
-               rt = rtalloc1((struct sockaddr *)&sin6, 0, 0UL);
+               rt = rtalloc1((struct sockaddr *)&sin6, 0, RTF_RNH_LOCKED);
                if (rt) {
                        if (rt->rt_ifp == ifp)
                                rtexpunge(rt);
Index: sys/netinet6/in6_rmx.c
===================================================================
--- sys/netinet6/in6_rmx.c      (revision 188767)
+++ sys/netinet6/in6_rmx.c      (working copy)
@@ -154,7 +154,7 @@
                 * Find out if it is because of an
                 * ARP entry and delete it if so.
                 */
-               rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_CLONING);
+               rt2 = rtalloc1((struct sockaddr *)sin6, 0, 
RTF_RNH_LOCKED|RTF_CLONING);
                if (rt2) {
                        if (rt2->rt_flags & RTF_LLINFO &&
                                rt2->rt_flags & RTF_HOST &&
@@ -181,7 +181,7 @@
                 *      net route entry, 3ffe:0501:: -> if0.
                 *      This case should not raise an error.
                 */
-               rt2 = rtalloc1((struct sockaddr *)sin6, 0, RTF_CLONING);
+               rt2 = rtalloc1((struct sockaddr *)sin6, 0, 
RTF_RNH_LOCKED|RTF_CLONING);
                if (rt2) {
                        if ((rt2->rt_flags & (RTF_CLONING|RTF_HOST|RTF_GATEWAY))
                                        == RTF_CLONING

Property changes on: sys/contrib/pf
___________________________________________________________________
Modified: svn:mergeinfo
   Merged 
/head/sys/contrib/pf:r185747,185774,185807,185849,185964-185965,186051-186052

_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to