On Wed, Jun 10, 2015 at 8:28 PM, Andy Gospodarek <go...@cumulusnetworks.com> wrote: > On Wed, Jun 10, 2015 at 07:53:59PM -0700, Scott Feldman wrote: >> On Wed, Jun 10, 2015 at 7:37 PM, Andy Gospodarek >> <go...@cumulusnetworks.com> wrote: >> >> > @@ -1129,7 +1142,15 @@ int fib_sync_down_dev(struct net_device *dev, int >> > force) >> > dead++; >> > else if (nexthop_nh->nh_dev == dev && >> > nexthop_nh->nh_scope != scope) { >> > - nexthop_nh->nh_flags |= RTNH_F_DEAD; >> > + switch (event) { >> > + case NETDEV_DOWN: >> > + case NETDEV_UNREGISTER: >> > + nexthop_nh->nh_flags |= >> > RTNH_F_DEAD; >> > + /* fall through */ >> > + case NETDEV_CHANGE: >> > + nexthop_nh->nh_flags |= >> > RTNH_F_LINKDOWN; >> > + break; >> > + } >> > #ifdef CONFIG_IP_ROUTE_MULTIPATH >> > spin_lock_bh(&fib_multipath_lock); >> > fi->fib_power -= nexthop_nh->nh_power; >> > @@ -1139,14 +1160,22 @@ int fib_sync_down_dev(struct net_device *dev, int >> > force) >> > dead++; >> > } >> > #ifdef CONFIG_IP_ROUTE_MULTIPATH >> > - if (force > 1 && nexthop_nh->nh_dev == dev) { >> > + if (event == NETDEV_UNREGISTER && >> > nexthop_nh->nh_dev == dev) { >> > dead = fi->fib_nhs; >> > break; >> > } >> > #endif >> > } endfor_nexthops(fi) >> > if (dead == fi->fib_nhs) { >> > - fi->fib_flags |= RTNH_F_DEAD; >> > + switch (event) { >> > + case NETDEV_DOWN: >> > + case NETDEV_UNREGISTER: >> > + fi->fib_flags |= RTNH_F_DEAD; >> > + /* fall through */ >> > + case NETDEV_CHANGE: >> > + fi->fib_flags |= RTNH_F_LINKDOWN; >> >> RTNH_F_LINKDOWN is to mark linkdown nexthop devs....why is the route >> fi being marked RTNH_F_LINKDOWN? >> >> The RTNH_F_LINKDOWN comment says: >> >> #define RTNH_F_LINKDOWN 16 /* carrier-down on nexthop */ > > This is done with the dead flag already. I'm actually following the > precedent already set there. > >> It's a per-nh flag, not per-route flag, correct? >> >> Can you show an ECMP example with only a subset of the nexthops dev >> linkdowned? Show the ip route output after going thru some link >> down/up events on some of the nexthops devs. > > Sure! This is exactly what I've been using for testing. > > # ip route show > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 > 90.0.0.0/24 via 70.0.0.2 dev p7p1 > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 10 > 100.0.0.0/24 > nexthop via 70.0.0.2 dev p7p1 weight 1 > nexthop via 80.0.0.2 dev p8p1 weight 1 > 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 > # # take p8p1 link down > # ip route show > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown > 90.0.0.0/24 via 70.0.0.2 dev p7p1 > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 10 dead linkdown > 100.0.0.0/24 > nexthop via 70.0.0.2 dev p7p1 weight 1 > nexthop via 80.0.0.2 dev p8p1 weight 1 dead linkdown > 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 > # ip route get 100.0.0.2 > 100.0.0.2 via 70.0.0.2 dev p7p1 src 70.0.0.1 > cache > # ip route get 100.0.0.2 > 100.0.0.2 via 70.0.0.2 dev p7p1 src 70.0.0.1 > cache > # # take p8p1 link up > # ip route show > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 > 90.0.0.0/24 via 70.0.0.2 dev p7p1 > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 10 > 100.0.0.0/24 > nexthop via 70.0.0.2 dev p7p1 weight 1 > nexthop via 80.0.0.2 dev p8p1 weight 1 > 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 > # ip route show > 100.0.0.2 via 70.0.0.2 dev p7p1 src 70.0.0.1 > cache > # ip route get 100.0.0.2 > 100.0.0.2 via 80.0.0.2 dev p8p1 src 80.0.0.1 > cache > # ip route get 100.0.0.2 > 100.0.0.2 via 70.0.0.2 dev p7p1 src 70.0.0.1 > cache > # ip route get 100.0.0.2 > 100.0.0.2 via 80.0.0.2 dev p8p1 src 80.0.0.1 > cache > # # you can see the round robin happening > # # take all ports p8p1 and p7p1 down > # ip route show > 70.0.0.0/24 dev p7p1 proto kernel scope link src 70.0.0.1 dead linkdown > 80.0.0.0/24 dev p8p1 proto kernel scope link src 80.0.0.1 dead linkdown > 90.0.0.0/24 via 70.0.0.2 dev p7p1 dead linkdown > 90.0.0.0/24 via 80.0.0.2 dev p8p1 metric 10 dead linkdown > 100.0.0.0/24 > nexthop via 70.0.0.2 dev p7p1 weight 1 dead linkdown > nexthop via 80.0.0.2 dev p8p1 weight 1 dead linkdown > 192.168.56.0/24 dev p2p1 proto kernel scope link src 192.168.56.2 > # ip route get 100.0.0.2 > RTNETLINK answers: Network is unreachable > # ip route get 80.0.0.2 > RTNETLINK answers: Network is unreachable > # ip route get 80.0.0.1 > local 80.0.0.1 dev lo src 80.0.0.1 > cache <local> > # ip route get 70.0.0.1 > local 70.0.0.1 dev lo src 70.0.0.1 > cache <local> > # # local addrs are still reachable
Perfect, looks good, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html