Hello, On Fri, 1 Apr 2016, David Ahern wrote:
> v3 > - Julian comments: changed use of dead in documentation to failed, > init state to NUD_REACHABLE which simplifies fib_good_nh, use of > nh_dev for neighbor lookup, fallback to first entry which is what > current logic does > > v2 > - use rcu locking to avoid refcnts per Eric's suggestion > - only consider neighbor info for nh_scope == RT_SCOPE_LINK per Julian's > comment > - drop the 'state == NUD_REACHABLE' from the state check since it is > part of NUD_VALID (comment from Julian) > - wrapped the use of the neigh in a sysctl > > Documentation/networking/ip-sysctl.txt | 10 ++++++++++ > include/net/netns/ipv4.h | 3 +++ > net/ipv4/fib_semantics.c | 32 ++++++++++++++++++++++++++++---- > net/ipv4/sysctl_net_ipv4.c | 11 +++++++++++ > 4 files changed, 52 insertions(+), 4 deletions(-) > > diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c > index d97268e8ff10..e08abf96824a 100644 > --- a/net/ipv4/fib_semantics.c > +++ b/net/ipv4/fib_semantics.c > @@ -1559,21 +1559,45 @@ int fib_sync_up(struct net_device *dev, unsigned int > nh_flags) > } > > #ifdef CONFIG_IP_ROUTE_MULTIPATH > +static bool fib_good_nh(const struct fib_nh *nh) > +{ > + int state = NUD_REACHABLE; > + > + if (nh->nh_scope == RT_SCOPE_LINK) { > + struct neighbour *n = NULL; NULL is not needed anymore. > + > + rcu_read_lock_bh(); > + > + n = __neigh_lookup_noref(&arp_tbl, &nh->nh_gw, nh->nh_dev); > + if (n) > + state = n->nud_state; > + > + rcu_read_unlock_bh(); > + } > + > + return !!(state & NUD_VALID); > +} > > void fib_select_multipath(struct fib_result *res, int hash) > { > struct fib_info *fi = res->fi; > + struct net *net = fi->fib_net; > + unsigned char first_nhsel = 0; Looking at fib_table_lookup() res->nh_sel is not 0 in all cases. I personally don't like that we do not fallback properly but to make this logic more correct we can use something like this: bool first = false; > > for_nexthops(fi) { > if (hash > atomic_read(&nh->nh_upper_bound)) > continue; > > - res->nh_sel = nhsel; > - return; > + if (!net->ipv4.sysctl_fib_multipath_use_neigh || > + fib_good_nh(nh)) { > + res->nh_sel = nhsel; > + return; > + } > + if (!first_nhsel) > + first_nhsel = nhsel; if (!first) { res->nh_sel = nhsel; first = true; } > } endfor_nexthops(fi); > > - /* Race condition: route has just become dead. */ > - res->nh_sel = 0; > + res->nh_sel = first_nhsel; And then this is not needed anymore. Even setting to 0 was not needed because 0 is not better than current nh_sel when both are DEAD/LINKDOWN. Regards -- Julian Anastasov <j...@ssi.bg>