On 6/8/17 11:55 PM, Cong Wang wrote: > On Thu, Jun 8, 2017 at 2:27 PM, Ben Greear <gree...@candelatech.com> wrote: >> >> As far as I can tell, the patch did not help, or at least we still reproduce >> the >> crash easily. > > netlink dump is serialized by nlk->cb_mutex so I don't think that > patch makes any sense w.r.t race condition.
>From what I can see fn_sernum should be accessed under table lock, so when saving and checking it during a walk make sure it the lock is held. That has nothing to do with the netlink dump, but the table changing during a walk. >> (gdb) l *(fib6_walk_continue+0x76) >> 0x188c6 is in fib6_walk_continue >> (/home/greearb/git/linux-2.6/net/ipv6/ip6_fib.c:1593). >> 1588 if (fn == w->root) >> 1589 return 0; >> 1590 pn = fn->parent; >> 1591 w->node = pn; >> 1592 #ifdef CONFIG_IPV6_SUBTREES >> 1593 if (FIB6_SUBTREE(pn) == fn) { > > Apparently fn->parent is NULL here for some reason, but > I don't know if that is expected or not. If a simple NULL check > is not enough here, we have to trace why it is NULL. >From my understanding, parent should not be null hence the attempts to fix access to table nodes under a lock. ie., figuring out why it is null here.