On Monday, June 23, 2014 1:12:54 pm Roger Pau Monné wrote: > On 23/06/14 18:49, Alexander V. Chernikov wrote: > > On 23.06.2014 20:39, Alexander V. Chernikov wrote: > >> On 23.06.2014 19:32, John Baldwin wrote: > >>> On Friday, June 20, 2014 11:25:51 am Roger Pau Monné wrote: > >>>> Hello, > >>>> > >>>> I've stumbled across the following panic when testing Xen netback with > >>>> if_bridge: > >>>> > >>>> Kernel page fault with the following non-sleepable locks held: > >>>> exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xfffff80006306c18) > >>> locked @ /usr/src/sys/m > >>>> KDB: stack backtrace: > >>>> X_db_symbol_values() at X_db_symbol_values+0x10b/frame 0xfffffe0000213490 > >>>> kdb_backtrace() at kdb_backtrace+0x39/frame 0xfffffe0000213540 > >>>> witness_warn() at witness_warn+0x4a8/frame 0xfffffe0000213600 > >>>> trap() at trap+0xc9d/frame 0xfffffe00002136a0 > >>>> trap() at trap+0x669/frame 0xfffffe00002138b0 > >>>> calltrap() at calltrap+0x8/frame 0xfffffe00002138b0 > >>>> --- trap 0xc, rip = 0xffffffff8221a0ef, rsp = 0xfffffe0000213970, rbp = > >>> 0xfffffe00002139e0 --- > >>>> bridge_input() at bridge_input+0x5ff/frame 0xfffffe00002139e0 > >>>> ether_vlanencap() at ether_vlanencap+0x4a3/frame 0xfffffe0000213a10 > >>>> netisr_dispatch_src() at netisr_dispatch_src+0x90/frame > >>>> 0xfffffe0000213a80 > >>>> ether_ifattach() at ether_ifattach+0x19f/frame 0xfffffe0000213ab0 > >>>> ath_dfs_get_thresholds() at ath_dfs_get_thresholds+0x81ce/frame > >>> 0xfffffe0000213b30 > >>>> intr_event_execute_handlers() at intr_event_execute_handlers+0x93/frame > >>> 0xfffffe0000213b70 > >>>> db_dump_intr_event() at db_dump_intr_event+0x796/frame 0xfffffe0000213bb0 > >>>> fork_exit() at fork_exit+0x84/frame 0xfffffe0000213bf0 > >>>> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000213bf0 > >>>> --- trap 0, rip = 0, rsp = 0xfffffe0000213cb0, rbp = 0 --- > >>>> > >>>> I've tracked this down to if_detach_internal setting ifp->if_addr to > >>>> NULL before calling EVENTHANDLER_INVOKE(ifnet_departure_event..., which > >>>> causes a panic in GRAB_OUR_PACKETS in the if_bridge code when it tries > >>>> to perform IF_LLADDR on an interface that's in the process of being > >>>> destroyed (ifp->if_addr set to NULL, but the ifnet_departure_event event > >>>> has not fired yet). > >>>> > >>>> I have the following naive patch that moves the firing of the event > >>>> before if_addr is set to NULL, but I'm not familiar with the ordering > >>>> in if_detach_internal, so I'm not sure if this might cause problems in > >>>> other parts of the code, could someone familiar with the net stuff > >>>> comment on the best way to deal with it? > >> > >> We should notify kernel customers only when we are really taking this > >> interface down and every other subsystem cannot add any new state to the > >> interface. > >> > >> In this patch you're sending notification before taking ifnet down, > >> removing its L3 addresses, routes, and so on. > >> > >> This can easily lead to panic in, for example, BPF subsystem (since BPF > >> state is freed in bpf_ifdetach() handler). > >> > >> Addintionally, this will introduce ifaddr / iface messages reversal for > >> rtsock. > > Whoops. I misread the patch. > > It should be OK. > > > >> > >> It looks like we'd better fix if_bridge (and it is still using mutexes, > >> what a shame!). > >> > >> Can you send me trace with line numbers? > > However, these two still stands. > > (And I'm wondering how you're getting any traffic on down/dying interface). > > I'm not getting the traffic from the dying interface, I'm getting the > traffic from another interface on the bridge (a physical bce interface), > which injects traffic into the bridge, that calls bridge_input, which > tries to read ifp->if_addr->ifa_addr from the dying interface, and that > leads to the panic. > > Line numbers: > > /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:2410 (bridge_input) > /usr/src/sys/net/if_ethersubr.c:543 (ether_input_internal) > /usr/src/sys/net/netisr.c:972 (netisr_dispatch_src) > /usr/src/sys/net/if_ethersubr.c:674 (ether_input) > /usr/src/sys/dev/bce/if_bce.c:6861 (bce_rx_intr)
I think this certainly suggests moving at least the eventhandler up so that things like vlans and bridges can detach from an interface while it is still constructed. I do think it would be ideal to move all three notifications to the same place though. (So your original patch plus moving the routing socket message.) -- John Baldwin _______________________________________________ freebsd-net@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-net To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"