On Tue, Dec 8, 2020 at 3:35 PM Jay Vosburgh <jay.vosbu...@canonical.com> wrote: > > Jarod Wilson <ja...@redhat.com> wrote: > > >I'm seeing a system get stuck unable to bring a downed interface back up > >when it's got an updelay value set, behavior which ceased when logging > >spew was removed from bond_miimon_inspect(). I'm monitoring logs on this > >system over another network connection, and it seems that the act of > >spewing logs at all there increases rtnl lock contention, because > >instrumented code showed bond_mii_monitor() never able to succeed in it's > >attempts to call rtnl_trylock() to actually commit link state changes, > >leaving the downed link stuck in BOND_LINK_DOWN. The system in question > >appears to be fine with the log spew being moved to > >bond_commit_link_state(), which is called after the successful > >rtnl_trylock(). I'm actually wondering if perhaps we ultimately need/want > >some bond-specific lock here to prevent racing with bond_close() instead > >of using rtnl, but this shift of the output appears to work. I believe > >this started happening when de77ecd4ef02 ("bonding: improve link-status > >update in mii-monitoring") went in, but I'm not 100% on that. > > We use RTNL not to avoid deadlock with bonding itself, but > because the "commit" side undertakes actions which require RTNL, e.g., > various events will eventually call netdev_lower_state_changed. > > However, the RTNL acquisition is a trylock to avoid the deadlock > with bond_close. Moving that out of line here (e.g., putting the commit > into another work queue event or the like) has the same problem, in that > bond_close needs to wait for all of the work queue events to finish, and > it holds RTNL.
Ah, okay, it wasn't clear to me that we actually do need RTNL here, I'd thought it was just for the deadlock avoidance with bond_close, based on the comments in the source. > Also, a dim memory says that the various notification messages > were mostly placed in the "inspect" phase and not the "commit" phase to > avoid doing printk-like activities with RTNL held. As a general > principle, I don't think we want to add more verbiage under RTNL. Yeah, that's fair. > >The addition of a case BOND_LINK_BACK in bond_miimon_inspect() is somewhat > >separate from the fix for the actual hang, but it eliminates a constant > >"invalid new link 3 on slave" message seen related to this issue, and it's > >not actually an invalid state here, so we shouldn't be reporting it as an > >error. > > Do you mean bond_miimon_commit here and not bond_miimon_inspect > (which already has a case for BOND_LINK_BACK)? Whoops, yes. > In principle, bond_miimon_commit should not see _BACK or _FAIL > state as a new link state, because those states should be managed at the > bond_miimon_inspect level (as they are the result of updelay and > downdelay). These states should not be "committed" in the sense of > causing notifications or doing actions that require RTNL. > > My recollection is that the "invalid new link" messages were the > result of a bug in de77ecd4ef02, which was fixed in 1899bb325149 > ("bonding: fix state transition issue in link monitoring"), but maybe > the RTNL problem here induces that in some other fashion. > > Either way, I believe this message is correct as-is. Hm, okay, definitely seeing this message pop up regularly when a link recovers, using a fairly simple reproducer: slave1=p6p1 slave2=p6p2 modprobe -rv bonding modprobe -v bonding mode=2 miimon=100 updelay=200 ip link set bond0 up ifenslave bond0 $slave1 $slave2 sleep 10 while : do ip link set $slave1 down sleep 1 ip link set $slave1 up sleep 1 done I wasn't actually seeing the problem until I was running a 'watch -n 1 "dmesg | tail -n 50"' or similar in a remote ssh session on the host. I should add the caveat that this was also initially seen on an older kernel, but with a fairly up-to-date bonding driver, which does include both de77ecd4ef02 and 1899bb325149. I'm going to keep prodding w/a more recent upstreamier kernel, and see if I can get a better idea of what's actually going on. -- Jarod Wilson ja...@redhat.com