On Tue, Dec 8, 2020 at 3:35 PM Jay Vosburgh <jay.vosbu...@canonical.com> wrote: > > Jarod Wilson <ja...@redhat.com> wrote: ... > >The addition of a case BOND_LINK_BACK in bond_miimon_commit() is somewhat > >separate from the fix for the actual hang, but it eliminates a constant > >"invalid new link 3 on slave" message seen related to this issue, and it's > >not actually an invalid state here, so we shouldn't be reporting it as an > >error. ... > In principle, bond_miimon_commit should not see _BACK or _FAIL > state as a new link state, because those states should be managed at the > bond_miimon_inspect level (as they are the result of updelay and > downdelay). These states should not be "committed" in the sense of > causing notifications or doing actions that require RTNL. > > My recollection is that the "invalid new link" messages were the > result of a bug in de77ecd4ef02, which was fixed in 1899bb325149 > ("bonding: fix state transition issue in link monitoring"), but maybe > the RTNL problem here induces that in some other fashion. > > Either way, I believe this message is correct as-is.
For reference, with 5.10.1 and this script: #!/bin/sh slave1=ens4f0 slave2=ens4f1 modprobe -rv bonding modprobe -v bonding mode=2 miimon=100 updelay=200 ip link set bond0 up ifenslave bond0 $slave1 $slave2 sleep 5 while : do ip link set $slave1 down sleep 1 ip link set $slave1 up sleep 1 done I get this repeating log output: [ 9488.262291] sfc 0000:05:00.0 ens4f0: link up at 10000Mbps full-duplex (MTU 1500) [ 9488.339508] bond0: (slave ens4f0): link status up, enabling it in 200 ms [ 9488.339511] bond0: (slave ens4f0): invalid new link 3 on slave [ 9488.547643] bond0: (slave ens4f0): link status definitely up, 10000 Mbps full duplex [ 9489.276614] bond0: (slave ens4f0): link status definitely down, disabling slave [ 9490.273830] sfc 0000:05:00.0 ens4f0: link up at 10000Mbps full-duplex (MTU 1500) [ 9490.315540] bond0: (slave ens4f0): link status up, enabling it in 200 ms [ 9490.315543] bond0: (slave ens4f0): invalid new link 3 on slave [ 9490.523641] bond0: (slave ens4f0): link status definitely up, 10000 Mbps full duplex [ 9491.356526] bond0: (slave ens4f0): link status definitely down, disabling slave [ 9492.285249] sfc 0000:05:00.0 ens4f0: link up at 10000Mbps full-duplex (MTU 1500) [ 9492.291522] bond0: (slave ens4f0): link status up, enabling it in 200 ms [ 9492.291523] bond0: (slave ens4f0): invalid new link 3 on slave [ 9492.499604] bond0: (slave ens4f0): link status definitely up, 10000 Mbps full duplex [ 9493.331594] bond0: (slave ens4f0): link status definitely down, disabling slave "invalid new link 3 on slave" is there every single time. Side note: I'm not actually able to reproduce the repeating "link status up, enabling it in 200 ms" and never recovering from a downed link on this host, no clue why it's so reproducible w/another system. -- Jarod Wilson ja...@redhat.com