Thomas Falcon <tlfal...@linux.ibm.com> wrote: >The following behavior has been observed when testing logical partition >migration of LACP-bonded VNIC devices in a PowerVM pseries environment. > >1. When performing the migration, the bond master detects that a slave has > lost its link, deactivates the LACP port, and sets the port's > is_enabled flag to false. >2. The slave device then updates it's carrier state to off while it resets > itself. This update triggers a NETDEV_CHANGE notification, which performs > a speed and duplex update. The device does not return a valid speed > and duplex, so the master sets the slave link state to BOND_LINK_FAIL. >3. When the slave VNIC device(s) are active again, some operations, such > as setting the port's is_enabled flag, are not performed when transitioning > the link state back to BOND_LINK_UP from BOND_LINK_FAIL, though the state > prior to the speed check was BOND_LINK_DOWN.
Just to make sure I'm understanding correctly, in regards to "the state prior to the speed check was BOND_LINK_DOWN," do you mean that during step 1, the slave link is set to BOND_LINK_DOWN, and then in step 2 changed from _DOWN to _FAIL? >Affected devices are therefore not utilized in the aggregation though they >are operational. The simplest way to fix this seems to be to restrict the >link state change to devices that are currently up and running. This sounds similar to an issue from last fall; can you confirm that you're running with a kernel that includes: 1899bb325149 bonding: fix state transition issue in link monitoring -J >CC: Jay Vosburgh <j.vosbu...@gmail.com> >CC: Veaceslav Falico <vfal...@gmail.com> >CC: Andy Gospodarek <a...@greyhouse.net> >Signed-off-by: Thomas Falcon <tlfal...@linux.ibm.com> >--- > drivers/net/bonding/bond_main.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index 2e70e43c5df5..d840da7cd379 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -3175,7 +3175,8 @@ static int bond_slave_netdev_event(unsigned long event, > * speeds/duplex are available. > */ > if (bond_update_speed_duplex(slave) && >- BOND_MODE(bond) == BOND_MODE_8023AD) { >+ BOND_MODE(bond) == BOND_MODE_8023AD && >+ slave->link == BOND_LINK_UP) { > if (slave->last_link_up) > slave->link = BOND_LINK_FAIL; > else >-- >2.18.2 > --- -Jay Vosburgh, jay.vosbu...@canonical.com