Forgot to CC list... > Try doing ifconfig eth1 down instead. This will take carrier down on > the NIC causing the upstream switch to flush it's learning table. > This is more realistic too, bond's don't typically failover when there > isn't a problem, so connectivity loss is expected when using > set-active-slave as you're doing.
In a production environment, there might be cases where doing a manual failover has it purposes: - maintenance, e.g. when you want to bring down a switch for maintenance and you want to do a controlled failover - because you want to spread the load to another switch (when active and failback interfaces are on different switches) - because you want to have 2 servers on the same switch, ... I followed your suggestion to bring down eth1 (which was the active one). Can this flushing behavior be switch-dependent? The device was disabled and a failover took place. ethtool reports the link to be down (Link detected: no). No interruption in network communication to the host... bond_mode: active-backup bond-hash-basis: 0 updelay: 200 ms downdelay: 200 ms lacp_negotiated: false slave eth1: disabled may_enable: false slave eth0: enabled active slave may_enable: true But the 4 running kvm guests however were unavailable for some time (I had fping running to the guests during the test from a host several switches away): During the switch 100% loss: Fri Jun 22 09:17:10 CEST 2012 sles111-flcapp-chico : xmt/rcv/%loss = 1/0/100% sles111-flwapp-aka : xmt/rcv/%loss = 1/0/100% sles111-repapp-ribet : xmt/rcv/%loss = 1/0/100% centos6-jmsdb-jajan : xmt/rcv/%loss = 1/0/100% 1 second later: the first guest responds: Fri Jun 22 09:17:11 CEST 2012 sles111-flcapp-chico : xmt/rcv/%loss = 1/0/100% sles111-flwapp-aka : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.60/0.60/0.60 sles111-repapp-ribet : xmt/rcv/%loss = 1/0/100% centos6-jmsdb-jajan : xmt/rcv/%loss = 1/0/100% 16 seconds later, 2 other guests start responding: Fri Jun 22 09:17:26 CEST 2012 sles111-flcapp-chico : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.53/0.53/0.53 sles111-flwapp-aka : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.49/0.49/0.49 sles111-repapp-ribet : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.54/0.54/0.54 centos6-jmsdb-jajan : xmt/rcv/%loss = 1/0/100% 29 seconds later, the 4th guest starts responding: Fri Jun 22 09:17:39 CEST 2012 sles111-flcapp-chico : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.29/0.29/0.29 sles111-flwapp-aka : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.45/0.45/0.45 sles111-repapp-ribet : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.46/0.46/0.46 centos6-jmsdb-jajan : xmt/rcv/%loss = 1/1/0%, min/avg/max = 0.80/0.80/0.80 > Something like this could be added to OVS as we currently do it for > SLB bonds. But the situation your testing is unrealistic (due to the > carrier not dropping) so I'd prefer to avoid it until there's a > real-world use case. Does it have other drawbacks besides extra arp communication? Perhaps the test above is more realistic. Yesterday I also did the test by shutting down the port on the switch, and if I recall correctly, the same behavior was seen. Problem is I don't have admin rights on the switch, so I can't test it quickly. Thanks for your response, Frido Roose _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss