Hi, tldr: 1) using balance-tcp prevents ovs from using megaflows? 2) balance-slb documentation is unclear when used with lacp? -> questions at the end
== background == I'm using openvswitch in its default out of the box mode as a mac learning L2 switch with multiple vlans for virtual machines on Debian/Xen hypervisors. OpenvSwitch on a physical server is connected with 2x1G ethernet ports to a 2x cisco 3750-X switch stack, using a bond with lacp. Average traffic levels are measured in Mb/s, not Gb/s. Traffic between the physical servers and the switch is avg. between ~20 to ~150 Mbit/s. Most of the servers use Debian Jessie (linux 3.16-ckt), Xen 4.4 and OpenvSwitch 2.3.0. A few of them are still running Debian Wheezy (linux 3.2), Xen 4.1 and OpenvSwitch 1.4.2. == traffic interruptions == Last weeks I've been investigating short network interruptions in our production network, lasting a few seconds each, starting to occur a few times a week during the last two months. Symptoms where flapping behaviour of vrrp on routers and flapping load balancer health checks. The first issue that was found was short bursts of unicast flooding on the network, mainly caused by a specific case of asymmetric routing. The asymmetric routing caused asymmetric L2 traffic, expiring mac address to port mappings, having unicast flooding as a result. It seems ovs having to duplicate a stream of traffic into 100+ virtual nics, does not help network stability(tm). The asymmetric routing case could quite easily be solved by some added routing policies to send traffic back where it came from, instead of taking a shortcut to a connected network. This solved almost all of the unicast flooding. == balance-tcp, flow counts, misses and lost flows == While investigating, I also started having a closer look at the behaviour of openvswitch. I added a little plugin to our monitoring (munin) to graph counters from the output of ovs-dpctl show. Especially the "lost" counter, which was increasing from time to time in several places caught the attention. When looking at the flows themselves (with ovs-dpctl dump-flows), I found out that the "megaflow" optimization that was introduced in ovs long time ago did not seem to be applied at all in our case. A significant number of flows are related to dns resolver traffic, containing all possible fields like source and destination ports, meaning all packets go to userspace, and the resulting flows will not be reused at all. Example: in_port(1),eth(src=02:00:52:5e:bc:05,dst=02:00:52:5e:bc:03), eth_type(0x8100),vlan(vid=10,pcp=0),encap(eth_type(0x0800), ipv4(src=82.94.188.6,dst=82.94.240.117,proto=17,tos=0,ttl=64,frag=no), udp(src=53,dst=50464)), packets:0, bytes:0, used:never, actions:pop_vlan,213 So, in a worst case scenario, when I need to do a dns request on a resolver that's behind a load balancer, with two routers in between, having all of them live on a different physical server, that means that about 14 extremely specific flows need to be set up and removed again. If I do 1000 requests, it's 14000 etc... When searching for information about this, I came across an old mailing list post from Ben, http://openvswitch.org/pipermail/discuss/2014-January/012769.html suggesting that using balance-tcp prevents the use of megaflows. The idea made sense when I read that, because the hashing algo probably wants to have the L4 info available. == balance-slb == In a test environment, I tried to see what happens when changing the hashing method from balance-tcp to balance-slb. After all, with our traffic volumes, spreading the traffic a bit based on mac/vlan alone is sufficient. The results were a devastating decrease of flow misses, lost flows and flow counts in general in all graphs. Yesterday, I applied the change to the production network, with the same results. Well, except for the wheezy ones, with older ovs, but we're emptying them anyway. Some graphs: https://syrinx.knorrie.org/~knorrie/keep/ovs/ The switch from balance-tcp to balance-slb is where most of the lines show a steep drop. The lost flows (yellow) seen yesterday but not today is the result of fixing a case of asymmetric traffic spikes. The wheezy boxes (with the greek letter names) do not show improvement, but we're emptying them anyway. There's a few busy routers (traffic complexity, not volume) on there, which need to move first. == questions == 1. Does using balance-tcp hashing on a bond disable megaflows? If so, why isn't there a huge warning about this and the significant resulting performance hit in the man page? 2. The documentation about balance-slb is confusing. In "Bonding Configuration", the text suggests that balance-slb and active-backup requires "On the upstream switch, do not configure the interfaces as a bond". Also, the vswitchd/INTERNALS file lists "Bond Balance Modes". It feels like the whole concept of choosing which ports are active (bond mode) and otoh choosing what traffic to throw at the active ports (hashing algorithm) are used interchangeably in a confusing way. Am I right to assume that all the potential problems listed at "SLB Bonding" in vswitchd/INTERNALS do not apply to my situation, if I use LACP and the balance-slb hashing method on top? There's a single line of hope inside the documentation, which seems to suggest this: "after LACP negotiation is complete, there is no need for special handling of received packets". Thanks, Hans van Kranenburg _______________________________________________ discuss mailing list discuss@openvswitch.org http://openvswitch.org/mailman/listinfo/discuss