SLB bonds are very strange beasts.  It's taken OVS a while to figure out
how they should really work.  Way back in the mists of time, when we were
in the midst of this process, we noticed that the following could happen:

        1. Local VM sends a packet to the OVS bridge.
        2. OVS bridge learns VM's MAC, forwards packets to SLB bond.
        3. Remote switch hasn't learned packet's destination and
           forwards packet back to other interfaces in the SLB bond.

Normally nothing bad happens in this scenario because OVS has already
learned the local port for the VM's MAC in step 2 and the rules for SLB
bonding (this is rule #2 in vswitchd/INTERNALS).  But at the time we were
implementing this, OVS didn't yet use active flows to keep MAC learning
entries alive; only new flows prevented a MAC entry from aging out.  So
in steady state (e.g. just "ping" traffic) OVS would regularly forget MAC
addresses.  If the remote switch also happened to forward a packet back to
one of the SLB bond interfaces, then OVS would learn the VM's address on
the bond, with the result that any traffic coming in from the remote switch
would be black-holed until the VM sent a new packet.  This was not good.

The fix we applied at the time was commit 2416b8ecea (bridge: Eject NORMAL
flows without a learning entry from datapath.) followed by a small
refinement in commit e96a4d8035 (bridge: Feed flow stats into learning
table.).  This fix causes flows that don't have a learning entry to be
ejected from the datapath if revalidation occurs.  This forced the next
packet in the flow to go to userspace, which in turn caused learning to
happen, fixing the problem.

However, this isn't a good solution for several reasons:

        * It forces more packets to userspace, which is expensive.

        * It doesn't just affect the cases where it helps, those where an
          SLB bond is actually involved.  (This could be fixed, but it is
          not worth it.)

        * It means that flow installability becomes nondeterministic.  When
          the first packet shows up for a flow, we install it.  But later
          if we revalidate it, we have to uninstall it.  That doesn't make
          sense; a flow should be either installable or not installable,
          not some weird mix.

Fortunately, the situation has improved since this fix was originally
designed.  First, active flows now keep MAC learning entries alive, since
commit e96a4d8035367 (bridge: Feed flow stats into learning table.)
Second, gratuitous ARP locking, added in commit 7febb9100b (bridge: Filte
some gratuitous ARPs on bond slaves.) means that gratuitous ARPs reflected
on bond slaves don't cause confusion (this is rule #4 in
vswitchd/INTERNALS).

These improvements mean that it is no longer necessary to have this
strange special case at all.  Therefore, this commit removes it.

I found this while investigating reports from code that I added to
occasionally check that flow actions were correct.

Signed-off-by: Ben Pfaff <b...@nicira.com>
---
 ofproto/ofproto-dpif.c |    8 --------
 1 files changed, 0 insertions(+), 8 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 56c3baf..1a399c0 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -5306,14 +5306,6 @@ xlate_normal(struct action_xlate_ctx *ctx)
         if (mac->port.p != in_bundle) {
             output_normal(ctx, mac->port.p, vlan);
         }
-    } else if (!ctx->packet && !eth_addr_is_multicast(ctx->flow.dl_dst)) {
-        /* If we are revalidating but don't have a learning entry then eject
-         * the flow.  Installing a flow that floods packets opens up a window
-         * of time where we could learn from a packet reflected on a bond and
-         * blackhole packets before the learning table is updated to reflect
-         * the correct port. */
-        ctx->may_set_up_flow = false;
-        return;
     } else {
         struct ofbundle *bundle;
 
-- 
1.7.2.5

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to