The "ovn -- 2 HVs, 3 LRs connected via LS, static routes"
test case currently exhibits frequent failures. These failures
occur because, at the time that the test packets are sent to
verify forwarding, no flows have been installed in the vswitch
for one of the hypervisors.

Investigation shows that, in the failing case, the ofctrl state
machine has not yet transitioned to the S_UPDATE_FLOWS state.
This occurrs when ofctrl_run() is called and:
   1) The state is S_TLV_TABLE_MOD_SENT.
   2) An OFPTYPE_NXT_TLV_TABLE_REPLY message is queued for reception.
   3) No event (other than SB probe timer expiration) is expected
      that would unblock poll_block() in the main ovn-controller
      loop.

In this scenario, ofctrl_run() will move state to S_CLEAR_FLOWS
and return, without having executed run_S_CLEAR_FLOWS() which
would have immediately transitioned the state to S_UPDATE_FLOWS
which is needed in order for ovn-controller to configure flows
in ovs-vswitchd. After a delay of about 5 seconds (the default
SB probe timer interval), ofctrl_run() would be called again
to make the transition to S_UPDATE_FLOWS, but by this time
the test case has already failed.

Fix by expanding the state machine's "while state != old_state"
loop to include processing of receive events. Without this
fix, around 40 failures are seen out of 100 attempts, with
this fix no failures have been observed in several hundred
attempts.

Signed-off-by: Lance Richardson <lrich...@redhat.com>
---
 ovn/controller/ofctrl.c | 46 +++++++++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index 4c410da..7cd60d6 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -410,36 +410,36 @@ ofctrl_run(const struct ovsrec_bridge *br_int)
         default:
             OVS_NOT_REACHED();
         }
-    } while (state != old_state);
 
-    for (int i = 0; state == old_state && i < 50; i++) {
-        struct ofpbuf *msg = rconn_recv(swconn);
-        if (!msg) {
-            break;
-        }
+        for (int i = 0; state == old_state && i < 50; i++) {
+            struct ofpbuf *msg = rconn_recv(swconn);
+            if (!msg) {
+                break;
+            }
 
-        const struct ofp_header *oh = msg->data;
-        enum ofptype type;
-        enum ofperr error;
+            const struct ofp_header *oh = msg->data;
+            enum ofptype type;
+            enum ofperr error;
 
-        error = ofptype_decode(&type, oh);
-        if (!error) {
-            switch (state) {
+            error = ofptype_decode(&type, oh);
+            if (!error) {
+                switch (state) {
 #define STATE(NAME) case NAME: recv_##NAME(oh, type); break;
-                STATES
+                    STATES
 #undef STATE
-            default:
-                OVS_NOT_REACHED();
+                default:
+                    OVS_NOT_REACHED();
+                }
+            } else {
+                char *s = ofp_to_string(oh, ntohs(oh->length), 1);
+                VLOG_WARN("could not decode OpenFlow message (%s): %s",
+                          ofperr_to_string(error), s);
+                free(s);
             }
-        } else {
-            char *s = ofp_to_string(oh, ntohs(oh->length), 1);
-            VLOG_WARN("could not decode OpenFlow message (%s): %s",
-                      ofperr_to_string(error), s);
-            free(s);
-        }
 
-        ofpbuf_delete(msg);
-    }
+            ofpbuf_delete(msg);
+        }
+    } while (state != old_state);
 
     return (state == S_CLEAR_FLOWS || state == S_UPDATE_FLOWS
             ? mff_ovn_geneve : 0);
-- 
2.5.5

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to