On Fri, Jul 22, 2016 at 11:04:47PM -0400, Lance Richardson wrote:
> ----- Original Message -----
> > From: "Ben Pfaff" <b...@ovn.org>
> > To: "Lance Richardson" <lrich...@redhat.com>
> > Cc: dev@openvswitch.org
> > Sent: Friday, July 22, 2016 6:47:14 PM
> > Subject: Re: [ovs-dev] [PATCH v2] ovn-controller: eliminate stall in ofctrl 
> > state machine
> > 
> > On Thu, Jul 07, 2016 at 08:31:08PM -0400, Lance Richardson wrote:
> > > The "ovn -- 2 HVs, 3 LRs connected via LS, static routes"
> > > test case currently exhibits frequent failures. These failures
> > > occur because, at the time that the test packets are sent to
> > > verify forwarding, no flows have been installed in the vswitch
> > > for one of the hypervisors.
> > > 
> > > Investigation shows that, in the failing case, the ofctrl state
> > > machine has not yet transitioned to the S_UPDATE_FLOWS state.
> > > This occurrs when ofctrl_run() is called and:
> > >    1) The state is S_TLV_TABLE_MOD_SENT.
> > >    2) An OFPTYPE_NXT_TLV_TABLE_REPLY message is queued for reception.
> > >    3) No event (other than SB probe timer expiration) is expected
> > >       that would unblock poll_block() in the main ovn-controller
> > >       loop.
> > > 
> > > In this scenario, ofctrl_run() will move state to S_CLEAR_FLOWS
> > > and return, without having executed run_S_CLEAR_FLOWS() which
> > > would have immediately transitioned the state to S_UPDATE_FLOWS
> > > which is needed in order for ovn-controller to configure flows
> > > in ovs-vswitchd. After a delay of about 5 seconds (the default
> > > SB probe timer interval), ofctrl_run() would be called again
> > > to make the transition to S_UPDATE_FLOWS, but by this time
> > > the test case has already failed.
> > > 
> > > Fix by expanding the state machine's "while state != old_state"
> > > loop to include processing of receive events, with a maximum
> > > iteration limit to prevent excessive looping in pathological
> > > cases. Without this fix, around 40 failures are seen out of
> > > 100 attempts, with this fix no failures have been observed in
> > > several hundred attempts.
> > > 
> > > Signed-off-by: Lance Richardson <lrich...@redhat.com>
> > > ---
> > >  v2: Added maximum iteration limit to state machine loop.
> > 
> > Thanks for investigating this.
> > 
> > I understand why your patch fixes a problem, but I don't yet understand
> > the root cause of the problem, and I'd like to know more before I commit
> > the patch.  I have two questions:
> > 
> > First, I don't understand why, if there's a message queued for reception
> > as you describe in 2), the main loop would wait for a timer expiration.
> > Before poll_block(), the main loop should call ofctrl_wait(), which
> > calls rconn_recv_wait(), which should cause poll_block() to wake up if
> > there's anything incoming from the OpenFlow connection.
> > 
> > Second, I don't know why the test "state == old_state" is here in
> > ofctrl_run().  I think that we can delete it.  It could be a culprit,
> > but on the other hand it seems like it's not a big deal if poll_block()
> > properly wakes up when a message is received:
> > 
> >     for (int i = 0; state == old_state && i < 50; i++) {
> >         struct ofpbuf *msg = rconn_recv(swconn);
> >         if (!msg) {
> >             break;
> >         }
> > 
> > Thanks,
> > 
> > Ben.
> > 
> As currently implemented, when ofctrl_run() is called,
> the run_*() functions will be called to transition through states that
> don't depend on external events until the state no longer changes, then
> any pending receive events are processed until the arbitrary limit of 50
> events is processed, all receive events have been processed, or the state
> changes (which does seem odd).
> 
> <This is the important bit:>
> If the receive event handling happens to transition to a state for which
> receive events are not expected, i.e. where the run_*() needs to be called
> in order to transition to the next state, then the ofctrl_run() will return
> and the transition will not be made until an event (timer expiration or 
> receive) occurs.
> 
> So the "message queued for reception" I referred to in the description is the
> one that transitions into S_CLEAR_FLOWS state.  ofctrl_run() returns after
> making this transition, despite the fact there is no receive event expected 
> which would
> cause ofctrl_run() to be called again. The state machine is stuck
> in this state until the next event (timer expiration) occurs to unblock
> the poll loop and cause ofctrl_run() to be called again.  When ofctrl_run()
> is called, the run_S_CLEAR_FLOWS() function immediately transitions into
> S_UPDATE_FLOWS state.
> 
> Another way to describe this: if a receive event causes a state transition,
> the run_*() function for the new state needs to be called in order to
> execute the actions for the new state (including possibly transitioning
> into another state).  If this doesn't occur, the state machine stalls
> until an event occurs to unblock the poll loop.
> 
> I can't say I'm particularly happy with the solution proposed in my patch,
> there's probably a better way. If the above explanation still doesn't help,
> let me know and I'll make another attempt when I'm a little less tired ;-)

Now that you explained it so clearly, it's obvious ;-)  Sorry I didn't
catch on quickly.  Your patience is admirable.

Your version will definitely work.  However, in the furtherance of a
long personal tradition of screwing with people's code, here's my
version.

--8<--------------------------cut here-------------------------->8--

From: Lance Richardson <lrich...@redhat.com>
Date: Thu, 7 Jul 2016 20:31:08 -0400
Subject: [PATCH] ovn-controller: eliminate stall in ofctrl state machine

The "ovn -- 2 HVs, 3 LRs connected via LS, static routes"
test case currently exhibits frequent failures. These failures
occur because, at the time that the test packets are sent to
verify forwarding, no flows have been installed in the vswitch
for one of the hypervisors.

The state machine implemented by ofctrl_run() is intended to
iterate as long as progress is being made, either as long as
the state continues to change or as long as packets are being
received.  Unfortunately, the code had a bug: if receiving a
packet caused the state to change, it didn't call the state's
run function again to try to see if it would change the state.
This caused a real problem in the following case:

   1) The state is S_TLV_TABLE_MOD_SENT.
   2) An OFPTYPE_NXT_TLV_TABLE_REPLY message is received.
   3) No event (other than SB probe timer expiration) is expected
      that would unblock poll_block() in the main ovn-controller
      loop.

In such a case, ofctrl_run() would receive the packet and
advance the state, but not call the run function for the new
state, and then leave the state machine paused until the next
event (e.g. a timer event) occurred.

This commit fixes the problem by continuing to iterate the state
machine until the state remains the same and no packet is
received in the same iteration.  Without this fix, around 40
failures are seen out of 100 attempts, with this fix no failures
have been observed in several hundred attempts (using an earlier
version of this patch).

Signed-off-by: Lance Richardson <lrich...@redhat.com>
Signed-off-by: Ben Pfaff <b...@ovn.org>
---
 ovn/controller/ofctrl.c | 56 ++++++++++++++++++++++++++++---------------------
 1 file changed, 32 insertions(+), 24 deletions(-)

diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c
index 184e86f..f0451b7 100644
--- a/ovn/controller/ofctrl.c
+++ b/ovn/controller/ofctrl.c
@@ -35,6 +35,7 @@
 #include "openvswitch/vlog.h"
 #include "ovn-controller.h"
 #include "ovn/lib/actions.h"
+#include "poll-loop.h"
 #include "physical.h"
 #include "rconn.h"
 #include "socket-util.h"
@@ -409,9 +410,10 @@ ofctrl_run(const struct ovsrec_bridge *br_int)
         state = S_NEW;
     }
 
-    enum ofctrl_state old_state;
-    do {
-        old_state = state;
+    bool progress = true;
+    for (int i = 0; progress && i < 50; i++) {
+        /* Allow the state machine to run. */
+        enum ofctrl_state old_state = state;
         switch (state) {
 #define STATE(NAME) case NAME: run_##NAME(); break;
             STATES
@@ -419,35 +421,41 @@ ofctrl_run(const struct ovsrec_bridge *br_int)
         default:
             OVS_NOT_REACHED();
         }
-    } while (state != old_state);
 
-    for (int i = 0; state == old_state && i < 50; i++) {
+        /* Try to process a received packet. */
         struct ofpbuf *msg = rconn_recv(swconn);
-        if (!msg) {
-            break;
-        }
-
-        const struct ofp_header *oh = msg->data;
-        enum ofptype type;
-        enum ofperr error;
+        if (msg) {
+            const struct ofp_header *oh = msg->data;
+            enum ofptype type;
+            enum ofperr error;
 
-        error = ofptype_decode(&type, oh);
-        if (!error) {
-            switch (state) {
+            error = ofptype_decode(&type, oh);
+            if (!error) {
+                switch (state) {
 #define STATE(NAME) case NAME: recv_##NAME(oh, type); break;
-                STATES
+                    STATES
 #undef STATE
-            default:
-                OVS_NOT_REACHED();
+                default:
+                    OVS_NOT_REACHED();
+                }
+            } else {
+                char *s = ofp_to_string(oh, ntohs(oh->length), 1);
+                VLOG_WARN("could not decode OpenFlow message (%s): %s",
+                          ofperr_to_string(error), s);
+                free(s);
             }
-        } else {
-            char *s = ofp_to_string(oh, ntohs(oh->length), 1);
-            VLOG_WARN("could not decode OpenFlow message (%s): %s",
-                      ofperr_to_string(error), s);
-            free(s);
+
+            ofpbuf_delete(msg);
         }
 
-        ofpbuf_delete(msg);
+        /* If we did some work, plan to go around again. */
+        progress = old_state != state || msg;
+    }
+    if (progress) {
+        /* We bailed out to limit the amount of work we do in one go, to allow
+         * other code a chance to run.  We were still making progress at that
+         * point, so ensure that we come back again without waiting. */
+        poll_immediate_wake();
     }
 
     return (state == S_CLEAR_FLOWS || state == S_UPDATE_FLOWS
-- 
2.1.3

_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev

Reply via email to