----- Original Message ----- > From: "Lance Richardson" <lrich...@redhat.com> > To: dev@openvswitch.org > Sent: Tuesday, July 5, 2016 8:58:24 AM > Subject: [ovs-dev] [PATCH] ovn-controller: eliminate stall in ofctrl state > machine > > The "ovn -- 2 HVs, 3 LRs connected via LS, static routes" > test case currently exhibits frequent failures. These failures > occur because, at the time that the test packets are sent to > verify forwarding, no flows have been installed in the vswitch > for one of the hypervisors. > > Investigation shows that, in the failing case, the ofctrl state > machine has not yet transitioned to the S_UPDATE_FLOWS state. > This occurrs when ofctrl_run() is called and: > 1) The state is S_TLV_TABLE_MOD_SENT. > 2) An OFPTYPE_NXT_TLV_TABLE_REPLY message is queued for reception. > 3) No event (other than SB probe timer expiration) is expected > that would unblock poll_block() in the main ovn-controller > loop. > > In this scenario, ofctrl_run() will move state to S_CLEAR_FLOWS > and return, without having executed run_S_CLEAR_FLOWS() which > would have immediately transitioned the state to S_UPDATE_FLOWS > which is needed in order for ovn-controller to configure flows > in ovs-vswitchd. After a delay of about 5 seconds (the default > SB probe timer interval), ofctrl_run() would be called again > to make the transition to S_UPDATE_FLOWS, but by this time > the test case has already failed. > > Fix by expanding the state machine's "while state != old_state" > loop to include processing of receive events. Without this > fix, around 40 failures are seen out of 100 attempts, with > this fix no failures have been observed in several hundred > attempts. > > Signed-off-by: Lance Richardson <lrich...@redhat.com> > ---
Note that this change looks much larger than it really is because of whitespace changes, it boils down to moving this line to include the for loop and re-indenting the for loop: > - } while (state != old_state); > > - for (int i = 0; state == old_state && i < 50; i++) { > - struct ofpbuf *msg = rconn_recv(swconn); > - if (!msg) { > - break; > - } > + for (int i = 0; state == old_state && i < 50; i++) { > + struct ofpbuf *msg = rconn_recv(swconn); > + if (!msg) { > + break; > + } > > - const struct ofp_header *oh = msg->data; > - enum ofptype type; > - enum ofperr error; > + const struct ofp_header *oh = msg->data; > + enum ofptype type; > + enum ofperr error; > > - error = ofptype_decode(&type, oh); > - if (!error) { > - switch (state) { > + error = ofptype_decode(&type, oh); > + if (!error) { > + switch (state) { > #define STATE(NAME) case NAME: recv_##NAME(oh, type); break; > - STATES > + STATES > #undef STATE > - default: > - OVS_NOT_REACHED(); > + default: > + OVS_NOT_REACHED(); > + } > + } else { > + char *s = ofp_to_string(oh, ntohs(oh->length), 1); > + VLOG_WARN("could not decode OpenFlow message (%s): %s", > + ofperr_to_string(error), s); > + free(s); > } > - } else { > - char *s = ofp_to_string(oh, ntohs(oh->length), 1); > - VLOG_WARN("could not decode OpenFlow message (%s): %s", > - ofperr_to_string(error), s); > - free(s); > - } > > - ofpbuf_delete(msg); > - } > + ofpbuf_delete(msg); > + } > + } while (state != old_state); > > return (state == S_CLEAR_FLOWS || state == S_UPDATE_FLOWS > ? mff_ovn_geneve : 0); > -- > 2.5.5 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > http://openvswitch.org/mailman/listinfo/dev > _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev