On Fri, Jul 22, 2016 at 09:54:26PM +0000, Ryan Moats wrote: > [1] reported increased failure rates in certain tests > with incremental processing (the numbers are the number of failures > seen in 100 tests): > > 2 ovn -- vtep: 3 HVs, 1 VIFs/HV, 1 GW, 1 LS > 10 ovn -- 2 HVs, 2 LS, 1 lport/LS, 2 peer LRs > 52 ovn -- 1 HV, 1 LS, 2 lport/LS, 1 LR > 45 ovn -- 1 HV, 2 LSs, 1 lport/LS, 1 LR > 23 ovn -- 2 HVs, 3 LS, 1 lport/LS, 2 peer LRs, static routes > 53 ovn -- 2 HVs, 3 LRs connected via LS, static routes > 32 ovn -- 2 HVs, 2 LRs connected via LS, gateway router > 50 ovn -- icmp_reply: 1 HVs, 2 LSs, 1 lport/LS, 1 LR > > These failures were caused by a combination of problems in > handling physical changes: > > 1. When a vif was removed, the localvif_to_ofport entry was not > removed. > 2. When a physical change was detected, ovn-controller would wait > a poll cycle before processing the logical flow table. > > This patch set addresses both of these issues while simultaneously > cleaning up the code in physical.c. A side effect is a modification > of where OF flows are dumped in the gateway router case that allowed > the root causes of this issue to be found. > > With these changes, all of the above tests had a 100/100 success rate. > > [1] http://openvswitch.org/pipermail/dev/2016-July/075803.html > > Signed-off-by: Ryan Moats <rmo...@us.ibm.com>
Thank you! This made a major improvement for me, too. I'm not seeing 100/100 but I also run my tests with TESTSUITEFLAGS=-j10, which is really stressful. I folded in the following changes (most importantly, to make it obvious why we're calling poll_immediate_wake(), which the commit message explained but the code didn't) and applied this to master. diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index dc4ef77..11b2c2b 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -55,8 +55,6 @@ physical_register_ovs_idl(struct ovsdb_idl *ovs_idl) static struct simap localvif_to_ofport = SIMAP_INITIALIZER(&localvif_to_ofport); -static struct simap new_localvif_to_ofport = - SIMAP_INITIALIZER(&new_localvif_to_ofport); static struct hmap tunnels = HMAP_INITIALIZER(&tunnels); /* UUID to identify OF flows not associated with ovsdb rows. */ @@ -606,6 +604,8 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, uuid_generate(hc_uuid); } + struct simap new_localvif_to_ofport = + SIMAP_INITIALIZER(&new_localvif_to_ofport); for (int i = 0; i < br_int->n_ports; i++) { const struct ovsrec_port *port_rec = br_int->ports[i]; if (!strcmp(port_rec->name, br_int->name)) { @@ -674,8 +674,10 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, tun->ofport = u16_to_ofp(ofport); tun->type = tunnel_type; full_binding_processing = true; - lflow_reset_processing(); binding_reset_processing(); + + /* Reprocess logical flow table immediately. */ + lflow_reset_processing(); poll_immediate_wake(); break; } else { @@ -712,6 +714,8 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, } if (localvif_map_changed) { full_binding_processing = true; + + /* Reprocess logical flow table immediately. */ lflow_reset_processing(); poll_immediate_wake(); } @@ -853,9 +857,10 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, ofctrl_add_flow(OFTABLE_DROP_LOOPBACK, 0, &match, &ofpacts, hc_uuid); ofpbuf_uninit(&ofpacts); - simap_clear(&new_localvif_to_ofport); HMAP_FOR_EACH_POP (tun, hmap_node, &tunnels) { free(tun); } hmap_clear(&tunnels); + + simap_destroy(&new_localvif_to_ofport); } _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev