On Sat, Jul 02, 2016 at 04:11:14PM -0400, Lance Richardson wrote: > > > ----- Original Message ----- > > From: "Ben Pfaff" <b...@ovn.org> > > To: "Lance Richardson" <lrich...@redhat.com> > > Cc: dev@openvswitch.org > > Sent: Saturday, July 2, 2016 3:23:35 PM > > Subject: Re: [ovs-dev] [PATCH] ovn-controller: process lport bindings only > > when transaction is possible > > > > On Sat, Jul 02, 2016 at 01:56:21PM -0400, Lance Richardson wrote: > > > As currently implemented, binding_run() normally updates the set of > > > locally owned logical ports on each call. When changes to the > > > membership of this set are detected (i.e. when locally bound > > > logical ports are added or deleted), additional processing to > > > update the sb database with lport binding is performed. > > > > > > However, the sb database can only be updated when a transaction to > > > the sb database is possible (that is, when ctx->ovnsb_idl_txn is > > > non-NULL). If a new logical port is detected while ctx->ovnsb_idl_txn > > > happens to be NULL, its binding information will not be updated in > > > the the sb database until another change to the set of locally-owned > > > logical ports changes. If no such change ever occurs, the sb database > > > is never updated with the appropriate binding information. > > > > > > Eliminate this issue by only updating the set of locally owned logical > > > ports when an sb database transaction is possible. This addresses > > > a cause of occasional failures in the "3 HVs, 3 LS, 3 lports/LS, 1 LR" > > > test case. > > > > > > Signed-off-by: Lance Richardson <lrich...@redhat.com> > > > --- > > > ovn/controller/binding.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c > > > index 8b439a6..369f8f2 100644 > > > --- a/ovn/controller/binding.c > > > +++ b/ovn/controller/binding.c > > > @@ -266,7 +266,7 @@ binding_run(struct controller_ctx *ctx, const struct > > > ovsrec_bridge *br_int, > > > } > > > > > > if (br_int) { > > > - if (get_local_iface_ids(br_int, &lports)) { > > > + if (ctx->ovnsb_idl_txn && get_local_iface_ids(br_int, &lports)) { > > > process_full_binding = true; > > > } > > > } else { > > > > At a glance, get_local_iface_ids() doesn't seem to directly modify the > > southbound database, but it does modify data structures that later code > > depends on. Based on that, it's not obvious for me to see that this is > > a correct change. Can you point out anything relevant? > > > > Thanks, > > > > Ben. > > > > Hi Ben, > > The failing scenario goes like this: > 1) Test case logical network setup is complete. > 2) The last physical network port is added via > as hv3 ovs-vsctl --add-port ... --set Interface vif333 > external-ids:iface-id=lp333 > 3) hv3 ovn-controller receives update from hv3 ovsdb-server with above > mapping, > binding_run() is called, and ctx->ovnsb_idl_txn happens to be NULL. > 4) binding_run() calls get_local_iface_ids(), which recognizes the new > local port as matching a logical port, so the lp333 is added to the > global ssets "lports" and "all_lports". This means lp333 will not be > treated > as a new logical port on subsequent calls. Because getLocal_iface_ids() > has discovered a new lport, it returns changed = true. > 5) Because get_local_iface_ids() returned true, binding_run() sets > process_full_binding > to true. > 6) Because process_full_binding is true, binding_run() calls > consider_local_datapath() > for each logical port in shash_lports (which now includes lp333). > 7) consider_local_datapath() processing returns without calling > sbrec_port_binding_set_chassis() because ctx->ovnsb_idl_txn is NULL. > 8) There are subsequent calls to binding_run() with non-NULL > ctx->ovnsb_idl, > but because lp333 is already in the "lports" sset, get_local_iface_ids() > returns changed=false, so process_full_binding is false, which means > consider_local_datapath() is not called for lp333. > 9) Because consider_local_datapath() is not called for lp333, the sb > database > is not updated with the lport/chassis binding. > > Hopefully the above is intelligible. Another way of looking at it would be > to say the condition for calling consider_local_datapath() is an "edge > trigger", > this change suppresses the trigger until the necessary actions can be > performed.
That is an amazingly detailed rationale. I appended all of it to the commit message and applied this to master. > One thing I have a doubt about is whether there's any guarantee that, after a > call > with ctx->ovnsb_idl_txn==NULL, there will be another call with non-NULL > ctx->ovnsb_idl_txn, > but I suspect that other things would not work without this being the case. This should be guaranteed. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev