On Tue, Mar 08, 2016 at 11:01:21PM -0600, Ryan Moats wrote: > > > > Ben Pfaff <b...@ovn.org> wrote on 03/08/2016 10:07:19 PM: > > > From: Ben Pfaff <b...@ovn.org> > > To: Ryan Moats/Omaha/IBM@IBMUS > > Cc: dev@openvswitch.org > > Date: 03/08/2016 10:07 PM > > Subject: Re: [ovs-dev] What looks like a bug in OVSDB change seqno > tracking > > > > On Tue, Mar 08, 2016 at 08:35:10PM -0600, Ryan Moats wrote: > > > While working on finishing up the next version of the incremental flow > > > processing patch, I stubbed > > > my toe on a situation where the port column of rows in the multicast > group > > > table of ovnsb were > > > being updated without the row's change seqno being updated. > > > > > > A quick look at the code base shows that weak references (like the port > > > column) are updated via > > > calls to ovsdb_txn_row_modify while row change seqnos are updated via > calls > > > to ovsdb_idl_row_* > > > and the two code paths don't appear to intersect. > > > > ovsdb_txn_row_modify() is part of ovsdb-server, and ovsdb_idl_row_*() is > > part of ovsdb-client, so they don't intersect per se. > > > > > While I consider such behavior a bug (because it means change seqnos > can't > > > be trusted), I wanted > > > to ping the list to confirm my cursory analysis before delving in to > trying > > > to find a way to address > > > this. > > > > Can you explain how to reproduce the issue? > > > > This gets a little complex, but I'll give it a try: > > Preparation: > > (1) Turn on change tracking for the entire SB database in > ovn/controller/ovn-controller.c > (2) In ovn/controller/physical.c, change SBREC_MULTICAST_GROUP_FOR_EACH to > SBREC_MULTICAST_GROUP_FOR_EACH_TRACKED, lookup the seqnos for each row and > dump them to the log > (3) In ovn/controller/ofctrl.c, change ovn_flow_log to always log messages > and add a line to log the flow that is passed to ofctrl_add_flow > > [If need be, I can see about extracting out a patch set that will make all > of the above changes] > > Now, run ovn E2E test case with 3 HVs, 1 VIFs/HV, 1 GW, 1LS (case 2014 in > my > setup) - it should pass, but if one looks at the output of > hv1/ovn-controller.log, one should see the following signatures: > > Initially, the first multicast row will produce: > > 2016-03-09T04:38:17.721Z|00105|physical|INFO|multicast row with 0 0 2 > 2016-03-09T04:38:17.721Z|00106|ofctrl|INFO|considering flow: table_id=33, > priority=100, reg7=0xffff,metadata=0x1, actions=set_field:0x1-> > reg5,set_field:0x1->reg7,resubmit(,34),set_field:0xffff->reg7 > > After the first tunnel port appears, the signature changes to: > > 2016-03-09T04:38:17.799Z|00150|physical|INFO|multicast row with 0 0 2 > 2016-03-09T04:38:17.799Z|00151|ofctrl|INFO|considering flow: table_id=33, > priority=100, reg7=0xffff,metadata=0x1, actions=set_field:0x1-> > reg5,set_field:0x1->reg7,resubmit(,34),set_field:0xffff->reg7 > 2016-03-09T04:38:17.799Z|00152|ofctrl|INFO|considering flow: table_id=32, > priority=100, reg7=0xffff,metadata=0x1, actions=set_field:0x1/0xffffff-> > tun_id,set_field:0xffff/0xffffffff->tun_metadata0,move:NXM_NX_REG6[0..14]-> > NXM_NX_TUN_METADATA0[16..30],output:2,resubmit(,33) > > After the second tunnel port appears, the signature changes to: > > 2016-03-09T04:38:20.028Z|00203|physical|INFO|multicast row with 0 0 2 > 2016-03-09T04:38:20.028Z|00204|ofctrl|INFO|considering flow: table_id=33, > priority=100, reg7=0xffff,metadata=0x1, actions=set_field:0x1-> > reg5,set_field:0x1->reg7,resubmit(,34),set_field:0xffff->reg7 > 2016-03-09T04:38:20.028Z|00205|ofctrl|INFO|considering flow: table_id=32, > priority=100, reg7=0xffff,metadata=0x1, actions=set_field:0x1/0xffffff-> > tun_id,output:3,set_field:0x1/0xffffff-> > tun_id,set_field:0xffff/0xffffffff->tun_metadata0,move:NXM_NX_REG6[0..14]-> > NXM_NX_TUN_METADATA0[16..30],output:2,resubmit(,33) > > Looking at the multicast code, it processes all the values in the > ports column of the multicast group row. The above signature says > to me that the the values in this column have changed, but the > change seqnos for that row didn't change (specifically the mod_seqno > was never set), which leads me to the conclusion that these seqnos > can't be trusted to reveal when rows of a table change... > > I'm hoping for a different interpretation...
As far as I can tell those rows get added once and never change, so that the change_seqnos are correct. I suspect that the problem here is that your instructions don't ever call ovsdb_idl_track_clear(). I'm appending the diff that I used. --8<--------------------------cut here-------------------------->8-- diff --git a/ovn/controller/ofctrl.c b/ovn/controller/ofctrl.c index 3297fb3..791175f 100644 --- a/ovn/controller/ofctrl.c +++ b/ovn/controller/ofctrl.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2015 Nicira, Inc. +/* Copyright (c) 2015, 2016 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -484,6 +484,8 @@ ofctrl_add_flow(struct hmap *desired_flows, f->ofpacts_len = actions->size; f->hmap_node.hash = ovn_flow_hash(f); + ovn_flow_log(f, "considering flow"); + if (ovn_flow_lookup(desired_flows, f)) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 5); if (!VLOG_DROP_INFO(&rl)) { diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c index 5852e35..bd27ab2 100644 --- a/ovn/controller/ovn-controller.c +++ b/ovn/controller/ovn-controller.c @@ -259,6 +259,7 @@ main(int argc, char *argv[]) char *ovnsb_remote = get_ovnsb_remote(ovs_idl_loop.idl); struct ovsdb_idl_loop ovnsb_idl_loop = OVSDB_IDL_LOOP_INITIALIZER( ovsdb_idl_create(ovnsb_remote, &sbrec_idl_class, true, true)); + ovsdb_idl_track_add_all(ovnsb_idl_loop.idl); ovsdb_idl_get_initial_snapshot(ovnsb_idl_loop.idl); /* Initialize connection tracking zones. */ @@ -338,6 +339,7 @@ main(int argc, char *argv[]) } ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop); ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop); + ovsdb_idl_track_clear(ovnsb_idl_loop.idl); poll_block(); if (should_service_stop()) { exiting = true; diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c index 657c3e2..35ca727 100644 --- a/ovn/controller/physical.c +++ b/ovn/controller/physical.c @@ -1,4 +1,4 @@ -/* Copyright (c) 2015 Nicira, Inc. +/* Copyright (c) 2015, 2016 Nicira, Inc. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -487,6 +487,15 @@ physical_run(struct controller_ctx *ctx, enum mf_field_id mff_ovn_geneve, const struct sbrec_multicast_group *mc; struct ofpbuf remote_ofpacts; ofpbuf_init(&remote_ofpacts, 0); + SBREC_MULTICAST_GROUP_FOR_EACH_TRACKED (mc, ctx->ovnsb_idl) { + struct ds s = DS_EMPTY_INITIALIZER; + for (int i = 0; i < OVSDB_IDL_CHANGE_MAX; i++) { + ds_put_format(&s, " %u", mc->header_.change_seqno[i]); + } + VLOG_INFO("multicast row %x, n_ports=%u, with%s", mc->header_.uuid.parts[0], mc->n_ports, ds_cstr(&s)); + ds_destroy(&s); + + } SBREC_MULTICAST_GROUP_FOR_EACH (mc, ctx->ovnsb_idl) { struct sset remote_chassis = SSET_INITIALIZER(&remote_chassis); struct match match; _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev