On Mon, Jul 18, 2016 at 11:34:54AM -0700, Guru Shetty wrote: > On 18 July 2016 at 10:35, Ben Pfaff <b...@ovn.org> wrote: > > > On Mon, Jul 18, 2016 at 12:00:30AM -0700, Gurucharan Shetty wrote: > > > We should not use "peer" while connecting a router to a switch. > > > (Doing so, will cause ovn-northd to constantly create and destroy > > > port_binding records which causes CPU utilization of ovn-controller to > > > spike up.) > > > > > > Fixes: 31114af758c7e6 ("ovn-nbctl: Update logical router port commands.") > > > Signed-off-by: Gurucharan Shetty <g...@ovn.org> > > > > I updated the commit message to correctly say: > We should not use "peer" while connecting a router to a switch. > (Doing so, will cause ovn-northd to constantly create and destroy > logical_flow records which causes CPU utilization of ovn-controller to > spike up.) > > > > This seems like a bug in ovn-northd. Did you investigate why it > > happens? > > > I think I know ( I haven't put a debugger to confirm). We create a logical > flow incorrectly and add it via ovn_lflow_add() with a switch's datapath > and a router's pipeline stage (S_ROUTER_IN_ARP_RESOLVE) here (This is when > we incorrectly set router's peer as a switch): > https://github.com/openvswitch/ovs/blob/master/ovn/northd/ovn-northd.c#L2614
Oh, good spotting, can we fix it something like this? diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 7ce509d..97e3271 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -715,7 +736,10 @@ join_logical_ports(struct northd_context *ctx, sizeof *op->od->router_ports * (op->od->n_router_ports + 1)); op->od->router_ports[op->od->n_router_ports++] = op; } else if (op->nbr && op->nbr->peer) { - op->peer = ovn_port_find(ports, op->nbr->peer); + struct ovn_port *peer = ovn_port_find(ports, op->nbr->peer); + if (peer && peer->nbr) { + op->peer = peer; + } } } } > In build_lflows, we go through each record of SB database's logical flows > and then do a ovn_lflow_find() which returns a negative if the data was > wrongly inserted, so it goes ahead and deleted the record from SB database. > > In the next block, it will insert it back into the SB database. I will send > one additional fix. But, I think we should also assert in ovn_lflow_add if > we add a datapath with a different datapath's pipeline - not sure what is a > nice way to do that. Here's an idea (it compiles but I didn't test it): diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c index 7ce509d..905f6a7 100644 --- a/ovn/northd/ovn-northd.c +++ b/ovn/northd/ovn-northd.c @@ -175,6 +175,20 @@ ovn_stage_to_str(enum ovn_stage stage) default: return "<unknown>"; } } + +/* Returns the type of the datapath to which a flow with the given 'stage' may + * be added. */ +static enum ovn_datapath_type +ovn_stage_to_datapath_type(enum ovn_stage stage) +{ + switch (stage) { +#define PIPELINE_STAGE(DP_TYPE, PIPELINE, STAGE, TABLE, NAME) \ + case S_##DP_TYPE##_##PIPELINE##_##STAGE: return DP_##DP_TYPE; + PIPELINE_STAGES +#undef PIPELINE_STAGE + default: OVS_NOT_REACHED(); + } +} static void usage(void) @@ -303,6 +317,13 @@ ovn_datapath_destroy(struct hmap *datapaths, struct ovn_datapath *od) } } +/* Returns 'od''s datapath type. */ +static enum ovn_datapath_type +ovn_datapath_get_type(const struct ovn_datapath *od) +{ + return od->nbs ? DP_SWITCH : DP_ROUTER; +} + static struct ovn_datapath * ovn_datapath_find(struct hmap *datapaths, const struct uuid *uuid) { @@ -985,6 +1006,8 @@ ovn_lflow_add(struct hmap *lflow_map, struct ovn_datapath *od, enum ovn_stage stage, uint16_t priority, const char *match, const char *actions) { + ovs_assert(ovn_stage_to_datapath_type(stage) == ovn_datapath_get_type(od)); + struct ovn_lflow *lflow = xmalloc(sizeof *lflow); ovn_lflow_init(lflow, od, stage, priority, xstrdup(match), xstrdup(actions)); Thanks, Ben. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev