Thanks Ben for the late night review an improving the commit message.
On Sat, Aug 3, 2013 at 12:18 AM, Ben Pfaff <b...@nicira.com> wrote: > Thanks, Andy and Jesse, for the confirmation. I rewrote the commit > message and applied this to master and branch-1.11 as follows. > > I'm really bleary-eyed tired. I hope the following is correct. > > Thanks, > > Ben. > > --8<--------------------------cut here-------------------------->8-- > > From: Andy Zhou <az...@nicira.com> > Date: Fri, 2 Aug 2013 20:22:17 -0700 > Subject: [PATCH] ofproto-dpif: avoid losing track of kernel flows upon > reinstallation > > This commit fixes a problem whereby userspace can lose track of a > flow installed in the kernel, instead believing that the flow is > not installed. The most visible consequence of this bug was a > message in the ovs-vswitchd log warning about an unexpected flow > in the kernel. Other possible consequences included loss of > statistics and failure to updates actions when the OpenFlow flow > table changed. > > The problem arose in the following scenario. Suppose userspace > sets up a kernel flow due to an arriving packet. Before kernel > flow setup completes, another packet for that flow arrives. The > kernel sends the new packet to userspace after userspace has > completed processing the batch of packets that set up the flow. > Userspace then attempts to reinstall the kernel flow. This fails > with EEXIST, so userspace then marked the flow as not-installed, > even though it was successfully installed before and remains > installed. The next time userspace dumped the kernel flow > table to gather statistics, it would complain about an unexpected > flow and delete it. > > In practice, we have seen these messages with netperf TCP_CRR tests and > UDP stream tests. > > This patch fixes the problem by changing userspace so that, once > it successfully installs a flow in the kernel, it will not reinstall > it when it sees another packet for the flow in userspace. This > has the downside that, if something goes wrong and a flow > disappears from the kernel (e.g. ovs-dpctl del-flows), then userspace > won't reinstall it (until it tries to delete it). (This is in fact > the reason why until now userspace reinstalled flows it knew it > already installed.) > > Some more background may be warranted. There are two EEXIST error > cases: > > 1. A subfacet was installed successfully in a previous (recent) > batch. Now we've attempted to reinstall exactly the same > subfacet in this batch. > > 2. A subfacet was installed successfully in a previous (recent) > batch or earlier in the current batch. We've attempted to > install a subfacet for an overlapping megaflow. > > Before megaflows, installation errors were ignored completely. > Since megaflows were introduced, they have been handled by > considering on any installation error that the given subfacet is > not installed. This works well for case #2 but causes case #1 to > yield unexpected flows, as described at the top of the commit > message. > > This commit adds the wrinkle that we never try to reinstall > exactly the same subfacet that we know we installed successfully > earlier (and haven't deleted) unless its actions change. This > ought to work just as well for case #2, and avoids the problem > with case #1. > > Prepared with assistance from Ethan. > > Signed-off-by: Andy Zhou <az...@nicira.com> > [b...@nicira.com rewrote the commit message] > Signed-off-by: Ben Pfaff <b...@nicira.com> > --- > ofproto/ofproto-dpif.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > index 3bf4fe3..29a93e6 100644 > --- a/ofproto/ofproto-dpif.c > +++ b/ofproto/ofproto-dpif.c > @@ -3460,7 +3460,7 @@ handle_flow_miss_with_facet(struct flow_miss *miss, > struct facet *facet, > subfacet_update_stats(subfacet, stats); > } > > - if (miss->upcall_type == DPIF_UC_MISS || subfacet->path != want_path) > { > + if (subfacet->path != want_path) { > struct flow_miss_op *op = &ops[(*n_ops)++]; > struct dpif_flow_put *put = &op->dpif_op.u.flow_put; > > -- > 1.7.10.4 > > > > On Sat, Aug 03, 2013 at 12:01:51AM -0700, Andy Zhou wrote: > > Yes, this is my understanding as well. > > I did not mention the implication of mega flow in the commit message > > because it is not the problem. But it is important in discussing about > the > > solution. > > > > Your summary provides a better and more complete picture. > > > > thanks for the review. > > > > > > On Fri, Aug 2, 2013 at 11:36 PM, Ben Pfaff <b...@nicira.com> wrote: > > > > > On Fri, Aug 02, 2013 at 10:27:14PM -0700, Andy Zhou wrote: > > > > In the case of mega flow would this cause multiple subfacets within a > > > facet > > > > to be considered installed? > > > > > > > > > > > > On Fri, Aug 2, 2013 at 10:21 PM, Ben Pfaff <b...@nicira.com> wrote: > > > > > > > > > On Fri, Aug 02, 2013 at 08:22:17PM -0700, Andy Zhou wrote: > > > > > > This patch prevents the same subfacet from trying to install the > > > kernel > > > > > > flow more than once. > > > > > > > > > > > > This is how a kernel flow can become unexpected to the user > space. > > > > > > When a incoming packet did not match any flow in the kernel, it > > > > > > would be sent up to the user space. A subfacet would be created > and > > > > > > a corresponding kenrel flow installed. So far so good. > > > > > > > > > > > > Just before the kernel flow was installed, another packet of the > same > > > > > > flow were to arrive at the kernel, it would also be queued up > > > > > > for user space. This time, user space would find the subfacet > just > > > > > > created (subfacet_create()) due of the first packet, and attempt > > > > > > to install the same kernel flow again, but could not (kernel > returns > > > > > EEXIST). > > > > > > User space would then mark the subfacet as not installed > > > > > (SF_NOT_INSTALLED). > > > > > > > > > > Can we just ignore EEXIST in the error handling loop at the bottom > of > > > > > handle_miss_upcalls()? > > > > > > > > > > > OK, now I'm starting to see the whole picture. > > > > > > There are two EEXIST error cases: > > > > > > 1. A subfacet was installed successfully in a previous (recent) > > > batch. Now we've attempted to reinstall exactly the same > > > subfacet in this batch. > > > > > > 2. A subfacet was installed successfully in a previous (recent) > > > batch or earlier in the current batch. We've attempted to > > > install a subfacet for an overlapping megaflow. > > > > > > So far we have two ideas for handling these: > > > > > > a) The current way, which is to consider on any installation > > > error that a given subfacet is not installed. This works > well > > > for case #2 but causes case #1 to yield unexpected flows, as > > > described in the commit message here. > > > > > > b) The way proposed in this patch, which is the same as a) with > > > the added wrinkle that we never try to reinstall exactly the > > > same subfacet that we know we installed successfully earlier > > > (and haven't deleted) unless its actions change. This ought > > > to work just as well as a) for case #2, and avoids the > problem > > > with case #1. > > > > > > The (minor) flaw in b) is that if something goes wrong and a flow > > > disappears from the kernel (e.g. ovs-dpctl del-flows), then userspace > > > won't reinstall it (until it tries to delete it). > > > > > > If I understand the patch properly now, then I'm willing to commit it. > > > Is that right? > > > > > > Thanks, > > > > > > Ben. > > > >
_______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev