On Thu, Dec 15, 2011 at 3:46 PM, Ben Pfaff <b...@nicira.com> wrote: > On Thu, Dec 15, 2011 at 03:24:38PM -0800, Jesse Gross wrote: >> On Thu, Dec 15, 2011 at 1:34 PM, Ben Pfaff <b...@nicira.com> wrote: >> > On Thu, Dec 15, 2011 at 01:00:00PM -0800, Jesse Gross wrote: >> >> On Thu, Dec 15, 2011 at 12:34 PM, Ben Pfaff <b...@nicira.com> wrote: >> >> > A workaround would be to call synchronize_rcu() and send the genl >> >> > reply from some context that doesn't hold genl_lock, but that's hardly >> >> > elegant. ??Also it means that the real reply would come after the one >> >> > generated automatically by af_netlink.c when NLM_F_ACK is used, which >> >> > violates the normal rules. >> >> >> >> Isn't that almost exactly the same as sending the message from the RCU >> >> callback (including the Netlink ordering issue)? >> > >> > In the "send from RCU callback" case, I intended that the normal >> > Netlink reply would be sent synchronously just after deletion, just as >> > it is now. ??The ordering would therefore stay just as it is now. ??Only >> > the broadcast reply would be deferred. >> >> What do you consider the "normal Netlink reply"? There are basically >> 3 different pieces: >> >> * Multicast flow information to the relevant group (excluding the >> requester if NLM_F_ECHO is specified). >> * Unicast flow information to the requester if NLM_F_ECHO is specified. >> * Unicast Netlink acknowledgement if NLM_F_ACK is specified or there >> was an error. >> >> Are you talking about doing the last two as-is and moving the >> multicast to the flow reclamation (but actually sending it to everyone >> at that time)? > > Yes, that's roughly what I had in mind, and perhaps exactly, depending > on what "but actually sending it everyone" means. The idea is that > the immediate, synchronous reply for NLM_F_ECHO would give approximate > statistics, the later multicast would give exact statistics, and > receiving the multicast could be used as an indicator that any upcalls > generated by the flow have already been queued to userspace. > > I assumed one socket would be used for sending the request and > receiving the unicast reply and another socket would be used for > receiving the multicasts, since we'd want to get both separately.
I think we're talking about the same thing now, although I would somewhat uncomfortable with this as it seems a pretty non-standard use of Netlink where the unicast and multicast replies have different values and can be inferred to mean different things. >> > The "use synchronize_rcu() on the side then send the reply" case would >> > change the message ordering. >> >> I guess I don't see why you couldn't use exactly the same strategy as >> you choose above. > > I assumed that an intended advantage of using synchronize_rcu() was > that all three forms I think you could do it either way in both - as I said, they seem pretty much equivalent to me. >> > Could we just use the existing spinlock in sw_flow plus the 'dead' >> > variable to ensure that no packets go through a deleted flow after >> > it's deleted? ??On the delete side, take the spinlock and set 'dead'. >> > On the fast path, take the spinlock and check 'dead' before executing >> > the actions, release the spinlock after executing the actions. ??We >> > already have to take the spinlock for every packet anyway to update >> > the stats, so it's not an additional cost. ??I doubt that there's any >> > parallel work for a given flow anyway (that would imply packet >> > reordering). ??We would have to avoid recursive locking somehow. >> >> Hmm, I think that would probably work. It's slightly convoluted but >> certainly much less so than the alternatives. How would you get >> recursive locking? > > Flow loops e.g. through patch ports was what came to mind. We don't execute any actions while the spin lock is held and trips through a patch port will appear as different flows, so I don't think that there is a possibility of recursive locking. I think you basically do: For flow processing: * Extract and lookup flow * Take spin lock * Check to see if flow is dead * Else update stats * Release spin lock * If dead, act like you didn't find the flow * Else execute actions For flow deletion: * Take spin lock * Read stats * Mark flow dead * Release spin lock _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev