On 11 July 2014 21:03, Thomas Graf <tg...@noironetworks.com> wrote: > > Correct. My main concern with doing it the first time it is required, is > > how it's synchronised. Flow dump is RCU. I don't really know what the > > threadsafety requirements are for this code, or what options are > available > > to address this. Is it viable to check the pointer is NULL, allocate the > > cache, perform an atomic swap, and free the (potentially NULL) returned > > pointer? > > Right, you would need the cmpxchg + RCU protection or take ovs_lock(). > Given the Netlink buffer is sized large enough, the cost of taking the > lock once for a large number of flows might get amortized compared to > the expensive translation. A benchmark would help. >
I'm skeptical of taking the ovs_lock(). The current patch performs this key/mask cache construction inside ovs_lock() as part of the critical section for flow install. If we perform this during flow_dump, but extend the ovs_lock to the entire flow dump operation, that has the potential to make things worse due to the added contention. If I follow correctly, the mspin_lock() bottleneck after the current patch is applied is already measuring contention on the ovs_mutex when revalidators attempt to delete flows. That's with flow_dump as RCU. Another sidenote: > Looking at the code, OVS does not handle NLM_F_DUMP_INTR in user space yet > and the kernel dump does not call genl_dump_check_consistent() yet to > provide > the flag. So what can currently happen even without your patch is that if > the dump requires more than one Netlink message, the RCU critical section > is left and thus a list deletion or insertion of a flow may occur in the > middle of the dump and thus provide a slightly incorrect set of flows. > Yes, I believe this is done deliberately for performance reasons. OVS userspace doesn't assume that it's getting a consistent snapshot of the flow table, it handles cases where flows are duplicated or missing in the dump. An idea has been floated to keep track of flow adds/deletes in the kernel to counteract this, but I got distracted by this idea first :-) > The alternative that I mentioned below the commit message is that we > > directly copy the key/mask from the original netlink message, and adjust > > the mask for any fields that have wildcarded bits where the kernel only > > supports full masking. This should be cheaper than converting to sw_flow > > and back again as this patch does currently. Furthermore, it shouldn't > > require locking, so the critical section can remain the same size as > > currently. > > Right, needs verification of your assumption. > The kernel module is free to ignore the mask completely, or install a flow that matches more specifically than the given mask: https://github.com/openvswitch/ovs/blob/branch-2.3/datapath/README#L107-L113 So it's just a matter of figuring out which fields lack support for wildcarding. This option also needs to consider the case where userspace doesn't specify a mask. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev