Re: [ovs-dev] [PATCH] util: fix compile warnings
On Fri, Jul 04, 2014 at 12:33:03PM -0700, Ansis Atteka wrote: > This patch fixes two compile warnings introduced by commit > 64b73291 ("util: create a copy of program_name"): > 1. ../lib/util.c:457:5: error: passing argument 1 of 'free' >discards 'const' qualifier from pointer target type; And > 2. ../lib/util.c:463:5: error: ISO C90 forbids mixed declarations >and code [-Werror=declaration-after-statement] (affected only >branch-2.3 that is C90 compliant and not the master) > > Reported-By: Joe Stringer > Reported-By: Lorand Jakab > Signed-Off-By: Ansis Atteka I don't see any warnings on current master. Did Yamamoto-san's commit (util: Suppress a warning by adding CONST_CAST) fix them? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 2/2] lib/hash: Use CRC32 for hashing.
On Fri, Jul 04, 2014 at 11:55:30AM -0700, Jarno Rajahalme wrote: > > On Jun 12, 2014, at 1:31 PM, Ben Pfaff wrote: > > > On Wed, Jun 04, 2014 at 04:32:21PM -0700, Jarno Rajahalme wrote: > >> Use CRC32 intrinsics for hash computations when building for > >> X86_64 with SSE4_2. > >> > >> Signed-off-by: Jarno Rajahalme > > > > Did you consider using __builtin_constant_p() to inline the hash > > computation only when the number of words is a constant? (On MSVC, > > which doesn't have __builtin_constant_p(), you could default to always > > inlining or never inlining, since this is not a correctness issue.) > > You mean like this: Right, that's what I mean. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] tests: Fix race in 'balance-tcp bonding' test.
On Fri, Jul 04, 2014 at 10:58:28AM +1200, Joe Stringer wrote: > Occasionally, this test would fail, with some of the ports reporting > "may_enable: false" in the bond/show output. This commit fixes the race > condition by waiting for may_enable to be true for all bond ports. > > Signed-off-by: Joe Stringer > --- > Running the test in a tight loop could cause this failure to happen > after about 5 runs. I'm guessing that the LACP negotiation doesn't > finish before we start sending packets. I would put that into the commit message. It is relevant. Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCHv3] dpif: Support fetching flow mask via dpif_flow_get().
On Fri, Jul 04, 2014 at 12:37:12PM +1200, Joe Stringer wrote: > Change the interface to allow implementations to pass back a buffer, and > allow callers to specify which of actions, mask, and stats they wish to > receive. This will be used in the next commit. > > Signed-off-by: Joe Stringer > --- > v3: Set *bufp to NULL before calling ->flow_get(). > Allocate the correct size for the buffer in dpif_netdev_flow_get(). > Don't overwrite mask with actions in dpif_netdev_flow_get(). > Remove unneeded *bufp = NULL in dpif_netdev_flow_get(). > v2: First post. This is for branch-2.3, I think. Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] Prepare ground for extensions to sFlow export
I took a look at the patch. It's not necessary to submit a pull-request. The change seems reasonable to me. Will you submit a followup patch to make use of the new statistics? I do not think it makes sense to apply this until they are used. On Fri, Jun 27, 2014 at 11:34:11AM -0700, Neil McKee wrote: > OK, I forked openvwitch/ovs on github to create this repo: > > https://github.com/sflow/ovs > > and then pushed the patch in there (with "Signed-off-by" in the commit > comment). > > https://github.com/sflow/ovs/commit/7aff910325fa3a4a11d363f09e06f83c64209485 > > Should I submit a pull-request? > > Regards, > Neil > > > -- > Neil McKee > InMon Corp. > http://www.inmon.com > > > On Mon, Jun 23, 2014 at 1:20 PM, Ben Pfaff wrote: > > > On Fri, Jun 13, 2014 at 12:01:50PM -0700, Neil McKee wrote: > > > Standard LACP counters are added to the LACP module, and > > > the sFlow library and test modules are extended to support the > > > export of those LACP counters as well as tunnel and OpenFlow > > > related structures. None of these structures are actually > > > exported yet, so this patch should have no discernible > > > effect. Hence no changes to the unit tests. > > > > > > Signed-off-by: Neil McKee > > > > Hi Neil. This patch is badly whitespace damaged. Could you repost > > it? (If you can't make that work, then it's also acceptable to push > > it to a publicly accessible Git repo and point to it.) > > > > Thanks, > > > > Ben. > > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v4 0/3] Support for layer 3 ports/flows
On Fri, Jun 27, 2014 at 04:21:52PM +0300, Lorand Jakab wrote: > This series implements support for layer 3 ports, of which we have one example > so far, the LISP vport. LISP support is currently implemented with a hack, by > adding/removing the Ethernet header within the datapath/vport-lisp.c file. By > removing the assumption that all packets/flows have Ethernet header, this > series adds generic support for layer 3 ports in OVS, and thus it is expected > that the user/kernel space API for LISP support will not change. This will > allow the upstreaming of the LISP vport, resulting in a decreased the delta > against the Linux kernel module, which is the goal that started this work. It > will also allow supporting layer 3 GRE tunnels, for which patches have been > proposed based on this work. > > The patch set doesn't change the current behavior when a packet from a LISP > port is sent to a layer 2 port or the other way around, so it's not necessary > to change existing flow rules. The implementation will automatically add the > appropriate pop_eth and push_eth actions to datapath flows. This may change > in > the future when OpenFlow support for these actions will be added, since > discussions on EXT-112 in the ONF showed preference for explicitly requiring > the pop_eth and push_eth actions in OpenFlow rules. I notice that this series tends to use tabs for indentation in userspace code, whereas it should use spaces. I'm looking at other issues too of course. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v4 0/3] Support for layer 3 ports/flows
On 7/7/14, 7:03 PM, Ben Pfaff wrote: On Fri, Jun 27, 2014 at 04:21:52PM +0300, Lorand Jakab wrote: This series implements support for layer 3 ports, of which we have one example so far, the LISP vport. LISP support is currently implemented with a hack, by adding/removing the Ethernet header within the datapath/vport-lisp.c file. By removing the assumption that all packets/flows have Ethernet header, this series adds generic support for layer 3 ports in OVS, and thus it is expected that the user/kernel space API for LISP support will not change. This will allow the upstreaming of the LISP vport, resulting in a decreased the delta against the Linux kernel module, which is the goal that started this work. It will also allow supporting layer 3 GRE tunnels, for which patches have been proposed based on this work. The patch set doesn't change the current behavior when a packet from a LISP port is sent to a layer 2 port or the other way around, so it's not necessary to change existing flow rules. The implementation will automatically add the appropriate pop_eth and push_eth actions to datapath flows. This may change in the future when OpenFlow support for these actions will be added, since discussions on EXT-112 in the ONF showed preference for explicitly requiring the pop_eth and push_eth actions in OpenFlow rules. I notice that this series tends to use tabs for indentation in userspace code, whereas it should use spaces. Sorry about that, I have vim setup for spaces, not sure how those tabs slipped in. I'm looking at other issues too of course. Thanks for reviewing. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v4 1/3] userspace: add support for pop_eth and push_eth actions
On Fri, Jun 27, 2014 at 04:21:53PM +0300, Lorand Jakab wrote: > These actions will allow L2->L3 and L3->L2 switching, and are supposed > to be added to flows installed in the datapath transparently by > ovs-vswitchd. > > Signed-off-by: Lorand Jakab Looks good to me, needs approval of Jesse or Pravin. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v4 2/3] userspace: add layer 3 flow and switching support
On Fri, Jun 27, 2014 at 04:21:54PM +0300, Lorand Jakab wrote: > This commit relaxes the assumption that all packets have an Ethernet > header, and adds support for layer 3 flows. For each packet received on > the Linux kernel datapath the l2 and l3 members of struct ofpbuf are > intialized appropriately, and some functions now expect this (notably > flow_extract()), in order to differentiate between layer 2 and layer 3 > packets. struct flow has now a new 'base_layer' member, because we > cannot assume that a flow has no Ethernet header when eth_src and > eth_dst are 0. For layer 3 packets, the protocol type is still stored > in the eth_type member. > > Switching L2->L3 and L3->L2 are both implemented by adding the pop_eth > and push_eth actions respectively when a transition is detected. The > push_eth action puts 0s on both source and destination MACs. These > addresses can be modified with mod_dl_dst and mod_dl_src actions. > > Added new prerequisite MFP_ETHERNET for fields MFF_ETH_SRC, MFF_ETH_DST, > MFF_VLAN_TCI, MFF_DL_VLAN, MFF_VLAN_VID and MFF_DL_VLAN_PCP. > > Signed-off-by: Lorand Jakab "sparse" warnings: ../lib/meta-flow.c:1148:26: warning: incorrect type in assignment (different base types) ../lib/meta-flow.c:1148:26:expected unsigned int [unsigned] [usertype] base_layer ../lib/meta-flow.c:1148:26:got restricted __be32 [usertype] ../lib/odp-util.c:3484:26: warning: incorrect type in assignment (different base types) CC lib/ofp-util.lo ../lib/odp-util.c:3484:26:expected unsigned int [unsigned] [usertype] base_layer ../lib/odp-util.c:3484:26:got restricted __be32 Otherwise this looks good to me. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 05/12] lib/classifier: Remove unused typedef cls_cb_func.
On Mon, Jun 30, 2014 at 08:17:21AM -0700, Jarno Rajahalme wrote: > Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 03/12] lib/ovs-atomic: Add atomic compare_exchange.
On Fri, Jul 04, 2014 at 06:03:53AM -0700, Jarno Rajahalme wrote: > > On Jul 3, 2014, at 2:26 PM, Ben Pfaff wrote: > > > On Mon, Jun 30, 2014 at 08:17:19AM -0700, Jarno Rajahalme wrote: > >> Add support for atomic compare_exchange operations. > >> > >> Add ovs_refcount_try_ref(), which takes a reference only if the > >> refcount is non-zero and returns true if a reference was taken, false > >> otherwise. This can be used in combined RCU/refcount scenarios where > >> we have an RCU protected reference to an refcounted object, but which > >> may be unref'ed at any time. If ovs_refcount_try_ref() fails, the > >> object may still be safely used until the current thread quiesces. > >> > >> Signed-off-by: Jarno Rajahalme > > > > I doubt that "relaxed" memory order is good enough here, given the > > commentary on the previous patch. > > I split this into two parts for v2, so that it is clear what we are > referring to. The new ovs_refcount_try_ref() never unrefs, so it seems > to me the commentary on taking a reference applies. Hence relaxed > memory model should be appropriate. The caller already has protected > access to the object via RCU, otherwise calling this on an object > whose reference count is zero would be accessing freed memory. Maybe > we should enforce RCU for this at least by naming the function as > ?ovs_refcount_try_ref_rcu()?? OK, that makes sense. I was reading the code without the mental context that it was always used in an RCU environment, but that's the only place that it really makes sense anyway. I think that your naming suggestion is appropriate, too. > > Not every architecture supports compare-and-exchange, but it sounds > > like probably the ones we care about do (do we need to support ARMv6 > > or PA-RISC?). > > Do you know how GCC and clang are handling these? Do they provide some > fallback implementations (maybe with locks?) for these platforms? > > Jarno > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] lib/classifier: Lockless lookups.
It looks like you posted a v2 of just the beginning of the series also? I'll review those, then we can look at the rest of the series. On Mon, Jun 30, 2014 at 04:40:06PM -0700, Jarno Rajahalme wrote: > Just to be clear, this is a v2 on the patch 12/12 of the series, due to the > batching changes Ethan just pushed. The 1-11 seem to still apply. > > Jarno > > On Jun 30, 2014, at 4:37 PM, Jarno Rajahalme wrote: > > > Now that all the relevant classifier structures use RCU and internal > > mutual exclusion for modifications, we can remove the fat-rwlock and > > thus make the classifier lookups lockless. > > > > As the readers are operating concurrently with the writers, a > > concurrent reader may or may not see a new rule being added by a > > writer, depending on how the concurrent events overlap with each > > other. Overall, this is no different from the former locked behavior, > > but there the visibility of the new rule only depended on the timing > > of the locking functions. > > > > A new rule is first added to the segment indices, so the readers may > > find the rule in the indices before the rule is visible in the > > subtables 'rules' map. This may result in us losing the opportunity > > to quit lookups earlier, resulting in sub-optimal wildcarding. This > > will be fixed by forthcoming revalidation always scheduled after flow > > table changes. > > > > Similar behavior may happen due to us removing the overlapping rule > > (if any) from the indices only after the corresponding new rule has > > been added. > > > > The subtable's max priority is updated only after a rule is inserted > > to the maps, so the concurrent readers may not see the rule, as the > > updated priority ordered subtable list will only be visible after the > > subtable's max priority is updated. > > > > Similarly, the classifier's partitions are updated by the caller after > > the rule is inserted to the maps, so the readers may keep skipping the > > subtable until they see the updated partitions. > > > > Signed-off-by: Jarno Rajahalme > > --- > > v2: Rebase to master (classifier lookup batching). > > > > lib/classifier.c| 35 --- > > lib/classifier.h| 45 - > > lib/dpif-netdev.c | 30 -- > > ofproto/ofproto-dpif.c |2 -- > > ofproto/ofproto.c | 33 ++--- > > ofproto/ofproto.h |3 +++ > > tests/test-classifier.c | 36 +--- > > utilities/ovs-ofctl.c |4 > > 8 files changed, 58 insertions(+), 130 deletions(-) > > > > diff --git a/lib/classifier.c b/lib/classifier.c > > index 8ba6867..fc05efe 100644 > > --- a/lib/classifier.c > > +++ b/lib/classifier.c > > @@ -481,7 +481,6 @@ classifier_init(struct classifier *cls_, const uint8_t > > *flow_segments) > > { > > struct cls_classifier *cls = xmalloc(sizeof *cls); > > > > -fat_rwlock_init(&cls_->rwlock); > > ovs_mutex_init(&cls->mutex); > > > > ovs_mutex_lock(&cls->mutex); > > @@ -506,7 +505,8 @@ classifier_init(struct classifier *cls_, const uint8_t > > *flow_segments) > > } > > > > /* Destroys 'cls'. Rules within 'cls', if any, are not freed; this is the > > - * caller's responsibility. */ > > + * caller's responsibility. > > + * May only be called after all the readers have been terminated. */ > > void > > classifier_destroy(struct classifier *cls_) > > OVS_EXCLUDED(cls_->cls->mutex) > > @@ -517,7 +517,6 @@ classifier_destroy(struct classifier *cls_) > > struct cls_subtable *subtable, *next_subtable; > > int i; > > > > -fat_rwlock_destroy(&cls_->rwlock); > > if (!cls) { > > return; > > } > > @@ -682,7 +681,10 @@ classifier_is_empty(const struct classifier *cls) > > /* Returns the number of rules in 'cls'. */ > > int > > classifier_count(const struct classifier *cls) > > +OVS_NO_THREAD_SAFETY_ANALYSIS > > { > > +/* n_rules is an int, so in the presence of concurrent writers this > > will > > + * return either the old or a new value. */ > > return cls->cls->n_rules; > > } > > > > @@ -1711,6 +1713,33 @@ find_equal(struct cls_subtable *subtable, const > > struct miniflow *flow, > > return NULL; > > } > > > > +/* > > + * As the readers are operating concurrently with the modifications, a > > + * concurrent reader may or may not see the new rule, depending on how > > + * the concurrent events overlap with each other. This is no > > + * different from the former locked behavior, but there the visibility > > + * of the new rule only depended on the timing of the locking > > + * functions. > > + * > > + * The new rule is first added to the segment indices, so the readers > > + * may find the rule in the indices before the rule is visible in the > > + * subtables 'rules' map. This may result in us losing the > > + * opportunity to quit lookups earlier, result
Re: [ovs-dev] [PATCH v2] lib/classifier: Lockless lookups.
Yes, I figured it might be better pipeline the revision process to get things through faster :-) Jarno On Jul 7, 2014, at 9:31 AM, Ben Pfaff wrote: > It looks like you posted a v2 of just the beginning of the series also? > I'll review those, then we can look at the rest of the series. > > On Mon, Jun 30, 2014 at 04:40:06PM -0700, Jarno Rajahalme wrote: >> Just to be clear, this is a v2 on the patch 12/12 of the series, due to the >> batching changes Ethan just pushed. The 1-11 seem to still apply. >> >> Jarno >> >> On Jun 30, 2014, at 4:37 PM, Jarno Rajahalme wrote: >> >>> Now that all the relevant classifier structures use RCU and internal >>> mutual exclusion for modifications, we can remove the fat-rwlock and >>> thus make the classifier lookups lockless. >>> >>> As the readers are operating concurrently with the writers, a >>> concurrent reader may or may not see a new rule being added by a >>> writer, depending on how the concurrent events overlap with each >>> other. Overall, this is no different from the former locked behavior, >>> but there the visibility of the new rule only depended on the timing >>> of the locking functions. >>> >>> A new rule is first added to the segment indices, so the readers may >>> find the rule in the indices before the rule is visible in the >>> subtables 'rules' map. This may result in us losing the opportunity >>> to quit lookups earlier, resulting in sub-optimal wildcarding. This >>> will be fixed by forthcoming revalidation always scheduled after flow >>> table changes. >>> >>> Similar behavior may happen due to us removing the overlapping rule >>> (if any) from the indices only after the corresponding new rule has >>> been added. >>> >>> The subtable's max priority is updated only after a rule is inserted >>> to the maps, so the concurrent readers may not see the rule, as the >>> updated priority ordered subtable list will only be visible after the >>> subtable's max priority is updated. >>> >>> Similarly, the classifier's partitions are updated by the caller after >>> the rule is inserted to the maps, so the readers may keep skipping the >>> subtable until they see the updated partitions. >>> >>> Signed-off-by: Jarno Rajahalme >>> --- >>> v2: Rebase to master (classifier lookup batching). >>> >>> lib/classifier.c| 35 --- >>> lib/classifier.h| 45 - >>> lib/dpif-netdev.c | 30 -- >>> ofproto/ofproto-dpif.c |2 -- >>> ofproto/ofproto.c | 33 ++--- >>> ofproto/ofproto.h |3 +++ >>> tests/test-classifier.c | 36 +--- >>> utilities/ovs-ofctl.c |4 >>> 8 files changed, 58 insertions(+), 130 deletions(-) >>> >>> diff --git a/lib/classifier.c b/lib/classifier.c >>> index 8ba6867..fc05efe 100644 >>> --- a/lib/classifier.c >>> +++ b/lib/classifier.c >>> @@ -481,7 +481,6 @@ classifier_init(struct classifier *cls_, const uint8_t >>> *flow_segments) >>> { >>>struct cls_classifier *cls = xmalloc(sizeof *cls); >>> >>> -fat_rwlock_init(&cls_->rwlock); >>>ovs_mutex_init(&cls->mutex); >>> >>>ovs_mutex_lock(&cls->mutex); >>> @@ -506,7 +505,8 @@ classifier_init(struct classifier *cls_, const uint8_t >>> *flow_segments) >>> } >>> >>> /* Destroys 'cls'. Rules within 'cls', if any, are not freed; this is the >>> - * caller's responsibility. */ >>> + * caller's responsibility. >>> + * May only be called after all the readers have been terminated. */ >>> void >>> classifier_destroy(struct classifier *cls_) >>>OVS_EXCLUDED(cls_->cls->mutex) >>> @@ -517,7 +517,6 @@ classifier_destroy(struct classifier *cls_) >>>struct cls_subtable *subtable, *next_subtable; >>>int i; >>> >>> -fat_rwlock_destroy(&cls_->rwlock); >>>if (!cls) { >>>return; >>>} >>> @@ -682,7 +681,10 @@ classifier_is_empty(const struct classifier *cls) >>> /* Returns the number of rules in 'cls'. */ >>> int >>> classifier_count(const struct classifier *cls) >>> +OVS_NO_THREAD_SAFETY_ANALYSIS >>> { >>> +/* n_rules is an int, so in the presence of concurrent writers this >>> will >>> + * return either the old or a new value. */ >>>return cls->cls->n_rules; >>> } >>> >>> @@ -1711,6 +1713,33 @@ find_equal(struct cls_subtable *subtable, const >>> struct miniflow *flow, >>>return NULL; >>> } >>> >>> +/* >>> + * As the readers are operating concurrently with the modifications, a >>> + * concurrent reader may or may not see the new rule, depending on how >>> + * the concurrent events overlap with each other. This is no >>> + * different from the former locked behavior, but there the visibility >>> + * of the new rule only depended on the timing of the locking >>> + * functions. >>> + * >>> + * The new rule is first added to the segment indices, so the readers >>> + * may find the rule in the
Re: [ovs-dev] [PATCH] util: fix compile warnings
Yes, Yamamoto-san's patch fixed warning by using CONST_CAST. I think changing "program_name" type to "char*" could be a cleaner solution. However, it does not address the second issue for branch-2.3 (ISO C90 forbids mixed declarations and code). This same issue is not present in master though. On Mon, Jul 7, 2014 at 8:09 AM, Ben Pfaff wrote: > On Fri, Jul 04, 2014 at 12:33:03PM -0700, Ansis Atteka wrote: >> This patch fixes two compile warnings introduced by commit >> 64b73291 ("util: create a copy of program_name"): >> 1. ../lib/util.c:457:5: error: passing argument 1 of 'free' >>discards 'const' qualifier from pointer target type; And >> 2. ../lib/util.c:463:5: error: ISO C90 forbids mixed declarations >>and code [-Werror=declaration-after-statement] (affected only >>branch-2.3 that is C90 compliant and not the master) >> >> Reported-By: Joe Stringer >> Reported-By: Lorand Jakab >> Signed-Off-By: Ansis Atteka > > I don't see any warnings on current master. Did Yamamoto-san's commit > (util: Suppress a warning by adding CONST_CAST) fix them? ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] Injection Moulds
Dear Sir/Madam B&H King Mold Limited is a professional plastic injection mold maker in Shenzhen China. Please feel free to contact me for further details or send us your drawings for a quote. Best Regards Nula Han King Mold / Business Dept. Tel: +86-755-89472535 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 2/5] lib/ovs-atomic: Add atomic compare_exchange.
On Fri, Jul 04, 2014 at 07:21:16AM -0700, Jarno Rajahalme wrote: > Add support for atomic compare_exchange operations. > > Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 3/5] lib/ovs-atomic: Add ovs_refcount_unref_relaxed(), ovs_refcount_try_ref_rcu().
On Fri, Jul 04, 2014 at 07:21:17AM -0700, Jarno Rajahalme wrote: > When a reference counted object is also RCU protected the deletion of > the object's memory is always postponed. This allows > memory_order_relaxed to be used also for unreferencing, as RCU > quiescing provides a full memory barrier (it has to, or otherwise > there could be lingering accesses to objects after they are recycled). > > Also, when access to the reference counted object is protected via a > mutex or a lock, the locking primitives provide the required memory > barrier functionality. > > Also, add ovs_refcount_try_ref_rcu(), which takes a reference only if > the refcount is non-zero and returns true if a reference was taken, > false otherwise. This can be used in combined RCU/refcount scenarios > where we have an RCU protected reference to an refcounted object, but > which may be unref'ed at any time. If ovs_refcount_try_ref_rcu() > fails, the object may still be safely used until the current thread > quiesces. > > Signed-off-by: Jarno Rajahalme > +/* Increments 'refcount', but only if it is non-zero. > + * > + * This may only be called an object which is RCU protected during > + * this call. This implies that its possible desctruction is "destruction" > + * postponed until all current RCU threads quiesce. > + * > + * Returns false if the refcount was zero. In this case the object may > + * be safely accessed until the current thread quiesces, but no additional Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 1/5] ovs-atomic: Use explicit memory order for ovs_refcount.
On Fri, Jul 04, 2014 at 07:21:15AM -0700, Jarno Rajahalme wrote: > Use explicit variants of atomic operations for the ovs_refcount to > avoid the overhead of the default memory_order_seq_cst. > > Adding a reference requires no memory ordering, as the calling thread > is already assumed to have protected access to the object being > reference counted. Hence, memory_order_relaxed is used for > ovs_refcount_ref(). ovs_refcount_read() does not change the reference > count, so it can also use memory_order_relaxed. > > Unreferencing an object needs a release barrier, so that none of the > accesses to the protected object are reordered after the atomic > decrement operation. Additionally, an explicit acquire barrier is > needed before the object is recycled, to keep the subsequent accesses > to the object's memory from being reordered before the atomic > decrement operation. > > This patch follows the memory ordering and argumentation discussed > here: > > http://www.chaoticmind.net/~hcb/projects/boost.atomic/doc/atomic/usage_examples.html > > Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 4/5] Use ovs_refcount_unref_relaxed.
On Fri, Jul 04, 2014 at 07:21:18AM -0700, Jarno Rajahalme wrote: > After a quick analysis, in most cases the access to refcounted objects > is clearly protected either with an explicit lock/mutex, or RCU. there > are only a few places where I left a call to ovs_refcount_unref(). > Upon closer analysis it may well be that those could also use the > relaxed form. > > Signed-off-by: Jarno Rajahalme I didn't look at these closely. Should I? Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 5/5] ofproto-dpif: Use ovs_refcount_try_ref_rcu().
On Fri, Jul 04, 2014 at 07:21:19AM -0700, Jarno Rajahalme wrote: > This is a prerequisite step in making the classifier lookups lockless. > If taking a reference fails, we do the lookup again, as a new (lower > priority) rule may now match instead. > > Also remove unwildcarding dl_type and nw_frag, as these are already > taken care of by xlate_actions(). > > Signed-off-by: Jarno Rajahalme I'd be inclined to write the retry loop in rule_dpif_lookup_in_table() as an actual "do-while" loop. Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v4 2/3] userspace: add layer 3 flow and switching support
On 7/7/14, 7:23 PM, Ben Pfaff wrote: On Fri, Jun 27, 2014 at 04:21:54PM +0300, Lorand Jakab wrote: This commit relaxes the assumption that all packets have an Ethernet header, and adds support for layer 3 flows. For each packet received on the Linux kernel datapath the l2 and l3 members of struct ofpbuf are intialized appropriately, and some functions now expect this (notably flow_extract()), in order to differentiate between layer 2 and layer 3 packets. struct flow has now a new 'base_layer' member, because we cannot assume that a flow has no Ethernet header when eth_src and eth_dst are 0. For layer 3 packets, the protocol type is still stored in the eth_type member. Switching L2->L3 and L3->L2 are both implemented by adding the pop_eth and push_eth actions respectively when a transition is detected. The push_eth action puts 0s on both source and destination MACs. These addresses can be modified with mod_dl_dst and mod_dl_src actions. Added new prerequisite MFP_ETHERNET for fields MFF_ETH_SRC, MFF_ETH_DST, MFF_VLAN_TCI, MFF_DL_VLAN, MFF_VLAN_VID and MFF_DL_VLAN_PCP. Signed-off-by: Lorand Jakab "sparse" warnings: ../lib/meta-flow.c:1148:26: warning: incorrect type in assignment (different base types) ../lib/meta-flow.c:1148:26:expected unsigned int [unsigned] [usertype] base_layer ../lib/meta-flow.c:1148:26:got restricted __be32 [usertype] ../lib/odp-util.c:3484:26: warning: incorrect type in assignment (different base types) CC lib/ofp-util.lo ../lib/odp-util.c:3484:26:expected unsigned int [unsigned] [usertype] base_layer ../lib/odp-util.c:3484:26:got restricted __be32 Fixed in my personal repo, fixes will be in v5. Otherwise this looks good to me. Thanks for reviewing. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] util: fix compile warnings
On Sun, Jul 6, 2014 at 3:37 PM, Joe Stringer wrote: > I feel like this is easier to follow if there's only one #ifdef block. Thanks for pointing this out. This can actually be factorized out even more by taking out free(program_name) out of both #ifdefs. Will send v2 patch soon. > > We could combine the blocks and make the two platforms more aligned with > something like this (based against master): > > diff --git a/lib/util.c b/lib/util.c > index 4f9b079..c65051d 100644 > --- a/lib/util.c > +++ b/lib/util.c > @@ -43,7 +43,7 @@ VLOG_DEFINE_THIS_MODULE(util); > COVERAGE_DEFINE(util_xalloc); > > /* argv[0] without directory names. */ > -const char *program_name; > +char *program_name; > > /* Name for the currently running thread or process, for log messages, > process > * listings, and debuggers. */ > @@ -455,25 +455,26 @@ void > set_program_name__(const char *argv0, const char *version, const char > *date, > const char *time) > { > -free(program_name); > +char *basename; > > #ifdef _WIN32 > -char *basename; > size_t max_len = strlen(argv0) + 1; > > SetErrorMode(GetErrorMode() | SEM_NOGPFAULTERRORBOX); > _set_output_format(_TWO_DIGIT_EXPONENT); > > +free(program_name); > basename = xmalloc(max_len); > _splitpath_s(argv0, NULL, 0, NULL, 0, basename, max_len, NULL, 0); > -assert_single_threaded(); > -program_name = basename; > #else > const char *slash = strrchr(argv0, '/'); > -assert_single_threaded(); > -program_name = xstrdup(slash ? slash + 1 : argv0); > + > +free(program_name); > +basename = xstrdup(slash ? slash + 1 : argv0); > #endif > > +assert_single_threaded(); > +program_name = basename; > free(program_version); > > if (!strcmp(version, VERSION)) { > diff --git a/lib/util.h b/lib/util.h > index b626c0b..dc34ee5 100644 > --- a/lib/util.h > +++ b/lib/util.h > @@ -86,7 +86,7 @@ void ovs_assert_failure(const char *, const char *, const > char *) NO_RETURN; > ((void) sizeof ((int) ((POINTER) == (TYPE) (POINTER))), \ > (TYPE) (POINTER)) > > -extern const char *program_name; > +extern char *program_name; > > #define __ARRAY_SIZE_NOCHECK(ARRAY) (sizeof(ARRAY) / sizeof((ARRAY)[0])) > #ifdef __GNUC__ ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCHv2] util: fix compile warnings
This patch fixes two compile warnings introduced by commit 64b73291 ("util: create a copy of program_name"): 1. ../lib/util.c:457:5: error: passing argument 1 of 'free' discards 'const' qualifier from pointer target type; And 2. ../lib/util.c:463:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] (affected only branch-2.3 that is C90 compliant and not the master) Reported-By: Joe Stringer Reported-By: Lorand Jakab Signed-Off-By: Ansis Atteka --- lib/util.c | 14 ++ lib/util.h |2 +- 2 files changed, 7 insertions(+), 9 deletions(-) diff --git a/lib/util.c b/lib/util.c index 1a30757..26ec174 100644 --- a/lib/util.c +++ b/lib/util.c @@ -42,7 +42,7 @@ VLOG_DEFINE_THIS_MODULE(util); COVERAGE_DEFINE(util_xalloc); /* argv[0] without directory names. */ -const char *program_name; +char *program_name; /* Name for the currently running thread or process, for log messages, process * listings, and debuggers. */ @@ -454,24 +454,22 @@ void set_program_name__(const char *argv0, const char *version, const char *date, const char *time) { -free(CONST_CAST(char *, program_name)); - -#ifdef _WIN32 char *basename; +#ifdef _WIN32 size_t max_len = strlen(argv0) + 1; SetErrorMode(GetErrorMode() | SEM_NOGPFAULTERRORBOX); basename = xmalloc(max_len); _splitpath_s(argv0, NULL, 0, NULL, 0, basename, max_len, NULL, 0); -assert_single_threaded(); -program_name = basename; #else const char *slash = strrchr(argv0, '/'); -assert_single_threaded(); -program_name = xstrdup(slash ? slash + 1 : argv0); +basename = xstrdup(slash ? slash + 1 : argv0); #endif +assert_single_threaded(); +free(program_name); +program_name = basename; free(program_version); if (!strcmp(version, VERSION)) { diff --git a/lib/util.h b/lib/util.h index 4d0ba76..4282007 100644 --- a/lib/util.h +++ b/lib/util.h @@ -86,7 +86,7 @@ void ovs_assert_failure(const char *, const char *, const char *) NO_RETURN; ((void) sizeof ((int) ((POINTER) == (TYPE) (POINTER))), \ (TYPE) (POINTER)) -extern const char *program_name; +extern char *program_name; #define __ARRAY_SIZE_NOCHECK(ARRAY) (sizeof(ARRAY) / sizeof((ARRAY)[0])) #ifdef __GNUC__ -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v2] lib/hash: Use CRC32 for hashing.
Use CRC32 intrinsics for hash computations when building for X86_64 with SSE4_2. Add a new hash_words64() and change hash_words() to be inlined when 'n_words' is a compile-time constant. Signed-off-by: Jarno Rajahalme --- v2: Changed hash_words to be inlined only when 'n_words' is known to be a compile time constant. lib/hash.c| 27 --- lib/hash.h| 208 + tests/test-hash.c | 11 ++- 3 files changed, 216 insertions(+), 30 deletions(-) diff --git a/lib/hash.c b/lib/hash.c index 349f54a..71cd74c 100644 --- a/lib/hash.c +++ b/lib/hash.c @@ -50,21 +50,6 @@ hash_bytes(const void *p_, size_t n, uint32_t basis) return hash_finish(hash, orig_n); } -/* Returns the hash of the 'n' 32-bit words at 'p', starting from 'basis'. - * 'p' must be properly aligned. */ -uint32_t -hash_words(const uint32_t p[], size_t n_words, uint32_t basis) -{ -uint32_t hash; -size_t i; - -hash = basis; -for (i = 0; i < n_words; i++) { -hash = hash_add(hash, p[i]); -} -return hash_finish(hash, n_words * 4); -} - uint32_t hash_double(double x, uint32_t basis) { @@ -74,3 +59,15 @@ hash_double(double x, uint32_t basis) memcpy(value, &x, sizeof value); return hash_3words(value[0], value[1], basis); } + +uint32_t +hash_words__(const uint32_t p[], size_t n_words, uint32_t basis) +{ +return hash_words_inline(p, n_words, basis); +} + +uint32_t +hash_words64__(const uint64_t p[], size_t n_words, uint64_t basis) +{ +return hash_words64_inline(p, n_words, basis); +} diff --git a/lib/hash.h b/lib/hash.h index 1e19f45..f8bbada 100644 --- a/lib/hash.h +++ b/lib/hash.h @@ -32,7 +32,6 @@ hash_rot(uint32_t x, int k) return (x << k) | (x >> (32 - k)); } -uint32_t hash_words(const uint32_t data[], size_t n_words, uint32_t basis); uint32_t hash_bytes(const void *, size_t n_bytes, uint32_t basis); static inline uint32_t hash_int(uint32_t x, uint32_t basis); @@ -83,6 +82,9 @@ static inline uint32_t mhash_finish(uint32_t hash, uint32_t n_bytes) return hash; } +#if !(defined(__SSE4_2__) && defined(__x86_64__)) +/* Mhash-based implemantation. */ + static inline uint32_t hash_add(uint32_t hash, uint32_t data) { return mhash_add(hash, data); @@ -93,23 +95,29 @@ static inline uint32_t hash_finish(uint32_t hash, uint32_t final) return mhash_finish(hash, final); } -static inline uint32_t hash_string(const char *s, uint32_t basis) +/* Returns the hash of the 'n' 32-bit words at 'p', starting from 'basis'. + * 'p' must be properly aligned. + * + * This is inlined for the compiler to have access to the 'n_words', which + * in many cases is a constant. */ +static inline uint32_t +hash_words_inline(const uint32_t p[], size_t n_words, uint32_t basis) { -return hash_bytes(s, strlen(s), basis); -} +uint32_t hash; +size_t i; -static inline uint32_t hash_int(uint32_t x, uint32_t basis) -{ -return hash_2words(x, basis); +hash = basis; +for (i = 0; i < n_words; i++) { +hash = hash_add(hash, p[i]); +} +return hash_finish(hash, n_words * 4); } -/* An attempt at a useful 1-bit hash function. Has not been analyzed for - * quality. */ -static inline uint32_t hash_boolean(bool x, uint32_t basis) +static inline uint32_t +hash_words64_inline(const uint64_t p[], size_t n_words, uint64_t basis) { -const uint32_t P0 = 0xc2b73583; /* This is hash_int(1, 0). */ -const uint32_t P1 = 0xe90f1258; /* This is hash_int(2, 0). */ -return (x ? P0 : P1) ^ hash_rot(basis, 1); +return hash_words_inline((uint32_t *)p, n_words * 2, + (uint32_t)basis ^ basis >> 32); } static inline uint32_t hash_pointer(const void *p, uint32_t basis) @@ -140,6 +148,180 @@ static inline uint32_t hash_uint64_basis(const uint64_t x, { return hash_3words((uint32_t)(x >> 32), (uint32_t)x, basis); } + +#else /* __SSE4_2__ && __x86_64__ */ +#include + +static inline uint32_t hash_add(uint32_t hash, uint32_t data) +{ +return _mm_crc32_u32(hash, data); +} + +static inline uint32_t hash_finish(uint64_t hash, uint64_t final) +{ +/* The finishing multiplier 0x805204f3 has been experimentally + * derived to pass the testsuite hash tests. */ +hash = _mm_crc32_u64(hash, final) * 0x805204f3; +return hash ^ (uint32_t)hash >> 16; /* Increase entropy in LSBs. */ +} + +/* Returns the hash of the 'n' 32-bit words at 'p_', starting from 'basis'. + * We access 'p_' as a uint64_t pointer, which is fine for __SSE_4_2__. + * + * This is inlined for the compiler to have access to the 'n_words', which + * in many cases is a constant. */ +static inline uint32_t +hash_words_inline(const uint32_t p_[], size_t n_words, uint32_t basis) +{ +const uint64_t *p = (const void *)p_; +uint64_t hash1 = basis; +uint64_t hash2 = 0; +uint64_t hash3 = n_words; +const uint32_t *endp = (const uint32_t *)p + n_words; +const uint64_t *limit
Re: [ovs-dev] [PATCH v2 5/5] ofproto-dpif: Use ovs_refcount_try_ref_rcu().
Thanks for the reviews, series pushed with suggested changes (upto this patch), Jarno On Jul 7, 2014, at 9:47 AM, Ben Pfaff wrote: > On Fri, Jul 04, 2014 at 07:21:19AM -0700, Jarno Rajahalme wrote: >> This is a prerequisite step in making the classifier lookups lockless. >> If taking a reference fails, we do the lookup again, as a new (lower >> priority) rule may now match instead. >> >> Also remove unwildcarding dl_type and nw_frag, as these are already >> taken care of by xlate_actions(). >> >> Signed-off-by: Jarno Rajahalme > > I'd be inclined to write the retry loop in rule_dpif_lookup_in_table() > as an actual "do-while" loop. Yeah, definitely cleaner this way, thanks for the hint: @@ -3383,20 +3383,15 @@ rule_dpif_lookup_in_table(struct ofproto_dpif *ofproto, uint8_t table_id, } } -retry: -fat_rwlock_rdlock(&cls->rwlock); -cls_rule = classifier_lookup(cls, flow, wc); -fat_rwlock_unlock(&cls->rwlock); - -rule = rule_dpif_cast(rule_from_cls_rule(cls_rule)); -if (rule && take_ref) { -if (!rule_dpif_try_ref(rule)) { -/* The rule was released before we got the ref, so it - * cannot be in the classifier any more. Do another - * lookup to find another rule, if any. */ -goto retry; -} -} +do { +fat_rwlock_rdlock(&cls->rwlock); +cls_rule = classifier_lookup(cls, flow, wc); +fat_rwlock_unlock(&cls->rwlock); + +rule = rule_dpif_cast(rule_from_cls_rule(cls_rule)); + +/* Try again if the rule was released before we get the reference. */ +} while (rule && take_ref && !rule_dpif_try_ref(rule)); return rule; } > > Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 4/5] Use ovs_refcount_unref_relaxed.
On Jul 7, 2014, at 9:47 AM, Ben Pfaff wrote: > On Fri, Jul 04, 2014 at 07:21:18AM -0700, Jarno Rajahalme wrote: >> After a quick analysis, in most cases the access to refcounted objects >> is clearly protected either with an explicit lock/mutex, or RCU. there >> are only a few places where I left a call to ovs_refcount_unref(). >> Upon closer analysis it may well be that those could also use the >> relaxed form. >> >> Signed-off-by: Jarno Rajahalme > > I didn't look at these closely. Should I? Actually, it would be kind of nice if we could eliminate the “normal” form of ovs_refcount_unref() and rename the ovs_refcount_unref_relaxed() as ovs_refcount_unref(), and also document that the access to the object being refcounted needs to be protected by a lock or a mutex, or it’s destruction needs to be RCU-postponed. Having too variants with different rules may be a bit complicated. To this end, it would be good if another pair of eyes checked/confirmed that a) the _relaxed() users actually conform to the above, and b) if/how the remaining users of ovs_refcount_unref() could be changed to conform to the new semantics. Nonetheless, I just pushed this together with the rest of the new series you just reviewed. Jarno > > Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH 05/12] lib/classifier: Remove unused typedef cls_cb_func.
Pushed, thanks for the review! Jarno On Jul 7, 2014, at 9:26 AM, Ben Pfaff wrote: > On Mon, Jun 30, 2014 at 08:17:21AM -0700, Jarno Rajahalme wrote: >> Signed-off-by: Jarno Rajahalme > > Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 5/5] ofproto-dpif: Use ovs_refcount_try_ref_rcu().
This small change made the last patch of the original series not apply any more, I’ll send a rebased v3 series out in a moment to keep things simple. No other changes, so if reviews are underway, they should apply as-is. Jarno On Jul 7, 2014, at 1:40 PM, Jarno Rajahalme wrote: > Thanks for the reviews, series pushed with suggested changes (upto this > patch), > > Jarno > > On Jul 7, 2014, at 9:47 AM, Ben Pfaff wrote: > >> On Fri, Jul 04, 2014 at 07:21:19AM -0700, Jarno Rajahalme wrote: >>> This is a prerequisite step in making the classifier lookups lockless. >>> If taking a reference fails, we do the lookup again, as a new (lower >>> priority) rule may now match instead. >>> >>> Also remove unwildcarding dl_type and nw_frag, as these are already >>> taken care of by xlate_actions(). >>> >>> Signed-off-by: Jarno Rajahalme >> >> I'd be inclined to write the retry loop in rule_dpif_lookup_in_table() >> as an actual "do-while" loop. > > Yeah, definitely cleaner this way, thanks for the hint: > > @@ -3383,20 +3383,15 @@ rule_dpif_lookup_in_table(struct ofproto_dpif > *ofproto, uint8_t table_id, > } > } > > -retry: > -fat_rwlock_rdlock(&cls->rwlock); > -cls_rule = classifier_lookup(cls, flow, wc); > -fat_rwlock_unlock(&cls->rwlock); > - > -rule = rule_dpif_cast(rule_from_cls_rule(cls_rule)); > -if (rule && take_ref) { > -if (!rule_dpif_try_ref(rule)) { > -/* The rule was released before we got the ref, so it > - * cannot be in the classifier any more. Do another > - * lookup to find another rule, if any. */ > -goto retry; > -} > -} > +do { > +fat_rwlock_rdlock(&cls->rwlock); > +cls_rule = classifier_lookup(cls, flow, wc); > +fat_rwlock_unlock(&cls->rwlock); > + > +rule = rule_dpif_cast(rule_from_cls_rule(cls_rule)); > + > +/* Try again if the rule was released before we get the reference. */ > +} while (rule && take_ref && !rule_dpif_try_ref(rule)); > > return rule; > } > > >> >> Acked-by: Ben Pfaff > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2 5/5] ofproto-dpif: Use ovs_refcount_try_ref_rcu().
Great, thanks. On Mon, Jul 07, 2014 at 02:16:29PM -0700, Jarno Rajahalme wrote: > This small change made the last patch of the original series not > apply any more, I?ll send a rebased v3 series out in a moment to > keep things simple. No other changes, so if reviews are underway, > they should apply as-is. > > Jarno > > On Jul 7, 2014, at 1:40 PM, Jarno Rajahalme wrote: > > > Thanks for the reviews, series pushed with suggested changes (upto this > > patch), > > > > Jarno > > > > On Jul 7, 2014, at 9:47 AM, Ben Pfaff wrote: > > > >> On Fri, Jul 04, 2014 at 07:21:19AM -0700, Jarno Rajahalme wrote: > >>> This is a prerequisite step in making the classifier lookups lockless. > >>> If taking a reference fails, we do the lookup again, as a new (lower > >>> priority) rule may now match instead. > >>> > >>> Also remove unwildcarding dl_type and nw_frag, as these are already > >>> taken care of by xlate_actions(). > >>> > >>> Signed-off-by: Jarno Rajahalme > >> > >> I'd be inclined to write the retry loop in rule_dpif_lookup_in_table() > >> as an actual "do-while" loop. > > > > Yeah, definitely cleaner this way, thanks for the hint: > > > > @@ -3383,20 +3383,15 @@ rule_dpif_lookup_in_table(struct ofproto_dpif > > *ofproto, uint8_t table_id, > > } > > } > > > > -retry: > > -fat_rwlock_rdlock(&cls->rwlock); > > -cls_rule = classifier_lookup(cls, flow, wc); > > -fat_rwlock_unlock(&cls->rwlock); > > - > > -rule = rule_dpif_cast(rule_from_cls_rule(cls_rule)); > > -if (rule && take_ref) { > > -if (!rule_dpif_try_ref(rule)) { > > -/* The rule was released before we got the ref, so it > > - * cannot be in the classifier any more. Do another > > - * lookup to find another rule, if any. */ > > -goto retry; > > -} > > -} > > +do { > > +fat_rwlock_rdlock(&cls->rwlock); > > +cls_rule = classifier_lookup(cls, flow, wc); > > +fat_rwlock_unlock(&cls->rwlock); > > + > > +rule = rule_dpif_cast(rule_from_cls_rule(cls_rule)); > > + > > +/* Try again if the rule was released before we get the reference. > > */ > > +} while (rule && take_ref && !rule_dpif_try_ref(rule)); > > > > return rule; > > } > > > > > >> > >> Acked-by: Ben Pfaff > > > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v3 1/7] lib/classifier: Use cmap.
Use cmap instead of hmap & hindex in classifier. Performance impact with current locking strategy is not yet tested. Later patches will introduce RCU into the classifier. Signed-off-by: Jarno Rajahalme Acked-by: Ben Pfaff --- v3: No change, but will push this patch only with the rest of the series. lib/classifier.c| 194 +-- lib/classifier.h|3 + tests/test-classifier.c | 10 +-- 3 files changed, 112 insertions(+), 95 deletions(-) diff --git a/lib/classifier.c b/lib/classifier.c index a50e897..69b5abd 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -22,8 +22,7 @@ #include "dynamic-string.h" #include "flow.h" #include "hash.h" -#include "hindex.h" -#include "hmap.h" +#include "cmap.h" #include "list.h" #include "odp-util.h" #include "ofp-util.h" @@ -58,25 +57,25 @@ struct cls_classifier { uint8_t n_flow_segments; uint8_t flow_segments[CLS_MAX_INDICES]; /* Flow segment boundaries to use * for staged lookup. */ -struct hmap subtables_map; /* Contains "struct cls_subtable"s. */ +struct cmap subtables_map; /* Contains "struct cls_subtable"s. */ struct pvector subtables; -struct hmap partitions; /* Contains "struct cls_partition"s. */ +struct cmap partitions; /* Contains "struct cls_partition"s. */ struct cls_trie tries[CLS_MAX_TRIES]; /* Prefix tries. */ unsigned int n_tries; }; /* A set of rules that all have the same fields wildcarded. */ struct cls_subtable { -struct hmap_node hmap_node; /* Within struct cls_classifier 'subtables_map' - * hmap. */ -struct hmap rules; /* Contains "struct cls_rule"s. */ +struct cmap_node cmap_node; /* Within struct cls_classifier + * 'subtables_map'. */ +struct cmap rules; /* Contains "struct cls_rule"s. */ int n_rules;/* Number of rules, including duplicates. */ unsigned int max_priority; /* Max priority of any rule in the subtable. */ unsigned int max_count; /* Count of max_priority rules. */ tag_type tag; /* Tag generated from mask for partitioning. */ uint8_t n_indices; /* How many indices to use. */ uint8_t index_ofs[CLS_MAX_INDICES]; /* u32 flow segment boundaries. */ -struct hindex indices[CLS_MAX_INDICES]; /* Staged lookup indices. */ +struct cmap indices[CLS_MAX_INDICES]; /* Staged lookup indices. */ unsigned int trie_plen[CLS_MAX_TRIES]; /* Trie prefix length in 'mask'. */ int ports_mask_len; struct trie_node *ports_trie; /* NULL if none. */ @@ -88,8 +87,8 @@ struct cls_subtable { * field) with tags for the "cls_subtable"s that contain rules that match that * metadata value. */ struct cls_partition { -struct hmap_node hmap_node; /* In struct cls_classifier's 'partitions' - * hmap. */ +struct cmap_node cmap_node; /* In struct cls_classifier's 'partitions' + * map. */ ovs_be64 metadata; /* metadata value for this partition. */ tag_type tags; /* OR of each flow's cls_subtable tag. */ struct tag_tracker tracker; /* Tracks the bits in 'tags'. */ @@ -98,9 +97,9 @@ struct cls_partition { /* Internal representation of a rule in a "struct cls_subtable". */ struct cls_match { struct cls_rule *cls_rule; -struct hindex_node index_nodes[CLS_MAX_INDICES]; /* Within subtable's - * 'indices'. */ -struct hmap_node hmap_node; /* Within struct cls_subtable 'rules'. */ +struct cmap_node index_nodes[CLS_MAX_INDICES]; /* Within subtable's +* 'indices'. */ +struct cmap_node cmap_node; /* Within struct cls_subtable 'rules'. */ unsigned int priority; /* Larger numbers are higher priorities. */ struct cls_partition *partition; struct list list; /* List of identical, lower-priority rules. */ @@ -461,9 +460,9 @@ classifier_init(struct classifier *cls_, const uint8_t *flow_segments) cls_->cls = cls; cls->n_rules = 0; -hmap_init(&cls->subtables_map); +cmap_init(&cls->subtables_map); pvector_init(&cls->subtables); -hmap_init(&cls->partitions); +cmap_init(&cls->partitions); cls->n_flow_segments = 0; if (flow_segments) { while (cls->n_flow_segments < CLS_MAX_INDICES @@ -494,18 +493,17 @@ classifier_destroy(struct classifier *cls_) trie_destroy(cls->tries[i].root); } -HMAP_FOR_EACH_SAFE (subtable, next_subtable, hmap_node, +CMAP_FOR_EACH_SAFE (subtable, next_subtable, cmap_node, &cls->subtables_map) { destroy_subtable(cls, subtable); } -hmap_destroy(&cls->subtables_map); +cmap_
[ovs-dev] [PATCH v3 4/7] lib/classifier: Stylistic change.
Rename 'nbits' as 'n_bits' to be more consistent with other count-like fields. Signed-off-by: Jarno Rajahalme --- v3: No change. lib/classifier.c | 60 +++--- 1 file changed, 30 insertions(+), 30 deletions(-) diff --git a/lib/classifier.c b/lib/classifier.c index 5380c80..21605c8 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -169,9 +169,9 @@ static void trie_remove(struct cls_trie *, const struct cls_rule *, int mlen); static void trie_remove_prefix(struct trie_node **, const ovs_be32 *prefix, int mlen); static void mask_set_prefix_bits(struct flow_wildcards *, uint8_t be32ofs, - unsigned int nbits); + unsigned int n_bits); static bool mask_prefix_bits_set(const struct flow_wildcards *, - uint8_t be32ofs, unsigned int nbits); + uint8_t be32ofs, unsigned int n_bits); /* flow/miniflow/minimask/minimatch utilities. * These are only used by the classifier, so place them here to allow @@ -1701,7 +1701,7 @@ next_rule_in_list(struct cls_match *rule) /* A longest-prefix match tree. */ struct trie_node { uint32_t prefix; /* Prefix bits for this node, MSB first. */ -uint8_t nbits;/* Never zero, except for the root node. */ +uint8_t n_bits; /* Never zero, except for the root node. */ unsigned int n_rules; /* Number of rules that have this prefix. */ struct trie_node *edges[2]; /* Both NULL if leaf. */ }; @@ -1748,15 +1748,15 @@ trie_get_prefix(const ovs_be32 pr[], unsigned int ofs, unsigned int plen) return raw_get_prefix(pr, ofs, plen) & ~0u << (32 - plen); } -/* Return the number of equal bits in 'nbits' of 'prefix's MSBs and a 'value' +/* Return the number of equal bits in 'n_bits' of 'prefix's MSBs and a 'value' * starting at "MSB 0"-based offset 'ofs'. */ static unsigned int -prefix_equal_bits(uint32_t prefix, unsigned int nbits, const ovs_be32 value[], +prefix_equal_bits(uint32_t prefix, unsigned int n_bits, const ovs_be32 value[], unsigned int ofs) { -uint64_t diff = prefix ^ raw_get_prefix(value, ofs, nbits); +uint64_t diff = prefix ^ raw_get_prefix(value, ofs, n_bits); /* Set the bit after the relevant bits to limit the result. */ -return raw_clz64(diff << 32 | UINT64_C(1) << (63 - nbits)); +return raw_clz64(diff << 32 | UINT64_C(1) << (63 - n_bits)); } /* Return the number of equal bits in 'node' prefix and a 'prefix' of length @@ -1765,7 +1765,7 @@ static unsigned int trie_prefix_equal_bits(const struct trie_node *node, const ovs_be32 prefix[], unsigned int ofs, unsigned int plen) { -return prefix_equal_bits(node->prefix, MIN(node->nbits, plen - ofs), +return prefix_equal_bits(node->prefix, MIN(node->n_bits, plen - ofs), prefix, ofs); } @@ -1795,7 +1795,7 @@ trie_branch_create(const ovs_be32 *prefix, unsigned int ofs, unsigned int plen, node->prefix = trie_get_prefix(prefix, ofs, plen); if (plen <= TRIE_PREFIX_BITS) { -node->nbits = plen; +node->n_bits = plen; node->edges[0] = NULL; node->edges[1] = NULL; node->n_rules = n_rules; @@ -1805,7 +1805,7 @@ trie_branch_create(const ovs_be32 *prefix, unsigned int ofs, unsigned int plen, plen - TRIE_PREFIX_BITS, n_rules); int bit = get_bit_at(subnode->prefix, 0); -node->nbits = TRIE_PREFIX_BITS; +node->n_bits = TRIE_PREFIX_BITS; node->edges[bit] = subnode; node->edges[!bit] = NULL; node->n_rules = 0; @@ -1837,35 +1837,35 @@ trie_is_leaf(const struct trie_node *trie) static void mask_set_prefix_bits(struct flow_wildcards *wc, uint8_t be32ofs, - unsigned int nbits) + unsigned int n_bits) { ovs_be32 *mask = &((ovs_be32 *)&wc->masks)[be32ofs]; unsigned int i; -for (i = 0; i < nbits / 32; i++) { +for (i = 0; i < n_bits / 32; i++) { mask[i] = OVS_BE32_MAX; } -if (nbits % 32) { -mask[i] |= htonl(~0u << (32 - nbits % 32)); +if (n_bits % 32) { +mask[i] |= htonl(~0u << (32 - n_bits % 32)); } } static bool mask_prefix_bits_set(const struct flow_wildcards *wc, uint8_t be32ofs, - unsigned int nbits) + unsigned int n_bits) { ovs_be32 *mask = &((ovs_be32 *)&wc->masks)[be32ofs]; unsigned int i; ovs_be32 zeroes = 0; -for (i = 0; i < nbits / 32; i++) { +for (i = 0; i < n_bits / 32; i++) { zeroes |= ~mask[i]; } -if (nbits % 32) { -zeroes |= ~mask[i] & htonl(~0u << (32 - nbits % 32)); +if (n_bits % 32) { +zeroes |= ~mask[i] & htonl(~0u <<
[ovs-dev] [PATCH v3 3/7] lib/ovs-rcu: Export ovsrcu_synchronize().
The following patch will add the first external user. Signed-off-by: Jarno Rajahalme --- v3: No change. lib/ovs-rcu.c |3 +-- lib/ovs-rcu.h |4 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/lib/ovs-rcu.c b/lib/ovs-rcu.c index 050a2ef..a052d6c 100644 --- a/lib/ovs-rcu.c +++ b/lib/ovs-rcu.c @@ -60,7 +60,6 @@ static void ovsrcu_flush_cbset(struct ovsrcu_perthread *); static void ovsrcu_unregister__(struct ovsrcu_perthread *); static bool ovsrcu_call_postponed(void); static void *ovsrcu_postpone_thread(void *arg OVS_UNUSED); -static void ovsrcu_synchronize(void); static struct ovsrcu_perthread * ovsrcu_perthread_get(void) @@ -153,7 +152,7 @@ ovsrcu_is_quiescent(void) return pthread_getspecific(perthread_key) == NULL; } -static void +void ovsrcu_synchronize(void) { unsigned int warning_threshold = 1000; diff --git a/lib/ovs-rcu.h b/lib/ovs-rcu.h index 366367c..2c7d1ea 100644 --- a/lib/ovs-rcu.h +++ b/lib/ovs-rcu.h @@ -203,4 +203,8 @@ void ovsrcu_quiesce_end(void); void ovsrcu_quiesce(void); bool ovsrcu_is_quiescent(void); +/* Synchronization. Waits for all non-quiescent threads to quiesce at least + * once. This can block for a relatively long time. */ +void ovsrcu_synchronize(void); + #endif /* ovs-rcu.h */ -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH v3 2/7] lib/classifier: Simplify iteration with C99 declaration.
Hide the cursor from the classifier iteration users and move locking to the iterators. This will make following RCU changes simpler, as the call sites of the iterators need not be changed at that point. Signed-off-by: Jarno Rajahalme --- v3: No change. lib/classifier.c| 70 +-- lib/classifier.h| 66 ++-- ofproto/ofproto-dpif.c |7 + ofproto/ofproto.c | 42 +--- tests/test-classifier.c | 67 - utilities/ovs-ofctl.c | 23 6 files changed, 156 insertions(+), 119 deletions(-) diff --git a/lib/classifier.c b/lib/classifier.c index 69b5abd..5380c80 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -588,11 +588,12 @@ trie_init(struct cls_classifier *cls, int trie_idx, } } -/* Returns true if 'cls' contains no classification rules, false otherwise. */ +/* Returns true if 'cls' contains no classification rules, false otherwise. + * Checking the cmap requires no locking. */ bool classifier_is_empty(const struct classifier *cls) { -return cls->cls->n_rules == 0; +return cmap_is_empty(&cls->cls->subtables_map); } /* Returns the number of rules in 'cls'. */ @@ -1154,7 +1155,8 @@ search_subtable(const struct cls_subtable *subtable, return NULL; } -/* Initializes 'cursor' for iterating through rules in 'cls': +/* Initializes 'cursor' for iterating through rules in 'cls', and returns the + * first matching cls_rule via '*pnode', or NULL if there are no matches. * * - If 'target' is null, the cursor will visit every rule in 'cls'. * @@ -1162,45 +1164,68 @@ search_subtable(const struct cls_subtable *subtable, * such that cls_rule_is_loose_match(rule, target) returns true. * * Ignores target->priority. */ -void -cls_cursor_init(struct cls_cursor *cursor, const struct classifier *cls, -const struct cls_rule *target) -{ -cursor->cls = cls->cls; -cursor->target = target && !cls_rule_is_catchall(target) ? target : NULL; -} - -/* Returns the first matching cls_rule in 'cursor''s iteration, or a null - * pointer if there are no matches. */ -struct cls_rule * -cls_cursor_first(struct cls_cursor *cursor) +struct cls_cursor cls_cursor_init(const struct classifier *cls, + const struct cls_rule *target, + void **pnode, const void *offset, bool safe) +OVS_NO_THREAD_SAFETY_ANALYSIS { +struct cls_cursor cursor; struct cls_subtable *subtable; +struct cls_rule *cls_rule = NULL; -CMAP_CURSOR_FOR_EACH (subtable, cmap_node, &cursor->subtables, - &cursor->cls->subtables_map) { -struct cls_match *rule = search_subtable(subtable, cursor); +cursor.safe = safe; +cursor.cls = cls; +cursor.target = target && !cls_rule_is_catchall(target) ? target : NULL; + +/* Find first rule. */ +fat_rwlock_rdlock(&cursor.cls->rwlock); +CMAP_CURSOR_FOR_EACH (subtable, cmap_node, &cursor.subtables, + &cursor.cls->cls->subtables_map) { +struct cls_match *rule = search_subtable(subtable, &cursor); if (rule) { -cursor->subtable = subtable; -return rule->cls_rule; +cursor.subtable = subtable; +cls_rule = rule->cls_rule; +break; } } +*pnode = (char *)cls_rule + (ptrdiff_t)offset; -return NULL; +/* Leave locked if requested and have a rule. */ +if (safe || !cls_rule) { +fat_rwlock_unlock(&cls->rwlock); +} +return cursor; +} + +static void +cls_cursor_next_unlock(struct cls_cursor *cursor, struct cls_rule *rule) +OVS_NO_THREAD_SAFETY_ANALYSIS +{ +/* Release the lock if no rule, or 'safe' mode. */ +if (!rule || cursor->safe) { +fat_rwlock_unlock(&cursor->cls->rwlock); +} } /* Returns the next matching cls_rule in 'cursor''s iteration, or a null * pointer if there are no more matches. */ struct cls_rule * cls_cursor_next(struct cls_cursor *cursor, const struct cls_rule *rule_) +OVS_NO_THREAD_SAFETY_ANALYSIS { struct cls_match *rule = CONST_CAST(struct cls_match *, rule_->cls_match); const struct cls_subtable *subtable; struct cls_match *next; +/* Lock if not locked already. */ +if (cursor->safe) { +fat_rwlock_rdlock(&cursor->cls->rwlock); +} + next = next_rule_in_list__(rule); if (next->priority < rule->priority) { +cls_cursor_next_unlock(cursor, next->cls_rule); return next->cls_rule; } @@ -1210,6 +1235,7 @@ cls_cursor_next(struct cls_cursor *cursor, const struct cls_rule *rule_) rule = next; CMAP_CURSOR_FOR_EACH_CONTINUE (rule, cmap_node, &cursor->rules) { if (rule_matches(rule, cursor->target)) { +cls_cursor_next_unlock(cursor, rule->cls_rule);
[ovs-dev] [PATCH v3 7/7] lib/classifier: Lockless lookups.
Now that all the relevant classifier structures use RCU and internal mutual exclusion for modifications, we can remove the fat-rwlock and thus make the classifier lookups lockless. As the readers are operating concurrently with the writers, a concurrent reader may or may not see a new rule being added by a writer, depending on how the concurrent events overlap with each other. Overall, this is no different from the former locked behavior, but there the visibility of the new rule only depended on the timing of the locking functions. A new rule is first added to the segment indices, so the readers may find the rule in the indices before the rule is visible in the subtables 'rules' map. This may result in us losing the opportunity to quit lookups earlier, resulting in sub-optimal wildcarding. This will be fixed by forthcoming revalidation always scheduled after flow table changes. Similar behavior may happen due to us removing the overlapping rule (if any) from the indices only after the corresponding new rule has been added. The subtable's max priority is updated only after a rule is inserted to the maps, so the concurrent readers may not see the rule, as the updated priority ordered subtable list will only be visible after the subtable's max priority is updated. Similarly, the classifier's partitions are updated by the caller after the rule is inserted to the maps, so the readers may keep skipping the subtable until they see the updated partitions. Signed-off-by: Jarno Rajahalme --- v3: Rebased to master. lib/classifier.c| 35 --- lib/classifier.h| 45 - lib/dpif-netdev.c | 25 - ofproto/ofproto-dpif.c |2 -- ofproto/ofproto.c | 33 ++--- ofproto/ofproto.h |3 +++ tests/test-classifier.c | 36 +--- utilities/ovs-ofctl.c |4 8 files changed, 58 insertions(+), 125 deletions(-) diff --git a/lib/classifier.c b/lib/classifier.c index ad035db..fe38a55 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -481,7 +481,6 @@ classifier_init(struct classifier *cls_, const uint8_t *flow_segments) { struct cls_classifier *cls = xmalloc(sizeof *cls); -fat_rwlock_init(&cls_->rwlock); ovs_mutex_init(&cls->mutex); ovs_mutex_lock(&cls->mutex); @@ -506,7 +505,8 @@ classifier_init(struct classifier *cls_, const uint8_t *flow_segments) } /* Destroys 'cls'. Rules within 'cls', if any, are not freed; this is the - * caller's responsibility. */ + * caller's responsibility. + * May only be called after all the readers have been terminated. */ void classifier_destroy(struct classifier *cls_) OVS_EXCLUDED(cls_->cls->mutex) @@ -517,7 +517,6 @@ classifier_destroy(struct classifier *cls_) struct cls_subtable *subtable, *next_subtable; int i; -fat_rwlock_destroy(&cls_->rwlock); if (!cls) { return; } @@ -682,7 +681,10 @@ classifier_is_empty(const struct classifier *cls) /* Returns the number of rules in 'cls'. */ int classifier_count(const struct classifier *cls) +OVS_NO_THREAD_SAFETY_ANALYSIS { +/* n_rules is an int, so in the presence of concurrent writers this will + * return either the old or a new value. */ return cls->cls->n_rules; } @@ -1711,6 +1713,33 @@ find_equal(struct cls_subtable *subtable, const struct miniflow *flow, return NULL; } +/* + * As the readers are operating concurrently with the modifications, a + * concurrent reader may or may not see the new rule, depending on how + * the concurrent events overlap with each other. This is no + * different from the former locked behavior, but there the visibility + * of the new rule only depended on the timing of the locking + * functions. + * + * The new rule is first added to the segment indices, so the readers + * may find the rule in the indices before the rule is visible in the + * subtables 'rules' map. This may result in us losing the + * opportunity to quit lookups earlier, resulting in sub-optimal + * wildcarding. This will be fixed by forthcoming revalidation always + * scheduled after flow table changes. + * + * Similar behavior may happen due to us removing the overlapping rule + * (if any) from the indices only after the new rule has been added. + * + * The subtable's max priority is updated only after the rule is + * inserted, so the concurrent readers may not see the rule, as the + * updated priority ordered subtable list will only be visible after + * the subtable's max priority is updated. + * + * Similarly, the classifier's partitions for new rules are updated by + * the caller after this function, so the readers may keep skipping + * the subtable until they see the updated partitions. + */ static struct cls_match * insert_rule(struct cls_classifier *cls, struct cls_subtable *subtable,
[ovs-dev] [PATCH v3 5/7] lib/classifier: Use internal mutex.
Add an internal mutex to the struct cls_classifier, and reorganize classifier internal structures according to the user of each field, marking the fields that need to be protected by the mutex. This makes locking requirements easier to track, and may make lookup more memory efficient. After this patch there is some double locking, as the caller is taking the fat-rwlock, and we take a mutex internally. A following patch will remove the classifier fat-rwlock, removing the (double) locking overhead. Signed-off-by: Jarno Rajahalme --- v3: No change. lib/classifier.c| 151 --- lib/classifier.h| 16 ++--- tests/test-classifier.c |8 ++- 3 files changed, 118 insertions(+), 57 deletions(-) diff --git a/lib/classifier.c b/lib/classifier.c index 21605c8..d6125d3 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -53,7 +53,8 @@ enum { }; struct cls_classifier { -int n_rules;/* Total number of rules. */ +struct ovs_mutex mutex; +int n_rules OVS_GUARDED;/* Total number of rules. */ uint8_t n_flow_segments; uint8_t flow_segments[CLS_MAX_INDICES]; /* Flow segment boundaries to use * for staged lookup. */ @@ -66,20 +67,30 @@ struct cls_classifier { /* A set of rules that all have the same fields wildcarded. */ struct cls_subtable { +/* The fields are only used by writers and iterators. */ struct cmap_node cmap_node; /* Within struct cls_classifier * 'subtables_map'. */ -struct cmap rules; /* Contains "struct cls_rule"s. */ -int n_rules;/* Number of rules, including duplicates. */ -unsigned int max_priority; /* Max priority of any rule in the subtable. */ -unsigned int max_count; /* Count of max_priority rules. */ -tag_type tag; /* Tag generated from mask for partitioning. */ -uint8_t n_indices; /* How many indices to use. */ -uint8_t index_ofs[CLS_MAX_INDICES]; /* u32 flow segment boundaries. */ -struct cmap indices[CLS_MAX_INDICES]; /* Staged lookup indices. */ -unsigned int trie_plen[CLS_MAX_TRIES]; /* Trie prefix length in 'mask'. */ -int ports_mask_len; -struct trie_node *ports_trie; /* NULL if none. */ -struct minimask mask; /* Wildcards for fields. */ + +/* The fields are only used by writers. */ +int n_rules OVS_GUARDED;/* Number of rules, including + * duplicates. */ +unsigned int max_priority OVS_GUARDED; /* Max priority of any rule in + * the subtable. */ +unsigned int max_count OVS_GUARDED; /* Count of max_priority rules. */ + +/* These fields are accessed by readers who care about wildcarding. */ +tag_type tag; /* Tag generated from mask for partitioning (const). */ +uint8_t n_indices; /* How many indices to use (const). */ +uint8_t index_ofs[CLS_MAX_INDICES]; /* u32 segment boundaries (const). */ +unsigned int trie_plen[CLS_MAX_TRIES]; /* Trie prefix length in 'mask' + * (runtime configurable). */ +int ports_mask_len; /* (const) */ +struct cmap indices[CLS_MAX_INDICES]; /* Staged lookup indices. */ +struct trie_node *ports_trie; /* NULL if none. */ + +/* These fields are accessed by all readers. */ +struct cmap rules; /* Contains "struct cls_rule"s. */ +struct minimask mask; /* Wildcards for fields (const). */ /* 'mask' must be the last field. */ }; @@ -91,18 +102,24 @@ struct cls_partition { * map. */ ovs_be64 metadata; /* metadata value for this partition. */ tag_type tags; /* OR of each flow's cls_subtable tag. */ -struct tag_tracker tracker; /* Tracks the bits in 'tags'. */ +struct tag_tracker tracker OVS_GUARDED; /* Tracks the bits in 'tags'. */ }; /* Internal representation of a rule in a "struct cls_subtable". */ struct cls_match { -struct cls_rule *cls_rule; +/* Accessed only by writers and iterators. */ +struct list list OVS_GUARDED; /* List of identical, lower-priority rules. */ + +/* Accessed only by writers. */ +struct cls_partition *partition OVS_GUARDED; + +/* Accessed by readers interested in wildcarding. */ +unsigned int priority; /* Larger numbers are higher priorities. */ struct cmap_node index_nodes[CLS_MAX_INDICES]; /* Within subtable's * 'indices'. */ +/* Accessed by all readers. */ struct cmap_node cmap_node; /* Within struct cls_subtable 'rules'. */ -unsigned int priority; /* Larger numbers are higher priorities. */ -struct cls_partition *partition; -
[ovs-dev] [PATCH v3 6/7] lib/classifier: RCUify prefix trie code.
cls_set_prefix_fields() now synchronizes explicitly with the readers, waiting them to finish using the old configuration before changing to the new configuration. Signed-off-by: Jarno Rajahalme --- v3: No change. lib/classifier.c| 214 +-- lib/classifier.h|2 +- ofproto/ofproto.c |6 +- tests/test-classifier.c | 65 +++--- 4 files changed, 208 insertions(+), 79 deletions(-) diff --git a/lib/classifier.c b/lib/classifier.c index d6125d3..ad035db 100644 --- a/lib/classifier.c +++ b/lib/classifier.c @@ -42,10 +42,12 @@ struct trie_ctx; #define TP_PORTS_OFS32 (offsetof(struct flow, tp_src) / 4) BUILD_ASSERT_DECL(TP_PORTS_OFS32 == offsetof(struct flow, tp_dst) / 4); +typedef OVSRCU_TYPE(struct trie_node *) rcu_trie_ptr; + /* Prefix trie for a 'field' */ struct cls_trie { const struct mf_field *field; /* Trie field, or NULL. */ -struct trie_node *root; /* NULL if none. */ +rcu_trie_ptr root;/* NULL if none. */ }; enum { @@ -86,7 +88,7 @@ struct cls_subtable { * (runtime configurable). */ int ports_mask_len; /* (const) */ struct cmap indices[CLS_MAX_INDICES]; /* Staged lookup indices. */ -struct trie_node *ports_trie; /* NULL if none. */ +rcu_trie_ptr ports_trie;/* NULL if none. */ /* These fields are accessed by all readers. */ struct cmap rules; /* Contains "struct cls_rule"s. */ @@ -180,16 +182,16 @@ static void trie_init(struct cls_classifier *cls, int trie_idx, OVS_REQUIRES(cls->mutex); static unsigned int trie_lookup(const struct cls_trie *, const struct flow *, unsigned int *checkbits); -static unsigned int trie_lookup_value(const struct trie_node *, +static unsigned int trie_lookup_value(const rcu_trie_ptr *, const ovs_be32 value[], unsigned int value_bits, unsigned int *checkbits); -static void trie_destroy(struct trie_node *); +static void trie_destroy(rcu_trie_ptr *); static void trie_insert(struct cls_trie *, const struct cls_rule *, int mlen); -static void trie_insert_prefix(struct trie_node **, const ovs_be32 *prefix, +static void trie_insert_prefix(rcu_trie_ptr *, const ovs_be32 *prefix, int mlen); static void trie_remove(struct cls_trie *, const struct cls_rule *, int mlen); -static void trie_remove_prefix(struct trie_node **, const ovs_be32 *prefix, +static void trie_remove_prefix(rcu_trie_ptr *, const ovs_be32 *prefix, int mlen); static void mask_set_prefix_bits(struct flow_wildcards *, uint8_t be32ofs, unsigned int n_bits); @@ -522,7 +524,7 @@ classifier_destroy(struct classifier *cls_) ovs_mutex_lock(&cls->mutex); for (i = 0; i < cls->n_tries; i++) { -trie_destroy(cls->tries[i].root); +trie_destroy(&cls->tries[i].root); } CMAP_FOR_EACH_SAFE (subtable, next_subtable, cmap_node, @@ -548,7 +550,7 @@ classifier_destroy(struct classifier *cls_) BUILD_ASSERT_DECL(MFF_N_IDS <= 64); /* Set the fields for which prefix lookup should be performed. */ -void +bool classifier_set_prefix_fields(struct classifier *cls_, const enum mf_field_id *trie_fields, unsigned int n_fields) @@ -556,10 +558,12 @@ classifier_set_prefix_fields(struct classifier *cls_, { struct cls_classifier *cls = cls_->cls; uint64_t fields = 0; -int i, trie; +const struct mf_field * new_fields[CLS_MAX_TRIES]; +int i, n_tries = 0; +bool changed = false; ovs_mutex_lock(&cls->mutex); -for (i = 0, trie = 0; i < n_fields && trie < CLS_MAX_TRIES; i++) { +for (i = 0; i < n_fields && n_tries < CLS_MAX_TRIES; i++) { const struct mf_field *field = mf_from_id(trie_fields[i]); if (field->flow_be32ofs < 0 || field->n_bits % 32) { /* Incompatible field. This is the only place where we @@ -576,18 +580,57 @@ classifier_set_prefix_fields(struct classifier *cls_, } fields |= UINT64_C(1) << trie_fields[i]; -if (trie >= cls->n_tries || field != cls->tries[trie].field) { -trie_init(cls, trie, field); +new_fields[n_tries] = NULL; +if (n_tries >= cls->n_tries || field != cls->tries[n_tries].field) { +new_fields[n_tries] = field; +changed = true; +} +n_tries++; +} + +if (changed || n_tries < cls->n_tries) { +struct cls_subtable *subtable; + +/* Trie configuration needs to change. Disable trie lookups + * for the tries that are changing and wait all the current readers + * with the old configuration to be
Re: [ovs-dev] [PATCHv2] util: fix compile warnings
Even better. Thanks! Acked-by: Joe Stringer ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [valgrind PATCH] ovsdb: Frees database memory on ovsdb process cleanup.
On Wed, Jul 02, 2014 at 03:00:16PM -0700, Ryan Wilson wrote: > This fixes valgrind errors. > > Signed-off-by: Ryan Wilson Applied, thanks! ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v2] lib/hash: Use CRC32 for hashing.
On Mon, Jul 07, 2014 at 01:16:59PM -0700, Jarno Rajahalme wrote: > Use CRC32 intrinsics for hash computations when building for > X86_64 with SSE4_2. > > Add a new hash_words64() and change hash_words() to be inlined when > 'n_words' is a compile-time constant. > > Signed-off-by: Jarno Rajahalme I only build-tested this, only in the non-SSE4.2 case. Acked-by: Ben Pfaff ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCHv2] util: fix compile warnings
I pushed this to master and branch-2.3. On 8 July 2014 09:53, Joe Stringer wrote: > Even better. Thanks! > > Acked-by: Joe Stringer > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] tests: Fix race in 'balance-tcp bonding' test.
Thanks, I adjusted the commit message and pushed to master: Running the test in a tight loop could cause this test to fail after about 5 runs, with some of the ports reporting "may_enable: false" in the "ovs-appctl bond/show" output. This commit fixes the race condition by waiting for may_enable to be true for all bond ports. I suspect that LACP negotiation finishes, but the main thread doesn't have a chance to enable the ports before we send the test packets. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCHv4 1/2] dpif: Support fetching flow mask via dpif_flow_get().
Change the interface to allow implementations to pass back a buffer, and allow callers to specify which of actions, mask, and stats they wish to receive. This will be used in the next commit. Signed-off-by: Joe Stringer --- v4: Rebase against master. v3: Set *bufp to NULL before calling ->flow_get(). Allocate the correct size for the buffer in dpif_netdev_flow_get(). Don't overwrite mask with actions in dpif_netdev_flow_get(). Remove unneeded *bufp = NULL in dpif_netdev_flow_get(). v2: First post. --- lib/dpif-linux.c| 23 +-- lib/dpif-netdev.c | 30 +++--- lib/dpif-provider.h | 21 - lib/dpif.c | 49 + lib/dpif.h |3 ++- 5 files changed, 83 insertions(+), 43 deletions(-) diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c index 0eac3e7..9e0aa29 100644 --- a/lib/dpif-linux.c +++ b/lib/dpif-linux.c @@ -1047,24 +1047,27 @@ dpif_linux_flow_get__(const struct dpif_linux *dpif, static int dpif_linux_flow_get(const struct dpif *dpif_, const struct nlattr *key, size_t key_len, -struct ofpbuf **actionsp, struct dpif_flow_stats *stats) +struct ofpbuf **bufp, +struct nlattr **maskp, size_t *mask_len, +struct nlattr **actionsp, size_t *actions_len, +struct dpif_flow_stats *stats) { const struct dpif_linux *dpif = dpif_linux_cast(dpif_); struct dpif_linux_flow reply; -struct ofpbuf *buf; int error; -error = dpif_linux_flow_get__(dpif, key, key_len, &reply, &buf); +error = dpif_linux_flow_get__(dpif, key, key_len, &reply, bufp); if (!error) { -if (stats) { -dpif_linux_flow_get_stats(&reply, stats); +if (maskp) { +*maskp = CONST_CAST(struct nlattr *, reply.mask); +*mask_len = reply.mask_len; } if (actionsp) { -ofpbuf_set_data(buf, CONST_CAST(struct nlattr *, reply.actions)); -ofpbuf_set_size(buf, reply.actions_len); -*actionsp = buf; -} else { -ofpbuf_delete(buf); +*actionsp = CONST_CAST(struct nlattr *, reply.actions); +*actions_len = reply.actions_len; +} +if (stats) { +dpif_linux_flow_get_stats(&reply, stats); } } return error; diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b6d6b2e..974b5e1 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1212,7 +1212,10 @@ dpif_netdev_flow_from_nlattrs(const struct nlattr *key, uint32_t key_len, static int dpif_netdev_flow_get(const struct dpif *dpif, const struct nlattr *nl_key, size_t nl_key_len, - struct ofpbuf **actionsp, struct dpif_flow_stats *stats) + struct ofpbuf **bufp, + struct nlattr **maskp, size_t *mask_len, + struct nlattr **actionsp, size_t *actions_len, + struct dpif_flow_stats *stats) { struct dp_netdev *dp = get_dp_netdev(dpif); struct dp_netdev_flow *netdev_flow; @@ -1231,11 +1234,32 @@ dpif_netdev_flow_get(const struct dpif *dpif, get_dpif_flow_stats(netdev_flow, stats); } -if (actionsp) { +if (maskp || actionsp) { struct dp_netdev_actions *actions; +size_t len = 0; actions = dp_netdev_flow_get_actions(netdev_flow); -*actionsp = ofpbuf_clone_data(actions->actions, actions->size); +len += maskp ? sizeof(struct odputil_keybuf) : 0; +len += actionsp ? actions->size : 0; + +*bufp = ofpbuf_new(len); +if (maskp) { +struct flow_wildcards wc; + +minimask_expand(&netdev_flow->cr.match.mask, &wc); +odp_flow_key_from_mask(*bufp, &wc.masks, &netdev_flow->flow, + odp_to_u32(wc.masks.in_port.odp_port), + SIZE_MAX, true); +*maskp = ofpbuf_data(*bufp); +*mask_len = ofpbuf_size(*bufp); +} +if (actionsp) { +struct dp_netdev_actions *actions; + +actions = dp_netdev_flow_get_actions(netdev_flow); +*actionsp = ofpbuf_put(*bufp, actions->actions, actions->size); +*actions_len = actions->size; +} } } else { error = ENOENT; diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h index b762ac0..2331cbf 100644 --- a/lib/dpif-provider.h +++ b/lib/dpif-provider.h @@ -246,16 +246,27 @@ struct dpif_class { * Returns 0 if successful. If no flow matches, returns ENOENT. On other * failure, returns a positive errno value. * - * If 'actionsp' is nonnull, then on success '*actionsp' must be set to an -
[ovs-dev] [PATCHv4 2/2] revalidator: Revalidate missed flows.
If the datapath doesn't dump a flow for some reason, and the current dump is expected to revalidate all flows in the datapath, then perform revalidation for those flows by fetching them during the sweep phase. If revalidation is not required, then leave the flow in the datapath and don't revalidate it. Signed-off-by: Joe Stringer --- v4: Minor adjustments to base against master. v3: No change. v2: Make the revalidator_sweep__() logic easier to read. Rebase. RFC: First post. --- ofproto/ofproto-dpif-upcall.c | 34 +- 1 file changed, 29 insertions(+), 5 deletions(-) diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c index 2e58d94..df33643 100644 --- a/ofproto/ofproto-dpif-upcall.c +++ b/ofproto/ofproto-dpif-upcall.c @@ -46,6 +46,7 @@ VLOG_DEFINE_THIS_MODULE(ofproto_dpif_upcall); COVERAGE_DEFINE(upcall_duplicate_flow); +COVERAGE_DEFINE(revalidate_missed_dp_flow); /* A thread that reads upcalls from dpif, forwards each upcall's packet, * and possibly sets up a kernel flow as a cache. */ @@ -1440,6 +1441,27 @@ revalidate(struct revalidator *revalidator) dpif_flow_dump_thread_destroy(dump_thread); } +/* Called with exclusive access to 'revalidator' and 'ukey'. */ +static bool +handle_missed_revalidation(struct revalidator *revalidator, + struct udpif_key *ukey) +OVS_NO_THREAD_SAFETY_ANALYSIS +{ +struct udpif *udpif = revalidator->udpif; +struct dpif_flow flow; +struct ofpbuf *buf; +bool keep = false; + +COVERAGE_INC(revalidate_missed_dp_flow); + +if (!dpif_flow_get(udpif->dpif, ukey->key, ukey->key_len, &buf, &flow)) { +keep = revalidate_ukey(udpif, ukey, &flow); +ofpbuf_delete(buf); +} + +return keep; +} + static void revalidator_sweep__(struct revalidator *revalidator, bool purge) OVS_NO_THREAD_SAFETY_ANALYSIS @@ -1455,18 +1477,20 @@ revalidator_sweep__(struct revalidator *revalidator, bool purge) /* During garbage collection, this revalidator completely owns its ukeys * map, and therefore doesn't need to do any locking. */ HMAP_FOR_EACH_SAFE (ukey, next, hmap_node, revalidator->ukeys) { -if (!ukey->flow_exists) { -ukey_delete(revalidator, ukey); -} else if (purge || ukey->dump_seq != dump_seq) { +if (ukey->flow_exists +&& (purge +|| (ukey->dump_seq != dump_seq +&& revalidator->udpif->need_revalidate +&& !handle_missed_revalidation(revalidator, ukey { struct dump_op *op = &ops[n_ops++]; -/* If we have previously seen a flow in the datapath, but it - * hasn't been seen in the current dump, delete it. */ dump_op_init(op, ukey->key, ukey->key_len, ukey); if (n_ops == REVALIDATE_MAX_BATCH) { push_dump_ops(revalidator, ops, n_ops); n_ops = 0; } +} else if (!ukey->flow_exists) { +ukey_delete(revalidator, ukey); } } -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCHv4 0/2] Revalidate missed flows - master
This is a rebase of the following series against master: http://patchwork.openvswitch.org/patch/4895/ http://patchwork.openvswitch.org/patch/4891/ The way that revalidators handle dpif_flows is a bit different on master, so this version has a different dpif_flow_get() interface from the version for branch-2.3. Note that I have dropped the flow_get batching code from this version, as the revalidator code doesn't make use of this feature. v4: Rebase against master. v3: Address buffer safety feedback. v2: Drop batching. RFC: Initial post. Joe Stringer (2): dpif: Support fetching flow mask via dpif_flow_get(). revalidator: Revalidate missed flows. lib/dpif-linux.c | 23 ++- lib/dpif-netdev.c | 30 ++--- lib/dpif-provider.h | 21 +- lib/dpif.c| 49 + lib/dpif.h|3 ++- ofproto/ofproto-dpif-upcall.c | 34 +++- 6 files changed, 112 insertions(+), 48 deletions(-) -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCHv3] dpif: Support fetching flow mask via dpif_flow_get().
> > This is for branch-2.3, I think. > > Acked-by: Ben Pfaff > Yes, thanks. I've pushed this to branch-2.3. A version of this series for master is available here: http://mail.openvswitch.org/pipermail/dev/2014-July/042507.html ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCHv2 2/2] revalidator: Revalidate missed flows.
Thanks, I added that change and pushed to branch-2.3. A version of this series for master is available here: http://mail.openvswitch.org/pipermail/dev/2014-July/042507.html ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCHv2] daemon: restart child process if it died before signaling its readiness
The child process (the one being monitored) could die before it was able to call fork_notify_startup() function. If such situation arises, then parent process (the one monitoring child process) would also terminate with a fatal log message: ...|EMER|fork child died before signaling startup (killed (...)) This patch changes that behavior by always restarting child process if it was able to start up at least once in the past. However, if child was not able to start up even once, then the monitor process would still terminate, because that would most likely indicate a persistent programming or system error. To reproduce use following script: while : ; do kill -SIGSEGV `cat /var/run/openvswitch/ovs-vswitchd.pid`; done Signed-Off-By: Ansis Atteka VMware-BZ: 1273550 --- lib/daemon-unix.c | 63 ++--- 1 file changed, 41 insertions(+), 22 deletions(-) diff --git a/lib/daemon-unix.c b/lib/daemon-unix.c index 676a7f3..3d08222 100644 --- a/lib/daemon-unix.c +++ b/lib/daemon-unix.c @@ -214,19 +214,23 @@ fork_and_clean_up(void) /* Forks, then: * * - In the parent, waits for the child to signal that it has completed its - * startup sequence. Then stores -1 in '*fdp' and returns the child's pid. + * startup sequence. Then stores -1 in '*fdp' and returns the child's + * pid in '*child_pid' argument. * - * - In the child, stores a fd in '*fdp' and returns 0. The caller should - * pass the fd to fork_notify_startup() after it finishes its startup - * sequence. + * - In the child, stores a fd in '*fdp' and returns 0 through '*child_pid' + * argument. The caller should pass the fd to fork_notify_startup() after + * it finishes its startup sequence. * - * If something goes wrong with the fork, logs a critical error and aborts the - * process. */ -static pid_t -fork_and_wait_for_startup(int *fdp) + * Returns 0 on success. If something goes wrong and child process was not + * able to signal its readiness by calling fork_notify_startup(), then this + * function returns -1. However, even in case of failure it still sets child + * process id in '*child_pid'. */ +static int +fork_and_wait_for_startup(int *fdp, pid_t *child_pid) { int fds[2]; pid_t pid; +int ret = 0; xpipe(fds); @@ -252,8 +256,9 @@ fork_and_wait_for_startup(int *fdp) exit(WEXITSTATUS(status)); } else { char *status_msg = process_status_msg(status); -VLOG_FATAL("fork child died before signaling startup (%s)", - status_msg); +VLOG_ERR("fork child died before signaling startup (%s)", + status_msg); +ret = -1; } } else if (retval < 0) { VLOG_FATAL("waitpid failed (%s)", ovs_strerror(errno)); @@ -268,8 +273,8 @@ fork_and_wait_for_startup(int *fdp) close(fds[0]); *fdp = fds[1]; } - -return pid; +*child_pid = pid; +return ret; } static void @@ -317,6 +322,7 @@ monitor_daemon(pid_t daemon_pid) time_t last_restart; char *status_msg; int crashes; +bool child_ready = true; set_subprogram_name("monitor"); status_msg = xstrdup("healthy"); @@ -329,13 +335,16 @@ monitor_daemon(pid_t daemon_pid) proctitle_set("monitoring pid %lu (%s)", (unsigned long int) daemon_pid, status_msg); -do { -retval = waitpid(daemon_pid, &status, 0); -} while (retval == -1 && errno == EINTR); +if (child_ready) { +do { +retval = waitpid(daemon_pid, &status, 0); +} while (retval == -1 && errno == EINTR); +if (retval == -1) { +VLOG_FATAL("waitpid failed (%s)", ovs_strerror(errno)); +} +} -if (retval == -1) { -VLOG_FATAL("waitpid failed (%s)", ovs_strerror(errno)); -} else if (retval == daemon_pid) { +if (!child_ready || retval == daemon_pid) { char *s = process_status_msg(status); if (should_restart(status)) { free(status_msg); @@ -372,8 +381,11 @@ monitor_daemon(pid_t daemon_pid) last_restart = time(NULL); VLOG_ERR("%s, restarting", status_msg); -daemon_pid = fork_and_wait_for_startup(&daemonize_fd); -if (!daemon_pid) { +child_ready = !fork_and_wait_for_startup(&daemonize_fd, + &daemon_pid); +if (child_ready && !daemon_pid) { +/* Child process needs to break out of monitoring + * loop. */ break; } } else { @@ -403,7 +415,12 @@ daemonize_start(void) daemonize_fd = -1; if (detach) { -if (fork_and
Re: [ovs-dev] [PATCH] daemon: restart child process if it died before signaling its readiness
On Thu, Jul 3, 2014 at 1:23 PM, Ben Pfaff wrote: > On Sun, Jun 29, 2014 at 08:37:25PM -0700, Ansis Atteka wrote: >> From: Ansis > > Fix up the patch author? > >> The child process (the one being monitored) could die before it was able >> to call fork_notify_startup() function. If such situation arises, then >> parent process (the one monitoring child process) would also terminate >> with a fatal log message: >> >> ...|EMER|fork child died before signaling startup (killed (...)) >> >> This patch changes that beahvior by always restarting child process >> if it was able to start up at least once in the past. However, if >> child was not able to start up even the first time, then monitor process >> would still terminate, because that would most likely indicate a >> persistent programming error. >> >> To reproduce use following script: >> >> while : ; do kill -SIGSEGV `cat /var/run/openvswitch/ovs-vswitchd.pid`; done >> >> Signed-Off-By: Ansis Atteka >> Issue: 1273550 > > Clang does not like this: > > ../lib/daemon-unix.c:338:13: error: variable 'retval' is used > uninitialized > whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] > if (child_ready) { > ^~~ > ../lib/daemon-unix.c:347:13: note: uninitialized use occurs here > if (retval == daemon_pid || !child_ready) { > ^~ > ../lib/daemon-unix.c:338:9: note: remove the 'if' if its condition is > always > true > if (child_ready) { > ^ > ../lib/daemon-unix.c:332:19: note: initialize the variable 'retval' to > silence > this warning > int retval; > ^ >= 0 > 1 error generated. > > I can kind of see its point. This use of uninitialized data will > never be a real problem, because !child_ready will always be true if > retval is uninitialized, but it still reads better (and pleases Clang) > if the test condition is reversed: > if (!child_ready || retval == daemon_pid) { > > I think that the previous code logged something at VLL_EMER level if > the monitor process exited. I think the new version does not; it > would be nice if it did. You might use VLOG_FATAL instead of exit(1) > to get that effect. I sent V2 patch that addresses all your comments. Thanks for review! ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] tests: Fix "megaflow disabled" test.
The previous 'grep' logic would occasionally catch unintended flows. There's no reason to check for these flows separately, so combine the two checks. Signed-off-by: Joe Stringer --- tests/ofproto-dpif.at |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 038eb07..d05b4ef 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -5273,10 +5273,8 @@ AT_CHECK([cat ovs-vswitchd.log | FILTER_FLOW_INSTALL | STRIP_USED], [0], [dnl skb_priority(0),skb_mark(0),recirc_id(0),dp_hash(0),in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0), actions:2 skb_priority(0),skb_mark(0),recirc_id(0),dp_hash(0),in_port(1),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.4,dst=10.0.0.3,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0), actions:drop ]) -AT_CHECK([cat ovs-vswitchd.log | grep '00:09.*packets:3' | FILTER_FLOW_DUMP], [0], [dnl +AT_CHECK([cat ovs-vswitchd.log | FILTER_FLOW_DUMP | grep 'packets:3'], [0], [dnl skb_priority(0),skb_mark(0),recirc_id(0),dp_hash(0),in_port(1),eth(src=50:54:00:00:00:09,dst=50:54:00:00:00:0a),eth_type(0x0800),ipv4(src=10.0.0.2,dst=10.0.0.1,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0), packets:3, bytes:180, used:0.0s, actions:2 -]) -AT_CHECK([cat ovs-vswitchd.log | grep '00:0b.*packets:3' | FILTER_FLOW_DUMP], [0], [dnl skb_priority(0),skb_mark(0),recirc_id(0),dp_hash(0),in_port(1),eth(src=50:54:00:00:00:0b,dst=50:54:00:00:00:0c),eth_type(0x0800),ipv4(src=10.0.0.4,dst=10.0.0.3,proto=1,tos=0,ttl=64,frag=no),icmp(type=8,code=0), packets:3, bytes:180, used:0.0s, actions:drop ]) OVS_VSWITCHD_STOP -- 1.7.10.4 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] tests: Fix "megaflow disabled" test.
On Mon, Jul 7, 2014 at 6:01 PM, Joe Stringer wrote: > The previous 'grep' logic would occasionally catch unintended flows. > There's no reason to check for these flows separately, so combine the > two checks. > > Signed-off-by: Joe Stringer > --- > tests/ofproto-dpif.at |4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > Acked-by: Ansis Atteka ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] [PATCH] dpif-linux: Recheck the socket pointer existence before getting its pid.
This commit fixes a race between port deletion and flow miss handling. More specifically, a port could be removed while handling the flow miss from it. If the flow requires slow path action, the handler thread will try querying a pid from port's sockets. Since the port has been deleted, the query will cause a dereference of NULL socket pointer. This commit makes the handler thread recheck the socket pointer before dereferencing it. VMware-BZ: 1251981 Reported-by: Pratap Reddy Signed-off-by: Alex Wang --- lib/dpif-linux.c |6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/dpif-linux.c b/lib/dpif-linux.c index 0eac3e7..b7495ce 100644 --- a/lib/dpif-linux.c +++ b/lib/dpif-linux.c @@ -831,7 +831,11 @@ dpif_linux_port_get_pid__(const struct dpif_linux *dpif, odp_port_t port_no, uint32_t idx = port_idx >= dpif->uc_array_size ? 0 : port_idx; struct dpif_handler *h = &dpif->handlers[hash % dpif->n_handlers]; -pid = nl_sock_pid(h->channels[idx].sock); +/* Needs to check in case the socket pointer is changed in between + * of holding the locks. */ +if (h->channels[idx].sock) { +pid = nl_sock_pid(h->channels[idx].sock); +} } return pid; -- 1.7.9.5 ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] tests: Fix "megaflow disabled" test.
Thanks, I applied this to master and branch-2.3. On 8 July 2014 13:17, Ansis Atteka wrote: > On Mon, Jul 7, 2014 at 6:01 PM, Joe Stringer > wrote: > > The previous 'grep' logic would occasionally catch unintended flows. > > There's no reason to check for these flows separately, so combine the > > two checks. > > > > Signed-off-by: Joe Stringer > > --- > > tests/ofproto-dpif.at |4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > Acked-by: Ansis Atteka > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 7/7] lib/classifier: Lockless lookups.
> Now that all the relevant classifier structures use RCU and internal > mutual exclusion for modifications, we can remove the fat-rwlock and > thus make the classifier lookups lockless. > > As the readers are operating concurrently with the writers, a > concurrent reader may or may not see a new rule being added by a > writer, depending on how the concurrent events overlap with each > other. Overall, this is no different from the former locked behavior, > but there the visibility of the new rule only depended on the timing > of the locking functions. > > A new rule is first added to the segment indices, so the readers may > find the rule in the indices before the rule is visible in the > subtables 'rules' map. This may result in us losing the opportunity > to quit lookups earlier, resulting in sub-optimal wildcarding. This > will be fixed by forthcoming revalidation always scheduled after flow > table changes. > > Similar behavior may happen due to us removing the overlapping rule > (if any) from the indices only after the corresponding new rule has > been added. > > The subtable's max priority is updated only after a rule is inserted > to the maps, so the concurrent readers may not see the rule, as the > updated priority ordered subtable list will only be visible after the > subtable's max priority is updated. > > Similarly, the classifier's partitions are updated by the caller after > the rule is inserted to the maps, so the readers may keep skipping the > subtable until they see the updated partitions. > > Signed-off-by: Jarno Rajahalme comments in ofproto-provider.h need an update. otherwise, looks good to me. how about unifying classifier and cls_classifier? Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 2/7] lib/classifier: Simplify iteration with C99 declaration.
> Hide the cursor from the classifier iteration users and move locking to > the iterators. This will make following RCU changes simpler, as the call > sites of the iterators need not be changed at that point. > > Signed-off-by: Jarno Rajahalme Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 1/7] lib/classifier: Use cmap.
> Use cmap instead of hmap & hindex in classifier. Performance impact > with current locking strategy is not yet tested. Later patches will > introduce RCU into the classifier. > > Signed-off-by: Jarno Rajahalme > Acked-by: Ben Pfaff Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 3/7] lib/ovs-rcu: Export ovsrcu_synchronize().
> The following patch will add the first external user. > > Signed-off-by: Jarno Rajahalme Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 4/7] lib/classifier: Stylistic change.
> Rename 'nbits' as 'n_bits' to be more consistent with other count-like > fields. > > Signed-off-by: Jarno Rajahalme Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 6/7] lib/classifier: RCUify prefix trie code.
> cls_set_prefix_fields() now synchronizes explicitly with the readers, > waiting them to finish using the old configuration before changing to > the new configuration. > > Signed-off-by: Jarno Rajahalme Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH v3 5/7] lib/classifier: Use internal mutex.
> Add an internal mutex to the struct cls_classifier, and reorganize > classifier internal structures according to the user of each field, > marking the fields that need to be protected by the mutex. This makes > locking requirements easier to track, and may make lookup more memory > efficient. > > After this patch there is some double locking, as the caller is taking > the fat-rwlock, and we take a mutex internally. A following patch > will remove the classifier fat-rwlock, removing the (double) locking > overhead. > > Signed-off-by: Jarno Rajahalme Acked-by: YAMAMOTO Takashi ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [GIT net] Open vSwitch
From: Pravin B Shelar Date: Mon, 30 Jun 2014 23:33:16 -0700 > A set of fixes for net. > First bug is related flow-table management. Second one is in sample > action. Third is related flow stats and last one add gre-err handler for ovs. Pulled, thanks Pravin. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] dpif-linux: Recheck the socket pointer existence before getting its pid.
On 8 July 2014 13:20, Alex Wang wrote: > > +/* Needs to check in case the socket pointer is changed in between > + * of holding the locks. */ Might be worth mentioning here that a known case is when the port is deleted while handling an upcall from that port. Acked-by: Joe Stringer ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
Re: [ovs-dev] [PATCH] dpif-linux: Recheck the socket pointer existence before getting its pid.
Thx for the review, applied to master and branch-2.3 On Mon, Jul 7, 2014 at 8:15 PM, Joe Stringer wrote: > On 8 July 2014 13:20, Alex Wang wrote: >> >> +/* Needs to check in case the socket pointer is changed in >> between >> + * of holding the locks. */ > > > Might be worth mentioning here that a known case is when the port is > deleted while handling an upcall from that port. > > Acked-by: Joe Stringer > > ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev
[ovs-dev] Reveal wood's natural beauty
Dutch Glow - Amish Wood Milk...100 Year Old Formula Dutch Glow is a century old furniture polish that was created to clean, polish, and nourish wood. Reveal wood's natural beauty and remove years of wax build up. Don't ruin your furniture with past wax or silicone sprays Learn More (Dutch Glow - Amish Wood Milk...100 Year Old Formula Dutch Glow is a century old furniture polish that was created to clean, polish, and nourish wood. Reveal wood's natural beauty and remove years of wax build up. Don't ruin your furniture with past wax or silicone sprays Learn More () 36 Then said he unto him, Because thou hast not obeyed the voice of the LORD, behold, as soon as thou art departed from me, a lion shall slay thee. And as soon as he was departed from him, a lion found him, and slew him. 37 Then he found another man, and said, Smite me, I pray thee. And the man smote him, so that in smiting he wounded him. 38 So the prophet departed, and waited for the king by the way, and disguised himself with ashes upon his face. 39 And as the king passed by, he cried unto the king and he said, Thy servant went out into the midst of the battle and, behold, a man turned aside, and brought a man unto me, and said, Keep this man if by any means he be missing, then shall thy life be for his life, or else thou shalt pay a talent of silver. 40 And as thy servant was busy here and there, he was gone. And the king of Israel said unto him, So shall thy judgment be thyself hast decided it. 41 And he hasted, and took the ashes away from his face and the king of Israel discerned him that he was of the prophets. 42 And he said unto him, Thus saith the LORD, Because thou hast let go out of thy hand a man whom I appointed to utter destruction, therefore thy life shall go for his life, and thy people for his people. 43 And the king of Israel went to his house heavy and displeased, and came to Samaria.) 36 Then said he unto him, Because thou hast not obeyed the voice of the LORD, behold, as soon as thou art departed from me, a lion shall slay thee. And as soon as he was departed from him, a lion found him, and slew him. 37 Then he found another man, and said, Smite me, I pray thee. And the man smote him, so that in smiting he wounded him. 38 So the prophet departed, and waited for the king by the way, and disguised himself with ashes upon his face. 39 And as the king passed by, he cried unto the king and he said, Thy servant went out into the midst of the battle and, behold, a man turned aside, and brought a man unto me, and said, Keep this man if by any means he be missing, then shall thy life be for his life, or else thou shalt pay a talent of silver. 40 And as thy servant was busy here and there, he was gone. And the king of Israel said unto him, So shall thy judgment be thyself hast decided it. 41 And he hasted, and took the ashes away from his face and the king of Israel discerned him that he was of the prophets. 42 And he said unto him, Thus saith the LORD, Because thou hast let go out of thy hand a man whom I appointed to utter destruction, therefore thy life shall go for his life, and thy people for his people. 43 And the king of Israel went to his house heavy and displeased, and came to Samaria. ___ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev