On May 20, 2014, at 2:03 PM, Ben Pfaff <b...@nicira.com> wrote: > On Tue, May 20, 2014 at 01:22:45PM -0700, Jarno Rajahalme wrote: >> >> On May 20, 2014, at 9:48 AM, Ben Pfaff <b...@nicira.com> wrote: >> >>> On Tue, May 20, 2014 at 09:12:23AM -0700, Ben Pfaff wrote: >>>> On Fri, May 16, 2014 at 04:38:02PM -0700, Jarno Rajahalme wrote: >>>>>> + if (b->nodes[slot] == node) { >>>>>> + cmap_set_bucket(b, slot, cmap_node_next_protected(node), hash); >>>>> >>>>> ?hash? is not changing here, so could just set the nodes: >>>>> >>>>> b->nodes[slot] = cmap_node_next_protected(node); >>>>> >>>>> Btw, what is the rationale that the nodes pointers are not RCU >>>>> pointers? If they were, it would feel possible to combine this special >>>>> case with the loop below. >>>> >>>> Good points. I'll work on that for a v3. >>> >>> After thinking a little further, I am not sure that it would become >>> possible to combine them, because I think that the cases are a little >>> different: >>> >>> * If we are removing the last node with a hash, which is usually >>> the case if node == b->nodes[slot], then we want to make sure >>> that from the viewpoint of any reader that this is atomic >>> (that is, the change to ->hashes[] and ->nodes[]), by >>> incrementing the counter around the change. I am not >>> absolutely certain that this is required, but the cost is >>> minimal so, lacking confidence, I prefer to do it. >> >> My point was that the hash need not be changed, if the dup insertion >> code is also changed to not care about the node value (see the other >> comment I just sent). > > OK. Like this? > > diff --git a/lib/cmap.c b/lib/cmap.c > index 30a6e2d..d291ec5 100644 > --- a/lib/cmap.c > +++ b/lib/cmap.c > @@ -648,7 +648,8 @@ cmap_remove__(struct cmap_impl *impl, struct cmap_node > *node, > } > > if (b->nodes[slot] == node) { > - cmap_set_bucket(b, slot, cmap_node_next_protected(node), hash); > + b->nodes[slot] = cmap_node_next_protected(node); > + atomic_thread_fence(memory_order_release);
Yes. Would an ovsrcu_set() include the release barrier, if nodes[] were RCU pointers? (This relating to my earlier question about that.) > } else { > struct cmap_node *iter = b->nodes[slot]; > for (;;) { > >>> >>> * Otherwise, we are shortening a linked list, but not >>> eliminating its slot, which does not affect readers in the >>> same way, so an ordinary RCU store should suffice. >>> >>> What I'm increasingly uncertain about is whether cmap_find() is correct. >>> The intention is that the atomic reads of the counters before and after >>> checking the nodes and the hashes should ensure that the cache lines >>> occupied by the buckets are stable. I think that's going to be true in >>> practice, with current compiler technology. But I am not sure that the >>> atomic_reads on the counters actually means that the node and hash reads >>> can't be moved outside the counter reads. If not, though, making the >>> node reads atomic (via RCU) and even the hash reads atomic (by making >>> them atomic_uint32s) wouldn't help. I think that the only thing that >>> would help would be adding explicit acquire and release barriers. That >>> might actually, in conjunction with good comments, be clearer than what >>> we have now. >>> >>> What do you think? >> >> atomic_read() implies memory_order_seq_cst, which is stronger than >> memory_order_acquire. An memory_order_acquire should guarantee that >> no memory operations after the atomic_read are moved before it, so >> we read the data only after reading an even counter. When we re-read >> the counter to verify it has not changed, an acquire barrier would >> guarantee no memory operations after the check are moved before it, >> but it would be possible for the memory operations before it to be >> moved after it. So the check needs a release barrier, even if we are >> only reading, to guarantee that memory operations before the check >> are not moved after it. The memory_order_seq_cst implied by >> atomic_read() does that, but is too strong, a memory_order_acq_rel >> should suffice, or even memory_order_acquire for the >> read_even_counter, and a memory_order_release for a >> ?check_counter()?. Makes sense? > > Yes. Like this? > > diff --git a/lib/cmap.c b/lib/cmap.c > index 30a6e2d..d291ec5 100644 > --- a/lib/cmap.c > +++ b/lib/cmap.c > @@ -245,11 +245,11 @@ cmap_is_empty(const struct cmap *cmap) > } > > static uint32_t > -read_counter(struct cmap_bucket *bucket) > +read_counter(struct cmap_bucket *bucket, memory_order order) > { > uint32_t counter; > > - atomic_read(&bucket->counter, &counter); > + atomic_read_explicit(&bucket->counter, &counter, order); > return counter; > } > > @@ -259,7 +259,7 @@ read_even_counter(struct cmap_bucket *bucket) > uint32_t counter; > > do { > - counter = read_counter(bucket); > + counter = read_counter(bucket, memory_order_acquire); > } while (OVS_UNLIKELY(counter & 1)); > > return counter; > @@ -291,7 +291,7 @@ retry: > for (i = 0; i < CMAP_K; i++) { > struct cmap_node *node = b1->nodes[i]; > if (node && b1->hashes[i] == hash) { > - if (OVS_UNLIKELY(read_counter(b1) != c1)) { > + if (OVS_UNLIKELY(read_counter(b1, memory_order_release) != c1)) { I would find the code more balanced if a read_even_counter() was paired with a “read_same_counter()”, rather than a raw read_counter. To be more specific, the explicit barrier should be visible or hidden by an utility function the same way in both cases. > goto retry; > } > return node; > @@ -303,15 +303,15 @@ retry: > for (i = 0; i < CMAP_K; i++) { > struct cmap_node *node = b2->nodes[i]; > if (node && b2->hashes[i] == hash) { > - if (OVS_UNLIKELY(read_counter(b2) != c2)) { > + if (OVS_UNLIKELY(read_counter(b2, memory_order_release) != c2)) { > goto retry; > } > return node; > } > } > > - if (OVS_UNLIKELY(read_counter(b1) != c1) || > - OVS_UNLIKELY(read_counter(b2) != c2)) { > + if (OVS_UNLIKELY(read_counter(b1, memory_order_release) != c1) || > + OVS_UNLIKELY(read_counter(b2, memory_order_release) != c2)) { > goto retry; > } > return NULL; > > I realize that all these incrementals are a big mess. I've pushed the > overall series to my "ovs-reviews" repo at > https://github.com/blp/ovs-reviews in the "cmap" branch. I'll happily > repost a v3 if that is easiest for you. I’d rather you tell me that you have addresses the questions I made and push it :-) I think it would be best at this point get this in and gain some experience with it. We can then enhance it as we see opportunities for it. Jarno
_______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev