On Fri, May 16, 2014 at 04:38:02PM -0700, Jarno Rajahalme wrote: > Acked-by: Jarno Rajahalme <jrajaha...@nicira.com>
Thanks. > Some comments below, though, Responses below. (It would be easier to spot your comments if you trimmed more carefully.) > On May 8, 2014, at 4:50 PM, Ben Pfaff <b...@nicira.com> wrote: > > +/* The implementation of a concurrent hash map. */ > > +struct cmap_impl { > > + unsigned int n; /* Number of in-use elements. */ > > + unsigned int max_n; /* Max elements before enlarging. */ > > + unsigned int max_load; /* Max load as fraction of UINT32_MAX. */ > > This seems to be a constant, or is this changeable through the API? I > see that space saving is not an issue here, so maybe this does not > matter. You're right, it is now a constant. The previously posted version had a user-controllable max load. I removed that for this version but forgot to remove this and a few other bits that were necessary to make it user-controllable. I folded in the following incremental: diff --git a/lib/cmap.c b/lib/cmap.c index c974a13..91c7d04 100644 --- a/lib/cmap.c +++ b/lib/cmap.c @@ -147,18 +147,17 @@ BUILD_ASSERT_DECL(sizeof(struct cmap_bucket) == CACHE_LINE_SIZE); /* Default maximum load factor (as a fraction of UINT32_MAX + 1) before * enlarging a cmap. Reasonable values lie between about 75% and 93%. Smaller * values waste memory; larger values increase the average insertion time. */ -#define CMAP_DEFAULT_MAX_LOAD ((uint32_t) (UINT32_MAX * .85)) +#define CMAP_MAX_LOAD ((uint32_t) (UINT32_MAX * .85)) /* The implementation of a concurrent hash map. */ struct cmap_impl { unsigned int n; /* Number of in-use elements. */ unsigned int max_n; /* Max elements before enlarging. */ - unsigned int max_load; /* Max load as fraction of UINT32_MAX. */ uint32_t mask; /* Number of 'buckets', minus one. */ uint32_t basis; /* Basis for rehashing client's hash values. */ /* Padding to make cmap_impl exactly one cache line long. */ - uint8_t pad[CACHE_LINE_SIZE - sizeof(unsigned int) * 5]; + uint8_t pad[CACHE_LINE_SIZE - sizeof(unsigned int) * 4]; struct cmap_bucket buckets[]; }; @@ -190,13 +189,13 @@ cmap_get_impl(const struct cmap *cmap) } static uint32_t -calc_max_n(uint32_t mask, uint32_t max_load) +calc_max_n(uint32_t mask) { - return ((uint64_t) (mask + 1) * CMAP_K * max_load) >> 32; + return ((uint64_t) (mask + 1) * CMAP_K * CMAP_MAX_LOAD) >> 32; } static struct cmap_impl * -cmap_impl_create(uint32_t mask, uint32_t max_load) +cmap_impl_create(uint32_t mask) { struct cmap_impl *impl; @@ -205,8 +204,7 @@ cmap_impl_create(uint32_t mask, uint32_t max_load) impl = xzalloc_cacheline(sizeof *impl + (mask + 1) * sizeof *impl->buckets); impl->n = 0; - impl->max_n = calc_max_n(mask, max_load); - impl->max_load = max_load; + impl->max_n = calc_max_n(mask); impl->mask = mask; impl->basis = random_uint32(); @@ -217,7 +215,7 @@ cmap_impl_create(uint32_t mask, uint32_t max_load) void cmap_init(struct cmap *cmap) { - ovsrcu_set(&cmap->impl, cmap_impl_create(0, CMAP_DEFAULT_MAX_LOAD)); + ovsrcu_set(&cmap->impl, cmap_impl_create(0)); } /* Destroys 'cmap'. @@ -687,7 +685,7 @@ cmap_rehash(struct cmap *cmap, uint32_t mask) struct cmap_impl *old = cmap_get_impl(cmap); struct cmap_impl *new; - new = cmap_impl_create(mask, old->max_load); + new = cmap_impl_create(mask); ovs_assert(old->n < new->max_n); while (!cmap_try_rehash(old, new)) { > > +/* Returns the rehashed value for 'hash' within 'impl'. (See also "Hash > > + * Functions" at the top of this file.) */ > > +static uint32_t > > +rehash(const struct cmap_impl *impl, uint32_t hash) > > +{ > > + return mhash_finish(impl->basis, hash); > > > I thought about this more in terms of ?finishing the user provided > (unfinished) hash with the current (temporally constant, for the lack > of better way of putting this) ?finisher? value. This would yield: > > return mhash_finish(hash, impl->basis); > > This should not make a real difference, though. I guess that you're right. As you say, it shouldn't really matter, though, and since I did my performance testing with this approach, I think I'll leave it that way. > > +} > > + > > +static struct cmap_impl * > > +cmap_get_impl(const struct cmap *cmap) > > +{ > > + return ovsrcu_get(struct cmap_impl *, &cmap->impl); > > +} > > + > > +static uint32_t > > +calc_max_n(uint32_t mask, uint32_t max_load) > > As said, ?max_load? seems to be constant. Fixed, see above. > > +static bool > > +cmap_insert_dup(struct cmap_node *new_node, uint32_t hash, > > + struct cmap_bucket *b) > > +{ > > + int i; > > + > > + for (i = 0; i < CMAP_K; i++) { > > + struct cmap_node *node = b->nodes[i]; > > + if (b->hashes[i] == hash && node) { > > + for (;;) { > > + struct cmap_node *next = cmap_node_next_protected(node); > > + if (!next) { > > + ovsrcu_set(&node->next, new_node); > > + return true; > > + } > > + node = next; > > + } > > It would be faster to add the dup to the beginning of the list instead: > > for (i = 0; i < CMAP_K; i++) { > if (b->hashes[i] == hash) { > ovsrcu_set(&new_node->next, b->nodes[i]); > b->nodes[i] = new_node; > return true; > } > } > > Syncing via the counter should be unnecessary, as the hash value was > already set. That's a good point about the counter. That's the reason that I switched to this method in v2. I'll work on this for a v3. > > + } > > + } > > + return false; > > +} > > + > > +/* Searches 'b' for an empty slot. If successful, stores 'node' and > > 'hash' in > > + * the slot and returns true. Otherwise, returns false. */ > > +static bool > > +cmap_insert_bucket(struct cmap_node *node, uint32_t hash, > > + struct cmap_bucket *b) > > +{ > > + int i; > > + > > + for (i = 0; i < CMAP_K; i++) { > > + if (!b->nodes[i]) { > > + cmap_set_bucket(b, i, node, hash); > > + return true; > > + } > > + } > > + return false; > > +} > > + > > +/* Returns the other bucket that b->nodes[slot] could occupy in 'impl'. > > (This > > + * might be the same as 'b'.) */ > > +static struct cmap_bucket * > > +other_bucket(struct cmap_impl *impl, struct cmap_bucket *b, int slot) > > +{ > > + uint32_t h1 = rehash(impl, b->hashes[slot]); > > + uint32_t h2 = other_hash(h1); > > + uint32_t b_idx = b - impl->buckets; > > + uint32_t other_h = (h1 & impl->mask) == b_idx ? h2 : h1; > > + > > + return &impl->buckets[other_h & impl->mask]; > > +} > > + > > +/* 'new_node' is to be inserted into 'impl', but both candidate buckets > > 'b1' > > + * and 'b2' are full. This function attempts to rearrange buckets within > > + * 'impl' to make room for 'new_node'. > > + * > > + * The implementation is a general-purpose breadth-first search. At first > > + * glance, this is more complex than a random walk through 'impl' > > (suggested by > > + * some references), but random walks have a tendency to loop back through > > a > > + * single bucket. We have to move nodes backward along the path that we > > find, > > + * so that no node actually disappears from the hash table, which means a > > + * random walk would have to be careful to deal with loops. By contrast, a > > + * successful breadth-first search always finds a *shortest* path through > > the > > + * hash table, and a shortest path will never contain loops, so it avoids > > that > > + * problem entirely. > > + */ > > +static bool > > +cmap_insert_bfs(struct cmap_impl *impl, struct cmap_node *new_node, > > + uint32_t hash, struct cmap_bucket *b1, struct cmap_bucket > > *b2) > > +{ > > + enum { MAX_PATH = 4 }; > > + > > + /* A path from 'start' to 'end' via the 'n' steps in 'slots[]'. > > + * > > + * One can follow the path via: > > + * > > + * struct cmap_bucket *b; > > + * int i; > > + * > > + * b = path->start; > > + * for (i = 0; i < path->n; i++) { > > + * b = other_bucket(impl, b, path->slots[i]); > > + * } > > + * ovs_assert(b == path->end); > > + */ > > + struct cmap_path { > > + struct cmap_bucket *start; /* First bucket along the path. */ > > + struct cmap_bucket *end; /* Last bucket on the path. */ > > + uint8_t slots[MAX_PATH]; /* Slots used for each hop. */ > > + int n; /* Number of slots[]. */ > > + }; > > + > > + enum { MAX_QUEUE = 500 }; > > Is MAX_QUEUE a function of the MAX_PATH and CMAP_K? (7^4 == 2401, 5^4 == 625) I added a comment: /* We need to limit the amount of work we do trying to find a path. It * might actually be impossible to rearrange the cmap, and after some time * it is likely to be easier to rehash the entire cmap. * * This value of MAX_QUEUE is an arbitrary limit suggested by one of the * references. Empirically, it seems to work OK .*/ I couldn't quickly find the particular reference that suggested this, but I remember seeing 500 suggested in one of them. It seems to work OK: you can see that rehashing is rare if you instrument test-cmap to print something when it rehashes. > > + struct cmap_path queue[MAX_QUEUE]; > > + int head = 0; > > + int tail = 0; > > + > > + /* Add 'b1' and 'b2' as starting points for the search. */ > > + queue[head].start = b1; > > + queue[head].end = b1; > > + queue[head].n = 0; > > + head++; > > + if (b1 != b2) { > > + queue[head].start = b2; > > + queue[head].end = b2; > > + queue[head].n = 0; > > + head++; > > + } > > + > > + while (tail < head) { > > I feel that ?head? and ?tail? are reversed here. A (FIFO) queue is > serviced at head, and new entries are added to the tail. We?re done > when the head reaches the tail. I have always informally reasoned that the tail follows behind the head, as the tail of an animal follows behind the head of the animal. I did not know that there was disagreement on this. According to Wikipedia, there are reasonable people on both sides of the discussion: http://en.wikipedia.org/wiki/FIFO#Head_or_tail_first Since OVS already contains other queues that follow the convention used here (see e.g. lib/byteq.h), I prefer to be consistent and leave it as is. > > + if (path->n < MAX_PATH && head < MAX_QUEUE) { > > Was it so that the v1 queue was circular, and you decided to drop that > for clarity? I might remember it wrong as well. Yes, the v1 queue was circular. I reasoned that, at most, a couple of items would wrap around circularly, so I decided that it was slightly simpler to just stop when we would run out of room. > > +static bool > > +cmap_try_insert(struct cmap_impl *impl, struct cmap_node *node, uint32_t > > hash) > > +{ > > + uint32_t h1 = rehash(impl, hash); > > + uint32_t h2 = other_hash(h1); > > + struct cmap_bucket *b1 = &impl->buckets[h1 & impl->mask]; > > + struct cmap_bucket *b2 = &impl->buckets[h2 & impl->mask]; > > + > > The likelihood of b1 == b2 is low enough to not avoid calling the > insert functions twice for the same bucket? Yes, that's my reasoning. > > + if (b->nodes[slot] == node) { > > + cmap_set_bucket(b, slot, cmap_node_next_protected(node), hash); > > ?hash? is not changing here, so could just set the nodes: > > b->nodes[slot] = cmap_node_next_protected(node); > > Btw, what is the rationale that the nodes pointers are not RCU > pointers? If they were, it would feel possible to combine this special > case with the loop below. Good points. I'll work on that for a v3. Thanks, Ben. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev