Re: [dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature

Ananyev, Konstantin Thu, 27 Sep 2018 05:33:39 -0700

> -----Original Message-----
> From: Richardson, Bruce
> Sent: Thursday, September 27, 2018 1:28 PM
> To: Ananyev, Konstantin <[email protected]>
> Cc: Honnappa Nagarahalli <[email protected]>; Wang, Yipeng1 
> <[email protected]>; [email protected];
> [email protected]
> Subject: Re: [dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature
> 
> On Thu, Sep 27, 2018 at 12:27:21PM +0100, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: dev [mailto:[email protected]] On Behalf Of Bruce Richardson
> > > Sent: Thursday, September 27, 2018 12:15 PM
> > > To: Honnappa Nagarahalli <[email protected]>
> > > Cc: Wang, Yipeng1 <[email protected]>; [email protected]; 
> > > [email protected]
> > > Subject: Re: [dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature
> > >
> > > On Thu, Sep 27, 2018 at 04:23:48AM +0000, Honnappa Nagarahalli wrote:
> > > >
> > > >
> > > > > -----Original Message-----
> > > > > From: Yipeng Wang <[email protected]>
> > > > > Sent: Friday, September 21, 2018 12:18 PM
> > > > > To: [email protected]
> > > > > Cc: [email protected]; [email protected]; [email protected];
> > > > > Honnappa Nagarahalli <[email protected]>
> > > > > Subject: [PATCH v2 5/7] hash: add extendable bucket feature
> > > > >
> > > > > In use cases that hash table capacity needs to be guaranteed, the 
> > > > > extendable
> > > > > bucket feature can be used to contain extra keys in linked lists when 
> > > > > conflict
> > > > > happens. This is similar concept to the extendable bucket hash table 
> > > > > in packet
> > > > > framework.
> > > > >
> > > > > This commit adds the extendable bucket feature. User can turn it on 
> > > > > or off
> > > > > through the extra flag field during table creation time.
> > > > >
> > > > > Extendable bucket table composes of buckets that can be linked list 
> > > > > to current
> > > > > main table. When extendable bucket is enabled, the table utilization 
> > > > > can
> > > > > always acheive 100%.
> > > > IMO, referring to this as 'table utilization' indicates an efficiency 
> > > > about memory utilization. Please consider changing this to indicate
> that
> > > all of the configured number of entries will be accommodated?
> > > >
> > > > > Although keys ending up in the ext buckets may have longer look up 
> > > > > time, they
> > > > > should be rare due to the cuckoo algorithm.
> > > > >
> > > > > Signed-off-by: Yipeng Wang <[email protected]>
> > > > > ---
> > > > >  lib/librte_hash/rte_cuckoo_hash.c | 326
> > > > > +++++++++++++++++++++++++++++++++-----
> > > > >  lib/librte_hash/rte_cuckoo_hash.h |   5 +
> > > > >  lib/librte_hash/rte_hash.h        |   3 +
> > > > >  3 files changed, 292 insertions(+), 42 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> > > > > b/lib/librte_hash/rte_cuckoo_hash.c
> > > > > index f7b86c8..616900b 100644
> > > > > --- a/lib/librte_hash/rte_cuckoo_hash.c
> > > > > +++ b/lib/librte_hash/rte_cuckoo_hash.c
> > > > > @@ -31,6 +31,10 @@
> > > > >  #include "rte_hash.h"
> > > > >  #include "rte_cuckoo_hash.h"
> > > > >
> > > > > +#define FOR_EACH_BUCKET(CURRENT_BKT, START_BUCKET)
> > > > > \
> > > > > +for (CURRENT_BKT = START_BUCKET;                                     
> > > > >  \
> > > > > +CURRENT_BKT != NULL;                                          \
> > > > > +CURRENT_BKT = CURRENT_BKT->next)
> > > > >
> > > > >  TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
> > > > >
> > > > > @@ -63,6 +67,14 @@ rte_hash_find_existing(const char *name)
> > > > >  return h;
> > > > >  }
> > > > >
> > > > > +static inline struct rte_hash_bucket *
> > > > > +rte_hash_get_last_bkt(struct rte_hash_bucket *lst_bkt) {
> > > > > +while (lst_bkt->next != NULL)
> > > > > +lst_bkt = lst_bkt->next;
> > > > > +return lst_bkt;
> > > > > +}
> > > > > +
> > > > >  void rte_hash_set_cmp_func(struct rte_hash *h, rte_hash_cmp_eq_t 
> > > > > func)  {
> > > > >  h->cmp_jump_table_idx = KEY_CUSTOM;
> > > > > @@ -85,13 +97,17 @@ rte_hash_create(const struct rte_hash_parameters
> > > > > *params)
> > > > >  struct rte_tailq_entry *te = NULL;
> > > > >  struct rte_hash_list *hash_list;
> > > > >  struct rte_ring *r = NULL;
> > > > > +struct rte_ring *r_ext = NULL;
> > > > >  char hash_name[RTE_HASH_NAMESIZE];
> > > > >  void *k = NULL;
> > > > >  void *buckets = NULL;
> > > > > +void *buckets_ext = NULL;
> > > > >  char ring_name[RTE_RING_NAMESIZE];
> > > > > +char ext_ring_name[RTE_RING_NAMESIZE];
> > > > >  unsigned num_key_slots;
> > > > >  unsigned i;
> > > > >  unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
> > > > > +unsigned int ext_table_support = 0;
> > > > >  unsigned int readwrite_concur_support = 0;
> > > > >
> > > > >  rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
> > > > > @@ -124,6 +140,9 @@ rte_hash_create(const struct rte_hash_parameters
> > > > > *params)
> > > > >  multi_writer_support = 1;
> > > > >  }
> > > > >
> > > > > +if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_EXT_TABLE)
> > > > > +ext_table_support = 1;
> > > > > +
> > > > >  /* Store all keys and leave the first entry as a dummy entry for
> > > > > lookup_bulk */
> > > > >  if (multi_writer_support)
> > > > >  /*
> > > > > @@ -145,6 +164,24 @@ rte_hash_create(const struct rte_hash_parameters
> > > > > *params)
> > > > >  goto err;
> > > > >  }
> > > > >
> > > > > +const uint32_t num_buckets = rte_align32pow2(params->entries) /
> > > > > +RTE_HASH_BUCKET_ENTRIES;
> > > > > +
> > > > > +snprintf(ext_ring_name, sizeof(ext_ring_name), "HT_EXT_%s",
> > > > > +params-
> > > > > >name);
> > > > Can be inside the if statement below.
> > > >
> > > > > +/* Create ring for extendable buckets. */
> > > > > +if (ext_table_support) {
> > > > > +r_ext = rte_ring_create(ext_ring_name,
> > > > > +rte_align32pow2(num_buckets + 1),
> > > > > +params->socket_id, 0);
> > > > > +
> > > > > +if (r_ext == NULL) {
> > > > > +RTE_LOG(ERR, HASH, "ext buckets memory allocation
> > > > > "
> > > > > +"failed\n");
> > > > > +goto err;
> > > > > +}
> > > > > +}
> > > > > +
> > > > >  snprintf(hash_name, sizeof(hash_name), "HT_%s", params->name);
> > > > >
> > > > >  rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> > > > > @@ -177,18 +214,34 @@ rte_hash_create(const struct rte_hash_parameters
> > > > > *params)
> > > > >  goto err_unlock;
> > > > >  }
> > > > >
> > > > > -const uint32_t num_buckets = rte_align32pow2(params->entries)
> > > > > -/ RTE_HASH_BUCKET_ENTRIES;
> > > > > -
> > > > >  buckets = rte_zmalloc_socket(NULL,
> > > > >  num_buckets * sizeof(struct rte_hash_bucket),
> > > > >  RTE_CACHE_LINE_SIZE, params->socket_id);
> > > > >
> > > > >  if (buckets == NULL) {
> > > > > -RTE_LOG(ERR, HASH, "memory allocation failed\n");
> > > > > +RTE_LOG(ERR, HASH, "buckets memory allocation failed\n");
> > > > >  goto err_unlock;
> > > > >  }
> > > > >
> > > > > +/* Allocate same number of extendable buckets */
> > > > IMO, we are allocating too much memory to support this feature. 
> > > > Especially, when we claim that keys ending up in the extendable
> table is
> > > a rare occurrence. By doubling the memory we are effectively saying that 
> > > the main table might have 50% utilization. It will also
> significantly
> > > increase the cycles required to iterate the complete hash table (in 
> > > rte_hash_iterate API) even when we expect that the extendable table
> > > contains very few entries.
> > > >
> > > > I am wondering if we can provide options to control the amount of extra 
> > > > memory that gets allocated and make the memory allocation
> > > dynamic (or on demand basis). I think this also goes well with the 
> > > general direction DPDK is taking - allocate resources as needed rather
> than
> > > allocating all the resources during initialization.
> > > >
> > >
> > > Given that adding new entries should not normally be a fast-path function,
> >
> > Umm, I don't think I agree with that.
> > There are plenty of cases where add/delete speed is important.
> > Konstantin
> 
> True, I suppose.
> Perhaps then the best approach is to give a couple of options to the
> developer. Allow specifying an initial amount of memory, and then an option
> to allow the memory to grow or not. If the cost of memory allocation is a
> problem for a specific app, then they can provide a large default and not
> allow growing, while other apps, for whom speed is not that critical, can
> provide a small default and allow growing.
> 
> Does that seem reasonable?

Yes, I think it does.
Konstantin
Re: [dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature

Reply via email to