[dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature

Bruce Richardson bruce.richardson at intel.com
Thu Sep 27 13:15:02 CEST 2018


On Thu, Sep 27, 2018 at 04:23:48AM +0000, Honnappa Nagarahalli wrote:
> 
> 
> > -----Original Message-----
> > From: Yipeng Wang <yipeng1.wang at intel.com>
> > Sent: Friday, September 21, 2018 12:18 PM
> > To: bruce.richardson at intel.com
> > Cc: dev at dpdk.org; yipeng1.wang at intel.com; michel at digirati.com.br;
> > Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> > Subject: [PATCH v2 5/7] hash: add extendable bucket feature
> >
> > In use cases that hash table capacity needs to be guaranteed, the extendable
> > bucket feature can be used to contain extra keys in linked lists when conflict
> > happens. This is similar concept to the extendable bucket hash table in packet
> > framework.
> >
> > This commit adds the extendable bucket feature. User can turn it on or off
> > through the extra flag field during table creation time.
> >
> > Extendable bucket table composes of buckets that can be linked list to current
> > main table. When extendable bucket is enabled, the table utilization can
> > always acheive 100%.
> IMO, referring to this as 'table utilization' indicates an efficiency about memory utilization. Please consider changing this to indicate that all of the configured number of entries will be accommodated?
> 
> > Although keys ending up in the ext buckets may have longer look up time, they
> > should be rare due to the cuckoo algorithm.
> >
> > Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
> > ---
> >  lib/librte_hash/rte_cuckoo_hash.c | 326
> > +++++++++++++++++++++++++++++++++-----
> >  lib/librte_hash/rte_cuckoo_hash.h |   5 +
> >  lib/librte_hash/rte_hash.h        |   3 +
> >  3 files changed, 292 insertions(+), 42 deletions(-)
> >
> > diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> > b/lib/librte_hash/rte_cuckoo_hash.c
> > index f7b86c8..616900b 100644
> > --- a/lib/librte_hash/rte_cuckoo_hash.c
> > +++ b/lib/librte_hash/rte_cuckoo_hash.c
> > @@ -31,6 +31,10 @@
> >  #include "rte_hash.h"
> >  #include "rte_cuckoo_hash.h"
> >
> > +#define FOR_EACH_BUCKET(CURRENT_BKT, START_BUCKET)
> > \
> > +for (CURRENT_BKT = START_BUCKET;                                      \
> > +CURRENT_BKT != NULL;                                          \
> > +CURRENT_BKT = CURRENT_BKT->next)
> >
> >  TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
> >
> > @@ -63,6 +67,14 @@ rte_hash_find_existing(const char *name)
> >  return h;
> >  }
> >
> > +static inline struct rte_hash_bucket *
> > +rte_hash_get_last_bkt(struct rte_hash_bucket *lst_bkt) {
> > +while (lst_bkt->next != NULL)
> > +lst_bkt = lst_bkt->next;
> > +return lst_bkt;
> > +}
> > +
> >  void rte_hash_set_cmp_func(struct rte_hash *h, rte_hash_cmp_eq_t func)  {
> >  h->cmp_jump_table_idx = KEY_CUSTOM;
> > @@ -85,13 +97,17 @@ rte_hash_create(const struct rte_hash_parameters
> > *params)
> >  struct rte_tailq_entry *te = NULL;
> >  struct rte_hash_list *hash_list;
> >  struct rte_ring *r = NULL;
> > +struct rte_ring *r_ext = NULL;
> >  char hash_name[RTE_HASH_NAMESIZE];
> >  void *k = NULL;
> >  void *buckets = NULL;
> > +void *buckets_ext = NULL;
> >  char ring_name[RTE_RING_NAMESIZE];
> > +char ext_ring_name[RTE_RING_NAMESIZE];
> >  unsigned num_key_slots;
> >  unsigned i;
> >  unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
> > +unsigned int ext_table_support = 0;
> >  unsigned int readwrite_concur_support = 0;
> >
> >  rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
> > @@ -124,6 +140,9 @@ rte_hash_create(const struct rte_hash_parameters
> > *params)
> >  multi_writer_support = 1;
> >  }
> >
> > +if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_EXT_TABLE)
> > +ext_table_support = 1;
> > +
> >  /* Store all keys and leave the first entry as a dummy entry for
> > lookup_bulk */
> >  if (multi_writer_support)
> >  /*
> > @@ -145,6 +164,24 @@ rte_hash_create(const struct rte_hash_parameters
> > *params)
> >  goto err;
> >  }
> >
> > +const uint32_t num_buckets = rte_align32pow2(params->entries) /
> > +RTE_HASH_BUCKET_ENTRIES;
> > +
> > +snprintf(ext_ring_name, sizeof(ext_ring_name), "HT_EXT_%s",
> > +params-
> > >name);
> Can be inside the if statement below.
> 
> > +/* Create ring for extendable buckets. */
> > +if (ext_table_support) {
> > +r_ext = rte_ring_create(ext_ring_name,
> > +rte_align32pow2(num_buckets + 1),
> > +params->socket_id, 0);
> > +
> > +if (r_ext == NULL) {
> > +RTE_LOG(ERR, HASH, "ext buckets memory allocation
> > "
> > +"failed\n");
> > +goto err;
> > +}
> > +}
> > +
> >  snprintf(hash_name, sizeof(hash_name), "HT_%s", params->name);
> >
> >  rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> > @@ -177,18 +214,34 @@ rte_hash_create(const struct rte_hash_parameters
> > *params)
> >  goto err_unlock;
> >  }
> >
> > -const uint32_t num_buckets = rte_align32pow2(params->entries)
> > -/ RTE_HASH_BUCKET_ENTRIES;
> > -
> >  buckets = rte_zmalloc_socket(NULL,
> >  num_buckets * sizeof(struct rte_hash_bucket),
> >  RTE_CACHE_LINE_SIZE, params->socket_id);
> >
> >  if (buckets == NULL) {
> > -RTE_LOG(ERR, HASH, "memory allocation failed\n");
> > +RTE_LOG(ERR, HASH, "buckets memory allocation failed\n");
> >  goto err_unlock;
> >  }
> >
> > +/* Allocate same number of extendable buckets */
> IMO, we are allocating too much memory to support this feature. Especially, when we claim that keys ending up in the extendable table is a rare occurrence. By doubling the memory we are effectively saying that the main table might have 50% utilization. It will also significantly increase the cycles required to iterate the complete hash table (in rte_hash_iterate API) even when we expect that the extendable table contains very few entries.
> 
> I am wondering if we can provide options to control the amount of extra memory that gets allocated and make the memory allocation dynamic (or on demand basis). I think this also goes well with the general direction DPDK is taking - allocate resources as needed rather than allocating all the resources during initialization.
> 

Given that adding new entries should not normally be a fast-path function,
how about allowing memory allocation in add itself. Why not initialize with
a fairly small number of extra bucket entries, and then each time they are
all used, double the number of entries. That will give efficient resource
scaling, I think.

/Bruce


More information about the dev mailing list