[dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature

Ananyev, Konstantin konstantin.ananyev at intel.com
Thu Sep 27 13:27:21 CEST 2018



> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce Richardson
> Sent: Thursday, September 27, 2018 12:15 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> Cc: Wang, Yipeng1 <yipeng1.wang at intel.com>; dev at dpdk.org; michel at digirati.com.br
> Subject: Re: [dpdk-dev] [PATCH v2 5/7] hash: add extendable bucket feature
> 
> On Thu, Sep 27, 2018 at 04:23:48AM +0000, Honnappa Nagarahalli wrote:
> >
> >
> > > -----Original Message-----
> > > From: Yipeng Wang <yipeng1.wang at intel.com>
> > > Sent: Friday, September 21, 2018 12:18 PM
> > > To: bruce.richardson at intel.com
> > > Cc: dev at dpdk.org; yipeng1.wang at intel.com; michel at digirati.com.br;
> > > Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>
> > > Subject: [PATCH v2 5/7] hash: add extendable bucket feature
> > >
> > > In use cases that hash table capacity needs to be guaranteed, the extendable
> > > bucket feature can be used to contain extra keys in linked lists when conflict
> > > happens. This is similar concept to the extendable bucket hash table in packet
> > > framework.
> > >
> > > This commit adds the extendable bucket feature. User can turn it on or off
> > > through the extra flag field during table creation time.
> > >
> > > Extendable bucket table composes of buckets that can be linked list to current
> > > main table. When extendable bucket is enabled, the table utilization can
> > > always acheive 100%.
> > IMO, referring to this as 'table utilization' indicates an efficiency about memory utilization. Please consider changing this to indicate that
> all of the configured number of entries will be accommodated?
> >
> > > Although keys ending up in the ext buckets may have longer look up time, they
> > > should be rare due to the cuckoo algorithm.
> > >
> > > Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
> > > ---
> > >  lib/librte_hash/rte_cuckoo_hash.c | 326
> > > +++++++++++++++++++++++++++++++++-----
> > >  lib/librte_hash/rte_cuckoo_hash.h |   5 +
> > >  lib/librte_hash/rte_hash.h        |   3 +
> > >  3 files changed, 292 insertions(+), 42 deletions(-)
> > >
> > > diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> > > b/lib/librte_hash/rte_cuckoo_hash.c
> > > index f7b86c8..616900b 100644
> > > --- a/lib/librte_hash/rte_cuckoo_hash.c
> > > +++ b/lib/librte_hash/rte_cuckoo_hash.c
> > > @@ -31,6 +31,10 @@
> > >  #include "rte_hash.h"
> > >  #include "rte_cuckoo_hash.h"
> > >
> > > +#define FOR_EACH_BUCKET(CURRENT_BKT, START_BUCKET)
> > > \
> > > +for (CURRENT_BKT = START_BUCKET;                                      \
> > > +CURRENT_BKT != NULL;                                          \
> > > +CURRENT_BKT = CURRENT_BKT->next)
> > >
> > >  TAILQ_HEAD(rte_hash_list, rte_tailq_entry);
> > >
> > > @@ -63,6 +67,14 @@ rte_hash_find_existing(const char *name)
> > >  return h;
> > >  }
> > >
> > > +static inline struct rte_hash_bucket *
> > > +rte_hash_get_last_bkt(struct rte_hash_bucket *lst_bkt) {
> > > +while (lst_bkt->next != NULL)
> > > +lst_bkt = lst_bkt->next;
> > > +return lst_bkt;
> > > +}
> > > +
> > >  void rte_hash_set_cmp_func(struct rte_hash *h, rte_hash_cmp_eq_t func)  {
> > >  h->cmp_jump_table_idx = KEY_CUSTOM;
> > > @@ -85,13 +97,17 @@ rte_hash_create(const struct rte_hash_parameters
> > > *params)
> > >  struct rte_tailq_entry *te = NULL;
> > >  struct rte_hash_list *hash_list;
> > >  struct rte_ring *r = NULL;
> > > +struct rte_ring *r_ext = NULL;
> > >  char hash_name[RTE_HASH_NAMESIZE];
> > >  void *k = NULL;
> > >  void *buckets = NULL;
> > > +void *buckets_ext = NULL;
> > >  char ring_name[RTE_RING_NAMESIZE];
> > > +char ext_ring_name[RTE_RING_NAMESIZE];
> > >  unsigned num_key_slots;
> > >  unsigned i;
> > >  unsigned int hw_trans_mem_support = 0, multi_writer_support = 0;
> > > +unsigned int ext_table_support = 0;
> > >  unsigned int readwrite_concur_support = 0;
> > >
> > >  rte_hash_function default_hash_func = (rte_hash_function)rte_jhash;
> > > @@ -124,6 +140,9 @@ rte_hash_create(const struct rte_hash_parameters
> > > *params)
> > >  multi_writer_support = 1;
> > >  }
> > >
> > > +if (params->extra_flag & RTE_HASH_EXTRA_FLAGS_EXT_TABLE)
> > > +ext_table_support = 1;
> > > +
> > >  /* Store all keys and leave the first entry as a dummy entry for
> > > lookup_bulk */
> > >  if (multi_writer_support)
> > >  /*
> > > @@ -145,6 +164,24 @@ rte_hash_create(const struct rte_hash_parameters
> > > *params)
> > >  goto err;
> > >  }
> > >
> > > +const uint32_t num_buckets = rte_align32pow2(params->entries) /
> > > +RTE_HASH_BUCKET_ENTRIES;
> > > +
> > > +snprintf(ext_ring_name, sizeof(ext_ring_name), "HT_EXT_%s",
> > > +params-
> > > >name);
> > Can be inside the if statement below.
> >
> > > +/* Create ring for extendable buckets. */
> > > +if (ext_table_support) {
> > > +r_ext = rte_ring_create(ext_ring_name,
> > > +rte_align32pow2(num_buckets + 1),
> > > +params->socket_id, 0);
> > > +
> > > +if (r_ext == NULL) {
> > > +RTE_LOG(ERR, HASH, "ext buckets memory allocation
> > > "
> > > +"failed\n");
> > > +goto err;
> > > +}
> > > +}
> > > +
> > >  snprintf(hash_name, sizeof(hash_name), "HT_%s", params->name);
> > >
> > >  rte_rwlock_write_lock(RTE_EAL_TAILQ_RWLOCK);
> > > @@ -177,18 +214,34 @@ rte_hash_create(const struct rte_hash_parameters
> > > *params)
> > >  goto err_unlock;
> > >  }
> > >
> > > -const uint32_t num_buckets = rte_align32pow2(params->entries)
> > > -/ RTE_HASH_BUCKET_ENTRIES;
> > > -
> > >  buckets = rte_zmalloc_socket(NULL,
> > >  num_buckets * sizeof(struct rte_hash_bucket),
> > >  RTE_CACHE_LINE_SIZE, params->socket_id);
> > >
> > >  if (buckets == NULL) {
> > > -RTE_LOG(ERR, HASH, "memory allocation failed\n");
> > > +RTE_LOG(ERR, HASH, "buckets memory allocation failed\n");
> > >  goto err_unlock;
> > >  }
> > >
> > > +/* Allocate same number of extendable buckets */
> > IMO, we are allocating too much memory to support this feature. Especially, when we claim that keys ending up in the extendable table is
> a rare occurrence. By doubling the memory we are effectively saying that the main table might have 50% utilization. It will also significantly
> increase the cycles required to iterate the complete hash table (in rte_hash_iterate API) even when we expect that the extendable table
> contains very few entries.
> >
> > I am wondering if we can provide options to control the amount of extra memory that gets allocated and make the memory allocation
> dynamic (or on demand basis). I think this also goes well with the general direction DPDK is taking - allocate resources as needed rather than
> allocating all the resources during initialization.
> >
> 
> Given that adding new entries should not normally be a fast-path function,

Umm, I don't think I agree with that.
There are plenty of cases where add/delete speed is important.
Konstantin

> how about allowing memory allocation in add itself. Why not initialize with
> a fairly small number of extra bucket entries, and then each time they are
> all used, double the number of entries. That will give efficient resource
> scaling, I think.
> 
> /Bruce


More information about the dev mailing list