[dpdk-dev] [PATCH v4 0/4] hash: add extendable bucket and partial key hashing

Dharmik Thakkar Dharmik.Thakkar at arm.com
Wed Oct 3 21:05:21 CEST 2018


Tested OK on Qualcomm Centriq 2400.

> On Sep 28, 2018, at 12:23 PM, Yipeng Wang <yipeng1.wang at intel.com> wrote:
> 
> This patch set made two major optimizations over the current rte_hash
> library.
> 
> First, it adds Extendable Bucket Table feature: a new structure that can
> accommodate keys that failed to get inserted into the main hash table due to
> the unlikely event of excessive hash collisions. The hash table buckets will
> get extended using a linked list to host these keys. This new design will
> guarantee insertion of 100% of the keys for a given hash table size with
> minimal overhead. A new flag value is added for user to indicate if the
> extendable bucket feature should be enabled or not. The linked list buckets is
> similar concept to the extendable bucket hash table in packet framework.
> In details, for insertion, the linked buckets will be used to store the keys
> that fail to get in the primary and the secondary bucket and the cuckoo path
> could not find an empty location for the maximum path length (small
> probability). For lookup, the key is checked first in the primary, then the
> secondary, then if the secondary is extended the linked list is traversed
> for a possible match.
> 
> Second, the patch set changes the current hashing algorithm to be "partial-key
> hashing". Partial-key hashing is the concept from Bin Fan, et al.'s paper
> "MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter
> Hashing". Instead of storing both 32-bit signature and alternative signature
> in the bucket, we only store a small 16-bit signature and calculate the
> alternative bucket index by XORing the signature with the current bucket index.
> This doubles the hash table memory efficiency since now one bucket
> only occupies one cache line instead of two in the original design.
> 
> v3->v4:
> 1. hash: Revise commit message to be more clear for "utilization" (Honnappa)
> 2. hash: in delete key function, return bucket change to use rte_ring_sp_enqueue
> instead of rte_ring_mp_enqueue, since it is already protected inside locks.
> 3. hash: update rte_hash_iterate comments (Honnappa)
> 4. hash: Add a new commit to fix race condition in the rte_hash_iterate (Honnappa)
> 5. hash/test: during utilization test, double check rte_hash_cnt returns correct
> value (Honnappa)
> 6. hash: for partial-key-hashing commit, break the get_buckets_index function
> into three. It may make future extension easier (Honnappa)
> 7. hash: change the comment for typedef uint32_t hash_sig_t to be more clear
> to users (Honnappa)
> 
> v2->v3:
> The first four commits were separated from this patch set as another
> independent patch set:
> https://mails.dpdk.org/archives/dev/2018-September/113118.html
> 1. hash: move snprintf for ext_ring name under the ext_table condition.
> 2. hash: fix memory leak by freeing ext_buckets in rte_hash_free.
> 3. hash: after failing cuckoo path, search not only ext buckets, but also the
> secondary bucket first to see if there may be an empty location now.
> 4. hash: totally rewrote the key deleting function logic. If the deleted key was
> not in the last bucket of the linked list when ext table enabled, the last
> entry in the linked list will be placed in the vacant slot from the deleted
> key. The purpose is to compact the entries in the linked list to be more close
> to the main table. This is to make sure that not many extendable buckets are
> wasted with only one or two entries after some time of running, also benefit
> lookup speed.
> 5. Other minor coding style/comments improvements.
> 
> V1->V2:
> 1. hash: Rewrite rte_hash_get_last_bkt to be more concise.
> 2. hash: Reorder the rte_hash struct to align cache line better.
> 3. test: Minor changes in auto test to add key insertion failure check during
> iteration test.
> 4. test: Add new commit to fix read-write test non-consecutive core issue.
> 4. hash: Add a new commit to remove unnecessary code introduced by previous
> patches.
> 5. hash: Comments improvement and coding style improvements over multiple
> places.
> 
> Signed-off-by: Yipeng Wang <yipeng1.wang at intel.com>
> 
> Yipeng Wang (4):
>  hash: fix race condition in iterate
>  hash: add extendable bucket feature
>  test/hash: implement extendable bucket hash test
>  hash: use partial-key hashing
> 
> lib/librte_hash/rte_cuckoo_hash.c | 585 ++++++++++++++++++++++++++++----------
> lib/librte_hash/rte_cuckoo_hash.h |  11 +-
> lib/librte_hash/rte_hash.h        |   8 +-
> test/test/test_hash.c             | 159 ++++++++++-
> test/test/test_hash_perf.c        | 114 ++++++--
> 5 files changed, 683 insertions(+), 194 deletions(-)
> 
> -- 
> 2.7.4
> 
Acked-by: Dharmik Thakkar <dharmik.thakkar at arm.com>



More information about the dev mailing list