[dpdk-dev] [PATCH] hash table: add a bucket iterator function

Fu, Qiaobin qiaobinf at bu.edu
Tue Jul 31 08:09:02 CEST 2018


Hi Yipeng,

Thanks for the feedbacks!

> On Jul 30, 2018, at 4:24 PM, Wang, Yipeng1 <yipeng1.wang at intel.com> wrote:
> 
> Hi, Qiaobin, 
> 
> Thanks for the patch. If I understand correctly your use case is to use hash table as a "cache" that new entries should evict stale ones automatically. Could you be more specific on your use scenarios so that I can have more context? 
> 

Actually, we didn’t use hash table as a “cache”, and there is no automation to evict stale ones. Instead, the functionrte_hash_bucket_iterate() allows the callers to iterate over the hash table in an incremental way, i.e., one bucket by another bucket, so that the caller doesn’t have to iterate over the whole hash table in one time. This way can avoid creating hiccups and achieve better cache performance. One possible use scenario is when DDoS attacks happen, one may want to take more CPU time to process the packets, thus iterating over the hash table incrementally is desirable.

> We are actually working on an extendable version of rte_hash which will link extra buckets to current table if the table is full. As long as the table is sized appropriately, there will be no conflict issues thus you don’t need to replace old entries. Will this solve your issue? Also, if the “cache” mode is what you want, we have the membership library “rte_member”. Is it applicable to your case? 

I agree that adding extra buckets to current table when the table is full can alleviate this scenario. Indeed, we also considered this solution before coming up our proposed solution. However, it’s still highly desirable to have a bucket iterator function. Considering the scenario where the machine’s memory is limited and cannot always allocate new memory to the hash table (e.g., DDoS attacks, switch measurement tasks, etc.), a solution is to allow the callers evict some less important (the criteria for the eviction is based on the caller’s needs) entries.

Indeed, we don’t have the “cache” mode, our implementation is a way to achieve better cache performance. So, the rte_member won’t help here.

> 
> w.r.t. the patch set you proposed, my concern is that the new API will expose too much internals to the user, e.g. the bucket/entries structure should be implementation dependent. Exposing internal structures would prevent improving the implementation down the road unless we keep changing API which is not recommended.  
> 

The functions we add here give a way for the callers to iterate over the specific bucket, which potentially contains the entry. These functions can be made general enough to allow callers to heuristically evict some entries, and add the new ones to the hash table. Otherwise, there is no way to evict some less important entries.

Hope this clarifies the patch.

Best,
Qiaobin

> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wiles, Keith
>> Sent: Sunday, July 29, 2018 6:17 AM
>> To: Qiaobin Fu <qiaobinf at bu.edu>
>> Cc: Richardson, Bruce <bruce.richardson at intel.com>; De Lara Guarch, Pablo <pablo.de.lara.guarch at intel.com>; dev at dpdk.org;
>> michel at digirati.com.br; doucette at bu.edu
>> Subject: Re: [dpdk-dev] [PATCH] hash table: add a bucket iterator function
>> 
>> 
>> 
>>> On Jul 28, 2018, at 12:48 PM, Qiaobin Fu <qiaobinf at bu.edu> wrote:
>>> 
>>> Function rte_hash_bucket_iterate() enables callers to
>>> incrementally iterate over the hash table bucket by bucket,
>>> so that it can avoid creating hiccups and thrashing the cache
>>> of the processor.
>>> 
>>> This patch mainly deals with cases in which
>>> the hash table is full and one needs to decide if the incoming
>>> entry is more valuable than the entries already in the hash table.
>>> 
>>> Signed-off-by: Qiaobin Fu <qiaobinf at bu.edu>
>>> Reviewed-by: Cody Doucette <doucette at bu.edu>
>>> Reviewed-by: Michel Machado <michel at digirati.com.br>
>>> ---
>>> lib/librte_hash/rte_cuckoo_hash.c    | 60 +++++++++++++++++++++++++
>>> lib/librte_hash/rte_hash.h           | 66 ++++++++++++++++++++++++++++
>>> lib/librte_hash/rte_hash_version.map |  4 ++
>>> 3 files changed, 130 insertions(+)
>>> 
>>> diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c
>>> index a07543a29..2b90f3593 100644
>>> --- a/lib/librte_hash/rte_cuckoo_hash.c
>>> +++ b/lib/librte_hash/rte_cuckoo_hash.c
>>> @@ -1160,3 +1160,63 @@ rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32
>>> 
>>> 	return position - 1;
>>> }
>>> +
>>> +int
>>> +rte_hash_get_num_buckets(struct rte_hash *h)
>>> +{
>>> +	RETURN_IF_TRUE((h == NULL), -EINVAL);
>>> +	return h->num_buckets;
>>> +}
>>> +
>>> +uint32_t
>>> +rte_hash_get_primary_bucket(const struct rte_hash *h, hash_sig_t sig)
>>> +{
>>> +	return sig & h->bucket_bitmask;
>>> +}
>>> +
>>> +uint32_t
>>> +rte_hash_get_secondary_bucket(const struct rte_hash *h, hash_sig_t sig)
>>> +{
>>> +	return rte_hash_secondary_hash(sig) & h->bucket_bitmask;
>>> +}
>> 
>> The above two functions should have the RETURN_IF_TRUE((h == NULL), -EINVAL); statements just like the first one.
>> 
>>> +
>>> +int32_t
>>> +rte_hash_bucket_iterate(const struct rte_hash *h,
>>> +	const void **key, void **data, uint32_t *bidx, uint32_t *next)
>>> +{
>>> +	uint32_t position;
>>> +	struct rte_hash_key *next_key;
>>> +
>>> +	RETURN_IF_TRUE(((h == NULL) || (key == NULL) || (data == NULL) ||
>>> +		(bidx == NULL) || (next == NULL)), -EINVAL);
>>> +
>>> +	/* Out of bounds. */
>>> +	if (*bidx >= h->num_buckets || *next >= RTE_HASH_BUCKET_ENTRIES)
>>> +		goto next_bucket;
>>> +
>>> +	/* If current position is empty, go to the next one. */
>>> +	while (h->buckets[*bidx].key_idx[*next] == EMPTY_SLOT) {
>>> +		(*next)++;
>>> +		/* End of bucket. */
>>> +		if (*next >= RTE_HASH_BUCKET_ENTRIES)
>>> +			goto next_bucket;
>>> +	}
>>> +
>>> +	/* Get position of entry in key table. */
>>> +	position = h->buckets[*bidx].key_idx[*next];
>>> +	next_key = (struct rte_hash_key *) ((char *)h->key_store +
>>> +				position * h->key_entry_size);
>>> +	/* Return key and data. */
>>> +	*key = next_key->key;
>>> +	*data = next_key->pdata;
>>> +
>>> +	/* Increment iterator. */
>>> +	(*next)++;
>>> +
>>> +	return position - 1;
>>> +
>>> +next_bucket:
>>> +	*bidx = (*bidx + 1) >= h->num_buckets ? 0 : *bidx + 1;
>>> +	*next = 0;
>> 
>> If this is an error case why are we setting *next to zero and incrementing *bidx ?
>> I would expect we should not change the values on an error because they cannot debug the values on return.
>> 
>>> +	return -ENOENT;
>>> +}
>>> diff --git a/lib/librte_hash/rte_hash.h b/lib/librte_hash/rte_hash.h
>>> index f71ca9fbf..329bf42d6 100644
>>> --- a/lib/librte_hash/rte_hash.h
>>> +++ b/lib/librte_hash/rte_hash.h
>>> @@ -94,6 +94,17 @@ rte_hash_create(const struct rte_hash_parameters *params);
>>> */
>>> void rte_hash_set_cmp_func(struct rte_hash *h, rte_hash_cmp_eq_t func);
>>> 
>>> +/**
>>> + * Get the number of buckets in the hash table.
>>> + * @param h
>>> + *   Hash table for which the function queries.
>>> + * @return
>>> + *   The number of buckets in the hash table h.
>>> + *    - EINVAL - invalid parameter passed to function
>>> + */
>>> +int
>> 
>> Needs to change to ‘int __rte_experimental’
>> 
>>> +rte_hash_get_num_buckets(struct rte_hash *h);
>>> +
>>> /**
>>> * Find an existing hash table object and return a pointer to it.
>>> *
>>> @@ -261,6 +272,34 @@ int
>>> rte_hash_get_key_with_position(const struct rte_hash *h, const int32_t position,
>>> 			       void **key);
>>> 
>>> +/**
>>> + * Get the primary bucket index given the precomputed hash value.
>>> + * This operation is multi-thread safe.
>>> + *
>>> + * @param h
>>> + *   Hash table to get the key from.
>>> + * @param sig
>>> + *   Precomputed hash value.
>>> + * @return
>>> + *   The primary bucket index.
>> 
>> Comment the error case here.
>>> + */
>>> +uint32_t
>> 
>> Needs to change to ‘uint32_t __rte_experimental’
>>> +rte_hash_get_primary_bucket(const struct rte_hash *h, hash_sig_t sig);
>>> +
>>> +/**
>>> + * Get the secondary bucket index given the precomputed hash value.
>>> + * This operation is multi-thread safe.
>>> + *
>>> + * @param h
>>> + *   Hash table to get the key from.
>>> + * @param sig
>>> + *   Precomputed hash value.
>>> + * @return
>>> + *   The primary bucket index.
>> 
>> Comment the error case here.
>>> + */
>>> +uint32_t
>> 
>> Needs to change to ‘uint32_t __rte_experimental’
>>> +rte_hash_get_secondary_bucket(const struct rte_hash *h, hash_sig_t sig);
>>> +
>>> /**
>>> * Find a key-value pair in the hash table.
>>> * This operation is multi-thread safe.
>>> @@ -419,6 +458,33 @@ rte_hash_lookup_bulk(const struct rte_hash *h, const void **keys,
>>> */
>>> int32_t
>>> rte_hash_iterate(const struct rte_hash *h, const void **key, void **data, uint32_t *next);
>>> +
>>> +/**
>>> + * Iterate through the buckets in hash table, returning key-value pairs.
>>> + *
>>> + * @param h
>>> + *   Hash table to iterate
>>> + * @param key
>>> + *   Output containing the key where current iterator
>>> + *   was pointing at
>>> + * @param data
>>> + *   Output containing the data associated with key.
>>> + *   Returns NULL if data was not stored.
>>> + * @param bidx
>>> + *   Pointer to the current bucket index. Should be 0 to start
>>> + *   iterating the hash table. Iterator is incremented after
>>> + *   RTE_HASH_BUCKET_ENTRIES calls of this function.
>> 
>> The wording of the last line seem odd to me, should the ‘of’ be removed?
>> 
>>> + * @param next
>>> + *   Pointer to iterator. Should be 0 to start iterating the bucket.
>>> + *   Iterator is incremented after each call of this function.
>>> + * @return
>>> + *   Position where key was stored, if successful.
>>> + *   - -EINVAL if the parameters are invalid.
>>> + *   - -ENOENT if end of the bucket.
>>> + */
>>> +int32_t
>> 
>> Needs to change to ‘int32_t __rte_experimental’ and why use ‘int32_t' when everywhere else we are using ‘int'?
>> 
>>> +rte_hash_bucket_iterate(const struct rte_hash *h,
>>> +	const void **key, void **data, uint32_t *bidx, uint32_t *next);
>> 
>> Should have a blank line here before the #ifdef.
>>> #ifdef __cplusplus
>>> }
>>> #endif
>>> diff --git a/lib/librte_hash/rte_hash_version.map b/lib/librte_hash/rte_hash_version.map
>>> index 52a2576f9..7d48f32c9 100644
>>> --- a/lib/librte_hash/rte_hash_version.map
>>> +++ b/lib/librte_hash/rte_hash_version.map
>>> @@ -15,6 +15,10 @@ DPDK_2.0 {
>>> 	rte_hash_lookup;
>>> 	rte_hash_lookup_bulk;
>>> 	rte_hash_lookup_with_hash;
>>> +	rte_hash_get_num_buckets;
>>> +	rte_hash_get_primary_bucket;
>>> +	rte_hash_get_secondary_bucket;
>>> +	rte_hash_bucket_iterate;
>> 
>> These must be added to a EXPERIMENTAL clause in the file and later you can remove the experimental indictor.
>> 
>>> 
>>> 	local: *;
>>> };
>>> --
>>> 2.17.1
>>> 
>> 
>> Regards,
>> Keith
> 



More information about the dev mailing list