[dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
Medvedkin, Vladimir
vladimir.medvedkin at intel.com
Thu Mar 26 18:28:29 CET 2020
Hi Yipeng, Stephen, all,
On 17/03/2020 19:52, Wang, Yipeng1 wrote:
>> -----Original Message-----
>> From: Stephen Hemminger <stephen at networkplumber.org>
>> Sent: Monday, March 16, 2020 12:33 PM
>> To: Medvedkin, Vladimir <vladimir.medvedkin at intel.com>
>> Cc: Morten Brørup <mb at smartsharesystems.com>; dev at dpdk.org;
>> Ananyev, Konstantin <konstantin.ananyev at intel.com>; Wang, Yipeng1
>> <yipeng1.wang at intel.com>; Gobriel, Sameh <sameh.gobriel at intel.com>;
>> Richardson, Bruce <bruce.richardson at intel.com>; Suanming Mou
>> <suanmingm at mellanox.com>; Olivier Matz <olivier.matz at 6wind.com>;
>> Xueming(Steven) Li <xuemingl at mellanox.com>; Andrew Rybchenko
>> <arybchenko at solarflare.com>; Asaf Penso <asafp at mellanox.com>; Ori Kam
>> <orika at mellanox.com>
>> Subject: Re: [dpdk-dev] [PATCH 0/3] add new Double Word Key hash table
>>
>> On Mon, 16 Mar 2020 18:27:40 +0000
>> "Medvedkin, Vladimir" <vladimir.medvedkin at intel.com> wrote:
>>
>>> Hi Morten,
>>>
>>>
>>> On 16/03/2020 14:39, Morten Brørup wrote:
>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vladimir
>>>>> Medvedkin
>>>>> Sent: Monday, March 16, 2020 2:38 PM
>>>>>
>>>>> Currently DPDK has a special implementation of a hash table for
>>>>> 4 byte keys which is called FBK hash. Unfortunately its main
>>>>> drawback is that it only supports 2 byte values.
>>>>> The new implementation called DWK (double word key) hash supports 8
>>>>> byte values, which is enough to store a pointer.
>>>>>
>>>>> It would also be nice to get feedback on whether to leave the old
>>>>> FBK and new DWK implementations, or whether to deprecate the old
>> one?
>>>> <rant on>
>>>>
>>>> Who comes up with these names?!?
>>>>
>>>> FBK (Four Byte Key) and DWK (Double Word Key) is supposed to mean
>> the same. Could you use 32 somewhere in the name instead, like in int32_t,
>> instead of using a growing list of creative synonyms for the same thing?
>> Pretty please, with a cherry on top!
>>>
>>> That's true, at first I named it as fbk2, but then it was decided to
>>> rename it "dwk", so that there was no confusion with the existing FBK
>>> library. Naming suggestions are welcome!
>>>
>>>> And if the value size is fixed too, perhaps the name should also indicate
>> the value size.
>>>> <rant off>
>>>>
>>>> It's a shame we don't have C++ class templates available in DPDK...
>>>>
>>>> In other news, Mellanox has sent an RFC for an "indexed memory pool"
>> library [1] to conserve memory by using uintXX_t instead of pointers, so
>> perhaps a variant of a 32 bit key hash library with 32 bit values (in addition to
>> 16 bit values in FBK and 64 bit in DWK) would be nice combination with that
>> library.
>>>> [1]: http://mails.dpdk.org/archives/dev/2019-October/147513.html
>>>>
>>>>
>>>> Med venlig hilsen / kind regards
>>>> - Morten Brørup
>>>>
>> Why is this different (or better) than existing rte_hash.
>> Having more flavors is not necessarily a good thing (except in Gelato)
> [Wang, Yipeng]
> Hi, Vladimir,
> As Stephen mentioned, I think it is good idea to explain the benefit of this new type of hash table more explicitly such as
> Specific use cases, differences with current rte_hash, and performance numbers, etc.
The main reason for this new hash library is performance. As I mentioned
earlier, the current rte_fbk implementation is pretty fast but it has a
number of drawbacks such as 2 byte values and limited collision
resolving capabilities. On the other hand, rte_hash (cuckoo hash)
doesn't have this drawbacks but at the cost of lower performance
comparing to rte_fbk.
If I understand correctly, performance penalty are due to :
1. Load two buckets
2. First compare signatures
3. If signature comparison hits get a key index and find memory location
with a key itself and get the key
4. Using indirect call to memcmp() to compare two uint32_t.
The new proposed 4 byte key hash table doesn't have rte_fbk drawbacks
while offers the same performance as rte_fbk.
Regarding use cases, in rte_ipsec_sad we are using rte_hash with 4 byte
key size. Replacing it with a new implementation gives about 30% in
performance.
The main disadvantage comparing to rte_hash is some performance
degradation with high average table utilization due to chain resolving
for 5th and subsequent collision.
--
Regards,
Vladimir
More information about the dev
mailing list