[dpdk-users] How to find an IPv4 and IPv6 hash function
gaetan.rivet at 6wind.com
Wed Oct 3 00:02:27 CEST 2018
Routing will generally perform the same operation, matching IP addresses
and gathering results marked for these addresses.
Usually, hashing is done to map large sets to a limited number of
buckets. It works well at scale for choosing RX queues, but you will
have poor data locality in a naive hash-table, as it would usually be
implemented with a linked-list.
You could then either design a hash-table with good data-locality even with
a lot of collisions, but it would probably be simpler to use a trie, as
you suggested I think by referencing dynamic arrays. You did not say
whether your monitoring was done on src or dst addresses, but if you
suggest using a trie, I assume you are only using one and not both.
If you need to match both, then you will be better off by hashing the
addresses and taking care of data locality in your hash-table.
Otherwise, you will find an extensive litterature going back decades on
matching IP addresses, this has been studied for a long time now.
The classic solution is generally a trie if you only need to match one IP
address, with path-compression used to avoid using too much memory,
which will kill performances if you do a lot of insertion and deletion
in your trie. So you need to know how long those entries will be live in
your monitor and how often they would disappear. If you are never
removing any, path compression seems a good point to start.
You can look into librte_lpm. I have never used it, but "Longest Prefix
Match" is referring to the operation of matching an IP address.
On Tue, Oct 02, 2018 at 07:20:39PM +0000, Ali Volkan Atli wrote:
> First of all, thanks you Pavel
> Actually I didn't ask the question properly. I have already used the l3fwd application as basis for 5-tuple "flows" in my application. Now I just need a data structure to keep 32-bit IPv4 address or 128-bit IPv6 address, just one IP address. I can use jenkins hash, murmurhash, fast fibonacci etc.. Also I have just found out that DPDK has the jenkins hash functions. Before asking the question to this e-mail group, I was thinking of why I did not use a dynamic array for 32-bit IPv4 address instead of hash table. Thus I could fetch the data in one shot. It still makes sense but of course it doesn't work for IPv6. Anyway I asked the question here because DPDK library designed for network and maybe someone could suggest a paper, blog, a DPDK API for just one IP-based data-structure.
> I apologize if the e-mail group is the wrong place.
> Best regards
> - Volkan
> From: Pavel Chuprikov [pschuprikov at gmail.com]
> Sent: Tuesday, October 2, 2018 8:32 PM
> To: Ali Volkan Atli
> Cc: users at dpdk.org
> Subject: Re: [dpdk-users] How to find an IPv4 and IPv6 hash function
> The l3fwd application  might serve as a starting point and in particular l3fwd_em.c .
> : http://doc.dpdk.org/guides/sample_app_ug/l3_forward.html
> : http://doc.dpdk.org/api/examples_2l3fwd_2l3fwd_em_8c-example.html
> Hope it helps,
> пн, 1 окт. 2018 г. в 14:37, Ali Volkan Atli <Volkan.Atli at argela.com.tr<mailto:Volkan.Atli at argela.com.tr>>:
> Dear DPDK users
> I am trying to write a code to monitor user data traffic by using DPDK. Each user has one IP address which can be IPv4 or IPv6. Therefore I'm thinking of creating two hash tables for IPv4 and IPv6. The key of these hash tables will be an IPv4 or IPv6 address and the data of these hash tables will be user traffic (uplink and downlink)
> First, is there any API, example etc.. to calculate IPv4 hash and IPv6 hash on DPDK library? If not, does anyone know where I might find any paper or writing how I can calculate IPv4 and IPv6 hash? Also is there a more appropriate data structure than the hash table for the problem?
> Thanks in advance.
> - Volkan
More information about the users