[dpdk-dev] 2.3 Roadmap

Matthew Hall mhall at mhcomputing.net
Wed Dec 2 16:47:41 CET 2015


On Wed, Dec 02, 2015 at 12:35:16PM +0000, Bruce Richardson wrote:
> Hi Matthew,
> 
> thanks for the info, but I'm not sure I understand it correctly. It seems to
> me that you are mostly referring to the depths/sizes of the tables being used,
> rather than to the "data-size" being stored in each entry, which was actually
> what I was asking about. Is that correct? If so, it seems that - looking initially
> at IPv4 LPM only - you are more looking for an increase in the number of tbl8's
> for lookup, rather than necessarily an increase the 8-bit user data being stored
> with each entry. [And assuming similar interest for v6] Am I right in 
> thinking this?
> 
> Thanks,
> /Bruce

This question is a result of a different way of looking at things between 
routing / networking and security. I actually need to increase the size of 
user data as I did in my patches.

1. There is an assumption, when LPM is used for routing, that many millions of 
inputs might map to a smaller number of outputs.

2. This assumption is not true in the security ecosystem. If I have several 
million CIDR blocks and bad IPs, I need a separate user data value output for 
each value input.

This is because, every time I have a bad IP, CIDR, Domain, URL, or Email, I 
create a security indicator tracking struct for each one of these. In the IP 
and CIDR case I find the struct using rte_hash (possibly for single IPs) and 
rte_lpm.

For Domain, URL, and Email, rte_hash cannot be used, because it mis-assumes 
all inputs are equal-length. So I had to use a different hash table.

4. The struct contains things such as a unique 64-bit unsigned integer for 
each separate IP or CIDR triggered, to allow looking up contextual data about 
the threat it represents. These IDs are defined by upstream threat databases, 
so I can't crunch them down to fit inside rte_lpm. They also include stats 
regarding how many times an indicator is seen, what kind of security threat it 
represents, etc. Without which you can't do any valuable security enrichment 
needed to respond to any events generated.

5. This means, if I want to support X million security indicators, regardless 
if they are IP, CIDR, Domain, URL, or Email, then I need X million distinct 
user data values to look up all the context that goes with them.

Matthew.


More information about the dev mailing list