[dpdk-users] HW cache utilisation w OVS-DPDK

Van Haaren, Harry harry.van.haaren at intel.com
Wed May 1 12:35:15 CEST 2019


Hi,

Some more details inline.

Please reply "inline" on mailing lists, it makes it easier for others to read back in future.
Also plain text is usually preferred to HTML formatted email.

> From: Avi Cohen [mailto:acohen at ves.io] 
> Sent: Tuesday, April 30, 2019 9:08 PM
> To: Van Haaren, Harry <harry.van.haaren at intel.com>
> Cc: Sara Gittlin <sara.gittlin at gmail.com>; users at dpdk.org
> Subject: Re: [dpdk-users] HW cache utilisation w OVS-DPDK
>
> Thank you Harry very much for your detailed answer.

Welcome!


> I think that ovs with Data plane is a kernel module (i.e. not ovs dpdk) the flow table is somehow always in the cache,
> hence there is an eviction function in user space that is sending commands to the kernel to delete non active flows in
> order to make place in this expensive cache memory. But i dont know how it is w ovs-dpdk. I know that generaly caches
> mem are transprant to sw, but w ovs kernel module the flow table is always stored in cache.

Aha - perhaps we need to clear up a few terminology issues.

HW caches like L1, L2 and LLC are present in the CPU silicon, and are transparent to software (same as before).

However, we can build data-structures in software that act as a "cache" for packet flows etc. These are also referred to as caches, however they are not the CPU caches :)

The term "cache" is used for both - a search tells me it technically means to "store away in hiding or for future use."


Based on your question about flow-table caching, I now understand that it is not the HW CPU caches you are interested in, but the software data-structures that provide packet-flow caching.

Correct, there are a number of flow-cache structures in OvS, depending on the exact configuration and options you enable.
I am not very familiar with the kernel data-path, however the EMC (Exact Match Cache) and SMC (Signature Match Cache) are two flow-caches available in OvS-DPDK. These are software data-structures, which enable faster lookups of packets, in order to identify the "rule" to apply to that packet flow. 

The flows are "cached" in this software data-structure, but can also be removed again to make more space for other flows.
Often the effectiveness of looking up items in a software cache is proportional to how full the cache is, due to hash-table collisions as they get fuller (and hence having to dig "deeper" into a hashtable to find the cached-entry, causing more CPU %). Hence, removing flows which are no longer active can be a good idea.


> Thanks again Harry
> -Sara

Hope the above is a bit more answering the software caching question I think you are interested in.

Regards, -Harry


> בתאריך יום ג׳, 30 באפר׳ 2019, 21:33, מאת Van Haaren, Harry ‏<harry.van.haaren at intel.com>:
> > -----Original Message-----
> > From: users [mailto:users-bounces at dpdk.org] On Behalf Of Sara Gittlin
> > Sent: Tuesday, April 30, 2019 6:15 PM
> > To: users at dpdk.org
> > Subject: [dpdk-users] HW cache utilisation w OVS-DPDK
> > 
> > Hello  All
>
> Hi Sara,
>
> > It is a naive and maybe a stupid question , but do we use HW cache L1/L2
> > etc with OVS-DPDK?
>
> The hardware CPU caches (L1, L2 and LLC) are transparent to software.
>
> Another way to say that is that When writing code, the software doesn't
> have to explicitly use L1 or L2, the memory being used (from libc malloc() or stack memory)
> is cached by the CPU without any software involvement.
>
> In short, software uses L1/L2/etc without "knowing" it as such...
>
> However, just because we (as C software developers) cannot directly access cache,
> does not mean that we should ignore it! In particular designing cache-conscious
> data-structures can have a *huge* impact on runtime performance.
>
> I recommend some of the CPP Con talks on software performance, particularly
> the one titled "Efficiency with Algorithms, Performance with Data Structures".
>
>
> > for example  where the flow-table  is stored ? in  HW-cache or in RAM?
>
> This is a good question - and the answer is like so many engineering questions - it depends :)
>
> If the part of the flow-table has been recently accessed, it is likely to be in the HW-cache.
> If the flow-table has been initialized, but not used recently it is likely to be in ordinary RAM.
>
> From a performance point of view, this is quite interesting, as certain flow-table accesses
> are expected to be cheap (in cache) while others might take longer (RAM).
>
>
> > Thank you
> > -Sara
>
> Hope that helps! Regards, -Harry


More information about the users mailing list