[dpdk-users] Why not prefetch the second cache line of struct rte_mbuf for better performance ?

Van Haaren, Harry harry.van.haaren at intel.com
Tue Mar 26 11:53:48 CET 2019


> -----Original Message-----
> From: users [mailto:users-bounces at dpdk.org] On Behalf Of Dell Will
> Sent: Tuesday, March 26, 2019 9:04 AM
> To: users <users at dpdk.org>
> Subject: [dpdk-users] Why not prefetch the second cache line of struct
> rte_mbuf for better performance ?
> 
> Hello, everybody

Hi,

> I find that many codes in DPDK only prefetch the first cache line of struct
> rte_mbuf.
> The struct rte_mbuf has 2 cache lines.
> Why not prefetch the second line ?

A reason that cache-line 2 is not always prefetched is that it is not
always going to be used.

For example, the packet RX routines modify only the 1-st cache line, and do
not require the 2nd to be available.


> Is it hinted that the CPU (x64 or ARM) always automatically prefetch the
> next immediately followed cache line ?

Some details on x86-64 prefetchers here, particularly the "Adjacent Cache-Line Prefetch" is of interest;
https://software.intel.com/en-us/articles/optimizing-application-performance-on-intel-coret-microarchitecture-using-hardware-implemented-prefetchers

[Side note, x64 is actually a different architecture than x86-64].


> Thanks a lot !

Hope that helps, -Harry


More information about the users mailing list