[dpdk-dev] TX performance regression caused by the mbuf cachline split

Marc Sune marc.sune at bisdn.de
Tue May 12 02:28:23 CEST 2015



On 12/05/15 01:18, Paul Emmerich wrote:
> Found a really simple solution that almost restores the original 
> performance: just add a prefetch on alloc. For some reason, I assumed 
> that this was already done since the troublesome commit I investigated 
> mentioned something about prefetching... I guess the commit referred 
> to the hardware prefetcher in the CPU.
>
> Adding an explicit prefetch command in the mbuf alloc function gives a 
> throughput of 12.7/10.35 Mpps in my benchmark with the 
> simple/full-featured tx path.
>
> DPDK 1.7.1 was at 14.1/10.7 Mpps. I guess I can live with that, since 
> I'm primarily interested in the full-featured path and the drop from 
> 10.7 to ~10.4 was due to another change.

Maybe a stupid question;

Does the performance of v1.7.1 also improve if you backport this patch 
to it?

Marc

>
> Patch: https://github.com/dpdk-org/dpdk/pull/2
> I also sent an email to the mailing list.
>
> I also think that the rx-path could also benefit from prefetching 
> somewhere.
>
>
> Paul
>



More information about the dev mailing list