[dpdk-dev] TX performance regression caused by the mbuf cachline split

Marc Sune marc.sune at bisdn.de
Tue May 12 02:38:33 CEST 2015



On 12/05/15 02:28, Marc Sune wrote:
>
>
> On 12/05/15 01:18, Paul Emmerich wrote:
>> Found a really simple solution that almost restores the original 
>> performance: just add a prefetch on alloc. For some reason, I assumed 
>> that this was already done since the troublesome commit I 
>> investigated mentioned something about prefetching... I guess the 
>> commit referred to the hardware prefetcher in the CPU.
>>
>> Adding an explicit prefetch command in the mbuf alloc function gives 
>> a throughput of 12.7/10.35 Mpps in my benchmark with the 
>> simple/full-featured tx path.
>>
>> DPDK 1.7.1 was at 14.1/10.7 Mpps. I guess I can live with that, since 
>> I'm primarily interested in the full-featured path and the drop from 
>> 10.7 to ~10.4 was due to another change.
>
> Maybe a stupid question;
>
> Does the performance of v1.7.1 also improve if you backport this patch 
> to it?

Self answered... split was done in 1.8, so it is indeed stupid.

Marc
>
> Marc
>
>>
>> Patch: https://github.com/dpdk-org/dpdk/pull/2
>> I also sent an email to the mailing list.
>>
>> I also think that the rx-path could also benefit from prefetching 
>> somewhere.
>>
>>
>> Paul
>>
>



More information about the dev mailing list