[dpdk-dev] [Patch 1/2] i40e RX Bulk Alloc: Larger list size (33 to 128) throughput optimization

Bruce Richardson bruce.richardson at intel.com
Wed Oct 28 12:15:06 CET 2015


On Tue, Oct 27, 2015 at 08:56:36PM +0000, Polehn, Mike A wrote:
> Combined 2 subroutines of code into one subroutine with one read operation followed by 
> buffer allocate and load loop.
> 
> Eliminated the staging queue and subroutine, which removed extra pointer list movements 
> and reduced number of active variable cache pages during for call.
> 
> Reduced queue position variables to just 2, the next read point and last NIC RX descriptor 
> position, also changed to allowing NIC descriptor table to not always need to be filled.
> 
> Removed NIC register update write from per loop to one per driver write call to minimize CPU 
> stalls waiting of multiple SMB synchronization points and for earlier NIC register writes that 
> often take large cycle counts to complete. For example with an input packet list of 33, with 
> the default loops size of 32, the second NIC register write will occur just after RX processing 
> for just 1 packet, resulting in large CPU stall time.
> 
> Eliminated initial rx packet present test before rx processing loop that also checks, since less 
> free time is generally available when packets are present than when not processing any input 
> packets. 
> 
> Used some standard variables to help reduce overhead of non-standard variable sizes.
> 
> Reduced number of variables, reordered variable structure to put most active variables in 
> first cache line, better utilize memory bytes inside cache line, and reduced active cache line 
> count to 1 cache line during processing call. Other RX subroutine sets might still use more 
> than 1 variable cache line.
> 
> Signed-off-by: Mike A. Polehn <mike.a.polehn at intel.com>

Hi Mike,

Thanks for the contribution.

However, this patch seems to contain a lot of changes to the i40e code. Since you have
multiple optimizations listed above in the description it would be good if you
could submit this patch as multiple patches, one for each optimization. That
would make it far easier for us to review and test. The same would apply to
patch 2 of this set, which looks to have multiple changes in a single patch too.
Also, each patch should have a unique title stating very briefly what the one
change in that patch is.

Regards,
/Bruce



More information about the dev mailing list