[dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization

Liang, Cunming cunming.liang at intel.com
Tue Feb 10 04:06:38 CET 2015



> -----Original Message-----
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zhihong Wang
> Sent: Thursday, January 29, 2015 10:39 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v2 0/4] DPDK memcpy optimization
> 
> This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
> It also extends memcpy test coverage with unaligned cases and more test points.
> 
> Optimization techniques are summarized below:
> 
> 1. Utilize full cache bandwidth
> 
> 2. Enforce aligned stores
> 
> 3. Apply load address alignment based on architecture features
> 
> 4. Make load/store address available as early as possible
> 
> 5. General optimization techniques like inlining, branch reducing, prefetch
> pattern access
> 
> --------------
> Changes in v2:
> 
> 1. Reduced constant test cases in app/test/test_memcpy_perf.c for fast build
> 
> 2. Modified macro definition for better code readability & safety
> 
> Zhihong Wang (4):
>   app/test: Disabled VTA for memcpy test in app/test/Makefile
>   app/test: Removed unnecessary test cases in app/test/test_memcpy.c
>   app/test: Extended test coverage in app/test/test_memcpy_perf.c
>   lib/librte_eal: Optimized memcpy in arch/x86/rte_memcpy.h for both SSE
>     and AVX platforms
> 
>  app/test/Makefile                                  |   6 +
>  app/test/test_memcpy.c                             |  52 +-
>  app/test/test_memcpy_perf.c                        | 220 ++++---
>  .../common/include/arch/x86/rte_memcpy.h           | 680 +++++++++++++++-----
> -
>  4 files changed, 654 insertions(+), 304 deletions(-)
> 
> --
> 1.9.3 

Acked-by:  Cunming Liang <cunming.liang at intel.com>




More information about the dev mailing list