[dpdk-dev] [PATCH 0/4] DPDK memcpy optimization

Neil Horman nhorman at tuxdriver.com
Mon Jan 19 14:02:21 CET 2015


On Mon, Jan 19, 2015 at 09:53:30AM +0800, zhihong.wang at intel.com wrote:
> This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
> It also extends memcpy test coverage with unaligned cases and more test points.
> 
> Optimization techniques are summarized below:
> 
> 1. Utilize full cache bandwidth
> 
> 2. Enforce aligned stores
> 
> 3. Apply load address alignment based on architecture features
> 
> 4. Make load/store address available as early as possible
> 
> 5. General optimization techniques like inlining, branch reducing, prefetch pattern access
> 
> Zhihong Wang (4):
>   Disabled VTA for memcpy test in app/test/Makefile
>   Removed unnecessary test cases in test_memcpy.c
>   Extended test coverage in test_memcpy_perf.c
>   Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX
>     platforms
> 
>  app/test/Makefile                                  |   6 +
>  app/test/test_memcpy.c                             |  52 +-
>  app/test/test_memcpy_perf.c                        | 238 +++++---
>  .../common/include/arch/x86/rte_memcpy.h           | 664 +++++++++++++++------
>  4 files changed, 656 insertions(+), 304 deletions(-)
> 
> -- 
> 1.9.3
> 
> 
Are you able to compile this with gcc 4.9.2?  The compilation of
test_memcpy_perf is taking forever for me.  It appears hung.
Neil



More information about the dev mailing list