[dpdk-dev] [PATCH 0/4] DPDK memcpy optimization
Neil Horman
nhorman at tuxdriver.com
Mon Jan 19 14:02:21 CET 2015
On Mon, Jan 19, 2015 at 09:53:30AM +0800, zhihong.wang at intel.com wrote:
> This patch set optimizes memcpy for DPDK for both SSE and AVX platforms.
> It also extends memcpy test coverage with unaligned cases and more test points.
>
> Optimization techniques are summarized below:
>
> 1. Utilize full cache bandwidth
>
> 2. Enforce aligned stores
>
> 3. Apply load address alignment based on architecture features
>
> 4. Make load/store address available as early as possible
>
> 5. General optimization techniques like inlining, branch reducing, prefetch pattern access
>
> Zhihong Wang (4):
> Disabled VTA for memcpy test in app/test/Makefile
> Removed unnecessary test cases in test_memcpy.c
> Extended test coverage in test_memcpy_perf.c
> Optimized memcpy in arch/x86/rte_memcpy.h for both SSE and AVX
> platforms
>
> app/test/Makefile | 6 +
> app/test/test_memcpy.c | 52 +-
> app/test/test_memcpy_perf.c | 238 +++++---
> .../common/include/arch/x86/rte_memcpy.h | 664 +++++++++++++++------
> 4 files changed, 656 insertions(+), 304 deletions(-)
>
> --
> 1.9.3
>
>
Are you able to compile this with gcc 4.9.2? The compilation of
test_memcpy_perf is taking forever for me. It appears hung.
Neil
More information about the dev
mailing list