[PATCH v8] eal/x86: improve rte_memcpy const size 16 performance
Konstantin Ananyev
konstantin.ananyev at huawei.com
Mon Jun 10 15:40:12 CEST 2024
> When the rte_memcpy() size is 16, the same 16 bytes are copied twice.
> In the case where the size is known to be 16 at build time, omit the
> duplicate copy.
>
> Reduced the amount of effectively copy-pasted code by using #ifdef
> inside functions instead of outside functions.
>
> Suggested-by: Stephen Hemminger <stephen at networkplumber.org>
> Signed-off-by: Morten Brørup <mb at smartsharesystems.com>
> Acked-by: Bruce Richardson <bruce.richardson at intel.com>
> ---
> Depends-on: series-31578 ("provide toolchain abstracted __builtin_constant_p")
>
> v8:
> * Keep trying to fix that CI does not understand the dependency...
> Depend on series instead of patch. Github only understands series.
> * Fix typo in patch description.
> v7:
> * Keep trying to fix that CI does not understand the dependency...
> Depend on patch instead of series.
> Move dependency out of the patch description itself, and down to the
> version log.
> v6:
> * Trying to fix CI not understanding dependency...
> Don't wrap dependency line.
> v5:
> * Fix for building with MSVC:
> Use __rte_constant() instead of __builtin_constant_p().
> Add dependency on patch providing __rte_constant().
> v4:
> * There are no problems compiling AVX2, only AVX. (Bruce Richardson)
> v3:
> * AVX2 is a superset of AVX;
> for a block of AVX code, testing for AVX suffices. (Bruce Richardson)
> * Define RTE_MEMCPY_AVX if AVX is available, to avoid copy-pasting the
> check for older GCC version. (Bruce Richardson)
> v2:
> * For GCC, version 11 is required for proper AVX handling;
> if older GCC version, treat AVX as SSE.
> Clang does not have this issue.
> Note: Original code always treated AVX as SSE, regardless of compiler.
> * Do not add copyright. (Stephen Hemminger)
Acked-by: Konstantin Ananyev <konstantin.ananyev at huawei.com>
The code change itself - LGTM.
Out of interest - do you expect any perf diff with these changes?
On my box I didn’t see any with 'memcpy_perf_autotest'.
Konstantin
More information about the dev
mailing list