[dpdk-dev] [PATCH v4 0/3] run-time Linking support
xiaoyun.li at intel.com
Mon Oct 2 18:13:13 CEST 2017
This patchset dynamically selects functions at run-time based on CPU flags
that current machine supports. This patchset modifies mempcy, memcpy perf
test and x86 EFD, using function pointers and bind them at constructor time.
Then in the cloud environment, users can compiler once for the minimum target
such as 'haswell'(not 'native') and run on different platforms (equal or above
haswell) and can get ISA optimization based on running CPU.
Xiaoyun Li (3):
eal/x86: run-time dispatch over memcpy
app/test: run-time dispatch over memcpy perf test
efd: run-time dispatch over x86 EFD functions
* Use gcc function multi-versioning to avoid compilation issues.
* Add macros for AVX512 and AVX2. Only if users enable AVX512 and the
compiler supports it, the AVX512 codes would be compiled. Only if the
compiler supports AVX2, the AVX2 codes would be compiled.
* Reduce function calls via only keep rte_memcpy_xxx.
* Add conditions that when copy size is small, use inline code path.
Otherwise, use dynamic code path.
* To support attribute target, clang version must be greater than 3.7.
Otherwise, would choose SSE/AVX code path, the same as before.
* Move two mocro functions to the top of the code since they would be
used in inline SSE/AVX and dynamic SSE/AVX codes.
* Modify rte_memcpy.h to several .c files and modify makefiles to compile
AVX2 and AVX512 files.
lib/librte_eal/bsdapp/eal/Makefile | 17 +
.../common/include/arch/x86/rte_memcpy.c | 59 ++
.../common/include/arch/x86/rte_memcpy.h | 861 +------------------
.../common/include/arch/x86/rte_memcpy_avx2.c | 291 +++++++
.../common/include/arch/x86/rte_memcpy_avx512f.c | 316 +++++++
.../common/include/arch/x86/rte_memcpy_internal.h | 909 +++++++++++++++++++++
.../common/include/arch/x86/rte_memcpy_sse.c | 585 +++++++++++++
lib/librte_eal/linuxapp/eal/Makefile | 17 +
lib/librte_efd/rte_efd_x86.h | 41 +-
mk/rte.cpuflags.mk | 14 +
test/test/test_memcpy_perf.c | 40 +-
11 files changed, 2288 insertions(+), 862 deletions(-)
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy.c
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_avx2.c
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_avx512f.c
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_internal.h
create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcpy_sse.c
More information about the dev