[dpdk-dev] [PATCH v4 00/14] acl: introduce AVX512 classify methods
Konstantin Ananyev
konstantin.ananyev at intel.com
Tue Oct 6 17:03:02 CEST 2020
These patch series introduce support of AVX512 specific classify
implementation for ACL library.
It adds two new algorithms:
- RTE_ACL_CLASSIFY_AVX512X16 - can process up to 16 flows in parallel.
It uses 256-bit width instructions/registers only
(to avoid frequency level change).
On my SKX box test-acl shows ~15-30% improvement
(depending on rule-set and input burst size)
when switching from AVX2 to AVX512X16 classify algorithms.
- RTE_ACL_CLASSIFY_AVX512X32 - can process up to 32 flows in parallel.
It uses 512-bit width instructions/registers and provides higher
performance then AVX512X16, but can cause frequency level change.
On my SKX box test-acl shows ~50-70% improvement
(depending on rule-set and input burst size)
when switching from AVX2 to AVX512X32 classify algorithms.
ICX and CLX testing showed similar level of speedup.
Current AVX512 classify implementation is only supported on x86_64.
Note that this series introduce a formal ABI incompatibility
with previous versions of ACL library.
Depends-on: patch-79310 ("eal/x86: introduce AVX 512-bit type")
v3 -> v4
Fix problems with meson 0.47
Updates to conform latest changes in the mainline
(removal of RTE_MACHINE_CPUFLAG_*)
Fix checkpatch warnings
v2 -> v3:
Fix checkpatch warnings
Split AVX512 algorithm into two and deduplicate common code
v1 -> v2:
Deduplicated 8/16 code paths as much as possible
Updated default algorithm selection
Removed library constructor to make it easier integrate with
https://patches.dpdk.org/project/dpdk/list/?series=11831
Updated docs
Konstantin Ananyev (14):
acl: fix x86 build when compiler doesn't support AVX2
doc: fix missing classify methods in ACL guide
acl: remove of unused enum value
acl: remove library constructor
app/acl: few small improvements
test/acl: expand classify test coverage
acl: add infrastructure to support AVX512 classify
acl: introduce 256-bit width AVX512 classify implementation
acl: update default classify algorithm selection
acl: introduce 512-bit width AVX512 classify implementation
acl: for AVX512 classify use 4B load whenever possible
acl: deduplicate AVX512 code paths
test/acl: add AVX512 classify support
app/acl: add AVX512 classify support
app/test-acl/main.c | 23 +-
app/test/test_acl.c | 105 ++--
config/x86/meson.build | 3 +-
.../prog_guide/packet_classif_access_ctrl.rst | 20 +
doc/guides/rel_notes/deprecation.rst | 4 -
doc/guides/rel_notes/release_20_11.rst | 12 +
lib/librte_acl/acl.h | 16 +
lib/librte_acl/acl_bld.c | 34 ++
lib/librte_acl/acl_gen.c | 2 +-
lib/librte_acl/acl_run_avx512.c | 164 ++++++
lib/librte_acl/acl_run_avx512_common.h | 477 ++++++++++++++++++
lib/librte_acl/acl_run_avx512x16.h | 341 +++++++++++++
lib/librte_acl/acl_run_avx512x8.h | 253 ++++++++++
lib/librte_acl/meson.build | 48 ++
lib/librte_acl/rte_acl.c | 212 ++++++--
lib/librte_acl/rte_acl.h | 4 +-
16 files changed, 1618 insertions(+), 100 deletions(-)
create mode 100644 lib/librte_acl/acl_run_avx512.c
create mode 100644 lib/librte_acl/acl_run_avx512_common.h
create mode 100644 lib/librte_acl/acl_run_avx512x16.h
create mode 100644 lib/librte_acl/acl_run_avx512x8.h
--
2.17.1
More information about the dev
mailing list