[dpdk-dev] [PATCH v4 00/14] acl: introduce AVX512 classify methods

Konstantin Ananyev konstantin.ananyev at intel.com
Tue Oct 6 17:03:02 CEST 2020


These patch series introduce support of AVX512 specific classify
implementation for ACL library.
It adds two new algorithms:
 - RTE_ACL_CLASSIFY_AVX512X16 - can process up to 16 flows in parallel.
   It uses 256-bit width instructions/registers only
   (to avoid frequency level change).
   On my SKX box test-acl shows ~15-30% improvement
   (depending on rule-set and input burst size)
   when switching from AVX2 to AVX512X16 classify algorithms.
 - RTE_ACL_CLASSIFY_AVX512X32 - can process up to 32 flows in parallel.
   It uses 512-bit width instructions/registers and provides higher
   performance then AVX512X16, but can cause frequency level change.
   On my SKX box test-acl shows ~50-70% improvement
   (depending on rule-set and input burst size)
   when switching from AVX2 to AVX512X32 classify algorithms.
   ICX and CLX testing showed similar level of speedup.

Current AVX512 classify implementation is only supported on x86_64.
Note that this series introduce a formal ABI incompatibility
with previous versions of ACL library.

Depends-on: patch-79310 ("eal/x86: introduce AVX 512-bit type")

v3 -> v4
  Fix problems with meson 0.47
  Updates to conform latest changes in the mainline
  (removal of RTE_MACHINE_CPUFLAG_*)
  Fix checkpatch warnings

v2 -> v3:
  Fix checkpatch warnings
  Split AVX512 algorithm into two and deduplicate common code
v1 -> v2:
  Deduplicated 8/16 code paths as much as possible
  Updated default algorithm selection
    Removed library constructor to make it easier integrate with
    https://patches.dpdk.org/project/dpdk/list/?series=11831
  Updated docs


Konstantin Ananyev (14):
  acl: fix x86 build when compiler doesn't support AVX2
  doc: fix missing classify methods in ACL guide
  acl: remove of unused enum value
  acl: remove library constructor
  app/acl: few small improvements
  test/acl: expand classify test coverage
  acl: add infrastructure to support AVX512 classify
  acl: introduce 256-bit width AVX512 classify implementation
  acl: update default classify algorithm selection
  acl: introduce 512-bit width AVX512 classify implementation
  acl: for AVX512 classify use 4B load whenever possible
  acl: deduplicate AVX512 code paths
  test/acl: add AVX512 classify support
  app/acl: add AVX512 classify support

 app/test-acl/main.c                           |  23 +-
 app/test/test_acl.c                           | 105 ++--
 config/x86/meson.build                        |   3 +-
 .../prog_guide/packet_classif_access_ctrl.rst |  20 +
 doc/guides/rel_notes/deprecation.rst          |   4 -
 doc/guides/rel_notes/release_20_11.rst        |  12 +
 lib/librte_acl/acl.h                          |  16 +
 lib/librte_acl/acl_bld.c                      |  34 ++
 lib/librte_acl/acl_gen.c                      |   2 +-
 lib/librte_acl/acl_run_avx512.c               | 164 ++++++
 lib/librte_acl/acl_run_avx512_common.h        | 477 ++++++++++++++++++
 lib/librte_acl/acl_run_avx512x16.h            | 341 +++++++++++++
 lib/librte_acl/acl_run_avx512x8.h             | 253 ++++++++++
 lib/librte_acl/meson.build                    |  48 ++
 lib/librte_acl/rte_acl.c                      | 212 ++++++--
 lib/librte_acl/rte_acl.h                      |   4 +-
 16 files changed, 1618 insertions(+), 100 deletions(-)
 create mode 100644 lib/librte_acl/acl_run_avx512.c
 create mode 100644 lib/librte_acl/acl_run_avx512_common.h
 create mode 100644 lib/librte_acl/acl_run_avx512x16.h
 create mode 100644 lib/librte_acl/acl_run_avx512x8.h

-- 
2.17.1



More information about the dev mailing list