[PATCH v5 0/4] add pointer compression API

Paul Szczepanek paul.szczepanek at arm.com
Thu Feb 22 09:15:38 CET 2024


For some reason your email is not visible to me, even though it's in the 
archive.

On 02/11/202416:32,Konstantin Ananyev konstantin.v.ananyev  wrote:

> From one side the code itself is very small and straightforward, > from other side - it is not clear to me what is intended usage for it
> within DPDK and it's applianances?
> Konstantin

The intended usage is explained in the cover email (see below) and demonstrated
in the test supplied in the following patch - when sending arrays of pointers
between cores as it happens in a forwarding example.

On 01/11/2023 18:12, Paul Szczepanek wrote:

> This patchset is proposing adding a new EAL header with utility functions
> that allow compression of arrays of pointers.
>
> When passing caches full of pointers between threads, memory containing
> the pointers is copied multiple times which is especially costly between
> cores. A compression method will allow us to shrink the memory size
> copied.
>
> The compression takes advantage of the fact that pointers are usually
> located in a limited memory region (like a mempool). We can compress them
> by converting them to offsets from a base memory address.
>
> Offsets can be stored in fewer bytes (dictated by the memory region size
> and alignment of the pointer). For example: an 8 byte aligned pointer
> which is part of a 32GB memory pool can be stored in 4 bytes. The API is
> very generic and does not assume mempool pointers, any pointer can be
> passed in.
>
> Compression is based on few and fast operations and especially with vector
> instructions leveraged creates minimal overhead.
>
> The API accepts and returns arrays because the overhead means it only is
> worth it when done in bulk.
>
> Test is added that shows potential performance gain from compression. In
> this test an array of pointers is passed through a ring between two cores.
> It shows the gain which is dependent on the bulk operation size. In this
> synthetic test run on ampere altra a substantial (up to 25%) performance
> gain is seen if done in bulk size larger than 32. At 32 it breaks even and
> lower sizes create a small (less than 5%) slowdown due to overhead.
>
> In a more realistic mock application running the l3 forwarding dpdk
> example that works in pipeline mode on two cores this translated into a
> ~5% throughput increase on an ampere altra.
>
> v2:
> * addressed review comments (style, explanations and typos)
> * lowered bulk iterations closer to original numbers to keep runtime short
> * fixed pointer size warning on 32-bit arch
> v3:
> * added 16-bit versions of compression functions and tests
> * added documentation of these new utility functions in the EAL guide
> v4:
> * added unit test
> * fix bug in NEON implementation of 32-bit decompress
> v5:
> * disable NEON and SVE implementation on AARCH32 due to wrong pointer size
>
> Paul Szczepanek (4):
>    eal: add pointer compression functions
>    test: add pointer compress tests to ring perf test
>    docs: add pointer compression to the EAL guide
>    test: add unit test for ptr compression
>
>   .mailmap                                      |   1 +
>   app/test/meson.build                          |   1 +
>   app/test/test_eal_ptr_compress.c              | 108 ++++++
>   app/test/test_ring.h                          |  94 ++++-
>   app/test/test_ring_perf.c                     | 354 ++++++++++++------
>   .../prog_guide/env_abstraction_layer.rst      | 142 +++++++
>   lib/eal/include/meson.build                   |   1 +
>   lib/eal/include/rte_ptr_compress.h            | 266 +++++++++++++
>   8 files changed, 843 insertions(+), 124 deletions(-)
>   create mode 100644 app/test/test_eal_ptr_compress.c
>   create mode 100644 lib/eal/include/rte_ptr_compress.h
>
> --
> 2.25.1
>


More information about the dev mailing list