[RFC v5 1/6] eal: add static per-lcore memory allocation facility
Konstantin Ananyev
konstantin.ananyev at huawei.com
Tue Mar 19 13:52:47 CET 2024
Hi Mattias,
> Introduce DPDK per-lcore id variables, or lcore variables for short.
>
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
>
> The primary <rte_lcore_var.h> use case is for statically allocating
> small chunks of often-used data, which is related logically, but where
> there are performance benefits to reap from having updates being local
> to an lcore.
>
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
>
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
>
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
Thanks for the RFC, very interesting one.
Few comments/questions below.
> RFC v5:
> * In Doxygen, consistenly use @<cmd> (and not \<cmd>).
> * The RTE_LCORE_VAR_GET() and SET() convience access macros
> covered an uncommon use case, where the lcore value is of a
> primitive type, rather than a struct, and is thus eliminated
> from the API. (Morten Brørup)
> * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
> RTE_LCORE_VAR_VALUE().
> * The underscores are removed from __rte_lcore_var_lcore_ptr() to
> signal that this function is a part of the public API.
> * Macro arguments are documented.
>
> RFV v4:
> * Replace large static array with libc heap-allocated memory. One
> implication of this change is there no longer exists a fixed upper
> bound for the total amount of memory used by lcore variables.
> RTE_MAX_LCORE_VAR has changed meaning, and now represent the
> maximum size of any individual lcore variable value.
> * Fix issues in example. (Morten Brørup)
> * Improve access macro type checking. (Morten Brørup)
> * Refer to the lcore variable handle as "handle" and not "name" in
> various macros.
> * Document lack of thread safety in rte_lcore_var_alloc().
> * Provide API-level assurance the lcore variable handle is
> always non-NULL, to all applications to use NULL to mean
> "not yet allocated".
> * Note zero-sized allocations are not allowed.
> * Give API-level guarantee the lcore variable values are zeroed.
>
> RFC v3:
> * Replace use of GCC-specific alignof(<expression>) with alignof(<type>).
> * Update example to reflect FOREACH macro name change (in RFC v2).
>
> RFC v2:
> * Use alignof to derive alignment requirements. (Morten Brørup)
> * Change name of FOREACH to make it distinct from <rte_lcore.h>'s
> *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
> * Allow user-specified alignment, but limit max to cache line size.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom at ericsson.com>
> Acked-by: Morten Brørup <mb at smartsharesystems.com>
> ---
> config/rte_config.h | 1 +
> doc/api/doxy-api-index.md | 1 +
> lib/eal/common/eal_common_lcore_var.c | 68 +++++
> lib/eal/common/meson.build | 1 +
> lib/eal/include/meson.build | 1 +
> lib/eal/include/rte_lcore_var.h | 368 ++++++++++++++++++++++++++
> lib/eal/version.map | 4 +
> 7 files changed, 444 insertions(+)
> create mode 100644 lib/eal/common/eal_common_lcore_var.c
> create mode 100644 lib/eal/include/rte_lcore_var.h
>
> diff --git a/config/rte_config.h b/config/rte_config.h
> index d743a5c3d3..0dac33d3b9 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -41,6 +41,7 @@
> /* EAL defines */
> #define RTE_CACHE_GUARD_LINES 1
> #define RTE_MAX_HEAPS 32
> +#define RTE_MAX_LCORE_VAR 1048576
> #define RTE_MAX_MEMSEG_LISTS 128
> #define RTE_MAX_MEMSEG_PER_LIST 8192
> #define RTE_MAX_MEM_MB_PER_LIST 32768
> diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
> index 8c1eb8fafa..a3b8391570 100644
> --- a/doc/api/doxy-api-index.md
> +++ b/doc/api/doxy-api-index.md
> @@ -99,6 +99,7 @@ The public API headers are grouped by topics:
> [interrupts](@ref rte_interrupts.h),
> [launch](@ref rte_launch.h),
> [lcore](@ref rte_lcore.h),
> + [lcore-varible](@ref rte_lcore_var.h),
> [per-lcore](@ref rte_per_lcore.h),
> [service cores](@ref rte_service.h),
> [keepalive](@ref rte_keepalive.h),
> diff --git a/lib/eal/common/eal_common_lcore_var.c b/lib/eal/common/eal_common_lcore_var.c
> new file mode 100644
> index 0000000000..5c353ebd46
> --- /dev/null
> +++ b/lib/eal/common/eal_common_lcore_var.c
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2024 Ericsson AB
> + */
> +
> +#include <inttypes.h>
> +
> +#include <rte_common.h>
> +#include <rte_debug.h>
> +#include <rte_log.h>
> +
> +#include <rte_lcore_var.h>
> +
> +#include "eal_private.h"
> +
> +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> +
> +static void *lcore_buffer;
> +static size_t offset = RTE_MAX_LCORE_VAR;
> +
> +static void *
> +lcore_var_alloc(size_t size, size_t align)
> +{
> + void *handle;
> + void *value;
> +
> + offset = RTE_ALIGN_CEIL(offset, align);
> +
> + if (offset + size > RTE_MAX_LCORE_VAR) {
> + lcore_buffer = aligned_alloc(RTE_CACHE_LINE_SIZE,
> + LCORE_BUFFER_SIZE);
Hmm... do I get it right: if offset is <= then RTE_MAX_LCORE_VAR, and offset + size > RTE_MAX_LCORE_VAR
we simply overwrite lcore_buffer with newly allocated buffer of the same size?
I understand that you expect it just never to happen (total size of all lcore vars never exceed 1MB), but still
I think we need to handle it in a some better way then just ignoring such possibility...
Might be RTE_VERIFY() at least?
As a more generic question - do we need to support LCORE_VAR for dlopen()s that could happen after rte_eal_init()
is called and LCORE threads were created?
Because, if no, then we probably can make this construction much more flexible:
one buffer per LCORE, allocate on demand, etc.
> + RTE_VERIFY(lcore_buffer != NULL);
> +
> + offset = 0;
> + }
> +
> + handle = RTE_PTR_ADD(lcore_buffer, offset);
> +
> + offset += size;
> +
> + RTE_LCORE_VAR_FOREACH_VALUE(value, handle)
> + memset(value, 0, size);
> +
> + EAL_LOG(DEBUG, "Allocated %"PRIuPTR" bytes of per-lcore data with a "
> + "%"PRIuPTR"-byte alignment", size, align);
> +
> + return handle;
> +}
> +
> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> + /* Having the per-lcore buffer size aligned on cache lines
> + * assures as well as having the base pointer aligned on cache
> + * size assures that aligned offsets also translate to alipgned
> + * pointers across all values.
> + */
> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);
> +
> + /* '0' means asking for worst-case alignment requirements */
> + if (align == 0)
> + align = alignof(max_align_t);
> +
> + RTE_ASSERT(rte_is_power_of_2(align));
> +
> + return lcore_var_alloc(size, align);
> +}
More information about the dev
mailing list