[RFC v3 2/3] lib: add fastmem library
Mattias Rönnblom
hofors at lysator.liu.se
Sat May 30 18:22:53 CEST 2026
On 5/28/26 21:56, Morten Brørup wrote:
>> From: Varghese, Vipin [mailto:Vipin.Varghese at amd.com]
>> Sent: Thursday, 28 May 2026 16.45
>>
>> Public
>>
>> Hi @Morten Brørup
>>
>> <snipped>
>>
>>>
>>>> +/**
>>>> + * Pre-reserve backing memory.
>>>> + *
>>>> + * Ensures that at least @p size bytes of memzone-backed memory
>> are
>>>> + * available to the allocator on @p socket_id, reserving
>> additional
>>>> + * memzones from EAL as needed to reach that total. Subsequent
>>>> + * allocations served from the pre-reserved memory do not incur
>>>> + * memzone-reservation cost.
>>>> + *
>>>> + * The reservation is cumulative: repeated calls to
>>>> + * rte_fastmem_reserve() with the same @p socket_id grow the
>>>> + * reservation monotonically. Reserved memory is never returned to
>>>> + * the system during the allocator's lifetime.
>>>> + *
>>>> + * A typical use is to call rte_fastmem_reserve() once at
>>>> + * application startup, with a size chosen to cover the expected
>>>> + * steady-state working set. Allocations and frees during
>>>> + * steady-state operation then avoid memzone reservations
>> entirely.
>>>> + *
>>>> + * @param size
>>>> + * The minimum amount of backing memory, in bytes, to make
>>>> + * available on @p socket_id. The allocator may reserve more than
>>>> + * the requested amount due to internal rounding (e.g., to
>> memzone
>>>> + * or block granularity).
>>>> + *
>>>> + * @param socket_id
>>>> + * The NUMA socket on which to reserve memory, or SOCKET_ID_ANY
>>>> + * to leave the choice to the allocator. With SOCKET_ID_ANY, the
>>>> + * allocator starts on the calling lcore's socket (or the first
>>>> + * configured socket if the caller is not bound to one) and falls
>>>> + * back to other sockets if the preferred socket cannot satisfy
>>>> + * the reservation.
>>>> + *
>>>> + * @return
>>>> + * - 0: Success.
>>>> + * - -ENOMEM: Insufficient huge-page memory to satisfy the
>> request.
>>>> + * - -EINVAL: Invalid @p socket_id.
>>>> + */
>>>> +__rte_experimental
>>>> +int
>>>> +rte_fastmem_reserve(size_t size, int socket_id);
>>>
>>> @Bruce,
>>> I vaguely recall that we discussed something about busses and sockets
>> a long time
>>> ago, but I cannot remember the details.
>>> Is socket_id the right type (and parameter name) to identify a memory
>> bus?
>>>
>>> @Vipin,
>>> You have been working on topology awareness. Same question to you:
>>> Is socket_id the right type (and parameter name) to identify a memory
>> bus?
>>
>> Short answer: socket_id is no longer a precise or sufficient
>> abstraction to represent a memory bus.
>> Based on the topology work with libhwloc, we’ve observed the following
>> across Ampere, Intel, and AMD platforms:
>>
>> Features like SNC (Sub-NUMA Clustering) on Intel and NPS (NUMA Per
>> Socket) on AMD change how socket_id maps to hardware.
>> In these modes:
>>
>> 1) A single physical socket can expose multiple NUMA domains.
>> 2)These NUMA domains align more closely with memory controller
>> groupings (i.e., memory buses) rather than the full socket.
>>
>>
>> Depending on the architecture:
>> a) Memory controllers may be collocated with compute cores or placed on
>> separate tiles.
>> b) As a result, socket_id can represent different scopes (full socket
>> vs. sub-socket domains), making it inconsistent.
>>
>>
>>
>> Hence practically: In some configurations, socket_id ≈ memory domain.
>> In others, it is coarser than the actual memory bus topology.
>>
>> To address this ambiguity, in the topology patches (v5/v6), we are
>> moving toward clearer separation:
>>
>> a. Cache domains (L1/L2/L3/L4) for compute locality
>> b. NUMA domains (memory + IO) as the unit for allocation locality
>>
>> This direction better reflects real hardware and avoids overloading
>> socket_id with multiple meanings.
>>
>> Happy to align this with the topology model we’re introducing so the
>> abstraction remains consistent going forward.
>> Thanks,
>> Vipin
>
> Thank you for the quick and detailed response, Vipin!
>
> I haven't looked deeply into the v5/v6 topology patches yet (it's on my TODO list).
>
> The rte_fastmem library builds on top of the rte_memzone library.
>
> So, if the rte_memzone library is updated to replace the meaning of its "socket_id" parameter with some NUMA domain identifier (we better rename the "socket_id" to a new name "numa_domain_id"), then the rte_fastmem library could remain unaffected, and its "socket_id" parameter would be passed on directly to the rte_memzone library's "numa_domain_id"?
>
What is a "domain" here? It's the same as what is usually (in my
experience) referred to as a NUMA node?
I would just call it "node_id".
> This is my conclusion: At this point, proper support for allocating memory in specific NUMA domains is an rte_memzone library issue, and nothing to worry about for the rte_fastmem library - it will be automatically supported in rte_fastmem when supported by rte_memzone.
>
>
More information about the dev
mailing list