[RFC v2 2/3] lib: add fastmem library
Mattias Rönnblom
hofors at lysator.liu.se
Wed May 27 13:17:28 CEST 2026
On 5/27/26 12:18, Bruce Richardson wrote:
> On Wed, May 27, 2026 at 12:12:19PM +0200, Mattias Rönnblom wrote:
>> On 5/26/26 15:23, Stephen Hemminger wrote:
>>> On Tue, 26 May 2026 10:57:42 +0200
>>> Mattias Rönnblom <hofors at lysator.liu.se> wrote:
>>>
>>>> +__rte_experimental
>>>> +void *
>>>> +rte_fastmem_alloc(size_t size, size_t align, unsigned int flags)
>>>> + __rte_alloc_size(1) __rte_alloc_align(2);
>>>
>>> Should also add attribute __rte_malloc which tells compiler
>>> that pointer returned cannot alias other memory
>>>
>>> And add __rte_dealloc(rte_fastmem_free, 1)
>>> which tells compiler that the returned pointer should only go
>>> back to fastmem (not free, rte_free, etc).
>>
>> Done. Only works for the single-object ops (not bulk) though.
>>
>> I've had a look at how to extend fastmem to support larger allocations
>> (without suggesting this is the way to go).
>>
>> Seems to me that implementation should be something like
>> a) slab allocator for small objects.
>> b) a cache-less per-socket page run allocator for mid-sized objects.
>> c) per-object memzones for large objects.
>>
>> If one would implement that, you would essentially have a plug-in
>> replacement for rte_malloc.h (maybe minus some debug and some more esoteric
>> DPDK heap features).
>>
>> Should fastmem be an outright replacement, or something that at least
>> initially lives alongside the regular heap, maybe with a run- or
>> compile-time option to make rte_malloc.h functions delegate to fastmem? This
>> is unclear to me at this point. I fear the more ambitious, cleaner and more
>> risky DPDK heap replacement path will go they way my attempts to replace
>> rte_memcpy or rte_timer went.
>>
>> I would agree with anyone saying that we should have only one heap-like API
>> for memory allocations. rte_malloc.h obviously needs to stay, for backward
>> compatibility reason, if nothing else. I would like to add bulk alloc/free,
>> and allow for smaller alignments than 64, since slabs can do that
>> efficiently (DPDK heap per-object header is 128 bytes!). One could either go
>> about that by extending rte_malloc.h or deprecating that API and starting
>> anew. In the latter case, one could do many more minor tweaks, like removing
>> the type pointers (only a nuance), remove the validate function, change and
>> extend the stats interface, etc.
>>
> +1 for replacing rte_malloc. For a replacement, I'd tend towards aiming for
> compatibilty over trying to fix too many little things at once. While
> changing a couple of things is ok, I'd rather not force applications to
> make too many updates to their code when moving from one DPDK version to
> another.
>
What one could attempt to do is to be fully backward compatible with
rte_malloc.h (maybe minus some debug features that require per-object
headers?) and then expose a new API which is functionally a superset of
rte_malloc.h.
The new API would be something like a hybrid of rte_malloc.h, the
mempool APIs, and the kind of API you find on an in-kernel memory
manager (e.g., Solaris'), tuned for DPDK lcore use.
> /Bruce
More information about the dev
mailing list