rte_malloc() and alignment

Stephen Hemminger stephen at networkplumber.org
Wed Feb 7 05:46:22 CET 2024


On Tue, 6 Feb 2024 17:17:31 +0100
Mattias Rönnblom <hofors at lysator.liu.se> wrote:

> The rte_malloc() API documentation has the following to say about the 
> align parameter:
> 
> "If 0, the return is a pointer that is suitably aligned for any kind of 
> variable (in the same manner as malloc()). Otherwise, the return is a 
> pointer that is a multiple of align. In this case, it must be a power of 
> two. (Minimum alignment is the cacheline size, i.e. 64-bytes)"
> 
> After reading this, one might be left with the impression that the 
> parenthesis refers to only the "otherwise" (non-zero-align) case, since 
> surely, cache line alignment should be sufficient for any kind of 
> variable and it semantics would be "in the same manner as malloc()".
> 
> However, in the actual RTE malloc implementation, any align parameter 
> value less than RTE_CACHE_LINE_SIZE results in an alignment of 
> RTE_CACHE_LINE_SIZE, unless I'm missing something.
> 
> Is there any conceivable scenario where passing a non-zero align 
> parameter is useful?
> 
> Would it be an improvement to rephrase the documentation to:
> 
> "The alignment of the allocated memory meets all of the following criteria:
> 1) able to hold any built-in type.
> 2) be at least as large as the align parameter.
> 3) be at least as large as RTE_CACHE_LINE_SIZE.
> 
> The align parameter must be a power-of-2 or 0.
> "
> 
> ...so it actually describes what is implemented? And also adds the 
> theoretical (?) case of a built-in type requiring > RTE_CACHE_LINE_SIZE 
> amount of alignment.

My reading is that align of 0 means that rte_malloc() should act
same as malloc(), and give alignment for largest type. 

Walking through the code, the real work is in and at this point align
of 0 has been convert to 1. in malloc_heap_alloc_on_heap_id()

/*
 * Iterates through the freelist for a heap to find a free element with the
 * biggest size and requested alignment. Will also set size to whatever element
 * size that was found.
 * Returns null on failure, or pointer to element on success.
 */
static struct malloc_elem *
find_biggest_element(struct malloc_heap *heap, size_t *size,
		unsigned int flags, size_t align, bool contig)


Then the elements are examined with:

size_t
malloc_elem_find_max_iova_contig(struct malloc_elem *elem, size_t align)

But I don't see anywhere that 0 converts to being aligned on sizeof(double)
which is the largest type.

Not sure who has expertise here? The allocator is a bit of problem child.
It is complex, slow and critical.


More information about the dev mailing list