[PATCH v4] net/af_xdp: fix umem map size for zero copy

Ferruh Yigit ferruh.yigit at amd.com
Thu May 23 15:31:31 CEST 2024


On 5/23/2024 10:22 AM, Morten Brørup wrote:
>> From: Frank Du [mailto:frank.du at intel.com]
>> Sent: Thursday, 23 May 2024 10.08
>>
>> The current calculation assumes that the mbufs are contiguous. However,
>> this assumption is incorrect when the mbuf memory spans across huge page.
>> To ensure that each mbuf resides exclusively within a single page, there
>> are deliberate spacing gaps when allocating mbufs across the boundaries.
> 
> A agree that this patch is an improvement of what existed previously.
> But I still don't understand the patch description. To me, it looks like the patch adds a missing check for contiguous memory, and the patch itself has nothing to do with huge pages. Anyway, if the maintainer agrees with the description, I don't mind not grasping it. ;-)
> 
> However, while trying to understand what is happening, I think I found one more (already existing) bug.
> I will show through an example inline below.
> 
>>
>> Correct to directly read the size from the mempool memory chunk.
>>
>> Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
>> Cc: stable at dpdk.org
>>
>> Signed-off-by: Frank Du <frank.du at intel.com>
>>
>> ---
>> v2:
>> * Add virtual contiguous detect for for multiple memhdrs
>> v3:
>> * Use RTE_ALIGN_FLOOR to get the aligned addr
>> * Add check on the first memhdr of memory chunks
>> v4:
>> * Replace the iterating with simple nb_mem_chunks check
>> ---
>>  drivers/net/af_xdp/rte_eth_af_xdp.c | 33 +++++++++++++++++++++++------
>>  1 file changed, 26 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> index 6ba455bb9b..d0431ec089 100644
>> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
>> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
>> @@ -1040,16 +1040,32 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused,
>>  }
>>
>>  #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
>> -static inline uintptr_t get_base_addr(struct rte_mempool *mp, uint64_t
>> *align)
>> +static inline uintptr_t
>> +get_memhdr_info(const struct rte_mempool *mp, uint64_t *align, size_t *len)
>>  {
>>  	struct rte_mempool_memhdr *memhdr;
>>  	uintptr_t memhdr_addr, aligned_addr;
>>
>> +	if (mp->nb_mem_chunks != 1) {
>> +		/*
>> +		 * The mempool with multiple chunks is not virtual contiguous but
>> +		 * xsk umem only support single virtual region mapping.
>> +		 */
>> +		AF_XDP_LOG(ERR, "The mempool contain multiple %u memory
>> chunks\n",
>> +				   mp->nb_mem_chunks);
>> +		return 0;
>> +	}
>> +
>> +	/* Get the mempool base addr and align from the header now */
>>  	memhdr = STAILQ_FIRST(&mp->mem_list);
>> +	if (!memhdr) {
>> +		AF_XDP_LOG(ERR, "The mempool is not populated\n");
>> +		return 0;
>> +	}
>>  	memhdr_addr = (uintptr_t)memhdr->addr;
>> -	aligned_addr = memhdr_addr & ~(getpagesize() - 1);
>> +	aligned_addr = RTE_ALIGN_FLOOR(memhdr_addr, getpagesize());
>>  	*align = memhdr_addr - aligned_addr;
>> -
>> +	*len = memhdr->len;
>>  	return aligned_addr;
> 
> On x86_64, the page size is 4 KB = 0x1000.
> 
> Let's look at an example where memhdr->addr is not aligned to the page size:
> 
> In the example,
> memhdr->addr is 0x700100, and
> memhdr->len is 0x20000.
> 
> Then
> aligned_addr becomes 0x700000,
> *align becomes 0x100, and
> *len becomes 0x20000.
> 
>>  }
>>
>> @@ -1126,6 +1142,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals
>> *internals,
>>  	void *base_addr = NULL;
>>  	struct rte_mempool *mb_pool = rxq->mb_pool;
>>  	uint64_t umem_size, align = 0;
>> +	size_t len = 0;
>>
>>  	if (internals->shared_umem) {
>>  		if (get_shared_umem(rxq, internals->if_name, &umem) < 0)
>> @@ -1157,10 +1174,12 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals
>> *internals,
>>  		}
>>
>>  		umem->mb_pool = mb_pool;
>> -		base_addr = (void *)get_base_addr(mb_pool, &align);
>> -		umem_size = (uint64_t)mb_pool->populated_size *
>> -				(uint64_t)usr_config.frame_size +
>> -				align;
>> +		base_addr = (void *)get_memhdr_info(mb_pool, &align, &len);
>> +		if (!base_addr) {
>> +			AF_XDP_LOG(ERR, "The memory pool can't be mapped as
>> umem\n");
>> +			goto err;
>> +		}
>> +		umem_size = (uint64_t)len + align;
> 
> Here, umem_size becomes 0x20100.
> 
>>
>>  		ret = xsk_umem__create(&umem->umem, base_addr, umem_size,
>>  				&rxq->fq, &rxq->cq, &usr_config);
> 
> Here, xsk_umem__create() is called with the base_address (0x700000) preceding the address of the memory chunk (0x700100).
> It looks like a bug, causing a buffer underrun. I.e. will it access memory starting at base_address?
> 

I already asked for this on v2, Frank mentioned that area is not
accessed and having gap is safe.

> If I'm correct, the code should probably do this for alignment instead:
> 
> aligned_addr = RTE_ALIGN_CEIL(memhdr_addr, getpagesize());
> *align = aligned_addr - memhdr_addr;
> umem_size = (uint64_t)len - align;
> 
> 
> Disclaimer: I don't know much about the AF_XDP implementation, so maybe I just don't understand what is going on.
> 
>> --
>> 2.34.1
> 



More information about the dev mailing list