|SUCCESS| [v4] mempool: improve cache behaviour and performance

dpdklab at iol.unh.edu dpdklab at iol.unh.edu
Sat Apr 18 14:13:33 CEST 2026


Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/163182

_AI Code Review_

=== Patch Review: series.patch (via Claude) ===
Review date: 2026-04-18

Reviewing the patch against DPDK coding standards and contribution guidelines.

commit 7f8a3b2e9d1c4a5b6e7f8a9b0c1d2e3f4a5b6c7d
Author: Morten Brorup <mb at smartsharesystems.com>

    mempool: improve cache behaviour and performance

This patch refactors the mempool cache to eliminate some unexpected
behaviour and reduce the mempool cache miss rate.

The patch changes the mempool cache algorithm to use half-size
flush/refill operations instead of full cache operations, and adjusts
bounce buffer limits to be based on cache->size/2 instead of
RTE_MEMPOOL_CACHE_MAX_SIZE.

> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 2e54fc4466..4ece89cb47 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h

[ ... ]

> @@ -1390,24 +1397,27 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
>  	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
>  	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
>  
> -	__rte_assume(cache->flushthresh <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
> -	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
> -	__rte_assume(cache->len <= cache->flushthresh);
> -	if (likely(cache->len + n <= cache->flushthresh)) {
> +	__rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> +	__rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> +	__rte_assume(cache->len <= cache->size);
> +	if (likely(cache->len + n <= cache->size)) {
>  		/* Sufficient room in the cache for the objects. */
>  		cache_objs = &cache->objs[cache->len];
>  		cache->len += n;
> -	} else if (n <= cache->flushthresh) {
> +	} else if (n <= cache->size / 2) {

Does this code correctly handle odd cache sizes? The doc comment in
rte_mempool_get_bulk() suggests cache_size should be even, but does the
code enforce this? If cache->size is odd, operations using cache->size/2
could lead to unexpected behavior due to integer division truncation.

> +		/* Fill the cache from the backend; fetch (size / 2) objects. */
> +		ret = rte_mempool_ops_dequeue_bulk(mp, cache->objs, cache->size / 2);

In rte_mempool_do_generic_get(), after this dequeue_bulk succeeds:

> +	__rte_assume(cache->size / 2 <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> +	__rte_assume(remaining <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> +	__rte_assume(remaining <= cache->size / 2);
> +	cache_objs = &cache->objs[cache->size / 2];
> +	cache->len = cache->size / 2 - remaining;

Does this logic correctly handle the case where remaining equals
cache->size/2? When remaining equals cache->size/2, cache->len becomes 0,
which seems correct, but it assumes cache->objs[0..cache->size/2-1] were
just populated by the dequeue_bulk call and we're consuming all of them.
Does the backward iteration in the loop below correctly consume from this
range?

> @@ -1046,12 +1048,17 @@ rte_mempool_free(struct rte_mempool *mp);
>   * @param cache_size
>   *   If cache_size is non-zero, the rte_mempool library will try to
>   *   limit the accesses to the common lockless pool, by maintaining a
> - *   per-lcore object cache. This argument must be lower or equal to
> - *   RTE_MEMPOOL_CACHE_MAX_SIZE and n / 1.5.
> + *   per-lcore object cache. This argument must be an even number,
> + *   lower or equal to RTE_MEMPOOL_CACHE_MAX_SIZE and n.

The documentation now requires cache_size to be even. Does
rte_mempool_create_empty() validate this requirement? Looking at the code
changes in rte_mempool.c:

> @@ -850,7 +840,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
>  
>  	/* asked cache too big */
>  	if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> -	    CALC_CACHE_FLUSHTHRESH(cache_size) > n) {
> +	    cache_size > n) {

Does this validation check enforce the even number requirement documented
above? Should there be a check like (cache_size & 1)?

> diff --git a/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
> index 9af275cd9d..67893f6b07 100644
> --- a/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
> +++ b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
> @@ -148,14 +148,20 @@ idpf_singleq_rearm(struct idpf_rx_queue *rxq)
>  	/* Can this be satisfied from the cache? */
>  	if (cache->len < IDPF_RXQ_REARM_THRESH) {
>  		/* No. Backfill the cache first, and then fill from it */
> -		uint32_t req = IDPF_RXQ_REARM_THRESH + (cache->size -
> -							cache->len);
>  
> -		/* How many do we require i.e. number to fill the cache + the request */
> +		/* Backfill would exceed the cache bounce buffer limit? */
> +		__rte_assume(cache->size / 2 <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> +		if (unlikely(cache->size / 2 < IDPF_RXQ_REARM_THRESH)) {
> +			idpf_singleq_rearm_common(rxq);
> +			return;
> +		}
> +
> +		/* Backfill the cache from the backend; fetch (size / 2) objects. */
> +		__rte_assume(cache->len < cache->size / 2);
>  		int ret = rte_mempool_ops_dequeue_bulk
> -				(rxq->mp, &cache->objs[cache->len], req);
> +				(rxq->mp, &cache->objs[cache->len], cache->size / 2);

In idpf_singleq_rearm() and idpf_splitq_rearm(), does this code handle the
case where cache->len is non-zero before the backfill? The objects are
dequeued to &cache->objs[cache->len], then cache->len is increased by
cache->size/2. Does this preserve existing cached objects correctly, or
could it lead to gaps in the cache->objs array?


More information about the test-report mailing list