|SUCCESS| [v4] mempool: improve cache behaviour and performance
dpdklab at iol.unh.edu
dpdklab at iol.unh.edu
Sat Apr 18 15:15:35 CEST 2026
Test-Label: ai-code-review
Test-Status: SUCCESS
https://dpdk.org/patch/163182
_AI Code Review_
=== Patch Review: series.patch (via Claude) ===
Review date: 2026-04-18
This patch implements significant changes to the mempool cache algorithm. I'll review it systematically for correctness and compliance issues.
git sha: Not specified in patch
Author: Morten Brorup <mb at smartsharesystems.com>
Subject: [PATCH v4] mempool: improve cache behaviour and performance
Brief summary:
This patch refactors the mempool cache algorithm to match the configured
cache size (eliminating the previous 1.5x behavior), improves cache
bounce buffer limits, and implements a half-size flush/refill strategy to
reduce cache miss rates.
> diff --git a/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
> index 9af275cd9d..67893f6b07 100644
> --- a/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
> +++ b/drivers/net/intel/idpf/idpf_common_rxtx_avx512.c
> @@ -148,14 +148,20 @@ idpf_singleq_rearm(struct idpf_rx_queue *rxq)
> /* Can this be satisfied from the cache? */
> if (cache->len < IDPF_RXQ_REARM_THRESH) {
> /* No. Backfill the cache first, and then fill from it */
> - uint32_t req = IDPF_RXQ_REARM_THRESH + (cache->size -
> - cache->len);
>
> - /* How many do we require i.e. number to fill the cache + the request */
> + /* Backfill would exceed the cache bounce buffer limit? */
> + __rte_assume(cache->size / 2 <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> + if (unlikely(cache->size / 2 < IDPF_RXQ_REARM_THRESH)) {
> + idpf_singleq_rearm_common(rxq);
> + return;
> + }
> +
> + /* Backfill the cache from the backend; fetch (size / 2) objects. */
> + __rte_assume(cache->len < cache->size / 2);
Can this assumption be violated if the cache is being used concurrently
or after a partial dequeue? The assumption states cache->len must be
less than cache->size / 2, but we only checked that cache->len <
IDPF_RXQ_REARM_THRESH. Does IDPF_RXQ_REARM_THRESH guarantee this
relationship?
> int ret = rte_mempool_ops_dequeue_bulk
> - (rxq->mp, &cache->objs[cache->len], req);
> + (rxq->mp, &cache->objs[cache->len], cache->size / 2);
Does this code overflow cache->objs[] if cache->len + cache->size / 2
exceeds the allocated array size? The array size is
RTE_MEMPOOL_CACHE_MAX_SIZE * 2 according to the structure definition,
but what happens when cache->len is near the maximum and we add
cache->size / 2 more objects?
> if (ret == 0) {
> - cache->len += req;
> + cache->len += cache->size / 2;
[ ... ]
> @@ -565,14 +571,20 @@ idpf_splitq_rearm(struct idpf_rx_queue *rx_bufq)
> /* Can this be satisfied from the cache? */
> if (cache->len < IDPF_RXQ_REARM_THRESH) {
> /* No. Backfill the cache first, and then fill from it */
> - uint32_t req = IDPF_RXQ_REARM_THRESH + (cache->size -
> - cache->len);
>
> - /* How many do we require i.e. number to fill the cache + the request */
> + /* Backfill would exceed the cache bounce buffer limit? */
> + __rte_assume(cache->size / 2 <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> + if (unlikely(cache->size / 2 < IDPF_RXQ_REARM_THRESH)) {
> + idpf_splitq_rearm_common(rx_bufq);
> + return;
> + }
> +
> + /* Backfill the cache from the backend; fetch (size / 2) objects. */
> + __rte_assume(cache->len < cache->size / 2);
Same question applies here - can this assumption be safely made given
the conditional only checks against IDPF_RXQ_REARM_THRESH?
[ ... ]
> diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c
> index 3042d94c14..805b52cc58 100644
> --- a/lib/mempool/rte_mempool.c
> +++ b/lib/mempool/rte_mempool.c
> @@ -850,7 +840,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size,
>
> /* asked cache too big */
> if (cache_size > RTE_MEMPOOL_CACHE_MAX_SIZE ||
> - CALC_CACHE_FLUSHTHRESH(cache_size) > n) {
> + cache_size > n) {
This change allows larger caches relative to pool size than before. Does
the mempool code correctly handle edge cases where cache_size approaches
n? Specifically in rte_mempool_do_generic_put() and
rte_mempool_do_generic_get()?
[ ... ]
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 2e54fc4466..4ece89cb47 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
[ ... ]
> @@ -1390,24 +1397,27 @@ rte_mempool_do_generic_put(struct rte_mempool *mp, void * const *obj_table,
> RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
>
> - __rte_assume(cache->flushthresh <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
> - __rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
> - __rte_assume(cache->len <= cache->flushthresh);
> - if (likely(cache->len + n <= cache->flushthresh)) {
> + __rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> + __rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> + __rte_assume(cache->len <= cache->size);
> + if (likely(cache->len + n <= cache->size)) {
> /* Sufficient room in the cache for the objects. */
> cache_objs = &cache->objs[cache->len];
> cache->len += n;
> - } else if (n <= cache->flushthresh) {
> + } else if (n <= cache->size / 2) {
> /*
> - * The cache is big enough for the objects, but - as detected by
> - * the comparison above - has insufficient room for them.
> - * Flush the cache to make room for the objects.
> + * The number of objects is within the cache bounce buffer limit,
> + * but - as detected by the comparison above - the cache has
> + * insufficient room for them.
> + * Flush the cache to the backend to make room for the objects;
> + * flush (size / 2) objects.
> */
> - cache_objs = &cache->objs[0];
> - rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->len);
> - cache->len = n;
> + __rte_assume(cache->len > cache->size / 2);
> + cache_objs = &cache->objs[cache->len - cache->size / 2];
> + cache->len = cache->len - cache->size / 2 + n;
> + rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache->size / 2);
Does this code correctly maintain cache->len after the flush? After the
enqueue_bulk, cache->len is set to "cache->len - cache->size / 2 + n",
but the objects being flushed are from the middle of the cache at
index "cache->len - cache->size / 2". Does this leave a hole in the
cache array between the flushed objects and the new objects being added?
> } else {
> - /* The request itself is too big for the cache. */
> + /* The request itself is too big. */
[ ... ]
> @@ -1524,7 +1534,7 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
> /* The cache is a stack, so copy will be in reverse order. */
> cache_objs = &cache->objs[cache->len];
>
> - __rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE * 2);
> + __rte_assume(cache->len <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> if (likely(n <= cache->len)) {
> /* The entire request can be satisfied from the cache. */
> RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
> @@ -1548,13 +1558,13 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
> for (index = 0; index < len; index++)
> *obj_table++ = *--cache_objs;
>
> - /* Dequeue below would overflow mem allocated for cache? */
> - if (unlikely(remaining > RTE_MEMPOOL_CACHE_MAX_SIZE))
> + /* Dequeue below would exceed the cache bounce buffer limit? */
> + __rte_assume(cache->size / 2 <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> + if (unlikely(remaining > cache->size / 2))
> goto driver_dequeue;
>
> - /* Fill the cache from the backend; fetch size + remaining objects. */
> - ret = rte_mempool_ops_dequeue_bulk(mp, cache->objs,
> - cache->size + remaining);
> + /* Fill the cache from the backend; fetch (size / 2) objects. */
> + ret = rte_mempool_ops_dequeue_bulk(mp, cache->objs, cache->size / 2);
> if (unlikely(ret < 0)) {
> /*
> * We are buffer constrained, and not able to fetch all that.
> @@ -1568,10 +1578,11 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void **obj_table,
> RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_bulk, 1);
> RTE_MEMPOOL_CACHE_STAT_ADD(cache, get_success_objs, n);
>
> - __rte_assume(cache->size <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> - __rte_assume(remaining <= RTE_MEMPOOL_CACHE_MAX_SIZE);
> - cache_objs = &cache->objs[cache->size + remaining];
> - cache->len = cache->size;
> + __rte_assume(cache->size / 2 <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> + __rte_assume(remaining <= RTE_MEMPOOL_CACHE_MAX_SIZE / 2);
> + __rte_assume(remaining <= cache->size / 2);
> + cache_objs = &cache->objs[cache->size / 2];
> + cache->len = cache->size / 2 - remaining;
Does this logic correctly handle the case where remaining equals zero?
When remaining is 0, cache->len becomes cache->size / 2, but we just
fetched cache->size / 2 objects into cache->objs[0..cache->size/2-1],
which seems correct. However, should we verify the dequeue succeeded
before setting cache->len?
Also, when remaining > 0, does cache_objs point to the correct location?
We set cache_objs = &cache->objs[cache->size / 2], then decrement it in
the loop, which would access objects backwards from index
cache->size / 2 - 1. Is this intentional?
More information about the test-report
mailing list