[PATCH v2 1/1] mempool: implement index-based per core cache
Morten Brørup
mb at smartsharesystems.com
Thu Jan 20 09:21:40 CET 2022
+CC Beilei as i40e maintainer
> From: Dharmik Thakkar [mailto:dharmik.thakkar at arm.com]
> Sent: Thursday, 13 January 2022 06.37
>
> Current mempool per core cache implementation stores pointers to mbufs
> On 64b architectures, each pointer consumes 8B
> This patch replaces it with index-based implementation,
> where in each buffer is addressed by (pool base address + index)
> It reduces the amount of memory/cache required for per core cache
>
> L3Fwd performance testing reveals minor improvements in the cache
> performance (L1 and L2 misses reduced by 0.60%)
> with no change in throughput
>
> Suggested-by: Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>
> Signed-off-by: Dharmik Thakkar <dharmik.thakkar at arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang at arm.com>
> ---
> lib/mempool/rte_mempool.h | 150 +++++++++++++++++++++++++-
> lib/mempool/rte_mempool_ops_default.c | 7 ++
> 2 files changed, 156 insertions(+), 1 deletion(-)
>
> diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> index 1e7a3c15273c..f2403fbc97a7 100644
> --- a/lib/mempool/rte_mempool.h
> +++ b/lib/mempool/rte_mempool.h
> @@ -50,6 +50,10 @@
> #include <rte_memcpy.h>
> #include <rte_common.h>
>
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
> +#include <rte_vect.h>
> +#endif
> +
> #include "rte_mempool_trace_fp.h"
>
> #ifdef __cplusplus
> @@ -239,6 +243,9 @@ struct rte_mempool {
> int32_t ops_index;
>
> struct rte_mempool_cache *local_cache; /**< Per-lcore local cache
> */
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
> + void *pool_base_value; /**< Base value to calculate indices */
> +#endif
>
> uint32_t populated_size; /**< Number of populated
> objects. */
> struct rte_mempool_objhdr_list elt_list; /**< List of objects in
> pool */
> @@ -1314,7 +1321,22 @@ rte_mempool_cache_flush(struct rte_mempool_cache
> *cache,
> if (cache == NULL || cache->len == 0)
> return;
> rte_mempool_trace_cache_flush(cache, mp);
> +
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
> + unsigned int i;
> + unsigned int cache_len = cache->len;
> + void *obj_table[RTE_MEMPOOL_CACHE_MAX_SIZE * 3];
> + void *base_value = mp->pool_base_value;
> + uint32_t *cache_objs = (uint32_t *) cache->objs;
Hi Dharmik and Honnappa,
The essence of this patch is based on recasting the type of the objs field in the rte_mempool_cache structure from an array of pointers to an array of uint32_t.
However, this effectively breaks the ABI, because the rte_mempool_cache structure is public and part of the API.
Some drivers [1] even bypass the mempool API and access the rte_mempool_cache structure directly, assuming that the objs array in the cache is an array of pointers. So you cannot recast the fields in the rte_mempool_cache structure the way this patch requires.
Although I do consider bypassing an API's accessor functions "spaghetti code", this driver's behavior is formally acceptable as long as the rte_mempool_cache structure is not marked as internal.
I really liked your idea of using indexes instead of pointers, so I'm very sorry to shoot it down. :-(
[1]: E.g. the Intel i40e PMD, http://code.dpdk.org/dpdk/latest/source/drivers/net/i40e/i40e_rxtx_vec_avx512.c#L25
-Morten
More information about the dev
mailing list