[dpdk-dev] [PATCH] mempool: allow for user-owned mempool caches

Olivier Matz olivier.matz at 6wind.com
Mon Mar 21 13:22:25 CET 2016


Hi Lazaros,

Thanks for this patch. To me, this is a valuable enhancement.
Please find some comments inline.

On 03/10/2016 03:44 PM, Lazaros Koromilas wrote:
> The mempool cache is only available to EAL threads as a per-lcore
> resource. Change this so that the user can create and provide their own
> cache on mempool get and put operations. This works with non-EAL threads
> too. This commit introduces new API calls with the 'with_cache' suffix,
> while the current ones default to the per-lcore local cache.
> 
> Signed-off-by: Lazaros Koromilas <l at nofutznetworks.com>
> ---
>  lib/librte_mempool/rte_mempool.c |  65 +++++-
>  lib/librte_mempool/rte_mempool.h | 442 ++++++++++++++++++++++++++++++++++++---
>  2 files changed, 467 insertions(+), 40 deletions(-)
> 
> diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
> index f8781e1..cebc2b7 100644
> --- a/lib/librte_mempool/rte_mempool.c
> +++ b/lib/librte_mempool/rte_mempool.c
> @@ -375,6 +375,43 @@ rte_mempool_xmem_usage(void *vaddr, uint32_t elt_num, size_t elt_sz,
>  	return usz;
>  }
>  
> +#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0

I wonder if this wouldn't cause a conflict with Keith's patch
that removes some #ifdefs RTE_MEMPOOL_CACHE_MAX_SIZE.
See: http://www.dpdk.org/dev/patchwork/patch/10492/

As this patch is already acked for 16.07, I think that your v2
could be rebased on top of it to avoid conflicts when Thomas will apply
it.

By the way, I also encourage you to have a look at other works in
progress in mempool:
http://www.dpdk.org/ml/archives/dev/2016-March/035107.html
http://www.dpdk.org/ml/archives/dev/2016-March/035201.html


> +static void
> +mempool_cache_init(struct rte_mempool_cache *cache, uint32_t size)
> +{
> +	cache->size = size;
> +	cache->flushthresh = CALC_CACHE_FLUSHTHRESH(size);
> +	cache->len = 0;
> +}
> +
> +/*
> + * Creates and initializes a cache for objects that are retrieved from and
> + * returned to an underlying mempool. This structure is identical to the
> + * structure included inside struct rte_mempool.
> + */

On top of Keith's patch, this comment may be reworked as the cache
structure is not included in the mempool structure anymore.

nit: I think the imperative form is preferred


> +struct rte_mempool_cache *
> +rte_mempool_cache_create(uint32_t size)
> +{
> +	struct rte_mempool_cache *cache;
> +
> +	if (size > RTE_MEMPOOL_CACHE_MAX_SIZE) {
> +		rte_errno = EINVAL;
> +		return NULL;
> +	}
> +
> +	cache = rte_zmalloc("MEMPOOL_CACHE", sizeof(*cache), RTE_CACHE_LINE_SIZE);
> +	if (cache == NULL) {
> +		RTE_LOG(ERR, MEMPOOL, "Cannot allocate mempool cache!\n");

I would remove the '!'



> @@ -587,10 +624,18 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size,
>  	mp->elt_size = objsz.elt_size;
>  	mp->header_size = objsz.header_size;
>  	mp->trailer_size = objsz.trailer_size;
> -	mp->cache_size = cache_size;
> -	mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size);
> +	mp->cache_size = cache_size; /* Keep this for backwards compat. */

I'm wondering if this should be kept for compat or if it makes sense
to keep it. The question is: do we want the cache_size to be a parameter
of the mempool or should it be a parameter of the cache?

I think we could remove this field from the mempool structure.


> @@ -673,13 +718,17 @@ rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp)
>  #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>  	unsigned lcore_id;
>  	unsigned count = 0;
> +	unsigned cache_size;
>  	unsigned cache_count;
>  
>  	fprintf(f, "  cache infos:\n");
> -	fprintf(f, "    cache_size=%"PRIu32"\n", mp->cache_size);
>  	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
> +		cache_size = mp->local_cache[lcore_id].size;
> +		fprintf(f, "    cache_size[%u]=%"PRIu32"\n",
> +			lcore_id, cache_size);
>  		cache_count = mp->local_cache[lcore_id].len;
> -		fprintf(f, "    cache_count[%u]=%u\n", lcore_id, cache_count);
> +		fprintf(f, "    cache_count[%u]=%"PRIu32"\n",
> +			lcore_id, cache_count);
>  		count += cache_count;

Does it still make sense to dump the content of the cache as some
external caches may exist?

If we want to list all caches, we could imagine a sort of cache
registration before using a cache structure. Example:

   cache = rte_mempool_add_cache()
   rte_mempool_del_cache()

All the caches could be browsed internally using a list.

Thoughts?


> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -95,19 +95,19 @@ struct rte_mempool_debug_stats {
>  } __rte_cache_aligned;
>  #endif
>  
> -#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0
>  /**
>   * A structure that stores a per-core object cache.
>   */
>  struct rte_mempool_cache {
> -	unsigned len; /**< Cache len */
> +	uint32_t size;        /**< Size of the cache */
> +	uint32_t flushthresh; /**< Threshold before we flush excess elements */
> +	uint32_t len;         /**< Current cache count */
>  	/*
>  	 * Cache is allocated to this size to allow it to overflow in certain
>  	 * cases to avoid needless emptying of cache.
>  	 */
>  	void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */
>  } __rte_cache_aligned;
> -#endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */
>  
>  /**
>   * A structure that stores the size of mempool elements.
> @@ -185,8 +185,6 @@ struct rte_mempool {
>  	int flags;                       /**< Flags of the mempool. */
>  	uint32_t size;                   /**< Size of the mempool. */
>  	uint32_t cache_size;             /**< Size of per-lcore local cache. */
> -	uint32_t cache_flushthresh;
> -	/**< Threshold before we flush excess elements. */
>  
>  	uint32_t elt_size;               /**< Size of an element. */
>  	uint32_t header_size;            /**< Size of header (before elt). */

Adding and removing these fields in the structure will break
the ABI. Could you please send a deprecation notice for it for
16.04, in the same model as http://dpdk.org/dev/patchwork/patch/11553/

The process is to get 3 acks for this deprecation notice, but I think
this won't be an issue as there are already several changes scheduled
in mempool that will break the ABI in 16.07, so it's the right time to
do it.



>  /**
> + * Put several objects back in the mempool (multi-producers safe).
> + * Use a user-provided mempool cache.
> + *
> + * @param mp
> + *   A pointer to the mempool structure.
> + * @param obj_table
> + *   A pointer to a table of void * pointers (objects).
> + * @param n
> + *   The number of objects to add in the mempool from the obj_table.
> + * @param cache
> + *   A pointer to a mempool cache structure. May be NULL if not needed.
> + */
> +static inline void __attribute__((always_inline))
> +rte_mempool_mp_put_bulk_with_cache(struct rte_mempool *mp,
> +				   void * const *obj_table, unsigned n,
> +				   struct rte_mempool_cache *cache)
> +{
> +	__mempool_check_cookies(mp, obj_table, n, 0);
> +	__mempool_put_bulk_with_cache(mp, obj_table, n, cache, 1);
> +}
> +
> +/**
> + * Put several objects back in the mempool (NOT multi-producers safe).
> + * Use a user-provided mempool cache.
> + *
> + * @param mp
> + *   A pointer to the mempool structure.
> + * @param obj_table
> + *   A pointer to a table of void * pointers (objects).
> + * @param n
> + *   The number of objects to add in the mempool from obj_table.
> + * @param cache
> + *   A pointer to a mempool cache structure. May be NULL if not needed.
> + */
> +static inline void
> +rte_mempool_sp_put_bulk_with_cache(struct rte_mempool *mp,
> +				   void * const *obj_table, unsigned n,
> +				   struct rte_mempool_cache *cache)
> +{
> +	__mempool_check_cookies(mp, obj_table, n, 0);
> +	__mempool_put_bulk_with_cache(mp, obj_table, n, cache, 0);
> +}
>
> [...]

Many functions are added here. As all of these functions are inline,
what would you think to have instead one new functions that would
match all cases:

static inline void
rte_mempool_generic_put(struct rte_mempool *mp,
		void * const *obj_table, unsigned n,
		struct rte_mempool_cache *cache,
		unsigned flags)

We could then consider the deprecation of some functions. Today,
without your patch, we have for put():

	rte_mempool_mp_put_bulk(mp, obj_table, n)
	rte_mempool_sp_put_bulk(mp, obj_table, n)
	rte_mempool_put_bulk(mp, obj_table, n)
	rte_mempool_mp_put(mp, obj)
	rte_mempool_sp_put(mp, obj)
	rte_mempool_put(mp, obj)

	__mempool_put_bulk(mp, obj_table, n, is_mp)  /* internal */

Maybe we should only keep:

	rte_mempool_put()
	rte_mempool_put_bulk(mp, obj_table, n)
	rte_mempool_generic_put(mp, obj_table, n, cache, flags)

and same for *get().



> +/**
> + * Create a user-owned mempool cache. This can be used by non-EAL threads
> + * to enable caching when they interact with a mempool.

nit: the doxygen format needs a one-line title

> + *
> + * @param size
> + *   The size of the mempool cache. See rte_mempool_create()'s cache_size
> + *   parameter description for more information. The same limits and
> + *   considerations apply here too.
> + */
> +struct rte_mempool_cache *
> +rte_mempool_cache_create(uint32_t size);
> +

I think we should also consider a free() function.

Maybe we should explain a bit more in the help of this function that
a mempool embeds a per-lcore cache by default.

Also, it would be great to have a test in app/test_mempool.c to validate
the feature and have an example of use.

And last thing, this is probably out of scope for now, but I'm
wondering if in the future this external cache could be used to
transparently share a pool (typically a mbuf pool), for instance
in case of a secondary process. Each process would have its own cache,
referencing the same shared mempool structure. This would probably
require to update the ethdev and mbuf API, so it's certainly a quite
large modification. If you have any ideas or plans going in this
direction, I would be interested.


Regards,
Olivier




More information about the dev mailing list