[PATCH v2] mempool: micro-optimize put function

Morten Brørup mb at smartsharesystems.com
Wed Nov 16 16:59:54 CET 2022

Previous message (by thread): [PATCH v2] mempool: micro-optimize put function
Next message (by thread): [PATCH v2] mempool: micro-optimize put function
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli at arm.com]
> Sent: Wednesday, 16 November 2022 16.51
> 
> <snip>
> >
> > Micro-optimization:
> > Reduced the most likely code path in the generic put function by
> moving an
> > unlikely check out of the most likely code path and further down.
> >
> > Also updated the comments in the function.
> >
> > v2 (feedback from Andrew Rybchenko):
> > * Modified comparison to prevent overflow if n is really huge and len
> is
> >   non-zero.
> > * Added assertion about the invariant preventing overflow in the
> >   comparison.
> > * Crossing the threshold is not extremely unlikely, so removed
> likely()
> >   from that comparison.
> >   The compiler will generate code with optimal static branch
> prediction
> >   here anyway.
> >
> > Signed-off-by: Morten Brørup <mb at smartsharesystems.com>
> > ---
> >  lib/mempool/rte_mempool.h | 36 ++++++++++++++++++++----------------
> >  1 file changed, 20 insertions(+), 16 deletions(-)
> >
> > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
> > index 9f530db24b..dd1a3177d6 100644
> > --- a/lib/mempool/rte_mempool.h
> > +++ b/lib/mempool/rte_mempool.h
> > @@ -1364,32 +1364,36 @@ rte_mempool_do_generic_put(struct
> > rte_mempool *mp, void * const *obj_table,  {
> >  	void **cache_objs;
> >
> > -	/* No cache provided */
> > +	/* No cache provided? */
> >  	if (unlikely(cache == NULL))
> >  		goto driver_enqueue;
> >
> > -	/* increment stat now, adding in mempool always success */
> > +	/* Increment stats now, adding in mempool always succeeds. */
> >  	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_bulk, 1);
> >  	RTE_MEMPOOL_CACHE_STAT_ADD(cache, put_objs, n);
> >
> > -	/* The request itself is too big for the cache */
> > -	if (unlikely(n > cache->flushthresh))
> > -		goto driver_enqueue_stats_incremented;
> > -
> > -	/*
> > -	 * The cache follows the following algorithm:
> > -	 *   1. If the objects cannot be added to the cache without
> crossing
> > -	 *      the flush threshold, flush the cache to the backend.
> > -	 *   2. Add the objects to the cache.
> > -	 */
> > +	/* Assert the invariant preventing overflow in the comparison
> below.
> > */
> > +	RTE_ASSERT(cache->len <= cache->flushthresh);
> >
> > -	if (cache->len + n <= cache->flushthresh) {
> > +	if (n <= cache->flushthresh - cache->len) {
> > +		/*
> > +		 * The objects can be added to the cache without crossing
> the
> > +		 * flush threshold.
> > +		 */
> >  		cache_objs = &cache->objs[cache->len];
> >  		cache->len += n;
> > -	} else {
> > +	} else if (likely(n <= cache->flushthresh)) {
> IMO, this is a misconfiguration on the application part. In the PMDs I
> have looked at, max value of 'n' is controlled by compile time
> constants. Application could do a compile time check on the cache
> threshold or we could have another RTE_ASSERT on this.

There could be applications using a mempool for something else than mbufs.

In that case, the application should be allowed to get/put many objects in one transaction.

> 
> > +		/*
> > +		 * The request itself fits into the cache.
> > +		 * But first, the cache must be flushed to the backend, so
> > +		 * adding the objects does not cross the flush threshold.
> > +		 */
> >  		cache_objs = &cache->objs[0];
> >  		rte_mempool_ops_enqueue_bulk(mp, cache_objs, cache-
> > >len);
> >  		cache->len = n;
> > +	} else {
> > +		/* The request itself is too big for the cache. */
> > +		goto driver_enqueue_stats_incremented;
> >  	}
> >
> >  	/* Add the objects to the cache. */
> > @@ -1399,13 +1403,13 @@ rte_mempool_do_generic_put(struct
> > rte_mempool *mp, void * const *obj_table,
> >
> >  driver_enqueue:
> >
> > -	/* increment stat now, adding in mempool always success */
> > +	/* Increment stats now, adding in mempool always succeeds. */
> >  	RTE_MEMPOOL_STAT_ADD(mp, put_bulk, 1);
> >  	RTE_MEMPOOL_STAT_ADD(mp, put_objs, n);
> >
> >  driver_enqueue_stats_incremented:
> >
> > -	/* push objects to the backend */
> > +	/* Push the objects to the backend. */
> >  	rte_mempool_ops_enqueue_bulk(mp, obj_table, n);  }
> >
> > --
> > 2.17.1

Previous message (by thread): [PATCH v2] mempool: micro-optimize put function
Next message (by thread): [PATCH v2] mempool: micro-optimize put function
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the dev mailing list