[PATCH] mempool: optimize get objects with constant n

Morten Brørup mb at smartsharesystems.com
Tue Apr 18 18:05:36 CEST 2023


> From: Morten Brørup
> Sent: Tuesday, 18 April 2023 13.30
> 
> > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > Sent: Tuesday, 18 April 2023 13.07
> >
> > On Tue, Apr 11, 2023 at 08:48:45AM +0200, Morten Brørup wrote:

[...]

> > > +		/*
> > > +		 * The request size is known at build time, and
> > > +		 * the entire request can be satisfied from the cache,
> > > +		 * so let the compiler unroll the fixed length copy loop.
> > > +		 */
> > > +		cache->len -= n;
> > > +		for (index = 0; index < n; index++)
> > > +			*obj_table++ = *--cache_objs;
> > > +
> >
> > This loop looks a little awkward to me. Would it be clearer (and
> perhaps
> > easier for compilers to unroll efficiently if it was rewritten as:
> >
> > 	cache->len -= n;
> > 	cache_objs = &cache->objs[cache->len];
> > 	for (index = 0; index < n; index++)
> > 		obj_table[index] = cache_objs[index];
> 
> The mempool cache is a stack, so the copy loop needs get the objects in
> decrementing order. I.e. the source index decrements and the destination
> index increments.
> 
> Regardless, your point here is still valid! I expected that any
> unrolling capable compiler can unroll *dst++ = *--src; but I can
> experiment with different compilers on godbolt.org to see if dst[index]
> = src[-index] is better.

Just for the record... I have now tried experimenting with the alternative, and it makes no difference.



More information about the dev mailing list