[PATCH v4] mempool: fix mempool cache flushing algorithm
Morten Brørup
mb at smartsharesystems.com
Thu Apr 7 13:36:17 CEST 2022
> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> Sent: Thursday, 7 April 2022 12.44
>
> On Thu, Apr 07, 2022 at 11:32:12AM +0100, Bruce Richardson wrote:
> > On Thu, Apr 07, 2022 at 11:26:53AM +0200, Morten Brørup wrote:
> > > > From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> > > > Sent: Thursday, 7 April 2022 11.14
> > > >
> > > > On Thu, Apr 07, 2022 at 11:04:53AM +0200, Morten Brørup wrote:
> > > > > > From: Morten Brørup [mailto:mb at smartsharesystems.com]
> > > > > > Sent: Wednesday, 2 February 2022 11.34
> > > > > >
> > > > > > This patch fixes the rte_mempool_do_generic_put() caching
> > > > algorithm,
> > > > > > which was fundamentally wrong, causing multiple performance
> issues
> > > > when
> > > > > > flushing.
> > > > > >
> > > > >
> > > > > [...]
> > > > >
> > > > > Olivier,
> > > > >
> > > > > Will you please consider this patch [1] and the other one [2].
> > > > >
> > > > > The primary bug here is this: When a mempool cache becomes full
> (i.e.
> > > > exceeds the "flush threshold"), and is flushed to the backing
> ring, it
> > > > is still full afterwards; but it should be empty afterwards. It
> is not
> > > > flushed entirely, only the elements exceeding "size" are flushed.
> > > > >
> > > >
> > > > I don't believe it should be flushed entirely, there should
> always be
> > > > some
> > > > elements left so that even after flush we can still allocate an
> > > > additional
> > > > burst. We want to avoid the situation where a flush of all
> elements is
> > > > immediately followed by a refill of new elements. However, we can
> flush
> > > > to
> > > > maybe size/2, and improve things. In short, this not emptying is
> by
> > > > design
> > > > rather than a bug, though we can look to tweak the behaviour.
> > > >
> > >
> > > I initially agreed with you about flushing to size/2.
> > >
> > > However, I did think further about it when I wrote the patch, and
> came to this conclusion: If an application thread repeatedly puts
> objects into the mempool, and does it so often that the cache overflows
> (i.e. reaches the flush threshold) and needs to be flushed, it is far
> more likely that the application thread will continue doing that,
> rather than start getting objects from the mempool. This speaks for
> flushing the cache entirely.
> > >
> > > Both solutions are better than flushing to size, so if there is a
> preference for keeping some objects in the cache after flushing, I can
> update the patch accordingly.
I forgot to mention some details here...
The cache is a stack, so leaving objects in it after flushing can be done in one of two ways:
1. Flush the top objects, and leave the bottom objects, which are extremely cold.
2. Flush the bottom objects, and move the objects from the top to the bottom, which is a costly operation.
Theoretically, there is a third option: Make the stack a circular buffer, so its "bottom pointer" can be moved around, instead of copying objects from the top to the bottom after flushing. However, this will add complexity when copying arrays to/from the stack in both the normal cases, i.e. to/from the application. And it introduces requirements to the cache size. So I quickly discarded the idea when it first came to me.
The provided patch flushes the entire cache, and then stores the newly added objects (the ones causing the flush) in the cache. So it is not completely empty after flushing. It contains some (but not many) objects, and they are hot.
> > >
> >
> > Would it be worth looking at adding per-core hinting to the mempool?
> > Indicate for a core that it allocates-only, i.e. RX thread, frees-
> only,
> > i.e. TX-thread, or does both alloc and free (the default)? That hint
> could
> > be used only on flush or refill to specify whether to flush all or
> partial,
> > and similarly to refill to max possible or just to size.
> >
> Actually, taking the idea further, we could always track per-core
> whether a
> core has ever done a flush/refill and use that as the hint instead. It
> could even be done in a branch-free manner if we want. For example:
>
> on flush:
> keep_entries = (size >> 1) & (never_refills - 1);
>
> which will set the entries to keep to be 0 if we have never had to
> refill, or
> half of size, if the thread has previously done refills.
>
Your suggestion is a good idea for a performance improvement.
We would also need "mostly" variants in addition to the "only" variants. Or the automatic detection will cause problems if triggered by some rare event.
And applications using the "service cores" concept will just fall back to the default alloc-free balanced variant.
Perhaps we should fix the current bugs (my term, not consensus) first, and then look at further performance improvements. It's already uphill getting Acks for my fixes as they are.
Another performance improvement could be hard coding the mempool cache size to RTE_MEMPOOL_CACHE_MAX_SIZE, so the copying between the cache and the backing ring can be done by a fixed size optimized memcpy using vector instructions.
Obviously, whatever we do to optimize this, we should ensure optimal handling of the common case, where objects are only moved between the application and the mempool cache, and doesn't touch the backing ring.
More information about the dev
mailing list