[dpdk-dev] [EXT] Re: [dpdk-stable] [PATCH v3] mempool: fix mempool obj alignment for non x86
Jerin Jacob Kollanukkaran
jerinj at marvell.com
Mon Jan 13 12:46:13 CET 2020
> -----Original Message-----
> From: David Marchand <david.marchand at redhat.com>
> Sent: Monday, January 13, 2020 3:17 PM
> To: Jerin Jacob Kollanukkaran <jerinj at marvell.com>
> Cc: dev <dev at dpdk.org>; Thomas Monjalon <thomas at monjalon.net>; Olivier
> Matz <olivier.matz at 6wind.com>; Andrew Rybchenko
> <arybchenko at solarflare.com>; Bruce Richardson
> <bruce.richardson at intel.com>; Ananyev, Konstantin
> <konstantin.ananyev at intel.com>; Hemant Agrawal
> <hemant.agrawal at nxp.com>; Shahaf Shuler <shahafs at mellanox.com>;
> Honnappa Nagarahalli <honnappa.nagarahalli at arm.com>; Gavin Hu
> <gavin.hu at arm.com>; viktorin at rehivetech.com; David Christensen
> <drc at linux.vnet.ibm.com>; Burakov, Anatoly <anatoly.burakov at intel.com>;
> dpdk stable <stable at dpdk.org>; Kevin Traynor <ktraynor at redhat.com>; Luca
> Boccassi <bluca at debian.org>
> Subject: [EXT] Re: [dpdk-stable] [dpdk-dev] [PATCH v3] mempool: fix mempool
> obj alignment for non x86
>
> External Email
>
> ----------------------------------------------------------------------
> On Mon, Jan 13, 2020 at 7:49 AM <jerinj at marvell.com> wrote:
> >
> > From: Jerin Jacob <jerinj at marvell.com>
> >
> > The existing optimize_object_size() function address the memory object
> > alignment constraint on x86 for better performance.
> >
> > Different (micro) architecture may have different memory alignment
> > constraint for better performance and it not the same as the existing
> > optimize_object_size().
> >
> > Some use, XOR(kind of CRC) scheme to enable DRAM channel distribution
> > based on the address and some may have a different formula.
> >
> > Introducing arch_mem_object_align() function to abstract the
> > difference between different (micro) architectures to avoid wasting
> > memory for mempool object alignment for the architecture that it is
> > not required to do so.
> >
> > Details on the amount of memory saving:
> >
> > Currently, arm64 based architectures use the default (nchan=4,
> > nrank=1). The worst case is for an object whose size (including
> > mempool
> > header) is 2 cache lines, where it is optimized to 3 cache lines (+50%).
> >
> > Examples for cache lines size = 64:
> > orig optimized
> > 64 -> 64 +0%
> > 128 -> 192 +50%
> > 192 -> 192 +0%
> > 256 -> 320 +25%
> > 320 -> 320 +0%
> > 384 -> 448 +16%
> > ...
> > 2304 -> 2368 +2.7% (~mbuf size)
> >
> > Additional details:
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchiv
> > e.com_dev-
> 40dpdk.org_msg149157.html&d=DwIFaQ&c=nKjWec2b6R0mOyPaz7xtfQ&
> >
> r=1DGob4H4rxz6H8uITozGOCa0s5f4wCNtTa4UUKvcsvI&m=VKkiHhyflsqwipCoE
> MtdUR
> > SXuHSq2neWGqTRmxVfjr8&s=y-LYGZ-
> 2MsAfrGo3r5aADQnr2mUcsP7LxXT5XEmTuwE&e=
> >
> > Fixes: af75078fece3 ("first public release")
>
> Weird to flag this as a problem in this sha1.
> x86 was the only architecture supported at the time.
> Either we mark the introduction of new architectures as the point of backport,
> or we remove this tag and just let Cc: stable at dpdk.org
While committing the maintainer can take either one of the decision. No issues/opinion on this from my side.
>
> > Cc: stable at dpdk.org
>
> It seems more an optimisation than a fix to me, but in any case, the stable
> maintainers will be the judges.
OK. No issues.
>
>
> >
> > Signed-off-by: Jerin Jacob <jerinj at marvell.com>
> > Reviewed-by: Gavin Hu <gavin.hu at arm.com>
> > ---
> > v3:
> > - Change comment for MEMPOOL_F_NO_SPREAD flag as " Spreading among
> > memory channels not required." (Stephen Hemminger)
> >
> > v2:
> > - Changed the return type of arch_mem_object_align() to "unsigned int" from
> > "unsigned" to fix the checkpatch issues (Olivier Matz)
> > - Updated the comments for MEMPOOL_F_NO_SPREAD (Olivier Matz)
> > - Update the git comments to share the memory saving details.
> >
> > doc/guides/prog_guide/mempool_lib.rst | 6 +++---
> > lib/librte_mempool/rte_mempool.c | 17 +++++++++++++----
> > lib/librte_mempool/rte_mempool.h | 3 ++-
> > 3 files changed, 18 insertions(+), 8 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/mempool_lib.rst
> > b/doc/guides/prog_guide/mempool_lib.rst
> > index 3bb84b0a6..eea7a2906 100644
> > --- a/doc/guides/prog_guide/mempool_lib.rst
> > +++ b/doc/guides/prog_guide/mempool_lib.rst
> > @@ -27,10 +27,10 @@ In debug mode
> (CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG is
> > enabled), statistics about get from/put in the pool are stored in the mempool
> structure.
> > Statistics are per-lcore to avoid concurrent access to statistics counters.
> >
> > -Memory Alignment Constraints
> > -----------------------------
> > +Memory Alignment Constraints on X86 architecture
> > +------------------------------------------------
>
> Nit: afaics in the docs, x86 is preferred to X86.
>
>
> >
> > -Depending on hardware memory configuration, performance can be greatly
> improved by adding a specific padding between objects.
> > +Depending on hardware memory configuration on X86 architecture,
> performance can be greatly improved by adding a specific padding between
> objects.
> > The objective is to ensure that the beginning of each object starts on a
> different channel and rank in memory so that all channels are equally loaded.
> >
> > This is particularly true for packet buffers when doing L3 forwarding or flow
> classification.
>
>
> --
> David Marchand
More information about the dev
mailing list