[dpdk-dev] Mbuf memory alignment constraints for (micro)architectures

Gavin Hu (Arm Technology China) Gavin.Hu at arm.com
Tue Nov 12 03:36:45 CET 2019


Hi Jerin,

> -----Original Message-----
> From: dev <dev-bounces at dpdk.org> On Behalf Of Gavin Hu (Arm Technology
> China)
> Sent: Monday, November 11, 2019 10:01 PM
> To: jerinj at marvell.com; dev at dpdk.org
> Cc: Olivier Matz <olivier.matz at 6wind.com>; Andrew Rybchenko
> <arybchenko at solarflare.com>; David Christensen <drc at linux.vnet.ibm.com>;
> bruce.richardson at intel.com; konstantin.ananyev at intel.com;
> hemant.agrawal at nxp.com; Shahaf Shuler <shahafs at mellanox.com>;
> Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>;
> viktorin at rehivetech.com; anatoly.burakov at intel.com; Steve Capper
> <Steve.Capper at arm.com>; Ola Liljedahl <Ola.Liljedahl at arm.com>; nd
> <nd at arm.com>
> Subject: Re: [dpdk-dev] Mbuf memory alignment constraints for
> (micro)architectures
> 
> Hi Jerin,
> 
> > -----Original Message-----
> > From: Jerin Jacob Kollanukkaran <jerinj at marvell.com>
> > Sent: Thursday, October 31, 2019 2:02 AM
> > To: dev at dpdk.org
> > Cc: Olivier Matz <olivier.matz at 6wind.com>; Andrew Rybchenko
> > <arybchenko at solarflare.com>; David Christensen
> <drc at linux.vnet.ibm.com>;
> > bruce.richardson at intel.com; konstantin.ananyev at intel.com;
> > hemant.agrawal at nxp.com; Shahaf Shuler <shahafs at mellanox.com>;
> > Honnappa Nagarahalli <Honnappa.Nagarahalli at arm.com>; Gavin Hu (Arm
> > Technology China) <Gavin.Hu at arm.com>; viktorin at rehivetech.com;
> > anatoly.burakov at intel.com
> > Subject: Mbuf memory alignment constraints for (micro)architectures
> >
> > CC:  Arch and platform maintainers
> >
> > While reviewing the mempool objection allocation requirements in the code,
> >
> > A) it's found that in the default case, mempool objects have padding
> > in the object trailer to have start addresses of objects among the different
> > channels,
> > to enable equally load on the DRAM channel to have better performance
> >
> > # More documentation is here
> > https://doc.dpdk.org/guides/prog_guide/mempool_lib.html
> > in section 8.3. Memory Alignment Constraints
> >
> > B) The optimize_object_size() does the channel distribution requirement
> > by the following formula
> >
> >         new_obj_size = (obj_size + RTE_MEMPOOL_ALIGN_MASK) /
> > RTE_MEMPOOL_ALIGN;
> >         while (get_gcd(new_obj_size, nrank * nchan) != 1)
> >                new_obj_size++;
> >
> >
> > C) The formula mentioned in the (B) is NOT generic. At least of the octeontx2
> > SoC
> > The memory/DDR controller works in different way. Where by:
> > # It does XOR operation of some  of physical address lines(not the user space
> > VA address)
> > to compute the hash and that the function defines the actual channel.
> >
> > The XOR(kind of CRC) scheme is useful because there is natural  channel
> > distribution
> > based on the address i.e No need to have padding to waste memory
> >
> > So, in short the padding scheme does not need for some SoC. I trying to send
> > the patch
> > to fix it. So the questions is,
> >
> > # Is PPC and other ARM SoC has formula (B)  to compute DRAM channel
> > distribution ? or
> > Is it specific to x86? That would define where the hooks needs to added to
> have
> > proper fix.
> Reading through some documents, both x86 and arm, and having internal
> discussion,
> it looks like this is specific to x86, x86 spreads adjacent virtual addresses within
> a page across multiple memory devices,
> the interleaving was done per one or two cache lines.
> https://software.intel.com/en-us/articles/how-memory-is-accessed
> 
> Arm leaves flexibility to implementations, no fixed pattern for interleaving and
> thus it can hardly be generalized.
Same conclusion, but more words for this topic(from Arm internally):
"Interleaving (or stripping) happens at the interconnect/memory controller level, so on Arm-based systems it's going to be highly dependent on the given SoC's integration and probably the system configuration too. Arm own interconnect and DMC IPs generally offer various options to support stripping, but even then it's the integrator's choice how to use them, and obviously there are multitudes of alternative third-party IPs too.
In summary, this really depends on the system's interconnect and memory controller capabilities and how it has been configured."
/Gavin
> >
> >
> >
> >
> >
> >
> >
> >



More information about the dev mailing list