[dpdk-dev] [PATCH v12 09/14] build: optional NUMA and cpu counts detection
Bruce Richardson
bruce.richardson at intel.com
Thu Nov 19 13:19:47 CET 2020
On Wed, Nov 18, 2020 at 03:23:13PM +0000, Juraj Linkeš wrote:
>
>
> > -----Original Message-----
> > From: Thomas Monjalon <thomas at monjalon.net>
> > Sent: Wednesday, November 18, 2020 3:43 PM
> > To: Bruce Richardson <bruce.richardson at intel.com>; Juraj Linkeš
> > <juraj.linkes at pantheon.tech>
> > Cc: Ruifeng.Wang at arm.com; Honnappa.Nagarahalli at arm.com;
> > Phil.Yang at arm.com; vcchunga at amazon.com; Dharmik.Thakkar at arm.com;
> > jerinjacobk at gmail.com; hemant.agrawal at nxp.com;
> > ajit.khaparde at broadcom.com; ferruh.yigit at intel.com; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v12 09/14] build: optional NUMA and cpu counts
> > detection
> >
> > 18/11/2020 15:19, Juraj Linkeš:
> > > From: Thomas Monjalon <thomas at monjalon.net>
> > > > 16/11/2020 10:13, Bruce Richardson:
> > > > > On Mon, Nov 16, 2020 at 08:24:48AM +0100, Thomas Monjalon wrote:
> > > > > > 13/11/2020 15:31, Juraj Linkeš:
> > > > > > > +option('max_lcores', type: 'integer', value: 0,
> > > > > > > + description: 'maximum number of cores/threads supported by
> > EAL.
> > > > > > > +Set to positive integer to overwrite per-arch or
> > > > > > > +cross-compilation
> > > > defaults. Set to -1 to detect the number of cores on the build
> > > > machine.') option('max_numa_nodes', type: 'integer', value: 0,
> > > > > > > + description: 'maximum number of NUMA nodes supported by
> > EAL.
> > > > > > > +Set to positive integer to overwrite per-arch or
> > > > > > > +cross-compilation defaults. Set to -1 to detect the number of
> > > > > > > +numa nodes on the build machine.')
> > > > > >
> > > > > > First comment: I don't like having so long description.
> > > > > > Second: I don't understand.
> > > > > >
> > > > > > It is said the default value is 0 so I expect it means automatic detection.
> > > > > > But later it is said -1 is for detection. So ?
> > > > > >
> > > > > Zero is for the "per-arch or cross-compilation default". This was
> > > > > discussed quite a bit in previous versions and this was te best
> > > > > compromise we could come up with. Having a default of auto-detect
> > > > > is definitely not something I think we should go with - just
> > > > > thinking of all the build CI jobs running on
> > > > > 2 or 4 core VMs! However, Juraj really felt there was value in
> > > > > having auto-detection, so it's set as a -1 value, which I'm ok with.
> > > >
> > > > The problem is that I don't understand what 0 means.
> > > >
> > >
> > > There are three pieces of information which we need to convey:
> > > 1. The default value (0) indicates that per-arch or cross-compilation defaults
> > will be used.
> > > 2. Positive integer values will be used instead of these defaults.
> >
> > Where these positive values come from?
> >
>
> From the user - they will have the option to set it to whatever the like if they don't want to use defaults.
>
> > > 3. Detected values will be used for native build when the value is -1.
> >
> > Why not detect for any native build set up with 0 (default)?
> >
>
> I'll let Bruce explain this, but I'll just say that we wanted to make the detection the default for native builds, so we're in agreement.
I think most of us agree that the different understanding of the term
"native build", is the cause of much of the disagreements and points of
dispute on this thread. From my view point, the term "native" can refer to:
1. what meson considers a native build, i.e. one not using a cross-file
2. a build for a different machine architecture to the one on the build
machine (this largely overlaps with #1, except that e.g. 32-bit build on
64-bit may be considered a cross-build in this case).
3. a build tailored exactly for the build machine itself i.e. both ISA, and
things like core counts.
4. a flag passed to the compiler to indicate the uarch level of the
instruction set to be used, e.g. on x86, AVX2, AVX-512 etc., based on
that of the build machine.
Historically, IIRC, in DPDK the "RTE_MACHINE" value was originally #4 since
that was it's use on x86 in the first versions of DPDK. With the move from
make to meson, that aspect was kept, but the meaning of #1 (I think we can
ignore #2) also came into play. Finally, while for x86 architecture, the
idea of #4 still held, for ARM use #3 is of major concern.
Is this a fair summary?
Based on this, my thinking is that the current "machine" value really needs
to be either renamed or split into two. We need to separate out the idea of
the "platform" (apologies if this is not the right term), from the
"instruction set"/"uarch" to make it clear what the value refers to. The
default "platform" value should probably be "generic", and the default
"instruction set" should be "default", which means it's set by the
"platform" value.
This I believe should allow the flexibility we need, i.e. to tune to the
native machine (case #3) above, adjust the platform to "native", while to
get behaviour #4, and only just the ISA level, but keep generic in terms of
other values, adjust the "instruction set" value. In other words, for x86
the "machine" value as used becomes the "instruction set" one, while for
ARM (if I understand the requirements correctly) the "machine" value
becomes the "platform" one.
Thoughts on this?
/Bruce
More information about the dev
mailing list