[dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics

Bruce Richardson bruce.richardson at intel.com
Fri Oct 9 13:36:06 CEST 2020


On Fri, Oct 09, 2020 at 12:12:56PM +0100, Burakov, Anatoly wrote:
> On 09-Oct-20 11:48 AM, Ananyev, Konstantin wrote:
> > > On 09-Oct-20 11:17 AM, Thomas Monjalon wrote:
> > > > 09/10/2020 12:03, Burakov, Anatoly:
> > > > > On 09-Oct-20 10:54 AM, Jerin Jacob wrote:
> > > > > > On Fri, Oct 9, 2020 at 3:10 PM Burakov, Anatoly
> > > > > > <anatoly.burakov at intel.com> wrote:
> > > > > > > On 09-Oct-20 10:29 AM, Thomas Monjalon wrote:
> > > > > > > > 09/10/2020 11:25, Burakov, Anatoly:
> > > > > > > > > On 09-Oct-20 6:42 AM, Jerin Jacob wrote:
> > > > > > > > > > On Thu, Oct 8, 2020 at 10:38 PM Ananyev, Konstantin
> > > > > > > > > > <konstantin.ananyev at intel.com> wrote:
> > > > > > > > > > > > On Thu, Oct 8, 2020 at 6:57 PM Burakov, Anatoly
> > > > > > > > > > > > <anatoly.burakov at intel.com> wrote:
> > > > > > > > > > > > > On 08-Oct-20 9:44 AM, Jerin Jacob wrote:
> > > > > > > > > > > > > > On Thu, Oct 8, 2020 at 2:04 PM Thomas Monjalon <thomas at monjalon.net> wrote:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Add two new power management intrinsics, and provide an implementation
> > > > > > > > > > > > > > > > in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions
> > > > > > > > > > > > > > > > are implemented as raw byte opcodes because there is not yet widespread
> > > > > > > > > > > > > > > > compiler support for these instructions.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > The power management instructions provide an architecture-specific
> > > > > > > > > > > > > > > > function to either wait until a specified TSC timestamp is reached, or
> > > > > > > > > > > > > > > > optionally wait until either a TSC timestamp is reached or a memory
> > > > > > > > > > > > > > > > location is written to. The monitor function also provides an optional
> > > > > > > > > > > > > > > > comparison, to avoid sleeping when the expected write has already
> > > > > > > > > > > > > > > > happened, and no more writes are expected.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > For more details, Please reference Intel SDM Volume 2.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I really would like to see feedbacks from other arch maintainers.
> > > > > > > > > > > > > > > Unfortunately they were not Cc'ed.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Shared the feedback from the arm64 perspective here. Yet to get a reply on this.
> > > > > > > > > > > > > > http://mails.dpdk.org/archives/dev/2020-September/181646.html
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Also please mark the new functions as experimental.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hi Jerin,
> > > > > > > > > > > > 
> > > > > > > > > > > > Hi Anatoly,
> > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > >       > IMO, We must introduce some arch feature-capability _get_ scheme to tell
> > > > > > > > > > > > >       > the consumer of this API is only supported on x86. Probably as
> > > > > > > > > > > > > functions[1]
> > > > > > > > > > > > >       > or macro flags scheme and have a stub for the other architectures as the
> > > > > > > > > > > > >       > API marked as generic ie rte_power_* not rte_x86_..
> > > > > > > > > > > > >       >
> > > > > > > > > > > > >       > This will help the consumer to create workers based on the
> > > > > > > > > > > > > instruction features
> > > > > > > > > > > > >       > which can NOT be abstracted as a generic feature across the
> > > > > > > > > > > > > architectures.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I'm not entirely sure what you mean by that.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I mean, yes, we should have added stubs for other architectures, and we
> > > > > > > > > > > > > will add those in future revisions, but what does your proposed runtime
> > > > > > > > > > > > > check accomplish that cannot currently be done with CPUID flags?
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > RTE_CPUFLAG_WAITPKG  flag definition is not available in other architectures.
> > > > > > > > > > > > i.e RTE_CPUFLAG_WAITPKG defined in lib/librte_eal/x86/include/rte_cpuflags.h
> > > > > > > > > > > > and it is used in http://patches.dpdk.org/patch/79540/ as generic API.
> > > > > > > > > > > > I doubt http://patches.dpdk.org/patch/79540/  would compile on non-x86.
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > I am agree with Jerin, that we need some generic way to
> > > > > > > > > > > figure-out does platform supports power_monitor() or not.
> > > > > > > > > > > Though not sure do we need to create a new feature-get framework here...
> > > > > > > > > > 
> > > > > > > > > > That's works too. Some means of generic probing is fine. Following
> > > > > > > > > > schemed needs
> > > > > > > > > > more documentation on that usage, as, it is not straight forward compare to
> > > > > > > > > > feature-get framework. Also, on the other thread, we are adding the
> > > > > > > > > > new instructions like
> > > > > > > > > > demote cacheline etc, maybe if the user wants to KNOW if the arch
> > > > > > > > > > supports it then
> > > > > > > > > > the feature-get framework is good.
> > > > > > > > > > If we think, there is no other usecase for generic arch feature-get
> > > > > > > > > > framework then
> > > > > > > > > > we can keep the below scheme else generic arch feature is better for
> > > > > > > > > > more forward
> > > > > > > > > > looking use cases.
> > > > > > > > > > 
> > > > > > > > > > > Might be just something like:
> > > > > > > > > > >       rte_power_monitor(...) == -ENOTSUP
> > > > > > > > > > > be enough indication for that?
> > > > > > > > > > > So user can just do:
> > > > > > > > > > > if (rte_power_monitor(NULL, 0, 0, 0, 0) == -ENOTSUP) {
> > > > > > > > > > >              /* not supported  path */
> > > > > > > > > > > }
> > > > > > > > > > > 
> > > > > > > > > > > To check is that feature supported or not.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > Looking at CLDEMOTE patches, CLDEMOTE is a noop on other archs. I think
> > > > > > > > > we can safely make this intrinsic as a noop on other archs as well, as
> > > > > > > > > it's functionally identical to waking up immediately.
> > > > > > > > > 
> > > > > > > > > If we're not creating this for CLDEMOTE, we don't need it here as well.
> > > > > > > > > If we do need it for this, then we arguably need it for CLDEMOTE too.
> > > > > > > > 
> > > > > > > > Sorry I don't understand what you mean, too many "it" and "this" :)
> > > > > > > > 
> > > > > > > 
> > > > > > > Sorry, i meant "the generic feature-get framework". CLDEMOTE doesn't
> > > > > > > exist on other archs, this doesn't too, so it's a fairly similar
> > > > > > > situation. Stubbing UMWAIT with a noop is a valid approach because it's
> > > > > > > equivalent to sleeping and then immediately waking up (which can happen
> > > > > > > for a host of reasons unrelated to the code itself).
> > > > > > 
> > > > > > If we are keeping the following return in the public API then it can not be NOP
> > > > > > + * @return
> > > > > > + *   - 1 if wakeup was due to TSC timeout expiration.
> > > > > > + *   - 0 if wakeup was due to memory write or other reasons.
> > > > > > + */
> > > > > > 
> > > > > 
> > > > > In the generic header, it is specified that return value is
> > > > > implementation-defined (i.e. arch-specific).
> > > > 
> > > > Obviously an API definition should *never* be "implementation-defined".
> > > 
> > > If there isn't a meaningful return value, we could either make it a
> > > void, or return 0/-ENOTSUP so. I'm OK with either as nop is a valid
> > > result for a UMWAIT, and there are no side-effects to the intrinsic
> > > itself (it's basically a fancy rte_pause).
> > > 
> > > > 
> > > > 
> > > > > I guess we could remove
> > > > > that and set return value to either 0 or -ENOTSUP if that would resolve
> > > > > the issue?
> > > > > 
> > > > > > Also, we need to fix compilation issue if any with
> > > > > > http://patches.dpdk.org/patch/79540/
> > > > > > as it has direct reference to if
> > > > > > (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) {
> > > > > > Either we need to add -ENOTSUP return or generic feature-get framework.
> > > > > 
> > > > > IIRC power library isn't compiled on anything other than x86, so this
> > > > > code wouldn't get compiled.
> > > > 
> > > > It is not call "power-x86", so we must assume it could work
> > > > on any architecture.
> > > 
> > > #ifdef it is!
> > > 
> > > > 
> > > > 
> > > > > > > I'm not against a generic feature-get framework, i'm just pointing out
> > > > > > > that if this is what's preventing the merge, it should prevent the merge
> > > > > > > of CLDEMOTE as well
> > 
> > I wouldn't consider these two as totally equal.
> > Yes, both are just hints to CPU, that can be ignored,
> > but if not, then consequences of executing are quite different.
> > If UMWAIT is not supported by cpu at all, then user might want to use some
> > different power saving mechanism (pause, frequence scaling, etc.).
> > Without information is UMWAIT supported or not, user can't make
> > such choice.
> 
> After some attempts at implementing this, i actually came to the conclusion
> that some generic way to check support for this feature is necessary,
> because we end up with a usability inconsistency:
> 
> 1) on non-x86, if you call the function, it returns -ENOTSUP
> 2) on x86, since we're not checking CPUID flags on every single call, it'll
> either succeed, or crash the process - the burden is on the user to check
> for CPUID flags, but it can't be done in an arch agnostic way because the
> CPUID flags are only defined for x86, thus requiring a special code path for
> x86
> 
> Where would be the best place to add such an infrastructure? I'm thinking
> rte_cpuflags.h?
> 
Time to relook at some of the contents of patchset:
http://patches.dpdk.org/project/dpdk/list/?series=4811&archive=both&state=*

The idea of that set (IIRC) was to replace the per-architecture enums with
just strings to avoid situations like this - or at least make them less
awkward.

/Bruce


More information about the dev mailing list