[dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics

Burakov, Anatoly anatoly.burakov at intel.com
Fri Oct 9 12:22:58 CEST 2020


On 09-Oct-20 11:17 AM, Thomas Monjalon wrote:
> 09/10/2020 12:03, Burakov, Anatoly:
>> On 09-Oct-20 10:54 AM, Jerin Jacob wrote:
>>> On Fri, Oct 9, 2020 at 3:10 PM Burakov, Anatoly
>>> <anatoly.burakov at intel.com> wrote:
>>>> On 09-Oct-20 10:29 AM, Thomas Monjalon wrote:
>>>>> 09/10/2020 11:25, Burakov, Anatoly:
>>>>>> On 09-Oct-20 6:42 AM, Jerin Jacob wrote:
>>>>>>> On Thu, Oct 8, 2020 at 10:38 PM Ananyev, Konstantin
>>>>>>> <konstantin.ananyev at intel.com> wrote:
>>>>>>>>> On Thu, Oct 8, 2020 at 6:57 PM Burakov, Anatoly
>>>>>>>>> <anatoly.burakov at intel.com> wrote:
>>>>>>>>>> On 08-Oct-20 9:44 AM, Jerin Jacob wrote:
>>>>>>>>>>> On Thu, Oct 8, 2020 at 2:04 PM Thomas Monjalon <thomas at monjalon.net> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Add two new power management intrinsics, and provide an implementation
>>>>>>>>>>>>> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions
>>>>>>>>>>>>> are implemented as raw byte opcodes because there is not yet widespread
>>>>>>>>>>>>> compiler support for these instructions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The power management instructions provide an architecture-specific
>>>>>>>>>>>>> function to either wait until a specified TSC timestamp is reached, or
>>>>>>>>>>>>> optionally wait until either a TSC timestamp is reached or a memory
>>>>>>>>>>>>> location is written to. The monitor function also provides an optional
>>>>>>>>>>>>> comparison, to avoid sleeping when the expected write has already
>>>>>>>>>>>>> happened, and no more writes are expected.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For more details, Please reference Intel SDM Volume 2.
>>>>>>>>>>>>
>>>>>>>>>>>> I really would like to see feedbacks from other arch maintainers.
>>>>>>>>>>>> Unfortunately they were not Cc'ed.
>>>>>>>>>>>
>>>>>>>>>>> Shared the feedback from the arm64 perspective here. Yet to get a reply on this.
>>>>>>>>>>> http://mails.dpdk.org/archives/dev/2020-September/181646.html
>>>>>>>>>>>
>>>>>>>>>>>> Also please mark the new functions as experimental.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Jerin,
>>>>>>>>>
>>>>>>>>> Hi Anatoly,
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      > IMO, We must introduce some arch feature-capability _get_ scheme to tell
>>>>>>>>>>      > the consumer of this API is only supported on x86. Probably as
>>>>>>>>>> functions[1]
>>>>>>>>>>      > or macro flags scheme and have a stub for the other architectures as the
>>>>>>>>>>      > API marked as generic ie rte_power_* not rte_x86_..
>>>>>>>>>>      >
>>>>>>>>>>      > This will help the consumer to create workers based on the
>>>>>>>>>> instruction features
>>>>>>>>>>      > which can NOT be abstracted as a generic feature across the
>>>>>>>>>> architectures.
>>>>>>>>>>
>>>>>>>>>> I'm not entirely sure what you mean by that.
>>>>>>>>>>
>>>>>>>>>> I mean, yes, we should have added stubs for other architectures, and we
>>>>>>>>>> will add those in future revisions, but what does your proposed runtime
>>>>>>>>>> check accomplish that cannot currently be done with CPUID flags?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> RTE_CPUFLAG_WAITPKG  flag definition is not available in other architectures.
>>>>>>>>> i.e RTE_CPUFLAG_WAITPKG defined in lib/librte_eal/x86/include/rte_cpuflags.h
>>>>>>>>> and it is used in http://patches.dpdk.org/patch/79540/ as generic API.
>>>>>>>>> I doubt http://patches.dpdk.org/patch/79540/  would compile on non-x86.
>>>>>>>>
>>>>>>>>
>>>>>>>> I am agree with Jerin, that we need some generic way to
>>>>>>>> figure-out does platform supports power_monitor() or not.
>>>>>>>> Though not sure do we need to create a new feature-get framework here...
>>>>>>>
>>>>>>> That's works too. Some means of generic probing is fine. Following
>>>>>>> schemed needs
>>>>>>> more documentation on that usage, as, it is not straight forward compare to
>>>>>>> feature-get framework. Also, on the other thread, we are adding the
>>>>>>> new instructions like
>>>>>>> demote cacheline etc, maybe if the user wants to KNOW if the arch
>>>>>>> supports it then
>>>>>>> the feature-get framework is good.
>>>>>>> If we think, there is no other usecase for generic arch feature-get
>>>>>>> framework then
>>>>>>> we can keep the below scheme else generic arch feature is better for
>>>>>>> more forward
>>>>>>> looking use cases.
>>>>>>>
>>>>>>>> Might be just something like:
>>>>>>>>      rte_power_monitor(...) == -ENOTSUP
>>>>>>>> be enough indication for that?
>>>>>>>> So user can just do:
>>>>>>>> if (rte_power_monitor(NULL, 0, 0, 0, 0) == -ENOTSUP) {
>>>>>>>>             /* not supported  path */
>>>>>>>> }
>>>>>>>>
>>>>>>>> To check is that feature supported or not.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Looking at CLDEMOTE patches, CLDEMOTE is a noop on other archs. I think
>>>>>> we can safely make this intrinsic as a noop on other archs as well, as
>>>>>> it's functionally identical to waking up immediately.
>>>>>>
>>>>>> If we're not creating this for CLDEMOTE, we don't need it here as well.
>>>>>> If we do need it for this, then we arguably need it for CLDEMOTE too.
>>>>>
>>>>> Sorry I don't understand what you mean, too many "it" and "this" :)
>>>>>
>>>>
>>>> Sorry, i meant "the generic feature-get framework". CLDEMOTE doesn't
>>>> exist on other archs, this doesn't too, so it's a fairly similar
>>>> situation. Stubbing UMWAIT with a noop is a valid approach because it's
>>>> equivalent to sleeping and then immediately waking up (which can happen
>>>> for a host of reasons unrelated to the code itself).
>>>
>>> If we are keeping the following return in the public API then it can not be NOP
>>> + * @return
>>> + *   - 1 if wakeup was due to TSC timeout expiration.
>>> + *   - 0 if wakeup was due to memory write or other reasons.
>>> + */
>>>
>>
>> In the generic header, it is specified that return value is
>> implementation-defined (i.e. arch-specific).
> 
> Obviously an API definition should *never* be "implementation-defined".

If there isn't a meaningful return value, we could either make it a 
void, or return 0/-ENOTSUP so. I'm OK with either as nop is a valid 
result for a UMWAIT, and there are no side-effects to the intrinsic 
itself (it's basically a fancy rte_pause).

> 
> 
>> I guess we could remove
>> that and set return value to either 0 or -ENOTSUP if that would resolve
>> the issue?
>>
>>> Also, we need to fix compilation issue if any with
>>> http://patches.dpdk.org/patch/79540/
>>> as it has direct reference to if
>>> (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_WAITPKG)) {
>>> Either we need to add -ENOTSUP return or generic feature-get framework.
>>
>> IIRC power library isn't compiled on anything other than x86, so this
>> code wouldn't get compiled.
> 
> It is not call "power-x86", so we must assume it could work
> on any architecture.

#ifdef it is!

> 
> 
>>>> I'm not against a generic feature-get framework, i'm just pointing out
>>>> that if this is what's preventing the merge, it should prevent the merge
>>>> of CLDEMOTE as well, yet Jerin has acked that one and has explicitly
>>>> stated that he's OK with leaving CLDEMOTE as a noop on other architectures.
> 
> CLDEMOTE is used for optimization, while UMWAIT can be used in a logic,
> that's why the expectations may be different.
> 

UMWAIT is a best-effort mechanism with no side-effects. It's perfectly 
legal for a UMWAIT to not sleep at all, thus rendering it effectively a 
noop. So i don't think it's all that different.

-- 
Thanks,
Anatoly


More information about the dev mailing list