[dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics

Burakov, Anatoly anatoly.burakov at intel.com
Fri Oct 9 18:59:17 CEST 2020


On 09-Oct-20 5:56 PM, Ananyev, Konstantin wrote:
> 
>>
>> On 09-Oct-20 4:39 PM, Ananyev, Konstantin wrote:
>>>
>>>> On 08-Oct-20 6:15 PM, Ananyev, Konstantin wrote:
>>>>>>
>>>>>> Add two new power management intrinsics, and provide an implementation
>>>>>> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions
>>>>>> are implemented as raw byte opcodes because there is not yet widespread
>>>>>> compiler support for these instructions.
>>>>>>
>>>>>> The power management instructions provide an architecture-specific
>>>>>> function to either wait until a specified TSC timestamp is reached, or
>>>>>> optionally wait until either a TSC timestamp is reached or a memory
>>>>>> location is written to. The monitor function also provides an optional
>>>>>> comparison, to avoid sleeping when the expected write has already
>>>>>> happened, and no more writes are expected.
>>>>>
>>>>> I think what this API is missing - a function to wakeup sleeping core.
>>>>> If user can/should use some system call to achieve that, then at least
>>>>> it has to be clearly documented, even better some wrapper provided.
>>>>
>>>> I don't think it's possible to do that without severely overcomplicating
>>>> the intrinsic and its usage, because AFAIK the only way to wake up a
>>>> sleeping core would be to send some kind of interrupt to the core, or
>>>> trigger a write to the cache-line in question.
>>>>
>>>
>>> Yes, I think we either need a syscall that would do an IPI for us
>>> (on top of my head - membarrier() does that, might be there are some other syscalls too),
>>> or something hand-made. For hand-made, I wonder would something like that
>>> be safe and sufficient:
>>> uint64_t val = atomic_load(addr);
>>> CAS(addr, val, &val);
>>> ?
>>> Anyway, one way or another - I think ability to wakeup core we put to sleep
>>> have to be an essential part of this feature.
>>> As I understand linux kernel will limit max amount of sleep time for these instructions:
>>> https://lwn.net/Articles/790920/
>>> But relying just on that, seems too vague for me:
>>> - user can adjust that value
>>> - wouldn't apply to older kernels and non-linux cases
>>> Konstantin
>>>
>>
>> This implies knowing the value the core is sleeping on.
> 
> You don't the value to wait for, you just need an address.
> And you can make wakeup function to accept address as a parameter,
> same as monitor() does.

Sorry, i meant the address. We don't know the address we're sleeping on.

> 
>> That's not
>> always the case - with this particular PMD power management scheme, we
>> get the address from the PMD and it stays inside the callback.
> 
> That's fine - you can store address inside you callback metadata
> and do wakeup as part of _disable_ function.
> 

The address may be different, and by the time we access the address it 
may become stale, so i don't see how that would help unless you're 
suggesting to have some kind of synchronization mechanism there.

-- 
Thanks,
Anatoly


More information about the dev mailing list