[dpdk-dev] [PATCH v4 02/10] eal: add power management intrinsics

Ananyev, Konstantin konstantin.ananyev at intel.com
Fri Oct 9 18:56:54 CEST 2020


> 
> On 09-Oct-20 4:39 PM, Ananyev, Konstantin wrote:
> >
> >> On 08-Oct-20 6:15 PM, Ananyev, Konstantin wrote:
> >>>>
> >>>> Add two new power management intrinsics, and provide an implementation
> >>>> in eal/x86 based on UMONITOR/UMWAIT instructions. The instructions
> >>>> are implemented as raw byte opcodes because there is not yet widespread
> >>>> compiler support for these instructions.
> >>>>
> >>>> The power management instructions provide an architecture-specific
> >>>> function to either wait until a specified TSC timestamp is reached, or
> >>>> optionally wait until either a TSC timestamp is reached or a memory
> >>>> location is written to. The monitor function also provides an optional
> >>>> comparison, to avoid sleeping when the expected write has already
> >>>> happened, and no more writes are expected.
> >>>
> >>> I think what this API is missing - a function to wakeup sleeping core.
> >>> If user can/should use some system call to achieve that, then at least
> >>> it has to be clearly documented, even better some wrapper provided.
> >>
> >> I don't think it's possible to do that without severely overcomplicating
> >> the intrinsic and its usage, because AFAIK the only way to wake up a
> >> sleeping core would be to send some kind of interrupt to the core, or
> >> trigger a write to the cache-line in question.
> >>
> >
> > Yes, I think we either need a syscall that would do an IPI for us
> > (on top of my head - membarrier() does that, might be there are some other syscalls too),
> > or something hand-made. For hand-made, I wonder would something like that
> > be safe and sufficient:
> > uint64_t val = atomic_load(addr);
> > CAS(addr, val, &val);
> > ?
> > Anyway, one way or another - I think ability to wakeup core we put to sleep
> > have to be an essential part of this feature.
> > As I understand linux kernel will limit max amount of sleep time for these instructions:
> > https://lwn.net/Articles/790920/
> > But relying just on that, seems too vague for me:
> > - user can adjust that value
> > - wouldn't apply to older kernels and non-linux cases
> > Konstantin
> >
> 
> This implies knowing the value the core is sleeping on.

You don't the value to wait for, you just need an address.
And you can make wakeup function to accept address as a parameter,
same as monitor() does. 

> That's not
> always the case - with this particular PMD power management scheme, we
> get the address from the PMD and it stays inside the callback.

That's fine - you can store address inside you callback metadata 
and do wakeup as part of _disable_ function.


More information about the dev mailing list