[dpdk-dev] [RFC 1/6] eal: add power management intrinsics

Honnappa Nagarahalli Honnappa.Nagarahalli at arm.com
Wed Jun 3 08:22:57 CEST 2020


<snip>

> 
> On Thu, May 28, 2020 at 9:08 PM Ananyev, Konstantin
> <konstantin.ananyev at intel.com> wrote:
> >
> >
> > > > Hi Anatoly,
> > > >
> > > >>
> > > >> Add two new power management intrinsics, and provide an
> > > >> implementation in eal/x86 based on UMONITOR/UMWAIT instructions.
> > > >> The instructions are implemented as raw byte opcodes because
> > > >> there is not yet widespread compiler support for these instructions.
> > > >>
> > > >> The power management instructions provide an
> > > >> architecture-specific function to either wait until a specified
> > > >> TSC timestamp is reached, or optionally wait until either a TSC
> > > >> timestamp is reached or a memory location is written to. The
> > > >> monitor function also provides an optional comparison, to avoid
> > > >> sleeping when the expected write has already happened, and no more
> writes are expected.
> > > >
> > > > Recently ARM guys introduced new generic API for similar (as I
> > > > understand) purposes: rte_wait_until_equal_(16|32|64).
> > > > Probably would make sense to unite both APIs into something common
> > > > and HW transparent.
> > > > Konstantin
> > >
> > > Hi Konstantin,
> > >
> > > That's not really similar purpose. This is monitoring a cacheline
> > > for writes, not waiting on a specific value.
> >
> > I understand that.
> >
> > > The "expected" value is there
> > > as basically a hack to get around the race condition due to the fact
> > > that by the time you enter monitoring state, the write you're
> > > waiting for may have already happened.
> >
> > AFAIK, current rte_wait_until_equal_* does pretty much the same thing:
> >
> > LDXR memaddr, $reg  // an address to monitor for if ($reg !=
> > expected_value)
> >    SEVL      //     arm monitor
> >    do {
> >        WFE     //      waits for write to that memory address
> >        LDXR memaddr, $reg
> >    } while ($reg != expected_value);
> >
> > Looks pretty similar to what rte_power_monitor() does, except you
> > don't have a loop for checking the new value.
> > Plus rte_power_monitor() provides extra options to the user -
> > timestamp and power save mode to enter.
> > Also I don't know what is the granularity of such events on ARM, is it
> > a cache-line or more/less.
> 
> As I understand it, Granularity is per the cache-line.
> ie. Load-exclusive(LDXR) followed by WFE will wait in a low-power state until
> the cache line is written.
Architecture allows for 16B to 2048B space. Typically, implementations use cache-line granularity.

> 
> But I see UMONITOR bit different, Where _without_ other core signaling to
> wakeup from wait state, it can wake on TSC expiry. I think, that's is the main
> primitive on this feature. Right?
> 
> WFE can also wake based on Timer stream events(kind of TSC in x86
> analogy) but it has a configuration
> bit that needs to allow for this scheme in userspace(EL0) or not?
> defined by EL1(Linux kernel).
Timer stream events are not per CPU core. They are system wide streams.

> I am planning to spend time on this after understanding the value addition of
> the feature/usecase[1] [1] http://mails.dpdk.org/archives/dev/2020-
> May/168888.html
> 
> 
> 
> 
> 
> > Might be ARM people can comment/correct me here.
> > Konstantin


More information about the dev mailing list