ARM v8 rte_power_pause

Wathsala Vithanage wathsala.vithanage at arm.com
Wed Jun 17 13:57:04 CEST 2026


Hi Morten and Hemant,

YIELD is a NOP on non-SMT CPUs, such as Neoverse.

WFE is universally available on AArch64, but it comes with a caveat: the 
CPU can remain in a low-power state indefinitely unless an event is 
triggered. That event can be generated explicitly via SEV/SEVL by a 
different CPU, or implicitly through address monitoring (LDAXR).

WFET is the safer variant because it includes a timeout, so explicit or 
implicit event-register manipulation is not required.

--wathsala

On 6/12/26 01:11, Hemant Agrawal wrote:
> Hi Morten,
> On Cortex‑A72 (ARMv8), the only architectural primitives available are YIELD, WFE, and WFI:
>
> 	YIELD is the only deterministic, low-overhead option (pure CPU relax, no entry into low-power state)
> 	WFE can be used as a low-power idle hint, but it is event-driven and not time-based (it may return immediately)
> 	WFI depends on interrupt wakeup and is therefore not suitable for tight latency loops
>
> For ~1 µs latency targets, the practical approach is a hybrid strategy:
>
> Short waits → spin using YIELD
> Slightly longer waits → opportunistically use WFE for power reduction
>
> A simple implementation could look like (not tested):
>
> static inline void rte_armv8_pause(unsigned int iters)
> {
> 	if (iters < 64) {
> 		for (unsigned int i = 0; i < iters; i++)
> 			asm volatile("yield");
> 	} else {
> 		asm volatile("sevl");
> 		asm volatile("wfe");
> 	}
> }
>
> @Wathsala Vithanage — would appreciate your thoughts, especially if there are any micro-architectural nuances we should consider.
>
> Regards,
> Hemant
>
>> -----Original Message-----
>> From: Morten Brørup <mb at smartsharesystems.com>
>> Sent: 03 June 2026 17:26
>> To: Wathsala Vithanage <wathsala.vithanage at arm.com>; Hemant Agrawal
>> <hemant.agrawal at nxp.com>; Sachin Saxena (OSS)
>> <sachin.saxena at oss.nxp.com>
>> Cc: dev at dpdk.org; Maxime Leroy <maxime at leroys.fr>
>> Subject: ARM v8 rte_power_pause
>> Importance: High
>>
>> Hi Wathsala, Hemant and Sachin,
>>
>> Over at the Grout project, we are discussing power management in the
>> context of 100 Gbit/s latency deadlines [1].
>>
>> rte_power_pause() is not implemented for ARM v8 / Cortex-A72.
>> Syscalls such as nanosleep() have too much overhead, and cannot be used.
>>
>> Any suggestions for a power-reducing method to make a CPU core "sleep" (i.e.
>> do nothing) for durations in the order of 1 microsecond?
>>
>> [1]:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithu
>> b.com%2FDPDK%2Fgrout%2Fpull%2F624%23issuecomment-
>> 4602036364&data=05%7C02%7Chemant.agrawal%40nxp.com%7Cdbff5f2e
>> 8db1406f0c4008dec1671791%7C686ea1d3bc2b4c6fa92cd99c5c301635%7
>> C0%7C0%7C639160845728472826%7CUnknown%7CTWFpbGZsb3d8eyJFb
>> XB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTW
>> FpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=DRpJWjm2yaF3Cnhk0b
>> bFFhmGbKRweOOiWdsWco2NbX0%3D&reserved=0
>>
>> -Morten


More information about the dev mailing list