[dpdk-dev] [PATCH v3 0/5] use WFE for locks and ring on aarch64
Honnappa Nagarahalli
Honnappa.Nagarahalli at arm.com
Tue Jul 23 21:15:40 CEST 2019
Hi Gavin,
I think this should have been V1 (I mean, no versioning, just 'PATCH'), since it is converted to patch. I think we should be able to resend it as V1 and mark this V3 as 'superseded'.
Hi Thomas,
Please let us know if it is worth/helps fixing the version.
Thanks,
Honnappa
> -----Original Message-----
> From: Gavin Hu <gavin.hu at arm.com>
> Sent: Tuesday, July 23, 2019 10:44 AM
> To: dev at dpdk.org
> Cc: nd <nd at arm.com>; thomas at monjalon.net;
> stephen at networkplumber.org; jerinj at marvell.com;
> pbhagavatula at marvell.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli at arm.com>; Gavin Hu (Arm Technology China)
> <Gavin.Hu at arm.com>
> Subject: [PATCH v3 0/5] use WFE for locks and ring on aarch64
>
> DPDK has multiple use cases where the core repeatedly polls a location in
> memory. This polling results in many cache and memory transactions.
>
> Arm architecture provides WFE (Wait For Event) instruction, which allows the
> cpu core to enter a low power state until woken up by the update to the
> memory location being polled. Thus reducing the cache and memory
> transactions.
>
> x86 has the PAUSE hint instruction to reduce such overhead.
>
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling for a
> memory location to become equal to a given value'.
>
> For non-Arm platforms, these APIs are just wrappers around do-while loop
> with rte_pause, so there are no performance differences.
>
> For Arm platforms, use of WFE can be configured using
> CONFIG_RTE_USE_WFE option. It is disabled by default.
>
> Currently, use of WFE is supported only for aarch64 platforms. armv7
> platforms do support the WFE instruction, but they require explicit wake up
> events(sev) and are less performannt.
>
> Testing shows that, performance varies across different platforms, with some
> showing degradation.
>
> CONFIG_RTE_USE_WFE should be enabled depending on the performance on
> the target platforms.
>
> V3:
> * Convert RFCs to patches
> V2:
> * Use inline functions instead of marcos
> * Add load and compare in the beginning of the APIs
> * Fix some style errors in asm inline
> V1:
> * Add the new APIs and use it for ring and locks
>
> Gavin Hu (5):
> eal: add the APIs to wait until equal
> ticketlock: use new API to reduce contention on aarch64
> ring: use wfe to wait for ring tail update on aarch64
> spinlock: use wfe to reduce contention on aarch64
> config: add WFE config entry for aarch64
>
> config/arm/meson.build | 1 +
> config/common_armv8a_linux | 6 ++
> .../common/include/arch/arm/rte_atomic_64.h | 4 +
> .../common/include/arch/arm/rte_pause_64.h | 106
> +++++++++++++++++++++
> .../common/include/arch/arm/rte_spinlock.h | 25 +++++
> lib/librte_eal/common/include/generic/rte_pause.h | 39 +++++++-
> .../common/include/generic/rte_spinlock.h | 2 +-
> .../common/include/generic/rte_ticketlock.h | 3 +-
> lib/librte_ring/rte_ring_c11_mem.h | 4 +-
> lib/librte_ring/rte_ring_generic.h | 3 +-
> 10 files changed, 185 insertions(+), 8 deletions(-)
>
> --
> 2.7.4
More information about the dev
mailing list