[dpdk-dev] [PATCH v4 0/6] use WFE for locks and ring on aarch64

David Marchand david.marchand at redhat.com
Wed Oct 16 10:08:44 CEST 2019


Hello guys,

This series got a lot of attention from ARM people and it seems ready
for integration.
But I did not see comment from other architectures, could you have a
look please?


Thanks.
-- 
David Marchand


On Thu, Aug 22, 2019 at 8:13 AM Gavin Hu <gavin.hu at arm.com> wrote:
>
> DPDK has multiple use cases where the core repeatedly polls a location in
> memory. This polling results in many cache and memory transactions.
>
> Arm architecture provides WFE (Wait For Event) instruction, which allows
> the cpu core to enter a low power state until woken up by the update to the
> memory location being polled. Thus reducing the cache and memory
> transactions.
>
> x86 has the PAUSE hint instruction to reduce such overhead.
>
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'.
>
> For non-Arm platforms, these APIs are just wrappers around do-while loop
> with rte_pause, so there are no performance differences.
>
> For Arm platforms, use of WFE can be configured using CONFIG_RTE_USE_WFE
> option. It is disabled by default.
>
> Currently, use of WFE is supported only for aarch64 platforms. armv7
> platforms do support the WFE instruction, but they require explicit wake up
> events(sev) and are less performannt.
>
> Testing shows that, performance varies across different platforms, with
> some showing degradation.
>
> CONFIG_RTE_ARM_USE_WFE should be enabled depending on the performance
> benchmarking on the target platforms. Power saving should be an bonus,
> but currenly we don't have ways to characterize that.
>
> V4:
> - rename the config as CONFIG_RTE_ARM_USE_WFE to indicate it applys to arm only
> - introduce a macro for assembly Skelton to reduce the duplication of code
> - add one patch for nxp fslmc to address a compiling error
> V3:
> - Convert RFCs to patches
> V2:
> - Use inline functions instead of marcos
> - Add load and compare in the beginning of the APIs
> - Fix some style errors in asm inline
> V1:
> - Add the new APIs and use it for ring and locks
>
> Gavin Hu (6):
>   bus/fslmc: fix the conflicting dmb function
>   eal: add the APIs to wait until equal
>   ticketlock: use new API to reduce contention on aarch64
>   ring: use wfe to wait for ring tail update on aarch64
>   spinlock: use wfe to reduce contention on aarch64
>   config: add WFE config entry for aarch64
>
>  config/arm/meson.build                             |  1 +
>  config/common_base                                 |  6 +++++
>  drivers/bus/fslmc/mc/fsl_mc_sys.h                  | 10 +++++---
>  drivers/bus/fslmc/mc/mc_sys.c                      |  3 +--
>  .../common/include/arch/arm/rte_pause_64.h         | 30 ++++++++++++++++++++++
>  .../common/include/arch/arm/rte_spinlock.h         | 25 ++++++++++++++++++
>  lib/librte_eal/common/include/generic/rte_pause.h  | 26 ++++++++++++++++++-
>  .../common/include/generic/rte_ticketlock.h        |  3 +--
>  lib/librte_ring/rte_ring_c11_mem.h                 |  4 +--
>  lib/librte_ring/rte_ring_generic.h                 |  3 +--
>  10 files changed, 99 insertions(+), 12 deletions(-)
>
> --
> 2.7.4
>



More information about the dev mailing list