Hi Gavin,
        I think this should have been V1 (I mean, no versioning, just 'PATCH'), 
since it is converted to patch. I think we should be able to resend it as V1 
and mark this V3 as 'superseded'.

Hi Thomas,
        Please let us know if it is worth/helps fixing the version.

Thanks,
Honnappa

> -----Original Message-----
> From: Gavin Hu <gavin...@arm.com>
> Sent: Tuesday, July 23, 2019 10:44 AM
> To: dev@dpdk.org
> Cc: nd <n...@arm.com>; tho...@monjalon.net;
> step...@networkplumber.org; jer...@marvell.com;
> pbhagavat...@marvell.com; Honnappa Nagarahalli
> <honnappa.nagaraha...@arm.com>; Gavin Hu (Arm Technology China)
> <gavin...@arm.com>
> Subject: [PATCH v3 0/5] use WFE for locks and ring on aarch64
> 
> DPDK has multiple use cases where the core repeatedly polls a location in
> memory. This polling results in many cache and memory transactions.
> 
> Arm architecture provides WFE (Wait For Event) instruction, which allows the
> cpu core to enter a low power state until woken up by the update to the
> memory location being polled. Thus reducing the cache and memory
> transactions.
> 
> x86 has the PAUSE hint instruction to reduce such overhead.
> 
> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling for a
> memory location to become equal to a given value'.
> 
> For non-Arm platforms, these APIs are just wrappers around do-while loop
> with rte_pause, so there are no performance differences.
> 
> For Arm platforms, use of WFE can be configured using
> CONFIG_RTE_USE_WFE option. It is disabled by default.
> 
> Currently, use of WFE is supported only for aarch64 platforms. armv7
> platforms do support the WFE instruction, but they require explicit wake up
> events(sev) and are less performannt.
> 
> Testing shows that, performance varies across different platforms, with some
> showing degradation.
> 
> CONFIG_RTE_USE_WFE should be enabled depending on the performance on
> the target platforms.
> 
> V3:
> * Convert RFCs to patches
> V2:
> * Use inline functions instead of marcos
> * Add load and compare in the beginning of the APIs
> * Fix some style errors in asm inline
> V1:
> * Add the new APIs and use it for ring and locks
> 
> Gavin Hu (5):
>   eal: add the APIs to wait until equal
>   ticketlock: use new API to reduce contention on aarch64
>   ring: use wfe to wait for ring tail update on aarch64
>   spinlock: use wfe to reduce contention on aarch64
>   config: add WFE config entry for aarch64
> 
>  config/arm/meson.build                             |   1 +
>  config/common_armv8a_linux                         |   6 ++
>  .../common/include/arch/arm/rte_atomic_64.h        |   4 +
>  .../common/include/arch/arm/rte_pause_64.h         | 106
> +++++++++++++++++++++
>  .../common/include/arch/arm/rte_spinlock.h         |  25 +++++
>  lib/librte_eal/common/include/generic/rte_pause.h  |  39 +++++++-
>  .../common/include/generic/rte_spinlock.h          |   2 +-
>  .../common/include/generic/rte_ticketlock.h        |   3 +-
>  lib/librte_ring/rte_ring_c11_mem.h                 |   4 +-
>  lib/librte_ring/rte_ring_generic.h                 |   3 +-
>  10 files changed, 185 insertions(+), 8 deletions(-)
> 
> --
> 2.7.4

Reply via email to