The most expensive ordering for hwsync to provide is the store-load barrier, because all prior stores have to be drained to the caches before subsequent instructions can complete.
stsync just orders stores which means it can just be a barrer that goes down the store queue and orders draining, and does not prevent completion of subsequent instructions. So it should be faster than hwsync. Use stsync for wmb(). Older processors that don't recognise the SC field should treat this as hwsync. Signed-off-by: Nicholas Piggin <npig...@gmail.com> --- arch/powerpc/include/asm/barrier.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h index f0ff5737b0d8..95e637c1a3b6 100644 --- a/arch/powerpc/include/asm/barrier.h +++ b/arch/powerpc/include/asm/barrier.h @@ -39,7 +39,7 @@ */ #define __mb() __asm__ __volatile__ ("sync" : : : "memory") #define __rmb() __asm__ __volatile__ ("sync" : : : "memory") -#define __wmb() __asm__ __volatile__ ("sync" : : : "memory") +#define __wmb() __asm__ __volatile__ (PPC_STSYNC : : : "memory") /* The sub-arch has lwsync */ #if defined(CONFIG_PPC64) || defined(CONFIG_PPC_E500MC) -- 2.40.1