On Thu, Oct 18, 2018 at 09:00:46PM +0200, Daniel Borkmann wrote: > On 10/18/2018 05:33 PM, Alexei Starovoitov wrote: > > On Thu, Oct 18, 2018 at 05:04:34PM +0200, Daniel Borkmann wrote: > >> #endif /* _TOOLS_LINUX_ASM_IA64_BARRIER_H */ > >> diff --git a/tools/arch/powerpc/include/asm/barrier.h > >> b/tools/arch/powerpc/include/asm/barrier.h > >> index a634da0..905a2c6 100644 > >> --- a/tools/arch/powerpc/include/asm/barrier.h > >> +++ b/tools/arch/powerpc/include/asm/barrier.h > >> @@ -27,4 +27,20 @@ > >> #define rmb() __asm__ __volatile__ ("sync" : : : "memory") > >> #define wmb() __asm__ __volatile__ ("sync" : : : "memory") > >> > >> +#if defined(__powerpc64__) > >> +#define smp_lwsync() __asm__ __volatile__ ("lwsync" : : : "memory") > >> + > >> +#define smp_store_release(p, v) \ > >> +do { \ > >> + smp_lwsync(); \ > >> + WRITE_ONCE(*p, v); \ > >> +} while (0) > >> + > >> +#define smp_load_acquire(p) \ > >> +({ \ > >> + typeof(*p) ___p1 = READ_ONCE(*p); \ > >> + smp_lwsync(); \ > >> + ___p1; \ > > > > I don't like this proliferation of asm. > > Why do we think that we can do better job than compiler? > > can we please use gcc builtins instead? > > https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html > > __atomic_load_n(ptr, __ATOMIC_ACQUIRE); > > __atomic_store_n(ptr, val, __ATOMIC_RELEASE); > > are done specifically for this use case if I'm not mistaken. > > I think it pays to learn what compiler provides. > > But are you sure the C11 memory model matches exact same model as kernel? > Seems like last time Will looked into it [0] it wasn't the case ...
I'm only suggesting equivalence of __atomic_load_n(ptr, __ATOMIC_ACQUIRE) with kernel's smp_load_acquire(). I've seen a bunch of user space ring buffer implementations implemented with __atomic_load_n() primitives. But let's ask experts who live in both worlds. Paul, what would you recommend? Should we copy paste smp_store_release() from kernel to be used in user space library/tools or use __atomic_load_n() builtins instead? > The above was pulled in and slightly adapted from kernel side of arch > asm barriers. Hm, it would probably be safest if an arch decides to adapt > C11 barriers first from kernel side and user space could then use the > exact same matching builtin functions for scenarios like these as well. > > [0] https://lore.kernel.org/lkml/20170308174300.gl20...@arm.com/