On Tue, Apr 04, 2023 at 08:43:01AM -0700, Tyler Retzlaff wrote: > On Tue, Apr 04, 2023 at 09:53:21AM +0100, Bruce Richardson wrote: > > On Mon, Apr 03, 2023 at 02:52:25PM -0700, Tyler Retzlaff wrote: > > > Inline assembly is not supported for msvc x64 instead use > > > _{Read,Write,ReadWrite}Barrier() intrinsics. > > > > > > Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com> > > > --- > > > lib/eal/include/generic/rte_atomic.h | 4 ++++ > > > lib/eal/x86/include/rte_atomic.h | 10 +++++++++- > > > 2 files changed, 13 insertions(+), 1 deletion(-) > > > > > > diff --git a/lib/eal/include/generic/rte_atomic.h > > > b/lib/eal/include/generic/rte_atomic.h > > > index 234b268..e973184 100644 > > > --- a/lib/eal/include/generic/rte_atomic.h > > > +++ b/lib/eal/include/generic/rte_atomic.h > > > @@ -116,9 +116,13 @@ > > > * Guarantees that operation reordering does not occur at compile time > > > * for operations directly before and after the barrier. > > > */ > > > +#ifndef RTE_TOOLCHAIN_MSVC > > > #define rte_compiler_barrier() do { \ > > > asm volatile ("" : : : "memory"); \ > > > } while(0) > > > +#else > > > +#define rte_compiler_barrier() _ReadWriteBarrier() > > > > Does this actually add a full memory barrier? If so, that's really not what > > we > > want, and will slow things down. > > for background MSVC when targeting amd64/arm64 do not permit inline > assmebly. The main reason is inline assembly can't be optimized. > instead it provides intrinsics (that are known) that can participate in > optimization. > > specific answer to your question. yes, it implements only a "compiler > barrier" not a full memory barrier preventing processor reordering. > > https://learn.microsoft.com/en-us/cpp/intrinsics/readwritebarrier?view=msvc-170 > "Limits the compiler optimizations that can reorder memory accesses > across the point of the call." > > note: ignore the caution on the documentation it only applies to C++
Thanks for clarifying. In that case, I think we need a different macro/barrier for the rte_smp_mp() case. When mixing reads and writes on x86, there are cases where we actually do need a full memory barrier/mfence, rather than just a compiler barrier. /Bruce