On Tue, Apr 04, 2023 at 08:43:01AM -0700, Tyler Retzlaff wrote:
> On Tue, Apr 04, 2023 at 09:53:21AM +0100, Bruce Richardson wrote:
> > On Mon, Apr 03, 2023 at 02:52:25PM -0700, Tyler Retzlaff wrote:
> > > Inline assembly is not supported for msvc x64 instead use
> > > _{Read,Write,ReadWrite}Barrier() intrinsics.
> > > 
> > > Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com>
> > > ---
> > >  lib/eal/include/generic/rte_atomic.h |  4 ++++
> > >  lib/eal/x86/include/rte_atomic.h     | 10 +++++++++-
> > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/lib/eal/include/generic/rte_atomic.h 
> > > b/lib/eal/include/generic/rte_atomic.h
> > > index 234b268..e973184 100644
> > > --- a/lib/eal/include/generic/rte_atomic.h
> > > +++ b/lib/eal/include/generic/rte_atomic.h
> > > @@ -116,9 +116,13 @@
> > >   * Guarantees that operation reordering does not occur at compile time
> > >   * for operations directly before and after the barrier.
> > >   */
> > > +#ifndef RTE_TOOLCHAIN_MSVC
> > >  #define  rte_compiler_barrier() do {             \
> > >   asm volatile ("" : : : "memory");       \
> > >  } while(0)
> > > +#else
> > > +#define rte_compiler_barrier() _ReadWriteBarrier()
> > 
> > Does this actually add a full memory barrier? If so, that's really not what 
> > we
> > want, and will slow things down.
> 
> for background MSVC when targeting amd64/arm64 do not permit inline
> assmebly. The main reason is inline assembly can't be optimized.
> instead it provides intrinsics (that are known) that can participate in
> optimization.
> 
> specific answer to your question.  yes, it implements only a "compiler
> barrier" not a full memory barrier preventing processor reordering.
> 
> https://learn.microsoft.com/en-us/cpp/intrinsics/readwritebarrier?view=msvc-170
>   "Limits the compiler optimizations that can reorder memory accesses
>    across the point of the call."
> 
>    note: ignore the caution on the documentation it only applies to C++

Thanks for clarifying. In that case, I think we need a different
macro/barrier for the rte_smp_mp() case. When mixing reads and writes on
x86, there are cases where we actually do need a full memory
barrier/mfence, rather than just a compiler barrier.

/Bruce

Reply via email to