On Tue, Apr 04, 2023 at 05:23:07PM +0100, Bruce Richardson wrote:
> On Tue, Apr 04, 2023 at 08:43:01AM -0700, Tyler Retzlaff wrote:
> > On Tue, Apr 04, 2023 at 09:53:21AM +0100, Bruce Richardson wrote:
> > > On Mon, Apr 03, 2023 at 02:52:25PM -0700, Tyler Retzlaff wrote:
> > > > Inline assembly is not supported for msvc x64 instead use
> > > > _{Read,Write,ReadWrite}Barrier() intrinsics.
> > > > 
> > > > Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com>
> > > > ---
> > > >  lib/eal/include/generic/rte_atomic.h |  4 ++++
> > > >  lib/eal/x86/include/rte_atomic.h     | 10 +++++++++-
> > > >  2 files changed, 13 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/lib/eal/include/generic/rte_atomic.h 
> > > > b/lib/eal/include/generic/rte_atomic.h
> > > > index 234b268..e973184 100644
> > > > --- a/lib/eal/include/generic/rte_atomic.h
> > > > +++ b/lib/eal/include/generic/rte_atomic.h
> > > > @@ -116,9 +116,13 @@
> > > >   * Guarantees that operation reordering does not occur at compile time
> > > >   * for operations directly before and after the barrier.
> > > >   */
> > > > +#ifndef RTE_TOOLCHAIN_MSVC
> > > >  #define        rte_compiler_barrier() do {             \
> > > >         asm volatile ("" : : : "memory");       \
> > > >  } while(0)
> > > > +#else
> > > > +#define rte_compiler_barrier() _ReadWriteBarrier()
> > > 
> > > Does this actually add a full memory barrier? If so, that's really not 
> > > what we
> > > want, and will slow things down.
> > 
> > for background MSVC when targeting amd64/arm64 do not permit inline
> > assmebly. The main reason is inline assembly can't be optimized.
> > instead it provides intrinsics (that are known) that can participate in
> > optimization.
> > 
> > specific answer to your question.  yes, it implements only a "compiler
> > barrier" not a full memory barrier preventing processor reordering.
> > 
> > https://learn.microsoft.com/en-us/cpp/intrinsics/readwritebarrier?view=msvc-170
> >   "Limits the compiler optimizations that can reorder memory accesses
> >    across the point of the call."
> > 
> >    note: ignore the caution on the documentation it only applies to C++
> 
> Thanks for clarifying. In that case, I think we need a different
> macro/barrier for the rte_smp_mp() case. When mixing reads and writes on
> x86, there are cases where we actually do need a full memory
> barrier/mfence, rather than just a compiler barrier.

yes, unfortunately i got distracted by expansion of rte_smp_{w,r}mp()
macros in lib/eal/x86/include/rte_atomic.h and assumed they were compiler
barriers. i can see now i should have been looking at the documentation
comments of the inline function prototypes in 
lib/eal/include/generic/rte_atomic.h

> 
> /Bruce

Reply via email to