On Mon, Apr 10, 2023 at 01:58:48PM -0700, Tyler Retzlaff wrote:
> On Mon, Apr 10, 2023 at 09:02:00PM +0100, Konstantin Ananyev wrote:
> > 06/04/2023 01:07, Tyler Retzlaff пишет:
> > >On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote:
> > >>
> > >>>>>>>Inline assembly is not supported for msvc x64 instead use
> > >>>>>>>_{Read,Write,ReadWrite}Barrier() intrinsics.
> > >>>>>>>
> > >>>>>>>Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com>
> > >>>>>>>---
> > >>>>>>>  lib/eal/include/generic/rte_atomic.h |  4 ++++
> > >>>>>>>  lib/eal/x86/include/rte_atomic.h     | 10 +++++++++-
> > >>>>>>>  2 files changed, 13 insertions(+), 1 deletion(-)
> > >>>>>>>
> > >>>>>>>diff --git a/lib/eal/include/generic/rte_atomic.h 
> > >>>>>>>b/lib/eal/include/generic/rte_atomic.h
> > >>>>>>>index 234b268..e973184 100644
> > >>>>>>>--- a/lib/eal/include/generic/rte_atomic.h
> > >>>>>>>+++ b/lib/eal/include/generic/rte_atomic.h
> > >>>>>>>@@ -116,9 +116,13 @@
> > >>>>>>>   * Guarantees that operation reordering does not occur at compile 
> > >>>>>>> time
> > >>>>>>>   * for operations directly before and after the barrier.
> > >>>>>>>   */
> > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC
> > >>>>>>>  #define    rte_compiler_barrier() do {             \
> > >>>>>>>     asm volatile ("" : : : "memory");       \
> > >>>>>>>  } while(0)
> > >>>>>>>+#else
> > >>>>>>>+#define rte_compiler_barrier() _ReadWriteBarrier()
> > >>>>>>>+#endif
> > >>>>>>>
> > >>>>>>>  /**
> > >>>>>>>   * Synchronization fence between threads based on the specified 
> > >>>>>>> memory order.
> > >>>>>>>diff --git a/lib/eal/x86/include/rte_atomic.h 
> > >>>>>>>b/lib/eal/x86/include/rte_atomic.h
> > >>>>>>>index f2ee1a9..5cce9ba 100644
> > >>>>>>>--- a/lib/eal/x86/include/rte_atomic.h
> > >>>>>>>+++ b/lib/eal/x86/include/rte_atomic.h
> > >>>>>>>@@ -27,9 +27,13 @@
> > >>>>>>>
> > >>>>>>>  #define    rte_rmb() _mm_lfence()
> > >>>>>>>
> > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC
> > >>>>>>>  #define rte_smp_wmb() rte_compiler_barrier()
> > >>>>>>>-
> > >>>>>>>  #define rte_smp_rmb() rte_compiler_barrier()
> > >>>>>>>+#else
> > >>>>>>>+#define rte_smp_wmb() _WriteBarrier()
> > >>>>>>>+#define rte_smp_rmb() _ReadBarrier()
> > >>>>>>>+#endif
> > >>>>>>>
> > >>>>>>>  /*
> > >>>>>>>   * From Intel Software Development Manual; Vol 3;
> > >>>>>>>@@ -66,11 +70,15 @@
> > >>>>>>>  static __rte_always_inline void
> > >>>>>>>  rte_smp_mb(void)
> > >>>>>>>  {
> > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC
> > >>>>>>>  #ifdef RTE_ARCH_I686
> > >>>>>>>     asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
> > >>>>>>>  #else
> > >>>>>>>     asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
> > >>>>>>>  #endif
> > >>>>>>>+#else
> > >>>>>>>+    rte_compiler_barrier();
> > >>>>>>>+#endif
> > >>>>>>
> > >>>>>>It doesn't look right to me: compiler_barrier is not identical to 
> > >>>>>>LOCK-ed operation,
> > >>>>>>and is not enough to serve as a proper memory barrier for SMP.
> > >>>>>
> > >>>>>i think i'm confused by the macro naming here.  i'll take another look
> > >>>>>thank you for raising it.
> > >>>>>
> > >>>>>>
> > >>>>>>Another ore generic comment - do we really need to pollute all that 
> > >>>>>>code with RTE_TOOLCHAIN_MSVC ifdefs?
> > >>>>>>Right now we have ability to have subdir per arch (x86/arm/etc.).
> > >>>>>>Can we treat x86+windows+msvc as a special arch?
> > >>>>>
> > >>>>>i asked this question previously and confirmed in the technical board
> > >>>>>meeting. the answer i received was that the community did not want new
> > >>>>>directory/headers introduced for compiler support matrix and i should
> > >>>>>use #ifdef in the existing headers.
> > >>>>
> > >>>>Ok, can I then ask at least to minimize number of ifdefs to absolute
> > >>>>minimum?
> > >>>
> > >>>in principal no objection at all, one question though is what to do with
> > >>>comment based documentation attached to macros? e.g.
> > >>>
> > >>>#ifdef SOME_FOO
> > >>>/* some documentation */
> > >>>#define some_macro
> > >>>#else
> > >>>#define some_macro
> > >>>#endif
> > >>>
> > >>>#ifdef SOME_FOO
> > >>>/* some documentation 2 */
> > >>>#define some_macro2
> > >>>#else
> > >>>#define some_macro2
> > >>>#endif
> > >>>
> > >>>i can either duplicate the documentation for every define so it stays
> > >>>"attached" or i can only document the first expansion. let me know what
> > >>>you expect.
> > >>>
> > >>>so something like this?
> > >>>
> > >>>#ifdef SOME_FOO
> > >>>/* some documentation */
> > >>>#define some_macro
> > >>>/* some documentation 2 */
> > >>>#define some_macro2
> > >>>#else
> > >>>#define some_macro
> > >>>#define some_macro2
> > >>>#endif
> > >>>
> > >>>or should all documentation be duplicated? which can become a teadious
> > >>>redundancy for anyone maintaining it. keep in mind we might have to make
> > >>>an exception for rte_common.h because it seems doing this would be
> > >>>really ugly there. take a look let me know.
> > >>
> > >>My personal preference would be to keep one documentation block for both 
> > >>cases
> > >>(yes, I suppose it needs to be updated if required):
> > >>
> > >>/* some documentation, probably for both SOME_FOO on/off */
> > >>#ifdef SOME_FOO
> > >>#define some_macro
> > >>#else
> > >>#define some_macro
> > >>#endif
> > >>
> > >>
> > >>>
> > >>>>It is really hard to read an follow acode that is heavily ifdefed.
> > >>>>Let say above we probably don't need to re-define
> > >>>>rte_smp_rmb/rte_smp_wmb, as both are boiled down to
> > >>>>compiler_barrier(), which is already redefined.
> > >>>
> > >>>can you take a look at v2 of the patch and re-prescribe your advise
> > >>>here? in v2 only the intel macros expand to the compiler barrier. though
> > >>>i find this vexing since as you pointed out it seems they aren't
> > >>>supposed to be compiler only barriers according to the documentation in
> > >>>generic/rte_atomic.h they are intended to be memory barriers.
> > >>
> > >>Commented, pls check if I explained my thoughts clear enough there.
> > >>>
> > >>>please help me if i've goofed up in this regard.
> > >>>
> > >>>>Another question - could it be visa-versa approach:
> > >>>>can we replace some inline assembly with common instincts whenever 
> > >>>>possible?
> > >>>
> > >>>msvc has only intrinsics and the conditional expansion for msvc is to
> > >>>use those intrinsics, gcc doesn't generally define intrinsics for 
> > >>>processor
> > >>>specific code does it?
> > >>
> > >>AFAIK latest gcc (and clang) versions do support majority of these 
> > >>instincts: __rdtsc, _xbegin/_xend, etc.
> > >>So my thought was - might be we can use same instincts for all 
> > >>compilers...
> > >>One implication I can think about - older versions of gcc.
> > >>But might be we can re-order things and have inlines only for these 
> > >>oldere gcc versions?
> > >
> > >i'm going to propose if we do this we do it as a separate change later.
> > >
> > >i fear it could turn into the following dance which seems not a lot
> > >better given i'm sure some people will argue there is no benefit to
> > >removing inline assembly for gcc/clang. my preference is not to get
> > >side-tracked on refactoring with the short merge window.
> > >
> > >#if (defined(__clang__) && clang version < x) ||
> > >     (defined(__GNUC__) && gcc version < x)
> > >__asm(whatever...
> > >#else
> > >__rdtsc()
> > >#endif
> > 
> > 
> > Played a bit with https://godbolt.org/.
> > It seems that __rdtsc()  and RMT builtins (xbegin/xend/xtest) are
> > supported all way down to gcc 4.8.1.
> 
> Do you know what version of clang comes with RHEL 7. Because for 23.07
> release at least I need to know it won't break clang compile on RHEL 7
> either (which is what makes this painful).
> 

I don't know if we ever stated that we need to support clang from RHEL 7. I
always assumed that it was just gcc 4.8 compiler, but I suppose we never
stated either that we didn't need to support it....

> I played around with clang 7 (which i'm not sure is the right version)
> it had __rdtsc() but it did not have xbegin/xend/xtest. For the various
> instructions they're appearing in different versions of the compilers.
> 

Running Centos 7 in a VM, I see Clang v3.4.2.

> I'm starting to wonder if i should drop the intrinsics changes into a
> separate series?
> 

Reply via email to