On Mon, Apr 10, 2023 at 01:58:48PM -0700, Tyler Retzlaff wrote: > On Mon, Apr 10, 2023 at 09:02:00PM +0100, Konstantin Ananyev wrote: > > 06/04/2023 01:07, Tyler Retzlaff пишет: > > >On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote: > > >> > > >>>>>>>Inline assembly is not supported for msvc x64 instead use > > >>>>>>>_{Read,Write,ReadWrite}Barrier() intrinsics. > > >>>>>>> > > >>>>>>>Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com> > > >>>>>>>--- > > >>>>>>> lib/eal/include/generic/rte_atomic.h | 4 ++++ > > >>>>>>> lib/eal/x86/include/rte_atomic.h | 10 +++++++++- > > >>>>>>> 2 files changed, 13 insertions(+), 1 deletion(-) > > >>>>>>> > > >>>>>>>diff --git a/lib/eal/include/generic/rte_atomic.h > > >>>>>>>b/lib/eal/include/generic/rte_atomic.h > > >>>>>>>index 234b268..e973184 100644 > > >>>>>>>--- a/lib/eal/include/generic/rte_atomic.h > > >>>>>>>+++ b/lib/eal/include/generic/rte_atomic.h > > >>>>>>>@@ -116,9 +116,13 @@ > > >>>>>>> * Guarantees that operation reordering does not occur at compile > > >>>>>>> time > > >>>>>>> * for operations directly before and after the barrier. > > >>>>>>> */ > > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > > >>>>>>> #define rte_compiler_barrier() do { \ > > >>>>>>> asm volatile ("" : : : "memory"); \ > > >>>>>>> } while(0) > > >>>>>>>+#else > > >>>>>>>+#define rte_compiler_barrier() _ReadWriteBarrier() > > >>>>>>>+#endif > > >>>>>>> > > >>>>>>> /** > > >>>>>>> * Synchronization fence between threads based on the specified > > >>>>>>> memory order. > > >>>>>>>diff --git a/lib/eal/x86/include/rte_atomic.h > > >>>>>>>b/lib/eal/x86/include/rte_atomic.h > > >>>>>>>index f2ee1a9..5cce9ba 100644 > > >>>>>>>--- a/lib/eal/x86/include/rte_atomic.h > > >>>>>>>+++ b/lib/eal/x86/include/rte_atomic.h > > >>>>>>>@@ -27,9 +27,13 @@ > > >>>>>>> > > >>>>>>> #define rte_rmb() _mm_lfence() > > >>>>>>> > > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > > >>>>>>> #define rte_smp_wmb() rte_compiler_barrier() > > >>>>>>>- > > >>>>>>> #define rte_smp_rmb() rte_compiler_barrier() > > >>>>>>>+#else > > >>>>>>>+#define rte_smp_wmb() _WriteBarrier() > > >>>>>>>+#define rte_smp_rmb() _ReadBarrier() > > >>>>>>>+#endif > > >>>>>>> > > >>>>>>> /* > > >>>>>>> * From Intel Software Development Manual; Vol 3; > > >>>>>>>@@ -66,11 +70,15 @@ > > >>>>>>> static __rte_always_inline void > > >>>>>>> rte_smp_mb(void) > > >>>>>>> { > > >>>>>>>+#ifndef RTE_TOOLCHAIN_MSVC > > >>>>>>> #ifdef RTE_ARCH_I686 > > >>>>>>> asm volatile("lock addl $0, -128(%%esp); " ::: "memory"); > > >>>>>>> #else > > >>>>>>> asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); > > >>>>>>> #endif > > >>>>>>>+#else > > >>>>>>>+ rte_compiler_barrier(); > > >>>>>>>+#endif > > >>>>>> > > >>>>>>It doesn't look right to me: compiler_barrier is not identical to > > >>>>>>LOCK-ed operation, > > >>>>>>and is not enough to serve as a proper memory barrier for SMP. > > >>>>> > > >>>>>i think i'm confused by the macro naming here. i'll take another look > > >>>>>thank you for raising it. > > >>>>> > > >>>>>> > > >>>>>>Another ore generic comment - do we really need to pollute all that > > >>>>>>code with RTE_TOOLCHAIN_MSVC ifdefs? > > >>>>>>Right now we have ability to have subdir per arch (x86/arm/etc.). > > >>>>>>Can we treat x86+windows+msvc as a special arch? > > >>>>> > > >>>>>i asked this question previously and confirmed in the technical board > > >>>>>meeting. the answer i received was that the community did not want new > > >>>>>directory/headers introduced for compiler support matrix and i should > > >>>>>use #ifdef in the existing headers. > > >>>> > > >>>>Ok, can I then ask at least to minimize number of ifdefs to absolute > > >>>>minimum? > > >>> > > >>>in principal no objection at all, one question though is what to do with > > >>>comment based documentation attached to macros? e.g. > > >>> > > >>>#ifdef SOME_FOO > > >>>/* some documentation */ > > >>>#define some_macro > > >>>#else > > >>>#define some_macro > > >>>#endif > > >>> > > >>>#ifdef SOME_FOO > > >>>/* some documentation 2 */ > > >>>#define some_macro2 > > >>>#else > > >>>#define some_macro2 > > >>>#endif > > >>> > > >>>i can either duplicate the documentation for every define so it stays > > >>>"attached" or i can only document the first expansion. let me know what > > >>>you expect. > > >>> > > >>>so something like this? > > >>> > > >>>#ifdef SOME_FOO > > >>>/* some documentation */ > > >>>#define some_macro > > >>>/* some documentation 2 */ > > >>>#define some_macro2 > > >>>#else > > >>>#define some_macro > > >>>#define some_macro2 > > >>>#endif > > >>> > > >>>or should all documentation be duplicated? which can become a teadious > > >>>redundancy for anyone maintaining it. keep in mind we might have to make > > >>>an exception for rte_common.h because it seems doing this would be > > >>>really ugly there. take a look let me know. > > >> > > >>My personal preference would be to keep one documentation block for both > > >>cases > > >>(yes, I suppose it needs to be updated if required): > > >> > > >>/* some documentation, probably for both SOME_FOO on/off */ > > >>#ifdef SOME_FOO > > >>#define some_macro > > >>#else > > >>#define some_macro > > >>#endif > > >> > > >> > > >>> > > >>>>It is really hard to read an follow acode that is heavily ifdefed. > > >>>>Let say above we probably don't need to re-define > > >>>>rte_smp_rmb/rte_smp_wmb, as both are boiled down to > > >>>>compiler_barrier(), which is already redefined. > > >>> > > >>>can you take a look at v2 of the patch and re-prescribe your advise > > >>>here? in v2 only the intel macros expand to the compiler barrier. though > > >>>i find this vexing since as you pointed out it seems they aren't > > >>>supposed to be compiler only barriers according to the documentation in > > >>>generic/rte_atomic.h they are intended to be memory barriers. > > >> > > >>Commented, pls check if I explained my thoughts clear enough there. > > >>> > > >>>please help me if i've goofed up in this regard. > > >>> > > >>>>Another question - could it be visa-versa approach: > > >>>>can we replace some inline assembly with common instincts whenever > > >>>>possible? > > >>> > > >>>msvc has only intrinsics and the conditional expansion for msvc is to > > >>>use those intrinsics, gcc doesn't generally define intrinsics for > > >>>processor > > >>>specific code does it? > > >> > > >>AFAIK latest gcc (and clang) versions do support majority of these > > >>instincts: __rdtsc, _xbegin/_xend, etc. > > >>So my thought was - might be we can use same instincts for all > > >>compilers... > > >>One implication I can think about - older versions of gcc. > > >>But might be we can re-order things and have inlines only for these > > >>oldere gcc versions? > > > > > >i'm going to propose if we do this we do it as a separate change later. > > > > > >i fear it could turn into the following dance which seems not a lot > > >better given i'm sure some people will argue there is no benefit to > > >removing inline assembly for gcc/clang. my preference is not to get > > >side-tracked on refactoring with the short merge window. > > > > > >#if (defined(__clang__) && clang version < x) || > > > (defined(__GNUC__) && gcc version < x) > > >__asm(whatever... > > >#else > > >__rdtsc() > > >#endif > > > > > > Played a bit with https://godbolt.org/. > > It seems that __rdtsc() and RMT builtins (xbegin/xend/xtest) are > > supported all way down to gcc 4.8.1. > > Do you know what version of clang comes with RHEL 7. Because for 23.07 > release at least I need to know it won't break clang compile on RHEL 7 > either (which is what makes this painful). >
I don't know if we ever stated that we need to support clang from RHEL 7. I always assumed that it was just gcc 4.8 compiler, but I suppose we never stated either that we didn't need to support it.... > I played around with clang 7 (which i'm not sure is the right version) > it had __rdtsc() but it did not have xbegin/xend/xtest. For the various > instructions they're appearing in different versions of the compilers. > Running Centos 7 in a VM, I see Clang v3.4.2. > I'm starting to wonder if i should drop the intrinsics changes into a > separate series? >