06/04/2023 01:07, Tyler Retzlaff пишет:
On Wed, Apr 05, 2023 at 10:57:02AM +0000, Konstantin Ananyev wrote:
Inline assembly is not supported for msvc x64 instead use
_{Read,Write,ReadWrite}Barrier() intrinsics.
Signed-off-by: Tyler Retzlaff <roret...@linux.microsoft.com>
---
lib/eal/include/generic/rte_atomic.h | 4 ++++
lib/eal/x86/include/rte_atomic.h | 10 +++++++++-
2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/lib/eal/include/generic/rte_atomic.h
b/lib/eal/include/generic/rte_atomic.h
index 234b268..e973184 100644
--- a/lib/eal/include/generic/rte_atomic.h
+++ b/lib/eal/include/generic/rte_atomic.h
@@ -116,9 +116,13 @@
* Guarantees that operation reordering does not occur at compile time
* for operations directly before and after the barrier.
*/
+#ifndef RTE_TOOLCHAIN_MSVC
#define rte_compiler_barrier() do { \
asm volatile ("" : : : "memory"); \
} while(0)
+#else
+#define rte_compiler_barrier() _ReadWriteBarrier()
+#endif
/**
* Synchronization fence between threads based on the specified memory order.
diff --git a/lib/eal/x86/include/rte_atomic.h b/lib/eal/x86/include/rte_atomic.h
index f2ee1a9..5cce9ba 100644
--- a/lib/eal/x86/include/rte_atomic.h
+++ b/lib/eal/x86/include/rte_atomic.h
@@ -27,9 +27,13 @@
#define rte_rmb() _mm_lfence()
+#ifndef RTE_TOOLCHAIN_MSVC
#define rte_smp_wmb() rte_compiler_barrier()
-
#define rte_smp_rmb() rte_compiler_barrier()
+#else
+#define rte_smp_wmb() _WriteBarrier()
+#define rte_smp_rmb() _ReadBarrier()
+#endif
/*
* From Intel Software Development Manual; Vol 3;
@@ -66,11 +70,15 @@
static __rte_always_inline void
rte_smp_mb(void)
{
+#ifndef RTE_TOOLCHAIN_MSVC
#ifdef RTE_ARCH_I686
asm volatile("lock addl $0, -128(%%esp); " ::: "memory");
#else
asm volatile("lock addl $0, -128(%%rsp); " ::: "memory");
#endif
+#else
+ rte_compiler_barrier();
+#endif
It doesn't look right to me: compiler_barrier is not identical to LOCK-ed
operation,
and is not enough to serve as a proper memory barrier for SMP.
i think i'm confused by the macro naming here. i'll take another look
thank you for raising it.
Another ore generic comment - do we really need to pollute all that code with
RTE_TOOLCHAIN_MSVC ifdefs?
Right now we have ability to have subdir per arch (x86/arm/etc.).
Can we treat x86+windows+msvc as a special arch?
i asked this question previously and confirmed in the technical board
meeting. the answer i received was that the community did not want new
directory/headers introduced for compiler support matrix and i should
use #ifdef in the existing headers.
Ok, can I then ask at least to minimize number of ifdefs to absolute
minimum?
in principal no objection at all, one question though is what to do with
comment based documentation attached to macros? e.g.
#ifdef SOME_FOO
/* some documentation */
#define some_macro
#else
#define some_macro
#endif
#ifdef SOME_FOO
/* some documentation 2 */
#define some_macro2
#else
#define some_macro2
#endif
i can either duplicate the documentation for every define so it stays
"attached" or i can only document the first expansion. let me know what
you expect.
so something like this?
#ifdef SOME_FOO
/* some documentation */
#define some_macro
/* some documentation 2 */
#define some_macro2
#else
#define some_macro
#define some_macro2
#endif
or should all documentation be duplicated? which can become a teadious
redundancy for anyone maintaining it. keep in mind we might have to make
an exception for rte_common.h because it seems doing this would be
really ugly there. take a look let me know.
My personal preference would be to keep one documentation block for both cases
(yes, I suppose it needs to be updated if required):
/* some documentation, probably for both SOME_FOO on/off */
#ifdef SOME_FOO
#define some_macro
#else
#define some_macro
#endif
It is really hard to read an follow acode that is heavily ifdefed.
Let say above we probably don't need to re-define
rte_smp_rmb/rte_smp_wmb, as both are boiled down to
compiler_barrier(), which is already redefined.
can you take a look at v2 of the patch and re-prescribe your advise
here? in v2 only the intel macros expand to the compiler barrier. though
i find this vexing since as you pointed out it seems they aren't
supposed to be compiler only barriers according to the documentation in
generic/rte_atomic.h they are intended to be memory barriers.
Commented, pls check if I explained my thoughts clear enough there.
please help me if i've goofed up in this regard.
Another question - could it be visa-versa approach:
can we replace some inline assembly with common instincts whenever possible?
msvc has only intrinsics and the conditional expansion for msvc is to
use those intrinsics, gcc doesn't generally define intrinsics for processor
specific code does it?
AFAIK latest gcc (and clang) versions do support majority of these instincts:
__rdtsc, _xbegin/_xend, etc.
So my thought was - might be we can use same instincts for all compilers...
One implication I can think about - older versions of gcc.
But might be we can re-order things and have inlines only for these oldere gcc
versions?
i'm going to propose if we do this we do it as a separate change later.
i fear it could turn into the following dance which seems not a lot
better given i'm sure some people will argue there is no benefit to
removing inline assembly for gcc/clang. my preference is not to get
side-tracked on refactoring with the short merge window.
#if (defined(__clang__) && clang version < x) ||
(defined(__GNUC__) && gcc version < x)
__asm(whatever...
#else
__rdtsc()
#endif
Played a bit with https://godbolt.org/.
It seems that __rdtsc() and RMT builtins (xbegin/xend/xtest) are
supported all way down to gcc 4.8.1.