> From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com] > Sent: Tuesday, 8 August 2023 22.50 > > On Tue, Aug 08, 2023 at 10:22:09PM +0200, Morten Brørup wrote: > > > From: Tyler Retzlaff [mailto:roret...@linux.microsoft.com] > > > Sent: Tuesday, 8 August 2023 21.20 > > > > > > On Tue, Aug 08, 2023 at 07:23:41PM +0100, Bruce Richardson wrote: > > > > On Tue, Aug 08, 2023 at 10:53:03AM -0700, Tyler Retzlaff wrote: > > > > > Hi folks, > > > > > > > > > > Moving this discussion to the dev mailing list for broader > comment. > > > > > > > > > > Unfortunately, we've hit a roadblock with integrating C11 > atomics > > > > > for DPDK. The main issue is that GNU C++ prior to -std=c++23 > > > explicitly > > > > > cannot be integrated with C11 stdatomic.h. Basically, you can't > > > include > > > > > the header and you can't use `_Atomic' type specifier to declare > > > atomic > > > > > types. This is not a problem with LLVM or MSVC as they both > allow > > > > > integration with C11 stdatomic.h, but going forward with C11 > atomics > > > > > would break using DPDK in C++ programs when building with GNU > g++. > > > > > > > > > > Essentially you cannot compile the following with g++. > > > > > > > > > > #include <stdatomic.h> > > > > > > > > > > int main(int argc, char *argv[]) { return 0; } > > > > > > > > > > In file included from atomic.cpp:1: > > > > > /usr/lib/gcc/x86_64-pc-cygwin/11/include/stdatomic.h:40:9: > error: > > > > > ‘_Atomic’ does not name a type > > > > > 40 | typedef _Atomic _Bool atomic_bool; > > > > > > > > > > ... more errors of same ... > > > > > > > > > > It's also acknowledged as something known and won't fix by GNU > g++ > > > > > maintainers. > > > > > > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60932 > > > > > > > > > > Given the timeframe I would like to propose the minimally > invasive, > > > > > lowest risk solution as follows. > > > > > > > > > > 1. Adopt stdatomic.h for all Windows targets, leave all > Linux/BSD > > > targets > > > > > using GCC builtin C++11 memory model atomics. > > > > > 2. Introduce a macro that allows _Atomic type specifier to be > > > applied to > > > > > function parameter, structure field types and variable > > > declarations. > > > > > > > > > > * The macro would expand empty for Linux/BSD targets. > > > > > * The macro would expand to C11 _Atomic keyword for Windows > > > targets. > > > > > > > > > > 3. Introduce basic macro that allows __atomic_xxx for > normalized > > > use > > > > > internal to DPDK. > > > > > > > > > > * The macro would not be defined for Linux/BSD targets. > > > > > * The macro would expand __atomic_xxx to corresponding > > > stdatomic.h > > > > > atomic_xxx operations for Windows targets. > > > > > > > > > Regarding naming of these macros (suggested in 2. and 3.), they should > probably bear the rte_ prefix instead of overlapping existing names, so > applications can also use them directly. > > > > E.g.: > > #define rte_atomic for _Atomic or nothing, > > #define rte_atomic_fetch_add() for atomic_fetch_add() or > __atomic_fetch_add(), and > > #define RTE_MEMORY_ORDER_SEQ_CST for memory_order_seq_cst or > __ATOMIC_SEQ_CST. > > > > Maybe that is what you meant already. I'm not sure of the scope and > details of your suggestion here. > > I'm shy to do anything in the rte_ namespace because I don't want to > formalize it as an API. > > I was envisioning the following. > > Internally DPDK code just uses __atomic_fetch_add directly, the macros > are provided for Windows targets to expand to __atomic_fetch_add. > > Externally DPDK applications that don't care about being portable may > use __atomic_fetch_add (BSD/Linux) or atomic_fetch_add (Windows) > directly. > > Externally DPDK applications that care to be portable may do what is > done Internally and <<use>> the __atomic_fetch_add directly. By > including say rte_stdatomic.h indirectly (Windows) gets the macros > expanded to atomic_fetch_add and for BSD/Linux it's a noop include. > > Basically I'm placing a little ugly into Windows built code and in trade > we don't end up with a bunch of rte_ APIs that were strongly objected to > previously. > > It's a compromise.
OK, we probably need to offer a public header file to wrap the atomics, using either names prefixed with rte_ or names similar to the gcc builtin atomics. I guess the objections were based on the assumption that we were switching to C11 atomics with DPDK 23.11, so the rte_ prefixed atomic APIs would be very short lived (DPDK 23.07 to 23.11 only). But with this new information about GNU C++ incompatibility, that seems not to be the case, so the naming discussion can be reopened. If we don't introduce such a wrapper header, all portable code needs to surround the use of atomics with #ifdef USE_STDATOMIC_H. BTW: Can the compilers that understand both builtin atomics and C11 stdatomics.h handle code with #define __atomic_fetch_add atomic_fetch_add and #define __ATOMIC_SEQ_CST memory_order_seq_cst? If not, then we need to use rte_ prefixed atomics. And what about C++ atomics... Do we want (or need?) a third variant using C++ atomics, e.g. "atomic<int> x;" instead of "_Atomic int x;"? (I hope not!) For reference, the "atomic_int" type is "_Atomic int" in C, but "std::atomic<int>" in C++. C++23 provides the C11 compatibility macro "_Atomic(T)", which means "_Atomic T" in C and "std::atomic<T>" in C++. Perhaps we can somewhat rely on this, and update our coding standards to require using e.g. "_Atomic(int)" for atomic types, and disallow using "_Atomic int". > > > > > > > > 4. We re-evaluate adoption of C11 atomics and corresponding > > > requirement of > > > > > -std=c++23 compliant compiler at the next long term ABI > promise > > > release. > > > > > > > > > > Q: Why not define macros that look like the standard and expand > > > those > > > > > names to builtins? > > > > > A: Because introducing the names is a violation of the C > standard, > > > we > > > > > can't / shouldn't define atomic_xxx names in the applications > > > namespace > > > > > as we are not ``the implementation''. > > > > > A: Because the builtins offer a subset of stdatomic.h capability > > > they > > > > > can only operate on pointer and integer types. If we > presented > > > the > > > > > stdatomic.h names there might be some confusion attempting to > > > perform > > > > > atomic operations on e.g. _Atomic specified struct would fail > but > > > only > > > > > on BSD/Linux builds (with the proposed solution). > > > > > > > > > > > > > Out of interest, rather than splitting on Windows vs *nix OS for > the > > > > atomics, what would it look like if we split behaviour based on C > vs > > > C++ > > > > use? Would such a thing work? > > > > > > Unfortunately no. The reason is binary packages and we don't know > which > > > toolchain consumes them. > > > > > > For example. > > > > > > Assume we build libeal-dev package with gcc. We'll end up with > headers > > > that contain the _Atomic specifier. > > > > > > Now we write an application and build it with > > > * gcc, sure works fine it knows about _Atomic > > > * clang, same as gcc > > > * clang++, works but is implementation detail that it works (it > isn't > > > standard) > > > * g++, does not work > > > > > > So the LCD is build package without _Atomic i.e. what we already > have > > > today > > > on BSD/Linux. > > > > > > > I agree with Tyler's conceptual solution as proposed in the first > email in this thread, but with a twist: > > > > Instead of splitting Windows vs. Linux/BSD, the split should be a > build time configuration parameter, e.g. USE_STDATOMIC_H. This would be > default true for Windows, and default false for Linux/BSD distros - i.e. > behave exactly as Tyler described. > > Interesting, so the intention here is default stdatomic off for > BSD/Linux and default on for Windows. Binary packagers could then choose > if they wanted to build binary packages incompatible with g++ < - > std=c++23 > by overriding the default and enabling stdatomic. > > I don't object to this if noone else does and it does seem to give more > options to packagers and users to decide for their distribution > channels. One note I'll make is that we would only commit to testing the > defaults in the CI to avoid blowing out the test matrix with non-default > options. Yes, I think everyone agrees about this for CI. Another thing: I just learned that FreeBSD uses CLANG as its default compiler: https://docs.freebsd.org/en/books/developers-handbook/tools/#tools-compiling So should the default be Windows/BSD use stdatomic.h, and only Linux using GCC builtin atomics? We are using "the default compiler in the distro" as the argument, so I think yes. > > > > > Having a build time configuration parameter would also allow the use > of stdatomic.h for applications that build DPDK from scratch, instead of > using the DPDK included with the distro. This could be C applications > built with the distro's C compiler or some other C compiler, or C++ > applications built with a newer GNU C++ compiler or CLANG++. > > > > It might also allow building C++ applications using an old GNU C++ > compiler on Windows (where the application is built with DPDK from > scratch). Not really an argument, just mentioning it. > > Yes, it seems like this would solve that problem in that on Windows the > default could be similarly overridden and turn stdatomic off if building > with GNU g++ on Windows. > > > > > > > Also, just wondering about the scope of the changes here. How many > > > header > > > > files are affected where we publicly expose atomics? > > > > > > So what is impacted is roughly what is in my v4 series that raised > my > > > attention to the issue. > > > > > > https://patchwork.dpdk.org/project/dpdk/list/?series=29086 > > > > > > We really can't solve the problem by not talking about atomics in > the > > > API because of the performance requirements of the APIs in question. > > > > > > e.g. It is stuff like rte_rwlock, rte_spinlock, rte_pause all stuff > that > > > can't have additional levels of indirection because of the overhead > > > involved. > > > > I strongly support this position. Hiding all atomic operations in non- > inline functions will have an unacceptable performance impact! (I know > Bruce wasn't suggesting this with his question; but someone once > suggested this on a techboard meeting, arguing that the overhead of > calling a function is very small.) > > Yeah, I think Bruce is aware but was just curious. > > The overhead of calling a function is in fact not-small on ports that > have > security features similar to Windows control flow guard (which is > required > to be enabled for some of our customers including our own (Microsoft) > shipped code). Very good to know. We should keep this in mind when someone suggests de-inlining fast path functions. Approximately how many CPU cycles does it cost to call a simple function with this security feature enabled (vs. inlining the function)? > > > > > > > > > > > > > > Thanks, > > > > /Bruce