On Thu, Apr 25, 2024 at 10:58:47AM +0200, Mattias Rönnblom wrote: > This patch set represent an attempt to improve and extend the RTE > bitops API, in particular for functions that operate on individual > bits. > > All new functionality is exposed to the user as generic selection > macros, delegating the actual work to private (__-marked) static > inline functions. Public functions (e.g., rte_bit_set32()) would just > be bloating the API. Such generic selection macros will here be > referred to as "functions", although technically they are not.
> > The legacy <rte_bitops.h> rte_bit_relaxed_*() family of functions is > replaced with three families: > > rte_bit_[test|set|clear|assign]() which provides no memory ordering or > atomicity guarantees and no read-once or write-once semantics (e.g., > no use of volatile), but does provide the best performance. The > performance degradation resulting from the use of volatile (e.g., > forcing loads and stores to actually occur and in the number > specified) and atomic (e.g., LOCK-prefixed instructions on x86) may be > significant. > > rte_bit_once_*() which guarantees program-level load and stores > actually occurring (i.e., prevents certain optimizations). The primary > use of these functions are in the context of memory mapped > I/O. Feedback on the details (semantics, naming) here would be greatly > appreciated, since the author is not much of a driver developer. > > rte_bit_atomic_*() which provides atomic bit-level operations, > including the possibility to specifying memory ordering constraints > (or the lack thereof). > > The atomic functions take non-_Atomic pointers, to be flexible, just > like the GCC builtins and default <rte_stdatomic.h>. The issue with > _Atomic APIs is that it may well be the case that the user wants to > perform both non-atomic and atomic operations on the same word. > > Having _Atomic-marked addresses would complicate supporting atomic > bit-level operations in the bitset API (proposed in a different RFC > patchset), and potentially other APIs depending on RTE bitops for > atomic bit-level ops). Either one needs two bitset variants, one > _Atomic bitset and one non-atomic one, or the bitset code needs to > cast the non-_Atomic pointer to an _Atomic one. Having a separate > _Atomic bitset would be bloat and also prevent the user from both, in > some situations, doing atomic operations against a bit set, while in > other situations (e.g., at times when MT safety is not a concern) > operating on the same objects in a non-atomic manner. understood. i think the only downside is that if you do have an _Atomic-specified type you'll have to cast the qualification away to use the function like macro. as a suggestion the _Generic legs could include both _Atomic-specified and non-_Atomic-specified types where an intermediate inline function could strip the qualification to use your core inline implementations. _Generic((v), int *: __foo32, RTE_ATOMIC(int) *: __foo32_unqual)(v)) static inline void __foo32(int *a) { ... } static inline void __foo32_unqual(RTE_ATOMIC(int) *a) { __foo32((int *)(uintptr_t)(a)); } there is some similar prior art in newer ISO C23 with typeof_unqual. https://en.cppreference.com/w/c/language/typeof > > Unlike rte_bit_relaxed_*(), individual bits are represented by bool, > not uint32_t or uint64_t. The author found the use of such large types > confusing, and also failed to see any performance benefits. > > A set of functions rte_bit_*_assign() are added, to assign a > particular boolean value to a particular bit. > > All new functions have properly documented semantics. > > All new functions (or more correctly, generic selection macros) > operate on both 32 and 64-bit words, with type checking. > > _Generic allow the user code to be a little more impact. Have a > type-generic atomic test/set/clear/assign bit API also seems > consistent with the "core" (word-size) atomics API, which is generic > (both GCC builtins and <rte_stdatomic.h> are). ack, can you confirm _Generic is usable from a C++ TU? i may be making a mistake locally but using g++ version 11.4.0 -std=c++20 it wasn't accepting it. i think using _Generic is ideal, it eliminates mistakes when handling the different integer sizes so if it turns out C++ doesn't want to cooperate the function like macro can conditionally expand to a C++ template this will need to be done for MSVC since i can confirm _Generic does not work with MSVC C++. > > The _Generic versions avoids having explicit unsigned long versions of > all functions. If you have an unsigned long, it's safe to use the > generic version (e.g., rte_set_bit()) and _Generic will pick the right > function, provided long is either 32 or 64 bit on your platform (which > it is on all DPDK-supported ABIs). > > The generic rte_bit_set() is a macro, and not a function, but > nevertheless has been given a lower-case name. That's how C11 does it > (for atomics, and other _Generic), and <rte_stdatomic.h>. Its address > can't be taken, but it does not evaluate its parameters more than > once. > > Things that are left out of this patch set, that may be included > in future versions: > > * Have all functions returning a bit number have the same return type > (i.e., unsigned int). > * Harmonize naming of some GCC builtin wrappers (i.e., rte_fls_u32()). > * Add __builtin_ffsll()/ffs() wrapper and potentially other wrappers > for useful/used bit-level GCC builtins. > * Eliminate the MSVC #ifdef-induced documentation duplication. > * _Generic versions of things like rte_popcount32(). (?) it would be nice to see them all converted, at the time i added them we still hadn't adopted C11 so was limited. but certainly not asking for it as a part of this series. > > Mattias Rönnblom (6): > eal: extend bit manipulation functionality > eal: add unit tests for bit operations > eal: add exactly-once bit access functions > eal: add unit tests for exactly-once bit access functions > eal: add atomic bit operations > eal: add unit tests for atomic bit access functions > > app/test/test_bitops.c | 319 +++++++++++++++++- > lib/eal/include/rte_bitops.h | 624 ++++++++++++++++++++++++++++++++++- > 2 files changed, 925 insertions(+), 18 deletions(-) > > -- Series-acked-by: Tyler Retzlaff <roret...@linux.microsoft.com> > 2.34.1