> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] > Sent: Wednesday, 31 January 2024 19.46 > > On 2024-01-31 17:06, Stephen Hemminger wrote: > > On Wed, 31 Jan 2024 14:13:01 +0100 > > Mattias Rönnblom <mattias.ronnb...@ericsson.com> wrote:
[...] > > FYI - the linux kernel has a similar but more complete set of > operations. > > It might be more efficient to use unsigned long rather than requiring > > the elements to be uint64_t. Thinking of the few 32 bit platforms. > > > > Keeping it 64-bit avoids a popcount-related #ifdef. DPDK doesn't have > an > equivalent to __builtin_popcountl(). > > How much do we need to care about 32-bit ISA performance? At the 2023 DPDK Summit I talked to someone at a very well known network equipment vendor using 32 bit CPUs in some of their products; some sort of CPE, IIRC. 32 bit CPUs are still out there, and 32-bit CPU support has not been deprecated in DPDK. For the bitset parameter to functions, you could either use "unsigned long*" (as suggested by Stephen), or "void*" (followed by type casting inside the functions). If only using this library for the command line argument parser and similar, performance is irrelevant. If we foresee using it in the fast path, e.g. with the htimer library, we shouldn't tie its API tightly to 64 bit. > > I'll go through the below API and some other APIs to see if there's > something obvious missing. > > When I originally wrote this code there were a few potential features > where I wasn't sure to what extent they were useful. One example was > the > shift operation. Any input is appreciated. Start off with what you already have. If we need more operations, they can always be added later. > > > Also, what if any thread safety guarantees? or atomic. > > > > Currently, it's all MT unsafe. > > An atomic set and get/test would make sense, and maybe other operations > would as well. > > Bringing in atomicity into the design makes it much less obvious: > > Would the atomic operations imply some memory ordering, or be > "relaxed". > I would lean toward relaxed, but then shouldn't bit-level atomics be > consistent with the core DPDK atomics API? With that in mind, memory > ordering should be user-configurable. > > If the code needs to be pure C11 atomics-wise, the words that makes up > the bitset must be _Atomic uint64_t. Then you need to be careful or end > up with "lock"-prefixed instructions if you manipulate the bitset > words. > Just a pure words[N] = 0; gives you a mov+mfence on x86, for example, > plus all the fun memory_order_seq_cst in terms of preventing > compiler-level optimizations. So you definitely can't have the bitset > always using _Atomic uint64_t, since would risk non-shared use cases. > You could have a variant I guess. Just duplicate the whole thing, or > something with macros. It seems like MT unsafe suffices for the near term use cases. We can add an atomic variant of the library later, if the need should arise. > > With GCC C11 builtins, you can both have the atomic cake and eat it, in > that you both access the data non-atomically/normally, and in an atomic > manner. Yep. And we care quite a lot about performance, so we are likely to keep using those until the compilers offer similar performance for C11 standard atomics.