> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] > Sent: Sunday, 28 April 2024 17.38 > > On 2024-04-26 14:00, Morten Brørup wrote: > >> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] > >> Sent: Friday, 26 April 2024 11.39 > >> > >> On 2024-04-25 18:18, Morten Brørup wrote: > >>>> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] > >>>> Sent: Thursday, 25 April 2024 16.36 > >>>> > >>>> On 2024-04-25 12:25, Morten Brørup wrote: > >>>>>> +#define rte_bit_atomic_test(addr, nr, memory_order) > >>>> \ > >>>>>> + _Generic((addr), > >>>>>> \ > >>>>>> + uint32_t *: __rte_bit_atomic_test32, > >>>>>> \ > >>>>>> + uint64_t *: __rte_bit_atomic_test64)(addr, nr, > >>>> memory_order) > >>>>> > >>>>> I wonder if these should have RTE_ATOMIC qualifier: > >>>>> > >>>>> + RTE_ATOMIC(uint32_t) *: __rte_bit_atomic_test32, > >>>> \ > >>>>> + RTE_ATOMIC(uint64_t) *: __rte_bit_atomic_test64)(addr, > nr, > >>>> memory_order) > >>>>> > >>>>> > >>>>>> +#define __RTE_GEN_BIT_ATOMIC_TEST(size) > >>>> \ > >>>>>> + static inline bool > >>>>>> \ > >>>>>> + __rte_bit_atomic_test ## size(const uint ## size ## _t *addr, > >>>> \ > >>>>> > >>>>> I wonder if the "addr" parameter should have RTE_ATOMIC qualifier: > >>>>> > >>>>> + __rte_bit_atomic_test ## size(const RTE_ATOMIC(uint ## size ## > _t) > >>>> *addr, \ > >>>>> > >>>>> instead of casting into a_addr. > >>>>> > >>>> > >>>> Check the cover letter for the rationale for the cast. > >>> > >>> Thanks, that clarifies it. Then... > >>> For the series: > >>> Acked-by: Morten Brørup <m...@smartsharesystems.com> > >>> > >>>> > >>>> Where I'm at now is that I think C11 _Atomic is rather poor design. The > >>>> assumption that an object which allows for atomic access always should > >>>> require all operations upon it to be atomic, regardless of where it is > >>>> in its lifetime, and which thread is accessing it, does not hold, in the > >>>> general case. > >>> > >>> It might be slow, but I suppose the C11 standard prioritizes correctness > >> over performance. > >>> > >> > >> That's a false dichotomy, in this case. You can have both. > > > > In theory you shouldn't need non-atomic access to atomic variables. > > In reality, we want it anyway, because real CPUs are faster at non-atomic > operations. > > > >> > >>> It seems locks are automatically added to _Atomic types larger than what > is > >> natively supported by the architecture. > >>> E.g. MSVC adds locks to _Atomic types larger than 8 byte. [1] > >>> > >>> [1]: https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio- > >> 2022-version-17-5-preview-2/ > >>> > >>>> > >>>> The only reason for _Atomic being as it is, as far as I can see, is to > >>>> accommodate for ISAs which does not have the appropriate atomic machine > >>>> instructions, and thus require a lock or some other data associated with > >>>> the actual user-data-carrying bits. Neither GCC nor DPDK supports any > >>>> such ISAs, to my knowledge. I suspect neither never will. So the cast > >>>> will continue to work. > >>> > >>> I tend to agree with you on this. > >>> > >>> We should officially decide that DPDK treats RTE_ATOMIC types as a union > of > >> _Atomic and non-atomic, i.e. operations on RTE_ATOMIC types can be both > atomic > >> and non-atomic. > >>> > >> > >> I think this is a subject which needs to be further explored. > > > > Yes. It's easier exploring and deciding now, when our options are open, than > after we have locked down the affected APIs. > > > >> > >> Objects that can be accessed both atomically and non-atomically should > >> be without _Atomic. With my current understanding of this issue, that > >> seems like the best option. > > > > Agree. > > > > The alterative described below is certainly no good! > > > > It would be nice if they were marked as sometimes-atomic by a qualifier or > special type, like rte_be32_t marks the network byte order variant of an > uint32_t. > > > > Furthermore, large atomic objects need the _Atomic qualifier for the > compiler to add (and use) the associated lock. > > If you have larger objects than the ISA can handle, you wouldn't want to > leave the choice of the synchronization primitive to use to the > compiler. I don't see how it could possibly know, which one is the most > appropriate, especially in a DPDK context. It would for example need to > know if the contending threads are non-preemptable or not. > > In some situations a sequence lock may well be your best option. Will > the compiler generate one for you? > > If "lock" means std::mutex, it would be a disaster, performance-wise.
Considering that the atomic functions, e.g. atomic_fetch_add(), without _explicit(..., memory_order) means memory_order_seq_cst, I think it does. This makes it relatively straightforward to use atomic types, at the cost of performance. There's a good description here: https://en.cppreference.com/w/c/language/atomic Note that accessing members of an _Atomic struct/union is undefined behavior. For those, you need to have a non-atomic type, used as "value" to void atomic_store( volatile _Atomic struct mytype * obj, const struct mytype value ), and return value from atomic_load( const volatile _Atomic struct mytype * obj ). In other words, for structs/unions, _Atomic variables are only accessed through accessor functions taking pointers to them, and thereby transformed from/to values of similar non-atomic type. I think that this concept also supports your suggestion above: Objects that can be accessed both atomically and non-atomically should be without _Atomic. But I still think it would be a good idea to mark them as sometimes-atomic, for source code readability/review purposes. E.g. the mbuf's refcnt field is of the type RTE_ATOMIC(uint16_t). If it is not only accessed through atomic_ accessor functions, should it still be marked RTE_ATOMIC()? In the future, compilers might warn or error when an _Atomic variable (of any type) is being accessed directly. The extreme solution would be not to mix atomic and non-atomic access to variables. But that seems unrealistic (at this time). If we truly want to support C11 atomics, we need to understand and follow the concepts in the standard. > > > Alternatively, we could specify that sometimes-atomic objects cannot be > larger than 8 byte, which is what MSVC can handle without locking. > > > >> > >> You could turn it around as well, and have such marked _Atomic and have > >> explicit casts to their non-_Atomic cousins when operated upon by > >> non-atomic functions. Not sure how realistic that is, since > >> non-atomicity is the norm. All generic selection-based "functions" must > >> take this into account. > >> > >>>> > >>>>>> + unsigned int nr, int > >>>>>> memory_order) \ > >>>>>> + { > >>>>>> \ > >>>>>> + RTE_ASSERT(nr < size); > >>>>>> \ > >>>>>> + > >>>>>> \ > >>>>>> + const RTE_ATOMIC(uint ## size ## _t) *a_addr = > >>>>>> \ > >>>>>> + (const RTE_ATOMIC(uint ## size ## _t) *)addr; > >>>>>> \ > >>>>>> + uint ## size ## _t mask = (uint ## size ## _t)1 << nr; > \ > >>>>>> + return rte_atomic_load_explicit(a_addr, memory_order) & > >>>> mask; \ > >>>>>> + } > >>>>> > >>>>> > >>>>> Similar considerations regarding volatile qualifier for the "once" > >>>> operations. > >>>>>