RE: [RFC v2 5/6] eal: add atomic bit operations

Morten Brørup Mon, 29 Apr 2024 00:24:59 -0700

> From: Mattias Rönnblom [mailto:[email protected]]
> Sent: Sunday, 28 April 2024 17.38
> 
> On 2024-04-26 14:00, Morten Brørup wrote:
> >> From: Mattias Rönnblom [mailto:[email protected]]
> >> Sent: Friday, 26 April 2024 11.39
> >>
> >> On 2024-04-25 18:18, Morten Brørup wrote:
> >>>> From: Mattias Rönnblom [mailto:[email protected]]
> >>>> Sent: Thursday, 25 April 2024 16.36
> >>>>
> >>>> On 2024-04-25 12:25, Morten Brørup wrote:
> >>>>>> +#define rte_bit_atomic_test(addr, nr, memory_order)
> >>>>  \
> >>>>>> +      _Generic((addr),                                                
> >>>>>> \
> >>>>>> +               uint32_t *: __rte_bit_atomic_test32,                   
> >>>>>> \
> >>>>>> +               uint64_t *: __rte_bit_atomic_test64)(addr, nr,
> >>>> memory_order)
> >>>>>
> >>>>> I wonder if these should have RTE_ATOMIC qualifier:
> >>>>>
> >>>>> +                RTE_ATOMIC(uint32_t) *: __rte_bit_atomic_test32,
> >>>>          \
> >>>>> +                RTE_ATOMIC(uint64_t) *: __rte_bit_atomic_test64)(addr,
> nr,
> >>>> memory_order)
> >>>>>
> >>>>>
> >>>>>> +#define __RTE_GEN_BIT_ATOMIC_TEST(size)
> >>>>  \
> >>>>>> +      static inline bool                                              
> >>>>>> \
> >>>>>> +      __rte_bit_atomic_test ## size(const uint ## size ## _t *addr,
> >>>>  \
> >>>>>
> >>>>> I wonder if the "addr" parameter should have RTE_ATOMIC qualifier:
> >>>>>
> >>>>> +       __rte_bit_atomic_test ## size(const RTE_ATOMIC(uint ## size ##
> _t)
> >>>> *addr,   \
> >>>>>
> >>>>> instead of casting into a_addr.
> >>>>>
> >>>>
> >>>> Check the cover letter for the rationale for the cast.
> >>>
> >>> Thanks, that clarifies it. Then...
> >>> For the series:
> >>> Acked-by: Morten Brørup <[email protected]>
> >>>
> >>>>
> >>>> Where I'm at now is that I think C11 _Atomic is rather poor design. The
> >>>> assumption that an object which allows for atomic access always should
> >>>> require all operations upon it to be atomic, regardless of where it is
> >>>> in its lifetime, and which thread is accessing it, does not hold, in the
> >>>> general case.
> >>>
> >>> It might be slow, but I suppose the C11 standard prioritizes correctness
> >> over performance.
> >>>
> >>
> >> That's a false dichotomy, in this case. You can have both.
> >
> > In theory you shouldn't need non-atomic access to atomic variables.
> > In reality, we want it anyway, because real CPUs are faster at non-atomic
> operations.
> >
> >>
> >>> It seems locks are automatically added to _Atomic types larger than what
> is
> >> natively supported by the architecture.
> >>> E.g. MSVC adds locks to _Atomic types larger than 8 byte. [1]
> >>>
> >>> [1]: https://devblogs.microsoft.com/cppblog/c11-atomics-in-visual-studio-
> >> 2022-version-17-5-preview-2/
> >>>
> >>>>
> >>>> The only reason for _Atomic being as it is, as far as I can see, is to
> >>>> accommodate for ISAs which does not have the appropriate atomic machine
> >>>> instructions, and thus require a lock or some other data associated with
> >>>> the actual user-data-carrying bits. Neither GCC nor DPDK supports any
> >>>> such ISAs, to my knowledge. I suspect neither never will. So the cast
> >>>> will continue to work.
> >>>
> >>> I tend to agree with you on this.
> >>>
> >>> We should officially decide that DPDK treats RTE_ATOMIC types as a union
> of
> >> _Atomic and non-atomic, i.e. operations on RTE_ATOMIC types can be both
> atomic
> >> and non-atomic.
> >>>
> >>
> >> I think this is a subject which needs to be further explored.
> >
> > Yes. It's easier exploring and deciding now, when our options are open, than
> after we have locked down the affected APIs.
> >
> >>
> >> Objects that can be accessed both atomically and non-atomically should
> >> be without _Atomic. With my current understanding of this issue, that
> >> seems like the best option.
> >
> > Agree.
> >
> > The alterative described below is certainly no good!
> >
> > It would be nice if they were marked as sometimes-atomic by a qualifier or
> special type, like rte_be32_t marks the network byte order variant of an
> uint32_t.
> >
> > Furthermore, large atomic objects need the _Atomic qualifier for the
> compiler to add (and use) the associated lock.
> 
> If you have larger objects than the ISA can handle, you wouldn't want to
> leave the choice of the synchronization primitive to use to the
> compiler. I don't see how it could possibly know, which one is the most
> appropriate, especially in a DPDK context. It would for example need to
> know if the contending threads are non-preemptable or not.
> 
> In some situations a sequence lock may well be your best option. Will
> the compiler generate one for you?
> 
> If "lock" means std::mutex, it would be a disaster, performance-wise.


Considering that the atomic functions, e.g. atomic_fetch_add(), without 
_explicit(..., memory_order) means memory_order_seq_cst, I think it does. This 
makes it relatively straightforward to use atomic types, at the cost of 
performance.

There's a good description here:
https://en.cppreference.com/w/c/language/atomic

Note that accessing members of an _Atomic struct/union is undefined behavior.
For those, you need to have a non-atomic type, used as "value" to void 
atomic_store( volatile _Atomic struct mytype * obj, const struct mytype value 
), and return value from atomic_load( const volatile _Atomic struct mytype * 
obj ).

In other words, for structs/unions, _Atomic variables are only accessed through 
accessor functions taking pointers to them, and thereby transformed from/to 
values of similar non-atomic type.
I think that this concept also supports your suggestion above: Objects that can 
be accessed both atomically and non-atomically should be without _Atomic.

But I still think it would be a good idea to mark them as sometimes-atomic, for 
source code readability/review purposes.

E.g. the mbuf's refcnt field is of the type RTE_ATOMIC(uint16_t). If it is not 
only accessed through atomic_ accessor functions, should it still be marked 
RTE_ATOMIC()?

In the future, compilers might warn or error when an _Atomic variable (of any 
type) is being accessed directly.
The extreme solution would be not to mix atomic and non-atomic access to 
variables. But that seems unrealistic (at this time).

If we truly want to support C11 atomics, we need to understand and follow the 
concepts in the standard.

> 
> > Alternatively, we could specify that sometimes-atomic objects cannot be
> larger than 8 byte, which is what MSVC can handle without locking.
> >
> >>
> >> You could turn it around as well, and have such marked _Atomic and have
> >> explicit casts to their non-_Atomic cousins when operated upon by
> >> non-atomic functions. Not sure how realistic that is, since
> >> non-atomicity is the norm. All generic selection-based "functions" must
> >> take this into account.
> >>
> >>>>
> >>>>>> +                                    unsigned int nr, int 
> >>>>>> memory_order) \
> >>>>>> +      {                                                               
> >>>>>> \
> >>>>>> +              RTE_ASSERT(nr < size);                                  
> >>>>>> \
> >>>>>> +                                                                      
> >>>>>> \
> >>>>>> +              const RTE_ATOMIC(uint ## size ## _t) *a_addr =          
> >>>>>> \
> >>>>>> +                      (const RTE_ATOMIC(uint ## size ## _t) *)addr;   
> >>>>>> \
> >>>>>> +              uint ## size ## _t mask = (uint ## size ## _t)1 << nr;
>       \
> >>>>>> +              return rte_atomic_load_explicit(a_addr, memory_order) &
> >>>> mask; \
> >>>>>> +      }
> >>>>>
> >>>>>
> >>>>> Similar considerations regarding volatile qualifier for the "once"
> >>>> operations.
> >>>>>

RE: [RFC v2 5/6] eal: add atomic bit operations

Reply via email to