Added other platform owners.

> > > > > @@ -208,4 +209,25 @@ static inline void
> > > > > rte_atomic64_clear(rte_atomic64_t
> > > > > *v)  }  #endif
> > > > >
> > > > > +static inline int
> > > > > +rte_atomic128_cmpset(volatile uint64_t *dst, uint64_t *exp,
> > > > > +uint64_t
> > > > > +*src) {
> > > > The API name suggests it is a 128b operation. 'dst', 'exp' and 'src'
> > > > should be pointers to 128b (__int128)? Or we could define our own
> > > > data
> > > type.
> > >
> > > I agree, I'm not a big fan of the 64b pointers here. I avoided
> > > __int128 originally because it fails to compile with -pedantic, but
> > > on second thought (and with your suggestion of a separate data
> > > type), we can resolve that with this typedef:
> > >
> > > typedef struct {
> > >         RTE_STD_C11 __int128 val;
> > > } rte_int128_t;
> > ok
> >
> > >
> > > > Since, it is a new API, can we define it with memory orderings
> > > > which will be more conducive to relaxed memory ordering based
> architectures?
> > > > You can refer to [1] and [2] for guidance.
> > >
> > > I certainly see the value in controlling the operation's memory
> > > ordering, like in the __atomic intrinsics, but I'm not sure this
> > > patchset is the right place to address that. I see that work going a
> > > couple
> > ways:
> > > 1. Expand the existing rte_atomicN_* interfaces with additional
> > > arguments. In that case, I'd prefer this be done in a separate
> > > patchset that addresses all the atomic operations, not just cmpset,
> > > so the interface changes are chosen according to the needs of the
> > > full set of atomic operations. If this approach is taken then
> > > there's no need to solve this while rte_atomic128_cmpset is
> > > experimental, since all the
> > other functions are non-experimental anyway.
> > >
> > > - Or -
> > >
> > > 2. Don't modify the existing rte_atomicN_* interfaces (or their
> > > strongly ordered behavior), and instead create new versions of them
> > > that take additional arguments. In this case, we can implement
> > > rte_atomic128_cmpset() as is and create a more flexible version in a
> > > later
> > patchset.
> > >
> > > Either way, I think the current interface (w.r.t. memory ordering
> > > options) can work and still leaves us in a good position for future
> > changes/improvements.
> > >
> > I do not see the need to modify/extend the existing rte_atomicN_* APIs
> > as the corresponding __atomic intrinsics serve as replacements. I
> > expect that at some point, DPDK code base will not be using
> rte_atomicN_* APIs.
> > However, __atomic intrinsics do not support 128b wide parameters.
> > Hence
> 
> I don't think that's correct. From the GCC docs:
> 
> "16-byte integral types are also allowed if `__int128' (see __int128) is
> supported by the architecture."
> 
> This works with x86 -64 -- I assume aarch64 also, but haven't confirmed.
> 
> Source: https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/_005f_005fatomic-
> Builtins.html
(following is based on my reading, not based on experiments)
My understanding is that the __atomic_load_n/store_n were implemented using a 
compare-and-swap HW instruction (due to lack of HW 128b atomic load and store 
instructions). This introduced the side effect or store/load respectively. 
Where as the user does not expect such side effects. As suggested in [1], it 
looks like GCC delegated the implementation to libatomic which 'it seems' uses 
locks to implement 128b __atomic intrinsics (needs to be verified)

If __atomic intrinsics, for any of the supported platforms, do not have an 
optimal implementation, I prefer to add a DPDK API as an abstraction. A given 
platform can choose to implement such an API using __atomic intrinsics if it 
wants. The DPDK API can be similar to __atomic_compare_exchange_n.

[1] https://patchwork.ozlabs.org/patch/721686/
[2] https://gcc.gnu.org/ml/gcc/2017-01/msg00167.html

> 
> > DPDK needs to write its own. Since this is the first API in that
> > regard, I prefer that we start with a signature that resembles
> > __atomic intrinsics which have been proven to provide best flexibility for
> all the platforms supported by DPDK.

Reply via email to