<snip> > >> > >>> Subject: Re: [dpdk-dev] Arm roadmap for 20.05 > >>> > >>> On 2020-03-10 17:42, Honnappa Nagarahalli wrote: > >>>> Hello, > >>>> Following are the work items planned for 20.05: > >>>> > >>>> 1) Use C11 atomic APIs in timer library > >>>> 2) Use C11 atomic APIs in service cores > >>>> 3) Use C11 atomics in VirtIO split ring > >>>> 4) Performance optimizations in i40e and MLX drivers for Arm > >>>> platforms > >>>> 5) RCU defer API > >>>> 6) Enable Travis CI with no huge-page tests - ~25 test cases > >>>> > >>>> Thank you, > >>>> Honnappa > >>> Maybe you should have a look at legacy DPDK atomics as well? > >>> Avoiding a full barrier for the add operation, for example. > >> By legacy, I believe you meant rte_atomic APIs. Those APIs do not take > memory order as a parameter. So, it is difficult to change the implementation > for those APIs. For ex: the add operation could take a RELEASE or RELAXED > order depending on the use case. > >> So, the proposal is to deprecate the rte_atomic APIs and use C11 APIs > >> directly. The proposal is here: > >> https://protect2.fireeye.com/v1/url?k=2e04311e-72d039b7-2e047185- > 865b > >> 3b1e120b-91a0698f69ff0d1f&q=1&e=976056f3-f089-4fa8-86b2- > aa5e88331555& > >> u=https%3A%2F%2Fpatches.dpdk.org%2Fcover%2F66745%2F > > Even though rte_atomic lacks the flexibility of C11 atomics, there > > might still be areas of improvement. Such improvements will have an > > instant effect, as opposed to waiting for all the rte_atomic users to > > change. > > > > > > The rte_atomic API leaves ordering unspecified, unfortunately. In the > > Linux kernel, from which DPDK seems to borrow much of the atomics and > > memory order related semantics, an atomic add doesn't imply any memory > > barriers. The current __sync_fetch_and_add()-based implementation > > implies a full barrier (ldadd+dmb) or release (ldaddal, on v8.1-a). If > > you would use C11 atomics to implement rte_atomic in ARM, you could > > use a relaxed memory order on rte_atomic*_add() (assuming you agree > > those are the implicit semantics of the legacy API) and just get an > > ldadd instruction. An alternative would be to implement the same thing > > in assembler, of course. > > > > > Another approach might be to just scrap all of the intrinsics and inline > assembler used for all the functions in rte_atomic, on all architectures, and > use C11 atomics instead. Yes, this is the approach we are taking. But, it does not solve the use of rte_atomic APIs in the applications.
>