+ Erik as there are similar changes to timer library <snip>
> > > > Subject: [PATCH v3 00/12] generic rte atomic APIs deprecate proposal > > > > > > DPDK provides generic rte_atomic APIs to do several atomic operations. > > > These APIs are using the deprecated __sync built-ins and enforce > > > full memory barriers on aarch64. However, full barriers are not > > > necessary in many use cases. In order to address such use cases, C > > > language offers > > > C11 atomic APIs. The C11 atomic APIs provide finer memory barrier > > > control by making use of the memory ordering parameter provided by > > > the > > user. > > > Various patches submitted in the past [2] and the patches in this > > > series indicate significant performance gains on multiple aarch64 > > > CPUs and no performance loss on x86. > > > > > > But the existing rte_atomic API implementations cannot be changed as > > > the APIs do not take the memory ordering parameter. The only choice > > > available is replacing the usage of the rte_atomic APIs with C11 > > > atomic APIs. In order to make this change, the following steps are > proposed: > > > > > > [1] deprecate rte_atomic APIs so that future patches do not use > > > rte_atomic APIs (a script is added to flag the usages). > > > [2] refactor the code that uses rte_atomic APIs to use c11 atomic APIs. > > > > On [1] above, I feel deprecating DPDKs atomic functions and failing > > checkpatch is a bit sudden. Perhaps noting that in a future release > > (20.11?) DPDK will move to a > > C11 based atomics model is a more gradual step to achieving the goal, > > and at that point add a checkpatch warning for additions of rte_atomic*? > We have been working on changing existing usages of rte_atomic APIs in DPDK > to use C11 atomics. Usually, the x.11 releases have significant amount of > changes (not sure how many would use rte_atomic APIs). I would prefer that > in 20.11 no additional code is added using rte_atomics APIs. However, I am > open to suggestions on the exact time frame. > Once we decide on the release, I think it makes sense to add a 'warning' in > the > checkpatch to indicate the deprecation timeline and add an 'error' after the > release. > > > > > More on [2] in context below. > > > > The above is my point-of-view, of course I'd like more people from the > > DPDK community to provide their input too. > > > > > > > This patchset contains: > > > 1) the checkpatch script changes to flag rte_atomic API usage in patches. > > > 2) changes to programmer guide describing writing efficient code for > > aarch64. > > > 3) changes to various libraries to make use of c11 atomic APIs. > > > > > > We are planning to replicate this idea across all the other > > > libraries, drivers, examples, test applications. In the next phase, > > > we will add changes to the mbuf, the EAL interrupts and the event > > > timer adapter > > libraries. > > > > About ~6/12 patches of this C11 set are targeting the Service Cores > > area of DPDK. I have some concerns over increased complexity of C11 > > implementation vs the (already complex) rte_atomic implementation today. > I agree that it C11 changes are complex, especially if one is starting out to > understand what these APIs provide. From my experience, once few > underlying concepts are understood, reviewing or making changes do not take > too much time. > > > I see other patchsets enabling C11 across other DPDK components, so > > maybe we should also discuss C11 enabling in a wider context that just > service cores? > Yes, agree. We are in the process of making changes to other areas as well. > > > > > I don't think it fair to expect all developers to be well versed in > > C11 atomic semantics like understanding the complex interactions > > between the various > > C11 RELEASE, AQUIRE barriers requires. > C11 has been around from sometime now. To my surprise, OVS already uses > C11 APIs extensively. VPP has been accepting C11 related changes from past > couple of years. Having said that, I agree in general that not everyone is > well > versed. > > > > > As maintainer of Service Cores I'm hesitant to accept the large-scale > > refactor > Right now, the patches are split into multiple commits. If required I can > host a > call to go over simple C11 API usages (sufficient to cover the usage in > service > core) and the changes in this patch. If you find that particular areas need > more > understanding I can work on providing additional information such as memory > order ladder diagrams. Please let me know what you think. When I started working with C11 APIs, I had referred to the following blogs. https://preshing.com/20120913/acquire-and-release-semantics/ https://preshing.com/20130702/the-happens-before-relation/ https://preshing.com/20130823/the-synchronizes-with-relation/ These will be helpful to understand the changes. > > > of atomic-implementation, as it could lead to racey bugs that are > > likely extremely difficult to track down. (The recent race-on-exit has > > proven the difficulty in reproducing, and that's with an atomics model > > I'm quite familiar with). > > > > Let me be very clear: I don't wish to block a C11 atomic > > implementation, but I'd like to discuss how we (DPDK community) can > > best port-to and maintain a complex multi-threaded service library > > with best-in-class performance for the workload. > > > > To put some discussions/solutions on the table: > > - Shared Maintainership of a component? > > Split in functionality and C11 atomics implementation > > Obviously there would be collaboration required in such a case. > > - Maybe shared maintainership is too much? > > A gentlemans/womans agreement of "helping out" with C11 atomics > > debug is enough? > I think shared maintainer ship could be too much as there are many changes. > But, I and other engineers from Arm (I would include Arm ecosystem as well) > can definitely help out on debug and reviews involving C11 APIs. (I see > Thomas's reply on this topic). > > > > > > > Hope my concerns are understandable, and of course input/feedback > > welcomed! -Harry > No worries at all. We are here to help and make this as easy as possible. > > > > > > > PS: Apologies for the delay in reply - was OOO on Irish national holiday. > Same here, was on vacation for 3 days. > > > > > > > > v3: > > > add libatomic dependency for 32-bit clang > > > > > > v2: > > > 1. fix Clang '-Wincompatible-pointer-types' WARNING. > > > 2. fix typos. > > > > > > Honnappa Nagarahalli (2): > > > service: avoid race condition for MT unsafe service > > > service: identify service running on another core correctly > > > > > > Phil Yang (10): > > > doc: add generic atomic deprecation section > > > devtools: prevent use of rte atomic APIs in future patches > > > eal/build: add libatomic dependency for 32-bit clang > > > build: remove redundant code > > > vhost: optimize broadcast rarp sync with c11 atomic > > > ipsec: optimize with c11 atomic for sa outbound sqn update > > > service: remove rte prefix from static functions > > > service: remove redundant code > > > service: optimize with c11 one-way barrier > > > service: relax barriers with C11 atomic operations > > > > > > devtools/checkpatches.sh | 9 ++ > > > doc/guides/prog_guide/writing_efficient_code.rst | 60 +++++++- > > > drivers/event/octeontx/meson.build | 5 - > > > drivers/event/octeontx2/meson.build | 5 - > > > drivers/event/opdl/meson.build | 5 - > > > lib/librte_eal/common/rte_service.c | 175 > > > ++++++++++++---------- > > > - > > > lib/librte_eal/meson.build | 6 + > > > lib/librte_ipsec/ipsec_sqn.h | 3 +- > > > lib/librte_ipsec/sa.h | 2 +- > > > lib/librte_rcu/meson.build | 5 - > > > lib/librte_vhost/vhost.h | 2 +- > > > lib/librte_vhost/vhost_user.c | 7 +- > > > lib/librte_vhost/virtio_net.c | 16 ++- > > > 13 files changed, 181 insertions(+), 119 deletions(-) > > > > > > -- > > > 2.7.4 >