Hi Phil & Honnappa,

> -----Original Message-----
> From: Phil Yang <phil.y...@arm.com>
> Sent: Tuesday, March 17, 2020 1:18 AM
> To: tho...@monjalon.net; Van Haaren, Harry <harry.van.haa...@intel.com>;
> Ananyev, Konstantin <konstantin.anan...@intel.com>;
> step...@networkplumber.org; maxime.coque...@redhat.com; dev@dpdk.org
> Cc: david.march...@redhat.com; jer...@marvell.com; hemant.agra...@nxp.com;
> honnappa.nagaraha...@arm.com; gavin...@arm.com; ruifeng.w...@arm.com;
> joyce.k...@arm.com; n...@arm.com
> Subject: [PATCH v3 00/12] generic rte atomic APIs deprecate proposal
> 
> DPDK provides generic rte_atomic APIs to do several atomic operations.
> These APIs are using the deprecated __sync built-ins and enforce full
> memory barriers on aarch64. However, full barriers are not necessary
> in many use cases. In order to address such use cases, C language offers
> C11 atomic APIs. The C11 atomic APIs provide finer memory barrier control
> by making use of the memory ordering parameter provided by the user.
> Various patches submitted in the past [2] and the patches in this series
> indicate significant performance gains on multiple aarch64 CPUs and no
> performance loss on x86.
> 
> But the existing rte_atomic API implementations cannot be changed as the
> APIs do not take the memory ordering parameter. The only choice available
> is replacing the usage of the rte_atomic APIs with C11 atomic APIs. In
> order to make this change, the following steps are proposed:
> 
> [1] deprecate rte_atomic APIs so that future patches do not use rte_atomic
> APIs (a script is added to flag the usages).
> [2] refactor the code that uses rte_atomic APIs to use c11 atomic APIs.

On [1] above, I feel deprecating DPDKs atomic functions and failing checkpatch 
is
a bit sudden. Perhaps noting that in a future release (20.11?) DPDK will move 
to a
C11 based atomics model is a more gradual step to achieving the goal, and at 
that
point add a checkpatch warning for additions of rte_atomic*?

More on [2] in context below.

The above is my point-of-view, of course I'd like more people from the DPDK 
community
to provide their input too.


> This patchset contains:
> 1) the checkpatch script changes to flag rte_atomic API usage in patches.
> 2) changes to programmer guide describing writing efficient code for aarch64.
> 3) changes to various libraries to make use of c11 atomic APIs.
> 
> We are planning to replicate this idea across all the other libraries,
> drivers, examples, test applications. In the next phase, we will add
> changes to the mbuf, the EAL interrupts and the event timer adapter libraries.

About ~6/12 patches of this C11 set are targeting the Service Cores area of 
DPDK. I have some concerns
over increased complexity of C11 implementation vs the (already complex) 
rte_atomic implementation today.
I see other patchsets enabling C11 across other DPDK components, so maybe we 
should also discuss C11
enabling in a wider context that just service cores?

I don't think it fair to expect all developers to be well versed in C11 atomic 
semantics like
understanding the complex interactions between the various C11 RELEASE, AQUIRE 
barriers requires.

As maintainer of Service Cores I'm hesitant to accept the large-scale refactor 
of atomic-implementation,
as it could lead to racey bugs that are likely extremely difficult to track 
down. (The recent race-on-exit
has proven the difficulty in reproducing, and that's with an atomics model I'm 
quite familiar with).

Let me be very clear: I don't wish to block a C11 atomic implementation, but 
I'd like to discuss how we
(DPDK community) can best port-to and maintain a complex multi-threaded service 
library with best-in-class
performance for the workload.

To put some discussions/solutions on the table:
- Shared Maintainership of a component?
     Split in functionality and C11 atomics implementation
     Obviously there would be collaboration required in such a case.
- Maybe shared maintainership is too much?
     A gentlemans/womans agreement of "helping out" with C11 atomics debug is 
enough?


Hope my concerns are understandable, and of course input/feedback welcomed! 
-Harry


PS: Apologies for the delay in reply - was OOO on Irish national holiday.


> v3:
> add libatomic dependency for 32-bit clang
> 
> v2:
> 1. fix Clang '-Wincompatible-pointer-types' WARNING.
> 2. fix typos.
> 
> Honnappa Nagarahalli (2):
>   service: avoid race condition for MT unsafe service
>   service: identify service running on another core correctly
> 
> Phil Yang (10):
>   doc: add generic atomic deprecation section
>   devtools: prevent use of rte atomic APIs in future patches
>   eal/build: add libatomic dependency for 32-bit clang
>   build: remove redundant code
>   vhost: optimize broadcast rarp sync with c11 atomic
>   ipsec: optimize with c11 atomic for sa outbound sqn update
>   service: remove rte prefix from static functions
>   service: remove redundant code
>   service: optimize with c11 one-way barrier
>   service: relax barriers with C11 atomic operations
> 
>  devtools/checkpatches.sh                         |   9 ++
>  doc/guides/prog_guide/writing_efficient_code.rst |  60 +++++++-
>  drivers/event/octeontx/meson.build               |   5 -
>  drivers/event/octeontx2/meson.build              |   5 -
>  drivers/event/opdl/meson.build                   |   5 -
>  lib/librte_eal/common/rte_service.c              | 175 ++++++++++++----------
> -
>  lib/librte_eal/meson.build                       |   6 +
>  lib/librte_ipsec/ipsec_sqn.h                     |   3 +-
>  lib/librte_ipsec/sa.h                            |   2 +-
>  lib/librte_rcu/meson.build                       |   5 -
>  lib/librte_vhost/vhost.h                         |   2 +-
>  lib/librte_vhost/vhost_user.c                    |   7 +-
>  lib/librte_vhost/virtio_net.c                    |  16 ++-
>  13 files changed, 181 insertions(+), 119 deletions(-)
> 
> --
> 2.7.4

Reply via email to