> -----Original Message----- > From: Van Haaren, Harry <harry.van.haa...@intel.com> > Sent: Friday, April 3, 2020 7:58 PM > To: Phil Yang <phil.y...@arm.com>; tho...@monjalon.net; Ananyev, > Konstantin <konstantin.anan...@intel.com>; > step...@networkplumber.org; maxime.coque...@redhat.com; > dev@dpdk.org > Cc: david.march...@redhat.com; jer...@marvell.com; > hemant.agra...@nxp.com; Honnappa Nagarahalli > <honnappa.nagaraha...@arm.com>; Gavin Hu <gavin...@arm.com>; > Ruifeng Wang <ruifeng.w...@arm.com>; Joyce Kong > <joyce.k...@arm.com>; nd <n...@arm.com> > Subject: RE: [PATCH v3 11/12] service: optimize with c11 one-way barrier > > > -----Original Message----- > > From: Phil Yang <phil.y...@arm.com> > > Sent: Tuesday, March 17, 2020 1:18 AM > > To: tho...@monjalon.net; Van Haaren, Harry > <harry.van.haa...@intel.com>; > > Ananyev, Konstantin <konstantin.anan...@intel.com>; > > step...@networkplumber.org; maxime.coque...@redhat.com; > dev@dpdk.org > > Cc: david.march...@redhat.com; jer...@marvell.com; > hemant.agra...@nxp.com; > > honnappa.nagaraha...@arm.com; gavin...@arm.com; > ruifeng.w...@arm.com; > > joyce.k...@arm.com; n...@arm.com > > Subject: [PATCH v3 11/12] service: optimize with c11 one-way barrier > > > > The num_mapped_cores and execute_lock are synchronized with > rte_atomic_XX > > APIs which is a full barrier, DMB, on aarch64. This patch optimized it with > > c11 atomic one-way barrier. > > > > Signed-off-by: Phil Yang <phil.y...@arm.com> > > Reviewed-by: Ruifeng Wang <ruifeng.w...@arm.com> > > Reviewed-by: Gavin Hu <gavin...@arm.com> > > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > Based on discussion on-list, it seems the consensus is to not use > GCC builtins, but instead use C11 APIs "proper"? If my conclusion is > correct, the v+1 of this patchset would require updates to that style API. > > Inline comments for context below, -Harry > > > > --- > > lib/librte_eal/common/rte_service.c | 50 > ++++++++++++++++++++++++++---------- > > - > > 1 file changed, 35 insertions(+), 15 deletions(-) > > > > diff --git a/lib/librte_eal/common/rte_service.c > > b/lib/librte_eal/common/rte_service.c > > index 0843c3c..c033224 100644 > > --- a/lib/librte_eal/common/rte_service.c > > +++ b/lib/librte_eal/common/rte_service.c > > @@ -42,7 +42,7 @@ struct rte_service_spec_impl { > > * running this service callback. When not set, a core may take the > > * lock and then run the service callback. > > */ > > - rte_atomic32_t execute_lock; > > + uint32_t execute_lock; > > > > /* API set/get-able variables */ > > int8_t app_runstate; > > @@ -54,7 +54,7 @@ struct rte_service_spec_impl { > > * It does not indicate the number of cores the service is running > > * on currently. > > */ > > - rte_atomic32_t num_mapped_cores; > > + int32_t num_mapped_cores; > > Any reason why "int32_t" or "uint32_t" is used over another? > execute_lock is a uint32_t above, num_mapped_cores is an int32_t?
It should be uint32_t for num_mapped_cores. This value will not be negative after __atomic_sub_fetch operation, because of the sequence of writer and reader accesses are guaranteed by the memory ordering. I will update it in the next version. Thanks, Phil <snip>