> -----Original Message----- > From: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > Sent: Tuesday, September 6, 2022 5:14 PM > To: Van; Haaren; Van Haaren, Harry <harry.van.haa...@intel.com> > Cc: dev@dpdk.org; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; > Morten Brørup <m...@smartsharesystems.com>; nd <n...@arm.com>; > mattias.ronnblom <mattias.ronnb...@ericsson.com> > Subject: [PATCH 1/6] service: reduce statistics overhead for parallel services > > Move the statistics from the service data structure to the per-lcore > struct. This eliminates contention for the counter cache lines, which > decreases the producer-side statistics overhead for services deployed > across many lcores.
Agree with the approach, good stuff. One comment below, but no changes necessary. > Prior to this patch, enabling statistics for a service with a > per-service function call latency of 1000 clock cycles deployed across > 16 cores on a Intel Xeon 6230N @ 2,3 GHz would incur a cost of ~10000 > core clock cycles per service call. After this patch, the statistics > overhead is reduce to 22 clock cycles per call. > > Signed-off-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> Tested by applying the two patch fixes first (http://patches.dpdk.org/project/dpdk/list/?series=23959) and this patchset from Mattias. Acked-by: Harry van Haaren <harry.van.haa...@intel.com> <snip> > static uint32_t rte_service_count; > @@ -138,13 +130,16 @@ rte_service_finalize(void) > rte_service_library_initialized = 0; > } > > -/* returns 1 if service is registered and has not been unregistered > - * Returns 0 if service never registered, or has been unregistered > - */ > -static inline int > +static inline bool > +service_registered(uint32_t id) > +{ > + return rte_services[id].internal_flags & SERVICE_F_REGISTERED; > +} Before we had a !! to flip any value (e.g. binary 100 to just binary 1). SERVICE_F_REGISTERED is binary 1,and internal_flags is u8, so its fine as is. <snip>