<snip> > > > > > > This commit fixes a potential racey-add that could occur if multiple > > service- > > > lcores were executing the same MT-safe service at the same time, > > > with service statistics collection enabled. > > > > > > Because multiple threads can run and execute the service, the stats > > values > > > can have multiple writer threads, resulting in the requirement of > > using > > > atomic addition for correctness. > > > > > > Note that when a MT unsafe service is executed, a spinlock is held, > > so the > > > stats increments are protected. This fact is used to avoid executing > > atomic > > > add instructions when not required. > > > > > > This patch causes a 1.25x increase in cycle-cost for polling a MT > > safe service > > > when statistics are enabled. No change was seen for MT unsafe > > services, or > > > when statistics are disabled. > > > > > > Reported-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > > > Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com> > > > Suggested-by: Morten Brørup <m...@smartsharesystems.com> > > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> > > > > > > --- > > > --- > > > lib/eal/common/rte_service.c | 10 ++++++++-- > > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > > > diff --git a/lib/eal/common/rte_service.c > > b/lib/eal/common/rte_service.c > > > index ef31b1f63c..f045e74ef3 100644 > > > --- a/lib/eal/common/rte_service.c > > > +++ b/lib/eal/common/rte_service.c > > > @@ -363,9 +363,15 @@ service_runner_do_callback(struct > > > rte_service_spec_impl *s, > > > uint64_t start = rte_rdtsc(); > > > s->spec.callback(userdata); > > > uint64_t end = rte_rdtsc(); > > > - s->cycles_spent += end - start; > > > + uint64_t cycles = end - start; > > > cs->calls_per_service[service_idx]++; > > > - s->calls++; > > > + if (service_mt_safe(s)) { > > > + __atomic_fetch_add(&s->cycles_spent, cycles, > > > __ATOMIC_RELAXED); > > > + __atomic_fetch_add(&s->calls, 1, > > > __ATOMIC_RELAXED); > > > + } else { > > > + s->cycles_spent += cycles; > > > + s->calls++; > > This is still a problem from a reader perspective. It is possible that > > the writes could be split while a reader is reading the stats. These > > need to be atomic adds. > > I don't understand what you suggest can go wrong here, Honnappa. If you > talking about 64 bit counters on 32 bit architectures, then I understand the > problem (and have many years of direct experience with it myself). > Otherwise, I hope you can elaborate or direct me to educational material > about the issue, considering this a learning opportunity. :-) I am thinking of the case where the 64b write is split into two 32b (or more) write operations either by the compiler or the micro-architecture. If this were to happen, it causes race conditions with the reader.
As far as I understand, the compiler does not provide any guarantees on generating non-tearing stores unless an atomic builtin/function is used. If we have to ensure the micro-architecture does not generate split writes, we need to be careful that future code additions do not change the alignment of the stats. > > > > > > + } > > > } else > > > s->spec.callback(userdata); > > > } > > > -- > > > 2.32.0