<snip>
> > >
> > > This commit fixes a potential racey-add that could occur if multiple
> > service-
> > > lcores were executing the same MT-safe service at the same time,
> > > with service statistics collection enabled.
> > >
> > > Because multiple threads can run and execute the service, the stats
> > values
> > > can have multiple writer threads, resulting in the requirement of
> > using
> > > atomic addition for correctness.
> > >
> > > Note that when a MT unsafe service is executed, a spinlock is held,
> > so the
> > > stats increments are protected. This fact is used to avoid executing
> > atomic
> > > add instructions when not required.
> > >
> > > This patch causes a 1.25x increase in cycle-cost for polling a MT
> > safe service
> > > when statistics are enabled. No change was seen for MT unsafe
> > services, or
> > > when statistics are disabled.
> > >
> > > Reported-by: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
> > > Suggested-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > > Suggested-by: Morten Brørup <m...@smartsharesystems.com>
> > > Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com>
> > >
> > > ---
> > > ---
> > >  lib/eal/common/rte_service.c | 10 ++++++++--
> > >  1 file changed, 8 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/lib/eal/common/rte_service.c
> > b/lib/eal/common/rte_service.c
> > > index ef31b1f63c..f045e74ef3 100644
> > > --- a/lib/eal/common/rte_service.c
> > > +++ b/lib/eal/common/rte_service.c
> > > @@ -363,9 +363,15 @@ service_runner_do_callback(struct
> > > rte_service_spec_impl *s,
> > >           uint64_t start = rte_rdtsc();
> > >           s->spec.callback(userdata);
> > >           uint64_t end = rte_rdtsc();
> > > -         s->cycles_spent += end - start;
> > > +         uint64_t cycles = end - start;
> > >           cs->calls_per_service[service_idx]++;
> > > -         s->calls++;
> > > +         if (service_mt_safe(s)) {
> > > +                 __atomic_fetch_add(&s->cycles_spent, cycles,
> > > __ATOMIC_RELAXED);
> > > +                 __atomic_fetch_add(&s->calls, 1,
> > > __ATOMIC_RELAXED);
> > > +         } else {
> > > +                 s->cycles_spent += cycles;
> > > +                 s->calls++;
> > This is still a problem from a reader perspective. It is possible that
> > the writes could be split while a reader is reading the stats. These
> > need to be atomic adds.
> 
> I don't understand what you suggest can go wrong here, Honnappa. If you
> talking about 64 bit counters on 32 bit architectures, then I understand the
> problem (and have many years of direct experience with it myself).
> Otherwise, I hope you can elaborate or direct me to educational material
> about the issue, considering this a learning opportunity. :-)
I am thinking of the case where the 64b write is split into two 32b (or more) 
write operations either by the compiler or the micro-architecture. If this were 
to happen, it causes race conditions with the reader.

As far as I understand, the compiler does not provide any guarantees on 
generating non-tearing stores unless an atomic builtin/function is used. If we 
have to ensure the micro-architecture does not generate split writes, we need 
to be careful that future code additions do not change the alignment of the 
stats.

> 
> >
> > > +         }
> > >   } else
> > >           s->spec.callback(userdata);
> > >  }
> > > --
> > > 2.32.0

Reply via email to