On Wed, 15 May 2024 11:30:45 +0200 Morten Brørup <m...@smartsharesystems.com> wrote:
> With a long term perspective, I consider this patch very useful. > And its 32 bit implementation can be optimized for various > architectures/compilers later. > > > In addition, it would be "nice to have" if reset() and fetch() could be > called from another thread than the thread adding to the counter. > > As previously discussed [1], I think it can be done without significantly > affecting fast path add() performance, by using an "offset" with > Release-Consume ordering. > > [1]: > https://inbox.dpdk.org/dev/98cbd80474fa8b44bf855df32c47dc35e9f...@smartserver.smartshare.dk/ > Without a specific driver use case, not sure why this added complexity is needed. If there is a specific example, can add it later. Any atomic operation ends up impacting the speculative execution pipeline on modern CPU's. This version ends up being just a single add instruction on ARM and x86 64 bit.