On Sun, 26 May 2024 09:21:55 +0200 Mattias Rönnblom <hof...@lysator.liu.se> wrote:
> On 2024-05-08 17:23, Stephen Hemminger wrote: > > On Wed, 8 May 2024 09:19:02 +0200 > > Mattias Rönnblom <hof...@lysator.liu.se> wrote: > > > >> On 2024-05-04 00:00, Stephen Hemminger wrote: > >>> On Fri, 3 May 2024 16:45:47 +0100 > >>> Ferruh Yigit <ferruh.yi...@amd.com> wrote: > >>> > >>>> For stats reset, use an offset instead of zeroing out actual stats > >>>> values, > >>>> get_stats() displays diff between stats and offset. > >>>> This way stats only updated in datapath and offset only updated in stats > >>>> reset function. This makes stats reset function more reliable. > >>>> > >>>> As stats only written by single thread, we can remove 'volatile' > >>>> qualifier > >>>> which should improve the performance in datapath. > >>>> > >>>> While updating around, 'igb_stats' parameter renamed as 'stats'. > >>>> > >>>> Signed-off-by: Ferruh Yigit <ferruh.yi...@amd.com> > >>>> --- > >>>> Cc: Mattias Rönnblom <mattias.ronnb...@ericsson.com> > >>>> Cc: Stephen Hemminger <step...@networkplumber.org> > >>>> Cc: Morten Brørup <m...@smartsharesystems.com> > >>>> > >>>> This update triggered by mail list discussion [1]. > >>>> > >>>> [1] > >>>> https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969...@lysator.liu.se/ > >>>> > >>> > >>> > >>> NAK > >>> > >>> I did not hear a good argument why atomic or volatile was necessary in > >>> the first place. > >>> Why? > >>> > >> > >> On the reader side, loads should be atomic. > >> On the writer side, stores should be atomic. > >> > >> Updates (stores) should actually occur in a timely manner. The complete > >> read-modify-write cycle need not be atomic, since we only have a single > >> writer. All this for the per-lcore counter case. > >> > >> If load or store tearing occurs, the counter values may occasionally > >> take totally bogus values. I think that should be avoided. Especially > >> since it will likely come at a very reasonable cost. > >> > >> From what it seems to me, load or store tearing may well occur. GCC may > >> generate two 32-bit stores for a program-level 64-bit store on 32-bit > >> x86. If you have constant and immediate-data store instructions, > >> constant writes may also be end up teared. The kernel documentation has > >> some example of this. Add LTO, it's not necessarily going to be all that > >> clear what is storing-a-constant and what is not. > >> > >> Maybe you care a little less if statistics are occasionally broken, or > >> some transient, inconsistent state, but generally they should work, and > >> they should never have some totally bogus values. So, statistics aren't > >> snow flakes, mostly just business as usual. > >> > >> We can't both have a culture that promotes C11-style parallel > >> programming, or, at the extreme, push the C11 APIs as-is, and the say > >> "and btw you don't have to care about the standard when it comes to > >> statistics". > >> > >> We could adopt the Linux kernel's rules, programming model, and APIs > >> (ignoring legal issues). That would be very old school, maybe somewhat > >> over-engineered for our purpose, include a fair amount of inline > >> assembler, and also and may well depend on GCC or GCC-like compilers, > >> just like what I believe the kernel does. > >> > >> We could use something in-between, heavily inspired by C11 but still > >> with an opportunity to work around compiler issues, library issues, and > >> extend the API for our use case. > >> > >> I agree we shouldn't have to mark statistics _Atomic, or RTE_ATOMIC(), > >> rte_atomic64_t, or rte_sometimes_atomic_and_sometimes_not64_t. Just > >> keeping the usual C integer types seems like a better option to me. > >> > >>> Why is this driver special (a snowflake) compared to all the other > >>> drivers doing software > >>> statistics (tap, virtio, xdp, ring, memif, netvsc, vmware)? > >> > >> If a broken piece of code has been copied around, one place is going to > >> be the first to be fixed. > > > > > > I dislike when any driver does something completely different than valid > > precedent. > > No other driver in DPDK, Vpp, FreeBSD, Linux (and probably Windows) uses > > atomic for > > updating statistics. We even got performance benefit at MS from removing > > atomic > > increment of staistics in internal layers. > > > > All of those are using atomic stores when updating the statistics, I'm > sure. Assuring a store being atomic is one thing, and assuring the whole > read-modify-write cycle is atomic is a completely different (and very > much more expensive) thing. > > > The idea of load tearing is crazy talk of integral types. It would break so > > many things. > > It is the kind of stupid compiler thing that would send Linus on a rant and > > get > > the GCC compiler writers in trouble. > > > > On 32-bit x86, store tearing for 64-bit integral types is the order of > the day. > > For <64 bit types, I agree. The only cases (like the one listed in the > kernel documentation), are going to be rare indeed. > > > The DPDK has always favored performance over strict safety guard rails > > everywhere. > > Switching to making every statistic an atomic operation is not in the > > spirit of > > what is required. There is no strict guarantee necessary here. > > > > I think we agree but just use different terminology. > Don't think any real conclusion was reached on this. Af_packet is a less used driver, more concerned about virtio and xdp drivers. Marking this patch as changes requested, since still under discussion.