On Sun, 26 May 2024 09:21:55 +0200
Mattias Rönnblom <hof...@lysator.liu.se> wrote:

> On 2024-05-08 17:23, Stephen Hemminger wrote:
> > On Wed, 8 May 2024 09:19:02 +0200
> > Mattias Rönnblom <hof...@lysator.liu.se> wrote:
> >   
> >> On 2024-05-04 00:00, Stephen Hemminger wrote:  
> >>> On Fri, 3 May 2024 16:45:47 +0100
> >>> Ferruh Yigit <ferruh.yi...@amd.com> wrote:
> >>>      
> >>>> For stats reset, use an offset instead of zeroing out actual stats 
> >>>> values,
> >>>> get_stats() displays diff between stats and offset.
> >>>> This way stats only updated in datapath and offset only updated in stats
> >>>> reset function. This makes stats reset function more reliable.
> >>>>
> >>>> As stats only written by single thread, we can remove 'volatile' 
> >>>> qualifier
> >>>> which should improve the performance in datapath.
> >>>>
> >>>> While updating around, 'igb_stats' parameter renamed as 'stats'.
> >>>>
> >>>> Signed-off-by: Ferruh Yigit <ferruh.yi...@amd.com>
> >>>> ---
> >>>> Cc: Mattias Rönnblom <mattias.ronnb...@ericsson.com>
> >>>> Cc: Stephen Hemminger <step...@networkplumber.org>
> >>>> Cc: Morten Brørup <m...@smartsharesystems.com>
> >>>>
> >>>> This update triggered by mail list discussion [1].
> >>>>
> >>>> [1]
> >>>> https://inbox.dpdk.org/dev/3b2cf48e-2293-4226-b6cd-5f4dd3969...@lysator.liu.se/
> >>>>   
> >>>
> >>>
> >>> NAK
> >>>
> >>> I did not hear a good argument why atomic or volatile was necessary in 
> >>> the first place.
> >>> Why?
> >>>      
> >>
> >> On the reader side, loads should be atomic.
> >> On the writer side, stores should be atomic.
> >>
> >> Updates (stores) should actually occur in a timely manner. The complete
> >> read-modify-write cycle need not be atomic, since we only have a single
> >> writer. All this for the per-lcore counter case.
> >>
> >> If load or store tearing occurs, the counter values may occasionally
> >> take totally bogus values. I think that should be avoided. Especially
> >> since it will likely come at a very reasonable cost.
> >>
> >>   From what it seems to me, load or store tearing may well occur. GCC may
> >> generate two 32-bit stores for a program-level 64-bit store on 32-bit
> >> x86. If you have constant and immediate-data store instructions,
> >> constant writes may also be end up teared. The kernel documentation has
> >> some example of this. Add LTO, it's not necessarily going to be all that
> >> clear what is storing-a-constant and what is not.
> >>
> >> Maybe you care a little less if statistics are occasionally broken, or
> >> some transient, inconsistent state, but generally they should work, and
> >> they should never have some totally bogus values. So, statistics aren't
> >> snow flakes, mostly just business as usual.
> >>
> >> We can't both have a culture that promotes C11-style parallel
> >> programming, or, at the extreme, push the C11 APIs as-is, and the say
> >> "and btw you don't have to care about the standard when it comes to
> >> statistics".
> >>
> >> We could adopt the Linux kernel's rules, programming model, and APIs
> >> (ignoring legal issues). That would be very old school, maybe somewhat
> >> over-engineered for our purpose, include a fair amount of inline
> >> assembler, and also and may well depend on GCC or GCC-like compilers,
> >> just like what I believe the kernel does.
> >>
> >> We could use something in-between, heavily inspired by C11 but still
> >> with an opportunity to work around compiler issues, library issues, and
> >> extend the API for our use case.
> >>
> >> I agree we shouldn't have to mark statistics _Atomic, or RTE_ATOMIC(),
> >> rte_atomic64_t, or rte_sometimes_atomic_and_sometimes_not64_t. Just
> >> keeping the usual C integer types seems like a better option to me.
> >>  
> >>> Why is this driver special (a snowflake) compared to all the other 
> >>> drivers doing software
> >>> statistics (tap, virtio, xdp, ring, memif, netvsc, vmware)?  
> >>
> >> If a broken piece of code has been copied around, one place is going to
> >> be the first to be fixed.  
> > 
> > 
> > I dislike when any driver does something completely different than valid 
> > precedent.
> > No other driver in DPDK, Vpp, FreeBSD, Linux (and probably Windows) uses 
> > atomic for
> > updating statistics. We even got performance benefit at MS from removing 
> > atomic
> > increment of staistics in internal layers.
> >   
> 
> All of those are using atomic stores when updating the statistics, I'm 
> sure. Assuring a store being atomic is one thing, and assuring the whole 
> read-modify-write cycle is atomic is a completely different (and very 
> much more expensive) thing.
> 
> > The idea of load tearing is crazy talk of integral types. It would break so 
> > many things.
> > It is the kind of stupid compiler thing that would send Linus on a rant and 
> > get
> > the GCC compiler writers in trouble.
> >   
> 
> On 32-bit x86, store tearing for 64-bit integral types is the order of 
> the day.
> 
> For <64 bit types, I agree. The only cases (like the one listed in the 
> kernel documentation), are going to be rare indeed.
> 
> > The DPDK has always favored performance over strict safety guard rails 
> > everywhere.
> > Switching to making every statistic an atomic operation is not in the 
> > spirit of
> > what is required. There is no strict guarantee necessary here.
> >   
> 
> I think we agree but just use different terminology.
> 

Don't think any real conclusion was reached on this.
Af_packet is a less used driver, more concerned about virtio and xdp drivers.
Marking this patch as changes requested, since still under discussion.

Reply via email to