On Thu, May 09, 2024 at 09:43:16AM +0200, Morten Brørup wrote:
> > From: Stephen Hemminger [mailto:step...@networkplumber.org]
> > Sent: Wednesday, 8 May 2024 22.54
> > 
> > On Wed, 8 May 2024 20:48:06 +0100
> > Ferruh Yigit <ferruh.yi...@amd.com> wrote:
> > 
> > > >
> > > > The idea of load tearing is crazy talk of integral types. It would
> > break so many things.
> > > > It is the kind of stupid compiler thing that would send Linus on a
> > rant and get
> > > > the GCC compiler writers in trouble.
> > > >
> > > > The DPDK has always favored performance over strict safety guard
> > rails everywhere.
> > > > Switching to making every statistic an atomic operation is not in
> > the spirit of
> > > > what is required. There is no strict guarantee necessary here.
> > > >
> > >
> > > I kind of agree with Stephen.
> > >
> > > Thanks Mattias, Morten & Stephen, it was informative discussion. But
> > for
> > > *SW drivers* stats update and reset is not core functionality and I
> > > think we can be OK to get hit on corner cases, instead of
> > > over-engineering or making code more complex.
> > 
> > 
> > I forgot the case of 64 bit values on 32 bit platforms!
> > Mostly because haven't cared about 32 bit for years...
> > 
> > The Linux kernel uses some wrappers to handle this.
> > On 64 bit platforms they become noop.
> > On 32 bit platform, they are protected by a seqlock and updates are
> > wrapped by the sequence count.
> > 
> > If we go this way, then doing similar Noop on 64 bit and atomic or
> > seqlock
> > on 32 bit should be done, but in common helper.
> > 
> > Looking inside FreeBSD, it looks like that has changed over the years as
> > well.
> > 
> >     if_inc_counter
> >             counter_u64_add
> >                     atomic_add_64
> > But the counters are always per-cpu in this case. So although it does
> > use
> > locked operation, will always be uncontended.
> > 
> > 
> > PS: Does DPDK still actually support 32 bit on x86? Can it be dropped
> > this cycle?
> 
> We cannot drop 32 bit architecture support altogether.
> 
> But, unlike the Linux kernel, DPDK doesn't need to support ancient 32 bit 
> architectures.
> If the few 32 bit architectures supported by DPDK provide non-tearing 64 bit 
> loads/stores, we don't need locks (in the fast path) for 64 bit counters.
> 
> In addition to 32 bit x86, DPDK supports ARMv7-A (a 32 bit architecture) and 
> 32 bit ARMv8.
> I don't think DPDK support any other 32 bit architectures.
> 
> 
> As Mattias mentioned, 32 bit x86 can use xmm registers to provide 64 bit 
> non-tearing load/store.
> 

Testing this a little in godbolt, I see gcc using xmm registers on 32-bit
when updating 64-bit counters, but clang doesn't seem to do so, but instead
does 2 stores when writing back the 64 value. (I tried with both volatile
and non-volatile 64-bit values, just to see if volatile would encourage
clang to do a single store).

GCC: https://godbolt.org/z/9eqKfT3hz
Clang: https://godbolt.org/z/PT5EqKn4c

/Bruce

Reply via email to