> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Wednesday, 8 May 2024 22.54
> 
> On Wed, 8 May 2024 20:48:06 +0100
> Ferruh Yigit <ferruh.yi...@amd.com> wrote:
> 
> > >
> > > The idea of load tearing is crazy talk of integral types. It would
> break so many things.
> > > It is the kind of stupid compiler thing that would send Linus on a
> rant and get
> > > the GCC compiler writers in trouble.
> > >
> > > The DPDK has always favored performance over strict safety guard
> rails everywhere.
> > > Switching to making every statistic an atomic operation is not in
> the spirit of
> > > what is required. There is no strict guarantee necessary here.
> > >
> >
> > I kind of agree with Stephen.
> >
> > Thanks Mattias, Morten & Stephen, it was informative discussion. But
> for
> > *SW drivers* stats update and reset is not core functionality and I
> > think we can be OK to get hit on corner cases, instead of
> > over-engineering or making code more complex.
> 
> 
> I forgot the case of 64 bit values on 32 bit platforms!
> Mostly because haven't cared about 32 bit for years...
> 
> The Linux kernel uses some wrappers to handle this.
> On 64 bit platforms they become noop.
> On 32 bit platform, they are protected by a seqlock and updates are
> wrapped by the sequence count.
> 
> If we go this way, then doing similar Noop on 64 bit and atomic or
> seqlock
> on 32 bit should be done, but in common helper.
> 
> Looking inside FreeBSD, it looks like that has changed over the years as
> well.
> 
>       if_inc_counter
>               counter_u64_add
>                       atomic_add_64
> But the counters are always per-cpu in this case. So although it does
> use
> locked operation, will always be uncontended.
> 
> 
> PS: Does DPDK still actually support 32 bit on x86? Can it be dropped
> this cycle?

We cannot drop 32 bit architecture support altogether.

But, unlike the Linux kernel, DPDK doesn't need to support ancient 32 bit 
architectures.
If the few 32 bit architectures supported by DPDK provide non-tearing 64 bit 
loads/stores, we don't need locks (in the fast path) for 64 bit counters.

In addition to 32 bit x86, DPDK supports ARMv7-A (a 32 bit architecture) and 32 
bit ARMv8.
I don't think DPDK support any other 32 bit architectures.


As Mattias mentioned, 32 bit x86 can use xmm registers to provide 64 bit 
non-tearing load/store.

Looking at ARMv7-A documentation, this architecture offers 64 bit non-tearing 
load/store by using two 32-bit registers and double-word Exclusive load and 
store instructions, LDREXD and STREXD. I don't know how costly they are, 
performance wise.

Supporting 64 bit counters has much broader scope than SW drivers.
Providing a "DPDK standard" design pattern with some utility functions would be 
useful.

The af_packet driver could serve as a reference use case.
It maintains both per-thread (per-queue) counters and the 
dev->data->rx_mbuf_alloc_failed counter shared by multiple threads.

Reply via email to