22/05/2026 00:42, Stephen Hemminger: > On Thu, 21 May 2026 18:56:31 +0000 > Morten Brørup <[email protected]> wrote: > > > The implementation for copying up to 64 bytes does not depend on address > > alignment with the size of the CPU's vector registers. Nonetheless, the > > exact same code for copying up to 64 bytes was present in both the aligned > > copy function and all the CPU vector register size specific variants of > > the unaligned copy functions. > > With this patch, the implementation for copying up to 64 bytes was > > consolidated into one instance, located in the common copy function, > > before checking alignment requirements. > > This provides three benefits: > > 1. No copy-paste in the source code. > > 2. A performance gain for copying up to 64 bytes, because the > > address alignment check is avoided in this case. > > 3. Reduced instruction memory footprint, because the compiler only > > generates one instance of the function for copying up to 64 bytes, instead > > of two instances (one in the unaligned copy function, and one in the > > aligned copy function). > > > > Furthermore, __rte_restrict was added to source and destination addresses. > > > > Also, the missing implementation of rte_mov48() was added. > > > > Until recently, some drivers required disabling stringop-overflow warnings > > when using rte_memcpy(). > > For some strange reason, these warnings were disabled in the rte_memcpy > > header file, instead of in the problematic drivers. > > With series-38174 ("remove use of rte_memcpy from net/intel"), the > > problematic drivers were updated to use memcpy() instead of rte_memcpy(), > > so disabling these warnings is no longer required, and was removed. > > > > Regarding performance... > > The memcpy performance test (cache-to-cache copy) shows: > > Copying up to 15 bytes takes ca. 4.5 cycles, versus ca. 6.5 cycles before. > > Copying 8 bytes takes 4 cycles, versus 7 cycles before. > > Copying 16 bytes takes 2 cycles, versus 4 cycles before. > > Copying 64 bytes takes 4 cycles, versus 7 cycles before. > > > > Depends-on: series-38174 ("remove use of rte_memcpy from net/intel") > > > > Signed-off-by: Morten Brørup <[email protected]> > > Acked-by: Bruce Richardson <[email protected]> > > Acked-by: Konstantin Ananyev <[email protected]> > > Here is the full wordy all providers reviews. [...] > Summary across 4 provider(s): clean=0 warnings=1 errors=3 failed=0
What is the followup? Do we target DPDK 26.07?

