On 6/25/24 21:27, Mattias Rönnblom wrote:
On Tue, Jun 25, 2024 at 05:29:35PM +0200, Maxime Coquelin wrote:
Hi Mattias,

On 6/20/24 19:57, Mattias Rönnblom wrote:
This patch set make DPDK library, driver, and application code use the
compiler/libc memcpy() by default when functions in <rte_memcpy.h> are
invoked.

The various custom DPDK rte_memcpy() implementations may be retained
by means of a build-time option.

This patch set only make a difference on x86, PPC and ARM. Loongarch
and RISCV already used compiler/libc memcpy().

It indeed makes a difference on x86!

Just tested latest main with and without your series on
Intel(R) Xeon(R) Gold 6438N.

The test is a simple IO loop between a Vhost PMD and a Virtio-user PMD:
# dpdk-testpmd -l 4-6   --file-prefix=virtio1 --no-pci --vdev 
'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1,mrg_rxbuf=1,in_order=1'
--single-file-segments -- -i
testpmd> start

# dpdk-testpmd -l 8-10   --file-prefix=vhost1 --no-pci --vdev
'net_vhost0,iface=vhost-net,client=1'   --single-file-segments -- -i
testpmd> start tx_first 32

Latest main: 14.5Mpps
Latest main + this series: 10Mpps


I ran the above benchmark on my Raptor Lake desktop (locked to 3,2
GHz). GCC 12.3.0.

Core use_cc_memcpy Mpps
E    false         9.5
E    true          9.7
P    false         16.4
P    true          13.5

On the P-cores, there's a significant performance regression, although
not as bad as the one you see on your Sapphire Rapids Xeon. On the
E-cores, there's actually a slight performance gain.

The virtio PMD does not directly invoke rte_memcpy() or anything else
from <rte_memcpy.h>, but rather use memcpy(), so I'm not sure I
understand what's going on here. Does the virtio driver delegate some
performance-critical task to some module that in turns uses
rte_memcpy()?

This is because Vhost is the bottleneck here, not Virtio driver.
Indeed, the virtqueues memory belongs to the Virtio driver and the
descriptors buffers are Virtio's mbufs, so not much memcpy's are done
there.

Vhost however, is a heavy memcpy user, as all the descriptors buffers are copied to/from its mbufs.

So for me, it should be disabled by default.

Regards,
Maxime

This patch set includes a number of fixes in drivers and libraries
which errornously relied on <rte_memcpy.h> including header files
(i.e., <rte_vect.h>) required by its implementation.

Mattias Rönnblom (13):
    net/i40e: add missing vector API header include
    net/iavf: add missing vector API header include
    net/ice: add missing vector API header include
    net/ixgbe: add missing vector API header include
    net/ngbe: add missing vector API header include
    net/txgbe: add missing vector API header include
    net/virtio: add missing vector API header include
    net/fm10k: add missing vector API header include
    event/dlb2: include headers for vector and memory copy APIs
    net/octeon_ep: add missing vector API header include
    distributor: add missing vector API header include
    fib: add missing vector API header include
    eal: provide option to use compiler memcpy instead of RTE

   config/meson.build                          |  1 +
   doc/guides/rel_notes/release_24_07.rst      | 21 +++++++
   drivers/event/dlb2/dlb2.c                   |  2 +
   drivers/net/fm10k/fm10k_rxtx_vec.c          |  3 +-
   drivers/net/i40e/i40e_rxtx_vec_sse.c        |  3 +-
   drivers/net/iavf/iavf_rxtx_vec_sse.c        |  3 +-
   drivers/net/ice/ice_rxtx_vec_sse.c          |  2 +-
   drivers/net/ixgbe/ixgbe_rxtx_vec_sse.c      |  3 +-
   drivers/net/ngbe/ngbe_rxtx_vec_sse.c        |  3 +-
   drivers/net/octeon_ep/otx_ep_ethdev.c       |  2 +
   drivers/net/txgbe/txgbe_rxtx_vec_sse.c      |  3 +-
   drivers/net/virtio/virtio_rxtx_simple_sse.c |  3 +-
   lib/distributor/rte_distributor.c           |  1 +
   lib/eal/arm/include/rte_memcpy.h            | 10 ++++
   lib/eal/include/generic/rte_memcpy.h        | 61 ++++++++++++++++++---
   lib/eal/loongarch/include/rte_memcpy.h      | 53 ++----------------
   lib/eal/ppc/include/rte_memcpy.h            | 10 ++++
   lib/eal/riscv/include/rte_memcpy.h          | 53 ++----------------
   lib/eal/x86/include/meson.build             |  1 +
   lib/eal/x86/include/rte_memcpy.h            | 11 +++-
   lib/fib/trie.c                              |  1 +
   meson_options.txt                           |  2 +
   22 files changed, 131 insertions(+), 121 deletions(-)




Reply via email to