Re: DPDK patch for Amston Lake SGMII <> GPY215
On 5/24/2024 6:40 AM, Jack.Chen wrote: > Dear DPDK Dev . > > This is PM from Advantech ENPD. We are working on Intel Amston Lake > CPU’s SGMII <> GPY215 PHY for DPDK test but fail. > > We consulted with Intel support team and they suggested we should > consult DPDK community and it should have the patch or code change for > Amston Lake <> GYP215 available for DPDK. > > Could you kindly suggest us the direction of it? I also keep my > Engineering team in this mail loop for further discussion. > > > > Thank you so much > > > > The error message while we testing DPDK > > SoC 2.5G LAN (BIOS set to 1G) with dpdk 24.03.0. It can run testpmd > test, and error message as follows : > > root@fwa-1214-efi:~/dpdk/dpdk-24.03/build/app# ./dpdk-testpmd -c 0xf -n > 1 -a 00:1e.4 --socket-mem=2048,0 -- -i --mbcache=512 --numa > --port-numa-config=0,0 --socket-num=0 --coremask=0x2 --nb-cores=1 > --rxq=1 --txq=1 --portmask=0x1 --rxd=2048 --rxfreet=64 --rxpt=64 > --rxht=8 --rxwt=0 --txd=2048 --txfreet=64 --txpt=64 --txht=0 --txwt=0 > --burst=64 --txrst=64 --rss-ip -a > > EAL: Detected CPU lcores: 4 > > EAL: Detected NUMA nodes: 1 > > EAL: Detected static linkage of DPDK > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket > > EAL: Selected IOVA mode 'PA' > > TELEMETRY: No legacy callbacks, legacy socket not created > > testpmd: No probed ethernet devices > > Interactive-mode selected > > Fail: input rxq (1) can't be greater than max_rx_queues (0) of port 0 > > EAL: Error - exiting with code: 1 > > Cause: rxq 1 invalid - must be >= 0 && <= 0 > > Hi Jack, According above log device is not detected. What is the Ehternet controller connected to the "GPY215 PHY" and do you know if it has required driver in DPDK for it? If device sits on PCIe bus, you can check it via `lspci`.
RE: [PATCH v3] test/crypto: fix enqueue dequeue callback case
Tested cryptodev_qat_autotest and cryptodev_null_autotest this patch along with https://patches.dpdk.org/project/dpdk/patch/20240416081222.3002268-1-ganapati.kundap...@intel.com/, callbacks are getting called for both NULL pmd and qat pmd. Acked-by: Ganapati Kundapura Thanks, Ganapati > -Original Message- > From: Akhil Goyal > Sent: Friday, May 24, 2024 10:43 PM > To: dev@dpdk.org > Cc: Kundapura, Ganapati ; Gujjar, > Abhinandan S ; fanzhang@gmail.com; > ano...@marvell.com; Akhil Goyal ; sta...@dpdk.org > Subject: [PATCH v3] test/crypto: fix enqueue dequeue callback case > > The enqueue/dequeue callback test cases were using the > test_null_burst_operation() for doing enqueue/dequeue. > But this function is only designed to be run for NULL PMD. > Hence for other PMDs, the callback was not getting called. > Now, separate processing thread is removed, instead NULL crypto operation is > created and processed so that callbacks are called. > Also added a check on a global static variable to verify that the callback is > actually called and fail the case if it is not getting called. > > Fixes: 5523a75af539 ("test/crypto: add case for enqueue/dequeue callbacks") > Cc: sta...@dpdk.org > > Signed-off-by: Akhil Goyal > --- > -v3: replaced AES-SHA1 with NULL crypto and removed separate thread. > > app/test/test_cryptodev.c | 106 -- > > 1 file changed, 89 insertions(+), 17 deletions(-) > > diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c index > 1703ebccf1..b644e87106 100644 > --- a/app/test/test_cryptodev.c > +++ b/app/test/test_cryptodev.c > @@ -199,6 +199,8 @@ post_process_raw_dp_op(void *user_data, > uint32_t index __rte_unused, > static struct crypto_testsuite_params testsuite_params = { NULL }; struct > crypto_testsuite_params *p_testsuite_params = &testsuite_params; static > struct crypto_unittest_params unittest_params; > +static bool enq_cb_called; > +static bool deq_cb_called; > > int > process_sym_raw_dp_op(uint8_t dev_id, uint16_t qp_id, @@ -14556,6 > +14558,7 @@ test_enq_callback(uint16_t dev_id, uint16_t qp_id, struct > rte_crypto_op **ops, > RTE_SET_USED(ops); > RTE_SET_USED(user_param); > > + enq_cb_called = true; > printf("crypto enqueue callback called\n"); > return nb_ops; > } > @@ -14569,21 +14572,58 @@ test_deq_callback(uint16_t dev_id, uint16_t > qp_id, struct rte_crypto_op **ops, > RTE_SET_USED(ops); > RTE_SET_USED(user_param); > > + deq_cb_called = true; > printf("crypto dequeue callback called\n"); > return nb_ops; > } > > /* > - * Thread using enqueue/dequeue callback with RCU. > + * Process enqueue/dequeue NULL crypto request to verify callback with > RCU. > */ > static int > -test_enqdeq_callback_thread(void *arg) > +test_enqdeq_callback_null_cipher(void) > { > - RTE_SET_USED(arg); > - /* DP thread calls rte_cryptodev_enqueue_burst()/ > - * rte_cryptodev_dequeue_burst() and invokes callback. > - */ > - test_null_burst_operation(); > + struct crypto_testsuite_params *ts_params = &testsuite_params; > + struct crypto_unittest_params *ut_params = &unittest_params; > + > + /* Setup Cipher Parameters */ > + ut_params->cipher_xform.type = RTE_CRYPTO_SYM_XFORM_CIPHER; > + ut_params->cipher_xform.next = &ut_params->auth_xform; > + > + ut_params->cipher_xform.cipher.algo = RTE_CRYPTO_CIPHER_NULL; > + ut_params->cipher_xform.cipher.op = > RTE_CRYPTO_CIPHER_OP_ENCRYPT; > + > + /* Setup HMAC Parameters */ > + ut_params->auth_xform.type = RTE_CRYPTO_SYM_XFORM_AUTH; > + ut_params->auth_xform.next = NULL; > + > + ut_params->auth_xform.auth.algo = RTE_CRYPTO_AUTH_NULL; > + ut_params->auth_xform.auth.op = > RTE_CRYPTO_AUTH_OP_GENERATE; > + > + /* Create Crypto session*/ > + ut_params->sess = rte_cryptodev_sym_session_create(ts_params- > >valid_devs[0], > + &ut_params->auth_xform, ts_params- > >session_mpool); > + TEST_ASSERT_NOT_NULL(ut_params->sess, "Session creation failed"); > + > + ut_params->op = rte_crypto_op_alloc(ts_params->op_mpool, > RTE_CRYPTO_OP_TYPE_SYMMETRIC); > + TEST_ASSERT_NOT_NULL(ut_params->op, "Failed to allocate > symmetric > +crypto op"); > + > + /* Generate an operation for each mbuf in burst */ > + ut_params->ibuf = rte_pktmbuf_alloc(ts_params->mbuf_pool); > + TEST_ASSERT_NOT_NULL(ut_params->ibuf, "Failed to allocate mbuf"); > + > + /* Append some random data */ > + TEST_ASSERT_NOT_NULL(rte_pktmbuf_append(ut_params->ibuf, > sizeof(unsigned int)), > + "no room to append data"); > + > + rte_crypto_op_attach_sym_session(ut_params->op, ut_params- > >sess); > + > + ut_params->op->sym->m_src = ut_params->ibuf; > + > + /* Process crypto operation */ > + TEST_ASSERT_NOT_NULL(process_crypto_request(ts_params- > >valid_devs[0], ut_params->op), > +
[RFC] eal: provide option to use compiler memcpy instead of RTE
Provide build option to have functions in delegate to the standard compiler/libc memcpy(), instead of using the various traditional, handcrafted, per-architecture rte_memcpy() implementations. A new meson build option 'use_cc_memcpy' is added. The default is true. It's not obvious what should be the default, but compiler memcpy() is enabled by default in this RFC so any tests run with this patch use the new approach. One purpose of this RFC is to make it easy to evaluate the costs and benefits of a switch. Only ARM and x86 is implemented. Signed-off-by: Mattias Rönnblom --- config/meson.build | 1 + lib/eal/arm/include/rte_memcpy.h | 10 + lib/eal/include/generic/rte_memcpy.h | 62 lib/eal/x86/include/meson.build | 6 ++- lib/eal/x86/include/rte_memcpy.h | 11 - meson_options.txt| 2 + 6 files changed, 83 insertions(+), 9 deletions(-) diff --git a/config/meson.build b/config/meson.build index 8c8b019c25..456056628e 100644 --- a/config/meson.build +++ b/config/meson.build @@ -353,6 +353,7 @@ endforeach # set other values pulled from the build options dpdk_conf.set('RTE_MAX_ETHPORTS', get_option('max_ethports')) dpdk_conf.set('RTE_LIBEAL_USE_HPET', get_option('use_hpet')) +dpdk_conf.set('RTE_USE_CC_MEMCPY', get_option('use_cc_memcpy')) dpdk_conf.set('RTE_ENABLE_STDATOMIC', get_option('enable_stdatomic')) dpdk_conf.set('RTE_ENABLE_TRACE_FP', get_option('enable_trace_fp')) dpdk_conf.set('RTE_PKTMBUF_HEADROOM', get_option('pkt_mbuf_headroom')) diff --git a/lib/eal/arm/include/rte_memcpy.h b/lib/eal/arm/include/rte_memcpy.h index 47dea9a8cc..e8aff722df 100644 --- a/lib/eal/arm/include/rte_memcpy.h +++ b/lib/eal/arm/include/rte_memcpy.h @@ -5,10 +5,20 @@ #ifndef _RTE_MEMCPY_ARM_H_ #define _RTE_MEMCPY_ARM_H_ +#include + +#ifdef RTE_USE_CC_MEMCPY + +#include + +#else + #ifdef RTE_ARCH_64 #include #else #include #endif +#endif /* RTE_USE_CC_MEMCPY */ + #endif /* _RTE_MEMCPY_ARM_H_ */ diff --git a/lib/eal/include/generic/rte_memcpy.h b/lib/eal/include/generic/rte_memcpy.h index e7f0f8eaa9..f2f66f372d 100644 --- a/lib/eal/include/generic/rte_memcpy.h +++ b/lib/eal/include/generic/rte_memcpy.h @@ -5,12 +5,20 @@ #ifndef _RTE_MEMCPY_H_ #define _RTE_MEMCPY_H_ +#ifdef __cplusplus +extern "C" { +#endif + /** * @file * * Functions for vectorised implementation of memcpy(). */ +#include +#include +#include + /** * Copy 16 bytes from one location to another using optimised * instructions. The locations should not overlap. @@ -35,8 +43,6 @@ rte_mov16(uint8_t *dst, const uint8_t *src); static inline void rte_mov32(uint8_t *dst, const uint8_t *src); -#ifdef __DOXYGEN__ - /** * Copy 48 bytes from one location to another using optimised * instructions. The locations should not overlap. @@ -49,8 +55,6 @@ rte_mov32(uint8_t *dst, const uint8_t *src); static inline void rte_mov48(uint8_t *dst, const uint8_t *src); -#endif /* __DOXYGEN__ */ - /** * Copy 64 bytes from one location to another using optimised * instructions. The locations should not overlap. @@ -87,8 +91,6 @@ rte_mov128(uint8_t *dst, const uint8_t *src); static inline void rte_mov256(uint8_t *dst, const uint8_t *src); -#ifdef __DOXYGEN__ - /** * Copy bytes from one location to another. The locations must not overlap. * @@ -111,6 +113,52 @@ rte_mov256(uint8_t *dst, const uint8_t *src); static void * rte_memcpy(void *dst, const void *src, size_t n); -#endif /* __DOXYGEN__ */ +#ifdef RTE_USE_CC_MEMCPY +static inline void +rte_mov16(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 16); +} + +static inline void +rte_mov32(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 32); +} + +static inline void +rte_mov48(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 48); +} + +static inline void +rte_mov64(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 64); +} + +static inline void +rte_mov128(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 128); +} + +static inline void +rte_mov256(uint8_t *dst, const uint8_t *src) +{ + memcpy(dst, src, 256); +} + +static inline void * +rte_memcpy(void *dst, const void *src, size_t n) +{ + return memcpy(dst, src, n); +} +#endif /* RTE_USE_CC_MEMCPY */ + +#ifdef __cplusplus +} +#endif #endif /* _RTE_MEMCPY_H_ */ diff --git a/lib/eal/x86/include/meson.build b/lib/eal/x86/include/meson.build index 52d2f8e969..cf851df60d 100644 --- a/lib/eal/x86/include/meson.build +++ b/lib/eal/x86/include/meson.build @@ -7,7 +7,6 @@ arch_headers = files( 'rte_cpuflags.h', 'rte_cycles.h', 'rte_io.h', -'rte_memcpy.h', 'rte_pause.h', 'rte_power_intrinsics.h', 'rte_prefetch.h', @@ -16,6 +15,11 @@ arch_headers = files( 'rte_spinlock.h', 'rte_vect.h', ) + +if not get_option('use_cc_memcpy') +arch_headers += 'rte_memcpy.h' +endif + ar
Re: [PATCH] net/mlx5/hws: add support for NVGRE matching
Hi, From: Bill Zhou Sent: Monday, May 20, 2024 9:25 AM To: Alex Vesker; Dariusz Sosnowski; Slava Ovsiienko; Ori Kam; Suanming Mou; Matan Azrad Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon (EXTERNAL); Raslan Darawsheh Subject: [PATCH] net/mlx5/hws: add support for NVGRE matching Add HWS support for RTE_FLOW_ITEM_TYPE_NVGRE item all fields. Signed-off-by: Dong Zhou Acked-by: Alex Vesker Patch applied to next-net-mlx, Kindest regards Raslan Darawsheh
Re: [PATCH] net/mlx5: fix Rx Hash queue resource release in sample flow
Hi, From: Jiawei(Jonny) Wang Sent: Monday, May 20, 2024 6:07 PM To: Bing Zhao; Suanming Mou; Dariusz Sosnowski; Slava Ovsiienko; Ori Kam; Matan Azrad Cc: dev@dpdk.org; Raslan Darawsheh; sta...@dpdk.org Subject: [PATCH] net/mlx5: fix Rx Hash queue resource release in sample flow While the queue/rss action was added to sample action lists, the rx hash queue resource was allocated in the sample action translation to create the sample DR action later. While there's a failure in the flow creation, the Rx hash queue resource of the sample action list was destroyed in the wrong place. This patch adds the checking to release the Rx hash queue resource after the sample action release, to avoid one more extra release if there's a failure. Fixes: ca5eb60ecd5b ("net/mlx5: fix resource release for mirror flow") Cc: sta...@dpdk.org Signed-off-by: Jiawei Wang Reviewed-by: Bing Zhao Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
Re: [PATCH] net/mlx5: remove redundant macro
Hi, From: Thomas Monjalon Sent: Friday, May 24, 2024 4:42 PM To: dev@dpdk.org Cc: Dariusz Sosnowski; Slava Ovsiienko; Ori Kam; Suanming Mou; Matan Azrad Subject: [PATCH] net/mlx5: remove redundant macro The macro MLX5_BITSHIFT() is not used anymore, and is redundant with RTE_BIT64(), so it can be removed. Signed-off-by: Thomas Monjalon Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
Re: [PATCH] net/mlx5: fix indirect action template error handling
Hi, From: Maayan Kashani Sent: Sunday, May 26, 2024 4:22 PM To: dev@dpdk.org Cc: Maayan Kashani; Suanming Mou; Raslan Darawsheh; sta...@dpdk.org; Dariusz Sosnowski; Slava Ovsiienko; Ori Kam; Matan Azrad Subject: [PATCH] net/mlx5: fix indirect action template error handling For indirect action type, on error case the function jumped to err but returned zero cause rte_errno was not initialized before the jump. It caused no error in table creation. In case reaching an error, if rte_errno is not initialized, it will be set to EINVAL. Now table creation should fail if the translate of the action fails. Added driver log warnings so it can be easy to track failure on shared actions translate. Fixes: 7ab3962d2d2b ("net/mlx5: add indirect HW steering action") Cc: sta...@dpdk.org Signed-off-by: Maayan Kashani Acked-by: Suanming Mou Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
Re: [PATCH] doc: add firmware update links in mlx5 guide
Hi, From: Thomas Monjalon Sent: Thursday, May 23, 2024 2:28 PM To: dev@dpdk.org Cc: Dariusz Sosnowski; Slava Ovsiienko; Ori Kam; Suanming Mou; Matan Azrad Subject: [PATCH] doc: add firmware update links in mlx5 guide If using upstream kernel and libraries, there was no direct link to download the firmware and update tool. Firmware update explanations are reorganized and include all links. Signed-off-by: Thomas Monjalon Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh
RE: [PATCH 1/2] eal: provide macro for GCC builtin constant intrinsic
PING for Review/ACK. Come on fellow reviewers, it's only 5 lines of code! The mempool library cannot build with MSVC without this patch series. Other patches are also being held back, waiting for this MSVC compatible DPDK macro for __builtin_constant_p(). The macro for MSVC can be improved as suggested by Stephen later. > From: Morten Brørup [mailto:m...@smartsharesystems.com] > Sent: Monday, 1 April 2024 10.35 > > > From: Stephen Hemminger [mailto:step...@networkplumber.org] > > Sent: Monday, 1 April 2024 00.03 > > > > On Wed, 20 Mar 2024 14:33:35 -0700 > > Tyler Retzlaff wrote: > > > > > +#ifdef RTE_TOOLCHAIN_MSVC > > > +#define __rte_constant(e) 0 > > > +#else > > > +#define __rte_constant(e) __extension__(__builtin_constant_p(e)) > > > +#endif > > > + > > > > > > I did some looking around and some other project have macros > > for expressing constant expression vs constant. > > > > Implementing this with some form of sizeof math is possible. > > For example in linux/compiler.h > > > > /* > > * This returns a constant expression while determining if an argument > > is > > * a constant expression, most importantly without evaluating the > > argument. > > * Glory to Martin Uecker > > * > > * Details: > > * - sizeof() return an integer constant expression, and does not > > evaluate > > * the value of its operand; it only examines the type of its operand. > > * - The results of comparing two integer constant expressions is also > > * an integer constant expression. > > * - The first literal "8" isn't important. It could be any literal > > value. > > * - The second literal "8" is to avoid warnings about unaligned > > pointers; > > * this could otherwise just be "1". > > * - (long)(x) is used to avoid warnings about 64-bit types on 32-bit > > * architectures. > > * - The C Standard defines "null pointer constant", "(void *)0", as > > * distinct from other void pointers. > > * - If (x) is an integer constant expression, then the "* 0l" resolves > > * it into an integer constant expression of value 0. Since it is cast > > to > > * "void *", this makes the second operand a null pointer constant. > > * - If (x) is not an integer constant expression, then the second > > operand > > * resolves to a void pointer (but not a null pointer constant: the > > value > > * is not an integer constant 0). > > * - The conditional operator's third operand, "(int *)8", is an object > > * pointer (to type "int"). > > * - The behavior (including the return type) of the conditional > > operator > > * ("operand1 ? operand2 : operand3") depends on the kind of > > expressions > > * given for the second and third operands. This is the central > > mechanism > > * of the macro: > > * - When one operand is a null pointer constant (i.e. when x is an > > integer > > * constant expression) and the other is an object pointer (i.e. our > > * third operand), the conditional operator returns the type of the > > * object pointer operand (i.e. "int *). Here, within the sizeof(), > > we > > * would then get: > > * sizeof(*((int *)(...)) == sizeof(int) == 4 > > * - When one operand is a void pointer (i.e. when x is not an integer > > * constant expression) and the other is an object pointer (i.e. our > > * third operand), the conditional operator returns a "void *" type. > > * Here, within the sizeof(), we would then get: > > * sizeof(*((void *)(...)) == sizeof(void) == 1 > > * - The equality comparison to "sizeof(int)" therefore depends on (x): > > * sizeof(int) == sizeof(int) (x) was a constant expression > > * sizeof(int) != sizeof(void)(x) was not a constant expression > > */ > > #define __is_constexpr(x) \ > > (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int > > *)8))) > > Nice! > If the author is willing to license it under the BSD license, we can copy it > as is. > > We might want to add a couple of build time checks to verify that it does what > is expected; to catch any changes in compiler behavior.
Re: [PATCH 1/2] eal: provide macro for GCC builtin constant intrinsic
On Wed, Mar 20, 2024 at 02:33:35PM -0700, Tyler Retzlaff wrote: > MSVC does not have a __builtin_constant_p intrinsic so provide > __rte_constant(e) that expands false for MSVC and to the intrinsic for > GCC. > > Signed-off-by: Tyler Retzlaff > --- > lib/eal/include/rte_common.h | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/lib/eal/include/rte_common.h b/lib/eal/include/rte_common.h > index 298a5c6..d520be6 100644 > --- a/lib/eal/include/rte_common.h > +++ b/lib/eal/include/rte_common.h > @@ -44,6 +44,12 @@ > #endif > #endif > > +#ifdef RTE_TOOLCHAIN_MSVC > +#define __rte_constant(e) 0 > +#else > +#define __rte_constant(e) __extension__(__builtin_constant_p(e)) > +#endif > + Acked-by: Bruce Richardson
RE: [PATCH v12 1/2] mempool cache: add zero-copy get and put functions
> From: Dharmik Thakkar [mailto:dharmikjayesh.thak...@arm.com] > Sent: Friday, 21 July 2023 18.29 > > From: Morten Brørup > > Zero-copy access to mempool caches is beneficial for PMD performance. > Furthermore, having a zero-copy mempool API is considered a precondition > for fixing a certain category of bugs, present in some PMDs: For > performance reasons, some PMDs had bypassed the mempool API in order to > achieve zero-copy access to the mempool cache. This can only be fixed > in those PMDs without a performance regression if the mempool library > offers zero-copy access APIs, so the PMDs can use the proper mempool > API instead of copy-pasting code from the mempool library. > Furthermore, the copy-pasted code in those PMDs has not been kept up to > date with the improvements of the mempool library, so when they bypass > the mempool API, mempool trace is missing and mempool statistics is not > updated. > > Bugzilla ID: 1052 > > Signed-off-by: Morten Brørup > Signed-off-by: Kamalakshitha Aligeri > Signed-off-by: Dharmik Thakkar > Reviewed-by: Ruifeng Wang > Acked-by: Konstantin Ananyev > Acked-by: Chengwen Feng > > --- Patchwork shows this series as failing: https://patchwork.dpdk.org/project/dpdk/list/?series=29003 Please fix and resubmit the series as v13, so we can get it into DPDK.
[PATCH v2] dma/cnxk: add higher chunk size support
From: Pavan Nikhilesh Add support to configure higher chunk size by using the new OPEN_V2 mailbox, this improves performance as the number of mempool allocs are reduced. Add timeout when polling for queue idle timeout. Signed-off-by: Pavan Nikhilesh Signed-off-by: Amit Prakash Shukla --- v2 Changes: - Update release notes. - Use timeout when polling for queue idle state. doc/guides/rel_notes/release_24_07.rst | 6 +++ drivers/common/cnxk/roc_dpi.c | 72 ++ drivers/common/cnxk/roc_dpi.h | 3 ++ drivers/common/cnxk/roc_dpi_priv.h | 3 ++ drivers/common/cnxk/version.map| 2 + drivers/dma/cnxk/cnxk_dmadev.c | 37 - drivers/dma/cnxk/cnxk_dmadev.h | 1 + 7 files changed, 101 insertions(+), 23 deletions(-) diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst index a69f24cf99..60b92e4842 100644 --- a/doc/guides/rel_notes/release_24_07.rst +++ b/doc/guides/rel_notes/release_24_07.rst @@ -55,6 +55,12 @@ New Features Also, make sure to start the actual text at the margin. === +* **Updated Marvell CNXK DMA driver.** + + * Updated DMA driver internal pool to use higher chunk size, effectively +reducing the number of mempool allocs needed, thereby increasing DMA +performance. + Removed Items - diff --git a/drivers/common/cnxk/roc_dpi.c b/drivers/common/cnxk/roc_dpi.c index 1ee777d779..892685d185 100644 --- a/drivers/common/cnxk/roc_dpi.c +++ b/drivers/common/cnxk/roc_dpi.c @@ -38,6 +38,24 @@ send_msg_to_pf(struct plt_pci_addr *pci_addr, const char *value, int size) return 0; } +int +roc_dpi_wait_queue_idle(struct roc_dpi *roc_dpi) +{ + const uint64_t cyc = (DPI_QUEUE_IDLE_TMO_MS * plt_tsc_hz()) / 1E3; + const uint64_t start = plt_tsc_cycles(); + uint64_t reg; + + /* Wait for SADDR to become idle */ + reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + while (!(reg & BIT_ULL(63))) { + reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + if (plt_tsc_cycles() - start == cyc) + return -ETIMEDOUT; + } + + return 0; +} + int roc_dpi_enable(struct roc_dpi *dpi) { @@ -57,7 +75,6 @@ roc_dpi_configure(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uin { struct plt_pci_device *pci_dev; dpi_mbox_msg_t mbox_msg; - uint64_t reg; int rc; if (!roc_dpi) { @@ -68,9 +85,9 @@ roc_dpi_configure(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uin pci_dev = roc_dpi->pci_dev; roc_dpi_disable(roc_dpi); - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); - while (!(reg & BIT_ULL(63))) - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); + rc = roc_dpi_wait_queue_idle(roc_dpi); + if (rc) + return rc; plt_write64(0x0, roc_dpi->rbase + DPI_VDMA_REQQ_CTL); plt_write64(chunk_base, roc_dpi->rbase + DPI_VDMA_SADDR); @@ -87,6 +104,45 @@ roc_dpi_configure(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uin if (mbox_msg.s.wqecsoff) mbox_msg.s.wqecs = 1; + rc = send_msg_to_pf(&pci_dev->addr, (const char *)&mbox_msg, sizeof(dpi_mbox_msg_t)); + if (rc < 0) + plt_err("Failed to send mbox message %d to DPI PF, err %d", mbox_msg.s.cmd, rc); + + return rc; +} + +int +roc_dpi_configure_v2(struct roc_dpi *roc_dpi, uint32_t chunk_sz, uint64_t aura, uint64_t chunk_base) +{ + struct plt_pci_device *pci_dev; + dpi_mbox_msg_t mbox_msg; + int rc; + + if (!roc_dpi) { + plt_err("roc_dpi is NULL"); + return -EINVAL; + } + + pci_dev = roc_dpi->pci_dev; + + roc_dpi_disable(roc_dpi); + + rc = roc_dpi_wait_queue_idle(roc_dpi); + if (rc) + return rc; + + plt_write64(0x0, roc_dpi->rbase + DPI_VDMA_REQQ_CTL); + plt_write64(chunk_base, roc_dpi->rbase + DPI_VDMA_SADDR); + mbox_msg.u[0] = 0; + mbox_msg.u[1] = 0; + /* DPI PF driver expects vfid starts from index 0 */ + mbox_msg.s.vfid = roc_dpi->vfid; + mbox_msg.s.cmd = DPI_QUEUE_OPEN_V2; + mbox_msg.s.csize = chunk_sz / 8; + mbox_msg.s.aura = aura; + mbox_msg.s.sso_pf_func = idev_sso_pffunc_get(); + mbox_msg.s.npa_pf_func = idev_npa_pffunc_get(); + rc = send_msg_to_pf(&pci_dev->addr, (const char *)&mbox_msg, sizeof(dpi_mbox_msg_t)); if (rc < 0) @@ -116,13 +172,11 @@ roc_dpi_dev_fini(struct roc_dpi *roc_dpi) { struct plt_pci_device *pci_dev = roc_dpi->pci_dev; dpi_mbox_msg_t mbox_msg; - uint64_t reg; int rc; - /* Wait for SADDR to become idle */ - reg = plt_read64(roc_dpi->rbase + DPI_VDMA_SADDR); - while (!(reg & BIT_ULL(63))) -
FW: [PATCH] mempool: dump includes list of memory chunks
PING for review. @Paul, @Du and @Ferruh, if you think the information provided by this patch would have been useful for your recent work with the mempool, please Review or ACK it. -Morten From: Morten Brørup [mailto:m...@smartsharesystems.com] Sent: Thursday, 16 May 2024 11.00 Added information about the memory chunks holding the objects in the mempool when dumping the status of the mempool to a file. Signed-off-by: Morten Brørup --- lib/mempool/rte_mempool.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c index 12390a2c81..e9a8a5b411 100644 --- a/lib/mempool/rte_mempool.c +++ b/lib/mempool/rte_mempool.c @@ -1230,6 +1230,7 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp) #endif struct rte_mempool_memhdr *memhdr; struct rte_mempool_ops *ops; + unsigned int n; unsigned common_count; unsigned cache_count; size_t mem_len = 0; @@ -1264,6 +1265,15 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp) (long double)mem_len / mp->size); } + fprintf(f, " mem_list:\n"); + n = 0; + STAILQ_FOREACH(memhdr, &mp->mem_list, next) { + fprintf(f, "addr[%u]=%p\n", n, memhdr->addr); + fprintf(f, "iova[%u]=0x%" PRIx64 "\n", n, memhdr->iova); + fprintf(f, "len[%u]=%zu\n", n, memhdr->len); + n++; + } + cache_count = rte_mempool_dump_cache(f, mp); common_count = rte_mempool_ops_get_count(mp); if ((cache_count + common_count) > mp->size) -- 2.17.1
More reviewing, please
Dear DPDK community, Many non-PMD patches don't get sufficient reviews/acks, and get stuck in limbo. At a recent DPDK tech board meeting, it was mentioned that: "On the Linux Kernel mailing list, patches are met with discussion, on the DPDK mailing list, patches are met with silence." We need to improve this situation. Please, let's all actively encourage our colleagues to review more patches, and encourage managers to approve spending more time on reviewing. -Morten
[PATCH v4] eal/x86: improve rte_memcpy const size 16 performance
When the rte_memcpy() size is 16, the same 16 bytes are copied twice. In the case where the size is known to be 16 at build tine, omit the duplicate copy. Reduced the amount of effectively copy-pasted code by using #ifdef inside functions instead of outside functions. Suggested-by: Stephen Hemminger Signed-off-by: Morten Brørup Acked-by: Bruce Richardson --- v4: * There are no problems compiling AVX2, only AVX. (Bruce Richardson) v3: * AVX2 is a superset of AVX; for a block of AVX code, testing for AVX suffices. (Bruce Richardson) * Define RTE_MEMCPY_AVX if AVX is available, to avoid copy-pasting the check for older GCC version. (Bruce Richardson) v2: * For GCC, version 11 is required for proper AVX handling; if older GCC version, treat AVX as SSE. Clang does not have this issue. Note: Original code always treated AVX as SSE, regardless of compiler. * Do not add copyright. (Stephen Hemminger) --- lib/eal/x86/include/rte_memcpy.h | 239 +-- 1 file changed, 64 insertions(+), 175 deletions(-) diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h index 72a92290e0..d687aa7756 100644 --- a/lib/eal/x86/include/rte_memcpy.h +++ b/lib/eal/x86/include/rte_memcpy.h @@ -27,6 +27,16 @@ extern "C" { #pragma GCC diagnostic ignored "-Wstringop-overflow" #endif +/* + * GCC older than version 11 doesn't compile AVX properly, so use SSE instead. + * There are no problems with AVX2. + */ +#if defined __AVX2__ +#define RTE_MEMCPY_AVX +#elif defined __AVX__ && !(defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 11)) +#define RTE_MEMCPY_AVX +#endif + /** * Copy bytes from one location to another. The locations must not overlap. * @@ -91,14 +101,6 @@ rte_mov15_or_less(void *dst, const void *src, size_t n) return ret; } -#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 - -#define ALIGNMENT_MASK 0x3F - -/** - * AVX512 implementation below - */ - /** * Copy 16 bytes from one location to another, * locations should not overlap. @@ -119,10 +121,15 @@ rte_mov16(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov32(uint8_t *dst, const uint8_t *src) { +#if defined RTE_MEMCPY_AVX __m256i ymm0; ymm0 = _mm256_loadu_si256((const __m256i *)src); _mm256_storeu_si256((__m256i *)dst, ymm0); +#else /* SSE implementation */ + rte_mov16((uint8_t *)dst + 0 * 16, (const uint8_t *)src + 0 * 16); + rte_mov16((uint8_t *)dst + 1 * 16, (const uint8_t *)src + 1 * 16); +#endif } /** @@ -132,10 +139,15 @@ rte_mov32(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov64(uint8_t *dst, const uint8_t *src) { +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 __m512i zmm0; zmm0 = _mm512_loadu_si512((const void *)src); _mm512_storeu_si512((void *)dst, zmm0); +#else /* AVX2, AVX & SSE implementation */ + rte_mov32((uint8_t *)dst + 0 * 32, (const uint8_t *)src + 0 * 32); + rte_mov32((uint8_t *)dst + 1 * 32, (const uint8_t *)src + 1 * 32); +#endif } /** @@ -156,12 +168,18 @@ rte_mov128(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov256(uint8_t *dst, const uint8_t *src) { - rte_mov64(dst + 0 * 64, src + 0 * 64); - rte_mov64(dst + 1 * 64, src + 1 * 64); - rte_mov64(dst + 2 * 64, src + 2 * 64); - rte_mov64(dst + 3 * 64, src + 3 * 64); + rte_mov128(dst + 0 * 128, src + 0 * 128); + rte_mov128(dst + 1 * 128, src + 1 * 128); } +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 + +/** + * AVX512 implementation below + */ + +#define ALIGNMENT_MASK 0x3F + /** * Copy 128-byte blocks from one location to another, * locations should not overlap. @@ -231,12 +249,22 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) /** * Fast way when copy size doesn't exceed 512 bytes */ + if (__builtin_constant_p(n) && n == 32) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 32) { rte_mov16((uint8_t *)dst, (const uint8_t *)src); + if (__builtin_constant_p(n) && n == 16) + return ret; /* avoid (harmless) duplicate copy */ rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n); return ret; } + if (__builtin_constant_p(n) && n == 64) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 64) { rte_mov32((uint8_t *)dst, (const uint8_t *)src); rte_mov32((uint8_t *)dst - 32 + n, @@ -313,80 +341,13 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) goto COPY_BLOCK_128_BACK63; } -#elif defined __AVX2__ - -#define ALIGNMENT_MASK 0x1F - -/** - * AVX2 implementation below - */ - -/** - * Copy 16 bytes from one location to another
[PATCH v5] eal/x86: improve rte_memcpy const size 16 performance
When the rte_memcpy() size is 16, the same 16 bytes are copied twice. In the case where the size is known to be 16 at build tine, omit the duplicate copy. Reduced the amount of effectively copy-pasted code by using #ifdef inside functions instead of outside functions. Depends-on: series-31578 ("provide toolchain abstracted __builtin_constant_p") Suggested-by: Stephen Hemminger Signed-off-by: Morten Brørup Acked-by: Bruce Richardson --- v5: * Fix for building with MSVC: Use __rte_constant() instead of __builtin_constant_p(). Add dependency on patch providing __rte_constant(). v4: * There are no problems compiling AVX2, only AVX. (Bruce Richardson) v3: * AVX2 is a superset of AVX; for a block of AVX code, testing for AVX suffices. (Bruce Richardson) * Define RTE_MEMCPY_AVX if AVX is available, to avoid copy-pasting the check for older GCC version. (Bruce Richardson) v2: * For GCC, version 11 is required for proper AVX handling; if older GCC version, treat AVX as SSE. Clang does not have this issue. Note: Original code always treated AVX as SSE, regardless of compiler. * Do not add copyright. (Stephen Hemminger) --- lib/eal/x86/include/rte_memcpy.h | 239 +-- 1 file changed, 64 insertions(+), 175 deletions(-) diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h index 72a92290e0..1619a8f296 100644 --- a/lib/eal/x86/include/rte_memcpy.h +++ b/lib/eal/x86/include/rte_memcpy.h @@ -27,6 +27,16 @@ extern "C" { #pragma GCC diagnostic ignored "-Wstringop-overflow" #endif +/* + * GCC older than version 11 doesn't compile AVX properly, so use SSE instead. + * There are no problems with AVX2. + */ +#if defined __AVX2__ +#define RTE_MEMCPY_AVX +#elif defined __AVX__ && !(defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 11)) +#define RTE_MEMCPY_AVX +#endif + /** * Copy bytes from one location to another. The locations must not overlap. * @@ -91,14 +101,6 @@ rte_mov15_or_less(void *dst, const void *src, size_t n) return ret; } -#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 - -#define ALIGNMENT_MASK 0x3F - -/** - * AVX512 implementation below - */ - /** * Copy 16 bytes from one location to another, * locations should not overlap. @@ -119,10 +121,15 @@ rte_mov16(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov32(uint8_t *dst, const uint8_t *src) { +#if defined RTE_MEMCPY_AVX __m256i ymm0; ymm0 = _mm256_loadu_si256((const __m256i *)src); _mm256_storeu_si256((__m256i *)dst, ymm0); +#else /* SSE implementation */ + rte_mov16((uint8_t *)dst + 0 * 16, (const uint8_t *)src + 0 * 16); + rte_mov16((uint8_t *)dst + 1 * 16, (const uint8_t *)src + 1 * 16); +#endif } /** @@ -132,10 +139,15 @@ rte_mov32(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov64(uint8_t *dst, const uint8_t *src) { +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 __m512i zmm0; zmm0 = _mm512_loadu_si512((const void *)src); _mm512_storeu_si512((void *)dst, zmm0); +#else /* AVX2, AVX & SSE implementation */ + rte_mov32((uint8_t *)dst + 0 * 32, (const uint8_t *)src + 0 * 32); + rte_mov32((uint8_t *)dst + 1 * 32, (const uint8_t *)src + 1 * 32); +#endif } /** @@ -156,12 +168,18 @@ rte_mov128(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov256(uint8_t *dst, const uint8_t *src) { - rte_mov64(dst + 0 * 64, src + 0 * 64); - rte_mov64(dst + 1 * 64, src + 1 * 64); - rte_mov64(dst + 2 * 64, src + 2 * 64); - rte_mov64(dst + 3 * 64, src + 3 * 64); + rte_mov128(dst + 0 * 128, src + 0 * 128); + rte_mov128(dst + 1 * 128, src + 1 * 128); } +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 + +/** + * AVX512 implementation below + */ + +#define ALIGNMENT_MASK 0x3F + /** * Copy 128-byte blocks from one location to another, * locations should not overlap. @@ -231,12 +249,22 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) /** * Fast way when copy size doesn't exceed 512 bytes */ + if (__rte_constant(n) && n == 32) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 32) { rte_mov16((uint8_t *)dst, (const uint8_t *)src); + if (__rte_constant(n) && n == 16) + return ret; /* avoid (harmless) duplicate copy */ rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n); return ret; } + if (__rte_constant(n) && n == 64) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 64) { rte_mov32((uint8_t *)dst, (const uint8_t *)src); rte_mov32((uint8_t *)dst - 32 + n, @@ -313,80 +341,13 @@ rte_memcpy_generic(void *dst, const void
RE: [PATCH v5] eal/x86: improve rte_memcpy const size 16 performance
Recheck-request: iol-testing
RE: [PATCH v2] doc: add dma perf feature details
Hi, Gentle Ping. Thanks, Amit Shukla > -Original Message- > From: Amit Prakash Shukla > Sent: Tuesday, March 19, 2024 12:16 AM > To: Cheng Jiang ; Chengwen Feng > > Cc: dev@dpdk.org; Jerin Jacob ; Vamsi Krishna Attunuru > ; Anoob Joseph ; > Gowrishankar Muthukrishnan ; Amit Prakash > Shukla > Subject: [PATCH v2] doc: add dma perf feature details > > Update dma perf test document with below support features: > 1. Memory-to-device and device-to-memory copy. > 2. Skip support. > 3. Scatter-gather support. > > Signed-off-by: Amit Prakash Shukla > --- > v2: > - Rebased the patch. > > doc/guides/tools/dmaperf.rst | 89 ++- > - > 1 file changed, 64 insertions(+), 25 deletions(-) >
Re: [PATCH] common/cnxk: restore segregation of logs based on module
On Tue, Apr 23, 2024 at 4:15 PM Anoob Joseph wrote: > > Originally the logs were segregated under various labels which could be > selectively enabled. It was changed to use 'pmd.common.cnxk' while > changing the macro used for registering logging. Address the same by > restoring the segregation. > > Current logs: > ... > logtype3 > pmd.common.cnxk > pmd.common.iavf > ... > > Changed to: > ... > logtype3 > pmd.common.cnxk.base > pmd.common.cnxk.crypto > pmd.common.cnxk.dpi > pmd.common.cnxk.esw > pmd.common.cnxk.event > pmd.common.cnxk.flow > pmd.common.cnxk.mbox > pmd.common.cnxk.mempool > pmd.common.cnxk.ml > pmd.common.cnxk.nix > pmd.common.cnxk.ree > pmd.common.cnxk.rep > pmd.common.cnxk.timer > pmd.common.cnxk.tm > pmd.common.iavf > ... > > Updated documentation also to reflect the same. > > Fixes: 233692f550a1 ("dma/cnxk: rework DMA driver") > > Signed-off-by: Anoob Joseph Applied to dpdk-next-net-mrvl/for-main. Thanks
[PATCH v4 0/7] Add ODM DMA device
Add Odyssey ODM DMA device. This PMD abstracts ODM hardware unit on Odyssey SoC which can perform mem to mem copies. The hardware unit can support upto 32 queues (vchan) and 16 VFs. It supports 'fill' operation with specific values. It also supports SG mode of operation with upto 4 src pointers and 4 destination pointers. The PMD is tested with both unit tests and performance applications. Changes in v4 - Added release notes - Addressed review comments from Jerin Changes in v3 - Addressed build failure with stdatomic stage in CI Changes in v2 - Addressed build failure in CI - Moved update to usertools as separate patch Anoob Joseph (2): dma/odm: add framework for ODM DMA device dma/odm: add hardware defines Gowrishankar Muthukrishnan (3): dma/odm: add dev init and fini dma/odm: add device ops dma/odm: add stats Vidya Sagar Velumuri (2): dma/odm: add copy and copy sg ops dma/odm: add remaining ops MAINTAINERS| 7 + doc/guides/dmadevs/index.rst | 1 + doc/guides/dmadevs/odm.rst | 92 doc/guides/rel_notes/release_24_07.rst | 4 + drivers/dma/meson.build| 1 + drivers/dma/odm/meson.build| 14 + drivers/dma/odm/odm.c | 237 drivers/dma/odm/odm.h | 203 +++ drivers/dma/odm/odm_dmadev.c | 717 + drivers/dma/odm/odm_priv.h | 49 ++ 10 files changed, 1325 insertions(+) create mode 100644 doc/guides/dmadevs/odm.rst create mode 100644 drivers/dma/odm/meson.build create mode 100644 drivers/dma/odm/odm.c create mode 100644 drivers/dma/odm/odm.h create mode 100644 drivers/dma/odm/odm_dmadev.c create mode 100644 drivers/dma/odm/odm_priv.h -- 2.45.1
[PATCH v4 1/7] dma/odm: add framework for ODM DMA device
Add framework for Odyssey ODM DMA device. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- MAINTAINERS | 6 +++ drivers/dma/meson.build | 1 + drivers/dma/odm/meson.build | 14 +++ drivers/dma/odm/odm.h| 29 ++ drivers/dma/odm/odm_dmadev.c | 74 5 files changed, 124 insertions(+) create mode 100644 drivers/dma/odm/meson.build create mode 100644 drivers/dma/odm/odm.h create mode 100644 drivers/dma/odm/odm_dmadev.c diff --git a/MAINTAINERS b/MAINTAINERS index c9adff9846..b581207a9a 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1269,6 +1269,12 @@ T: git://dpdk.org/next/dpdk-next-net-mrvl F: drivers/dma/cnxk/ F: doc/guides/dmadevs/cnxk.rst +Marvell Odyssey ODM DMA +M: Gowrishankar Muthukrishnan +M: Vidya Sagar Velumuri +T: git://dpdk.org/next/dpdk-next-net-mrvl +F: drivers/dma/odm/ + NXP DPAA DMA M: Gagandeep Singh M: Sachin Saxena diff --git a/drivers/dma/meson.build b/drivers/dma/meson.build index 582654ea1b..358132759a 100644 --- a/drivers/dma/meson.build +++ b/drivers/dma/meson.build @@ -8,6 +8,7 @@ drivers = [ 'hisilicon', 'idxd', 'ioat', +'odm', 'skeleton', ] std_deps = ['dmadev'] diff --git a/drivers/dma/odm/meson.build b/drivers/dma/odm/meson.build new file mode 100644 index 00..227b10c890 --- /dev/null +++ b/drivers/dma/odm/meson.build @@ -0,0 +1,14 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(C) 2024 Marvell. + +if not is_linux or not dpdk_conf.get('RTE_ARCH_64') +build = false +reason = 'only supported on 64-bit Linux' +subdir_done() +endif + +deps += ['bus_pci', 'dmadev', 'eal', 'mempool', 'pci'] + +sources = files('odm_dmadev.c') + +pmd_supports_disable_iova_as_pa = true diff --git a/drivers/dma/odm/odm.h b/drivers/dma/odm/odm.h new file mode 100644 index 00..aeeb6f9e9a --- /dev/null +++ b/drivers/dma/odm/odm.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell. + */ + +#ifndef _ODM_H_ +#define _ODM_H_ + +#include + +extern int odm_logtype; + +#define odm_err(...) \ + rte_log(RTE_LOG_ERR, odm_logtype, \ + RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__,\ + RTE_FMT_TAIL(__VA_ARGS__, ))) +#define odm_info(...) \ + rte_log(RTE_LOG_INFO, odm_logtype, \ + RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__,\ + RTE_FMT_TAIL(__VA_ARGS__, ))) + +struct __rte_cache_aligned odm_dev { + struct rte_pci_device *pci_dev; + uint8_t *rbase; + uint16_t vfid; + uint8_t max_qs; + uint8_t num_qs; +}; + +#endif /* _ODM_H_ */ diff --git a/drivers/dma/odm/odm_dmadev.c b/drivers/dma/odm/odm_dmadev.c new file mode 100644 index 00..cc3342cf7b --- /dev/null +++ b/drivers/dma/odm/odm_dmadev.c @@ -0,0 +1,74 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell. + */ + +#include + +#include +#include +#include +#include +#include +#include + +#include "odm.h" + +#define PCI_VENDOR_ID_CAVIUM0x177D +#define PCI_DEVID_ODYSSEY_ODM_VF 0xA08C +#define PCI_DRIVER_NAME dma_odm + +static int +odm_dmadev_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_device *pci_dev) +{ + char name[RTE_DEV_NAME_MAX_LEN]; + struct odm_dev *odm = NULL; + struct rte_dma_dev *dmadev; + + if (!pci_dev->mem_resource[0].addr) + return -ENODEV; + + memset(name, 0, sizeof(name)); + rte_pci_device_name(&pci_dev->addr, name, sizeof(name)); + + dmadev = rte_dma_pmd_allocate(name, pci_dev->device.numa_node, sizeof(*odm)); + if (dmadev == NULL) { + odm_err("DMA device allocation failed for %s", name); + return -ENOMEM; + } + + odm_info("DMA device %s probed", name); + + return 0; +} + +static int +odm_dmadev_remove(struct rte_pci_device *pci_dev) +{ + char name[RTE_DEV_NAME_MAX_LEN]; + + memset(name, 0, sizeof(name)); + rte_pci_device_name(&pci_dev->addr, name, sizeof(name)); + + return rte_dma_pmd_release(name); +} + +static const struct rte_pci_id odm_dma_pci_map[] = { + { + RTE_PCI_DEVICE(PCI_VENDOR_ID_CAVIUM, PCI_DEVID_ODYSSEY_ODM_VF) + }, + { + .vendor_id = 0, + }, +}; + +static struct rte_pci_driver odm_dmadev = { + .id_table = odm_dma_pci_map, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .probe = odm_dmadev_probe, + .remove = odm_dmadev_remove, +}; +
[PATCH v4 2/7] dma/odm: add hardware defines
Add ODM registers and structures. Add mailbox structs as well. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- drivers/dma/odm/odm.h | 106 + drivers/dma/odm/odm_priv.h | 49 + 2 files changed, 155 insertions(+) create mode 100644 drivers/dma/odm/odm_priv.h diff --git a/drivers/dma/odm/odm.h b/drivers/dma/odm/odm.h index aeeb6f9e9a..8cc3e0de44 100644 --- a/drivers/dma/odm/odm.h +++ b/drivers/dma/odm/odm.h @@ -9,6 +9,46 @@ extern int odm_logtype; +/* ODM VF register offsets from VF_BAR0 */ +#define ODM_VDMA_EN(x) (0x00 | (x << 3)) +#define ODM_VDMA_REQQ_CTL(x) (0x80 | (x << 3)) +#define ODM_VDMA_DBELL(x) (0x100 | (x << 3)) +#define ODM_VDMA_RING_CFG(x) (0x180 | (x << 3)) +#define ODM_VDMA_IRING_BADDR(x) (0x200 | (x << 3)) +#define ODM_VDMA_CRING_BADDR(x) (0x280 | (x << 3)) +#define ODM_VDMA_COUNTS(x) (0x300 | (x << 3)) +#define ODM_VDMA_IRING_NADDR(x) (0x380 | (x << 3)) +#define ODM_VDMA_CRING_NADDR(x) (0x400 | (x << 3)) +#define ODM_VDMA_IRING_DBG(x) (0x480 | (x << 3)) +#define ODM_VDMA_CNT(x)(0x580 | (x << 3)) +#define ODM_VF_INT (0x1000) +#define ODM_VF_INT_W1S (0x1008) +#define ODM_VF_INT_ENA_W1C (0x1010) +#define ODM_VF_INT_ENA_W1S (0x1018) +#define ODM_MBOX_VF_PF_DATA(i) (0x2000 | (i << 3)) +#define ODM_MBOX_RETRY_CNT (0xfff) +#define ODM_MBOX_ERR_CODE_MAX (0xf) +#define ODM_IRING_IDLE_WAIT_CNT (0xfff) + +/* + * Enumeration odm_hdr_xtype_e + * + * ODM Transfer Type Enumeration + * Enumerates the pointer type in ODM_DMA_INSTR_HDR_S[XTYPE] + */ +#define ODM_XTYPE_INTERNAL 2 +#define ODM_XTYPE_FILL0 4 +#define ODM_XTYPE_FILL1 5 + +/* + * ODM Header completion type enumeration + * Enumerates the completion type in ODM_DMA_INSTR_HDR_S[CT] + */ +#define ODM_HDR_CT_CW_CA 0x0 +#define ODM_HDR_CT_CW_NC 0x1 + +#define ODM_MAX_QUEUES_PER_DEV 16 + #define odm_err(...) \ rte_log(RTE_LOG_ERR, odm_logtype, \ RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__,\ @@ -18,6 +58,72 @@ extern int odm_logtype; RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__,\ RTE_FMT_TAIL(__VA_ARGS__, ))) +/* + * Structure odm_instr_hdr_s for ODM + * + * ODM DMA Instruction Header Format + */ +union odm_instr_hdr_s { + uint64_t u; + struct odm_instr_hdr { + uint64_t nfst : 3; + uint64_t reserved_3 : 1; + uint64_t nlst : 3; + uint64_t reserved_7_9 : 3; + uint64_t ct : 2; + uint64_t stse : 1; + uint64_t reserved_13_28 : 16; + uint64_t sts : 1; + uint64_t reserved_30_49 : 20; + uint64_t xtype : 3; + uint64_t reserved_53_63 : 11; + } s; +}; + +/* ODM Completion Entry Structure */ +union odm_cmpl_ent_s { + uint32_t u; + struct odm_cmpl_ent { + uint32_t cmp_code : 8; + uint32_t rsvd : 23; + uint32_t valid : 1; + } s; +}; + +/* ODM DMA Ring Configuration Register */ +union odm_vdma_ring_cfg_s { + uint64_t u; + struct { + uint64_t isize : 8; + uint64_t rsvd_8_15 : 8; + uint64_t csize : 8; + uint64_t rsvd_24_63 : 40; + } s; +}; + +/* ODM DMA Instruction Ring DBG */ +union odm_vdma_iring_dbg_s { + uint64_t u; + struct { + uint64_t dbell_cnt : 32; + uint64_t offset : 16; + uint64_t rsvd_48_62 : 15; + uint64_t iwbusy : 1; + } s; +}; + +/* ODM DMA Counts */ +union odm_vdma_counts_s { + uint64_t u; + struct { + uint64_t dbell : 32; + uint64_t buf_used_cnt : 9; + uint64_t rsvd_41_43 : 3; + uint64_t rsvd_buf_used_cnt : 3; + uint64_t rsvd_47_63 : 17; + } s; +}; + struct __rte_cache_aligned odm_dev { struct rte_pci_device *pci_dev; uint8_t *rbase; diff --git a/drivers/dma/odm/odm_priv.h b/drivers/dma/odm/odm_priv.h new file mode 100644 index 00..1878f4d9a6 --- /dev/null +++ b/drivers/dma/odm/odm_priv.h @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell. + */ + +#ifndef _ODM_PRIV_H_ +#define _ODM_PRIV_H_ + +#define ODM_MAX_VFS16 +#define ODM_MAX_QUEUES 32 + +#define ODM_CMD_QUEUE_SIZE 4096 + +#define ODM_DEV_INIT 0x1 +#define ODM_DEV_CLOSE 0x2 +#define ODM_QUEUE_OPEN 0x3 +#define ODM_QUEUE_CLOSE 0x4 +#define ODM_REG_DUMP 0x5 + +struct odm_mbox_dev_msg { + /* Response code */ + uint64_t rsp : 8; + /* Number of VFs */
[PATCH v4 3/7] dma/odm: add dev init and fini
From: Gowrishankar Muthukrishnan Add ODM device init and fini. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- drivers/dma/odm/meson.build | 2 +- drivers/dma/odm/odm.c| 97 drivers/dma/odm/odm.h| 10 drivers/dma/odm/odm_dmadev.c | 13 + 4 files changed, 121 insertions(+), 1 deletion(-) create mode 100644 drivers/dma/odm/odm.c diff --git a/drivers/dma/odm/meson.build b/drivers/dma/odm/meson.build index 227b10c890..d597762d37 100644 --- a/drivers/dma/odm/meson.build +++ b/drivers/dma/odm/meson.build @@ -9,6 +9,6 @@ endif deps += ['bus_pci', 'dmadev', 'eal', 'mempool', 'pci'] -sources = files('odm_dmadev.c') +sources = files('odm_dmadev.c', 'odm.c') pmd_supports_disable_iova_as_pa = true diff --git a/drivers/dma/odm/odm.c b/drivers/dma/odm/odm.c new file mode 100644 index 00..c0963da451 --- /dev/null +++ b/drivers/dma/odm/odm.c @@ -0,0 +1,97 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(C) 2024 Marvell. + */ + +#include + +#include + +#include + +#include "odm.h" +#include "odm_priv.h" + +static void +odm_vchan_resc_free(struct odm_dev *odm, int qno) +{ + RTE_SET_USED(odm); + RTE_SET_USED(qno); +} + +static int +send_mbox_to_pf(struct odm_dev *odm, union odm_mbox_msg *msg, union odm_mbox_msg *rsp) +{ + int retry_cnt = ODM_MBOX_RETRY_CNT; + union odm_mbox_msg pf_msg; + + msg->d.err = ODM_MBOX_ERR_CODE_MAX; + odm_write64(msg->u[0], odm->rbase + ODM_MBOX_VF_PF_DATA(0)); + odm_write64(msg->u[1], odm->rbase + ODM_MBOX_VF_PF_DATA(1)); + + pf_msg.u[0] = 0; + pf_msg.u[1] = 0; + pf_msg.u[0] = odm_read64(odm->rbase + ODM_MBOX_VF_PF_DATA(0)); + + while (pf_msg.d.rsp == 0 && retry_cnt > 0) { + pf_msg.u[0] = odm_read64(odm->rbase + ODM_MBOX_VF_PF_DATA(0)); + --retry_cnt; + } + + if (retry_cnt <= 0) + return -EBADE; + + pf_msg.u[1] = odm_read64(odm->rbase + ODM_MBOX_VF_PF_DATA(1)); + + if (rsp) { + rsp->u[0] = pf_msg.u[0]; + rsp->u[1] = pf_msg.u[1]; + } + + if (pf_msg.d.rsp == msg->d.err && pf_msg.d.err != 0) + return -EBADE; + + return 0; +} + +int +odm_dev_init(struct odm_dev *odm) +{ + struct rte_pci_device *pci_dev = odm->pci_dev; + union odm_mbox_msg mbox_msg; + uint16_t vfid; + int rc; + + odm->rbase = pci_dev->mem_resource[0].addr; + vfid = ((pci_dev->addr.devid & 0x1F) << 3) | (pci_dev->addr.function & 0x7); + vfid -= 1; + odm->vfid = vfid; + odm->num_qs = 0; + + mbox_msg.u[0] = 0; + mbox_msg.u[1] = 0; + mbox_msg.q.vfid = odm->vfid; + mbox_msg.q.cmd = ODM_DEV_INIT; + rc = send_mbox_to_pf(odm, &mbox_msg, &mbox_msg); + if (!rc) + odm->max_qs = 1 << (4 - mbox_msg.d.nvfs); + + return rc; +} + +int +odm_dev_fini(struct odm_dev *odm) +{ + union odm_mbox_msg mbox_msg; + int qno, rc = 0; + + mbox_msg.u[0] = 0; + mbox_msg.u[1] = 0; + mbox_msg.q.vfid = odm->vfid; + mbox_msg.q.cmd = ODM_DEV_CLOSE; + rc = send_mbox_to_pf(odm, &mbox_msg, &mbox_msg); + + for (qno = 0; qno < odm->num_qs; qno++) + odm_vchan_resc_free(odm, qno); + + return rc; +} diff --git a/drivers/dma/odm/odm.h b/drivers/dma/odm/odm.h index 8cc3e0de44..0bf0c6345b 100644 --- a/drivers/dma/odm/odm.h +++ b/drivers/dma/odm/odm.h @@ -5,6 +5,10 @@ #ifndef _ODM_H_ #define _ODM_H_ +#include + +#include +#include #include extern int odm_logtype; @@ -49,6 +53,9 @@ extern int odm_logtype; #define ODM_MAX_QUEUES_PER_DEV 16 +#define odm_read64(addr) rte_read64_relaxed((volatile void *)(addr)) +#define odm_write64(val, addr) rte_write64_relaxed((val), (volatile void *)(addr)) + #define odm_err(...) \ rte_log(RTE_LOG_ERR, odm_logtype, \ RTE_FMT("%s(): %u" RTE_FMT_HEAD(__VA_ARGS__, ), __func__, __LINE__,\ @@ -132,4 +139,7 @@ struct __rte_cache_aligned odm_dev { uint8_t num_qs; }; +int odm_dev_init(struct odm_dev *odm); +int odm_dev_fini(struct odm_dev *odm); + #endif /* _ODM_H_ */ diff --git a/drivers/dma/odm/odm_dmadev.c b/drivers/dma/odm/odm_dmadev.c index cc3342cf7b..bef335c10c 100644 --- a/drivers/dma/odm/odm_dmadev.c +++ b/drivers/dma/odm/odm_dmadev.c @@ -23,6 +23,7 @@ odm_dmadev_probe(struct rte_pci_driver *pci_drv __rte_unused, struct rte_pci_dev char name[RTE_DEV_NAME_MAX_LEN]; struct odm_dev *odm = NULL; struct rte_dma_dev *dmadev; + int rc; if (!pci_dev->mem_resource[0].addr) return -ENODEV; @@ -37,8 +38,20 @@ odm_dmadev_probe(struct rte_pci_driver *pci_drv __rte_unused,
[PATCH v4 4/7] dma/odm: add device ops
From: Gowrishankar Muthukrishnan Add DMA device control ops. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- drivers/dma/odm/odm.c| 144 ++- drivers/dma/odm/odm.h| 54 + drivers/dma/odm/odm_dmadev.c | 85 + 3 files changed, 281 insertions(+), 2 deletions(-) diff --git a/drivers/dma/odm/odm.c b/drivers/dma/odm/odm.c index c0963da451..270808f4df 100644 --- a/drivers/dma/odm/odm.c +++ b/drivers/dma/odm/odm.c @@ -7,6 +7,7 @@ #include #include +#include #include "odm.h" #include "odm_priv.h" @@ -14,8 +15,15 @@ static void odm_vchan_resc_free(struct odm_dev *odm, int qno) { - RTE_SET_USED(odm); - RTE_SET_USED(qno); + struct odm_queue *vq = &odm->vq[qno]; + + rte_memzone_free(vq->iring_mz); + rte_memzone_free(vq->cring_mz); + rte_free(vq->extra_ins_sz); + + vq->iring_mz = NULL; + vq->cring_mz = NULL; + vq->extra_ins_sz = NULL; } static int @@ -53,6 +61,138 @@ send_mbox_to_pf(struct odm_dev *odm, union odm_mbox_msg *msg, union odm_mbox_msg return 0; } +static int +odm_queue_ring_config(struct odm_dev *odm, int vchan, int isize, int csize) +{ + union odm_vdma_ring_cfg_s ring_cfg = {0}; + struct odm_queue *vq = &odm->vq[vchan]; + + if (vq->iring_mz == NULL || vq->cring_mz == NULL) + return -EINVAL; + + ring_cfg.s.isize = (isize / 1024) - 1; + ring_cfg.s.csize = (csize / 1024) - 1; + + odm_write64(ring_cfg.u, odm->rbase + ODM_VDMA_RING_CFG(vchan)); + odm_write64(vq->iring_mz->iova, odm->rbase + ODM_VDMA_IRING_BADDR(vchan)); + odm_write64(vq->cring_mz->iova, odm->rbase + ODM_VDMA_CRING_BADDR(vchan)); + + return 0; +} + +int +odm_enable(struct odm_dev *odm) +{ + struct odm_queue *vq; + int qno, rc = 0; + + for (qno = 0; qno < odm->num_qs; qno++) { + vq = &odm->vq[qno]; + + vq->desc_idx = vq->stats.completed_offset; + vq->pending_submit_len = 0; + vq->pending_submit_cnt = 0; + vq->iring_head = 0; + vq->cring_head = 0; + vq->ins_ring_head = 0; + vq->iring_sz_available = vq->iring_max_words; + + rc = odm_queue_ring_config(odm, qno, vq->iring_max_words * 8, + vq->cring_max_entry * 4); + if (rc < 0) + break; + + odm_write64(0x1, odm->rbase + ODM_VDMA_EN(qno)); + } + + return rc; +} + +int +odm_disable(struct odm_dev *odm) +{ + int qno, wait_cnt = ODM_IRING_IDLE_WAIT_CNT; + uint64_t val; + + /* Disable the queue and wait for the queue to became idle */ + for (qno = 0; qno < odm->num_qs; qno++) { + odm_write64(0x0, odm->rbase + ODM_VDMA_EN(qno)); + do { + val = odm_read64(odm->rbase + ODM_VDMA_IRING_BADDR(qno)); + } while ((!(val & 1ULL << 63)) && (--wait_cnt > 0)); + } + + return 0; +} + +int +odm_vchan_setup(struct odm_dev *odm, int vchan, int nb_desc) +{ + struct odm_queue *vq = &odm->vq[vchan]; + int isize, csize, max_nb_desc, rc = 0; + union odm_mbox_msg mbox_msg; + const struct rte_memzone *mz; + char name[32]; + + if (vq->iring_mz != NULL) + odm_vchan_resc_free(odm, vchan); + + mbox_msg.u[0] = 0; + mbox_msg.u[1] = 0; + + /* ODM PF driver expects vfid starts from index 0 */ + mbox_msg.q.vfid = odm->vfid; + mbox_msg.q.cmd = ODM_QUEUE_OPEN; + mbox_msg.q.qidx = vchan; + rc = send_mbox_to_pf(odm, &mbox_msg, &mbox_msg); + if (rc < 0) + return rc; + + /* Determine instruction & completion ring sizes. */ + + /* Create iring that can support nb_desc. Round up to a multiple of 1024. */ + isize = RTE_ALIGN_CEIL(nb_desc * ODM_IRING_ENTRY_SIZE_MAX * 8, 1024); + isize = RTE_MIN(isize, ODM_IRING_MAX_SIZE); + snprintf(name, sizeof(name), "vq%d_iring%d", odm->vfid, vchan); + mz = rte_memzone_reserve_aligned(name, isize, SOCKET_ID_ANY, 0, 1024); + if (mz == NULL) + return -ENOMEM; + vq->iring_mz = mz; + vq->iring_max_words = isize / 8; + + /* Create cring that can support max instructions that can be inflight in hw. */ + max_nb_desc = (isize / (ODM_IRING_ENTRY_SIZE_MIN * 8)); + csize = RTE_ALIGN_CEIL(max_nb_desc * sizeof(union odm_cmpl_ent_s), 1024); + snprintf(name, sizeof(name), "vq%d_cring%d", odm->vfid, vchan); + mz = rte_memzone_reserve_aligned(name, csize, SOCKET_ID_ANY, 0, 1024); + if (mz == NULL) { + rc = -ENOMEM; + goto iring_free; + } + vq->cring_mz = mz; + vq->cring_max_entry = csize / 4; + + /* Allocate memory to trac
[PATCH v4 5/7] dma/odm: add stats
From: Gowrishankar Muthukrishnan Add DMA dev stats. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- drivers/dma/odm/odm_dmadev.c | 63 ++-- 1 file changed, 61 insertions(+), 2 deletions(-) diff --git a/drivers/dma/odm/odm_dmadev.c b/drivers/dma/odm/odm_dmadev.c index 8c705978fe..13b2588246 100644 --- a/drivers/dma/odm/odm_dmadev.c +++ b/drivers/dma/odm/odm_dmadev.c @@ -87,14 +87,73 @@ odm_dmadev_close(struct rte_dma_dev *dev) return 0; } +static int +odm_stats_get(const struct rte_dma_dev *dev, uint16_t vchan, struct rte_dma_stats *rte_stats, + uint32_t size) +{ + struct odm_dev *odm = dev->fp_obj->dev_private; + + if (size < sizeof(rte_stats)) + return -EINVAL; + if (rte_stats == NULL) + return -EINVAL; + + if (vchan != RTE_DMA_ALL_VCHAN) { + struct rte_dma_stats *stats = (struct rte_dma_stats *)&odm->vq[vchan].stats; + + *rte_stats = *stats; + } else { + int i; + + for (i = 0; i < odm->num_qs; i++) { + struct rte_dma_stats *stats = (struct rte_dma_stats *)&odm->vq[i].stats; + + rte_stats->submitted += stats->submitted; + rte_stats->completed += stats->completed; + rte_stats->errors += stats->errors; + } + } + + return 0; +} + +static void +odm_vq_stats_reset(struct vq_stats *vq_stats) +{ + vq_stats->completed_offset += vq_stats->completed; + vq_stats->completed = 0; + vq_stats->errors = 0; + vq_stats->submitted = 0; +} + +static int +odm_stats_reset(struct rte_dma_dev *dev, uint16_t vchan) +{ + struct odm_dev *odm = dev->fp_obj->dev_private; + struct vq_stats *vq_stats; + int i; + + if (vchan != RTE_DMA_ALL_VCHAN) { + vq_stats = &odm->vq[vchan].stats; + odm_vq_stats_reset(vq_stats); + } else { + for (i = 0; i < odm->num_qs; i++) { + vq_stats = &odm->vq[i].stats; + odm_vq_stats_reset(vq_stats); + } + } + + return 0; +} + static const struct rte_dma_dev_ops odm_dmadev_ops = { .dev_close = odm_dmadev_close, .dev_configure = odm_dmadev_configure, .dev_info_get = odm_dmadev_info_get, .dev_start = odm_dmadev_start, .dev_stop = odm_dmadev_stop, - .stats_get = NULL, - .stats_reset = NULL, + .stats_get = odm_stats_get, + .stats_reset = odm_stats_reset, .vchan_setup = odm_dmadev_vchan_setup, }; -- 2.45.1
[PATCH v4 6/7] dma/odm: add copy and copy sg ops
From: Vidya Sagar Velumuri Add ODM copy and copy SG ops. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- drivers/dma/odm/odm_dmadev.c | 236 +++ 1 file changed, 236 insertions(+) diff --git a/drivers/dma/odm/odm_dmadev.c b/drivers/dma/odm/odm_dmadev.c index 13b2588246..b21be83a89 100644 --- a/drivers/dma/odm/odm_dmadev.c +++ b/drivers/dma/odm/odm_dmadev.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include "odm.h" @@ -87,6 +88,238 @@ odm_dmadev_close(struct rte_dma_dev *dev) return 0; } +static int +odm_dmadev_copy(void *dev_private, uint16_t vchan, rte_iova_t src, rte_iova_t dst, uint32_t length, + uint64_t flags) +{ + uint16_t pending_submit_len, pending_submit_cnt, iring_sz_available, iring_head; + const int num_words = ODM_IRING_ENTRY_SIZE_MIN; + struct odm_dev *odm = dev_private; + uint64_t *iring_head_ptr; + struct odm_queue *vq; + uint64_t h; + + const union odm_instr_hdr_s hdr = { + .s.ct = ODM_HDR_CT_CW_NC, + .s.xtype = ODM_XTYPE_INTERNAL, + .s.nfst = 1, + .s.nlst = 1, + }; + + vq = &odm->vq[vchan]; + + h = length; + h |= ((uint64_t)length << 32); + + const uint16_t max_iring_words = vq->iring_max_words; + + iring_sz_available = vq->iring_sz_available; + pending_submit_len = vq->pending_submit_len; + pending_submit_cnt = vq->pending_submit_cnt; + iring_head_ptr = vq->iring_mz->addr; + iring_head = vq->iring_head; + + if (iring_sz_available < num_words) + return -ENOSPC; + + if ((iring_head + num_words) >= max_iring_words) { + + iring_head_ptr[iring_head] = hdr.u; + iring_head = (iring_head + 1) % max_iring_words; + + iring_head_ptr[iring_head] = h; + iring_head = (iring_head + 1) % max_iring_words; + + iring_head_ptr[iring_head] = src; + iring_head = (iring_head + 1) % max_iring_words; + + iring_head_ptr[iring_head] = dst; + iring_head = (iring_head + 1) % max_iring_words; + } else { + iring_head_ptr[iring_head++] = hdr.u; + iring_head_ptr[iring_head++] = h; + iring_head_ptr[iring_head++] = src; + iring_head_ptr[iring_head++] = dst; + } + + pending_submit_len += num_words; + + if (flags & RTE_DMA_OP_FLAG_SUBMIT) { + rte_wmb(); + odm_write64(pending_submit_len, odm->rbase + ODM_VDMA_DBELL(vchan)); + vq->stats.submitted += pending_submit_cnt + 1; + vq->pending_submit_len = 0; + vq->pending_submit_cnt = 0; + } else { + vq->pending_submit_len = pending_submit_len; + vq->pending_submit_cnt++; + } + + vq->iring_head = iring_head; + + vq->iring_sz_available = iring_sz_available - num_words; + + /* No extra space to save. Skip entry in extra space ring. */ + vq->ins_ring_head = (vq->ins_ring_head + 1) % vq->cring_max_entry; + + return vq->desc_idx++; +} + +static inline void +odm_dmadev_fill_sg(uint64_t *cmd, const struct rte_dma_sge *src, const struct rte_dma_sge *dst, + uint16_t nb_src, uint16_t nb_dst, union odm_instr_hdr_s *hdr) +{ + int i = 0, j = 0; + uint64_t h = 0; + + cmd[j++] = hdr->u; + /* When nb_src is even */ + if (!(nb_src & 0x1)) { + /* Fill the iring with src pointers */ + for (i = 1; i < nb_src; i += 2) { + h = ((uint64_t)src[i].length << 32) | src[i - 1].length; + cmd[j++] = h; + cmd[j++] = src[i - 1].addr; + cmd[j++] = src[i].addr; + } + + /* Fill the iring with dst pointers */ + for (i = 1; i < nb_dst; i += 2) { + h = ((uint64_t)dst[i].length << 32) | dst[i - 1].length; + cmd[j++] = h; + cmd[j++] = dst[i - 1].addr; + cmd[j++] = dst[i].addr; + } + + /* Handle the last dst pointer when nb_dst is odd */ + if (nb_dst & 0x1) { + h = dst[nb_dst - 1].length; + cmd[j++] = h; + cmd[j++] = dst[nb_dst - 1].addr; + cmd[j++] = 0; + } + } else { + /* When nb_src is odd */ + + /* Fill the iring with src pointers */ + for (i = 1; i < nb_src; i += 2) { + h = ((uint64_t)src[i].length << 32) | src[i - 1].length; + cmd[j++] = h; + cmd[j++] = src[i - 1].addr; + cmd[j++] = src[i]
[PATCH v4 7/7] dma/odm: add remaining ops
From: Vidya Sagar Velumuri Add all remaining ops such as fill, burst_capacity etc. Also update the documentation. Signed-off-by: Anoob Joseph Signed-off-by: Gowrishankar Muthukrishnan Signed-off-by: Vidya Sagar Velumuri --- MAINTAINERS| 1 + doc/guides/dmadevs/index.rst | 1 + doc/guides/dmadevs/odm.rst | 92 + doc/guides/rel_notes/release_24_07.rst | 4 + drivers/dma/odm/odm.h | 4 + drivers/dma/odm/odm_dmadev.c | 250 + 6 files changed, 352 insertions(+) create mode 100644 doc/guides/dmadevs/odm.rst diff --git a/MAINTAINERS b/MAINTAINERS index b581207a9a..195125ee1e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -1274,6 +1274,7 @@ M: Gowrishankar Muthukrishnan M: Vidya Sagar Velumuri T: git://dpdk.org/next/dpdk-next-net-mrvl F: drivers/dma/odm/ +F: doc/guides/dmadevs/odm.rst NXP DPAA DMA M: Gagandeep Singh diff --git a/doc/guides/dmadevs/index.rst b/doc/guides/dmadevs/index.rst index 5bd25b32b9..ce9f6eb260 100644 --- a/doc/guides/dmadevs/index.rst +++ b/doc/guides/dmadevs/index.rst @@ -17,3 +17,4 @@ an application through DMA API. hisilicon idxd ioat + odm diff --git a/doc/guides/dmadevs/odm.rst b/doc/guides/dmadevs/odm.rst new file mode 100644 index 00..a2eaab59a0 --- /dev/null +++ b/doc/guides/dmadevs/odm.rst @@ -0,0 +1,92 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2024 Marvell. + +Odyssey ODM DMA Device Driver += + +The ``odm`` DMA device driver provides a poll-mode driver (PMD) for Marvell Odyssey +DMA Hardware Accelerator block found in Odyssey SoC. The block supports only mem +to mem DMA transfers. + +ODM DMA device can support up to 32 queues and 16 VFs. + +Prerequisites and Compilation procedure +--- + +Device Setup +- + +ODM DMA device is initialized by kernel PF driver. The PF kernel driver is part +of Marvell software packages for Odyssey. + +Kernel module can be inserted as in below example:: + +$ sudo insmod odyssey_odm.ko + +ODM DMA device can support up to 16 VFs:: + +$ sudo echo 16 > /sys/bus/pci/devices/\:08\:00.0/sriov_numvfs + +Above command creates 16 VFs with 2 queues each. + +The ``dpdk-devbind.py`` script, included with DPDK, can be used to show the +presence of supported hardware. Running ``dpdk-devbind.py --status-dev dma`` +will show all the Odyssey ODM DMA devices. + +Devices using VFIO drivers +~~ + +The HW devices to be used will need to be bound to a user-space IO driver. +The ``dpdk-devbind.py`` script can be used to view the state of the devices +and to bind them to a suitable DPDK-supported driver, such as ``vfio-pci``. +For example:: + + $ dpdk-devbind.py -b vfio-pci :08:00.1 + +Device Probing and Initialization +~ + +To use the devices from an application, the dmadev API can be used. + +Once configured, the device can then be made ready for use +by calling the ``rte_dma_start()`` API. + +Performing Data Copies +~~ + +Refer to the :ref:`Enqueue / Dequeue APIs ` section +of the dmadev library documentation for details on operation enqueue and +submission API usage. + +Performance Tuning Parameters +~ + +To achieve higher performance, DMA device needs to be tuned using PF kernel +driver module parameters. + +Following options are exposed by kernel PF driver via devlink interface for +tuning performance. + +``eng_sel`` + + ODM DMA device has 2 engines internally. Engine to queue mapping is decided + by a hardware register which can be configured as below:: + +$ /sbin/devlink dev param set pci/:08:00.0 name eng_sel value 3435973836 cmode runtime + + Each bit in the register corresponds to one queue. Each queue would be + associated with one engine. If the value of the bit corresponding to the queue + is 0, then engine 0 would be picked. If it is 1, then engine 1 would be + picked. + + In the above command, the register value is set as + ``1100 1100 1100 1100 1100 1100 1100 1100`` which allows for alternate engines + to be used with alternate VFs (assuming the system has 16 VFs with 2 queues + each). + +``max_load_request`` + + Specifies maximum outstanding load requests on internal bus. Values can range + from 1 to 512. Set to 512 for maximum requests in flight.:: + +$ /sbin/devlink dev param set pci/:08:00.0 name max_load_request value 512 cmode runtime diff --git a/doc/guides/rel_notes/release_24_07.rst b/doc/guides/rel_notes/release_24_07.rst index a69f24cf99..3bc8451330 100644 --- a/doc/guides/rel_notes/release_24_07.rst +++ b/doc/guides/rel_notes/release_24_07.rst @@ -55,6 +55,10 @@ New Features Also, make sure to start the actual text at the margin. === +* **Added Marvell Odyssey ODM DMA device suppo
Re: [PATCH v4 1/3] event/dlb2: add support for HW delayed token
On Thu, May 2, 2024 at 1:16 AM Abdullah Sevincer wrote: > > In DLB 2.5, hardware assist is available, complementing the Delayed > token POP software implementation. When it is enabled, the feature > works as follows: > > It stops CQ scheduling when the inflight limit associated with the CQ > is reached. So the feature is activated only if the core is > congested. If the core can handle multiple atomic flows, DLB will not > try to switch them. This is an improvement over SW implementation > which always switches the flows. > > The feature will resume CQ scheduling when the number of pending > completions fall below a configured threshold. To emulate older 2.0 > behavior, this threshold is set to 1 by old APIs. SW sets CQ to > auto-pop mode for token return, as tokens withholding is not > necessary now. As HW counts completions and not tokens, events equal > to HL (History List) entries will be scheduled to DLB before the > feature activates and stops CQ scheduling. 1) Also tell about adding new PMD API and update the release notes for PMD section for new feature. 2) Fix CI http://mails.dpdk.org/archives/test-report/2024-May/657681.html > > Signed-off-by: Abdullah Sevincer +/** Set inflight threshold for flow migration */ > +#define DLB2_FLOW_MIGRATION_THRESHOLD RTE_BIT64(0) Fix the namespace for public API, RTE_PMD_DLB2_PORT_SET_F_FLOW_MIGRATION_... > + > +/** Set port history list */ > +#define DLB2_SET_PORT_HL RTE_BIT64(1) RTE_PMD_DLB2_PORT_SET_F_PORT_HL > + > +struct dlb2_port_param { fix name space, rte_pmd_dlb2_port_params > + uint16_t inflight_threshold : 12; > +}; > + > +/*! > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * Configure various port parameters. > + * AUTO_POP. This function must be called before calling > rte_event_port_setup() > + * for the port, but after calling rte_event_dev_configure(). > + * > + * @param dev_id > + *The identifier of the event device. > + * @param port_id > + *The identifier of the event port. > + * @param flags > + *Bitmask of the parameters being set. > + * @param val > + *Structure coantaining the values of parameters being set. Why not use struct rte_pmd_dlb2_port_params itself instead of void *. > + * > + * @return > + * - 0: Success > + * - EINVAL: Invalid dev_id, port_id, or mode > + * - EINVAL: The DLB2 is not configured, is already running, or the port is > + * already setup > + */ > +__rte_experimental > +int > +rte_pmd_dlb2_set_port_param(uint8_t dev_id, > + uint8_t port_id, > + uint64_t flags, > + void *val);
Re: [PATCH v4 2/3] event/dlb2: add support for dynamic HL entries
On Thu, May 2, 2024 at 1:16 AM Abdullah Sevincer wrote: > > In DLB 2.5, hardware assist is available, complementing the Delayed > token POP software implementation. When it is enabled, the feature > works as follows: > > It stops CQ scheduling when the inflight limit associated with the CQ > is reached. So the feature is activated only if the core is > congested. If the core can handle multiple atomic flows, DLB will not > try to switch them. This is an improvement over SW implementation > which always switches the flows. > > The feature will resume CQ scheduling when the number of pending > completions fall below a configured threshold. > > DLB has 64 LDB ports and 2048 HL entries. If all LDB ports are used, > possible HL entries per LDB port equals 2048 / 64 = 32. So, the > maximum CQ depth possible is 16, if all 64 LB ports are needed in a > high-performance setting. > > In case all CQs are configured to have HL = 2* CQ Depth as a > performance option, then the calculation of HL at the time of domain > creation will be based on maximum possible dequeue depth. This could > result in allocating too many HL entries to the domain as DLB only > has limited number of HL entries to be allocated. Hence, it is best > to allow application to specify HL entries as a command line argument > and override default allocation. A summary of usage is listed below: > > When 'use_default_hl = 1', Per port HL is set to > DLB2_FIXED_CQ_HL_SIZE (32) and command line parameter > alloc_hl_entries is ignored. > > When 'use_default_hl = 0', Per LDB port HL = 2 * CQ depth and per > port HL is set to 2 * DLB2_FIXED_CQ_HL_SIZE. > > User should calculate needed HL entries based on CQ depths the > application will use and specify it as command line parameter > 'alloc_hl_entries'. This will be used to allocate HL entries. > Hence, alloc_hl_entries = (Sum of all LDB ports CQ depths * 2). > > If alloc_hl_entries is not specified, then Total HL entries for the > vdev = num_ldb_ports * 64. > > Signed-off-by: Abdullah Sevincer > } > diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h > index d6828aa482..dc9f98e142 100644 > --- a/drivers/event/dlb2/dlb2_priv.h > +++ b/drivers/event/dlb2/dlb2_priv.h > @@ -52,6 +52,8 @@ > #define DLB2_PRODUCER_COREMASK "producer_coremask" > #define DLB2_DEFAULT_LDB_PORT_ALLOCATION_ARG "default_port_allocation" > #define DLB2_ENABLE_CQ_WEIGHT_ARG "enable_cq_weight" > +#define DLB2_USE_DEFAULT_HL "use_default_hl" > +#define DLB2_ALLOC_HL_ENTRIES "alloc_hl_entries" 1)Update doc/guides/eventdevs/dlb2.rst for new devargs 2)Please release note PMD section for this feature.
Re: [PATCH v4 3/3] event/dlb2: enhance DLB credit handling
On Thu, May 2, 2024 at 1:27 AM Abdullah Sevincer wrote: > > This commit improves DLB credit handling scenarios when > ports hold on to credits but can't release them due to insufficient > accumulation (less than 2 * credit quanta). > > Worker ports now release all accumulated credits when back-to-back > zero poll count reaches preset threshold. > > Producer ports release all accumulated credits if enqueue fails for a > consecutive number of retries. > > In a multi-producer system, some producer(s) may exit early while > holding on to credits. Now these are released during port unlink > which needs to be performed by the application. > > test-eventdev is modified to call rte_event_port_unlink() to release > any accumulated credits by producer ports. > > Signed-off-by: Abdullah Sevincer > --- > app/test-eventdev/test_perf_common.c | 20 +-- 1) Spotted non-driver changes in driver patches, Please send test-eventdev changes as separate commit with complete rational. 2) Fix CI issues http://mails.dpdk.org/archives/test-report/2024-May/657683.html > drivers/event/dlb2/dlb2.c| 203 +-- > drivers/event/dlb2/dlb2_priv.h | 1 + > drivers/event/dlb2/meson.build | 12 ++ > drivers/event/dlb2/meson_options.txt | 6 + > 5 files changed, 194 insertions(+), 48 deletions(-) > create mode 100644 drivers/event/dlb2/meson_options.txt > > > static inline uint64_t > diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c > index 11bbe30d7b..2c341a5845 100644 > --- a/drivers/event/dlb2/dlb2.c > +++ b/drivers/event/dlb2/dlb2.c > @@ -43,7 +43,47 @@ > * to DLB can go ahead of relevant application writes like updates to buffers > * being sent with event > */ > +#ifndef DLB2_BYPASS_FENCE_ON_PP > #define DLB2_BYPASS_FENCE_ON_PP 0 /* 1 == Bypass fence, 0 == do not bypass > */ > +#endif > + > +/* HW credit checks can only be turned off for DLB2 device if following > + * is true for each created eventdev > + * LDB credits <= DIR credits + minimum CQ Depth > + * (CQ Depth is minimum of all ports configured within eventdev) > + * This needs to be true for all eventdevs created on any DLB2 device > + * managed by this driver. > + * DLB2.5 does not any such restriction as it has single credit pool > + */ > +#ifndef DLB_HW_CREDITS_CHECKS > +#define DLB_HW_CREDITS_CHECKS 1 > +#endif > + > +/* > + * SW credit checks can only be turned off if application has a way to > + * limit input events to the eventdev below assigned credit limit > + */ > +#ifndef DLB_SW_CREDITS_CHECKS > +#define DLB_SW_CREDITS_CHECKS 1 > +#endif > + > + > +static void dlb2_check_and_return_credits(struct dlb2_eventdev_port *ev_port, > + bool cond, uint32_t threshold) > +{ > +#if DLB_SW_CREDITS_CHECKS || DLB_HW_CREDITS_CHECKS This new patch is full of compilation flags clutter, can you make it runtime? > > diff --git a/drivers/event/dlb2/dlb2_priv.h b/drivers/event/dlb2/dlb2_priv.h > index dc9f98e142..fd76b5b9fb 100644 > --- a/drivers/event/dlb2/dlb2_priv.h > +++ b/drivers/event/dlb2/dlb2_priv.h > @@ -527,6 +527,7 @@ struct __rte_cache_aligned dlb2_eventdev_port { > struct rte_event_port_conf conf; /* user-supplied configuration */ > uint16_t inflight_credits; /* num credits this port has right now */ > uint16_t credit_update_quanta; > + uint32_t credit_return_count; /* count till the credit return > condition is true */ > struct dlb2_eventdev *dlb2; /* backlink optimization */ > alignas(RTE_CACHE_LINE_SIZE) struct dlb2_port_stats stats; > struct dlb2_event_queue_link link[DLB2_MAX_NUM_QIDS_PER_LDB_CQ]; > diff --git a/drivers/event/dlb2/meson.build b/drivers/event/dlb2/meson.build > index 515d1795fe..77a197e32c 100644 > --- a/drivers/event/dlb2/meson.build > +++ b/drivers/event/dlb2/meson.build > @@ -68,3 +68,15 @@ endif > headers = files('rte_pmd_dlb2.h') > > deps += ['mbuf', 'mempool', 'ring', 'pci', 'bus_pci'] > + > +if meson.version().version_compare('> 0.58.0') > +fs = import('fs') > +dlb_options = fs.read('meson_options.txt').strip().split('\n') > + > +foreach opt: dlb_options > + if (opt.strip().startswith('#') or opt.strip() == '') > + continue > + endif > + cflags += '-D' + opt.strip().to_upper().replace(' ','') > +endforeach > +endif > diff --git a/drivers/event/dlb2/meson_options.txt > b/drivers/event/dlb2/meson_options.txt Adding @Richardson, Bruce @Thomas Monjalon to comment on this, I am not sure driver specific meson_options.txt is a good path? > new file mode 100644 > index 00..b57c999e54 > --- /dev/null > +++ b/drivers/event/dlb2/meson_options.txt > @@ -0,0 +1,6 @@ > +# SPDX-License-Identifier: BSD-3-Clause > +# Copyright(c) 2023-2024 Intel Corporation > + > +DLB2_BYPASS_FENCE_ON_PP = 0 > +DLB_HW_CREDITS_CHECKS = 1 > +DLB_SW_CREDITS_CHECKS = 1 > -- > 2.25.1 >
Re: [PATCH] event/dsw: support explicit release only mode
On Sat, May 25, 2024 at 1:13 AM Mattias Rönnblom wrote: > > Add the RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE capability to the > DSW event device. > > This feature may be used by an EAL thread to pull more work from the > work scheduler, without giving up the option to forward events > originating from a previous dequeue batch. This in turn allows an EAL > thread to be productive while waiting for a hardware accelerator to > complete some operation. > > Prior to this change, DSW didn't make any distinction between > RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_NEW type events, other than that > new events would be backpressured earlier. > > After this change, DSW tracks the number of released events (i.e., > events of type RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_RELASE) that has > been enqueued. > > For efficency reasons, DSW does not track the *identity* of individual > events. This in turn implies that a certain stage in the flow > migration process, DSW must wait for all pending releases (on the > migration source port, only) to be received from the application, to > assure that no event pertaining to any of the to-be-migrated flows are > being processed. > > With this change, DSW starts making a distinction between forward and > new type events for credit allocation purposes. Only RTE_EVENT_OP_NEW > events needs credits. All events marked as RTE_EVENT_OP_FORWARD must > have a corresponding dequeued event from a previous dequeue batch. > > Flow migration for flows on RTE_SCHED_TYPE_PARALLEL queues remains > unaffected by this change. > > A side-effect of the tweaked DSW migration logic is that the migration > latency is reduced, regardless if implicit relase is enabled or not. > > Signed-off-by: Mattias Rönnblom 1) Update releases for PMD specific section for this new feature 2) Fix CI issue as applicable https://patches.dpdk.org/project/dpdk/patch/20240524192437.183960-1-mattias.ronnb...@ericsson.com/ http://mails.dpdk.org/archives/test-report/2024-May/672848.html https://github.com/ovsrobot/dpdk/actions/runs/9229147658
Re: [PATCH v5] cnxk: disable building template files
On Mon, May 27, 2024 at 09:04:29PM +0530, pbhagavat...@marvell.com wrote: > From: Pavan Nikhilesh > > Disable building template files when CNXK_DIS_TMPLT_FUNC > is defined as a part of c_args. > This option can be used when reworking datapath or debugging > issues to reduce Rx/Tx path compilation time. > > Example command: > meson build -Dc_args='-DCNXK_DIS_TMPLT_FUNC' > -Dexamples=all --cross-file config/arm/arm64_cn10k_linux_gcc > Should this option be set in CI by default, or in test-meson-builds by default? When do we need to avoid setting this flag, vs setting it? Thanks, /Bruce
Re: [PATCH] event/dsw: support explicit release only mode
On 2024-05-27 17:35, Jerin Jacob wrote: On Sat, May 25, 2024 at 1:13 AM Mattias Rönnblom wrote: Add the RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE capability to the DSW event device. This feature may be used by an EAL thread to pull more work from the work scheduler, without giving up the option to forward events originating from a previous dequeue batch. This in turn allows an EAL thread to be productive while waiting for a hardware accelerator to complete some operation. Prior to this change, DSW didn't make any distinction between RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_NEW type events, other than that new events would be backpressured earlier. After this change, DSW tracks the number of released events (i.e., events of type RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_RELASE) that has been enqueued. For efficency reasons, DSW does not track the *identity* of individual events. This in turn implies that a certain stage in the flow migration process, DSW must wait for all pending releases (on the migration source port, only) to be received from the application, to assure that no event pertaining to any of the to-be-migrated flows are being processed. With this change, DSW starts making a distinction between forward and new type events for credit allocation purposes. Only RTE_EVENT_OP_NEW events needs credits. All events marked as RTE_EVENT_OP_FORWARD must have a corresponding dequeued event from a previous dequeue batch. Flow migration for flows on RTE_SCHED_TYPE_PARALLEL queues remains unaffected by this change. A side-effect of the tweaked DSW migration logic is that the migration latency is reduced, regardless if implicit relase is enabled or not. Signed-off-by: Mattias Rönnblom 1) Update releases for PMD specific section for this new feature Should the release note update be in the same patch, or a separate? 2) Fix CI issue as applicable https://patches.dpdk.org/project/dpdk/patch/20240524192437.183960-1-mattias.ronnb...@ericsson.com/ http://mails.dpdk.org/archives/test-report/2024-May/672848.html https://github.com/ovsrobot/dpdk/actions/runs/9229147658
[PATCH] net/i40e: increase descriptor queue length to 8160
According to the Intel X710/XXV710/XL710 Datasheet, the maximum receive queue descriptor length is 0x1FE0 (8160 in base 10). This is specified as QLEN in table 8-12, page 1083. I've tested this change with an XXV710 NIC and it has positive effect on performance under high load scenarios. Where previously I'd get ~2000 packets/sec miss rate, now I get only ~40 packets/sec miss rate. Signed-off-by: Igor Gutorov --- drivers/net/i40e/i40e_rxtx.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/i40e/i40e_rxtx.h b/drivers/net/i40e/i40e_rxtx.h index 2f2f890855..33fc9770d9 100644 --- a/drivers/net/i40e/i40e_rxtx.h +++ b/drivers/net/i40e/i40e_rxtx.h @@ -25,7 +25,7 @@ #define I40E_RX_MAX_DATA_BUF_SIZE (16 * 1024 - 128) #defineI40E_MIN_RING_DESC 64 -#defineI40E_MAX_RING_DESC 4096 +#defineI40E_MAX_RING_DESC 8160 #define I40E_FDIR_NUM_TX_DESC (I40E_FDIR_PRG_PKT_CNT << 1) #define I40E_FDIR_NUM_RX_DESC (I40E_FDIR_PRG_PKT_CNT << 1) -- 2.45.1
Re: DPDK patch for Amston Lake SGMII <> GPY215
On 5/27/2024 2:48 PM, Amy.Shih wrote: Copied down, please bottom post. > > Best Regards, > Amy Shih > Advantech ICVG x86 Software > 02-7732-3399 Ext. 1249 > > -Original Message- > From: Ferruh Yigit > Sent: Monday, May 27, 2024 4:58 PM > To: Jack.Chen ; dev@dpdk.org > Cc: Amy.Shih ; bill.lu ; > Jenny3.Lin ; Bruce Richardson > ; Mcnamara, John > Subject: Re: DPDK patch for Amston Lake SGMII <> GPY215 > > On 5/24/2024 6:40 AM, Jack.Chen wrote: >> Dear DPDK Dev . >> >> This is PM from Advantech ENPD. We are working on Intel Amston Lake >> CPU’s SGMII <> GPY215 PHY for DPDK test but fail. >> >> We consulted with Intel support team and they suggested we should >> consult DPDK community and it should have the patch or code change for >> Amston Lake <> GYP215 available for DPDK. >> >> Could you kindly suggest us the direction of it? I also keep my >> Engineering team in this mail loop for further discussion. >> >> >> >> Thank you so much >> >> >> >> The error message while we testing DPDK >> >> SoC 2.5G LAN (BIOS set to 1G) with dpdk 24.03.0. It can run testpmd >> test, and error message as follows : >> >> root@fwa-1214-efi:~/dpdk/dpdk-24.03/build/app# ./dpdk-testpmd -c 0xf -n >> 1 -a 00:1e.4 --socket-mem=2048,0 -- -i --mbcache=512 --numa >> --port-numa-config=0,0 --socket-num=0 --coremask=0x2 --nb-cores=1 >> --rxq=1 --txq=1 --portmask=0x1 --rxd=2048 --rxfreet=64 --rxpt=64 >> --rxht=8 --rxwt=0 --txd=2048 --txfreet=64 --txpt=64 --txht=0 --txwt=0 >> --burst=64 --txrst=64 --rss-ip -a >> >> EAL: Detected CPU lcores: 4 >> >> EAL: Detected NUMA nodes: 1 >> >> EAL: Detected static linkage of DPDK >> >> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket >> >> EAL: Selected IOVA mode 'PA' >> >> TELEMETRY: No legacy callbacks, legacy socket not created >> >> testpmd: No probed ethernet devices >> >> Interactive-mode selected >> >> Fail: input rxq (1) can't be greater than max_rx_queues (0) of port 0 >> >> EAL: Error - exiting with code: 1 >> >> Cause: rxq 1 invalid - must be >= 0 && <= 0 >> >> > > > Hi Jack, > > According above log device is not detected. > What is the Ehternet controller connected to the "GPY215 PHY" and do you > know if it has required driver in DPDK for it? > If device sits on PCIe bus, you can check it via `lspci`. > > > Hi Ferruh: > > The Ethernet controller connected to the "GPY215 PHY" is the integrated Gigabit Ethernet (GbE) controller from the Intel Amston Lake CPU. > The output of `lspci` is as follows: > > 00:1e.4 Ethernet controller [0200]: Intel Corporation Device [8086:54ac] In Linux kernel, it is "dwmac-intel" driver, and as far as I can see it is not supported in DPDK.
Re: [PATCH] event/dsw: support explicit release only mode
On Mon, May 27, 2024 at 9:38 PM Mattias Rönnblom wrote: > > On 2024-05-27 17:35, Jerin Jacob wrote: > > On Sat, May 25, 2024 at 1:13 AM Mattias Rönnblom > > wrote: > >> > >> Add the RTE_EVENT_DEV_CAP_IMPLICIT_RELEASE_DISABLE capability to the > >> DSW event device. > >> > >> This feature may be used by an EAL thread to pull more work from the > >> work scheduler, without giving up the option to forward events > >> originating from a previous dequeue batch. This in turn allows an EAL > >> thread to be productive while waiting for a hardware accelerator to > >> complete some operation. > >> > >> Prior to this change, DSW didn't make any distinction between > >> RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_NEW type events, other than that > >> new events would be backpressured earlier. > >> > >> After this change, DSW tracks the number of released events (i.e., > >> events of type RTE_EVENT_OP_FORWARD and RTE_EVENT_OP_RELASE) that has > >> been enqueued. > >> > >> For efficency reasons, DSW does not track the *identity* of individual > >> events. This in turn implies that a certain stage in the flow > >> migration process, DSW must wait for all pending releases (on the > >> migration source port, only) to be received from the application, to > >> assure that no event pertaining to any of the to-be-migrated flows are > >> being processed. > >> > >> With this change, DSW starts making a distinction between forward and > >> new type events for credit allocation purposes. Only RTE_EVENT_OP_NEW > >> events needs credits. All events marked as RTE_EVENT_OP_FORWARD must > >> have a corresponding dequeued event from a previous dequeue batch. > >> > >> Flow migration for flows on RTE_SCHED_TYPE_PARALLEL queues remains > >> unaffected by this change. > >> > >> A side-effect of the tweaked DSW migration logic is that the migration > >> latency is reduced, regardless if implicit relase is enabled or not. > >> > >> Signed-off-by: Mattias Rönnblom > > > > > > 1) Update releases for PMD specific section for this new feature > > Should the release note update be in the same patch, or a separate? Same patch. > > > 2) Fix CI issue as applicable > > > > https://patches.dpdk.org/project/dpdk/patch/20240524192437.183960-1-mattias.ronnb...@ericsson.com/ > > http://mails.dpdk.org/archives/test-report/2024-May/672848.html > > https://github.com/ovsrobot/dpdk/actions/runs/9229147658
Re: [PATCH] event: fix warning from useless snprintf
On Thu, Apr 25, 2024 at 12:41 AM Stephen Hemminger wrote: > > On Wed, 24 Apr 2024 17:12:39 + > "Van Haaren, Harry" wrote: > > > > > > > From: Stephen Hemminger > > > Sent: Wednesday, April 24, 2024 5:13 PM > > > To: Van Haaren, Harry > > > Cc: dev@dpdk.org; Richardson, Bruce; Jerin Jacob > > > Subject: Re: [PATCH] event: fix warning from useless snprintf > > > > > > On Wed, 24 Apr 2024 08:45:52 + > > > "Van Haaren, Harry" wrote: > > > > > > > > From: Stephen Hemminger > > > > > Sent: Wednesday, April 24, 2024 4:45 AM > > > > > To: dev@dpdk.org > > > > > Cc: Richardson, Bruce; Stephen Hemminger; Van Haaren, Harry; Jerin > > > > > Jacob > > > > > Subject: [PATCH] event: fix warning from useless snprintf > > > > > > > > > > With Gcc-14, this warning is generated: > > > > > ../drivers/event/sw/sw_evdev.c:263:3: warning: 'snprintf' will always > > > > > be truncated; > > > > > specified size is 12, but format string expands to at least 13 > > > > > [-Wformat-truncation] > > > > > 263 | snprintf(buf, sizeof(buf), "sw%d_iq_%d_rob", > > > > > dev_id, i); > > > > > | ^ > > > > > > > > > > Yet the whole printf to the buf is unnecessary. The type string > > > > > argument > > > > > has never been implemented, and should just be NULL. Removing the > > > > > unnecessary snprintf, then means IQ_ROB_NAMESIZE can be removed. > > > > > > > > I understand that today the "type" value isn't implemented, but across > > > > the DPDK codebase it > > > > seems like others are filling in "type" to be some debug-useful > > > > name/string. If it was added > > > > in future it'd be nice to have the ROB/IQ memory identified by name, > > > > like the rest of DPDK components. > > > > > > No, don't bother. This is a case of > > > https://en.wikipedia.org/wiki/You_aren%27t_gonna_need_it > > > > I agree that YAGNI perhaps applied when designing the APIs, but the "type" > > parameter is there now... > > Should we add a guidance of "when reworking code, always pass NULL as the > > type parameter to rte_malloc functions" somewhere in the programmers guide, > > to align community with this "pass NULL for type" initiative? > > > > > > > > Acked-by: Harry van Haaren Changed to event/sw: Applied to dpdk-next-eventdev/for-main. Thanks > >
Re: [PATCH 10/10] net/cnxk: define CPT HW result format for PMD API
On Fri, May 17, 2024 at 1:16 PM Nithin Dabilpuram wrote: > > From: Srujana Challa > > Defines CPT HW result format for PMD API, > rte_pmd_cnxk_inl_ipsec_res(). > > Signed-off-by: Srujana Challa > --- > drivers/net/cnxk/cn10k_ethdev_sec.c | 4 ++-- > drivers/net/cnxk/rte_pmd_cnxk.h | 28 ++-- > 2 files changed, 28 insertions(+), 4 deletions(-) > > > +/** CPT HW result format */ > +union rte_pmd_cnxk_cpt_res_s { > + struct rte_pmd_cpt_cn10k_res_s { > + uint64_t compcode : 7; Public API, Please add Doxygen for every symbol and check the generated HTML files. > + uint64_t doneint : 1; > + uint64_t uc_compcode : 8; > + uint64_t rlen : 16; > + uint64_t spi : 32; > + > + uint64_t esn; > + } cn10k; > + > + struct rte_pmd_cpt_cn9k_res_s { > + uint64_t compcode : 8; > + uint64_t uc_compcode : 8; > + uint64_t doneint : 1; > + uint64_t reserved_17_63 : 47; > + > + uint64_t reserved_64_127; > + } cn9k; > + > + uint64_t u64[2]; > +}; > + > 2.25.1 >
Re: [PATCH 09/10] net/cnxk: clear CGX stats during xstats reset
On Fri, May 17, 2024 at 1:16 PM Nithin Dabilpuram wrote: > > From: Sunil Kumar Kori > > Currently only NIX stats are cleared during xstats > reset and CGX stats are left as it is. > > Clearing CGX stats too during xstats reset. > > Signed-off-by: Sunil Kumar Kori Change to fix and add Fixes: tag > --- > drivers/net/cnxk/cnxk_stats.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/net/cnxk/cnxk_stats.c b/drivers/net/cnxk/cnxk_stats.c > index f2fc89..469faff405 100644 > --- a/drivers/net/cnxk/cnxk_stats.c > +++ b/drivers/net/cnxk/cnxk_stats.c > @@ -316,6 +316,8 @@ cnxk_nix_xstats_reset(struct rte_eth_dev *eth_dev) > goto exit; > } > > + /* Reset MAC stats */ > + rc = roc_nix_mac_stats_reset(nix); > exit: > return rc; > } > -- > 2.25.1 >
Re: [PATCH 06/10] net/cnxk: add option to disable custom meta aura
On Fri, May 17, 2024 at 1:23 PM Nithin Dabilpuram wrote: > > Add option to explicitly disable custom meta aura. Currently > custom meta aura is enabled automatically when inl_cpt_channel > is set i.e inline dev is masking CHAN field in IPsec rules. > > Also decouple the custom meta aura feature from custom sa action > so that the custom sa action can independently be used. > > Signed-off-by: Nithin Dabilpuram > --- > doc/guides/nics/cnxk.rst | 13 + > drivers/common/cnxk/roc_nix_inl.c | 19 +-- > drivers/common/cnxk/roc_nix_inl.h | 1 + > drivers/common/cnxk/version.map| 1 + > drivers/net/cnxk/cnxk_ethdev.c | 5 + > drivers/net/cnxk/cnxk_ethdev.h | 3 +++ > drivers/net/cnxk/cnxk_ethdev_devargs.c | 8 +++- > 7 files changed, 43 insertions(+), 7 deletions(-) > > diff --git a/doc/guides/nics/cnxk.rst b/doc/guides/nics/cnxk.rst > index f5f296ee36..99ad224efd 100644 > --- a/doc/guides/nics/cnxk.rst > +++ b/doc/guides/nics/cnxk.rst > @@ -444,6 +444,19 @@ Runtime Config Options > With the above configuration, driver would enable packet inject from ARM > cores > to crypto to process and send back in Rx path. > > +- ``Disable custom meta aura feature`` (default ``0``) > + > + Custom meta aura i.e 1:N meta aura is enabled for second pass traffic by > default when > + ``inl_cpt_channel`` devarg is provided. Provide an option to disable the > custom > + meta aura feature by setting devarg ``custom_meta_aura_dis`` to ``1``. Update release notes for PMD section for this new feature. > + > + For example:: > + > + -a 0002:02:00.0,custom_meta_aura_dis=1 > + > + With the above configuration, driver would disable custom meta aura > feature for > + ``0002:02:00.0`` ethdev. > + > .. note::
RE: [PATCH v3 0/7] Fix outer UDP checksum for Intel nics
> -Original Message- > From: David Marchand > Sent: Thursday, April 18, 2024 11:20 AM > To: dev@dpdk.org > Cc: NBU-Contact-Thomas Monjalon (EXTERNAL) ; > ferruh.yi...@amd.com > Subject: [PATCH v3 0/7] Fix outer UDP checksum for Intel nics > > This series aims at fixing outer UDP checksum for Intel nics (i40e and > ice). > The net/hns3 is really similar in its internals and has been aligned. > > As I touched testpmd csumonly engine, this series may break other > vendors outer offloads, so please vendors, review and test this ASAP. > > Thanks. > > > -- Hello, I have tested the patchset on ConnectX-6 Dx as suggested by Thomas and didn't see failures in our functional tests caused by the changes. Tested-by: Ali Alnubani Regards, Ali
Including contigmem in core dumps
I've been wondering why we exclude memory allocated by eal_get_virtual_area() from core dumps? (More specifically, it calls eal_mem_set_dump() to call madvise() to disable core dumps from the allocated region.) On many occasions, when debugging after a crash, it would have been very convenient to be able to see the contents of an mbuf or other object allocated in contigmem space. And we often avoid using the rte memory allocator just because of this. Is there any reason for this, or could it perhaps be a compile-time configuration option not to call madvise()?
[PATCH 0/2] support new firmware name scheme
This patch series refator the firmware load logic and add support of the fourth firmware name scheme, and this name scheme takes the third priority. Chaoyong He (2): net/nfp: refactor the firmware load logic net/nfp: support new firmware name scheme drivers/net/nfp/nfp_ethdev.c | 73 ++-- 1 file changed, 36 insertions(+), 37 deletions(-) -- 2.39.1
[PATCH 1/2] net/nfp: refactor the firmware load logic
Refactor the firmware load logic, make it more specific and clear. Signed-off-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- drivers/net/nfp/nfp_ethdev.c | 66 1 file changed, 29 insertions(+), 37 deletions(-) diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c index cdc946faff..771137db92 100644 --- a/drivers/net/nfp/nfp_ethdev.c +++ b/drivers/net/nfp/nfp_ethdev.c @@ -1081,16 +1081,18 @@ nfp_net_init(struct rte_eth_dev *eth_dev, static int nfp_fw_get_name(struct rte_pci_device *dev, - struct nfp_nsp *nsp, - char *card, + struct nfp_cpp *cpp, + struct nfp_eth_table *nfp_eth_table, + struct nfp_hwinfo *hwinfo, char *fw_name, size_t fw_size) { char serial[40]; uint16_t interface; + char card_desc[100]; uint32_t cpp_serial_len; + const char *nfp_fw_model; const uint8_t *cpp_serial; - struct nfp_cpp *cpp = nfp_nsp_cpp(nsp); cpp_serial_len = nfp_cpp_serial(cpp, &cpp_serial); if (cpp_serial_len != NFP_SERIAL_LEN) @@ -1119,8 +1121,20 @@ nfp_fw_get_name(struct rte_pci_device *dev, if (access(fw_name, F_OK) == 0) return 0; + nfp_fw_model = nfp_hwinfo_lookup(hwinfo, "nffw.partno"); + if (nfp_fw_model == NULL) { + nfp_fw_model = nfp_hwinfo_lookup(hwinfo, "assembly.partno"); + if (nfp_fw_model == NULL) { + PMD_DRV_LOG(ERR, "firmware model NOT found"); + return -EIO; + } + } + /* Finally try the card type and media */ - snprintf(fw_name, fw_size, "%s/%s", DEFAULT_FW_PATH, card); + snprintf(card_desc, sizeof(card_desc), "nic_%s_%dx%d.nffw", + nfp_fw_model, nfp_eth_table->count, + nfp_eth_table->ports[0].speed / 1000); + snprintf(fw_name, fw_size, "%s/%s", DEFAULT_FW_PATH, card_desc); PMD_DRV_LOG(DEBUG, "Trying with fw file: %s", fw_name); if (access(fw_name, F_OK) == 0) return 0; @@ -1364,49 +1378,20 @@ nfp_fw_setup(struct rte_pci_device *dev, { int err; char fw_name[125]; - char card_desc[100]; struct nfp_nsp *nsp; - const char *nfp_fw_model; - - nfp_fw_model = nfp_hwinfo_lookup(hwinfo, "nffw.partno"); - if (nfp_fw_model == NULL) - nfp_fw_model = nfp_hwinfo_lookup(hwinfo, "assembly.partno"); - - if (nfp_fw_model != NULL) { - PMD_DRV_LOG(INFO, "firmware model found: %s", nfp_fw_model); - } else { - PMD_DRV_LOG(ERR, "firmware model NOT found"); - return -EIO; - } - if (nfp_eth_table->count == 0 || nfp_eth_table->count > 8) { - PMD_DRV_LOG(ERR, "NFP ethernet table reports wrong ports: %u", - nfp_eth_table->count); - return -EIO; + err = nfp_fw_get_name(dev, cpp, nfp_eth_table, hwinfo, fw_name, sizeof(fw_name)); + if (err != 0) { + PMD_DRV_LOG(ERR, "Can't find suitable firmware."); + return err; } - PMD_DRV_LOG(INFO, "NFP ethernet port table reports %u ports", - nfp_eth_table->count); - - PMD_DRV_LOG(INFO, "Port speed: %u", nfp_eth_table->ports[0].speed); - - snprintf(card_desc, sizeof(card_desc), "nic_%s_%dx%d.nffw", - nfp_fw_model, nfp_eth_table->count, - nfp_eth_table->ports[0].speed / 1000); - nsp = nfp_nsp_open(cpp); if (nsp == NULL) { PMD_DRV_LOG(ERR, "NFP error when obtaining NSP handle"); return -EIO; } - err = nfp_fw_get_name(dev, nsp, card_desc, fw_name, sizeof(fw_name)); - if (err != 0) { - PMD_DRV_LOG(ERR, "Can't find suitable firmware."); - nfp_nsp_close(nsp); - return err; - } - if (multi_pf->enabled) err = nfp_fw_reload_for_multi_pf(nsp, fw_name, cpp, dev_info, multi_pf, force_reload_fw); @@ -1853,6 +1838,13 @@ nfp_pf_init(struct rte_pci_device *pci_dev) goto hwinfo_cleanup; } + if (nfp_eth_table->count == 0 || nfp_eth_table->count > 8) { + PMD_INIT_LOG(ERR, "NFP ethernet table reports wrong ports: %u", + nfp_eth_table->count); + ret = -EIO; + goto eth_table_cleanup; + } + pf_dev->multi_pf.enabled = nfp_check_multi_pf_from_nsp(pci_dev, cpp); pf_dev->multi_pf.function_id = function_id; -- 2.39.1
[PATCH 2/2] net/nfp: support new firmware name scheme
Now all application firmware is indifferent of port speed, so do not bother to compose the firmware name with media info. This will reduce a number of symlinks for firmware files. The logic of firmware name with media info still kept for compatibility. Signed-off-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- drivers/net/nfp/nfp_ethdev.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/net/nfp/nfp_ethdev.c b/drivers/net/nfp/nfp_ethdev.c index 771137db92..74d4a726df 100644 --- a/drivers/net/nfp/nfp_ethdev.c +++ b/drivers/net/nfp/nfp_ethdev.c @@ -1130,6 +1130,13 @@ nfp_fw_get_name(struct rte_pci_device *dev, } } + /* And then try the model name */ + snprintf(card_desc, sizeof(card_desc), "%s.nffw", nfp_fw_model); + snprintf(fw_name, fw_size, "%s/%s", DEFAULT_FW_PATH, card_desc); + PMD_DRV_LOG(DEBUG, "Trying with fw file: %s", fw_name); + if (access(fw_name, F_OK) == 0) + return 0; + /* Finally try the card type and media */ snprintf(card_desc, sizeof(card_desc), "nic_%s_%dx%d.nffw", nfp_fw_model, nfp_eth_table->count, -- 2.39.1
[PATCH 0/4] support generic flow item
This patch series add the support of some generic flow items, namely flow items with a NULL 'item->spec' value, including: - ETH flow item - TCP flow item - UDP flow item - SCTP flow item Chaoyong He (4): net/nfp: support generic ETH flow item net/nfp: support generic TCP flow item net/nfp: support generic UDP flow item net/nfp: support generic SCTP flow item drivers/net/nfp/flower/nfp_flower_flow.c | 126 +++ 1 file changed, 83 insertions(+), 43 deletions(-) -- 2.39.1
[PATCH 1/4] net/nfp: support generic ETH flow item
Add support of ETH flow item with a NULL 'item->spec' value. Signed-off-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- drivers/net/nfp/flower/nfp_flower_flow.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/nfp/flower/nfp_flower_flow.c b/drivers/net/nfp/flower/nfp_flower_flow.c index ae3f25e410..098a714ea5 100644 --- a/drivers/net/nfp/flower/nfp_flower_flow.c +++ b/drivers/net/nfp/flower/nfp_flower_flow.c @@ -1230,10 +1230,9 @@ nfp_flow_merge_eth(__rte_unused struct nfp_app_fw_flower *app_fw_flower, } eth->mpls_lse = 0; - -eth_end: *mbuf_off += sizeof(struct nfp_flower_mac_mpls); +eth_end: return 0; } -- 2.39.1
[PATCH 2/4] net/nfp: support generic TCP flow item
Add support of TCP flow item with a NULL 'item->spec' value. Signed-off-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- drivers/net/nfp/flower/nfp_flower_flow.c | 45 +++- 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/drivers/net/nfp/flower/nfp_flower_flow.c b/drivers/net/nfp/flower/nfp_flower_flow.c index 098a714ea5..810f55f805 100644 --- a/drivers/net/nfp/flower/nfp_flower_flow.c +++ b/drivers/net/nfp/flower/nfp_flower_flow.c @@ -1316,11 +1316,6 @@ nfp_flow_merge_ipv4(__rte_unused struct nfp_app_fw_flower *app_fw_flower, ipv4_udp_tun->ipv4.dst = hdr->dst_addr; } } else { - if (spec == NULL) { - PMD_DRV_LOG(DEBUG, "nfp flow merge ipv4: no item->spec!"); - goto ipv4_end; - } - /* * Reserve space for L4 info. * rte_flow has ipv4 before L4 but NFP flower fw requires L4 before ipv4. @@ -1328,6 +1323,11 @@ nfp_flow_merge_ipv4(__rte_unused struct nfp_app_fw_flower *app_fw_flower, if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_TP) != 0) *mbuf_off += sizeof(struct nfp_flower_tp_ports); + if (spec == NULL) { + PMD_DRV_LOG(DEBUG, "nfp flow merge ipv4: no item->spec!"); + goto ipv4_end; + } + hdr = is_mask ? &mask->hdr : &spec->hdr; ipv4 = (struct nfp_flower_ipv4 *)*mbuf_off; @@ -1399,11 +1399,6 @@ nfp_flow_merge_ipv6(__rte_unused struct nfp_app_fw_flower *app_fw_flower, sizeof(ipv6_udp_tun->ipv6.ipv6_dst)); } } else { - if (spec == NULL) { - PMD_DRV_LOG(DEBUG, "nfp flow merge ipv6: no item->spec!"); - goto ipv6_end; - } - /* * Reserve space for L4 info. * rte_flow has ipv6 before L4 but NFP flower fw requires L4 before ipv6. @@ -1411,6 +1406,11 @@ nfp_flow_merge_ipv6(__rte_unused struct nfp_app_fw_flower *app_fw_flower, if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_TP) != 0) *mbuf_off += sizeof(struct nfp_flower_tp_ports); + if (spec == NULL) { + PMD_DRV_LOG(DEBUG, "nfp flow merge ipv6: no item->spec!"); + goto ipv6_end; + } + hdr = is_mask ? &mask->hdr : &spec->hdr; vtc_flow = rte_be_to_cpu_32(hdr->vtc_flow); ipv6 = (struct nfp_flower_ipv6 *)*mbuf_off; @@ -1445,23 +1445,34 @@ nfp_flow_merge_tcp(__rte_unused struct nfp_app_fw_flower *app_fw_flower, const struct rte_flow_item_tcp *mask; struct nfp_flower_meta_tci *meta_tci; - spec = item->spec; - if (spec == NULL) { - PMD_DRV_LOG(DEBUG, "nfp flow merge tcp: no item->spec!"); - return 0; - } - meta_tci = (struct nfp_flower_meta_tci *)nfp_flow->payload.unmasked_data; if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV4) != 0) { ipv4 = (struct nfp_flower_ipv4 *) (*mbuf_off - sizeof(struct nfp_flower_ipv4)); + if (is_mask) + ipv4->ip_ext.proto = 0xFF; + else + ipv4->ip_ext.proto = IPPROTO_TCP; ports = (struct nfp_flower_tp_ports *) ((char *)ipv4 - sizeof(struct nfp_flower_tp_ports)); - } else { /* IPv6 */ + } else if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV6) != 0) { ipv6 = (struct nfp_flower_ipv6 *) (*mbuf_off - sizeof(struct nfp_flower_ipv6)); + if (is_mask) + ipv6->ip_ext.proto = 0xFF; + else + ipv6->ip_ext.proto = IPPROTO_TCP; ports = (struct nfp_flower_tp_ports *) ((char *)ipv6 - sizeof(struct nfp_flower_tp_ports)); + } else { + PMD_DRV_LOG(ERR, "nfp flow merge tcp: no L3 layer!"); + return -EINVAL; + } + + spec = item->spec; + if (spec == NULL) { + PMD_DRV_LOG(DEBUG, "nfp flow merge tcp: no item->spec!"); + return 0; } mask = item->mask ? item->mask : proc->mask_default; -- 2.39.1
[PATCH 3/4] net/nfp: support generic UDP flow item
Add support of UDP flow item with a NULL 'item->spec' value. Signed-off-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- drivers/net/nfp/flower/nfp_flower_flow.c | 41 1 file changed, 28 insertions(+), 13 deletions(-) diff --git a/drivers/net/nfp/flower/nfp_flower_flow.c b/drivers/net/nfp/flower/nfp_flower_flow.c index 810f55f805..4cbdfd02b8 100644 --- a/drivers/net/nfp/flower/nfp_flower_flow.c +++ b/drivers/net/nfp/flower/nfp_flower_flow.c @@ -1522,18 +1522,13 @@ nfp_flow_merge_udp(__rte_unused struct nfp_app_fw_flower *app_fw_flower, bool is_mask, bool is_outer_layer) { - char *ports_off; struct nfp_flower_tp_ports *ports; + struct nfp_flower_ipv4 *ipv4 = NULL; + struct nfp_flower_ipv6 *ipv6 = NULL; const struct rte_flow_item_udp *spec; const struct rte_flow_item_udp *mask; struct nfp_flower_meta_tci *meta_tci; - spec = item->spec; - if (spec == NULL) { - PMD_DRV_LOG(DEBUG, "nfp flow merge udp: no item->spec!"); - return 0; - } - /* Don't add L4 info if working on a inner layer pattern */ if (!is_outer_layer) { PMD_DRV_LOG(INFO, "Detected inner layer UDP, skipping."); @@ -1542,13 +1537,33 @@ nfp_flow_merge_udp(__rte_unused struct nfp_app_fw_flower *app_fw_flower, meta_tci = (struct nfp_flower_meta_tci *)nfp_flow->payload.unmasked_data; if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV4) != 0) { - ports_off = *mbuf_off - sizeof(struct nfp_flower_ipv4) - - sizeof(struct nfp_flower_tp_ports); - } else {/* IPv6 */ - ports_off = *mbuf_off - sizeof(struct nfp_flower_ipv6) - - sizeof(struct nfp_flower_tp_ports); + ipv4 = (struct nfp_flower_ipv4 *) + (*mbuf_off - sizeof(struct nfp_flower_ipv4)); + if (is_mask) + ipv4->ip_ext.proto = 0xFF; + else + ipv4->ip_ext.proto = IPPROTO_UDP; + ports = (struct nfp_flower_tp_ports *) + ((char *)ipv4 - sizeof(struct nfp_flower_tp_ports)); + } else if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV6) != 0) { + ipv6 = (struct nfp_flower_ipv6 *) + (*mbuf_off - sizeof(struct nfp_flower_ipv6)); + if (is_mask) + ipv6->ip_ext.proto = 0xFF; + else + ipv6->ip_ext.proto = IPPROTO_UDP; + ports = (struct nfp_flower_tp_ports *) + ((char *)ipv6 - sizeof(struct nfp_flower_tp_ports)); + } else { + PMD_DRV_LOG(ERR, "nfp flow merge udp: no L3 layer!"); + return -EINVAL; + } + + spec = item->spec; + if (spec == NULL) { + PMD_DRV_LOG(DEBUG, "nfp flow merge udp: no item->spec!"); + return 0; } - ports = (struct nfp_flower_tp_ports *)ports_off; mask = item->mask ? item->mask : proc->mask_default; if (is_mask) { -- 2.39.1
[PATCH 4/4] net/nfp: support generic SCTP flow item
Add support of SCTP flow item with a NULL 'item->spec' value. Signed-off-by: Chaoyong He Reviewed-by: Long Wu Reviewed-by: Peng Zhang --- drivers/net/nfp/flower/nfp_flower_flow.c | 37 +--- 1 file changed, 26 insertions(+), 11 deletions(-) diff --git a/drivers/net/nfp/flower/nfp_flower_flow.c b/drivers/net/nfp/flower/nfp_flower_flow.c index 4cbdfd02b8..bd77807db0 100644 --- a/drivers/net/nfp/flower/nfp_flower_flow.c +++ b/drivers/net/nfp/flower/nfp_flower_flow.c @@ -1586,28 +1586,43 @@ nfp_flow_merge_sctp(__rte_unused struct nfp_app_fw_flower *app_fw_flower, bool is_mask, __rte_unused bool is_outer_layer) { - char *ports_off; struct nfp_flower_tp_ports *ports; + struct nfp_flower_ipv4 *ipv4 = NULL; + struct nfp_flower_ipv6 *ipv6 = NULL; struct nfp_flower_meta_tci *meta_tci; const struct rte_flow_item_sctp *spec; const struct rte_flow_item_sctp *mask; + meta_tci = (struct nfp_flower_meta_tci *)nfp_flow->payload.unmasked_data; + if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV4) != 0) { + ipv4 = (struct nfp_flower_ipv4 *) + (*mbuf_off - sizeof(struct nfp_flower_ipv4)); + if (is_mask) + ipv4->ip_ext.proto = 0xFF; + else + ipv4->ip_ext.proto = IPPROTO_SCTP; + ports = (struct nfp_flower_tp_ports *) + ((char *)ipv4 - sizeof(struct nfp_flower_tp_ports)); + } else if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV6) != 0) { + ipv6 = (struct nfp_flower_ipv6 *) + (*mbuf_off - sizeof(struct nfp_flower_ipv6)); + if (is_mask) + ipv6->ip_ext.proto = 0xFF; + else + ipv6->ip_ext.proto = IPPROTO_SCTP; + ports = (struct nfp_flower_tp_ports *) + ((char *)ipv6 - sizeof(struct nfp_flower_tp_ports)); + } else { + PMD_DRV_LOG(ERR, "nfp flow merge sctp: no L3 layer!"); + return -EINVAL; + } + spec = item->spec; if (spec == NULL) { PMD_DRV_LOG(DEBUG, "nfp flow merge sctp: no item->spec!"); return 0; } - meta_tci = (struct nfp_flower_meta_tci *)nfp_flow->payload.unmasked_data; - if ((meta_tci->nfp_flow_key_layer & NFP_FLOWER_LAYER_IPV4) != 0) { - ports_off = *mbuf_off - sizeof(struct nfp_flower_ipv4) - - sizeof(struct nfp_flower_tp_ports); - } else { /* IPv6 */ - ports_off = *mbuf_off - sizeof(struct nfp_flower_ipv6) - - sizeof(struct nfp_flower_tp_ports); - } - ports = (struct nfp_flower_tp_ports *)ports_off; - mask = item->mask ? item->mask : proc->mask_default; if (is_mask) { ports->port_src = mask->hdr.src_port; -- 2.39.1
[PATCH] eal: speed up dpdk init time
If we have a lot of huge pages in system, the memory init will cost long time in legacy-mem mode. For example, we have 120G memory in unit of 2MB hugepage, the env init will cost 43s. Almost half of time spent on find_numasocket, since the address in /proc/self/numa_maps is orderd, we can sort hugepg_tbl by orig_va first and then just read numa_maps line by line is enough to find socket. In my test, spent time reduced to 19s. Signed-off-by: Fengnan Chang --- lib/eal/linux/eal_memory.c | 115 +++-- 1 file changed, 72 insertions(+), 43 deletions(-) diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c index 45879ca743..28cc136ac0 100644 --- a/lib/eal/linux/eal_memory.c +++ b/lib/eal/linux/eal_memory.c @@ -414,7 +414,7 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi, static int find_numasocket(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi) { - int socket_id; + int socket_id = -1; char *end, *nodestr; unsigned i, hp_count = 0; uint64_t virt_addr; @@ -432,54 +432,61 @@ find_numasocket(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi) snprintf(hugedir_str, sizeof(hugedir_str), "%s/%s", hpi->hugedir, eal_get_hugefile_prefix()); - /* parse numa map */ - while (fgets(buf, sizeof(buf), f) != NULL) { - - /* ignore non huge page */ - if (strstr(buf, " huge ") == NULL && + /* if we find this page in our mappings, set socket_id */ + for (i = 0; i < hpi->num_pages[0]; i++) { + void *va = NULL; + /* parse numa map */ + while (fgets(buf, sizeof(buf), f) != NULL) { + if (strstr(buf, " huge ") == NULL && strstr(buf, hugedir_str) == NULL) - continue; - - /* get zone addr */ - virt_addr = strtoull(buf, &end, 16); - if (virt_addr == 0 || end == buf) { - EAL_LOG(ERR, "%s(): error in numa_maps parsing", __func__); - goto error; - } + continue; + /* get zone addr */ + virt_addr = strtoull(buf, &end, 16); + if (virt_addr == 0 || end == buf) { + EAL_LOG(ERR, "error in numa_maps parsing"); + goto error; + } - /* get node id (socket id) */ - nodestr = strstr(buf, " N"); - if (nodestr == NULL) { - EAL_LOG(ERR, "%s(): error in numa_maps parsing", __func__); - goto error; - } - nodestr += 2; - end = strstr(nodestr, "="); - if (end == NULL) { - EAL_LOG(ERR, "%s(): error in numa_maps parsing", __func__); - goto error; - } - end[0] = '\0'; - end = NULL; + /* get node id (socket id) */ + nodestr = strstr(buf, " N"); + if (nodestr == NULL) { + EAL_LOG(ERR, "error in numa_maps parsing"); + goto error; + } + nodestr += 2; + end = strstr(nodestr, "="); + if (end == NULL) { + EAL_LOG(ERR, "error in numa_maps parsing"); + goto error; + } + end[0] = '\0'; + end = NULL; - socket_id = strtoul(nodestr, &end, 0); - if ((nodestr[0] == '\0') || (end == NULL) || (*end != '\0')) { - EAL_LOG(ERR, "%s(): error in numa_maps parsing", __func__); - goto error; + socket_id = strtoul(nodestr, &end, 0); + if ((nodestr[0] == '\0') || (end == NULL) || (*end != '\0')) { + EAL_LOG(ERR, "error in numa_maps parsing"); + goto error; + } + va = (void *)(unsigned long)virt_addr; + if (hugepg_tbl[i].orig_va != va) { + EAL_LOG(DEBUG, "search %p not seq, let's start from begin", + hugepg_tbl[i].orig_va); + fseek(f, 0, SEEK_SET); + } else { + break; + } } - - /* if we find this page in our mappings, set socket_id */ - for (i = 0; i < hpi->num_pages[0]; i++) { - void *va = (void *)(unsigned long)virt_addr; -
[PATCH v5] eal/x86: improve rte_memcpy const size 16 performance
When the rte_memcpy() size is 16, the same 16 bytes are copied twice. In the case where the size is known to be 16 at build tine, omit the duplicate copy. Reduced the amount of effectively copy-pasted code by using #ifdef inside functions instead of outside functions. Depends-on: series-31578 ("provide toolchain abstracted __builtin_constant_p") Suggested-by: Stephen Hemminger Signed-off-by: Morten Brørup Acked-by: Bruce Richardson --- v6: * Don't wrap depends on line. It seems not to have been understood. v5: * Fix for building with MSVC: Use __rte_constant() instead of __builtin_constant_p(). Add dependency on patch providing __rte_constant(). v4: * There are no problems compiling AVX2, only AVX. (Bruce Richardson) v3: * AVX2 is a superset of AVX; for a block of AVX code, testing for AVX suffices. (Bruce Richardson) * Define RTE_MEMCPY_AVX if AVX is available, to avoid copy-pasting the check for older GCC version. (Bruce Richardson) v2: * For GCC, version 11 is required for proper AVX handling; if older GCC version, treat AVX as SSE. Clang does not have this issue. Note: Original code always treated AVX as SSE, regardless of compiler. * Do not add copyright. (Stephen Hemminger) --- lib/eal/x86/include/rte_memcpy.h | 239 +-- 1 file changed, 64 insertions(+), 175 deletions(-) diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h index 72a92290e0..1619a8f296 100644 --- a/lib/eal/x86/include/rte_memcpy.h +++ b/lib/eal/x86/include/rte_memcpy.h @@ -27,6 +27,16 @@ extern "C" { #pragma GCC diagnostic ignored "-Wstringop-overflow" #endif +/* + * GCC older than version 11 doesn't compile AVX properly, so use SSE instead. + * There are no problems with AVX2. + */ +#if defined __AVX2__ +#define RTE_MEMCPY_AVX +#elif defined __AVX__ && !(defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 11)) +#define RTE_MEMCPY_AVX +#endif + /** * Copy bytes from one location to another. The locations must not overlap. * @@ -91,14 +101,6 @@ rte_mov15_or_less(void *dst, const void *src, size_t n) return ret; } -#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 - -#define ALIGNMENT_MASK 0x3F - -/** - * AVX512 implementation below - */ - /** * Copy 16 bytes from one location to another, * locations should not overlap. @@ -119,10 +121,15 @@ rte_mov16(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov32(uint8_t *dst, const uint8_t *src) { +#if defined RTE_MEMCPY_AVX __m256i ymm0; ymm0 = _mm256_loadu_si256((const __m256i *)src); _mm256_storeu_si256((__m256i *)dst, ymm0); +#else /* SSE implementation */ + rte_mov16((uint8_t *)dst + 0 * 16, (const uint8_t *)src + 0 * 16); + rte_mov16((uint8_t *)dst + 1 * 16, (const uint8_t *)src + 1 * 16); +#endif } /** @@ -132,10 +139,15 @@ rte_mov32(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov64(uint8_t *dst, const uint8_t *src) { +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 __m512i zmm0; zmm0 = _mm512_loadu_si512((const void *)src); _mm512_storeu_si512((void *)dst, zmm0); +#else /* AVX2, AVX & SSE implementation */ + rte_mov32((uint8_t *)dst + 0 * 32, (const uint8_t *)src + 0 * 32); + rte_mov32((uint8_t *)dst + 1 * 32, (const uint8_t *)src + 1 * 32); +#endif } /** @@ -156,12 +168,18 @@ rte_mov128(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov256(uint8_t *dst, const uint8_t *src) { - rte_mov64(dst + 0 * 64, src + 0 * 64); - rte_mov64(dst + 1 * 64, src + 1 * 64); - rte_mov64(dst + 2 * 64, src + 2 * 64); - rte_mov64(dst + 3 * 64, src + 3 * 64); + rte_mov128(dst + 0 * 128, src + 0 * 128); + rte_mov128(dst + 1 * 128, src + 1 * 128); } +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 + +/** + * AVX512 implementation below + */ + +#define ALIGNMENT_MASK 0x3F + /** * Copy 128-byte blocks from one location to another, * locations should not overlap. @@ -231,12 +249,22 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) /** * Fast way when copy size doesn't exceed 512 bytes */ + if (__rte_constant(n) && n == 32) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 32) { rte_mov16((uint8_t *)dst, (const uint8_t *)src); + if (__rte_constant(n) && n == 16) + return ret; /* avoid (harmless) duplicate copy */ rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n); return ret; } + if (__rte_constant(n) && n == 64) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 64) { rte_mov32((uint8_t *)dst, (const uint8_t *)src); rte_mov32((uint8_t *)dst
[PATCH v6] eal/x86: improve rte_memcpy const size 16 performance
When the rte_memcpy() size is 16, the same 16 bytes are copied twice. In the case where the size is known to be 16 at build tine, omit the duplicate copy. Reduced the amount of effectively copy-pasted code by using #ifdef inside functions instead of outside functions. Depends-on: series-31578 ("provide toolchain abstracted ...") Suggested-by: Stephen Hemminger Signed-off-by: Morten Brørup Acked-by: Bruce Richardson --- v6: * Don't wrap depends on line. It seems not to have been understood. v5: * Fix for building with MSVC: Use __rte_constant() instead of __builtin_constant_p(). Add dependency on patch providing __rte_constant(). v4: * There are no problems compiling AVX2, only AVX. (Bruce Richardson) v3: * AVX2 is a superset of AVX; for a block of AVX code, testing for AVX suffices. (Bruce Richardson) * Define RTE_MEMCPY_AVX if AVX is available, to avoid copy-pasting the check for older GCC version. (Bruce Richardson) v2: * For GCC, version 11 is required for proper AVX handling; if older GCC version, treat AVX as SSE. Clang does not have this issue. Note: Original code always treated AVX as SSE, regardless of compiler. * Do not add copyright. (Stephen Hemminger) --- lib/eal/x86/include/rte_memcpy.h | 239 +-- 1 file changed, 64 insertions(+), 175 deletions(-) diff --git a/lib/eal/x86/include/rte_memcpy.h b/lib/eal/x86/include/rte_memcpy.h index 72a92290e0..1619a8f296 100644 --- a/lib/eal/x86/include/rte_memcpy.h +++ b/lib/eal/x86/include/rte_memcpy.h @@ -27,6 +27,16 @@ extern "C" { #pragma GCC diagnostic ignored "-Wstringop-overflow" #endif +/* + * GCC older than version 11 doesn't compile AVX properly, so use SSE instead. + * There are no problems with AVX2. + */ +#if defined __AVX2__ +#define RTE_MEMCPY_AVX +#elif defined __AVX__ && !(defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 11)) +#define RTE_MEMCPY_AVX +#endif + /** * Copy bytes from one location to another. The locations must not overlap. * @@ -91,14 +101,6 @@ rte_mov15_or_less(void *dst, const void *src, size_t n) return ret; } -#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 - -#define ALIGNMENT_MASK 0x3F - -/** - * AVX512 implementation below - */ - /** * Copy 16 bytes from one location to another, * locations should not overlap. @@ -119,10 +121,15 @@ rte_mov16(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov32(uint8_t *dst, const uint8_t *src) { +#if defined RTE_MEMCPY_AVX __m256i ymm0; ymm0 = _mm256_loadu_si256((const __m256i *)src); _mm256_storeu_si256((__m256i *)dst, ymm0); +#else /* SSE implementation */ + rte_mov16((uint8_t *)dst + 0 * 16, (const uint8_t *)src + 0 * 16); + rte_mov16((uint8_t *)dst + 1 * 16, (const uint8_t *)src + 1 * 16); +#endif } /** @@ -132,10 +139,15 @@ rte_mov32(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov64(uint8_t *dst, const uint8_t *src) { +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 __m512i zmm0; zmm0 = _mm512_loadu_si512((const void *)src); _mm512_storeu_si512((void *)dst, zmm0); +#else /* AVX2, AVX & SSE implementation */ + rte_mov32((uint8_t *)dst + 0 * 32, (const uint8_t *)src + 0 * 32); + rte_mov32((uint8_t *)dst + 1 * 32, (const uint8_t *)src + 1 * 32); +#endif } /** @@ -156,12 +168,18 @@ rte_mov128(uint8_t *dst, const uint8_t *src) static __rte_always_inline void rte_mov256(uint8_t *dst, const uint8_t *src) { - rte_mov64(dst + 0 * 64, src + 0 * 64); - rte_mov64(dst + 1 * 64, src + 1 * 64); - rte_mov64(dst + 2 * 64, src + 2 * 64); - rte_mov64(dst + 3 * 64, src + 3 * 64); + rte_mov128(dst + 0 * 128, src + 0 * 128); + rte_mov128(dst + 1 * 128, src + 1 * 128); } +#if defined __AVX512F__ && defined RTE_MEMCPY_AVX512 + +/** + * AVX512 implementation below + */ + +#define ALIGNMENT_MASK 0x3F + /** * Copy 128-byte blocks from one location to another, * locations should not overlap. @@ -231,12 +249,22 @@ rte_memcpy_generic(void *dst, const void *src, size_t n) /** * Fast way when copy size doesn't exceed 512 bytes */ + if (__rte_constant(n) && n == 32) { + rte_mov32((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 32) { rte_mov16((uint8_t *)dst, (const uint8_t *)src); + if (__rte_constant(n) && n == 16) + return ret; /* avoid (harmless) duplicate copy */ rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n); return ret; } + if (__rte_constant(n) && n == 64) { + rte_mov64((uint8_t *)dst, (const uint8_t *)src); + return ret; + } if (n <= 64) { rte_mov32((uint8_t *)dst, (const uint8_t *)src); rte_mov32((uint8_t *)dst - 32 + n, @@ -313
Re: Including contigmem in core dumps
Hi Lewis, 2024-05-27 19:34 (UTC-0500), Lewis Donzis: > I've been wondering why we exclude memory allocated by eal_get_virtual_area() > from core dumps? (More specifically, it calls eal_mem_set_dump() to call > madvise() to disable core dumps from the allocated region.) > > On many occasions, when debugging after a crash, it would have been very > convenient to be able to see the contents of an mbuf or other object > allocated in contigmem space. And we often avoid using the rte memory > allocator just because of this. > > Is there any reason for this, or could it perhaps be a compile-time > configuration option not to call madvise()? Memory reserved by eal_get_virtual_area() is not yet useful, but it is very large, so by excluding it from dumps, DPDK prevents dumps from including large zero-filled parts. It also makes sense to call eal_mem_set_dump(..., false) from eal_memalloc.c:free_seg(), because of --huge-unlink=never: in this mode (Linux-only), freed segments are not cleared, so if they were included into dump, it would be a lot of garbage data. Newly allocated hugepages are not included into dumps because this would make dumps very large by default. However, this could be an opt-in as a runtime option if need be.