[PATCH 0/7] add Nitrox compress device support

2024-03-02 Thread Nagadheeraj Rottela
Add the Nitrox PMD to support Nitrox compress device. --- v5: * Added missing entry for nitrox folder in compress meson.json v4: * Fixed checkpatch warnings. * Updated release notes. v3: * Fixed ABI compatibility issue. v2: * Reformatted patches to minimize number of changes. * Removed empty fil

[PATCH v5 1/7] crypto/nitrox: move common code

2024-03-02 Thread Nagadheeraj Rottela
A new compressdev Nitrox PMD will be added in next few patches. This patch moves some of the common code which is shared across Nitrox crypto and compress drivers to drivers/common/nitrox folder. Signed-off-by: Nagadheeraj Rottela --- MAINTAINERS| 1 + driver

[PATCH v5 2/7] drivers/compress: add Nitrox driver

2024-03-02 Thread Nagadheeraj Rottela
Introduce Nitrox compressdev driver. This patch implements below operations - dev_configure - dev_close - dev_infos_get - private_xform_create - private_xform_free Signed-off-by: Nagadheeraj Rottela --- MAINTAINERS | 7 + doc/guides/compressdevs/features/nitrox

[PATCH v5 3/7] common/nitrox: add compress hardware queue management

2024-03-02 Thread Nagadheeraj Rottela
Added compress device hardware ring initialization. Signed-off-by: Nagadheeraj Rottela --- drivers/common/nitrox/nitrox_csr.h | 12 +++ drivers/common/nitrox/nitrox_hal.c | 116 + drivers/common/nitrox/nitrox_hal.h | 115 drivers/common/n

[PATCH v5 4/7] crypto/nitrox: set queue type during queue pair setup

2024-03-02 Thread Nagadheeraj Rottela
Set queue type as SE to initialize symmetric hardware queue. Signed-off-by: Nagadheeraj Rottela --- drivers/crypto/nitrox/nitrox_sym.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/crypto/nitrox/nitrox_sym.c b/drivers/crypto/nitrox/nitrox_sym.c index 1244317438..03652d3ade 100644

[PATCH v5 6/7] compress/nitrox: support stateless request

2024-03-02 Thread Nagadheeraj Rottela
Implement enqueue and dequeue burst operations for stateless request support. Signed-off-by: Nagadheeraj Rottela --- drivers/compress/nitrox/meson.build | 1 + drivers/compress/nitrox/nitrox_comp.c| 91 ++- drivers/compress/nitrox/nitrox_comp_reqmgr.c | 792 ++

[PATCH v5 5/7] compress/nitrox: add software queue management

2024-03-02 Thread Nagadheeraj Rottela
Added software queue management code corresponding to queue pair setup and release functions. Signed-off-by: Nagadheeraj Rottela --- drivers/compress/nitrox/nitrox_comp.c | 115 +++--- drivers/compress/nitrox/nitrox_comp.h | 1 + 2 files changed, 105 insertions(+), 11 delet

[PATCH v5 7/7] compress/nitrox: support stateful request

2024-03-02 Thread Nagadheeraj Rottela
Implement enqueue and dequeue burst operations for stateful request support. Signed-off-by: Nagadheeraj Rottela --- drivers/compress/nitrox/nitrox_comp.c| 97 +++- drivers/compress/nitrox/nitrox_comp.h| 1 + drivers/compress/nitrox/nitrox_comp_reqmgr.c | 550 --

RE: [PATCH v5 0/4] add pointer compression API

2024-03-02 Thread Morten Brørup
> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com] > Sent: Friday, 1 March 2024 20.57 > > > On Mar 1, 2024, at 5:16 AM, Morten Brørup > wrote: > > > >> From: Konstantin Ananyev [mailto:konstantin.anan...@huawei.com] > >> Sent: Thursday, 22 February 2024 17.16 > >> > >>> For some r

Re: [PATCH v2 00/71] replace use of fixed size rte_mempcy

2024-03-02 Thread Mattias Rönnblom
On 2024-03-01 18:14, Stephen Hemminger wrote: The DPDK has a lot of "cargo cult" usage of rte_memcpy. This patch set replaces cases where rte_memcpy is used with a fixed size constant size. Typical example is: rte_memcpy(mac_addrs, mac.addr_bytes, RTE_ETHER_ADDR_LEN); which can be replac

Re: [PATCH v2 01/71] cocci/rte_memcpy: add script to eliminate fixed size rte_memcpy

2024-03-02 Thread Mattias Rönnblom
On 2024-03-01 18:14, Stephen Hemminger wrote: Rte_memcpy should not be used for the simple case of copying a fix size structure because it is slower and will hide problems from code analysis tools. Coverity, fortify and other analyzers special case memcpy(). Gcc (and Clang) are smart enough to i

Re: [PATCH v2 00/71] replace use of fixed size rte_mempcy

2024-03-02 Thread Mattias Rönnblom
On 2024-03-02 12:14, Mattias Rönnblom wrote: On 2024-03-01 18:14, Stephen Hemminger wrote: The DPDK has a lot of "cargo cult" usage of rte_memcpy. This patch set replaces cases where rte_memcpy is used with a fixed size constant size. Typical example is: rte_memcpy(mac_addrs, mac.addr_bytes

RE: [PATCH v2 00/71] replace use of fixed size rte_mempcy

2024-03-02 Thread Morten Brørup
> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se] > Sent: Saturday, 2 March 2024 13.02 > > On 2024-03-02 12:14, Mattias Rönnblom wrote: > > On 2024-03-01 18:14, Stephen Hemminger wrote: > >> The DPDK has a lot of "cargo cult" usage of rte_memcpy. > >> This patch set replaces cases where rte_

[RFC 1/7] eal: extend bit manipulation functions

2024-03-02 Thread Mattias Rönnblom
Add functionality to test, set, clear, and assign the value to individual bits in 32-bit or 64-bit words. These functions have no implications on memory ordering, atomicity and does not use volatile and thus does not prevent any compiler optimizations. Signed-off-by: Mattias Rönnblom --- lib/ea

[RFC 4/7] eal: add generic once-type bit operations macros

2024-03-02 Thread Mattias Rönnblom
Add macros for once-type bit operations operating on both 32-bit and 64-bit words by means of C11 generic selection. Signed-off-by: Mattias Rönnblom --- lib/eal/include/rte_bitops.h | 101 +++ 1 file changed, 101 insertions(+) diff --git a/lib/eal/include/rte_bit

[RFC 2/7] eal: add generic bit manipulation macros

2024-03-02 Thread Mattias Rönnblom
Add bit-level test/set/clear/assign macros operating on both 32-bit and 64-bit words by means of C11 generic selection. Signed-off-by: Mattias Rönnblom --- lib/eal/include/rte_bitops.h | 81 1 file changed, 81 insertions(+) diff --git a/lib/eal/include/rte_b

[RFC 3/7] eal: add bit manipulation functions which read or write once

2024-03-02 Thread Mattias Rönnblom
Add bit test/set/clear/assign functions which prevents certain compiler optimizations and guarantees that program-level memory loads and/or stores will actually occur. These functions are useful when interacting with memory-mapped hardware devices. The "once" family of functions does not promise

[RFC 5/7] eal: add atomic bit operations

2024-03-02 Thread Mattias Rönnblom
Add atomic bit test/set/clear/assign and test-and-set/clear functions. All atomic bit functions allow (and indeed, require) the caller to specify a memory order. Signed-off-by: Mattias Rönnblom --- lib/eal/include/rte_bitops.h | 337 +++ 1 file changed, 337 inser

[RFC 0/7] Improve EAL bit operations API

2024-03-02 Thread Mattias Rönnblom
This patch set represent an attempt to improve and extend the RTE bitops API, in particular for functions that operate on individual bits. RFCv1 is submitted primarily to 1) receive general feedback on if improvements in this area is worth working on, and 2) receive feedback on the API. No test c

[RFC 6/7] eal: add generic atomic bit operations

2024-03-02 Thread Mattias Rönnblom
Add atomic bit-level test/set/clear/assign macros operating on both 32-bit and 64-bit words by means of C11 generic selection. Signed-off-by: Mattias Rönnblom --- lib/eal/include/rte_bitops.h | 125 +++ 1 file changed, 125 insertions(+) diff --git a/lib/eal/inclu

[RFC 7/7] eal: deprecate relaxed family of bit operations

2024-03-02 Thread Mattias Rönnblom
Informally (by means of documentation) deprecate the rte_bit_relaxed_*() family of bit-level operations. rte_bit_relaxed_*() has been replaced by three new families of bit-level query and manipulation functions. rte_bit_relaxed_*() failed to deliver the atomicity guarantees their name suggested.

Re: [PATCH v2 00/71] replace use of fixed size rte_mempcy

2024-03-02 Thread Stephen Hemminger
On Sat, 2 Mar 2024 13:01:51 +0100 Mattias Rönnblom wrote: > I ran some DSW benchmarks, and if you add > > diff --git a/lib/eal/x86/include/rte_memcpy.h > b/lib/eal/x86/include/rte_memcpy.h > index 72a92290e0..64cd82d78d 100644 > --- a/lib/eal/x86/include/rte_memcpy.h > +++ b/lib/eal/x86/include

Re: [PATCH v2 00/71] replace use of fixed size rte_mempcy

2024-03-02 Thread Stephen Hemminger
On Sat, 2 Mar 2024 14:05:45 +0100 Morten Brørup wrote: > > > > > My experience with replacing rte_memcpy() with memcpy() (or vice > > versa) > > > is mixed. > > > > > > I've also tried just dropping the DPDK-custom memcpy() implementation > > > altogether, and that caused a performance dro

Re: [PATCH v2 01/71] cocci/rte_memcpy: add script to eliminate fixed size rte_memcpy

2024-03-02 Thread Stephen Hemminger
On Sat, 2 Mar 2024 12:19:13 +0100 Mattias Rönnblom wrote: > On 2024-03-01 18:14, Stephen Hemminger wrote: > > Rte_memcpy should not be used for the simple case of copying > > a fix size structure because it is slower and will hide problems > > from code analysis tools. Coverity, fortify and other

Re: [RFC 1/7] eal: extend bit manipulation functions

2024-03-02 Thread Stephen Hemminger
On Sat, 2 Mar 2024 14:53:22 +0100 Mattias Rönnblom wrote: > diff --git a/lib/eal/include/rte_bitops.h b/lib/eal/include/rte_bitops.h > index 449565eeae..9a368724d5 100644 > --- a/lib/eal/include/rte_bitops.h > +++ b/lib/eal/include/rte_bitops.h > @@ -2,6 +2,7 @@ > * Copyright(c) 2020 Arm Limite

Re: [RFC 7/7] eal: deprecate relaxed family of bit operations

2024-03-02 Thread Stephen Hemminger
On Sat, 2 Mar 2024 14:53:28 +0100 Mattias Rönnblom wrote: > diff --git a/lib/eal/include/rte_bitops.h b/lib/eal/include/rte_bitops.h > index b5a9df5930..783dd0e1ee 100644 > --- a/lib/eal/include/rte_bitops.h > +++ b/lib/eal/include/rte_bitops.h > @@ -1179,6 +1179,10 @@ __RTE_GEN_BIT_ATOMIC_OPS(64

RE: [PATCH v2 00/71] replace use of fixed size rte_mempcy

2024-03-02 Thread Morten Brørup
> From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Saturday, 2 March 2024 17.38 > > On Sat, 2 Mar 2024 14:05:45 +0100 > Morten Brørup wrote: > > > > > > > > My experience with replacing rte_memcpy() with memcpy() (or vice > > > versa) > > > > is mixed. > > > > > > > > I've als

RE: [PATCH v2 01/71] cocci/rte_memcpy: add script to eliminate fixed size rte_memcpy

2024-03-02 Thread Morten Brørup
+To: x86 maintainers, another bug in rte_memcpy() > From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Saturday, 2 March 2024 18.02 > > On Sat, 2 Mar 2024 12:19:13 +0100 > Mattias Rönnblom wrote: > > > On 2024-03-01 18:14, Stephen Hemminger wrote: > > > Rte_memcpy should not be

[PATCH v7] mempool: test performance with larger bursts

2024-03-02 Thread Morten Brørup
Bursts of up to 64, 128 and 256 packets are not uncommon, so increase the maximum tested get and put burst sizes from 32 to 256. For convenience, also test get and put burst sizes of RTE_MEMPOOL_CACHE_MAX_SIZE. Some applications keep more than 512 objects, so increase the maximum number of kept ob

Re: [PATCH] lib/hash,lib/rcu: feature hidden key count in hash

2024-03-02 Thread Abdullah Ömer Yamaç
Sorry for the late reply. I understood what you mean. I will create only the reclaim API for the hash library. Thanks for the explanation. On Wed, Feb 28, 2024 at 5:51 PM Honnappa Nagarahalli < honnappa.nagaraha...@arm.com> wrote: > > > > On Feb 28, 2024, at 5:44 AM, Abdullah Ömer Yamaç > wrote:

[PATCH] rte_memcpy: fix off by one for size 16 and 32

2024-03-02 Thread Stephen Hemminger
The rte_memcpy code would do extra instructions for size 16 and 32 which potentially could reference past end of data. For size of 16, only single mov16 is needed. same for size of 32, only single mov32. Fixes: f5472703c0bd ("eal: optimize aligned memcpy on x86") Fixes: d35cc1fe6a7a ("eal/x86: re

[PATCH v2] lib/hash: feature reclaim defer queue

2024-03-02 Thread Abdullah Ömer Yamaç
This patch adds a new feature to the hash library to allow the user to reclaim the defer queue. This is useful when the user wants to force reclaim resources that are not being used. This API is only available if the RCU is enabled. Signed-off-by: Abdullah Ömer Yamaç Acked-by: Honnappa Nagarahall

[PATCH v2] lib/hash: feature reclaim defer queue

2024-03-02 Thread Abdullah Ömer Yamaç
This patch adds a new feature to the hash library to allow the user to reclaim the defer queue. This is useful when the user wants to force reclaim resources that are not being used. This API is only available if the RCU is enabled. Signed-off-by: Abdullah Ömer Yamaç Acked-by: Honnappa Nagarahall

RE: [PATCH] rte_memcpy: fix off by one for size 16 and 32

2024-03-02 Thread Morten Brørup
I'm also working on a fix. Med venlig hilsen / Kind regards, -Morten Brørup > -Original Message- > From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Saturday, 2 March 2024 21.57 > To: dev@dpdk.org > Cc: Morten Brørup; Bruce Richardson; Konstantin Ananyev; Zhihong Wang; >

[PATCH] eal/x86: improve rte_memcpy const size 16 performance

2024-03-02 Thread Morten Brørup
When the rte_memcpy() size is 16, the same 16 bytes are copied twice. In the case where the size is knownto be 16 at build tine, omit the duplicate copy. Reduced the amount of effectively copy-pasted code by using #ifdef inside functions instead of outside functions. Suggested-by: Stephen Hemming

RE: [PATCH] rte_memcpy: fix off by one for size 16 and 32

2024-03-02 Thread Morten Brørup
> From: Stephen Hemminger [mailto:step...@networkplumber.org] > Sent: Saturday, 2 March 2024 21.49 > > The rte_memcpy code would do extra instructions for size 16 > and 32 which potentially could reference past end of data. It's a somewhat weird concept, but they don't reference past end of data.

RE: [PATCH] eal/x86: improve rte_memcpy const size 16 performance

2024-03-02 Thread Morten Brørup
Recheck-request: iol-broadcom-Performance Patch only modifies x86 code, but fails performance on aarch64.

Re: [PATCH] eal/x86: improve rte_memcpy const size 16 performance

2024-03-02 Thread Stephen Hemminger
On Sun, 3 Mar 2024 00:48:12 +0100 Morten Brørup wrote: > When the rte_memcpy() size is 16, the same 16 bytes are copied twice. > In the case where the size is knownto be 16 at build tine, omit the > duplicate copy. > > Reduced the amount of effectively copy-pasted code by using #ifdef > inside

Re: [PATCH] eal/x86: improve rte_memcpy const size 16 performance

2024-03-02 Thread Stephen Hemminger
On Sat, 2 Mar 2024 21:40:03 -0800 Stephen Hemminger wrote: > On Sun, 3 Mar 2024 00:48:12 +0100 > Morten Brørup wrote: > > > When the rte_memcpy() size is 16, the same 16 bytes are copied twice. > > In the case where the size is knownto be 16 at build tine, omit the > > duplicate copy. > > > >

Re: [RFC 1/7] eal: extend bit manipulation functions

2024-03-02 Thread Mattias Rönnblom
On 2024-03-02 18:05, Stephen Hemminger wrote: On Sat, 2 Mar 2024 14:53:22 +0100 Mattias Rönnblom wrote: diff --git a/lib/eal/include/rte_bitops.h b/lib/eal/include/rte_bitops.h index 449565eeae..9a368724d5 100644 --- a/lib/eal/include/rte_bitops.h +++ b/lib/eal/include/rte_bitops.h @@ -2,6 +2,

Re: [RFC 7/7] eal: deprecate relaxed family of bit operations

2024-03-02 Thread Mattias Rönnblom
On 2024-03-02 18:07, Stephen Hemminger wrote: On Sat, 2 Mar 2024 14:53:28 +0100 Mattias Rönnblom wrote: diff --git a/lib/eal/include/rte_bitops.h b/lib/eal/include/rte_bitops.h index b5a9df5930..783dd0e1ee 100644 --- a/lib/eal/include/rte_bitops.h +++ b/lib/eal/include/rte_bitops.h @@ -1179,6

Re: [PATCH] rte_memcpy: fix off by one for size 16 and 32

2024-03-02 Thread Mattias Rönnblom
On 2024-03-02 21:56, Stephen Hemminger wrote: On Sat, 2 Mar 2024 12:49:23 -0800 Stephen Hemminger wrote: The rte_memcpy code would do extra instructions for size 16 and 32 which potentially could reference past end of data. For size of 16, only single mov16 is needed. same for size of 32, on

Re: [PATCH] rte_memcpy: fix off by one for size 16 and 32

2024-03-02 Thread Mattias Rönnblom
On 2024-03-02 21:49, Stephen Hemminger wrote: The rte_memcpy code would do extra instructions for size 16 and 32 which potentially could reference past end of data. For size of 16, only single mov16 is needed. same for size of 32, only single mov32. Fixes: f5472703c0bd ("eal: optimize aligned m