Re: [PATCH v4 1/3] ethdev: introduce ethdev desc dump API

2022-10-04 Thread Andrew Rybchenko

On 10/4/22 01:40, Ferruh Yigit wrote:

On 9/23/2022 8:43 AM, Dongdong Liu wrote:



From: "Min Hu (Connor)" 

Added the ethdev Rx/Tx desc dump API which provides functions for query
descriptor from device. HW descriptor info differs in different NICs.
The information demonstrates I/O process which is important for debug.
As the information is different


Overall OK to have these new APIs, please find comments below.

Do you think does it worth to list this as one of the PMD future in 
future list, in 'doc/guides/nics/features.rst' ?


IMHO it does not deserve entry in features.
It is a deep debugging using vendor-specific information.



  between NICs, the new API is introduced.


Signed-off-by: Min Hu (Connor) 
Signed-off-by: Dongdong Liu 
Acked-by: Ray Kinsella 


<...>


  int rte_eth_dev_priv_dump(uint16_t port_id, FILE *file);

+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior 
notice

+ *
+ * Dump ethdev Rx descriptor info to a file.
+ *
+ * This API is used for debugging, not a dataplane API.
+ *
+ * @param file
+ *   A pointer to a file for output.
+ * @param dev
+ *   Port (ethdev) handle.
+ * @param queue_id
+ *   The selected queue.
+ * @param num
+ *   The number of the descriptors to dump.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+__rte_experimental
+int rte_eth_rx_hw_desc_dump(FILE *file, uint16_t port_id, uint16_t 
queue_id,

+   uint16_t num);


There are other HW desc related APIs, like 
'rte_eth_rx_descriptor_status()'.

Should this APIs follow same naming convention:
'rte_eth_rx_descriptor_dump()'
'rte_eth_tx_descriptor_dump()'


+1 on naming, it should not be bound to HW. SW parts could be
interesting as well.




+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior 
notice

+ *
+ * Dump ethdev Tx descriptor info to a file.
+ *
+ * This API is used for debugging, not a dataplane API.
+ *
+ * @param file
+ *   A pointer to a file for output.
+ * @param dev
+ *   Port (ethdev) handle.
+ * @param queue_id
+ *   The selected queue.
+ * @param num
+ *   The number of the descriptors to dump.
+ * @return
+ *   - On success, zero.
+ *   - On failure, a negative value.
+ */
+__rte_experimental
+int rte_eth_tx_hw_desc_dump(FILE *file, uint16_t port_id, uint16_t 
queue_id,

+   uint16_t num);


'num' is provided, does it assume it starts from offset 0, what do you 
think to provide 'offset' as parameter?

It may be a use case to start from where tail/head pointer is.


+
+
  #include 

  /**
diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map
index 03f52fee91..3c7c75b582 100644
--- a/lib/ethdev/version.map
+++ b/lib/ethdev/version.map
@@ -285,6 +285,8 @@ EXPERIMENTAL {
 rte_mtr_color_in_protocol_priority_get;
 rte_mtr_color_in_protocol_set;
 rte_mtr_meter_vlan_table_update;
+   rte_eth_rx_hw_desc_dump;
+   rte_eth_tx_hw_desc_dump;


These new APIs should go after "# added in 22.11" comment, if you rebase 
on top of latest HEAD, comment is already there.






Re: [PATCH v5] ethdev: add send to kernel action

2022-10-04 Thread Andrew Rybchenko

On 10/3/22 19:34, Michael Savisko wrote:

In some cases application may receive a packet that should have been
received by the kernel. In this case application uses KNI or other means
to transfer the packet to the kernel.

With bifurcated driver we can have a rule to route packets matching
a pattern (example: IPv4 packets) to the DPDK application and the rest
of the traffic will be received by the kernel.
But if we want to receive most of the traffic in DPDK except specific
pattern (example: ICMP packets) that should be processed by the kernel,
then it's easier to re-route these packets with a single rule.

This commit introduces new rte_flow action which allows application to
re-route packets directly to the kernel without software involvement.

Add new testpmd rte_flow action 'send_to_kernel'. The application
may use this action to route the packet to the kernel while still
in the HW.

Example with testpmd command:

flow create 0 ingress priority 0 group 1 pattern eth type spec 0x0800
type mask 0x / end actions send_to_kernel / end

Signed-off-by: Michael Savisko 
Acked-by: Ori Kam 


Acked-by: Andrew Rybchenko 

Applied to dpdk-next-net/main, thanks.



Re: [PATCH v7 1/4] ethdev: introduce protocol header API

2022-10-04 Thread Andrew Rybchenko

On 10/4/22 05:21, Wang, YuanX wrote:

Hi Andrew,


-Original Message-
From: Andrew Rybchenko 
Sent: Monday, October 3, 2022 3:04 PM
To: Wang, YuanX ; dev@dpdk.org; Thomas
Monjalon ; Ferruh Yigit ;
Ray Kinsella 
Cc: ferruh.yi...@xilinx.com; Li, Xiaoyun ; Singh, Aman
Deep ; Zhang, Yuying
; Zhang, Qi Z ; Yang,
Qiming ; jerinjac...@gmail.com;
viachesl...@nvidia.com; step...@networkplumber.org; Ding, Xuan
; hpoth...@marvell.com; Tang, Yaqi
; Wenxuan Wu 
Subject: Re: [PATCH v7 1/4] ethdev: introduce protocol header API

On 10/2/22 00:05, Yuan Wang wrote:

Add a new ethdev API to retrieve supported protocol headers of a PMD,
which helps to configure protocol header based buffer split.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
Reviewed-by: Andrew Rybchenko 
---
   doc/guides/rel_notes/release_22_11.rst |  5 
   lib/ethdev/ethdev_driver.h | 15 
   lib/ethdev/rte_ethdev.c| 33 ++
   lib/ethdev/rte_ethdev.h| 30 +++
   lib/ethdev/version.map |  3 +++
   5 files changed, 86 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst
b/doc/guides/rel_notes/release_22_11.rst
index 0231959874..6a7474a3d6 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -96,6 +96,11 @@ New Features
 * Added ``rte_event_eth_tx_adapter_queue_stop`` to stop the Tx

Adapter

   from enqueueing any packets to the Tx queue.

+* **Added new ethdev API for PMD to get buffer split supported
+protocol types.**
+
+  * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get

supported

+header protocols of a PMD to split.
+


ethdev features should be grouped together in release notes.
I'll fix it on applying if a new version is not required.


We will send a new version. For the doc changes, I don't understand your point 
very well.
Since will be no new changes to the code within this patch, could you help to 
adjust the doc?
Thanks very much.


Please, read a comment just after 'New Features' section start.
Hopefully it will make my note clearer.
Anyway, don't worry about it a lot. I can easily fix it on
applying.





[snip]


diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index
0c2c1088c0..1f0a7f8f3f 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6002,6 +6002,39 @@ rte_eth_dev_priv_dump(uint16_t port_id, FILE

*file)

return eth_err(port_id, (*dev->dev_ops->eth_dev_priv_dump)(dev,

file));

   }

+int
+rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id,
+uint32_t *ptypes, int num) {
+   int i, j;
+   struct rte_eth_dev *dev;
+   const uint32_t *all_types;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+   dev = &rte_eth_devices[port_id];
+
+   if (ptypes == NULL && num > 0) {
+   RTE_ETHDEV_LOG(ERR,
+   "Cannot get ethdev port %u supported header

protocol types to NULL when array size is non zero\n",

+   port_id);
+   return -EINVAL;
+   }
+
+   if (*dev->dev_ops->buffer_split_supported_hdr_ptypes_get == NULL)
+   return -ENOTSUP;
+   all_types =
+(*dev->dev_ops->buffer_split_supported_hdr_ptypes_get)(dev);
+
+   if (!all_types)


Should be compared with NULL explicitly as coding standard says. I can fix it
on applying as well.


Sure, I will fix in v8.



[snip]




[PATCH] raw/skeleton: remove useless check

2022-10-04 Thread David Marchand
As reported by Coverity, this check is pointless since dev is already
dereferenced earlier. Besides, dev is passed by the rawdev layer and
can't be NULL.

Note: the issue was probably present before the incriminated commit.
It is unclear why Coverity would start complaining about this now.

Coverity issue: 380991
Fixes: 8f1d23ece06a ("eal: deprecate RTE_FUNC_PTR_* macros")

Signed-off-by: David Marchand 
---
 drivers/raw/skeleton/skeleton_rawdev.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/raw/skeleton/skeleton_rawdev.c 
b/drivers/raw/skeleton/skeleton_rawdev.c
index 1d043bec5d..b2ca1cc5cd 100644
--- a/drivers/raw/skeleton/skeleton_rawdev.c
+++ b/drivers/raw/skeleton/skeleton_rawdev.c
@@ -475,9 +475,6 @@ static int skeleton_rawdev_firmware_status_get(struct 
rte_rawdev *dev,
 
skeldev = skeleton_rawdev_get_priv(dev);
 
-   if (dev == NULL)
-   return -EINVAL;
-
if (status_info)
memcpy(status_info, &skeldev->fw.firmware_state,
sizeof(enum skeleton_firmware_state));
-- 
2.37.3



[PATCH] bus/pci: remove VFIO status log in scan

2022-10-04 Thread David Marchand
Linux EAL triggers a scan on all buses, PCI included.
Once done, it configures VFIO.
Checking for VFIO status in the PCI bus scan is pointless.

Signed-off-by: David Marchand 
---
 drivers/bus/pci/linux/pci.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
index c8703d52f3..e69595ca2b 100644
--- a/drivers/bus/pci/linux/pci.c
+++ b/drivers/bus/pci/linux/pci.c
@@ -452,11 +452,6 @@ rte_pci_scan(void)
if (!rte_eal_has_pci())
return 0;
 
-#ifdef VFIO_PRESENT
-   if (!pci_vfio_is_enabled())
-   RTE_LOG(DEBUG, EAL, "VFIO PCI modules not loaded\n");
-#endif
-
dir = opendir(rte_pci_get_sysfs_path());
if (dir == NULL) {
RTE_LOG(ERR, EAL, "%s(): opendir failed: %s\n",
-- 
2.37.3



[PATCH] remove prefix to some local macros in apps and examples

2022-10-04 Thread David Marchand
RTE_TEST_[RT]X_DESC_DEFAULT and RTE_TEST_[RT]X_DESC_MAX macros have been
copied in a lot of app/ and examples/ code.
Those macros are local to each program.

They are not related to a DPDK public header/API, drop the RTE_TEST_
prefix.

Signed-off-by: David Marchand 
---
 app/test-pmd/testpmd.c| 12 ++--
 app/test-pmd/testpmd.h|  6 +++---
 app/test/test_link_bonding.c  |  8 
 app/test/test_pmd_perf.c  |  8 
 app/test/test_security_inline_proto.c |  8 
 doc/guides/sample_app_ug/link_status_intr.rst |  2 +-
 examples/bbdev_app/main.c |  8 
 examples/ip_fragmentation/main.c  |  8 
 examples/ip_reassembly/main.c |  8 
 examples/ipv4_multicast/main.c|  8 
 examples/l2fwd-crypto/main.c  |  8 
 examples/l2fwd-event/l2fwd_common.c   |  4 ++--
 examples/l2fwd-event/l2fwd_common.h   |  4 ++--
 examples/l2fwd-event/main.c   |  4 ++--
 examples/l2fwd-jobstats/main.c|  8 
 examples/l2fwd-keepalive/main.c   |  8 
 examples/l2fwd/main.c |  8 
 examples/l3fwd-graph/main.c   |  8 
 examples/l3fwd-power/main.c   |  8 
 examples/l3fwd/l3fwd.h|  4 ++--
 examples/l3fwd/main.c |  6 +++---
 examples/link_status_interrupt/main.c |  8 
 examples/vhost/main.c | 10 +-
 examples/vmdq/main.c  | 16 
 examples/vmdq_dcb/main.c  | 14 +++---
 25 files changed, 97 insertions(+), 97 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index de6ad00138..39ee3d331d 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -288,10 +288,10 @@ queueid_t nb_txq = 1; /**< Number of TX queues per port. 
*/
  * Configurable number of RX/TX ring descriptors.
  * Defaults are supplied by drivers via ethdev.
  */
-#define RTE_TEST_RX_DESC_DEFAULT 0
-#define RTE_TEST_TX_DESC_DEFAULT 0
-uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT; /**< Number of RX descriptors. */
-uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT; /**< Number of TX descriptors. */
+#define RX_DESC_DEFAULT 0
+#define TX_DESC_DEFAULT 0
+uint16_t nb_rxd = RX_DESC_DEFAULT; /**< Number of RX descriptors. */
+uint16_t nb_txd = TX_DESC_DEFAULT; /**< Number of TX descriptors. */
 
 #define RTE_PMD_PARAM_UNSET -1
 /*
@@ -1719,9 +1719,9 @@ init_config(void)
if (param_total_num_mbufs)
nb_mbuf_per_pool = param_total_num_mbufs;
else {
-   nb_mbuf_per_pool = RTE_TEST_RX_DESC_MAX +
+   nb_mbuf_per_pool = RX_DESC_MAX +
(nb_lcores * mb_mempool_cache) +
-   RTE_TEST_TX_DESC_MAX + MAX_PKT_BURST;
+   TX_DESC_MAX + MAX_PKT_BURST;
nb_mbuf_per_pool *= RTE_MAX_ETHPORTS;
}
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index a7b8565a6d..627a42ce3b 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -28,9 +28,6 @@
 
 #define RTE_PORT_ALL(~(portid_t)0x0)
 
-#define RTE_TEST_RX_DESC_MAX2048
-#define RTE_TEST_TX_DESC_MAX2048
-
 #define RTE_PORT_STOPPED(uint16_t)0
 #define RTE_PORT_STARTED(uint16_t)1
 #define RTE_PORT_CLOSED (uint16_t)2
@@ -67,6 +64,9 @@ extern uint8_t cl_quit;
 /* The prefix of the mbuf pool names created by the application. */
 #define MBUF_POOL_NAME_PFX "mb_pool"
 
+#define RX_DESC_MAX2048
+#define TX_DESC_MAX2048
+
 #define MAX_PKT_BURST 512
 #define DEF_PKT_BURST 32
 
diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 194ed5a7ec..c28c1ada46 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -47,8 +47,8 @@
 #define MBUF_CACHE_SIZE (250)
 #define BURST_SIZE (32)
 
-#define RTE_TEST_RX_DESC_MAX   (2048)
-#define RTE_TEST_TX_DESC_MAX   (2048)
+#define RX_DESC_MAX(2048)
+#define TX_DESC_MAX(2048)
 #define MAX_PKT_BURST  (512)
 #define DEF_PKT_BURST  (16)
 
@@ -225,8 +225,8 @@ test_setup(void)
"Ethernet header struct allocation failed!");
}
 
-   nb_mbuf_per_pool = RTE_TEST_RX_DESC_MAX + DEF_PKT_BURST +
-   RTE_TEST_TX_DESC_MAX + MAX_PKT_BURST;
+   nb_mbuf_per_pool = RX_DESC_MAX + DEF_PKT_BURST +
+   TX_DESC_MAX + MAX_PKT_BURST;
if (test_params->mbuf_pool == NULL) {
test_params->mbuf_pool = rte_pktmbuf_pool_create("MBUF_POOL",
nb_mbuf_per_pool, MBUF_CACHE_SIZE, 0,
diff --git a/app/test/test_pmd_perf.c b/app/test/test_pmd_perf.c
index ec3dc251d1..10dba0da88 100644
--- a/app/t

Re: [PATCH] drivers: suggestion on removing empty version.map files

2022-10-04 Thread Bruce Richardson
On Tue, Oct 04, 2022 at 09:30:39AM +0300, Omer Yamac wrote:
> 
> 
> On 03.10.2022 17:01, Bruce Richardson wrote:
> > On Mon, Oct 03, 2022 at 04:59:18PM +0300, Omer Yamac wrote:
> > > 
> > > 
> > > On 03.10.2022 12:19, Bruce Richardson wrote:
> > > > On Mon, Oct 03, 2022 at 09:52:03AM +0300, Abdullah Ömer Yamaç wrote:
> > > > > In this patch, we remove all version.map files which include
> > > > > only the below part:
> > > > > `DPDK_23 {
> > > > >   local: *;
> > > > > };`
> > > > >
> > > > > Then we modify the meson.build to be able to compile without
> > > > > version.map
> > > > >
> > > > > Signed-off-by: Abdullah Ömer Yamaç 
> > > > > Suggested-by: Ferruh Yigit 
> > > > > ---
> > > >
> > > > I think you need to flag this as depending on us bumping the meson
> > > > version
> > > > requirement up to 0.53 as has been proposed. This doesn't work with 0.4x
> > > > versions.
> > > >
> > > Thanks for your warnings.
> > > Instead of using fs module, I will use python script that checks
> > > file exist
> > > or not.
> > > If it is okay, I will resubmit the patch.
> > 
> > I'd rather not go down that road unless we really need to. Right now the
> > empty version.map files are pretty much harmless, so there is no
> > compelling
> > need to change. Therefore, I'd rather wait to have the meson version
> > bumped
> > to 0.53 and then have this patch applied, without having to worry about
> > using script fallbacks.
> I understood; but one thing I'm not sure what should I do? I don't know how
> can I flag the meson requirement. Is there any special method or just a
> comment in the commit?

You can just put a note in the commit log, under a cut-line indicating what
other patches your patch depends upon.
See https://doc.dpdk.org/guides/contributing/patches.html#patch-dependencies

Regards,
/Bruce


Re: [PATCH] malloc: remove unused function to set limit

2022-10-04 Thread Thomas Monjalon
27/09/2022 13:46, David Marchand:
> This function was never implemented and has been deprecated for a long
> time. We can remove it.
> 
> Signed-off-by: David Marchand 

Applied, thanks.





Re: [PATCH v7 2/4] ethdev: introduce protocol hdr based buffer split

2022-10-04 Thread Andrew Rybchenko

On 10/4/22 05:48, Wang, YuanX wrote:

Hi Andrew,


-Original Message-
On 10/2/22 00:05, Yuan Wang wrote:

+
+   /* skip the payload */


Sorry, it is confusing. What do you mean here?


Because setting n proto_hdr will generate (n+1) segments. If we want to split 
the packet into n segments, we only need to check the first (n-1) proto_hdr.
For example, for ETH-IPV4-UDP-PAYLOAD, if we want to split after the UDP 
header, we only need to set and check the UDP header in the first segment.

Maybe mask is not a good way, so we will use index to filter out the check of 
proto_hdr inside the last segment.


I see your point and understand the problem now.
Thinking a bit more about it I realize that consistency check
here should be more sophisticated.
It should not allow:
 - seg1 - length-based, seg2 - proto-based, seg3 - payload
 - seg1 - proto-based, seg2 - legnth-based, seg3 - proto-based, seg4 - 
payload

I.e. no protocol-based split after length-based.
But should allow:
 - seg1 - proto-based, seg2 - legnth-based, seg3 - payload
I.e. length based split after protocol-based.

Taking the last point above into account, proto_hdr in the last
segment should be 0 like in length-based split (not
RTE_PTYPE_ALL_MASK).

It is an interesting question how to request:
 - seg1 - ETH, seg2 - IPv4, seg3 - UDP, seg4 - payload
Should we really repeat ETH in seg2->proto_hdr and
seg3->proto_hdr header and IPv4 in seg3->proto_hdr again?
I tend to say no since when packet comes to seg2 it already
has no ETH header.

If so, how to handle configuration when ETH is repeat in seg2?
For example,
  - seg1 ETH+IPv4+UDP
  - seg2 ETH+IPv6+UDP
  - seg2 0
Should we deny it or should we define behaviour like.
If a packet does not match segX proto_hdr, the segment is
skipped and segX+1 considered.
Of course, not all drivers/HW supports it. If so, such
configuration should be just discarded by the driver itself.



Re: [PATCH v3] ethdev: add hint when creating async transfer table

2022-10-04 Thread Andrew Rybchenko

On 9/28/22 12:24, Rongwei Liu wrote:

The transfer domain rule is able to match traffic wire/vf
origin and it means two directions' underlayer resource.

In customer deployments, they usually match only one direction
traffic in single flow table: either from wire or from vf.

Introduce one new member transfer_mode into rte_flow_template_table_attr
to indicate the flow table direction property: from wire, from vf
or bi-direction(default).

It helps to save underlayer memory also on insertion rate, and this
new field doesn't expose any matching criteira.

By default, the transfer domain is to match bi-direction traffic, and
no behavior changed.

1. Match wire origin only
flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
flow template_table 0 create group 0 priority 0 transfer vf_orig...


Since wire_orig and vf_orig are just optional hints and not
all PMDs are obliged to handle it, it does not impose any
matching criteria. So, example above are misleading and you
need to add pattern items to highlight that corresponding rules
are really wire_orig or vf_orig.



Signed-off-by: Rongwei Liu 
Acked-by: Ori Kam 


[snip]



Re: [PATCH v7 0/7] Introduce support for LoongArch architecture

2022-10-04 Thread zhoumin



On Tue, Oct 4, 2022 at 14:59, David Marchand wrote:

On Fri, Sep 30, 2022 at 10:02 AM Min Zhou  wrote:

Dear team,

The following patch set is intended to support DPDK running on LoongArch
architecture.

LoongArch is the general processor architecture of Loongson Corporation
and is a new RISC ISA, which is a bit like MIPS or RISC-V.

The online documents of LoongArch architecture are here:
 https://loongson.github.io/LoongArch-Documentation/README-EN.html

The latest build tools for LoongArch (binary) can be downloaded from:
 https://github.com/loongson/build-tools

v7:
 - rebase the patchset on the main repository
 - add errno.h to rte_power_intrinsics.c according with
   commit 72b452c5f259

I did some comments on patch 1.



Yes, thanks. I have read them carefully and prepare to send the v8 patchset.



I am still considering the patch 7 (hooking into GHA) but the rest
looks good enough to me.



Yes, thanks. The changes in the patch 7 is indeed not good for adding CI 
for LoongArch.


As we discussed last weekend, it is better to set up a CI system for 
LoongArch and integrate


the test results of new patch to Patchwork. We are building the CI 
system, but it will take


some time.



Could you respin the series?



OK, thank. I will send the v8 patchset to fix them.




Thanks!


Thanks,

--

Min Zhou



Re: [PATCH v7 1/7] eal/loongarch: support LoongArch architecture

2022-10-04 Thread zhoumin

Hi, David,

Thanks a lot for your helpful reply.


On Tue, Oct 4, 2022 at 01:15, David Marchand wrote:

On Fri, Sep 30, 2022 at 10:02 AM Min Zhou  wrote:

Add all necessary elements for DPDK to compile and run EAL on
LoongArch64 Soc.

This includes:

- EAL library implementation for LoongArch ISA.
- meson build structure for 'loongarch' architecture.
   RTE_ARCH_LOONGARCH define is added for architecture identification.
- xmm_t structure operation stubs as there is no vector support in
   the current version for LoongArch.

Compilation was tested on Debian and CentOS using loongarch64
cross-compile toolchain from x86 build hosts. Functions were tested
on Loongnix and Kylin which are two Linux distributions supported
LoongArch host based on Linux 4.19 maintained by Loongson
Corporation.

We also tested DPDK on LoongArch with some external applications,
including: Pktgen-DPDK, OVS, VPP.

The platform is currently marked as linux-only because there is no
other OS than Linux support LoongArch host currently.

The i40e PMD driver is disabled on LoongArch because of the absence
of vector support in the current version.

Similar to RISC-V, the compilation of following modules has been
disabled by this commit and will be re-enabled in later commits as
fixes are introduced:
net/ixgbe, net/memif, net/tap, example/l3fwd.

Signed-off-by: Min Zhou 
---
  MAINTAINERS   |  6 ++
  app/test/test_xmmt_ops.h  | 12 +++
  .../loongarch/loongarch_loongarch64_linux_gcc | 16 
  config/loongarch/meson.build  | 43 +

Please update devtools/test-meson-builds.sh in this patch.

I tested the compilation of the series per patch (I caught one issue
in net/bnxt which I posted a fix for), with this diff:

@@ -260,6 +260,10 @@ build build-x86-mingw $f skipABI -Dexamples=helloworld
  f=$srcdir/config/arm/arm64_armv8_linux_gcc
  build build-arm64-generic-gcc $f ABI $use_shared

+# generic LoongArch
+f=$srcdir/config/loongarch/loongarch_loongarch64_linux_gcc
+build build-loongarch64-generic-gcc $f ABI $use_shared
+
  # IBM POWER
  f=$srcdir/config/ppc/ppc64le-power8-linux-gcc
  build build-ppc64-power8-gcc $f ABI $use_shared



OK, thanks. It's very helpful. I ever tried to add them, but I ran into

some problems during the test. I will add them into the v8 patchset.





  doc/guides/contributing/design.rst|  2 +-
  .../cross_build_dpdk_for_loongarch.rst| 87 +
  doc/guides/linux_gsg/index.rst|  1 +
  doc/guides/nics/features.rst  |  8 ++
  doc/guides/nics/features/default.ini  |  1 +
  doc/guides/rel_notes/release_22_11.rst|  7 ++
  drivers/net/i40e/meson.build  |  6 ++
  drivers/net/ixgbe/meson.build |  6 ++
  drivers/net/memif/meson.build |  6 ++
  drivers/net/tap/meson.build   |  6 ++
  examples/l3fwd/meson.build|  6 ++
  lib/eal/linux/eal_memory.c|  4 +
  lib/eal/loongarch/include/meson.build | 18 
  lib/eal/loongarch/include/rte_atomic.h| 47 ++
  lib/eal/loongarch/include/rte_byteorder.h | 40 
  lib/eal/loongarch/include/rte_cpuflags.h  | 39 
  lib/eal/loongarch/include/rte_cycles.h| 47 ++
  lib/eal/loongarch/include/rte_io.h| 18 
  lib/eal/loongarch/include/rte_memcpy.h| 61 
  lib/eal/loongarch/include/rte_pause.h | 24 +
  .../loongarch/include/rte_power_intrinsics.h  | 20 
  lib/eal/loongarch/include/rte_prefetch.h  | 47 ++
  lib/eal/loongarch/include/rte_rwlock.h| 42 +
  lib/eal/loongarch/include/rte_spinlock.h  | 64 +
  lib/eal/loongarch/include/rte_vect.h  | 65 +
  lib/eal/loongarch/meson.build | 11 +++
  lib/eal/loongarch/rte_cpuflags.c  | 93 +++
  lib/eal/loongarch/rte_cycles.c| 45 +
  lib/eal/loongarch/rte_hypervisor.c| 11 +++
  lib/eal/loongarch/rte_power_intrinsics.c  | 53 +++
  meson.build   |  2 +
  35 files changed, 963 insertions(+), 1 deletion(-)
  create mode 100644 config/loongarch/loongarch_loongarch64_linux_gcc
  create mode 100644 config/loongarch/meson.build
  create mode 100644 doc/guides/linux_gsg/cross_build_dpdk_for_loongarch.rst
  create mode 100644 lib/eal/loongarch/include/meson.build
  create mode 100644 lib/eal/loongarch/include/rte_atomic.h
  create mode 100644 lib/eal/loongarch/include/rte_byteorder.h
  create mode 100644 lib/eal/loongarch/include/rte_cpuflags.h
  create mode 100644 lib/eal/loongarch/include/rte_cycles.h
  create mode 100644 lib/eal/loongarch/include/rte_io.h
  create mode 100644 lib/eal/loongarch/include/rte_memcpy.h
  create mode 100644 lib/eal/loongarch/include/rte_pause.h
  create mode 100644 lib/eal/loongarch/include/

Re: [PATCH v7 0/7] Introduce support for LoongArch architecture

2022-10-04 Thread zhoumin

Hi, David,

Thanks for your kind reply.


On Tue, Oct 4, 2022 at 00:30, David Marchand wrote:

Hello Min,

On Sat, Oct 1, 2022 at 4:26 PM zhoumin  wrote:

I'm Sorry, I misunderstood the 'instructions' you said. The process of
making the toolchain is a little complicated. So I made a script used to
generate the toolchain from source codes. The content of the script is
as follows:


I successfully generated a cross compilation toolchain with this script.

I ran this script in a UBI8 image (iow RHEL8), with the
codeready-builder-for-rhel-8-x86_64-rpms repo enabled and the
following packages installed:
# subscription-manager repos --enable codeready-builder-for-rhel-8-x86_64-rpms
# dnf install bison diffutils file flex gcc gcc-c++ git gmp-devel
libtool make python3 rsync texinfo wget xz zlib-devel



I'm sorry. I forgot to give the dependencies for build the cross 
compilation toolchain. These


dependencies can be added into the documentation for LoongArch. Thanks a 
lot.




The script below works, but it is really better to run it with -e.

# bash -e $script



Yes, thanks. The script will run for a long time. It is better to run it 
with -e in order to exit quickly when an error has occured.






#!/bin/bash

# Prepare the working directories
export SYSDIR=/tmp/la_cross_tools
mkdir -pv ${SYSDIR}
mkdir -pv ${SYSDIR}/downloads
mkdir -pv ${SYSDIR}/build
install -dv ${SYSDIR}/cross-tools
install -dv ${SYSDIR}/sysroot

set +h
umask 022
# Set the environment variables to be used next
export BUILDDIR="${SYSDIR}/build"
export DOWNLOADDIR="${SYSDIR}/downloads"
export LC_ALL=POSIX
export CROSS_HOST="$(echo $MACHTYPE | sed "s/$(echo $MACHTYPE | cut -d- 
-f2)/cross/")"
export CROSS_TARGET="loongarch64-unknown-linux-gnu"
export MABI="lp64d"
export BUILD64="-mabi=lp64d"
export PATH=${SYSDIR}/cross-tools/bin:/bin:/usr/bin
export JOBS=-j8
unset CFLAGS
unset CXXFLAGS

# Download the source code archives
pushd $DOWNLOADDIR
wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.19.tar.gz
wget https://ftp.gnu.org/gnu/gmp/gmp-6.2.1.tar.xz
wget https://www.mpfr.org/mpfr-4.1.0/mpfr-4.1.0.tar.xz
wget https://ftp.gnu.org/gnu/mpc/mpc-1.2.1.tar.gz
wget https://ftp.gnu.org/gnu/libc/glibc-2.36.tar.xz
popd

# Make and install the linux header files
tar xvf ${DOWNLOADDIR}/linux-5.19.tar.gz -C ${BUILDDIR}
pushd ${BUILDDIR}/linux-5.19
 make mrproper
 make ARCH=loongarch INSTALL_HDR_PATH=dest headers_install
 find dest/include -name '.*' -delete
 mkdir -pv ${SYSDIR}/sysroot/usr/include
 cp -rv dest/include/* ${SYSDIR}/sysroot/usr/include
popd

# Prepare the binutils source code
git clone git://sourceware.org/git/binutils-gdb.git --depth 1
pushd binutils-gdb
 git archive --format=tar.gz --prefix=binutils-2.38/ --output ../binutils-2.38.tar.gz 
"master"
popd
mv binutils-2.38.tar.gz ${DOWNLOADDIR}

# Make and install the binutils files
tar xvf ${DOWNLOADDIR}/binutils-2.38.tar.gz -C ${BUILDDIR}
pushd ${BUILDDIR}/binutils-2.38
 rm -rf gdb* libdecnumber readline sim
 mkdir tools-build
 pushd tools-build
 CC=gcc AR=ar AS=as \
 ../configure --prefix=${SYSDIR}/cross-tools --build=${CROSS_HOST} 
--host=${CROSS_HOST} \
  --target=${CROSS_TARGET} --with-sysroot=${SYSDIR}/sysroot 
--disable-nls \
  --disable-static --disable-werror --enable-64-bit-bfd
 make configure-host ${JOBS}
 make ${JOBS}
 make install-strip
 cp -v ../include/libiberty.h ${SYSDIR}/sysroot/usr/include
 popd
popd

# Make and install the gmp files used by GCC
tar xvf ${DOWNLOADDIR}/gmp-6.2.1.tar.xz -C ${BUILDDIR}
pushd ${BUILDDIR}/gmp-6.2.1
 ./configure --prefix=${SYSDIR}/cross-tools --enable-cxx --disable-static
 make ${JOBS}
 make install
popd

# Make and install the mpfr files used by GCC
tar xvf ${DOWNLOADDIR}/mpfr-4.1.0.tar.xz -C ${BUILDDIR}
pushd ${BUILDDIR}/mpfr-4.1.0
 ./configure --prefix=${SYSDIR}/cross-tools --disable-static 
--with-gmp=${SYSDIR}/cross-tools
 make ${JOBS}
 make install
popd

# Make and install the mpc files used by GCC
tar xvf ${DOWNLOADDIR}/mpc-1.2.1.tar.gz -C ${BUILDDIR}
pushd ${BUILDDIR}/mpc-1.2.1
 ./configure --prefix=${SYSDIR}/cross-tools --disable-static 
--with-gmp=${SYSDIR}/cross-tools
 make ${JOBS}
 make install
popd

# Prepare the gcc source code
git clone git://sourceware.org/git/gcc.git --depth 1
pushd gcc
 git archive --format=tar.gz --prefix=gcc-13.0.0/ --output ../gcc-13.0.0.tar.gz 
"master"
popd
mv gcc-13.0.0.tar.gz ${DOWNLOADDIR}

# Make and install the simplified GCC files
tar xvf ${DOWNLOADDIR}/gcc-13.0.0.tar.gz -C ${BUILDDIR}
pushd ${BUILDDIR}/gcc-13.0.0
 mkdir tools-build
 pushd tools-build
 AR=ar LDFLAGS="-Wl,-rpath,${SYSDIR}/cross-tools/lib" \
 ../configure --prefix=${SYSDIR}/cross-tools --build=${CROSS_HOST} 
--host=${CROSS_HOST} \
  --target=${CROSS_TARGET} --disable-nls \
  --with-mpfr=${SYSDIR}/cro

Re: [PATCH v3 1/1] ethdev: support congestion management

2022-10-04 Thread Andrew Rybchenko

On 9/29/22 12:35, sk...@marvell.com wrote:

From: Jerin Jacob 

NIC HW controllers often come with congestion management support on
various HW objects such as Rx queue depth or mempool queue depth.

Also, it can support various modes of operation such as RED
(Random early discard), WRED etc on those HW objects.

This patch adds a framework to express such modes(enum rte_cman_mode)
and introduce (enum rte_eth_cman_obj) to enumerate the different
objects where the modes can operate on.

This patch adds RTE_CMAN_RED mode of operation and


This patch adds -> Add


RTE_ETH_CMAN_OBJ_RX_QUEUE, RTE_ETH_CMAN_OBJ_RX_QUEUE_MEMPOOL object.

Introduced reserved fields in configuration structure


Introduce


backed by rte_eth_cman_config_init() to add new configuration
parameters without ABI breakage.

Added rte_eth_cman_info_get() API to get the information such as


Add


supported modes and objects.

Added rte_eth_cman_config_init(), rte_eth_cman_config_set() APIs


Add


to configure congestion management on those object with associated mode.

Finally, Added rte_eth_cman_config_get() API to retrieve the


add


applied configuration.

Signed-off-by: Jerin Jacob 
Signed-off-by: Sunil Kumar Kori 


I'll send v4 with few minor correction.


---
v2..v3:
  - Rename rte_cman.c to rte_ethdev_cman.c
  - Move lib/eal/include/rte_cman.h to lib/ethdev/rte_cman.h
  - Fix review comments (Andrew Rybchenko)
  - Add release notes

v1..v2:
  - Fix review comments (Akhil Goyal)

rfc..v1:
  - Added RED specification (http://www.aciri.org/floyd/papers/red/red.html) 
link
  - Fixed doxygen comment issue (Min Hu)

  doc/guides/nics/features.rst   |  12 ++
  doc/guides/nics/features/default.ini   |   1 +
  doc/guides/rel_notes/release_22_11.rst |   6 +
  lib/ethdev/ethdev_driver.h |  25 
  lib/ethdev/ethdev_private.c|  12 ++
  lib/ethdev/ethdev_private.h|   2 +
  lib/ethdev/meson.build |   2 +
  lib/ethdev/rte_cman.h  |  55 +
  lib/ethdev/rte_ethdev.h| 164 +
  lib/ethdev/rte_ethdev_cman.c   | 101 +++
  lib/ethdev/version.map |   6 +
  11 files changed, 386 insertions(+)
  create mode 100644 lib/ethdev/rte_cman.h
  create mode 100644 lib/ethdev/rte_ethdev_cman.c

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index b4a8e9881c..70ca46e651 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -727,6 +727,18 @@ Supports configuring per-queue stat counter mapping.
``rte_eth_dev_set_tx_queue_stats_mapping()``.
  
  
+.. _nic_features_congestion_management:

+
+Congestion management
+-
+
+Supports congestion management.
+
+* **[implements] eth_dev_ops**: ``cman_info_get``, ``cman_config_set``, 
``cman_config_get``.
+* **[related]API**: ``rte_eth_cman_info_get()``, 
``rte_eth_cman_config_init()``,
+  ``rte_eth_cman_config_set()``, ``rte_eth_cman_config_get()``.
+
+
  .. _nic_features_fw_version:
  
  FW version

diff --git a/doc/guides/nics/features/default.ini 
b/doc/guides/nics/features/default.ini
index f7192cb0da..a9c0008ebd 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -60,6 +60,7 @@ Tx descriptor status =
  Basic stats  =
  Extended stats   =
  Stats per queue  =
+Congestion management =
  FW version   =
  EEPROM dump  =
  Module EEPROM dump   =
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 0231959874..ea9908e578 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -81,6 +81,12 @@ New Features
* Added AES-CCM support in lookaside protocol (IPsec) for CN9K & CN10K.
* Added AES & DES DOCSIS algorithm support in lookaside crypto for CN9K.
  
+* **Added support for congestion management for ethdev.**

+
+  Added new APIs ``rte_eth_cman_config_init()``, ``rte_eth_cman_config_get()``,
+  ``rte_eth_cman_config_set()``, ``rte_eth_cman_info_get()``
+  to support congestion management.
+


The position is a bit incorrect. It should go after ethdev
items.


  * **Added eventdev adapter instance get API.**
  
* Added ``rte_event_eth_rx_adapter_instance_get`` to get Rx adapter

diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 8cd8eb8685..e1e2d10a35 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1094,6 +1094,22 @@ typedef int (*eth_rx_queue_avail_thresh_query_t)(struct 
rte_eth_dev *dev,
uint16_t *rx_queue_id,
uint8_t *avail_thresh);
  
+/** @internal Get congestion management information. */

+typedef int (*eth_cman_info_get_t)(struct rte_eth_dev *dev,
+   struct rte_eth_cman_info *info);
+
+/** @internal Init congestion management structure with default val

[PATCH v4] ethdev: support congestion management

2022-10-04 Thread Andrew Rybchenko
From: Jerin Jacob 

NIC HW controllers often come with congestion management support on
various HW objects such as Rx queue depth or mempool queue depth.

Also, it can support various modes of operation such as RED
(Random early discard), WRED etc on those HW objects.

Add a framework to express such modes(enum rte_cman_mode) and
introduce (enum rte_eth_cman_obj) to enumerate the different
objects where the modes can operate on.

Add RTE_CMAN_RED mode of operation and RTE_ETH_CMAN_OBJ_RX_QUEUE,
RTE_ETH_CMAN_OBJ_RX_QUEUE_MEMPOOL objects.

Introduce reserved fields in configuration structure
backed by rte_eth_cman_config_init() to add new configuration
parameters without ABI breakage.

Add rte_eth_cman_info_get() API to get the information such as
supported modes and objects.

Add rte_eth_cman_config_init(), rte_eth_cman_config_set() APIs
to configure congestion management on those object with associated mode.

Finally, add rte_eth_cman_config_get() API to retrieve the
applied configuration.

Signed-off-by: Jerin Jacob 
Signed-off-by: Sunil Kumar Kori 
Signed-off-by: Andrew Rybchenko 
---
v3..v4: Andrew Rybchenko
 - rebase
 - remove eth_check_err() and use eth_err() instead
 - minor fixes in description to avoid "This patch" and "Added".
 - correct position in release notes
v2..v3:
 - Rename rte_cman.c to rte_ethdev_cman.c
 - Move lib/eal/include/rte_cman.h to lib/ethdev/rte_cman.h
 - Fix review comments (Andrew Rybchenko)
 - Add release notes

v1..v2:
 - Fix review comments (Akhil Goyal)

rfc..v1:
 - Added RED specification (http://www.aciri.org/floyd/papers/red/red.html) link
 - Fixed doxygen comment issue (Min Hu)

 doc/guides/nics/features.rst   |  12 ++
 doc/guides/nics/features/default.ini   |   1 +
 doc/guides/rel_notes/release_22_11.rst |   6 +
 lib/ethdev/ethdev_driver.h |  25 
 lib/ethdev/ethdev_private.h|   3 +
 lib/ethdev/meson.build |   2 +
 lib/ethdev/rte_cman.h  |  55 +
 lib/ethdev/rte_ethdev.c|   2 +-
 lib/ethdev/rte_ethdev.h| 164 +
 lib/ethdev/rte_ethdev_cman.c   | 101 +++
 lib/ethdev/version.map |   4 +
 11 files changed, 374 insertions(+), 1 deletion(-)
 create mode 100644 lib/ethdev/rte_cman.h
 create mode 100644 lib/ethdev/rte_ethdev_cman.c

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index b4a8e9881c..70ca46e651 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -727,6 +727,18 @@ Supports configuring per-queue stat counter mapping.
   ``rte_eth_dev_set_tx_queue_stats_mapping()``.
 
 
+.. _nic_features_congestion_management:
+
+Congestion management
+-
+
+Supports congestion management.
+
+* **[implements] eth_dev_ops**: ``cman_info_get``, ``cman_config_set``, 
``cman_config_get``.
+* **[related]API**: ``rte_eth_cman_info_get()``, 
``rte_eth_cman_config_init()``,
+  ``rte_eth_cman_config_set()``, ``rte_eth_cman_config_get()``.
+
+
 .. _nic_features_fw_version:
 
 FW version
diff --git a/doc/guides/nics/features/default.ini 
b/doc/guides/nics/features/default.ini
index f7192cb0da..a9c0008ebd 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -60,6 +60,7 @@ Tx descriptor status =
 Basic stats  =
 Extended stats   =
 Stats per queue  =
+Congestion management =
 FW version   =
 EEPROM dump  =
 Module EEPROM dump   =
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 44f9a30c6a..0ffa004a9e 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -78,6 +78,12 @@ New Features
   Added new rte_flow action which allows application to re-route packets
   directly to the kernel without software involvement.
 
+* **Added support for congestion management for ethdev.**
+
+  Added new APIs ``rte_eth_cman_config_init()``, ``rte_eth_cman_config_get()``,
+  ``rte_eth_cman_config_set()``, ``rte_eth_cman_info_get()``
+  to support congestion management.
+
 * **Updated Intel iavf driver.**
 
   * Added flow subscription support.
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 8cd8eb8685..e1e2d10a35 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1094,6 +1094,22 @@ typedef int (*eth_rx_queue_avail_thresh_query_t)(struct 
rte_eth_dev *dev,
uint16_t *rx_queue_id,
uint8_t *avail_thresh);
 
+/** @internal Get congestion management information. */
+typedef int (*eth_cman_info_get_t)(struct rte_eth_dev *dev,
+   struct rte_eth_cman_info *info);
+
+/** @internal Init congestion management structure with default values. */
+typedef int (*eth_cman_config_init_t)(struct rte_eth_dev *dev,
+   struct rte_eth_cman_config 

Re: [PATCH v3 1/1] ethdev: support congestion management

2022-10-04 Thread Andrew Rybchenko

On 10/4/22 12:02, Andrew Rybchenko wrote:

On 9/29/22 12:35, sk...@marvell.com wrote:

From: Jerin Jacob 

NIC HW controllers often come with congestion management support on
various HW objects such as Rx queue depth or mempool queue depth.

Also, it can support various modes of operation such as RED
(Random early discard), WRED etc on those HW objects.

This patch adds a framework to express such modes(enum rte_cman_mode)
and introduce (enum rte_eth_cman_obj) to enumerate the different
objects where the modes can operate on.

This patch adds RTE_CMAN_RED mode of operation and


This patch adds -> Add


RTE_ETH_CMAN_OBJ_RX_QUEUE, RTE_ETH_CMAN_OBJ_RX_QUEUE_MEMPOOL object.

Introduced reserved fields in configuration structure


Introduce


backed by rte_eth_cman_config_init() to add new configuration
parameters without ABI breakage.

Added rte_eth_cman_info_get() API to get the information such as


Add


supported modes and objects.

Added rte_eth_cman_config_init(), rte_eth_cman_config_set() APIs


Add


to configure congestion management on those object with associated mode.

Finally, Added rte_eth_cman_config_get() API to retrieve the


add


applied configuration.

Signed-off-by: Jerin Jacob 
Signed-off-by: Sunil Kumar Kori 


I'll send v4 with few minor correction.


Done, but I'm sorry I forgot to specify --in-reply-to.




RE: [PATCH v7 1/4] eal: add lcore poll busyness telemetry

2022-10-04 Thread Morten Brørup
> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> Sent: Monday, 3 October 2022 22.02

[...]

> The functionality provided is very useful, and the implementation is
> clever in the way it doesn't require any application modifications.
> But,
> a clever, useful brittle hack is still a brittle hack.
> 
> What if there was instead a busyness module, where the application
> would
> explicitly report what it was up to. The new library would hook up to
> telemetry just like this patchset does, plus provide an explicit API to
> retrieve lcore thread load.
> 
> The service cores framework (fancy name for rte_service.c) could also
> call the lcore load tracking module, provided all services properly
> reported back on whether or not they were doing anything useful with
> the
> cycles they just spent.
> 
> The metrics of such a load tracking module could potentially be used by
> other modules in DPDK, or by the application. It could potentially be
> used for dynamic load balancing of service core services, or for power
> management (e.g, DVFS), or for a potential future deferred-work type
> mechanism more sophisticated than current rte_service, or some green
> threads/coroutines/fiber thingy. The DSW event device could also use it
> to replace its current internal load estimation scheme.

[...]

I agree 100 % with everything Mattias wrote above, and I would like to voice my 
opinion too.

This patch is full of preconditions and assumptions. Its only true advantage 
(vs. a generic load tracking library) is that it doesn't require any 
application modifications, and thus can be deployed with zero effort.

I my opinion, it would be much better with a well designed generic load 
tracking library, to be called from the application, so it gets correct 
information about what the lcores spend their cycles doing. And as Mattias 
mentions: With the appropriate API for consumption of the collected data, it 
could also provide actionable statistics for use by the application itself, not 
just telemetry. ("Actionable statistics": Statistics that is directly usable 
for decision making.)

There is also the aspect of time-to-benefit: This patch immediately provides 
benefits (to the users of the DPDK applications that meet the 
preconditions/assumptions of the patch), while a generic load tracking library 
will take years to get integrated into applications before it provides benefits 
(to the users of the DPDK applications that use the new library).

So, we should ask ourselves: Do we want an application-specific solution with a 
short time-to-benefit, or a generic solution with a long time-to-benefit? (I 
use the term "application specific" because not all applications can be tweaked 
to provide meaningful data with this patch. You might also label a generic 
library "application specific", because it requires that the application uses 
the library - however that is a common requirement of all DPDK libraries.)

Furthermore, if the proposed patch is primarily for the benefit of OVS, I 
suppose that calls to a generic load tracking library could be added to OVS 
within a relatively short time frame (although not as quick as this patch).

I guess that the developers of this patch initially thought that it was generic 
and usable for the majority of applications, and it came as somewhat a surprise 
that it wasn't as generic as expected. The DPDK community has a good review 
process with open discussions and sharing of thoughts and ideas. Sometimes, an 
idea doesn't fly, because the corner cases turn out to be more common than 
expected. I'm sorry to say it, but I think that is the case for this patch. :-(

-Morten



Re: [PATCH v2] ethdev: remove header split Rx offload

2022-10-04 Thread Andrew Rybchenko

On 9/13/22 10:38, Andrew Rybchenko wrote:

On 8/12/22 06:13, xuan.d...@intel.com wrote:

From: Xuan Ding 

As announced in the deprecation note, this patch removes the Rx offload
flag 'RTE_ETH_RX_OFFLOAD_HEADER_SPLIT' and 'split_hdr_size' field from
the structure 'rte_eth_rxmode'. Meanwhile, the place where the examples
and apps initialize the 'split_hdr_size' field, and where the drivers
check if the 'split_hdr_size' value is 0 are also removed.

User can still use `RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT` for per-queue packet
split offload, which is configured by 'rte_eth_rxseg_split'.

Signed-off-by: Xuan Ding 
---
v2:
* fix CI build error
---


Acked-by: Andrew Rybchenko 



Rebased and applied to dpdk-next-net/main, thanks.



Re: [PATCH v3] ethdev: queue rate parameter changed from 16b to 32b

2022-10-04 Thread Andrew Rybchenko

On 9/28/22 08:51, skotesh...@marvell.com wrote:

From: Satha Rao 

The rate parameter modified to uint32_t, so that it can work
for more than 64 Gbps.

Signed-off-by: Satha Rao 


Reviewed-by: Andrew Rybchenko 

Applied to dpdk-next-net/main, thanks.




[PATCH v2 0/9] Trace subsystem fixes

2022-10-04 Thread David Marchand
Hello,

This series addresses a number of issues and limitations I have
identified over time in the trace subsystem.

The main issue was with dynamically enabling trace points which was not
working if no trace point had been enabled at rte_eal_init() time.

This is 22.11 material.

We may start thinking about marking this API stable, but this is another
topic.


-- 
David Marchand

Changes since v1:
- split patch 3,
- addressed comments on (previously) patch 4,

David Marchand (9):
  trace: fix mode for new trace point
  trace: fix mode change
  trace: fix leak with regexp
  trace: rework loop on trace points
  trace: fix dynamically enabling trace points
  trace: fix race in debug dump
  trace: fix metadata dump
  trace: remove limitation on trace point name
  trace: remove limitation on directory

 app/test/test_trace.c   |  67 +++---
 app/test/test_trace.h   |   2 +
 doc/guides/prog_guide/trace_lib.rst |  14 ++-
 lib/eal/common/eal_common_trace.c   | 111 +++-
 lib/eal/common/eal_common_trace_ctf.c   |   3 -
 lib/eal/common/eal_common_trace_utils.c |  81 -
 lib/eal/common/eal_trace.h  |  11 +--
 7 files changed, 136 insertions(+), 153 deletions(-)

-- 
2.37.3



[PATCH v2 1/9] trace: fix mode for new trace point

2022-10-04 Thread David Marchand
If an application registers trace points later than rte_eal_init(),
changes in the trace point mode were not applied.

Fixes: 84c4fae4628f ("trace: implement operation APIs")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_trace.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/eal/common/eal_common_trace.c 
b/lib/eal/common/eal_common_trace.c
index f9b187d15f..d5dbc7d667 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -512,6 +512,7 @@ __rte_trace_point_register(rte_trace_point_t *handle, const 
char *name,
/* Form the trace handle */
*handle = sz;
*handle |= trace.nb_trace_points << __RTE_TRACE_FIELD_ID_SHIFT;
+   trace_mode_set(handle, trace.mode);
 
trace.nb_trace_points++;
tp->handle = handle;
-- 
2.37.3



[PATCH v2 3/9] trace: fix leak with regexp

2022-10-04 Thread David Marchand
The precompiled buffer initialised in regcomp must be freed before
leaving rte_trace_regexp.

Fixes: 84c4fae4628f ("trace: implement operation APIs")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
Changes since v1:
- split patch in two, keeping only the backportable fix as patch 3,

---
 lib/eal/common/eal_common_trace.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/lib/eal/common/eal_common_trace.c 
b/lib/eal/common/eal_common_trace.c
index 1b86f5d2d2..1db11e3e14 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -218,8 +218,10 @@ rte_trace_regexp(const char *regex, bool enable)
rc = rte_trace_point_disable(tp->handle);
found = 1;
}
-   if (rc < 0)
-   return rc;
+   if (rc < 0) {
+   found = 0;
+   break;
+   }
}
regfree(&r);
 
-- 
2.37.3



[PATCH v2 4/9] trace: rework loop on trace points

2022-10-04 Thread David Marchand
Directly skip the block when a trace point does not match the user
criteria.

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_trace.c | 34 +--
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/lib/eal/common/eal_common_trace.c 
b/lib/eal/common/eal_common_trace.c
index 1db11e3e14..6b8660c318 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -186,15 +186,18 @@ rte_trace_pattern(const char *pattern, bool enable)
int rc = 0, found = 0;
 
STAILQ_FOREACH(tp, &tp_list, next) {
-   if (fnmatch(pattern, tp->name, 0) == 0) {
-   if (enable)
-   rc = rte_trace_point_enable(tp->handle);
-   else
-   rc = rte_trace_point_disable(tp->handle);
-   found = 1;
+   if (fnmatch(pattern, tp->name, 0) != 0)
+   continue;
+
+   if (enable)
+   rc = rte_trace_point_enable(tp->handle);
+   else
+   rc = rte_trace_point_disable(tp->handle);
+   if (rc < 0) {
+   found = 0;
+   break;
}
-   if (rc < 0)
-   return rc;
+   found = 1;
}
 
return rc | found;
@@ -211,17 +214,18 @@ rte_trace_regexp(const char *regex, bool enable)
return -EINVAL;
 
STAILQ_FOREACH(tp, &tp_list, next) {
-   if (regexec(&r, tp->name, 0, NULL, 0) == 0) {
-   if (enable)
-   rc = rte_trace_point_enable(tp->handle);
-   else
-   rc = rte_trace_point_disable(tp->handle);
-   found = 1;
-   }
+   if (regexec(&r, tp->name, 0, NULL, 0) != 0)
+   continue;
+
+   if (enable)
+   rc = rte_trace_point_enable(tp->handle);
+   else
+   rc = rte_trace_point_disable(tp->handle);
if (rc < 0) {
found = 0;
break;
}
+   found = 1;
}
regfree(&r);
 
-- 
2.37.3



[PATCH v2 2/9] trace: fix mode change

2022-10-04 Thread David Marchand
The API does not state that changing mode should be refused if no trace
point is enabled. Remove this limitation.

Fixes: 84c4fae4628f ("trace: implement operation APIs")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 app/test/test_trace.c | 3 ---
 lib/eal/common/eal_common_trace.c | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/app/test/test_trace.c b/app/test/test_trace.c
index 76af79162b..44ac38a4fa 100644
--- a/app/test/test_trace.c
+++ b/app/test/test_trace.c
@@ -126,9 +126,6 @@ test_trace_mode(void)
 
current = rte_trace_mode_get();
 
-   if (!rte_trace_is_enabled())
-   return TEST_SKIPPED;
-
rte_trace_mode_set(RTE_TRACE_MODE_DISCARD);
if (rte_trace_mode_get() != RTE_TRACE_MODE_DISCARD)
goto failed;
diff --git a/lib/eal/common/eal_common_trace.c 
b/lib/eal/common/eal_common_trace.c
index d5dbc7d667..1b86f5d2d2 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -127,9 +127,6 @@ rte_trace_mode_set(enum rte_trace_mode mode)
 {
struct trace_point *tp;
 
-   if (!rte_trace_is_enabled())
-   return;
-
STAILQ_FOREACH(tp, &tp_list, next)
trace_mode_set(tp->handle, mode);
 
-- 
2.37.3



[PATCH v2 6/9] trace: fix race in debug dump

2022-10-04 Thread David Marchand
trace->nb_trace_mem_list access must be under trace->lock to avoid
races with threads allocating/freeing their trace buffers.

Fixes: f6b2d65dcd5d ("trace: implement debug dump")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_trace.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/eal/common/eal_common_trace.c 
b/lib/eal/common/eal_common_trace.c
index 6aa11a3b50..ec168e37b3 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -259,10 +259,9 @@ trace_lcore_mem_dump(FILE *f)
struct __rte_trace_header *header;
uint32_t count;
 
-   if (trace->nb_trace_mem_list == 0)
-   return;
-
rte_spinlock_lock(&trace->lock);
+   if (trace->nb_trace_mem_list == 0)
+   goto out;
fprintf(f, "nb_trace_mem_list = %d\n", trace->nb_trace_mem_list);
fprintf(f, "\nTrace mem info\n--\n");
for (count = 0; count < trace->nb_trace_mem_list; count++) {
@@ -273,6 +272,7 @@ trace_lcore_mem_dump(FILE *f)
header->stream_header.lcore_id,
header->stream_header.thread_name);
}
+out:
rte_spinlock_unlock(&trace->lock);
 }
 
-- 
2.37.3



[PATCH v2 5/9] trace: fix dynamically enabling trace points

2022-10-04 Thread David Marchand
Enabling trace points at runtime was not working if no trace point had
been enabled first at rte_eal_init() time. The reason was that
trace.args reflected the arguments passed to --trace= EAL option.

To fix this:
- the trace subsystem initialisation is updated: trace directory
  creation is deferred to when traces are dumped (to avoid creating
  directories that may not be used),
- per lcore memory allocation still relies on rte_trace_is_enabled() but
  this helper now tracks if any trace point is enabled. The
  documentation is updated accordingly,
- cleanup helpers must always be called in rte_eal_cleanup() since some
  trace points might have been enabled and disabled in the lifetime of
  the DPDK application,

With this fix, we can update the unit test and check that a trace point
callback is invoked when expected.

Note:
- the 'trace' global variable might be shadowed with the argument
  passed to the functions dealing with trace point handles.
  'tp' has been used for referring to trace_point object.
  Prefer 't' for referring to handles,

Fixes: 84c4fae4628f ("trace: implement operation APIs")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
Changes since v1:
- restored level to INFO for trace directory log message,
- moved trace_mkdir() to rte_trace_save,

---
 app/test/test_trace.c   | 20 ++
 app/test/test_trace.h   |  2 +
 doc/guides/prog_guide/trace_lib.rst | 14 +--
 lib/eal/common/eal_common_trace.c   | 53 ++---
 lib/eal/common/eal_common_trace_utils.c | 11 -
 lib/eal/common/eal_trace.h  |  3 +-
 6 files changed, 65 insertions(+), 38 deletions(-)

diff --git a/app/test/test_trace.c b/app/test/test_trace.c
index 44ac38a4fa..2660f52f1d 100644
--- a/app/test/test_trace.c
+++ b/app/test/test_trace.c
@@ -9,6 +9,8 @@
 #include "test.h"
 #include "test_trace.h"
 
+int app_dpdk_test_tp_count;
+
 #ifdef RTE_EXEC_ENV_WINDOWS
 
 static int
@@ -95,8 +97,15 @@ test_trace_point_regex(void)
 static int32_t
 test_trace_point_disable_enable(void)
 {
+   int expected;
int rc;
 
+   /* At tp registration, the associated counter increases once. */
+   expected = 1;
+   TEST_ASSERT_EQUAL(app_dpdk_test_tp_count, expected,
+   "Expecting %d, but got %d for app_dpdk_test_tp_count",
+   expected, app_dpdk_test_tp_count);
+
rc = rte_trace_point_disable(&__app_dpdk_test_tp);
if (rc < 0)
goto failed;
@@ -104,6 +113,12 @@ test_trace_point_disable_enable(void)
if (rte_trace_point_is_enabled(&__app_dpdk_test_tp))
goto failed;
 
+   /* No emission expected */
+   app_dpdk_test_tp("app.dpdk.test.tp");
+   TEST_ASSERT_EQUAL(app_dpdk_test_tp_count, expected,
+   "Expecting %d, but got %d for app_dpdk_test_tp_count",
+   expected, app_dpdk_test_tp_count);
+
rc = rte_trace_point_enable(&__app_dpdk_test_tp);
if (rc < 0)
goto failed;
@@ -113,6 +128,11 @@ test_trace_point_disable_enable(void)
 
/* Emit the trace */
app_dpdk_test_tp("app.dpdk.test.tp");
+   expected++;
+   TEST_ASSERT_EQUAL(app_dpdk_test_tp_count, expected,
+   "Expecting %d, but got %d for app_dpdk_test_tp_count",
+   expected, app_dpdk_test_tp_count);
+
return TEST_SUCCESS;
 
 failed:
diff --git a/app/test/test_trace.h b/app/test/test_trace.h
index 413842f60d..4ad44e2bea 100644
--- a/app/test/test_trace.h
+++ b/app/test/test_trace.h
@@ -3,10 +3,12 @@
  */
 #include 
 
+extern int app_dpdk_test_tp_count;
 RTE_TRACE_POINT(
app_dpdk_test_tp,
RTE_TRACE_POINT_ARGS(const char *str),
rte_trace_point_emit_string(str);
+   app_dpdk_test_tp_count++;
 )
 
 RTE_TRACE_POINT_FP(
diff --git a/doc/guides/prog_guide/trace_lib.rst 
b/doc/guides/prog_guide/trace_lib.rst
index fbadf9fde9..9a8f38073d 100644
--- a/doc/guides/prog_guide/trace_lib.rst
+++ b/doc/guides/prog_guide/trace_lib.rst
@@ -271,10 +271,16 @@ Trace memory
 The trace memory will be allocated through an internal function
 ``__rte_trace_mem_per_thread_alloc()``. The trace memory will be allocated
 per thread to enable lock less trace-emit function.
-The memory for the trace memory for DPDK lcores will be allocated on
-``rte_eal_init()`` if the trace is enabled through a EAL option.
-For non DPDK threads, on the first trace emission, the memory will be
-allocated.
+
+For non lcore threads, the trace memory is allocated on the first trace
+emission.
+
+For lcore threads, if trace points are enabled through a EAL option, the trace
+memory is allocated when the threads are known of DPDK
+(``rte_eal_init`` for EAL lcores, ``rte_thread_register`` for non-EAL lcores).
+Otherwise, when trace points are enabled later in the life of the application,
+the behavior is the same as non lcore threads and the trace memory is allocated
+on the first trace emission.
 
 Trace memory layout
 ~~~

[PATCH v2 7/9] trace: fix metadata dump

2022-10-04 Thread David Marchand
The API does not describe that metadata dump is conditioned to enabling
any trace points.

While at it, merge dump unit tests into the generic trace_autotest to
enhance coverage.

Fixes: f6b2d65dcd5d ("trace: implement debug dump")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 app/test/test_trace.c | 44 +--
 lib/eal/common/eal_common_trace_ctf.c |  3 --
 2 files changed, 15 insertions(+), 32 deletions(-)

diff --git a/app/test/test_trace.c b/app/test/test_trace.c
index 2660f52f1d..6bedf14024 100644
--- a/app/test/test_trace.c
+++ b/app/test/test_trace.c
@@ -20,20 +20,6 @@ test_trace(void)
return TEST_SKIPPED;
 }
 
-static int
-test_trace_dump(void)
-{
-   printf("trace_dump not supported on Windows, skipping test\n");
-   return TEST_SKIPPED;
-}
-
-static int
-test_trace_metadata_dump(void)
-{
-   printf("trace_metadata_dump not supported on Windows, skipping test\n");
-   return TEST_SKIPPED;
-}
-
 #else
 
 static int32_t
@@ -214,6 +200,19 @@ test_generic_trace_points(void)
return TEST_SUCCESS;
 }
 
+static int
+test_trace_dump(void)
+{
+   rte_trace_dump(stdout);
+   return 0;
+}
+
+static int
+test_trace_metadata_dump(void)
+{
+   return rte_trace_metadata_dump(stdout);
+}
+
 static struct unit_test_suite trace_tests = {
.suite_name = "trace autotest",
.setup = NULL,
@@ -226,6 +225,8 @@ static struct unit_test_suite trace_tests = {
TEST_CASE(test_trace_point_globbing),
TEST_CASE(test_trace_point_regex),
TEST_CASE(test_trace_points_lookup),
+   TEST_CASE(test_trace_dump),
+   TEST_CASE(test_trace_metadata_dump),
TEST_CASES_END()
}
 };
@@ -236,21 +237,6 @@ test_trace(void)
return unit_test_suite_runner(&trace_tests);
 }
 
-static int
-test_trace_dump(void)
-{
-   rte_trace_dump(stdout);
-   return 0;
-}
-
-static int
-test_trace_metadata_dump(void)
-{
-   return rte_trace_metadata_dump(stdout);
-}
-
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
 REGISTER_TEST_COMMAND(trace_autotest, test_trace);
-REGISTER_TEST_COMMAND(trace_dump, test_trace_dump);
-REGISTER_TEST_COMMAND(trace_metadata_dump, test_trace_metadata_dump);
diff --git a/lib/eal/common/eal_common_trace_ctf.c 
b/lib/eal/common/eal_common_trace_ctf.c
index 335932a271..c6775c3b4d 100644
--- a/lib/eal/common/eal_common_trace_ctf.c
+++ b/lib/eal/common/eal_common_trace_ctf.c
@@ -358,9 +358,6 @@ rte_trace_metadata_dump(FILE *f)
char *ctf_meta = trace->ctf_meta;
int rc;
 
-   if (!rte_trace_is_enabled())
-   return 0;
-
if (ctf_meta == NULL)
return -EINVAL;
 
-- 
2.37.3



[PATCH v2 8/9] trace: remove limitation on trace point name

2022-10-04 Thread David Marchand
The name of a trace point is provided as a constant string via the
RTE_TRACE_POINT_REGISTER macro.
We can rely on the constant string in the binary and simply point at it.
There is then no need for a (fixed size) copy.

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_trace.c   | 10 +++---
 lib/eal/common/eal_common_trace_utils.c |  2 +-
 lib/eal/common/eal_trace.h  |  3 +--
 3 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/lib/eal/common/eal_common_trace.c 
b/lib/eal/common/eal_common_trace.c
index ec168e37b3..5caaac8e59 100644
--- a/lib/eal/common/eal_common_trace.c
+++ b/lib/eal/common/eal_common_trace.c
@@ -235,7 +235,7 @@ rte_trace_point_lookup(const char *name)
return NULL;
 
STAILQ_FOREACH(tp, &tp_list, next)
-   if (strncmp(tp->name, name, TRACE_POINT_NAME_SIZE) == 0)
+   if (strcmp(tp->name, name) == 0)
return tp->handle;
 
return NULL;
@@ -492,10 +492,7 @@ __rte_trace_point_register(rte_trace_point_t *handle, 
const char *name,
}
 
/* Initialize the trace point */
-   if (rte_strscpy(tp->name, name, TRACE_POINT_NAME_SIZE) < 0) {
-   trace_err("name is too long");
-   goto free;
-   }
+   tp->name = name;
 
/* Copy the accumulated fields description and clear it for the next
 * trace point.
@@ -517,8 +514,7 @@ __rte_trace_point_register(rte_trace_point_t *handle, const 
char *name,
 
/* All Good !!! */
return 0;
-free:
-   free(tp);
+
 fail:
if (trace.register_errno == 0)
trace.register_errno = rte_errno;
diff --git a/lib/eal/common/eal_common_trace_utils.c 
b/lib/eal/common/eal_common_trace_utils.c
index 7bf1c05e12..72108d36a6 100644
--- a/lib/eal/common/eal_common_trace_utils.c
+++ b/lib/eal/common/eal_common_trace_utils.c
@@ -42,7 +42,7 @@ trace_entry_compare(const char *name)
int count = 0;
 
STAILQ_FOREACH(tp, tp_list, next) {
-   if (strncmp(tp->name, name, TRACE_POINT_NAME_SIZE) == 0)
+   if (strcmp(tp->name, name) == 0)
count++;
if (count > 1) {
trace_err("found duplicate entry %s", name);
diff --git a/lib/eal/common/eal_trace.h b/lib/eal/common/eal_trace.h
index 72a5a461ae..26a18a2c48 100644
--- a/lib/eal/common/eal_trace.h
+++ b/lib/eal/common/eal_trace.h
@@ -24,14 +24,13 @@
 
 #define TRACE_PREFIX_LEN 12
 #define TRACE_DIR_STR_LEN (sizeof("-mm-dd-AM-HH-MM-SS") + TRACE_PREFIX_LEN)
-#define TRACE_POINT_NAME_SIZE 64
 #define TRACE_CTF_MAGIC 0xC1FC1FC1
 #define TRACE_MAX_ARGS 32
 
 struct trace_point {
STAILQ_ENTRY(trace_point) next;
rte_trace_point_t *handle;
-   char name[TRACE_POINT_NAME_SIZE];
+   const char *name;
char *ctf_field;
 };
 
-- 
2.37.3



[PATCH v2 9/9] trace: remove limitation on directory

2022-10-04 Thread David Marchand
Remove arbitrary limit on 12 characters of the file prefix used for the
directory where to store the traces.
Simplify the code by relying on dynamic allocations.

Signed-off-by: David Marchand 
---
 lib/eal/common/eal_common_trace_utils.c | 68 +
 lib/eal/common/eal_trace.h  |  5 +-
 2 files changed, 25 insertions(+), 48 deletions(-)

diff --git a/lib/eal/common/eal_common_trace_utils.c 
b/lib/eal/common/eal_common_trace_utils.c
index 72108d36a6..8561a0e198 100644
--- a/lib/eal/common/eal_common_trace_utils.c
+++ b/lib/eal/common/eal_common_trace_utils.c
@@ -87,11 +87,11 @@ trace_uuid_generate(void)
 }
 
 static int
-trace_session_name_generate(char *trace_dir)
+trace_session_name_generate(char **trace_dir)
 {
+   char date[sizeof("-mm-dd-AM-HH-MM-SS")];
struct tm *tm_result;
time_t tm;
-   int rc;
 
tm = time(NULL);
if ((int)tm == -1)
@@ -101,38 +101,32 @@ trace_session_name_generate(char *trace_dir)
if (tm_result == NULL)
goto fail;
 
-   rc = rte_strscpy(trace_dir, eal_get_hugefile_prefix(),
-   TRACE_PREFIX_LEN);
-   if (rc == -E2BIG)
-   rc = TRACE_PREFIX_LEN - 1;
-   trace_dir[rc++] = '-';
-
-   rc = strftime(trace_dir + rc, TRACE_DIR_STR_LEN - rc,
-   "%Y-%m-%d-%p-%I-%M-%S", tm_result);
-   if (rc == 0) {
+   if (strftime(date, sizeof(date), "%Y-%m-%d-%p-%I-%M-%S", tm_result) == 
0) {
errno = ENOSPC;
goto fail;
}
 
-   return rc;
+   if (asprintf(trace_dir, "%s-%s", eal_get_hugefile_prefix(), date) == -1)
+   goto fail;
+
+   return 0;
 fail:
rte_errno = errno;
-   return -rte_errno;
+   return -1;
 }
 
 static int
 trace_dir_update(const char *str)
 {
struct trace *trace = trace_obj_get();
-   int rc, remaining;
-
-   remaining = sizeof(trace->dir) - trace->dir_offset;
-   rc = rte_strscpy(&trace->dir[0] + trace->dir_offset, str, remaining);
-   if (rc < 0)
-   goto fail;
+   char *dir;
+   int rc;
 
-   trace->dir_offset += rc;
-fail:
+   rc = asprintf(&dir, "%s%s", trace->dir != NULL ? trace->dir : "", str);
+   if (rc != -1) {
+   free(trace->dir);
+   trace->dir = dir;
+   }
return rc;
 }
 
@@ -246,22 +240,15 @@ eal_trace_mode_args_save(const char *val)
 int
 eal_trace_dir_args_save(char const *val)
 {
-   struct trace *trace = trace_obj_get();
char *dir_path;
int rc;
 
-   if (strlen(val) >= sizeof(trace->dir) - 1) {
-   trace_err("input string is too big");
-   return -ENAMETOOLONG;
-   }
-
if (asprintf(&dir_path, "%s/", val) == -1) {
trace_err("failed to copy directory: %s", strerror(errno));
return -ENOMEM;
}
 
rc = trace_dir_update(dir_path);
-
free(dir_path);
return rc;
 }
@@ -289,10 +276,8 @@ trace_epoch_time_save(void)
 }
 
 static int
-trace_dir_default_path_get(char *dir_path)
+trace_dir_default_path_get(char **dir_path)
 {
-   struct trace *trace = trace_obj_get();
-   uint32_t size = sizeof(trace->dir);
struct passwd *pwd;
char *home_dir;
 
@@ -308,8 +293,8 @@ trace_dir_default_path_get(char *dir_path)
}
 
/* Append dpdk-traces to directory */
-   if (snprintf(dir_path, size, "%s/dpdk-traces/", home_dir) < 0)
-   return -ENAMETOOLONG;
+   if (asprintf(dir_path, "%s/dpdk-traces/", home_dir) == -1)
+   return -ENOMEM;
 
return 0;
 }
@@ -318,25 +303,19 @@ static int
 trace_mkdir(void)
 {
struct trace *trace = trace_obj_get();
-   char session[TRACE_DIR_STR_LEN];
static bool already_done;
-   char *dir_path;
+   char *session;
int rc;
 
if (already_done)
return 0;
 
-   if (!trace->dir_offset) {
-   dir_path = calloc(1, sizeof(trace->dir));
-   if (dir_path == NULL) {
-   trace_err("fail to allocate memory");
-   return -ENOMEM;
-   }
+   if (trace->dir == NULL) {
+   char *dir_path;
 
-   rc = trace_dir_default_path_get(dir_path);
+   rc = trace_dir_default_path_get(&dir_path);
if (rc < 0) {
trace_err("fail to get default path");
-   free(dir_path);
return rc;
}
 
@@ -354,10 +333,11 @@ trace_mkdir(void)
return -rte_errno;
}
 
-   rc = trace_session_name_generate(session);
+   rc = trace_session_name_generate(&session);
if (rc < 0)
return rc;
rc = trace_dir_update(session);
+   free(session);
if (rc < 0)
return rc;
 
diff --git a/lib/eal/common/eal_trace.h b/lib/eal/common/eal_

[PATCH v2 0/4] crypto/ccp cleanup

2022-10-04 Thread David Marchand
This is a *untested* cleanup series after looking for usage of
rte_pci_device objects in DPDK drivers.
I can't test those patches by lack of hw, so I hope the driver maintainer
can look into them.

Thanks.
-- 
David Marchand

Changes since v1:
- rebased,
- copied new maintainer,

David Marchand (4):
  crypto/ccp: remove some printf
  crypto/ccp: remove some dead code for UIO
  crypto/ccp: fix IOVA handling
  crypto/ccp: fix PCI probing

 drivers/crypto/ccp/ccp_crypto.c  | 106 +++---
 drivers/crypto/ccp/ccp_dev.c | 103 ++---
 drivers/crypto/ccp/ccp_dev.h |  31 ++--
 drivers/crypto/ccp/ccp_pci.c | 240 ---
 drivers/crypto/ccp/ccp_pci.h |  27 
 drivers/crypto/ccp/meson.build   |   1 -
 drivers/crypto/ccp/rte_ccp_pmd.c |  23 +--
 7 files changed, 47 insertions(+), 484 deletions(-)
 delete mode 100644 drivers/crypto/ccp/ccp_pci.c
 delete mode 100644 drivers/crypto/ccp/ccp_pci.h

-- 
2.37.3



[PATCH v2 1/4] crypto/ccp: remove some printf

2022-10-04 Thread David Marchand
A DPDK application must _not_ use printf.
Use log framework.

Fixes: ef4b04f87fa6 ("crypto/ccp: support device init")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 drivers/crypto/ccp/ccp_dev.c | 4 ++--
 drivers/crypto/ccp/ccp_pci.c | 3 ++-
 drivers/crypto/ccp/rte_ccp_pmd.c | 2 +-
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c
index 424ead82c3..9c9cb81236 100644
--- a/drivers/crypto/ccp/ccp_dev.c
+++ b/drivers/crypto/ccp/ccp_dev.c
@@ -362,7 +362,7 @@ ccp_find_lsb_regions(struct ccp_queue *cmd_q, uint64_t 
status)
if (ccp_get_bit(&cmd_q->lsbmask, j))
weight++;
 
-   printf("Queue %d can access %d LSB regions  of mask  %lu\n",
+   CCP_LOG_DBG("Queue %d can access %d LSB regions  of mask  %lu\n",
   (int)cmd_q->id, weight, cmd_q->lsbmask);
 
return weight ? 0 : -EINVAL;
@@ -709,7 +709,7 @@ ccp_probe_devices(struct rte_pci_device *pci_dev,
snprintf(dirname, sizeof(dirname), "%s/%s",
 SYSFS_PCI_DEVICES, d->d_name);
if (is_ccp_device(dirname, ccp_id, &ccp_type)) {
-   printf("CCP : Detected CCP device with ID = 0x%x\n",
+   CCP_LOG_DBG("CCP : Detected CCP device with ID = 
0x%x\n",
   ccp_id[ccp_type].device_id);
ret = ccp_probe_device(ccp_type, pci_dev);
if (ret == 0)
diff --git a/drivers/crypto/ccp/ccp_pci.c b/drivers/crypto/ccp/ccp_pci.c
index 38029a9081..c941e222c7 100644
--- a/drivers/crypto/ccp/ccp_pci.c
+++ b/drivers/crypto/ccp/ccp_pci.c
@@ -11,6 +11,7 @@
 #include 
 
 #include "ccp_pci.h"
+#include "ccp_pmd_private.h"
 
 static const char * const uio_module_names[] = {
"igb_uio",
@@ -41,7 +42,7 @@ ccp_check_pci_uio_module(void)
rewind(fp);
}
fclose(fp);
-   printf("Insert igb_uio or uio_pci_generic kernel module(s)");
+   CCP_LOG_DBG("Insert igb_uio or uio_pci_generic kernel module(s)");
return -1;/* uio not inserted */
 }
 
diff --git a/drivers/crypto/ccp/rte_ccp_pmd.c b/drivers/crypto/ccp/rte_ccp_pmd.c
index 013f3be1e6..7338ef0ae8 100644
--- a/drivers/crypto/ccp/rte_ccp_pmd.c
+++ b/drivers/crypto/ccp/rte_ccp_pmd.c
@@ -250,7 +250,7 @@ cryptodev_ccp_create(const char *name,
goto init_error;
}
 
-   printf("CCP : Crypto device count = %d\n", cryptodev_cnt);
+   CCP_LOG_DBG("CCP : Crypto device count = %d\n", cryptodev_cnt);
dev->device = &pci_dev->device;
dev->device->driver = &pci_drv->driver;
dev->driver_id = ccp_cryptodev_driver_id;
-- 
2.37.3



[PATCH v2 2/4] crypto/ccp: remove some dead code for UIO

2022-10-04 Thread David Marchand
uio_fd is unused.

Fixes: 09a0fd736a08 ("crypto/ccp: enable IOMMU")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 drivers/crypto/ccp/ccp_dev.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c
index 9c9cb81236..410e62121e 100644
--- a/drivers/crypto/ccp/ccp_dev.c
+++ b/drivers/crypto/ccp/ccp_dev.c
@@ -653,7 +653,6 @@ static int
 ccp_probe_device(int ccp_type, struct rte_pci_device *pci_dev)
 {
struct ccp_device *ccp_dev = NULL;
-   int uio_fd = -1;
 
ccp_dev = rte_zmalloc("ccp_device", sizeof(*ccp_dev),
  RTE_CACHE_LINE_SIZE);
@@ -671,8 +670,6 @@ ccp_probe_device(int ccp_type, struct rte_pci_device 
*pci_dev)
return 0;
 fail:
CCP_LOG_ERR("CCP Device probe failed");
-   if (uio_fd >= 0)
-   close(uio_fd);
rte_free(ccp_dev);
return -1;
 }
-- 
2.37.3



[PATCH v2 3/4] crypto/ccp: fix IOVA handling

2022-10-04 Thread David Marchand
Using IOVA or physical addresses is something that the user (via
--iova-mode=) or the bus code decides.

The crypto/ccp PCI driver should only use rte_mem_virt2iova.
It should not try to decide what to use solely based on the kmod
the PCI device is bound to.

While at it, the global variable sha_ctx looks unsafe and unneeded.
Remove it.

Fixes: 09a0fd736a08 ("crypto/ccp: enable IOMMU")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 drivers/crypto/ccp/ccp_crypto.c  | 105 ++-
 drivers/crypto/ccp/ccp_dev.c |   9 +--
 drivers/crypto/ccp/ccp_pci.c |  34 --
 drivers/crypto/ccp/ccp_pci.h |   3 -
 drivers/crypto/ccp/rte_ccp_pmd.c |   3 -
 5 files changed, 19 insertions(+), 135 deletions(-)

diff --git a/drivers/crypto/ccp/ccp_crypto.c b/drivers/crypto/ccp/ccp_crypto.c
index 4bab18323b..351d8ac63e 100644
--- a/drivers/crypto/ccp/ccp_crypto.c
+++ b/drivers/crypto/ccp/ccp_crypto.c
@@ -33,8 +33,6 @@
 #include 
 #include 
 
-extern int iommu_mode;
-void *sha_ctx;
 /* SHA initial context values */
 uint32_t ccp_sha1_init[SHA_COMMON_DIGEST_SIZE / sizeof(uint32_t)] = {
SHA1_H4, SHA1_H3,
@@ -748,13 +746,8 @@ ccp_configure_session_cipher(struct ccp_session *sess,
CCP_LOG_ERR("Invalid CCP Engine");
return -ENOTSUP;
}
-   if (iommu_mode == 2) {
-   sess->cipher.nonce_phys = rte_mem_virt2iova(sess->cipher.nonce);
-   sess->cipher.key_phys = rte_mem_virt2iova(sess->cipher.key_ccp);
-   } else {
-   sess->cipher.nonce_phys = rte_mem_virt2phy(sess->cipher.nonce);
-   sess->cipher.key_phys = rte_mem_virt2phy(sess->cipher.key_ccp);
-   }
+   sess->cipher.nonce_phys = rte_mem_virt2iova(sess->cipher.nonce);
+   sess->cipher.key_phys = rte_mem_virt2iova(sess->cipher.key_ccp);
return 0;
 }
 
@@ -793,7 +786,6 @@ ccp_configure_session_auth(struct ccp_session *sess,
sess->auth.ctx = (void *)ccp_sha1_init;
sess->auth.ctx_len = CCP_SB_BYTES;
sess->auth.offset = CCP_SB_BYTES - SHA1_DIGEST_SIZE;
-   rte_memcpy(sha_ctx, sess->auth.ctx, SHA_COMMON_DIGEST_SIZE);
break;
case RTE_CRYPTO_AUTH_SHA1_HMAC:
if (sess->auth_opt) {
@@ -832,7 +824,6 @@ ccp_configure_session_auth(struct ccp_session *sess,
sess->auth.ctx = (void *)ccp_sha224_init;
sess->auth.ctx_len = CCP_SB_BYTES;
sess->auth.offset = CCP_SB_BYTES - SHA224_DIGEST_SIZE;
-   rte_memcpy(sha_ctx, sess->auth.ctx, SHA256_DIGEST_SIZE);
break;
case RTE_CRYPTO_AUTH_SHA224_HMAC:
if (sess->auth_opt) {
@@ -895,7 +886,6 @@ ccp_configure_session_auth(struct ccp_session *sess,
sess->auth.ctx = (void *)ccp_sha256_init;
sess->auth.ctx_len = CCP_SB_BYTES;
sess->auth.offset = CCP_SB_BYTES - SHA256_DIGEST_SIZE;
-   rte_memcpy(sha_ctx, sess->auth.ctx, SHA256_DIGEST_SIZE);
break;
case RTE_CRYPTO_AUTH_SHA256_HMAC:
if (sess->auth_opt) {
@@ -958,7 +948,6 @@ ccp_configure_session_auth(struct ccp_session *sess,
sess->auth.ctx = (void *)ccp_sha384_init;
sess->auth.ctx_len = CCP_SB_BYTES << 1;
sess->auth.offset = (CCP_SB_BYTES << 1) - SHA384_DIGEST_SIZE;
-   rte_memcpy(sha_ctx, sess->auth.ctx, SHA512_DIGEST_SIZE);
break;
case RTE_CRYPTO_AUTH_SHA384_HMAC:
if (sess->auth_opt) {
@@ -1023,7 +1012,6 @@ ccp_configure_session_auth(struct ccp_session *sess,
sess->auth.ctx = (void *)ccp_sha512_init;
sess->auth.ctx_len = CCP_SB_BYTES << 1;
sess->auth.offset = (CCP_SB_BYTES << 1) - SHA512_DIGEST_SIZE;
-   rte_memcpy(sha_ctx, sess->auth.ctx, SHA512_DIGEST_SIZE);
break;
case RTE_CRYPTO_AUTH_SHA512_HMAC:
if (sess->auth_opt) {
@@ -1173,13 +1161,8 @@ ccp_configure_session_aead(struct ccp_session *sess,
CCP_LOG_ERR("Unsupported aead algo");
return -ENOTSUP;
}
-   if (iommu_mode == 2) {
-   sess->cipher.nonce_phys = rte_mem_virt2iova(sess->cipher.nonce);
-   sess->cipher.key_phys = rte_mem_virt2iova(sess->cipher.key_ccp);
-   } else {
-   sess->cipher.nonce_phys = rte_mem_virt2phy(sess->cipher.nonce);
-   sess->cipher.key_phys = rte_mem_virt2phy(sess->cipher.key_ccp);
-   }
+   sess->cipher.nonce_phys = rte_mem_virt2iova(sess->cipher.nonce);
+   sess->cipher.key_phys = rte_mem_virt2iova(sess->cipher.key_ccp);
return 0;
 }
 
@@ -1594,14 +1577,8 @@ ccp_perform_hmac(struct rte_crypto_op *op,
  op->sym->auth.data.offset);
append_ptr = (void *)rte_pktmbuf_append(op->sym->m_src,
   

[PATCH v2 4/4] crypto/ccp: fix PCI probing

2022-10-04 Thread David Marchand
This driver has been converted from a vdev driver to a pci driver some
time ago.  This conversion is buggy as it tries to probe any pci devices
present on a system for *each* probe request from the PCI bus.

Rely on the passed pci device and only probe what is requested.

While at it:
- stop copying the pci device object content into a local private copy,
- rely on the PCI identifier and remove internal ccp_device_version
  identifier,
- ccp_list can be made static,

With this done, all the code parsing Linux sysfs can be dropped.

Fixes: 889317b7ecb3 ("crypto/ccp: convert driver from vdev to PCI")
Cc: sta...@dpdk.org

Signed-off-by: David Marchand 
---
 drivers/crypto/ccp/ccp_crypto.c  |   1 -
 drivers/crypto/ccp/ccp_dev.c |  89 ++---
 drivers/crypto/ccp/ccp_dev.h |  31 ++---
 drivers/crypto/ccp/ccp_pci.c | 207 ---
 drivers/crypto/ccp/ccp_pci.h |  24 
 drivers/crypto/ccp/meson.build   |   1 -
 drivers/crypto/ccp/rte_ccp_pmd.c |  18 +--
 7 files changed, 26 insertions(+), 345 deletions(-)
 delete mode 100644 drivers/crypto/ccp/ccp_pci.c
 delete mode 100644 drivers/crypto/ccp/ccp_pci.h

diff --git a/drivers/crypto/ccp/ccp_crypto.c b/drivers/crypto/ccp/ccp_crypto.c
index 351d8ac63e..461f18ca2e 100644
--- a/drivers/crypto/ccp/ccp_crypto.c
+++ b/drivers/crypto/ccp/ccp_crypto.c
@@ -26,7 +26,6 @@
 
 #include "ccp_dev.h"
 #include "ccp_crypto.h"
-#include "ccp_pci.h"
 #include "ccp_pmd_private.h"
 
 #include 
diff --git a/drivers/crypto/ccp/ccp_dev.c b/drivers/crypto/ccp/ccp_dev.c
index 14c54929c4..ee30f5ac30 100644
--- a/drivers/crypto/ccp/ccp_dev.c
+++ b/drivers/crypto/ccp/ccp_dev.c
@@ -20,10 +20,9 @@
 #include 
 
 #include "ccp_dev.h"
-#include "ccp_pci.h"
 #include "ccp_pmd_private.h"
 
-struct ccp_list ccp_list = TAILQ_HEAD_INITIALIZER(ccp_list);
+static TAILQ_HEAD(, ccp_device) ccp_list = TAILQ_HEAD_INITIALIZER(ccp_list);
 static int ccp_dev_id;
 
 int
@@ -68,7 +67,7 @@ ccp_read_hwrng(uint32_t *value)
struct ccp_device *dev;
 
TAILQ_FOREACH(dev, &ccp_list, next) {
-   void *vaddr = (void *)(dev->pci.mem_resource[2].addr);
+   void *vaddr = (void *)(dev->pci->mem_resource[2].addr);
 
while (dev->hwrng_retries++ < CCP_MAX_TRNG_RETRIES) {
*value = CCP_READ_REG(vaddr, TRNG_OUT_REG);
@@ -480,7 +479,7 @@ ccp_assign_lsbs(struct ccp_device *ccp)
 }
 
 static int
-ccp_add_device(struct ccp_device *dev, int type)
+ccp_add_device(struct ccp_device *dev)
 {
int i;
uint32_t qmr, status_lo, status_hi, dma_addr_lo, dma_addr_hi;
@@ -494,9 +493,9 @@ ccp_add_device(struct ccp_device *dev, int type)
 
dev->id = ccp_dev_id++;
dev->qidx = 0;
-   vaddr = (void *)(dev->pci.mem_resource[2].addr);
+   vaddr = (void *)(dev->pci->mem_resource[2].addr);
 
-   if (type == CCP_VERSION_5B) {
+   if (dev->pci->id.device_id == AMD_PCI_CCP_5B) {
CCP_WRITE_REG(vaddr, CMD_TRNG_CTL_OFFSET, 0x00012D57);
CCP_WRITE_REG(vaddr, CMD_CONFIG_0_OFFSET, 0x0003);
for (i = 0; i < 12; i++) {
@@ -615,41 +614,8 @@ ccp_remove_device(struct ccp_device *dev)
TAILQ_REMOVE(&ccp_list, dev, next);
 }
 
-static int
-is_ccp_device(const char *dirname,
- const struct rte_pci_id *ccp_id,
- int *type)
-{
-   char filename[PATH_MAX];
-   const struct rte_pci_id *id;
-   uint16_t vendor, device_id;
-   int i;
-   unsigned long tmp;
-
-   /* get vendor id */
-   snprintf(filename, sizeof(filename), "%s/vendor", dirname);
-   if (ccp_pci_parse_sysfs_value(filename, &tmp) < 0)
-   return 0;
-   vendor = (uint16_t)tmp;
-
-   /* get device id */
-   snprintf(filename, sizeof(filename), "%s/device", dirname);
-   if (ccp_pci_parse_sysfs_value(filename, &tmp) < 0)
-   return 0;
-   device_id = (uint16_t)tmp;
-
-   for (id = ccp_id, i = 0; id->vendor_id != 0; id++, i++) {
-   if (vendor == id->vendor_id &&
-   device_id == id->device_id) {
-   *type = i;
-   return 1; /* Matched device */
-   }
-   }
-   return 0;
-}
-
-static int
-ccp_probe_device(int ccp_type, struct rte_pci_device *pci_dev)
+int
+ccp_probe_device(struct rte_pci_device *pci_dev)
 {
struct ccp_device *ccp_dev;
 
@@ -658,10 +624,10 @@ ccp_probe_device(int ccp_type, struct rte_pci_device 
*pci_dev)
if (ccp_dev == NULL)
goto fail;
 
-   ccp_dev->pci = *pci_dev;
+   ccp_dev->pci = pci_dev;
 
/* device is valid, add in list */
-   if (ccp_add_device(ccp_dev, ccp_type)) {
+   if (ccp_add_device(ccp_dev)) {
ccp_remove_device(ccp_dev);
goto fail;
}
@@ -672,40 +638,3 @@ ccp_probe_device(int ccp_type, struct rte_pci_device 
*pci_dev)
rte_free(ccp_dev);
return -1;
 }
-
-int
-ccp_probe_

[PATCH v7 0/6] crypto/security session framework rework

2022-10-04 Thread Akhil Goyal
This patchset reworks the symmetric crypto and security session
data structure to use a single virtual/physical contiguous buffer
for symmetric crypto/security session and driver private data.
In addition the session data structure is now private.
The session is represented as an opaque pointer in the application.

With the change the session is no longer supported to be accessed
by multiple device drivers. For the same reason
rte_cryptodev_sym_session_init/clear APIs are deprecated as
rte_cryptodev_sym_session_create/free will initialize and
clear the driver specific data field.

The change was also submitted last year during DPDK 21.11
timeframe also[1], but was not applied due to lack of feedback from
community. Please help in getting this cleanup merged in this cycle.

Now the similar work was already done for asymmetric crypto.
This patchset is rebased over current tree and fixes all
the issues reported so far.

Changes in v7:
- fixed build for ixgbe and txgbe

Changes in v6:
- rebased over TOT
Changes in v5:
- rebased over latest dpdk-next-crypto tree

Changes in v4:
- squashed armv8_crypto fixes.
http://patches.dpdk.org/project/dpdk/cover/20220926100120.3980185-1-ruifeng.w...@arm.com/

Changes in v3:
- Updated release notes
- fixed checkpatch issues
- renamed macro to get sess priv data to align with crypto macro
- added acked-by/tested-by

Changes in v2:
This patchset is a v2 for the patch that was sent by Fan Zhang(Intel)
with a few changes
- Added security session rework also.
- fixed issues in [2] reported on mailing list.
- few other fixes.

Please review and provide feedback as soon as possible
as this is intended to be merged in DPDK 22.11 RC1.

Currently the cnxk platform is tested with this change.
Request everyone to review and test on their platform.

Special note to ixgbe and txgbe maintainers.
There is a wrong implementation for flow creation. Please check.
A hack is added to bypass it. Please fix it separately.

[1] 
https://patches.dpdk.org/project/dpdk/cover/20211018213452.2734720-1-gak...@marvell.com/
[2] 
https://patches.dpdk.org/project/dpdk/cover/20220829160645.378406-1-roy.fan.zh...@intel.com/


Akhil Goyal (5):
  cryptodev: rework session framework
  cryptodev: hide sym session structure
  security: remove priv mempool usage
  drivers/crypto: support security session get size op
  security: hide session structure

Fan Zhang (1):
  crypto/scheduler: use unified session

 app/test-crypto-perf/cperf.h  |   1 -
 app/test-crypto-perf/cperf_ops.c  |  64 ++--
 app/test-crypto-perf/cperf_ops.h  |   6 +-
 app/test-crypto-perf/cperf_test_latency.c |  11 +-
 app/test-crypto-perf/cperf_test_latency.h |   1 -
 .../cperf_test_pmd_cyclecount.c   |  12 +-
 .../cperf_test_pmd_cyclecount.h   |   1 -
 app/test-crypto-perf/cperf_test_throughput.c  |  13 +-
 app/test-crypto-perf/cperf_test_throughput.h  |   1 -
 app/test-crypto-perf/cperf_test_verify.c  |  11 +-
 app/test-crypto-perf/cperf_test_verify.h  |   1 -
 app/test-crypto-perf/main.c   |  30 +-
 app/test-eventdev/test_perf_common.c  |  43 +--
 app/test-eventdev/test_perf_common.h  |   1 -
 app/test/test_cryptodev.c | 354 +-
 app/test/test_cryptodev_blockcipher.c |  18 +-
 app/test/test_cryptodev_security_ipsec.c  |   2 +-
 app/test/test_cryptodev_security_ipsec.h  |   2 +-
 app/test/test_event_crypto_adapter.c  |  39 +-
 app/test/test_ipsec.c |  49 +--
 app/test/test_ipsec_perf.c|   4 +-
 app/test/test_security.c  | 178 ++---
 app/test/test_security_inline_proto.c |  26 +-
 doc/guides/prog_guide/cryptodev_lib.rst   |  16 +-
 doc/guides/rel_notes/deprecation.rst  |   9 -
 doc/guides/rel_notes/release_22_11.rst|  14 +
 drivers/crypto/armv8/armv8_pmd_private.h  |   2 -
 drivers/crypto/armv8/rte_armv8_pmd.c  |  21 +-
 drivers/crypto/armv8/rte_armv8_pmd_ops.c  |  35 +-
 drivers/crypto/bcmfs/bcmfs_sym_session.c  |  39 +-
 drivers/crypto/bcmfs/bcmfs_sym_session.h  |   3 +-
 drivers/crypto/caam_jr/caam_jr.c  |  69 +---
 drivers/crypto/ccp/ccp_crypto.c   |  56 +--
 drivers/crypto/ccp/ccp_pmd_ops.c  |  32 +-
 drivers/crypto/ccp/ccp_pmd_private.h  |   2 -
 drivers/crypto/ccp/rte_ccp_pmd.c  |  29 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c |  41 +-
 drivers/crypto/cnxk/cn10k_ipsec.c |  45 +--
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c  |  38 +-
 drivers/crypto/cnxk/cn9k_ipsec.c  |  50 +--
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  |  55 +--
 drivers/crypto/cnxk/cnxk_cryptodev_ops.h  |  16 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |  70 ++--
 drivers/crypto/dpaa2_sec/dpaa2_sec_raw_dp.c   |   6 +-
 drivers/crypto/dpaa_sec/dpaa_sec.c|  69 +---
 drivers/cr

[PATCH v7 2/6] crypto/scheduler: use unified session

2022-10-04 Thread Akhil Goyal
From: Fan Zhang 

This patch updates the scheduler PMD to use unified session
data structure. Previously thanks to the private session
array in cryptodev sym session there are no necessary
change needed for scheduler PMD other than the way ops
are enqueued/dequeued. The patch inherits the same design
in the original session data structure to the scheduler PMD
so the cryptodev sym session can be as a linear buffer for
both session header and driver private data.

With the change there are inevitable extra cost on both memory
(64 bytes per session per driver type) and cycle count (set
the correct session for each cop based on the worker before
enqueue, and retrieve the original session after dequeue).

Signed-off-by: Fan Zhang 
Signed-off-by: Akhil Goyal 
Acked-by: Kai Ji 
Tested-by: Gagandeep Singh 
Tested-by: David Coyle 
Tested-by: Kevin O'Sullivan 
---
 drivers/crypto/scheduler/scheduler_failover.c |  19 ++-
 .../crypto/scheduler/scheduler_multicore.c|  17 +++
 .../scheduler/scheduler_pkt_size_distr.c  |  84 +---
 drivers/crypto/scheduler/scheduler_pmd_ops.c  | 107 +++-
 .../crypto/scheduler/scheduler_pmd_private.h  | 120 +-
 .../crypto/scheduler/scheduler_roundrobin.c   |  11 +-
 6 files changed, 318 insertions(+), 40 deletions(-)

diff --git a/drivers/crypto/scheduler/scheduler_failover.c 
b/drivers/crypto/scheduler/scheduler_failover.c
index 2a0e29fa72..7fadcf66d0 100644
--- a/drivers/crypto/scheduler/scheduler_failover.c
+++ b/drivers/crypto/scheduler/scheduler_failover.c
@@ -16,18 +16,19 @@
 struct fo_scheduler_qp_ctx {
struct scheduler_worker primary_worker;
struct scheduler_worker secondary_worker;
+   uint8_t primary_worker_index;
+   uint8_t secondary_worker_index;
 
uint8_t deq_idx;
 };
 
 static __rte_always_inline uint16_t
 failover_worker_enqueue(struct scheduler_worker *worker,
-   struct rte_crypto_op **ops, uint16_t nb_ops)
+   struct rte_crypto_op **ops, uint16_t nb_ops, uint8_t index)
 {
-   uint16_t i, processed_ops;
+   uint16_t processed_ops;
 
-   for (i = 0; i < nb_ops && i < 4; i++)
-   rte_prefetch0(ops[i]->sym->session);
+   scheduler_set_worker_session(ops, nb_ops, index);
 
processed_ops = rte_cryptodev_enqueue_burst(worker->dev_id,
worker->qp_id, ops, nb_ops);
@@ -47,13 +48,14 @@ schedule_enqueue(void *qp, struct rte_crypto_op **ops, 
uint16_t nb_ops)
return 0;
 
enqueued_ops = failover_worker_enqueue(&qp_ctx->primary_worker,
-   ops, nb_ops);
+   ops, nb_ops, PRIMARY_WORKER_IDX);
 
if (enqueued_ops < nb_ops)
enqueued_ops += failover_worker_enqueue(
&qp_ctx->secondary_worker,
&ops[enqueued_ops],
-   nb_ops - enqueued_ops);
+   nb_ops - enqueued_ops,
+   SECONDARY_WORKER_IDX);
 
return enqueued_ops;
 }
@@ -94,7 +96,7 @@ schedule_dequeue(void *qp, struct rte_crypto_op **ops, 
uint16_t nb_ops)
qp_ctx->deq_idx = (~qp_ctx->deq_idx) & WORKER_SWITCH_MASK;
 
if (nb_deq_ops == nb_ops)
-   return nb_deq_ops;
+   goto retrieve_session;
 
worker = workers[qp_ctx->deq_idx];
 
@@ -104,6 +106,9 @@ schedule_dequeue(void *qp, struct rte_crypto_op **ops, 
uint16_t nb_ops)
worker->nb_inflight_cops -= nb_deq_ops2;
}
 
+retrieve_session:
+   scheduler_retrieve_session(ops, nb_deq_ops + nb_deq_ops2);
+
return nb_deq_ops + nb_deq_ops2;
 }
 
diff --git a/drivers/crypto/scheduler/scheduler_multicore.c 
b/drivers/crypto/scheduler/scheduler_multicore.c
index 900ab4049d..3dea850661 100644
--- a/drivers/crypto/scheduler/scheduler_multicore.c
+++ b/drivers/crypto/scheduler/scheduler_multicore.c
@@ -183,11 +183,19 @@ mc_scheduler_worker(struct rte_cryptodev *dev)
 
while (!mc_ctx->stop_signal) {
if (pending_enq_ops) {
+   scheduler_set_worker_session(
+   &enq_ops[pending_enq_ops_idx], pending_enq_ops,
+   worker_idx);
processed_ops =
rte_cryptodev_enqueue_burst(worker->dev_id,
worker->qp_id,
&enq_ops[pending_enq_ops_idx],
pending_enq_ops);
+   if (processed_ops < pending_deq_ops)
+   scheduler_retrieve_session(
+   &enq_ops[pending_enq_ops_idx +
+   processed_ops],
+   pending_deq_ops - processed_ops);
pending_enq_ops -= processed_ops;
pending_enq_

[PATCH v7 3/6] cryptodev: hide sym session structure

2022-10-04 Thread Akhil Goyal
Structure rte_cryptodev_sym_session is moved to internal
headers which are not visible to applications.
The only field which should be used by app is opaque_data.
This field can now be accessed via set/get APIs added in this
patch.
Subsequent changes in app and lib are made to compile the code.

Signed-off-by: Akhil Goyal 
Signed-off-by: Fan Zhang 
Acked-by: Kai Ji 
Tested-by: Gagandeep Singh 
Tested-by: David Coyle 
Tested-by: Kevin O'Sullivan 
---
 app/test-crypto-perf/cperf_ops.c  | 24 
 app/test-crypto-perf/cperf_ops.h  |  6 +-
 app/test-crypto-perf/cperf_test_latency.c |  2 +-
 .../cperf_test_pmd_cyclecount.c   |  2 +-
 app/test-crypto-perf/cperf_test_throughput.c  |  2 +-
 app/test-crypto-perf/cperf_test_verify.c  |  2 +-
 app/test-eventdev/test_perf_common.c  |  8 +--
 app/test/test_cryptodev.c | 17 +++---
 app/test/test_cryptodev_blockcipher.c |  2 +-
 app/test/test_event_crypto_adapter.c  |  4 +-
 app/test/test_ipsec.c |  2 +-
 app/test/test_ipsec_perf.c|  4 +-
 doc/guides/prog_guide/cryptodev_lib.rst   | 16 ++
 doc/guides/rel_notes/deprecation.rst  |  5 --
 doc/guides/rel_notes/release_22_11.rst|  9 +++
 drivers/crypto/bcmfs/bcmfs_sym_session.c  |  5 +-
 drivers/crypto/caam_jr/caam_jr.c  | 10 ++--
 drivers/crypto/ccp/ccp_crypto.c   | 30 +-
 drivers/crypto/ccp/ccp_pmd_ops.c  |  2 +-
 drivers/crypto/ccp/rte_ccp_pmd.c  |  4 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c |  8 +--
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c  | 10 ++--
 drivers/crypto/cnxk/cnxk_cryptodev_ops.c  |  4 +-
 drivers/crypto/cnxk/cnxk_cryptodev_ops.h  |  2 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   | 11 ++--
 drivers/crypto/dpaa2_sec/dpaa2_sec_raw_dp.c   |  2 +-
 drivers/crypto/dpaa_sec/dpaa_sec.c|  9 ++-
 drivers/crypto/dpaa_sec/dpaa_sec_raw_dp.c |  2 +-
 drivers/crypto/ipsec_mb/ipsec_mb_ops.c|  2 +-
 drivers/crypto/ipsec_mb/ipsec_mb_private.h|  4 +-
 drivers/crypto/ipsec_mb/pmd_aesni_gcm.c   |  6 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c|  4 +-
 drivers/crypto/ipsec_mb/pmd_kasumi.c  |  2 +-
 drivers/crypto/ipsec_mb/pmd_snow3g.c  |  2 +-
 drivers/crypto/mlx5/mlx5_crypto.c |  7 +--
 drivers/crypto/nitrox/nitrox_sym.c|  6 +-
 drivers/crypto/null/null_crypto_pmd.c |  3 +-
 drivers/crypto/null/null_crypto_pmd_ops.c |  2 +-
 drivers/crypto/octeontx/otx_cryptodev_ops.c   | 10 ++--
 drivers/crypto/openssl/rte_openssl_pmd.c  |  4 +-
 drivers/crypto/openssl/rte_openssl_pmd_ops.c  |  4 +-
 drivers/crypto/qat/qat_sym.c  |  4 +-
 drivers/crypto/qat/qat_sym.h  |  3 +-
 drivers/crypto/qat/qat_sym_session.c  |  6 +-
 .../scheduler/scheduler_pkt_size_distr.c  | 13 ++---
 drivers/crypto/scheduler/scheduler_pmd_ops.c  |  4 +-
 .../crypto/scheduler/scheduler_pmd_private.h  | 30 +-
 drivers/crypto/virtio/virtio_cryptodev.c  |  4 +-
 drivers/crypto/virtio/virtio_rxtx.c   |  2 +-
 examples/fips_validation/fips_dev_self_test.c | 10 ++--
 examples/fips_validation/main.c   |  2 +-
 examples/l2fwd-crypto/main.c  |  6 +-
 lib/cryptodev/cryptodev_pmd.h | 32 +++
 lib/cryptodev/cryptodev_trace_points.c|  3 -
 lib/cryptodev/rte_crypto.h|  3 +-
 lib/cryptodev/rte_crypto_sym.h|  7 +--
 lib/cryptodev/rte_cryptodev.c | 15 +++--
 lib/cryptodev/rte_cryptodev.h | 56 +--
 lib/cryptodev/rte_cryptodev_trace.h   | 14 +
 lib/cryptodev/version.map |  1 -
 lib/ipsec/rte_ipsec_group.h   |  5 +-
 lib/ipsec/ses.c   |  3 +-
 62 files changed, 239 insertions(+), 244 deletions(-)

diff --git a/app/test-crypto-perf/cperf_ops.c b/app/test-crypto-perf/cperf_ops.c
index c6f5735bb0..5acd495794 100644
--- a/app/test-crypto-perf/cperf_ops.c
+++ b/app/test-crypto-perf/cperf_ops.c
@@ -13,7 +13,7 @@ static void
 cperf_set_ops_asym(struct rte_crypto_op **ops,
   uint32_t src_buf_offset __rte_unused,
   uint32_t dst_buf_offset __rte_unused, uint16_t nb_ops,
-  struct rte_cryptodev_sym_session *sess,
+  void *sess,
   const struct cperf_options *options,
   const struct cperf_test_vector *test_vector __rte_unused,
   uint16_t iv_offset __rte_unused,
@@ -55,7 +55,7 @@ static void
 cperf_set_ops_security(struct rte_crypto_op **ops,
uint32_t src_buf_offset __rte_unused,
uint32_t dst_buf_offset __rte_unused,
-   uint16_t nb_ops, struct rte_cryptodev_sym_session *sess,
+   uint16_t nb_ops, void *sess,
 

[PATCH v7 4/6] security: remove priv mempool usage

2022-10-04 Thread Akhil Goyal
As per current design, rte_security_session_create()
unnecessarily use 2 mempool objects for a single session.

To address this, the API will now take only 1 mempool
object instead of 2. With this change, the library layer
will get the object from mempool and session priv data is
stored contiguously in the same mempool object.

User need to ensure that the mempool created in application
is big enough for session private data as well. This can be
ensured if the pool is created after getting size of session
priv data using API rte_security_session_get_size().

Since set and get pkt metadata for security sessions are now
made inline for Inline crypto/proto mode, a new member fast_mdata
is added to the rte_security_session.
To access opaque data and fast_mdata will be accessed via inline
APIs which can do pointer manipulations inside library from
session_private_data pointer coming from application.

Signed-off-by: Akhil Goyal 
Tested-by: Gagandeep Singh 
Tested-by: David Coyle 
Tested-by: Kevin O'Sullivan 
---
 app/test-crypto-perf/cperf.h  |   1 -
 app/test-crypto-perf/cperf_ops.c  |  13 +-
 app/test-crypto-perf/cperf_test_latency.c |   3 +-
 app/test-crypto-perf/cperf_test_latency.h |   1 -
 .../cperf_test_pmd_cyclecount.c   |   3 +-
 .../cperf_test_pmd_cyclecount.h   |   1 -
 app/test-crypto-perf/cperf_test_throughput.c  |   3 +-
 app/test-crypto-perf/cperf_test_throughput.h  |   1 -
 app/test-crypto-perf/cperf_test_verify.c  |   3 +-
 app/test-crypto-perf/cperf_test_verify.h  |   1 -
 app/test-crypto-perf/main.c   |   3 -
 app/test/test_cryptodev.c |  44 +-
 app/test/test_ipsec.c |   7 +-
 app/test/test_security.c  | 146 ++
 app/test/test_security_inline_proto.c |  16 +-
 drivers/crypto/caam_jr/caam_jr.c  |  31 +---
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c |   7 +-
 drivers/crypto/cnxk/cn10k_ipsec.c |  45 ++
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c  |   9 +-
 drivers/crypto/cnxk/cn9k_ipsec.c  |  50 ++
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |  29 +---
 drivers/crypto/dpaa2_sec/dpaa2_sec_raw_dp.c   |   3 +-
 drivers/crypto/dpaa_sec/dpaa_sec.c|  25 +--
 drivers/crypto/dpaa_sec/dpaa_sec_raw_dp.c |   3 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c|  26 +---
 drivers/crypto/mvsam/rte_mrvl_pmd.c   |   3 +-
 drivers/crypto/mvsam/rte_mrvl_pmd_ops.c   |  21 +--
 drivers/crypto/qat/qat_sym.c  |   3 +-
 drivers/crypto/qat/qat_sym.h  |  11 +-
 drivers/crypto/qat/qat_sym_session.c  |  27 +---
 drivers/crypto/qat/qat_sym_session.h  |   2 +-
 drivers/net/cnxk/cn10k_ethdev_sec.c   |  38 ++---
 drivers/net/cnxk/cn9k_ethdev_sec.c|  41 ++---
 drivers/net/iavf/iavf_ipsec_crypto.c  |  23 +--
 drivers/net/ixgbe/ixgbe_ipsec.c   |  31 ++--
 drivers/net/txgbe/txgbe_ipsec.c   |  32 ++--
 examples/ipsec-secgw/ipsec-secgw.c|  34 
 examples/ipsec-secgw/ipsec.c  |   9 +-
 examples/ipsec-secgw/ipsec.h  |   1 -
 lib/cryptodev/rte_cryptodev.h |   2 +-
 lib/security/rte_security.c   |  20 ++-
 lib/security/rte_security.h   |  30 ++--
 lib/security/rte_security_driver.h|  13 +-
 43 files changed, 193 insertions(+), 622 deletions(-)

diff --git a/app/test-crypto-perf/cperf.h b/app/test-crypto-perf/cperf.h
index 2b0aad095c..db58228dce 100644
--- a/app/test-crypto-perf/cperf.h
+++ b/app/test-crypto-perf/cperf.h
@@ -15,7 +15,6 @@ struct cperf_op_fns;
 
 typedef void  *(*cperf_constructor_t)(
struct rte_mempool *sess_mp,
-   struct rte_mempool *sess_priv_mp,
uint8_t dev_id,
uint16_t qp_id,
const struct cperf_options *options,
diff --git a/app/test-crypto-perf/cperf_ops.c b/app/test-crypto-perf/cperf_ops.c
index 5acd495794..727eee6599 100644
--- a/app/test-crypto-perf/cperf_ops.c
+++ b/app/test-crypto-perf/cperf_ops.c
@@ -642,7 +642,6 @@ cperf_set_ops_aead(struct rte_crypto_op **ops,
 
 static void *
 create_ipsec_session(struct rte_mempool *sess_mp,
-   struct rte_mempool *priv_mp,
uint8_t dev_id,
const struct cperf_options *options,
const struct cperf_test_vector *test_vector,
@@ -753,13 +752,11 @@ create_ipsec_session(struct rte_mempool *sess_mp,
rte_cryptodev_get_sec_ctx(dev_id);
 
/* Create security session */
-   return (void *)rte_security_session_create(ctx,
-   &sess_conf, sess_mp, priv_mp);
+   return (void *)rte_security_session_create(ctx, &sess_conf, sess_mp);
 }
 
 static void *
 cperf_create_session(struct rte_mempool *sess_mp,
-   struct rte_mempoo

[PATCH v7 5/6] drivers/crypto: support security session get size op

2022-10-04 Thread Akhil Goyal
Added the support for rte_security_op.session_get_size()
in all the PMDs which support rte_security sessions and the
op was not supported.

Signed-off-by: Akhil Goyal 
Acked-by: Kai Ji 
Tested-by: Gagandeep Singh 
Tested-by: David Coyle 
Tested-by: Kevin O'Sullivan 
---
 drivers/crypto/caam_jr/caam_jr.c| 6 ++
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 7 +++
 drivers/crypto/dpaa_sec/dpaa_sec.c  | 8 
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c  | 7 +++
 drivers/crypto/mvsam/rte_mrvl_pmd_ops.c | 7 +++
 drivers/crypto/qat/dev/qat_sym_pmd_gen1.c   | 1 +
 drivers/crypto/qat/qat_sym_session.c| 6 ++
 drivers/crypto/qat/qat_sym_session.h| 2 ++
 8 files changed, 44 insertions(+)

diff --git a/drivers/crypto/caam_jr/caam_jr.c b/drivers/crypto/caam_jr/caam_jr.c
index bbf2c0bdb1..67d9bb89e5 100644
--- a/drivers/crypto/caam_jr/caam_jr.c
+++ b/drivers/crypto/caam_jr/caam_jr.c
@@ -1937,6 +1937,11 @@ caam_jr_security_session_destroy(void *dev __rte_unused,
return 0;
 }
 
+static unsigned int
+caam_jr_security_session_get_size(void *device __rte_unused)
+{
+   return sizeof(struct caam_jr_session);
+}
 
 static int
 caam_jr_dev_configure(struct rte_cryptodev *dev,
@@ -2031,6 +2036,7 @@ static struct rte_cryptodev_ops caam_jr_ops = {
 static struct rte_security_ops caam_jr_security_ops = {
.session_create = caam_jr_security_session_create,
.session_update = NULL,
+   .session_get_size = caam_jr_security_session_get_size,
.session_stats_get = NULL,
.session_destroy = caam_jr_security_session_destroy,
.set_pkt_metadata = NULL,
diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c 
b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
index 28a868da53..49f08f69f0 100644
--- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
+++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
@@ -3733,6 +3733,12 @@ dpaa2_sec_security_session_destroy(void *dev 
__rte_unused,
}
return 0;
 }
+
+static unsigned int
+dpaa2_sec_security_session_get_size(void *device __rte_unused)
+{
+   return sizeof(dpaa2_sec_session);
+}
 #endif
 static int
 dpaa2_sec_sym_session_configure(struct rte_cryptodev *dev __rte_unused,
@@ -4184,6 +4190,7 @@ dpaa2_sec_capabilities_get(void *device __rte_unused)
 static const struct rte_security_ops dpaa2_sec_security_ops = {
.session_create = dpaa2_sec_security_session_create,
.session_update = NULL,
+   .session_get_size = dpaa2_sec_security_session_get_size,
.session_stats_get = NULL,
.session_destroy = dpaa2_sec_security_session_destroy,
.set_pkt_metadata = NULL,
diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c 
b/drivers/crypto/dpaa_sec/dpaa_sec.c
index b1529bd1f6..0df63aaf3f 100644
--- a/drivers/crypto/dpaa_sec/dpaa_sec.c
+++ b/drivers/crypto/dpaa_sec/dpaa_sec.c
@@ -3289,6 +3289,13 @@ dpaa_sec_security_session_destroy(void *dev __rte_unused,
}
return 0;
 }
+
+static unsigned int
+dpaa_sec_security_session_get_size(void *device __rte_unused)
+{
+   return sizeof(dpaa_sec_session);
+}
+
 #endif
 static int
 dpaa_sec_dev_configure(struct rte_cryptodev *dev __rte_unused,
@@ -3547,6 +3554,7 @@ dpaa_sec_capabilities_get(void *device __rte_unused)
 static const struct rte_security_ops dpaa_sec_security_ops = {
.session_create = dpaa_sec_security_session_create,
.session_update = NULL,
+   .session_get_size = dpaa_sec_security_session_get_size,
.session_stats_get = NULL,
.session_destroy = dpaa_sec_security_session_destroy,
.set_pkt_metadata = NULL,
diff --git a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c 
b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
index 76cb1c543a..fc9ee01124 100644
--- a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
+++ b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
@@ -2130,6 +2130,12 @@ aesni_mb_pmd_sec_sess_destroy(void *dev __rte_unused,
return 0;
 }
 
+static unsigned int
+aesni_mb_pmd_sec_sess_get_size(void *device __rte_unused)
+{
+   return sizeof(struct aesni_mb_session);
+}
+
 /** Get security capabilities for aesni multi-buffer */
 static const struct rte_security_capability *
 aesni_mb_pmd_sec_capa_get(void *device __rte_unused)
@@ -2140,6 +2146,7 @@ aesni_mb_pmd_sec_capa_get(void *device __rte_unused)
 static struct rte_security_ops aesni_mb_pmd_sec_ops = {
.session_create = aesni_mb_pmd_sec_sess_create,
.session_update = NULL,
+   .session_get_size = aesni_mb_pmd_sec_sess_get_size,
.session_stats_get = NULL,
.session_destroy = aesni_mb_pmd_sec_sess_destroy,
.set_pkt_metadata = NULL,
diff --git a/drivers/crypto/mvsam/rte_mrvl_pmd_ops.c 
b/drivers/crypto/mvsam/rte_mrvl_pmd_ops.c
index 1aa8e935f1..6ac0407c36 100644
--- a/drivers/crypto/mvsam/rte_mrvl_pmd_ops.c
+++ b/drivers/crypto/mvsam/rte_mrvl_pmd_ops.c
@@ -907,6 +907,12 @@ mrvl_crypto_pmd_security_session_destroy(void *dev 

[PATCH v7 6/6] security: hide session structure

2022-10-04 Thread Akhil Goyal
Structure rte_security_session is moved to internal
headers which are not visible to applications.
The only field which should be used by app is opaque_data.
This field can now be accessed via set/get APIs added in this
patch.
Subsequent changes in app and lib are made to compile the code.

Signed-off-by: Akhil Goyal 
Tested-by: Gagandeep Singh 
Tested-by: David Coyle 
Tested-by: Kevin O'Sullivan 
---
 app/test-crypto-perf/cperf_ops.c  |  6 +-
 .../cperf_test_pmd_cyclecount.c   |  2 +-
 app/test-crypto-perf/cperf_test_throughput.c  |  2 +-
 app/test/test_cryptodev.c |  2 +-
 app/test/test_cryptodev_security_ipsec.c  |  2 +-
 app/test/test_cryptodev_security_ipsec.h  |  2 +-
 app/test/test_security.c  | 32 
 app/test/test_security_inline_proto.c | 10 +--
 doc/guides/rel_notes/deprecation.rst  |  4 -
 doc/guides/rel_notes/release_22_11.rst|  5 ++
 drivers/crypto/caam_jr/caam_jr.c  |  2 +-
 drivers/crypto/cnxk/cn10k_cryptodev_ops.c |  4 +-
 drivers/crypto/cnxk/cn9k_cryptodev_ops.c  |  6 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |  6 +-
 drivers/crypto/dpaa_sec/dpaa_sec.c|  4 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c|  4 +-
 drivers/crypto/qat/qat_sym.c  |  4 +-
 drivers/crypto/qat/qat_sym.h  |  4 +-
 drivers/net/iavf/iavf_ipsec_crypto.h  |  2 +-
 examples/ipsec-secgw/ipsec_worker.c   |  2 +-
 lib/cryptodev/rte_crypto_sym.h|  4 +-
 lib/ipsec/rte_ipsec_group.h   | 12 +--
 lib/ipsec/ses.c   |  2 +-
 lib/security/rte_security.c   | 13 ++-
 lib/security/rte_security.h   | 80 ---
 lib/security/rte_security_driver.h| 18 +
 26 files changed, 137 insertions(+), 97 deletions(-)

diff --git a/app/test-crypto-perf/cperf_ops.c b/app/test-crypto-perf/cperf_ops.c
index 727eee6599..61a3967697 100644
--- a/app/test-crypto-perf/cperf_ops.c
+++ b/app/test-crypto-perf/cperf_ops.c
@@ -65,8 +65,7 @@ cperf_set_ops_security(struct rte_crypto_op **ops,
 
for (i = 0; i < nb_ops; i++) {
struct rte_crypto_sym_op *sym_op = ops[i]->sym;
-   struct rte_security_session *sec_sess =
-   (struct rte_security_session *)sess;
+   void *sec_sess = (void *)sess;
uint32_t buf_sz;
 
uint32_t *per_pkt_hfn = rte_crypto_op_ctod_offset(ops[i],
@@ -131,8 +130,7 @@ cperf_set_ops_security_ipsec(struct rte_crypto_op **ops,
uint16_t iv_offset __rte_unused, uint32_t *imix_idx,
uint64_t *tsc_start)
 {
-   struct rte_security_session *sec_sess =
-   (struct rte_security_session *)sess;
+   void *sec_sess = sess;
const uint32_t test_buffer_size = options->test_buffer_size;
const uint32_t headroom_sz = options->headroom_sz;
const uint32_t segment_sz = options->segment_sz;
diff --git a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c 
b/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
index aa2654250f..0307e82996 100644
--- a/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
+++ b/app/test-crypto-perf/cperf_test_pmd_cyclecount.c
@@ -71,7 +71,7 @@ cperf_pmd_cyclecount_test_free(struct 
cperf_pmd_cyclecount_ctx *ctx)
(struct rte_security_ctx *)
rte_cryptodev_get_sec_ctx(ctx->dev_id);
rte_security_session_destroy(sec_ctx,
-   (struct rte_security_session *)ctx->sess);
+   (void *)ctx->sess);
} else
 #endif
rte_cryptodev_sym_session_free(ctx->dev_id, ctx->sess);
diff --git a/app/test-crypto-perf/cperf_test_throughput.c 
b/app/test-crypto-perf/cperf_test_throughput.c
index db89b7ddff..e892a70699 100644
--- a/app/test-crypto-perf/cperf_test_throughput.c
+++ b/app/test-crypto-perf/cperf_test_throughput.c
@@ -49,7 +49,7 @@ cperf_throughput_test_free(struct cperf_throughput_ctx *ctx)
rte_cryptodev_get_sec_ctx(ctx->dev_id);
rte_security_session_destroy(
sec_ctx,
-   (struct rte_security_session *)ctx->sess);
+   (void *)ctx->sess);
}
 #endif
else
diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 9708fc87d2..c6d47a035e 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -84,7 +84,7 @@ struct crypto_unittest_params {
union {
void *sess;
 #ifdef RTE_LIB_SECURITY
-   struct rte_security_session *sec_session;
+   void *sec_session;
 #endif
};
 #ifdef RTE_LIB_SECURITY
diff --git a/app/test/test_cryptodev_security_ipsec.c

[PATCH] net/i40e: fix build with MinGW GCC 12

2022-10-04 Thread Thomas Monjalon
When compiling with MinGW GCC 12,
the rte_flow_item array is seen as read out of bound:

net/i40e/i40e_hash.c:389:47: error:
array subscript 50 is above array bounds of ‘const uint64_t[50]’
{aka ‘const long long unsigned int[50]’} [-Werror=array-bounds]
389 | item_hdr = pattern_item_header[last_item_type];
|~~~^~~~

It seems the assert check done above this line has no impact.
A check is added to make the compiler happy.

Signed-off-by: Thomas Monjalon 
---
 drivers/net/i40e/i40e_hash.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/i40e/i40e_hash.c b/drivers/net/i40e/i40e_hash.c
index 8962e9d97a..ba616aea9f 100644
--- a/drivers/net/i40e/i40e_hash.c
+++ b/drivers/net/i40e/i40e_hash.c
@@ -386,6 +386,8 @@ i40e_hash_get_pattern_type(const struct rte_flow_item 
pattern[],
prev_item_type = last_item_type;
assert(last_item_type < (enum rte_flow_item_type)
RTE_DIM(pattern_item_header));
+   if (last_item_type >= RTE_DIM(pattern_item_header))
+   goto not_sup;
item_hdr = pattern_item_header[last_item_type];
assert(item_hdr);
 
-- 
2.36.1



[PATCH] net/qede/base: fix 32-bit build with GCC 12

2022-10-04 Thread Thomas Monjalon
A pointer is passed to a macro and it seems mistakenly referenced.
This issue is seen only when compiling with GCC 12 for 32-bit:

drivers/net/qede/base/ecore_init_fw_funcs.c:1418:25:
error: array subscript 1 is outside array bounds of ‘u32[1]’
{aka ‘unsigned int[1]’} [-Werror=array-bounds]
 1418 | ecore_wr(dev, ptt, ((addr) + (4 * i)),  \
  | ^
 1419 |  ((u32 *)&(arr))[i]);   \
  |  ~~~
drivers/net/qede/base/ecore_init_fw_funcs.c:1465:17:
note: in expansion of macro ‘ARR_REG_WR’
 1465 | ARR_REG_WR(p_hwfn, p_ptt, addr, pData, len_in_dwords);
  | ^~
drivers/net/qede/base/ecore_init_fw_funcs.c:1439:35:
note: at offset 4 into object ‘pData’ of size 4
 1439 |  u32 *pData,
  |  ~^

Fixes: 3b307c55f2ac ("net/qede/base: update FW to 8.40.25.0")
Cc: sta...@dpdk.org

Signed-off-by: Thomas Monjalon 
---
 drivers/net/qede/base/ecore_init_fw_funcs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/qede/base/ecore_init_fw_funcs.c 
b/drivers/net/qede/base/ecore_init_fw_funcs.c
index 6a52f32cc9..0aa7f85567 100644
--- a/drivers/net/qede/base/ecore_init_fw_funcs.c
+++ b/drivers/net/qede/base/ecore_init_fw_funcs.c
@@ -1416,7 +1416,7 @@ void ecore_init_brb_ram(struct ecore_hwfn *p_hwfn,
u32 i;  \
for (i = 0; i < (arr_size); i++)\
ecore_wr(dev, ptt, ((addr) + (4 * i)),  \
-((u32 *)&(arr))[i]);   \
+((u32 *)arr)[i]);  \
} while (0)
 
 #ifndef DWORDS_TO_BYTES
-- 
2.36.1



Re: [PATCH] net/i40e: fix build with MinGW GCC 12

2022-10-04 Thread Thomas Monjalon
04/10/2022 13:17, Thomas Monjalon:
> When compiling with MinGW GCC 12,
> the rte_flow_item array is seen as read out of bound:
> 
> net/i40e/i40e_hash.c:389:47: error:
>   array subscript 50 is above array bounds of ‘const uint64_t[50]’
>   {aka ‘const long long unsigned int[50]’} [-Werror=array-bounds]
>   389 | item_hdr = pattern_item_header[last_item_type];
>   |~~~^~~~
> 
> It seems the assert check done above this line has no impact.
> A check is added to make the compiler happy.

We could add those lines as the real issue is the item array:

Fixes: ef4c16fd9148 ("net/i40e: refactor RSS flow")
Cc: sta...@dpdk.org

> Signed-off-by: Thomas Monjalon 
> ---
>  drivers/net/i40e/i40e_hash.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/i40e/i40e_hash.c b/drivers/net/i40e/i40e_hash.c
> index 8962e9d97a..ba616aea9f 100644
> --- a/drivers/net/i40e/i40e_hash.c
> +++ b/drivers/net/i40e/i40e_hash.c
> @@ -386,6 +386,8 @@ i40e_hash_get_pattern_type(const struct rte_flow_item 
> pattern[],
>   prev_item_type = last_item_type;
>   assert(last_item_type < (enum rte_flow_item_type)
>   RTE_DIM(pattern_item_header));
> + if (last_item_type >= RTE_DIM(pattern_item_header))
> + goto not_sup;
>   item_hdr = pattern_item_header[last_item_type];
>   assert(item_hdr);
>  
> 







Re: [PATCH] mem: fix API doc about allocation on secondary processes

2022-10-04 Thread David Marchand
On Fri, Sep 30, 2022 at 3:19 PM Honnappa Nagarahalli
 wrote:
> > Since 10 years, memzone allocation is allowed on secondary processes. Now
> > it's time to update the documentation accordingly.
> >
> > At the same time, fix mempool, mbuf and ring documentation which rely on
> > memzones internally.
> >
> > Bugzilla ID: 1074
> > Fixes: 916e4f4f4e45 ("memory: fix for multi process support")
Cc: sta...@dpdk.org

> >
> > Signed-off-by: Olivier Matz 
> Reviewed-by: Honnappa Nagarahalli 

Applied, thanks.


-- 
David Marchand



Re: [PATCH v7 1/4] eal: add lcore poll busyness telemetry

2022-10-04 Thread Bruce Richardson
On Tue, Oct 04, 2022 at 11:15:19AM +0200, Morten Brørup wrote:
> > From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
> > Sent: Monday, 3 October 2022 22.02
> 
> [...]
> 
> > The functionality provided is very useful, and the implementation is
> > clever in the way it doesn't require any application modifications.
> > But,
> > a clever, useful brittle hack is still a brittle hack.
> > 

I think that may be a little harsh here. After all, this is a feature which
is build-time disabled and runtime disabled by default, so like many other
components it's designed for use when it makes sense to do so.

Furthermore, I'd just like to point out that the authors, when doing the
patches, have left in the hooks so that even apps, for which the "for-free"
scheme doesn't work, can still leverage the infrastructure to have the app
itself report the busy/free metrics.

> > What if there was instead a busyness module, where the application
> > would
> > explicitly report what it was up to. The new library would hook up to
> > telemetry just like this patchset does, plus provide an explicit API to
> > retrieve lcore thread load.
> > 
> > The service cores framework (fancy name for rte_service.c) could also
> > call the lcore load tracking module, provided all services properly
> > reported back on whether or not they were doing anything useful with
> > the
> > cycles they just spent.
> > 
> > The metrics of such a load tracking module could potentially be used by
> > other modules in DPDK, or by the application. It could potentially be
> > used for dynamic load balancing of service core services, or for power
> > management (e.g, DVFS), or for a potential future deferred-work type
> > mechanism more sophisticated than current rte_service, or some green
> > threads/coroutines/fiber thingy. The DSW event device could also use it
> > to replace its current internal load estimation scheme.
> 
> [...]
> 
> I agree 100 % with everything Mattias wrote above, and I would like to voice 
> my opinion too.
> 
> This patch is full of preconditions and assumptions. Its only true advantage 
> (vs. a generic load tracking library) is that it doesn't require any 
> application modifications, and thus can be deployed with zero effort.
> 
> I my opinion, it would be much better with a well designed generic load 
> tracking library, to be called from the application, so it gets correct 
> information about what the lcores spend their cycles doing. And as Mattias 
> mentions: With the appropriate API for consumption of the collected data, it 
> could also provide actionable statistics for use by the application itself, 
> not just telemetry. ("Actionable statistics": Statistics that is directly 
> usable for decision making.)
> 
> There is also the aspect of time-to-benefit: This patch immediately provides 
> benefits (to the users of the DPDK applications that meet the 
> preconditions/assumptions of the patch), while a generic load tracking 
> library will take years to get integrated into applications before it 
> provides benefits (to the users of the DPDK applications that use the new 
> library).
> 
> So, we should ask ourselves: Do we want an application-specific solution with 
> a short time-to-benefit, or a generic solution with a long time-to-benefit? 
> (I use the term "application specific" because not all applications can be 
> tweaked to provide meaningful data with this patch. You might also label a 
> generic library "application specific", because it requires that the 
> application uses the library - however that is a common requirement of all 
> DPDK libraries.)
> 
> Furthermore, if the proposed patch is primarily for the benefit of OVS, I 
> suppose that calls to a generic load tracking library could be added to OVS 
> within a relatively short time frame (although not as quick as this patch).
> 
> I guess that the developers of this patch initially thought that it was 
> generic and usable for the majority of applications, and it came as somewhat 
> a surprise that it wasn't as generic as expected. The DPDK community has a 
> good review process with open discussions and sharing of thoughts and ideas. 
> Sometimes, an idea doesn't fly, because the corner cases turn out to be more 
> common than expected. I'm sorry to say it, but I think that is the case for 
> this patch. :-(
> 

I'd actually like to question this last statement a little.

I think we in the DPDK community are very good at coming up with
theoretical examples where things don't work, but are they really cases
that occur commonly in the real-world? 

I accept, for example, that the "for free" approach would not be suitable
for something like VPP which does multiple polls to gather packets before
processing, but for some of the other cases I'd question their commonality.
For example, a number of objections have focused on the case where
allocation of buffers fails and so the busyness gets counted wrongly.  Are
there really (many) apps out there where runni

RE: [PATCH] raw/skeleton: remove useless check

2022-10-04 Thread Hemant Agrawal
Acked-by: Hemant Agrawal 


RE: [PATCH 7/7] app/flow-perf: add hairpin queue memory config

2022-10-04 Thread Wisam Monther
Hi Dariusz,

> -Original Message-
> From: Dariusz Sosnowski 
> Sent: Monday, September 19, 2022 7:38 PM
> To: Wisam Monther 
> Cc: dev@dpdk.org
> Subject: [PATCH 7/7] app/flow-perf: add hairpin queue memory config
> 
> This patch adds the hairpin-conf command line parameter to flow-perf
> application. hairpin-conf parameter takes a hexadecimal bitmask with bits
> having the following meaning:
> 
> - Bit 0 - Force memory settings of hairpin RX queue.
> - Bit 1 - Force memory settings of hairpin TX queue.
> - Bit 4 - Use locked device memory for hairpin RX queue.
> - Bit 5 - Use RTE memory for hairpin RX queue.
> - Bit 8 - Use locked device memory for hairpin TX queue.
> - Bit 9 - Use RTE memory for hairpin TX queue.
> 
> Signed-off-by: Dariusz Sosnowski 
> ---

You have some checks issues; can you please kindly check them?

BRs,
Wisam Jaddo



DPDK build for Arm with GCC 12

2022-10-04 Thread Thomas Monjalon
Hello,

GCC 12 suspects an out-of-bound write in NEON port_groupx4():

In file included from examples/ipsec-secgw/ipsec_neon.h:9,
 from examples/ipsec-secgw/ipsec_lpm_neon.h:9,
 from examples/ipsec-secgw/ipsec_worker.c:16:
examples/common/neon/port_group.h: In function 'port_groupx4':
examples/common/neon/port_group.h:42:21: error:
array subscript 'union [0]' is partly outside
array bounds of 'uint16_t[5]'
   42 | pnum->u64 = gptbl[v].pnum;
  | ^~
examples/common/neon/port_group.h:21:23: note:
object 'pn' of size [0, 10]
   21 | port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16x8_t dp1,
  |  ~^~~
examples/common/neon/port_group.h:43:21: error:
array subscript 'union [0]' is partly outside
array bounds of 'uint16_t[5]'
   43 | pnum->u16[FWDSTEP] = 1;
  | ^~
examples/common/neon/port_group.h:21:23: note:
object 'pn' of size [0, 10]
   21 | port_groupx4(uint16_t pn[FWDSTEP + 1], uint16_t *lp, uint16x8_t dp1,
  |  ~^~~

Please could you help fixing it?




[PATCH v3] mempool: fix get objects from mempool with cache

2022-10-04 Thread Andrew Rybchenko
From: Morten Brørup 

A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.

Fix the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the ring.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the ring.

The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the ring (instead of only the
number of requested objects minus the objects available in the ring),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the ring fails, only
the remaining requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache
plus the common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.

Signed-off-by: Morten Brørup 
Signed-off-by: Andrew Rybchenko 
---
v3 changes (Andrew Rybchenko)
 - Always get first objects from the cache even if request is bigger
   than cache size. Remove one corresponding condition from the path
   when request is fully served from cache.
 - Simplify code to avoid duplication:
- Get objects directly from backend in single place only.
- Share code which gets from the cache first regardless if
  everythihg is obtained from the cache or just the first part.
 - Rollback cache length in unlikely failure branch to avoid cache
   vs NULL check in success branch.

v2 changes
- Do not modify description of return value. This belongs in a separate
doc fix.
- Elaborate even more on which bugs the modifications fix.

 lib/mempool/rte_mempool.h | 74 +--
 1 file changed, 48 insertions(+), 26 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index a3c4ee351d..58e41ed401 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1443,41 +1443,54 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void 
**obj_table,
   unsigned int n, struct rte_mempool_cache *cache)
 {
int ret;
+   unsigned int remaining = n;
uint32_t index, len;
void **cache_objs;
 
-   /* No cache provided or cannot be satisfied from cache */
-   if (unlikely(cache == NULL || n >= cache->size))
+   /* No cache provided */
+   if (unlikely(cache == NULL))
goto ring_dequeue;
 
-   cache_objs = cache->objs;
+   /* Use the cache as much as we have to return hot objects first */
+   len = RTE_MIN(remaining, cache->len);
+   cache_objs = &cache->objs[cache->len];
+   cache->len -= len;
+   remaining -= len;
+   for (index = 0; index < len; index++)
+   *obj_table++ = *--cache_objs;
 
-   /* Can this be satisfied from the cache? */
-   if (cache->len < n) {
-   /* No. Backfill the cache first, an

[PATCH v4 0/5] add remaining SGL support to AESNI_MB

2022-10-04 Thread Ciara Power
Currently, the intel-ipsec-mb library only supports SGL for
GCM and ChaCha20-Poly1305 algorithms through the JOB API.

To add SGL support for other algorithms, a workaround approach is
added in the AESNI_MB PMD. SGL feature flags can now be added to
the PMD.

This patchset also includes a fix for SGL wireless operations,
session cleanup and session creation for sessionless operations.

Some additional Snow3G SGL and AES tests are also added for
various SGL input/output combinations that were not
previously being tested.

v4: Added error check when appending space for digest to buffer.

v3:
  - Modified fix to reset sessions, and ensure values are then set for
sessionless testcases. V2 fix just ensured the same values in
session objects were reused, as they were not being reset,
which was incorrect.
  - Reduced code duplication by adding a reusable function.
  - Changed int to uint64_t for total_len.

v2:
  - Added documentation changes.
  - Added fix for sessionless cleanup.
  - Modified blockcipher tests to support various SGL types.
  - Added more SGL AES tests.
  - Small fixes.

Ciara Power (5):
  test/crypto: fix wireless auth digest segment
  crypto/ipsec_mb: fix session creation for sessionless
  crypto/ipsec_mb: add remaining SGL support
  test/crypto: add OOP snow3g SGL tests
  test/crypto: add remaining blockcipher SGL tests

 app/test/test_cryptodev.c   |  58 +++-
 app/test/test_cryptodev_aes_test_vectors.h  | 345 +---
 app/test/test_cryptodev_blockcipher.c   |  50 +--
 app/test/test_cryptodev_blockcipher.h   |   2 +
 app/test/test_cryptodev_hash_test_vectors.h |   8 +-
 doc/guides/cryptodevs/aesni_mb.rst  |   1 -
 doc/guides/cryptodevs/features/aesni_mb.ini |   4 +
 doc/guides/rel_notes/release_22_11.rst  |   5 +
 drivers/crypto/ipsec_mb/ipsec_mb_private.h  |  12 +-
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c  | 180 --
 lib/cryptodev/rte_cryptodev.c   |   1 +
 11 files changed, 549 insertions(+), 117 deletions(-)

-- 
2.25.1



[PATCH v4 2/5] crypto/ipsec_mb: fix session creation for sessionless

2022-10-04 Thread Ciara Power
Currently, for a sessionless op, the session taken from the mempool
contains some values previously set by a testcase that does use a
session. This is due to the session object not being reset before going
back into the mempool.

This caused issues when multiple sessionless testcases ran, as the
previously set objects were being used for the first few testcases, but
subsequent testcases used empty objects, as they were being correctly
reset by the sessionless testcases.

To fix this, the session objects are now reset before being returned to
the mempool for session testcases. In addition, rather than pulling the
session object directly from the mempool for sessionless testcases, the
session_create() function is now used, which sets the required values,
such as nb_drivers.

Fixes: c75542ae4200 ("crypto/ipsec_mb: introduce IPsec_mb framework")
Fixes: b3bbd9e5f265 ("cryptodev: support device independent sessions")
Cc: roy.fan.zh...@intel.com
Cc: slawomirx.mrozow...@intel.com

Signed-off-by: Ciara Power 
Acked-by: Fan Zhang 
Acked-by: Pablo de Lara 

---
v3:
  - Modified fix to reset sessions, and ensure values are then set for
sessionless testcases. V2 fix just ensured the same values in
session objects were reused, as they were not being reset,
which was incorrect.
---
 drivers/crypto/ipsec_mb/ipsec_mb_private.h | 12 
 lib/cryptodev/rte_cryptodev.c  |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/crypto/ipsec_mb/ipsec_mb_private.h 
b/drivers/crypto/ipsec_mb/ipsec_mb_private.h
index 472b672f08..420701a818 100644
--- a/drivers/crypto/ipsec_mb/ipsec_mb_private.h
+++ b/drivers/crypto/ipsec_mb/ipsec_mb_private.h
@@ -415,7 +415,7 @@ ipsec_mb_get_session_private(struct ipsec_mb_qp *qp, struct 
rte_crypto_op *op)
uint32_t driver_id = ipsec_mb_get_driver_id(qp->pmd_type);
struct rte_crypto_sym_op *sym_op = op->sym;
uint8_t sess_type = op->sess_type;
-   void *_sess;
+   struct rte_cryptodev_sym_session *_sess;
void *_sess_private_data = NULL;
struct ipsec_mb_internals *pmd_data = &ipsec_mb_pmds[qp->pmd_type];
 
@@ -426,8 +426,12 @@ ipsec_mb_get_session_private(struct ipsec_mb_qp *qp, 
struct rte_crypto_op *op)
driver_id);
break;
case RTE_CRYPTO_OP_SESSIONLESS:
-   if (!qp->sess_mp ||
-   rte_mempool_get(qp->sess_mp, (void **)&_sess))
+   if (!qp->sess_mp)
+   return NULL;
+
+   _sess = rte_cryptodev_sym_session_create(qp->sess_mp);
+
+   if (!_sess)
return NULL;
 
if (!qp->sess_mp_priv ||
@@ -443,7 +447,7 @@ ipsec_mb_get_session_private(struct ipsec_mb_qp *qp, struct 
rte_crypto_op *op)
sess = NULL;
}
 
-   sym_op->session = (struct rte_cryptodev_sym_session *)_sess;
+   sym_op->session = _sess;
set_sym_session_private_data(sym_op->session, driver_id,
 _sess_private_data);
break;
diff --git a/lib/cryptodev/rte_cryptodev.c b/lib/cryptodev/rte_cryptodev.c
index 9e76a1c72d..ac0c508e76 100644
--- a/lib/cryptodev/rte_cryptodev.c
+++ b/lib/cryptodev/rte_cryptodev.c
@@ -2187,6 +2187,7 @@ rte_cryptodev_sym_session_free(struct 
rte_cryptodev_sym_session *sess)
 
/* Return session to mempool */
sess_mp = rte_mempool_from_obj(sess);
+   memset(sess, 0, 
rte_cryptodev_sym_get_existing_header_session_size(sess));
rte_mempool_put(sess_mp, sess);
 
rte_cryptodev_trace_sym_session_free(sess);
-- 
2.25.1



[PATCH v4 1/5] test/crypto: fix wireless auth digest segment

2022-10-04 Thread Ciara Power
The segment size for some tests was too small to hold the auth digest.
This caused issues when using op->sym->auth.digest.data for comparisons
in AESNI_MB PMD after a subsequent patch enables SGL.

For example, if segment size is 2, and digest size is 4, then 4 bytes
are read from op->sym->auth.digest.data, which overflows into the memory
after the segment, rather than using the second segment that contains
the remaining half of the digest.

Fixes: 11c5485bb276 ("test/crypto: add scatter-gather tests for IP and OOP")

Signed-off-by: Ciara Power 
Acked-by: Fan Zhang 
Acked-by: Pablo de Lara 

---
v4: Added failure check when appending digest size to buffer.
---
 app/test/test_cryptodev.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 0c39b16b71..799eff0649 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -3051,6 +3051,16 @@ create_wireless_algo_auth_cipher_operation(
remaining_off -= rte_pktmbuf_data_len(sgl_buf);
sgl_buf = sgl_buf->next;
}
+
+   /* The last segment should be large enough to hold full digest 
*/
+   if (sgl_buf->data_len < auth_tag_len) {
+   rte_pktmbuf_free(sgl_buf->next);
+   sgl_buf->next = NULL;
+   TEST_ASSERT_NOT_NULL(rte_pktmbuf_append(sgl_buf,
+   auth_tag_len - sgl_buf->data_len),
+   "No room to append auth tag");
+   }
+
sym_op->auth.digest.data = rte_pktmbuf_mtod_offset(sgl_buf,
uint8_t *, remaining_off);
sym_op->auth.digest.phys_addr = rte_pktmbuf_iova_offset(sgl_buf,
-- 
2.25.1



[PATCH v4 4/5] test/crypto: add OOP snow3g SGL tests

2022-10-04 Thread Ciara Power
More tests are added to test variations of OOP SGL for snow3g.
This includes LB_IN_SGL_OUT and SGL_IN_LB_OUT.

Signed-off-by: Ciara Power 
Acked-by: Fan Zhang 
Acked-by: Pablo de Lara 
---
 app/test/test_cryptodev.c | 48 +++
 1 file changed, 39 insertions(+), 9 deletions(-)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 799eff0649..e732daae03 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -4360,7 +4360,8 @@ test_snow3g_encryption_oop(const struct snow3g_test_data 
*tdata)
 }
 
 static int
-test_snow3g_encryption_oop_sgl(const struct snow3g_test_data *tdata)
+test_snow3g_encryption_oop_sgl(const struct snow3g_test_data *tdata,
+   uint8_t sgl_in, uint8_t sgl_out)
 {
struct crypto_testsuite_params *ts_params = &testsuite_params;
struct crypto_unittest_params *ut_params = &unittest_params;
@@ -4391,9 +4392,12 @@ test_snow3g_encryption_oop_sgl(const struct 
snow3g_test_data *tdata)
 
uint64_t feat_flags = dev_info.feature_flags;
 
-   if (!(feat_flags & RTE_CRYPTODEV_FF_OOP_SGL_IN_SGL_OUT)) {
-   printf("Device doesn't support out-of-place scatter-gather "
-   "in both input and output mbufs. "
+   if (((sgl_in && sgl_out) && !(feat_flags & 
RTE_CRYPTODEV_FF_OOP_SGL_IN_SGL_OUT))
+   || ((!sgl_in && sgl_out) &&
+   !(feat_flags & RTE_CRYPTODEV_FF_OOP_LB_IN_SGL_OUT))
+   || ((sgl_in && !sgl_out) &&
+   !(feat_flags & RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT))) {
+   printf("Device doesn't support out-of-place scatter gather 
type. "
"Test Skipped.\n");
return TEST_SKIPPED;
}
@@ -4418,10 +4422,21 @@ test_snow3g_encryption_oop_sgl(const struct 
snow3g_test_data *tdata)
/* the algorithms block size */
plaintext_pad_len = RTE_ALIGN_CEIL(plaintext_len, 16);
 
-   ut_params->ibuf = create_segmented_mbuf(ts_params->mbuf_pool,
-   plaintext_pad_len, 10, 0);
-   ut_params->obuf = create_segmented_mbuf(ts_params->mbuf_pool,
-   plaintext_pad_len, 3, 0);
+   if (sgl_in)
+   ut_params->ibuf = create_segmented_mbuf(ts_params->mbuf_pool,
+   plaintext_pad_len, 10, 0);
+   else {
+   ut_params->ibuf = rte_pktmbuf_alloc(ts_params->mbuf_pool);
+   rte_pktmbuf_append(ut_params->ibuf, plaintext_pad_len);
+   }
+
+   if (sgl_out)
+   ut_params->obuf = create_segmented_mbuf(ts_params->mbuf_pool,
+   plaintext_pad_len, 3, 0);
+   else {
+   ut_params->obuf = rte_pktmbuf_alloc(ts_params->mbuf_pool);
+   rte_pktmbuf_append(ut_params->obuf, plaintext_pad_len);
+   }
 
TEST_ASSERT_NOT_NULL(ut_params->ibuf,
"Failed to allocate input buffer in mempool");
@@ -6775,9 +6790,20 @@ test_snow3g_encryption_test_case_1_oop(void)
 static int
 test_snow3g_encryption_test_case_1_oop_sgl(void)
 {
-   return test_snow3g_encryption_oop_sgl(&snow3g_test_case_1);
+   return test_snow3g_encryption_oop_sgl(&snow3g_test_case_1, 1, 1);
+}
+
+static int
+test_snow3g_encryption_test_case_1_oop_lb_in_sgl_out(void)
+{
+   return test_snow3g_encryption_oop_sgl(&snow3g_test_case_1, 0, 1);
 }
 
+static int
+test_snow3g_encryption_test_case_1_oop_sgl_in_lb_out(void)
+{
+   return test_snow3g_encryption_oop_sgl(&snow3g_test_case_1, 1, 0);
+}
 
 static int
 test_snow3g_encryption_test_case_1_offset_oop(void)
@@ -16006,6 +16032,10 @@ static struct unit_test_suite 
cryptodev_snow3g_testsuite  = {
test_snow3g_encryption_test_case_1_oop),
TEST_CASE_ST(ut_setup, ut_teardown,
test_snow3g_encryption_test_case_1_oop_sgl),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_snow3g_encryption_test_case_1_oop_lb_in_sgl_out),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_snow3g_encryption_test_case_1_oop_sgl_in_lb_out),
TEST_CASE_ST(ut_setup, ut_teardown,
test_snow3g_encryption_test_case_1_offset_oop),
TEST_CASE_ST(ut_setup, ut_teardown,
-- 
2.25.1



[PATCH v4 3/5] crypto/ipsec_mb: add remaining SGL support

2022-10-04 Thread Ciara Power
The intel-ipsec-mb library supports SGL for GCM and ChaChaPoly
algorithms using the JOB API.
This support was added to AESNI_MB PMD previously, but the SGL feature
flags could not be added due to no SGL support for other algorithms.

This patch adds a workaround SGL approach for other algorithms
using the JOB API. The segmented input buffers are copied into a
linear buffer, which is passed as a single job to intel-ipsec-mb.
The job is processed, and on return, the linear buffer is split into the
original destination segments.

Existing AESNI_MB testcases are passing with these feature flags added.

Signed-off-by: Ciara Power 
Acked-by: Fan Zhang 
Acked-by: Pablo de Lara 

---
v3:
  - Reduced code duplication by adding a reusable function.
  - Changed int to uint64_t for total_len.
v2:
  - Small improvements when copying segments to linear buffer.
  - Added documentation changes.
---
 doc/guides/cryptodevs/aesni_mb.rst  |   1 -
 doc/guides/cryptodevs/features/aesni_mb.ini |   4 +
 doc/guides/rel_notes/release_22_11.rst  |   5 +
 drivers/crypto/ipsec_mb/pmd_aesni_mb.c  | 180 
 4 files changed, 156 insertions(+), 34 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index 07222ee117..59c134556f 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -72,7 +72,6 @@ Protocol offloads:
 Limitations
 ---
 
-* Chained mbufs are not supported.
 * Out-of-place is not supported for combined Crypto-CRC DOCSIS security
   protocol.
 * RTE_CRYPTO_CIPHER_DES_DOCSISBPI is not supported for combined Crypto-CRC
diff --git a/doc/guides/cryptodevs/features/aesni_mb.ini 
b/doc/guides/cryptodevs/features/aesni_mb.ini
index 3c648a391e..e4e965c35a 100644
--- a/doc/guides/cryptodevs/features/aesni_mb.ini
+++ b/doc/guides/cryptodevs/features/aesni_mb.ini
@@ -12,6 +12,10 @@ CPU AVX= Y
 CPU AVX2   = Y
 CPU AVX512 = Y
 CPU AESNI  = Y
+In Place SGL   = Y
+OOP SGL In SGL Out = Y
+OOP SGL In LB  Out = Y
+OOP LB  In SGL Out = Y
 OOP LB  In LB  Out = Y
 CPU crypto = Y
 Symmetric sessionless  = Y
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 53fe21453c..81f7f978a4 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -102,6 +102,11 @@ New Features
   * Added AES-CCM support in lookaside protocol (IPsec) for CN9K & CN10K.
   * Added AES & DES DOCSIS algorithm support in lookaside crypto for CN9K.
 
+* **Added SGL support to AESNI_MB PMD.**
+
+  Added support for SGL to AESNI_MB PMD. Support for inplace,
+  OOP SGL in SGL out, OOP LB in SGL out, and OOP SGL in LB out added.
+
 * **Added eventdev adapter instance get API.**
 
   * Added ``rte_event_eth_rx_adapter_instance_get`` to get Rx adapter
diff --git a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c 
b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
index 6d5d3ce8eb..62f7d4ee5a 100644
--- a/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
+++ b/drivers/crypto/ipsec_mb/pmd_aesni_mb.c
@@ -937,7 +937,7 @@ static inline uint64_t
 auth_start_offset(struct rte_crypto_op *op, struct aesni_mb_session *session,
uint32_t oop, const uint32_t auth_offset,
const uint32_t cipher_offset, const uint32_t auth_length,
-   const uint32_t cipher_length)
+   const uint32_t cipher_length, uint8_t lb_sgl)
 {
struct rte_mbuf *m_src, *m_dst;
uint8_t *p_src, *p_dst;
@@ -945,7 +945,7 @@ auth_start_offset(struct rte_crypto_op *op, struct 
aesni_mb_session *session,
uint32_t cipher_end, auth_end;
 
/* Only cipher then hash needs special calculation. */
-   if (!oop || session->chain_order != IMB_ORDER_CIPHER_HASH)
+   if (!oop || session->chain_order != IMB_ORDER_CIPHER_HASH || lb_sgl)
return auth_offset;
 
m_src = op->sym->m_src;
@@ -1159,6 +1159,81 @@ handle_aead_sgl_job(IMB_JOB *job, IMB_MGR *mb_mgr,
return 0;
 }
 
+static uint64_t
+sgl_linear_cipher_auth_len(IMB_JOB *job, uint64_t *auth_len)
+{
+   uint64_t cipher_len;
+
+   if (job->cipher_mode == IMB_CIPHER_SNOW3G_UEA2_BITLEN ||
+   job->cipher_mode == IMB_CIPHER_KASUMI_UEA1_BITLEN)
+   cipher_len = (job->msg_len_to_cipher_in_bits >> 3) +
+   (job->cipher_start_src_offset_in_bits >> 3);
+   else
+   cipher_len = job->msg_len_to_cipher_in_bytes +
+   job->cipher_start_src_offset_in_bytes;
+
+   if (job->hash_alg == IMB_AUTH_SNOW3G_UIA2_BITLEN ||
+   job->hash_alg == IMB_AUTH_ZUC_EIA3_BITLEN)
+   *auth_len = (job->msg_len_to_hash_in_bits >> 3) +
+   job->hash_start_src_offset_in_bytes;
+   else if (job->hash_alg == IMB_AUTH_AES_GMAC)
+   *auth_len = job->u.GCM.aad_

[PATCH v4 5/5] test/crypto: add remaining blockcipher SGL tests

2022-10-04 Thread Ciara Power
The current blockcipher test function only has support for two types of
SGL test, INPLACE or OOP_SGL_IN_LB_OUT. These types are hardcoded into
the function, with the number of segments always set to 3.

To ensure all SGL types are tested, blockcipher test vectors now have
fields to specify SGL type, and the number of segments.
If these fields are missing, the previous defaults are used,
either INPLACE or OOP_SGL_IN_LB_OUT, with 3 segments.

Some AES and Hash vectors are modified to use these new fields, and new
AES tests are added to test the SGL types that were not previously
being tested.

Signed-off-by: Ciara Power 
Acked-by: Fan Zhang 
Acked-by: Pablo de Lara 
---
 app/test/test_cryptodev_aes_test_vectors.h  | 345 +---
 app/test/test_cryptodev_blockcipher.c   |  50 +--
 app/test/test_cryptodev_blockcipher.h   |   2 +
 app/test/test_cryptodev_hash_test_vectors.h |   8 +-
 4 files changed, 335 insertions(+), 70 deletions(-)

diff --git a/app/test/test_cryptodev_aes_test_vectors.h 
b/app/test/test_cryptodev_aes_test_vectors.h
index a797af1b00..2c1875d3d9 100644
--- a/app/test/test_cryptodev_aes_test_vectors.h
+++ b/app/test/test_cryptodev_aes_test_vectors.h
@@ -4163,12 +4163,44 @@ static const struct blockcipher_test_case 
aes_chain_test_cases[] = {
},
{
.test_descr = "AES-192-CTR XCBC Decryption Digest Verify "
-   "Scatter Gather",
+   "Scatter Gather (Inplace)",
+   .test_data = &aes_test_data_2,
+   .op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC,
+   .feature_mask = BLOCKCIPHER_TEST_FEATURE_SG,
+   .sgl_flag = RTE_CRYPTODEV_FF_IN_PLACE_SGL,
+   .sgl_segs = 3
+   },
+   {
+   .test_descr = "AES-192-CTR XCBC Decryption Digest Verify "
+   "Scatter Gather OOP (SGL in SGL out)",
+   .test_data = &aes_test_data_2,
+   .op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC,
+   .feature_mask = BLOCKCIPHER_TEST_FEATURE_SG |
+   BLOCKCIPHER_TEST_FEATURE_OOP,
+   .sgl_flag = RTE_CRYPTODEV_FF_OOP_SGL_IN_SGL_OUT,
+   .sgl_segs = 3
+   },
+   {
+   .test_descr = "AES-192-CTR XCBC Decryption Digest Verify "
+   "Scatter Gather OOP (LB in SGL out)",
.test_data = &aes_test_data_2,
.op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC,
.feature_mask = BLOCKCIPHER_TEST_FEATURE_SG |
BLOCKCIPHER_TEST_FEATURE_OOP,
+   .sgl_flag = RTE_CRYPTODEV_FF_OOP_LB_IN_SGL_OUT,
+   .sgl_segs = 3
},
+   {
+   .test_descr = "AES-192-CTR XCBC Decryption Digest Verify "
+   "Scatter Gather OOP (SGL in LB out)",
+   .test_data = &aes_test_data_2,
+   .op_mask = BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC,
+   .feature_mask = BLOCKCIPHER_TEST_FEATURE_SG |
+   BLOCKCIPHER_TEST_FEATURE_OOP,
+   .sgl_flag = RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT,
+   .sgl_segs = 3
+   },
+
{
.test_descr = "AES-256-CTR HMAC-SHA1 Encryption Digest",
.test_data = &aes_test_data_3,
@@ -4193,11 +4225,52 @@ static const struct blockcipher_test_case 
aes_chain_test_cases[] = {
},
{
.test_descr = "AES-128-CBC HMAC-SHA1 Encryption Digest "
-   "Scatter Gather",
+   "Scatter Gather (Inplace)",
+   .test_data = &aes_test_data_4,
+   .op_mask = BLOCKCIPHER_TEST_OP_ENC_AUTH_GEN,
+   .feature_mask = BLOCKCIPHER_TEST_FEATURE_SG,
+   .sgl_flag = RTE_CRYPTODEV_FF_IN_PLACE_SGL,
+   .sgl_segs = 3
+   },
+   {
+   .test_descr = "AES-128-CBC HMAC-SHA1 Encryption Digest "
+   "Scatter Gather OOP (SGL in SGL out)",
+   .test_data = &aes_test_data_4,
+   .op_mask = BLOCKCIPHER_TEST_OP_ENC_AUTH_GEN,
+   .feature_mask = BLOCKCIPHER_TEST_FEATURE_SG |
+   BLOCKCIPHER_TEST_FEATURE_OOP,
+   .sgl_flag = RTE_CRYPTODEV_FF_OOP_SGL_IN_SGL_OUT,
+   .sgl_segs = 3
+   },
+   {
+   .test_descr = "AES-128-CBC HMAC-SHA1 Encryption Digest "
+   "Scatter Gather OOP 16 segs (SGL in SGL out)",
+   .test_data = &aes_test_data_4,
+   .op_mask = BLOCKCIPHER_TEST_OP_ENC_AUTH_GEN,
+   .feature_mask = BLOCKCIPHER_TEST_FEATURE_SG |
+   BLOCKCIPHER_TEST_FEATURE_OOP,
+   .sgl_flag = RTE_CRYPTODEV_FF_OOP_SGL_IN_SGL_OUT,
+   .sgl_segs = 16
+   },
+   {
+   .test_descr = "AES-128-CBC HMAC-SHA1 Encryption Digest "
+   

Re: [PATCH v2] mempool: fix get objects from mempool with cache

2022-10-04 Thread Andrew Rybchenko

Hi Morten,

In general I agree that the fix is required.
In sent v3 I'm trying to make it a bit better from my point of
view. See few notes below.

On 2/2/22 11:14, Morten Brørup wrote:

A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.

This patch fixes the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the ring.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the ring.

The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the ring (instead of only the
number of requested objects minus the objects available in the ring),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the ring fails, only
the remaining requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache
plus the common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.
And in the code path where the cache was backfilled first, numbers were
added and subtracted from the cache length; now this code path simply
sets the cache length to its final value.


I've just sent v3 with suggested changes to the patch.



v2 changes
- Do not modify description of return value. This belongs in a separate
doc fix.
- Elaborate even more on which bugs the modifications fix.

Signed-off-by: Morten Brørup 
---
  lib/mempool/rte_mempool.h | 75 ---
  1 file changed, 54 insertions(+), 21 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 1e7a3c1527..2898c690b0 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1463,38 +1463,71 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void 
**obj_table,
uint32_t index, len;
void **cache_objs;
  
-	/* No cache provided or cannot be satisfied from cache */

-   if (unlikely(cache == NULL || n >= cache->size))
+   /* No cache provided or if get would overflow mem allocated for cache */
+   if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))


The second condition is unnecessary until we try to fill in
cache from backend.


goto ring_dequeue;
  
-	cache_objs = cache->objs;

+   cache_objs = &cache->objs[cache->len];
+
+   if (n <= cache->len) {
+   /* The entire request can be satisfied from the cache. */
+   cache->len -= n;
+   for (index = 0; index < n; index++)
+   *obj_table++ = *--cache_objs;
+
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n);
  
-	/* Can this be satisfied from the cache? */

-   if (cache->len < n) {
-   /* No. Backfill the cache first, and then fill from it */
-  

Re: [PATCH] remove prefix to some local macros in apps and examples

2022-10-04 Thread Ferruh Yigit

On 10/4/2022 9:01 AM, David Marchand wrote:

RTE_TEST_[RT]X_DESC_DEFAULT and RTE_TEST_[RT]X_DESC_MAX macros have been
copied in a lot of app/ and examples/ code.
Those macros are local to each program.

They are not related to a DPDK public header/API, drop the RTE_TEST_
prefix.

Signed-off-by: David Marchand


Acked-by: Ferruh Yigit 


[PATCH v8] eal: add bus cleanup to eal cleanup

2022-10-04 Thread Kevin Laatz
During EAL init, all buses are probed and the devices found are
initialized. On eal_cleanup(), the inverse does not happen, meaning any
allocated memory and other configuration will not be cleaned up
appropriately on exit.

Currently, in order for device cleanup to take place, applications must
call the driver-relevant functions to ensure proper cleanup is done before
the application exits. Since initialization occurs for all devices on the
bus, not just the devices used by an application, it requires a)
application awareness of all bus devices that could have been probed on the
system, and b) code duplication across applications to ensure cleanup is
performed. An example of this is rte_eth_dev_close() which is commonly used
across the example applications.

This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
init/exit more symmetrical, ensuring all bus devices are cleaned up
appropriately without the application needing to be aware of all bus types
that may have been probed during initialization.

Contained in this patch are the changes required to perform cleanup for
devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
ask for bus maintainers to add the relevant cleanup for their buses since
they have the domain expertise.

Signed-off-by: Kevin Laatz 
Acked-by: Morten Brørup 
Reviewed-by: Bruce Richardson 

---
v8:
* rebase

v7:
* free rte_pci_device structs during cleanup
* free rte_vdev_device structs during cleanup

v6:
* fix units in doc API descriptions

v5:
* add doc updates for new APIs

v4:
* fix return value when scaling_freq_max is not set
* fix mismatching comments

v3:
* move setters from arg parse function to init
* consider 0 as 'not set' for scaling_freq_max
* other minor fixes

v2:
* add doc update for l3fwd-power
* order version.map additions alphabetically
---
 drivers/bus/pci/pci_common.c| 28 
 drivers/bus/vdev/vdev.c | 27 +++
 lib/eal/common/eal_common_bus.c | 17 +
 lib/eal/common/eal_private.h| 10 ++
 lib/eal/freebsd/eal.c   |  1 +
 lib/eal/include/bus_driver.h| 13 +
 lib/eal/linux/eal.c |  1 +
 lib/eal/windows/eal.c   |  1 +
 8 files changed, 98 insertions(+)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 5ea72bcf23..fb754e0e0a 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "private.h"
 
@@ -439,6 +440,32 @@ pci_probe(void)
return (probed && probed == failed) ? -1 : 0;
 }
 
+static int
+pci_cleanup(void)
+{
+   struct rte_pci_device *dev, *tmp_dev;
+   int error = 0;
+
+   RTE_TAILQ_FOREACH_SAFE(dev, &rte_pci_bus.device_list, next, tmp_dev) {
+   struct rte_pci_driver *drv = dev->driver;
+   int ret = 0;
+
+   if (drv == NULL || drv->remove == NULL)
+   continue;
+
+   ret = drv->remove(dev);
+   if (ret < 0) {
+   rte_errno = errno;
+   error = -1;
+   }
+   dev->driver = NULL;
+   dev->device.driver = NULL;
+   free(dev);
+   }
+
+   return error;
+}
+
 /* dump one device */
 static int
 pci_dump_one_device(FILE *f, struct rte_pci_device *dev)
@@ -856,6 +883,7 @@ struct rte_pci_bus rte_pci_bus = {
.bus = {
.scan = rte_pci_scan,
.probe = pci_probe,
+   .cleanup = pci_cleanup,
.find_device = pci_find_device,
.plug = pci_plug,
.unplug = pci_unplug,
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index b176b658fc..f5b43f1930 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -567,6 +567,32 @@ vdev_probe(void)
return ret;
 }
 
+static int
+vdev_cleanup(void)
+{
+   struct rte_vdev_device *dev, *tmp_dev;
+   int error = 0;
+
+   RTE_TAILQ_FOREACH_SAFE(dev, &vdev_device_list, next, tmp_dev) {
+   const struct rte_vdev_driver *drv;
+   int ret = 0;
+
+   drv = container_of(dev->device.driver, const struct 
rte_vdev_driver, driver);
+
+   if (drv == NULL || drv->remove == NULL)
+   continue;
+
+   ret = drv->remove(dev);
+   if (ret < 0)
+   error = -1;
+
+   dev->device.driver = NULL;
+   free(dev);
+   }
+
+   return error;
+}
+
 struct rte_device *
 rte_vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 const void *data)
@@ -625,6 +651,7 @@ vdev_get_iommu_class(void)
 static struct rte_bus rte_vdev_bus = {
.scan = vdev_scan,
.probe = vdev_probe,
+   .cleanup = vdev_cleanup,
.find_device = rte_vdev_find_device,
.plug =

Re: [PATCH v7 1/4] eal: add lcore poll busyness telemetry

2022-10-04 Thread Mattias Rönnblom
On 2022-10-04 13:57, Bruce Richardson wrote:
> On Tue, Oct 04, 2022 at 11:15:19AM +0200, Morten Brørup wrote:
>>> From: Mattias Rönnblom [mailto:hof...@lysator.liu.se]
>>> Sent: Monday, 3 October 2022 22.02
>>
>> [...]
>>
>>> The functionality provided is very useful, and the implementation is
>>> clever in the way it doesn't require any application modifications.
>>> But,
>>> a clever, useful brittle hack is still a brittle hack.
>>>
> 
> I think that may be a little harsh here. After all, this is a feature which
> is build-time disabled and runtime disabled by default, so like many other
> components it's designed for use when it makes sense to do so.
> 

So you don't think it's a hack? The driver level and the level of basic 
data structures (e.g., the ring) is the appropriate level to classify 
cycles into useful and not useful? And you don't think all the shaky 
assumptions makes it brittle?

Runtime configurable or not doesn't make a difference in this regard, in 
my opinion. On the source code level, this code is there, and making it 
compile-time conditional just makes matters worse.

Had this feature been limited to a small library, it would made a 
difference, but it's smeared across a wide range of APIs, and this list 
is not yet complete. Anything than can produce items of work need to be 
adapted.

That said, it's not obvious how this should be done. The higher-layer 
constructs where this should really be done aren't there in DPDK, at 
least not yet.

Have you considered the option to instrument rte_pause()? It's the 
closes DPDK has to the (now largely extinct) idle loop in an OS kernel. 
It too would be a hack, but maybe a less intrusive one.

> Furthermore, I'd just like to point out that the authors, when doing the
> patches, have left in the hooks so that even apps, for which the "for-free"
> scheme doesn't work, can still leverage the infrastructure to have the app
> itself report the busy/free metrics.
> 

If this is done properly, in a way that the data can reasonably be 
trusted and it can be enabled in runtime without much of a performance 
implication, tracking lcore load could be much more useful, than just 
best effort-telemetry.

Why is it so important not to require changes to the application? The 
changes are likely trivial, not unlike those I've submitted for the 
equivalent bookkeeping for DPDK services.

>>> What if there was instead a busyness module, where the application
>>> would
>>> explicitly report what it was up to. The new library would hook up to
>>> telemetry just like this patchset does, plus provide an explicit API to
>>> retrieve lcore thread load.
>>>
>>> The service cores framework (fancy name for rte_service.c) could also
>>> call the lcore load tracking module, provided all services properly
>>> reported back on whether or not they were doing anything useful with
>>> the
>>> cycles they just spent.
>>>
>>> The metrics of such a load tracking module could potentially be used by
>>> other modules in DPDK, or by the application. It could potentially be
>>> used for dynamic load balancing of service core services, or for power
>>> management (e.g, DVFS), or for a potential future deferred-work type
>>> mechanism more sophisticated than current rte_service, or some green
>>> threads/coroutines/fiber thingy. The DSW event device could also use it
>>> to replace its current internal load estimation scheme.
>>
>> [...]
>>
>> I agree 100 % with everything Mattias wrote above, and I would like to voice 
>> my opinion too.
>>
>> This patch is full of preconditions and assumptions. Its only true advantage 
>> (vs. a generic load tracking library) is that it doesn't require any 
>> application modifications, and thus can be deployed with zero effort.
>>
>> I my opinion, it would be much better with a well designed generic load 
>> tracking library, to be called from the application, so it gets correct 
>> information about what the lcores spend their cycles doing. And as Mattias 
>> mentions: With the appropriate API for consumption of the collected data, it 
>> could also provide actionable statistics for use by the application itself, 
>> not just telemetry. ("Actionable statistics": Statistics that is directly 
>> usable for decision making.)
>>
>> There is also the aspect of time-to-benefit: This patch immediately provides 
>> benefits (to the users of the DPDK applications that meet the 
>> preconditions/assumptions of the patch), while a generic load tracking 
>> library will take years to get integrated into applications before it 
>> provides benefits (to the users of the DPDK applications that use the new 
>> library).
>>
>> So, we should ask ourselves: Do we want an application-specific solution 
>> with a short time-to-benefit, or a generic solution with a long 
>> time-to-benefit? (I use the term "application specific" because not all 
>> applications can be tweaked to provide meaningful data with this patch. You 
>> might also label a generic library "appl

Re: [PATCH] app/testpmd: fix vlan offload of rxq

2022-10-04 Thread Singh, Aman Deep



On 9/30/2022 9:15 PM, Mingjin Ye wrote:

After setting "vlan offload" in testpmd, the result does not
update the rxq queues configuration.

Therefore, this patch is to reconfigure rxq queues after
executing the "vlan offload" command.

Fixes: a47aa8b97afe ("app/testpmd: add vlan offload support")
Cc: sta...@dpdk.org

Signed-off-by: Mingjin Ye 
Acked-by: Aman Singh
---
  app/test-pmd/cmdline.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b4fe9dfb17..066a482fb5 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -4076,6 +4076,7 @@ cmd_vlan_offload_parsed(void *parsed_result,
else
vlan_extend_set(port_id, on);
  
+	cmd_reconfig_device_queue(port_id, 1, 1);

return;
  }
  




Re: [PATCH] net/tap: fix the overflow of the network interface index.

2022-10-04 Thread Andrew Rybchenko

On 7/21/22 18:19, Stephen Hemminger wrote:

On Thu, 21 Jul 2022 11:13:01 +
Alex Kiselev  wrote:


On Linux and most other systems, network interface index is a 32-bit
integer.  Indexes overflowing the 16-bit integer are frequently seen
when used inside a Docker container.

Signed-off-by: Alex Kiselev 


Looks good, Linux API is inconsistent in use of signed vs unsigned
int for the ifindex. But negative values are never used/returned.

Acked-by: Stephen Hemminger 


Fixes: 7c25284e30c2 ("net/tap: add netlink back-end for flow API")
Fixes: 2bc06869cd94 ("net/tap: add remote netdevice traffic capture")
Cc: sta...@dpdk.org

Applied to dpdk-next-net/main, thanks.


RE: [PATCH v3] mempool: fix get objects from mempool with cache

2022-10-04 Thread Morten Brørup
> From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
> Sent: Tuesday, 4 October 2022 14.54
> To: Olivier Matz
> Cc: dev@dpdk.org; Morten Brørup; Beilei Xing; Bruce Richardson; Jerin
> Jacob Kollanukkaran
> Subject: [PATCH v3] mempool: fix get objects from mempool with cache
> 
> From: Morten Brørup 
> 
> A flush threshold for the mempool cache was introduced in DPDK version
> 1.3, but rte_mempool_do_generic_get() was not completely updated back
> then, and some inefficiencies were introduced.
> 
> Fix the following in rte_mempool_do_generic_get():
> 
> 1. The code that initially screens the cache request was not updated
> with the change in DPDK version 1.3.
> The initial screening compared the request length to the cache size,
> which was correct before, but became irrelevant with the introduction
> of
> the flush threshold. E.g. the cache can hold up to flushthresh objects,
> which is more than its size, so some requests were not served from the
> cache, even though they could be.
> The initial screening has now been corrected to match the initial
> screening in rte_mempool_do_generic_put(), which verifies that a cache
> is present, and that the length of the request does not overflow the
> memory allocated for the cache.
> 
> This bug caused a major performance degradation in scenarios where the
> application burst length is the same as the cache size. In such cases,
> the objects were not ever fetched from the mempool cache, regardless if
> they could have been.
> This scenario occurs e.g. if an application has configured a mempool
> with a size matching the application's burst size.
> 
> 2. The function is a helper for rte_mempool_generic_get(), so it must
> behave according to the description of that function.
> Specifically, objects must first be returned from the cache,
> subsequently from the ring.
> After the change in DPDK version 1.3, this was not the behavior when
> the request was partially satisfied from the cache; instead, the
> objects
> from the ring were returned ahead of the objects from the cache.
> This bug degraded application performance on CPUs with a small L1
> cache,
> which benefit from having the hot objects first in the returned array.
> (This is probably also the reason why the function returns the objects
> in reverse order, which it still does.)
> Now, all code paths first return objects from the cache, subsequently
> from the ring.
> 
> The function was not behaving as described (by the function using it)
> and expected by applications using it. This in itself is also a bug.
> 
> 3. If the cache could not be backfilled, the function would attempt
> to get all the requested objects from the ring (instead of only the
> number of requested objects minus the objects available in the ring),
> and the function would fail if that failed.
> Now, the first part of the request is always satisfied from the cache,
> and if the subsequent backfilling of the cache from the ring fails,
> only
> the remaining requested objects are retrieved from the ring.
> 
> The function would fail despite there are enough objects in the cache
> plus the common pool.
> 
> 4. The code flow for satisfying the request from the cache was slightly
> inefficient:
> The likely code path where the objects are simply served from the cache
> was treated as unlikely. Now it is treated as likely.
> 
> Signed-off-by: Morten Brørup 
> Signed-off-by: Andrew Rybchenko 
> ---
> v3 changes (Andrew Rybchenko)
>  - Always get first objects from the cache even if request is bigger
>than cache size. Remove one corresponding condition from the path
>when request is fully served from cache.
>  - Simplify code to avoid duplication:
> - Get objects directly from backend in single place only.
> - Share code which gets from the cache first regardless if
>   everythihg is obtained from the cache or just the first part.
>  - Rollback cache length in unlikely failure branch to avoid cache
>vs NULL check in success branch.
> 
> v2 changes
> - Do not modify description of return value. This belongs in a separate
> doc fix.
> - Elaborate even more on which bugs the modifications fix.
> 
>  lib/mempool/rte_mempool.h | 74 +--
>  1 file changed, 48 insertions(+), 26 deletions(-)

Thank you, Andrew.

I haven't compared the resulting assembler output (regarding performance), but 
I have carefully reviewed the resulting v3 source code for potential bugs in 
all code paths and for performance, and think it looks good.

The RTE_MIN() macro looks like it prefers the first parameter, so static branch 
prediction for len=RTE_MIN(remaining, cache->len) should be correct.

You could consider adding likely() around (cache != NULL) near the bottom of 
the function, so it matches the unlikely(cache == NULL) at the top of the 
function; mainly for symmetry in the source code, as I expect it to be the 
compiler default anyway.

Also, you could add "remaining" to the comment:
/* Get th

Re: [PATCH] net/tap: add persist option

2022-10-04 Thread Andrew Rybchenko

On 8/9/22 22:34, Stephen Hemminger wrote:

The TAP device only lasts as long as the DPDK application that opened
it is running. This behavior is basd if the DPDK application needs
to be updated transparently without disturbing other services
using the tap device.

This patch adds a persist feature to the TAP device. If this flag
is set, the kernel network device remains even if after the application
has exited.

Signed-off-by: Stephen Hemminger 


Reviewed-by: Andrew Rybchenko 

Applied to dpdk-next-net/main, thanks.



Re: [PATCH v2 1/1] app/testpmd: add command line argument 'nic-to-pmd-rx-metadata'

2022-10-04 Thread Andrew Rybchenko

On 9/1/22 11:03, Singh, Aman Deep wrote:



On 8/2/2022 11:21 PM, Hanumanth Pothula wrote:

Presently, rx metadata is sent to PMD by default, leading
to a performance drop as processing for the same in rx path
takes extra cycles.

Hence, introducing command line argument, 'nic-to-pmd-rx-metadata'
to control passing rx metadata to PMD. By default it’s disabled.

Signed-off-by: Hanumanth Pothula 


Acked-by: Aman Singh 


v2:
- taken cared alignment issues
- renamed command line argument from rx-metadata to 
nic-to-pmd-rx-metadata

- renamed variable name from rx-metadata to nic_to_pmd_rx_metadata
---






Please, update doc/guides/testpmd_app_ug/testpmd_funcs.rst to
document the new command-line argument.

Also avoid argument name in the summary. It must be human-
readable.



Re: [PATCH] memif: memif driver does not crashes when there's different N of TX and RX queues

2022-10-04 Thread Andrew Rybchenko

On 8/8/22 13:39, Joyce Kong wrote:

Hi Huzaifa,

This patch looks good to me.
And would you please help review my memif patches?
https://patches.dpdk.org/project/dpdk/cover/20220701102815.1444223-1-joyce.k...@arm.com/

Thanks,
Joyce


-Original Message-
From: huzaifa.rahman 
Sent: Tuesday, July 26, 2022 6:16 PM
To: jgraj...@cisco.com
Cc: dev@dpdk.org; huzaifa.rahman 
Subject: [PATCH] memif: memif driver does not crashes when there's
different N of TX and RX queues

net/memif: fix memif crash with different Tx Rx queues



Bugzilla ID: 734

there's a bug in memif_stats_get() function due to confusion between C2S
(client->server) and S2C (server->client) rings, causing a crash if there's a
different number of RX and TX queues.

this is fixed by selectiing the correct rings for RX and TX i.e for RX, S2C 
rings
are selected and for TX, C2S rings are selected.


Fixes: 09c7e63a71f9 ("net/memif: introduce memory interface PMD")
Cc: sta...@dpdk.org


Signed-off-by: huzaifa.rahman 

Reviewed-by: Joyce Kong 


Fixed above on applying.

Applied to dpdk-next-net/main, thanks.




[PATCH v2] drivers/bus: set device NUMA node to unknown by default

2022-10-04 Thread Olivier Matz
The dev->device.numa_node field is set by each bus driver for
every device it manages to indicate on which NUMA node this device lies.

When this information is unknown, the assigned value is not consistent
across the bus drivers.

Set the default value to SOCKET_ID_ANY (-1) by all bus drivers
when the NUMA information is unavailable. This change impacts
rte_eth_dev_socket_id() in the same manner.

Signed-off-by: Olivier Matz 
---

v2
* use SOCKET_ID_ANY instead of -1 in drivers/dma/idxd (David)
* document the behavior change of rte_eth_dev_socket_id()
* fix few examples where rte_eth_dev_socket_id() was expected to
  return 0 on unknown socket

 doc/guides/rel_notes/deprecation.rst |  7 ---
 doc/guides/rel_notes/release_22_11.rst   |  6 ++
 drivers/bus/auxiliary/auxiliary_common.c |  8 ++--
 drivers/bus/auxiliary/linux/auxiliary.c  | 13 +
 drivers/bus/dpaa/dpaa_bus.c  |  1 +
 drivers/bus/fslmc/fslmc_bus.c|  1 +
 drivers/bus/pci/bsd/pci.c|  2 +-
 drivers/bus/pci/linux/pci.c  | 16 ++--
 drivers/bus/pci/pci_common.c |  8 ++--
 drivers/bus/pci/windows/pci.c|  1 -
 drivers/bus/vmbus/linux/vmbus_bus.c  |  1 -
 drivers/bus/vmbus/vmbus_common.c |  8 ++--
 drivers/dma/idxd/idxd_bus.c  |  3 ++-
 examples/distributor/main.c  |  4 ++--
 examples/flow_classify/flow_classify.c   |  2 ++
 examples/rxtx_callbacks/main.c   |  2 +-
 lib/ethdev/rte_ethdev.h  |  4 ++--
 17 files changed, 35 insertions(+), 52 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index a991fa14de..2a1a6ff899 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -33,13 +33,6 @@ Deprecation Notices
   ``__atomic_thread_fence`` must be used for patches that need to be merged in
   20.08 onwards. This change will not introduce any performance degradation.
 
-* bus: The ``dev->device.numa_node`` field is set by each bus driver for
-  every device it manages to indicate on which NUMA node this device lies.
-  When this information is unknown, the assigned value is not consistent
-  across the bus drivers.
-  In DPDK 22.11, the default value will be set to -1 by all bus drivers
-  when the NUMA information is unavailable.
-
 * kni: The KNI kernel module and library are not recommended for use by new
   applications - other technologies such as virtio-user are recommended 
instead.
   Following the DPDK technical board
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 53fe21453c..d52f823694 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -317,6 +317,12 @@ ABI Changes
 * eventdev: Added ``weight`` and ``affinity`` fields
   to ``rte_event_queue_conf`` structure.
 
+* bus: Changed the device numa node to -1 when NUMA information is unavailable.
+  The ``dev->device.numa_node`` field is set by each bus driver for
+  every device it manages to indicate on which NUMA node this device lies.
+  When this information is unknown, the assigned value was not consistent
+  across the bus drivers. This similarly impacts ``rte_eth_dev_socket_id()``.
+
 
 Known Issues
 
diff --git a/drivers/bus/auxiliary/auxiliary_common.c 
b/drivers/bus/auxiliary/auxiliary_common.c
index 259ff152c4..6bb1fe7c96 100644
--- a/drivers/bus/auxiliary/auxiliary_common.c
+++ b/drivers/bus/auxiliary/auxiliary_common.c
@@ -105,12 +105,8 @@ rte_auxiliary_probe_one_driver(struct rte_auxiliary_driver 
*drv,
return -1;
}
 
-   if (dev->device.numa_node < 0) {
-   if (rte_socket_count() > 1)
-   AUXILIARY_LOG(INFO, "Device %s is not NUMA-aware, 
defaulting socket to 0",
-   dev->name);
-   dev->device.numa_node = 0;
-   }
+   if (dev->device.numa_node < 0 && rte_socket_count() > 1)
+   RTE_LOG(INFO, EAL, "Device %s is not NUMA-aware\n", dev->name);
 
iova_mode = rte_eal_iova_mode();
if ((drv->drv_flags & RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA) > 0 &&
diff --git a/drivers/bus/auxiliary/linux/auxiliary.c 
b/drivers/bus/auxiliary/linux/auxiliary.c
index d4c564cd78..02fc9285dc 100644
--- a/drivers/bus/auxiliary/linux/auxiliary.c
+++ b/drivers/bus/auxiliary/linux/auxiliary.c
@@ -40,14 +40,11 @@ auxiliary_scan_one(const char *dirname, const char *name)
/* Get NUMA node, default to 0 if not present */
snprintf(filename, sizeof(filename), "%s/%s/numa_node",
 dirname, name);
-   if (access(filename, F_OK) != -1) {
-   if (eal_parse_sysfs_value(filename, &tmp) == 0)
-   dev->device.numa_node = tmp;
-   else
-   dev->device.numa_node = -1;
-   } else {
-   dev->device.numa_n

Re: [PATCH v2] net/axgbe: support segmented Tx

2022-10-04 Thread Andrew Rybchenko

On 9/9/22 12:31, Namburu, Chandu-babu wrote:

-Original Message-
From: Modali, Bhagyada 
Sent: Thursday, September 8, 2022 11:45 PM
To: Namburu, Chandu-babu ; Yigit, Ferruh 
Cc: dev@dpdk.org; sta...@dpdk.org; Modali, Bhagyada 
Subject: [PATCH v2] net/axgbe: support segmented Tx

Enable segmented tx support and add jumbo packet transmit capability

Signed-off-by: Bhagyada Modali 


Acked-by: Chandubabu Namburu 

Applied to dpdk-next-net/main, thanks.


RE: [PATCH v7 1/4] ethdev: introduce protocol header API

2022-10-04 Thread Wang, YuanX
Hi Andrew,

> -Original Message-
> From: Andrew Rybchenko 
> Sent: Tuesday, October 4, 2022 3:53 PM
> To: Wang, YuanX ; dev@dpdk.org; Thomas
> Monjalon ; Ferruh Yigit ;
> Ray Kinsella 
> Cc: ferruh.yi...@xilinx.com; Li, Xiaoyun ; Singh, Aman
> Deep ; Zhang, Yuying
> ; Zhang, Qi Z ; Yang,
> Qiming ; jerinjac...@gmail.com;
> viachesl...@nvidia.com; step...@networkplumber.org; Ding, Xuan
> ; hpoth...@marvell.com; Tang, Yaqi
> ; Wenxuan Wu 
> Subject: Re: [PATCH v7 1/4] ethdev: introduce protocol header API
> 
> On 10/4/22 05:21, Wang, YuanX wrote:
> > Hi Andrew,
> >
> >> -Original Message-
> >> From: Andrew Rybchenko 
> >> Sent: Monday, October 3, 2022 3:04 PM
> >> To: Wang, YuanX ; dev@dpdk.org; Thomas
> Monjalon
> >> ; Ferruh Yigit ; Ray
> >> Kinsella 
> >> Cc: ferruh.yi...@xilinx.com; Li, Xiaoyun ;
> >> Singh, Aman Deep ; Zhang, Yuying
> >> ; Zhang, Qi Z ; Yang,
> >> Qiming ; jerinjac...@gmail.com;
> >> viachesl...@nvidia.com; step...@networkplumber.org; Ding, Xuan
> >> ; hpoth...@marvell.com; Tang, Yaqi
> >> ; Wenxuan Wu 
> >> Subject: Re: [PATCH v7 1/4] ethdev: introduce protocol header API
> >>
> >> On 10/2/22 00:05, Yuan Wang wrote:
> >>> Add a new ethdev API to retrieve supported protocol headers of a
> >>> PMD, which helps to configure protocol header based buffer split.
> >>>
> >>> Signed-off-by: Yuan Wang 
> >>> Signed-off-by: Xuan Ding 
> >>> Signed-off-by: Wenxuan Wu 
> >>> Reviewed-by: Andrew Rybchenko 
> >>> ---
> >>>doc/guides/rel_notes/release_22_11.rst |  5 
> >>>lib/ethdev/ethdev_driver.h | 15 
> >>>lib/ethdev/rte_ethdev.c| 33 ++
> >>>lib/ethdev/rte_ethdev.h| 30 +++
> >>>lib/ethdev/version.map |  3 +++
> >>>5 files changed, 86 insertions(+)
> >>>
> >>> diff --git a/doc/guides/rel_notes/release_22_11.rst
> >>> b/doc/guides/rel_notes/release_22_11.rst
> >>> index 0231959874..6a7474a3d6 100644
> >>> --- a/doc/guides/rel_notes/release_22_11.rst
> >>> +++ b/doc/guides/rel_notes/release_22_11.rst
> >>> @@ -96,6 +96,11 @@ New Features
> >>>  * Added ``rte_event_eth_tx_adapter_queue_stop`` to stop the Tx
> >> Adapter
> >>>from enqueueing any packets to the Tx queue.
> >>>
> >>> +* **Added new ethdev API for PMD to get buffer split supported
> >>> +protocol types.**
> >>> +
> >>> +  * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to
> >>> + get
> >> supported
> >>> +header protocols of a PMD to split.
> >>> +
> >>
> >> ethdev features should be grouped together in release notes.
> >> I'll fix it on applying if a new version is not required.
> >
> > We will send a new version. For the doc changes, I don't understand your
> point very well.
> > Since will be no new changes to the code within this patch, could you help
> to adjust the doc?
> > Thanks very much.
> 
> Please, read a comment just after 'New Features' section start.
> Hopefully it will make my note clearer.
> Anyway, don't worry about it a lot. I can easily fix it on applying.

Is it written like the following, if it is not correct please help to fix.

* **Added protocol header based buffer split.**

  * Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
header protocols of a PMD to split.
  * Ethdev: The ``reserved`` field in the ``rte_eth_rxseg_split`` structure is
replaced with ``proto_hdr`` to support protocol header based buffer split.
User can choose length or protocol header to configure buffer split
according to NIC's capability.

Thanks,
Yuan

 [snip]



RE: [PATCH v7 2/4] ethdev: introduce protocol hdr based buffer split

2022-10-04 Thread Wang, YuanX
Hi Andrew,

> -Original Message-
> From: Andrew Rybchenko 
> Sent: Tuesday, October 4, 2022 4:23 PM
> To: Wang, YuanX ; dev@dpdk.org; Thomas
> Monjalon ; Ferruh Yigit 
> Cc: ferruh.yi...@xilinx.com; m...@ashroe.eu; Li, Xiaoyun
> ; Singh, Aman Deep ;
> Zhang, Yuying ; Zhang, Qi Z
> ; Yang, Qiming ;
> jerinjac...@gmail.com; viachesl...@nvidia.com;
> step...@networkplumber.org; Ding, Xuan ;
> hpoth...@marvell.com; Tang, Yaqi 
> Subject: Re: [PATCH v7 2/4] ethdev: introduce protocol hdr based buffer split
> 
> On 10/4/22 05:48, Wang, YuanX wrote:
> > Hi Andrew,
> >
> >> -Original Message-
> >> On 10/2/22 00:05, Yuan Wang wrote:
> >>> +
> >>> + /* skip the payload */
> >>
> >> Sorry, it is confusing. What do you mean here?
> >
> > Because setting n proto_hdr will generate (n+1) segments. If we want to
> split the packet into n segments, we only need to check the first (n-1)
> proto_hdr.
> > For example, for ETH-IPV4-UDP-PAYLOAD, if we want to split after the UDP
> header, we only need to set and check the UDP header in the first segment.
> >
> > Maybe mask is not a good way, so we will use index to filter out the check
> of proto_hdr inside the last segment.
> 
> I see your point and understand the problem now.
> Thinking a bit more about it I realize that consistency check here should be
> more sophisticated.
> It should not allow:
>   - seg1 - length-based, seg2 - proto-based, seg3 - payload
>   - seg1 - proto-based, seg2 - legnth-based, seg3 - proto-based, seg4 - 
> payload
> I.e. no protocol-based split after length-based.
> But should allow:
>   - seg1 - proto-based, seg2 - legnth-based, seg3 - payload I.e. length based
> split after protocol-based.
> 
> Taking the last point above into account, proto_hdr in the last segment
> should be 0 like in length-based split (not RTE_PTYPE_ALL_MASK).

Just to confirm, do you mean that the payload as last segment should be treated 
as a length-based split(proto_hdr == 0)?
If so, for this question, 'check that dataroom in the last segment mempool is 
sufficient> for up to MTU packet if Rx scatter is disabled'
Is it not necessary to compare MTU size and mbuf_size? Because the check in 
length based split is sufficient. We will send v8 soon with above thought, 
please help to check.

> 
> It is an interesting question how to request:
>   - seg1 - ETH, seg2 - IPv4, seg3 - UDP, seg4 - payload Should we really 
> repeat
> ETH in seg2->proto_hdr and
> seg3->proto_hdr header and IPv4 in seg3->proto_hdr again?
> I tend to say no since when packet comes to seg2 it already has no ETH
> header.
> 
> If so, how to handle configuration when ETH is repeat in seg2?
> For example,
>- seg1 ETH+IPv4+UDP
>- seg2 ETH+IPv6+UDP
>- seg2 0
> Should we deny it or should we define behaviour like.
> If a packet does not match segX proto_hdr, the segment is skipped and
> segX+1 considered.
> Of course, not all drivers/HW supports it. If so, such configuration should be
> just discarded by the driver itself.

Here a question that needs to be clarified, whether the segments are sequential 
or independent. I prefer the former because it's more readable. Furthermore, it 
consists with length based split, which also configures the lengths 
sequentially. In this case, the following situation does not exist:
- seg1 ETH+IPv4+UDP
- seg2 ETH+IPv6+UDP
- seg3 0

For the case of repeating ETH, such as - seg1 - ETH, seg2 - IPv4, seg3 - UDP, 
seg4 - payload, as you suggested, we can omit ETH in the following segment. but 
IPV4-UDP and IPV6-UDP still need  to be distinguished, follow our previous 
discussion (user wants to split at IPV4-UDP rather than IPV6-UDP although 
driver supports both). In this case, seg1 - ETH, seg2 - IPv4, seg3 - UDP, seg4 
- payload,
we set proto_hdr with:
seg1 proto_hdr1=RTE_PTYPE_L2_ETHER
seg2 proto_hdr2=RTE_PTYPE_L3_IPV4
seg3 proto_hdr3=RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L4_UDP

Thanks,
Yuan



Re: [PATCH] doc: relate bifurcated driver and flow isolated mode

2022-10-04 Thread Andrew Rybchenko

On 9/20/22 13:56, Ori Kam wrote:

Hi,


-Original Message-
From: Dariusz Sosnowski 
Sent: Tuesday, 20 September 2022 11:49

Hi Thomas,


-Original Message-
From: Thomas Monjalon 
Sent: Wednesday, September 14, 2022 23:30
To: dev@dpdk.org
Cc: Michael Savisko ; Slava Ovsiienko
; Matan Azrad ; Dariusz
Sosnowski ; Asaf Penso ;

Ori

Kam ; Ferruh Yigit ; Andrew
Rybchenko 
Subject: [PATCH] doc: relate bifurcated driver and flow isolated mode

External email: Use caution opening links or attachments


The relation between the isolated mode in ethdev flow API and bifurcated
driver behaviour was not clearly explained.

It is made clear in the how-to guide that isolated mode is required for flow
bifurcation to the kernel.
On the other side, the impact of the isolated mode on a bifurcated driver is
made more explicit.

Signed-off-by: Thomas Monjalon 
---
  doc/guides/howto/flow_bifurcation.rst | 3 ++-
  lib/ethdev/rte_flow.h | 4 
  2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/doc/guides/howto/flow_bifurcation.rst
b/doc/guides/howto/flow_bifurcation.rst
index 7ba66b9003..79cf4f1e64 100644
--- a/doc/guides/howto/flow_bifurcation.rst
+++ b/doc/guides/howto/flow_bifurcation.rst
@@ -55,7 +55,8 @@ The full device is already shared with the kernel driver.
  The DPDK application can setup some flow steering rules,  and let the rest

go

to the kernel stack.
  In order to define the filters strictly with flow rules, -the
:ref:`flow_isolated_mode` can be configured.
+the :ref:`flow_isolated_mode` must be configured, so there is no
+default rule routing traffic to userspace.

  There is no specific instructions to follow.
  The recommended reading is the :doc:`../prog_guide/rte_flow` guide.
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h index
a79f1e7ef0..1bac3fd9ec 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -4254,6 +4254,10 @@ rte_flow_query(uint16_t port_id,
   *
   * Isolated mode guarantees that all ingress traffic comes from defined

flow

   * rules only (current and future).
+ * When enabled with a bifurcated driver,
+ * non-matched packets are routed to the kernel driver interface.
+ * When disabled (the default),
+ * there may be some default rules routing traffic to the DPDK port.
   *
   * Besides making ingress more deterministic, it allows PMDs to safely

reuse

   * resources otherwise assigned to handle the remaining traffic, such as
--
2.36.1


Looks good to me. Thank you.

Reviewed-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski


Acked-by: Ori Kam 
Best,
Ori


Applied to dpdk-next-net/main, thanks.



Re: [PATCH v2] net/ring: add monitor callback

2022-10-04 Thread Andrew Rybchenko

On 9/2/22 20:25, Herakliusz Lipiec wrote:

Currently ring pmd does not support ``rte_power_monitor`` api.
This patch adds support by adding monitor callback that is called
whenever we enter sleep state and need to check if it is time to wake
up.

Signed-off-by: Herakliusz Lipiec 
Acked-by: Bruce Richardson 


Applied to dpdk-next-net/main, thanks.




RE: [PATCH v2] mempool: fix get objects from mempool with cache

2022-10-04 Thread Morten Brørup
@Aaron, do you have any insights or comments to my curiosity below?

> From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
> Sent: Tuesday, 4 October 2022 14.58
> 
> Hi Morten,
> 
> In general I agree that the fix is required.
> In sent v3 I'm trying to make it a bit better from my point of
> view. See few notes below.

I stand by my review and accept of v3 - this message is not intended to change 
that! I'm just curious...

I wonder how accurate the automated performance tests ([v2], [v3]) are, and if 
they are comparable between February and October?

[v2]: http://mails.dpdk.org/archives/test-report/2022-February/256462.html
[v3]: http://mails.dpdk.org/archives/test-report/2022-October/311526.html


Ubuntu 20.04
Kernel: 4.15.0-generic
Compiler: gcc 7.4
NIC: Intel Corporation Ethernet Converged Network Adapter XL710-QDA2 4 Mbps
Target: x86_64-native-linuxapp-gcc
Fail/Total: 0/4

Detail performance results:
** V2 **:
+--+-+-++--+
| num_cpus | num_threads | txd/rxd | frame_size |  throughput difference from  |
|  | | ||   expected   |
+==+=+=++==+
| 1| 2   | 512 | 64 | 0.5% |
+--+-+-++--+
| 1| 2   | 2048| 64 | -1.5%|
+--+-+-++--+
| 1| 1   | 512 | 64 | 4.3% |
+--+-+-++--+
| 1| 1   | 2048| 64 | 10.9%|
+--+-+-++--+

** V3 **:
+--+-+-++--+
| num_cpus | num_threads | txd/rxd | frame_size |  throughput difference from  |
|  | | ||   expected   |
+==+=+=++==+
| 1| 2   | 512 | 64 | -0.7%|
+--+-+-++--+
| 1| 2   | 2048| 64 | -2.3%|
+--+-+-++--+
| 1| 1   | 512 | 64 | 0.5% |
+--+-+-++--+
| 1| 1   | 2048| 64 | 7.9% |
+--+-+-++--+



RE: [PATCH] dumpcap: fix list interfaces

2022-10-04 Thread Kaur, Arshdeep
Hi Stephen,

I tested the patch. "-D" option is now working properly.

But I am facing an issue in this.

Using "-D" provides me with the interfaces available. For me these are 
":18:01.0" and ":18:09.0":
./dpdk-dumpcap -D --file-prefix wls_1
FlexRAN SDK bblib_lte_ldpc_decoder version #DIRTY#
FlexRAN SDK bblib_lte_ldpc_encoder version #DIRTY#
FlexRAN SDK bblib_lte_LDPC_ratematch version #DIRTY#
FlexRAN SDK bblib_lte_rate_dematching_5gnr version #DIRTY#
FlexRAN SDK bblib_lte_turbo version #DIRTY#
FlexRAN SDK bblib_lte_crc version #DIRTY#
FlexRAN SDK bblib_lte_rate_matching version #DIRTY#
FlexRAN SDK bblib_common version #DIRTY#
FlexRAN SDK bblib_srs_fft_cestimate_5gnr version #DIRTY#
FlexRAN SDK bblib_mldts_process_5gnr version #DIRTY#
EAL: :18:01.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :18:01.1 cannot be used
EAL: :18:09.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :18:09.1 cannot be used
EAL: :18:11.0 cannot find TAILQ entry for PCI device!
EAL: Requested device :18:11.0 cannot be used
EAL: :18:11.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :18:11.1 cannot be used
EAL: :18:19.0 cannot find TAILQ entry for PCI device!
EAL: Requested device :18:19.0 cannot be used
EAL: :18:19.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :18:19.1 cannot be used
EAL: :af:01.0 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:01.0 cannot be used
EAL: :af:01.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:01.1 cannot be used
EAL: :af:09.0 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:09.0 cannot be used
EAL: :af:09.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:09.1 cannot be used
EAL: :af:11.0 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:11.0 cannot be used
EAL: :af:11.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:11.1 cannot be used
EAL: :af:19.0 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:19.0 cannot be used
EAL: :af:19.1 cannot find TAILQ entry for PCI device!
EAL: Requested device :af:19.1 cannot be used
0. :18:01.0
1. :18:09.0

But when I use these same interfaces to capture, they are not available:
./dpdk-dumpcap -i :18:01.0 -c 500 -s 9600 -w capture1.pacp --file-prefix 
wls_1
FlexRAN SDK bblib_lte_ldpc_decoder version #DIRTY#
FlexRAN SDK bblib_lte_ldpc_encoder version #DIRTY#
FlexRAN SDK bblib_lte_LDPC_ratematch version #DIRTY#
FlexRAN SDK bblib_lte_rate_dematching_5gnr version #DIRTY#
FlexRAN SDK bblib_lte_turbo version #DIRTY#
FlexRAN SDK bblib_lte_crc version #DIRTY#
FlexRAN SDK bblib_lte_rate_matching version #DIRTY#
FlexRAN SDK bblib_common version #DIRTY#
FlexRAN SDK bblib_srs_fft_cestimate_5gnr version #DIRTY#
FlexRAN SDK bblib_mldts_process_5gnr version #DIRTY#
EAL: Error - exiting with code: 1
  Cause: Specified port_number ":18:01.0" is not a valid number

./dpdk-dumpcap -i :18:09.0 -c 500 -s 9600 -w capture2.pacp --file-prefix 
wls_1
FlexRAN SDK bblib_lte_ldpc_decoder version #DIRTY#
FlexRAN SDK bblib_lte_ldpc_encoder version #DIRTY#
FlexRAN SDK bblib_lte_LDPC_ratematch version #DIRTY#
FlexRAN SDK bblib_lte_rate_dematching_5gnr version #DIRTY#
FlexRAN SDK bblib_lte_turbo version #DIRTY#
FlexRAN SDK bblib_lte_crc version #DIRTY#
FlexRAN SDK bblib_lte_rate_matching version #DIRTY#
FlexRAN SDK bblib_common version #DIRTY#
FlexRAN SDK bblib_srs_fft_cestimate_5gnr version #DIRTY#
FlexRAN SDK bblib_mldts_process_5gnr version #DIRTY#
EAL: Error - exiting with code: 1
  Cause: Specified port_number ":18:09.0" is not a valid number

According to me, select_interface() has same issue that dump_interfaces() had. 
So we need to add a flag for this in similar way and handle select_interface() 
in main after parse_opts, dpdk_init and dump_interfaces.

I tested this changes and it works for me. But I am not sure how it will affect 
entire dumpcap. Please let me know your thoughts about it.

Thanks and regards,
Arshdeep Kaur

> -Original Message-
> From: Stephen Hemminger 
> Sent: Monday, September 26, 2022 5:04 AM
> To: dev@dpdk.org
> Cc: Stephen Hemminger ;
> konce...@gmail.com; Pattan, Reshma 
> Subject: [PATCH] dumpcap: fix list interfaces
> 
> The change to do argument process before EAL init broke the support of
> list-interfaces option. Fix by setting flag and doing list-interfaces later.
> 
> Fixes: a8dde09f97df ("app/dumpcap: allow help/version without primary
> process")
> Cc: konce...@gmail.com
> Signed-off-by: Stephen Hemminger 
> ---
>  app/dumpcap/main.c | 19 +--
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c index
> a6041d4ff495..490a0f050bc8 100644
> --- a/app/dumpcap/main.c
> +++ b/app/dumpcap/main.c
> @@ -63,6 +63,8 @@ static unsigned int ring_size = 2048;  st

Re: [PATCH v8] eal: add bus cleanup to eal cleanup

2022-10-04 Thread David Marchand
On Tue, Oct 4, 2022 at 3:08 PM Kevin Laatz  wrote:
>
> During EAL init, all buses are probed and the devices found are
> initialized. On eal_cleanup(), the inverse does not happen, meaning any
> allocated memory and other configuration will not be cleaned up
> appropriately on exit.
>
> Currently, in order for device cleanup to take place, applications must
> call the driver-relevant functions to ensure proper cleanup is done before
> the application exits. Since initialization occurs for all devices on the
> bus, not just the devices used by an application, it requires a)
> application awareness of all bus devices that could have been probed on the
> system, and b) code duplication across applications to ensure cleanup is
> performed. An example of this is rte_eth_dev_close() which is commonly used
> across the example applications.
>
> This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
> init/exit more symmetrical, ensuring all bus devices are cleaned up
> appropriately without the application needing to be aware of all bus types
> that may have been probed during initialization.
>
> Contained in this patch are the changes required to perform cleanup for
> devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
> ask for bus maintainers to add the relevant cleanup for their buses since
> they have the domain expertise.
>
> Signed-off-by: Kevin Laatz 
> Acked-by: Morten Brørup 
> Reviewed-by: Bruce Richardson 
>

Thanks for the rebase.
Most of it lgtm, just one question/comment.

[snip]

> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> index a1bb5363b1..b9a7792c19 100644
> --- a/lib/eal/freebsd/eal.c
> +++ b/lib/eal/freebsd/eal.c
> @@ -896,6 +896,7 @@ rte_eal_cleanup(void)
> rte_mp_channel_cleanup();
> rte_trace_save();
> eal_trace_fini();
> +   eal_bus_cleanup();
> /* after this point, any DPDK pointers will become dangling */
> rte_eal_memory_detach();
> rte_eal_alarm_cleanup();

Do you have a reason to put the bus cleanup after the traces are
stored and the trace subsystem is uninitialised?

With the current location for eal_bus_cleanup(), it means that this
function (and any code it calls) is not traceable.
To be fair, I don't think we have any trace points in this code at the
moment, but we might have in the future.


-- 
David Marchand



Re: [PATCH v3 0/5] Add support for live migration and cleanup MCDI headers

2022-10-04 Thread Andrew Rybchenko

On 7/14/22 16:47, abhimanyu.sa...@xilinx.com wrote:

From: Abhimanyu Saini 

In SW assisted live migration, vDPA driver will stop all virtqueues
and setup up SW vrings to relay the communication between the
virtio driver and the vDPA device using an event driven relay thread
This will allow vDPA driver to help on guest dirty page logging for
live migration.

Abhimanyu Saini (5):
   common/sfc_efx/base: remove VQ index check during VQ start
   common/sfc_efx/base: update MCDI headers
   common/sfc_efx/base: use the updated definitions of cidx/pidx
   vdpa/sfc: enable support for multi-queue
   vdpa/sfc: Add support for SW assisted live migration

  drivers/common/sfc_efx/base/efx.h   |  12 +-
  drivers/common/sfc_efx/base/efx_regs_mcdi.h |  36 +-
  drivers/common/sfc_efx/base/rhead_virtio.c  |  28 +-
  drivers/vdpa/sfc/sfc_vdpa.h |   1 +
  drivers/vdpa/sfc/sfc_vdpa_hw.c  |   2 +
  drivers/vdpa/sfc/sfc_vdpa_ops.c | 345 ++--
  drivers/vdpa/sfc/sfc_vdpa_ops.h |  17 +-
  7 files changed, 378 insertions(+), 63 deletions(-)



Patch 4/5 requires review notes processing.

Applied without the 4/5 patch to dpdk-next-net/main, thanks.



Re: [PATCH v8] eal: add bus cleanup to eal cleanup

2022-10-04 Thread Kevin Laatz

On 04/10/2022 16:28, David Marchand wrote:

On Tue, Oct 4, 2022 at 3:08 PM Kevin Laatz  wrote:

During EAL init, all buses are probed and the devices found are
initialized. On eal_cleanup(), the inverse does not happen, meaning any
allocated memory and other configuration will not be cleaned up
appropriately on exit.

Currently, in order for device cleanup to take place, applications must
call the driver-relevant functions to ensure proper cleanup is done before
the application exits. Since initialization occurs for all devices on the
bus, not just the devices used by an application, it requires a)
application awareness of all bus devices that could have been probed on the
system, and b) code duplication across applications to ensure cleanup is
performed. An example of this is rte_eth_dev_close() which is commonly used
across the example applications.

This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
init/exit more symmetrical, ensuring all bus devices are cleaned up
appropriately without the application needing to be aware of all bus types
that may have been probed during initialization.

Contained in this patch are the changes required to perform cleanup for
devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
ask for bus maintainers to add the relevant cleanup for their buses since
they have the domain expertise.

Signed-off-by: Kevin Laatz 
Acked-by: Morten Brørup 
Reviewed-by: Bruce Richardson 


Thanks for the rebase.
Most of it lgtm, just one question/comment.

[snip]


diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
index a1bb5363b1..b9a7792c19 100644
--- a/lib/eal/freebsd/eal.c
+++ b/lib/eal/freebsd/eal.c
@@ -896,6 +896,7 @@ rte_eal_cleanup(void)
 rte_mp_channel_cleanup();
 rte_trace_save();
 eal_trace_fini();
+   eal_bus_cleanup();
 /* after this point, any DPDK pointers will become dangling */
 rte_eal_memory_detach();
 rte_eal_alarm_cleanup();

Do you have a reason to put the bus cleanup after the traces are
stored and the trace subsystem is uninitialised?

With the current location for eal_bus_cleanup(), it means that this
function (and any code it calls) is not traceable.
To be fair, I don't think we have any trace points in this code at the
moment, but we might have in the future.


No reason for doing it after trace un-init. I'll move and resend.

Thanks!




[PATCH v8 3/6] net/memif: set memfd syscall ID on LoongArch

2022-10-04 Thread Min Zhou
Define the missing __NR_memfd_create syscall id to enable the memif
PMD on LoongArch.

Signed-off-by: Min Zhou 
---
 drivers/net/memif/meson.build | 6 --
 drivers/net/memif/rte_eth_memif.h | 2 ++
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/memif/meson.build b/drivers/net/memif/meson.build
index 30c0fbc798..680bc8631c 100644
--- a/drivers/net/memif/meson.build
+++ b/drivers/net/memif/meson.build
@@ -1,12 +1,6 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright 2018-2019 Cisco Systems, Inc.  All rights reserved.
 
-if arch_subdir == 'loongarch'
-build = false
-reason = 'not supported on LoongArch'
-subdir_done()
-endif
-
 if not is_linux
 build = false
 reason = 'only supported on Linux'
diff --git a/drivers/net/memif/rte_eth_memif.h 
b/drivers/net/memif/rte_eth_memif.h
index 81e7dceae0..eb692aee68 100644
--- a/drivers/net/memif/rte_eth_memif.h
+++ b/drivers/net/memif/rte_eth_memif.h
@@ -182,6 +182,8 @@ const char *memif_version(void);
 #define __NR_memfd_create 356
 #elif defined __riscv
 #define __NR_memfd_create 279
+#elif defined __loongarch__
+#define __NR_memfd_create 279
 #else
 #error "__NR_memfd_create unknown for this architecture"
 #endif
-- 
2.32.1 (Apple Git-133)



[PATCH v8 2/6] net/ixgbe: add vector stubs for LoongArch

2022-10-04 Thread Min Zhou
Similar to RISC-V, the current version for LoongArch do not support
vector. Re-use vector processing stubs in ixgbe PMD defined for PPC
for LoongArch. This enables ixgbe PMD usage in scalar mode on
LoongArch.

The ixgbe PMD driver was validated with Intel X520-DA2 NIC and the
test-pmd application, l2fwd, l3fwd examples.

Signed-off-by: Min Zhou 
---
 doc/guides/nics/features/ixgbe.ini | 1 +
 drivers/net/ixgbe/ixgbe_rxtx.c | 7 +--
 drivers/net/ixgbe/meson.build  | 6 --
 3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/doc/guides/nics/features/ixgbe.ini 
b/doc/guides/nics/features/ixgbe.ini
index ab759a6fb3..97c0a6af9e 100644
--- a/doc/guides/nics/features/ixgbe.ini
+++ b/doc/guides/nics/features/ixgbe.ini
@@ -52,6 +52,7 @@ FreeBSD  = Y
 Linux= Y
 Windows  = Y
 ARMv8= Y
+LoongArch64  = Y
 rv64 = Y
 x86-32   = Y
 x86-64   = Y
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 009d9b624a..c9d6ca9efe 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -5957,8 +5957,11 @@ ixgbe_config_rss_filter(struct rte_eth_dev *dev,
return 0;
 }
 
-/* Stubs needed for linkage when RTE_ARCH_PPC_64 or RTE_ARCH_RISCV is set */
-#if defined(RTE_ARCH_PPC_64) || defined(RTE_ARCH_RISCV)
+/* Stubs needed for linkage when RTE_ARCH_PPC_64, RTE_ARCH_RISCV or
+ * RTE_ARCH_LOONGARCH is set.
+ */
+#if defined(RTE_ARCH_PPC_64) || defined(RTE_ARCH_RISCV) || \
+   defined(RTE_ARCH_LOONGARCH)
 int
 ixgbe_rx_vec_dev_conf_condition_check(struct rte_eth_dev __rte_unused *dev)
 {
diff --git a/drivers/net/ixgbe/meson.build b/drivers/net/ixgbe/meson.build
index 80ab012448..a18908ef7c 100644
--- a/drivers/net/ixgbe/meson.build
+++ b/drivers/net/ixgbe/meson.build
@@ -1,12 +1,6 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2017 Intel Corporation
 
-if arch_subdir == 'loongarch'
-build = false
-reason = 'not supported on LoongArch'
-subdir_done()
-endif
-
 cflags += ['-DRTE_LIBRTE_IXGBE_BYPASS']
 
 subdir('base')
-- 
2.32.1 (Apple Git-133)



[PATCH v8 0/6] Introduce support for LoongArch architecture

2022-10-04 Thread Min Zhou
Dear team,

The following patch set is intended to support DPDK running on LoongArch
architecture.

LoongArch is the general processor architecture of Loongson Corporation
and is a new RISC ISA, which is a bit like MIPS or RISC-V.

The online documents of LoongArch architecture are here:
https://loongson.github.io/LoongArch-Documentation/README-EN.html

The latest build tools for LoongArch (binary) can be downloaded from:
https://github.com/loongson/build-tools

If you want to generate your own cross toolchain, you can refer to
this thread:

https://inbox.dpdk.org/dev/53b50799-cb29-7ee6-be89-4fe21566e...@loongson.cn/T/#m1da99578f85894a4ddcd8e39d8239869e6a501d1
>From the link above, you can find a script to do that.

v8: 
- rebase the patchset on the main repository
- add meson build test for LoongArch in devtools/test-meson-builds.sh
- add ccache to build configuration file
- change the cpp meson variable to a c++ compiler
- complete the cross compilation documentation for LoongArch, adding
  reference to the build script and dependency list
- put the feature description for LoongArch in the EAL features list
  in release_22_11.rst
- simplify macro definition for new added headers
- put the items about LoongArch in the right place in meson.build

v7:
- rebase the patchset on the main repository
- add errno.h to rte_power_intrinsics.c according with
  commit 72b452c5f259

v6:
- place some blocks for LoongArch in a pseudo alphabetical order
- remove some macros not used
- update release notes in the correct format
- remove some headers for LoongArch, including msclock, pflock and
  ticketlock, which are now non-arch specific
- rename some helpers to make them more readable 
- remove some copied comments
- force-set RTE_FORCE_INTRINSICS in config and remove non-arch
  specific implementations
- fix format errors in meson file reported by check-meson.py
- rebase the patchset on the main repository

v5:
- merge all patches for supporting LoongArch EAL into one patch
- add LoongArch cross compilation document and update some documents
  related to architecture
- remove vector stubs added for LoongArch in net/i40e and net/ixgbe
- add LOONGARCH64 cross compilation job in github ci

v4:
- rebase the patchset on the main repository of version 22.07.0

v3:
- add URL for cross compile tool chain
- remove rte_lpm_lsx.h which was a dummy vector implementation
  because there is already a scalar implementation, thanks to
  Michal Mazurek
- modify the name of compiler for cross compiling
- remove useless variable in meson.build

v2:
- use standard atomics of toolchain to implement
  atomic operations
- implement spinlock based on standard atomics

Min Zhou (6):
  eal/loongarch: support LoongArch architecture
  net/ixgbe: add vector stubs for LoongArch
  net/memif: set memfd syscall ID on LoongArch
  net/tap: set BPF syscall ID for LoongArch
  examples/l3fwd: enable LoongArch operation
  test/cpuflags: add test for LoongArch cpu flag

 MAINTAINERS   |  6 ++
 app/test/test_cpuflags.c  | 41 
 app/test/test_xmmt_ops.h  | 12 +++
 .../loongarch/loongarch_loongarch64_linux_gcc | 16 +++
 config/loongarch/meson.build  | 43 
 devtools/test-meson-builds.sh |  4 +
 doc/guides/contributing/design.rst|  2 +-
 .../cross_build_dpdk_for_loongarch.rst| 97 +++
 doc/guides/linux_gsg/index.rst|  1 +
 doc/guides/nics/features.rst  |  8 ++
 doc/guides/nics/features/default.ini  |  1 +
 doc/guides/nics/features/ixgbe.ini|  1 +
 doc/guides/rel_notes/release_22_11.rst|  7 ++
 drivers/net/i40e/meson.build  |  6 ++
 drivers/net/ixgbe/ixgbe_rxtx.c|  7 +-
 drivers/net/memif/rte_eth_memif.h |  2 +
 drivers/net/tap/tap_bpf.h |  2 +
 examples/l3fwd/l3fwd_em.c |  8 ++
 lib/eal/linux/eal_memory.c|  4 +
 lib/eal/loongarch/include/meson.build | 18 
 lib/eal/loongarch/include/rte_atomic.h| 47 +
 lib/eal/loongarch/include/rte_byteorder.h | 40 
 lib/eal/loongarch/include/rte_cpuflags.h  | 39 
 lib/eal/loongarch/include/rte_cycles.h| 47 +
 lib/eal/loongarch/include/rte_io.h| 18 
 lib/eal/loongarch/include/rte_memcpy.h| 61 
 lib/eal/loongarch/include/rte_pause.h | 24 +
 .../loongarch/include/rte_power_intrinsics.h  | 20 
 lib/eal/loongarch/include/rte_prefetch.h  | 47 +
 lib/eal/loongarch/include/rte_rwlock.h| 42 
 lib/eal/loongarch/include/rte_spinlock.h  | 64 
 lib/eal/loongarch/include/rte_vect.h  | 65 ++

[PATCH v8 5/6] examples/l3fwd: enable LoongArch operation

2022-10-04 Thread Min Zhou
Add missing em_mask_key() implementation to enable the l3fwd to be
run on LoongArch.

Signed-off-by: Min Zhou 
---
 examples/l3fwd/l3fwd_em.c  | 8 
 examples/l3fwd/meson.build | 6 --
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 0531282a1f..a203dc9e46 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -247,6 +247,14 @@ em_mask_key(void *key, xmm_t mask)
 
return vect_and(data, mask);
 }
+#elif defined(RTE_ARCH_LOONGARCH)
+static inline xmm_t
+em_mask_key(void *key, xmm_t mask)
+{
+   xmm_t data = vect_load_128(key);
+
+   return vect_and(data, mask);
+}
 #else
 #error No vector engine (SSE, NEON, ALTIVEC) available, check your toolchain
 #endif
diff --git a/examples/l3fwd/meson.build b/examples/l3fwd/meson.build
index d2f2d96099..b40244a941 100644
--- a/examples/l3fwd/meson.build
+++ b/examples/l3fwd/meson.build
@@ -6,12 +6,6 @@
 # To build this example as a standalone application with an already-installed
 # DPDK instance, use 'make'
 
-if arch_subdir == 'loongarch'
-build = false
-reason = 'not supported on LoongArch'
-subdir_done()
-endif
-
 allow_experimental_apis = true
 deps += ['acl', 'hash', 'lpm', 'fib', 'eventdev']
 sources = files(
-- 
2.32.1 (Apple Git-133)



[PATCH v8 1/6] eal/loongarch: support LoongArch architecture

2022-10-04 Thread Min Zhou
Add all necessary elements for DPDK to compile and run EAL on
LoongArch64 Soc.

This includes:

- EAL library implementation for LoongArch ISA.
- meson build structure for 'loongarch' architecture.
  RTE_ARCH_LOONGARCH define is added for architecture identification.
- xmm_t structure operation stubs as there is no vector support in
  the current version for LoongArch.

Compilation was tested on Debian and CentOS using loongarch64
cross-compile toolchain from x86 build hosts. Functions were tested
on Loongnix and Kylin which are two Linux distributions supported
LoongArch host based on Linux 4.19 maintained by Loongson
Corporation.

We also tested DPDK on LoongArch with some external applications,
including: Pktgen-DPDK, OVS, VPP.

The platform is currently marked as linux-only because there is no
other OS than Linux support LoongArch host currently.

The i40e PMD driver is disabled on LoongArch because of the absence
of vector support in the current version.

Similar to RISC-V, the compilation of following modules has been
disabled by this commit and will be re-enabled in later commits as
fixes are introduced:
net/ixgbe, net/memif, net/tap, example/l3fwd.

Signed-off-by: Min Zhou 
---
 MAINTAINERS   |  6 ++
 app/test/test_xmmt_ops.h  | 12 +++
 .../loongarch/loongarch_loongarch64_linux_gcc | 16 +++
 config/loongarch/meson.build  | 43 
 devtools/test-meson-builds.sh |  4 +
 doc/guides/contributing/design.rst|  2 +-
 .../cross_build_dpdk_for_loongarch.rst| 97 +++
 doc/guides/linux_gsg/index.rst|  1 +
 doc/guides/nics/features.rst  |  8 ++
 doc/guides/nics/features/default.ini  |  1 +
 doc/guides/rel_notes/release_22_11.rst|  7 ++
 drivers/net/i40e/meson.build  |  6 ++
 drivers/net/ixgbe/meson.build |  6 ++
 drivers/net/memif/meson.build |  6 ++
 drivers/net/tap/meson.build   |  6 ++
 examples/l3fwd/meson.build|  6 ++
 lib/eal/linux/eal_memory.c|  4 +
 lib/eal/loongarch/include/meson.build | 18 
 lib/eal/loongarch/include/rte_atomic.h| 47 +
 lib/eal/loongarch/include/rte_byteorder.h | 40 
 lib/eal/loongarch/include/rte_cpuflags.h  | 39 
 lib/eal/loongarch/include/rte_cycles.h| 47 +
 lib/eal/loongarch/include/rte_io.h| 18 
 lib/eal/loongarch/include/rte_memcpy.h| 61 
 lib/eal/loongarch/include/rte_pause.h | 24 +
 .../loongarch/include/rte_power_intrinsics.h  | 20 
 lib/eal/loongarch/include/rte_prefetch.h  | 47 +
 lib/eal/loongarch/include/rte_rwlock.h| 42 
 lib/eal/loongarch/include/rte_spinlock.h  | 64 
 lib/eal/loongarch/include/rte_vect.h  | 65 +
 lib/eal/loongarch/meson.build | 11 +++
 lib/eal/loongarch/rte_cpuflags.c  | 93 ++
 lib/eal/loongarch/rte_cycles.c| 45 +
 lib/eal/loongarch/rte_hypervisor.c| 11 +++
 lib/eal/loongarch/rte_power_intrinsics.c  | 53 ++
 meson.build   |  2 +
 36 files changed, 977 insertions(+), 1 deletion(-)
 create mode 100644 config/loongarch/loongarch_loongarch64_linux_gcc
 create mode 100644 config/loongarch/meson.build
 create mode 100644 doc/guides/linux_gsg/cross_build_dpdk_for_loongarch.rst
 create mode 100644 lib/eal/loongarch/include/meson.build
 create mode 100644 lib/eal/loongarch/include/rte_atomic.h
 create mode 100644 lib/eal/loongarch/include/rte_byteorder.h
 create mode 100644 lib/eal/loongarch/include/rte_cpuflags.h
 create mode 100644 lib/eal/loongarch/include/rte_cycles.h
 create mode 100644 lib/eal/loongarch/include/rte_io.h
 create mode 100644 lib/eal/loongarch/include/rte_memcpy.h
 create mode 100644 lib/eal/loongarch/include/rte_pause.h
 create mode 100644 lib/eal/loongarch/include/rte_power_intrinsics.h
 create mode 100644 lib/eal/loongarch/include/rte_prefetch.h
 create mode 100644 lib/eal/loongarch/include/rte_rwlock.h
 create mode 100644 lib/eal/loongarch/include/rte_spinlock.h
 create mode 100644 lib/eal/loongarch/include/rte_vect.h
 create mode 100644 lib/eal/loongarch/meson.build
 create mode 100644 lib/eal/loongarch/rte_cpuflags.c
 create mode 100644 lib/eal/loongarch/rte_cycles.c
 create mode 100644 lib/eal/loongarch/rte_hypervisor.c
 create mode 100644 lib/eal/loongarch/rte_power_intrinsics.c

diff --git a/MAINTAINERS b/MAINTAINERS
index a55b379d73..5472fccf61 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -294,6 +294,12 @@ F: app/*/*_neon.*
 F: examples/*/*_neon.*
 F: examples/common/neon/
 
+LoongArch
+M: Min Zhou 
+F: config/loongarch/
+F: doc/guides/linux_gsg/cross_build_dpdk_for_loongarch.rst
+F: lib/eal/loongarch/
+
 IBM POWER (alpha)
 M: David Christensen 
 F: config/

[PATCH v8 4/6] net/tap: set BPF syscall ID for LoongArch

2022-10-04 Thread Min Zhou
Define the missing __NR_bpf syscall id to enable the tap PMD on
LoongArch.

Signed-off-by: Min Zhou 
---
 drivers/net/tap/meson.build | 6 --
 drivers/net/tap/tap_bpf.h   | 2 ++
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/tap/meson.build b/drivers/net/tap/meson.build
index f0d03069cd..c09713a67b 100644
--- a/drivers/net/tap/meson.build
+++ b/drivers/net/tap/meson.build
@@ -1,12 +1,6 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright 2018 Luca Boccassi 
 
-if arch_subdir == 'loongarch'
-build = false
-reason = 'not supported on LoongArch'
-subdir_done()
-endif
-
 if not is_linux
 build = false
 reason = 'only supported on Linux'
diff --git a/drivers/net/tap/tap_bpf.h b/drivers/net/tap/tap_bpf.h
index 639bdf3a79..0d38bc111f 100644
--- a/drivers/net/tap/tap_bpf.h
+++ b/drivers/net/tap/tap_bpf.h
@@ -103,6 +103,8 @@ union bpf_attr {
 #  define __NR_bpf 361
 # elif defined(__riscv)
 #  define __NR_bpf 280
+# elif defined(__loongarch__)
+#  define __NR_bpf 280
 # else
 #  error __NR_bpf not defined
 # endif
-- 
2.32.1 (Apple Git-133)



[PATCH v8 6/6] test/cpuflags: add test for LoongArch cpu flag

2022-10-04 Thread Min Zhou
Add checks for all flag values defined in the LoongArch cpu
feature table.

Signed-off-by: Min Zhou 
---
 app/test/test_cpuflags.c | 41 
 1 file changed, 41 insertions(+)

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 98a99c2c7d..a0e342ae48 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -281,6 +281,47 @@ test_cpuflags(void)
CHECK_FOR_FLAG(RTE_CPUFLAG_RISCV_ISA_Z);
 #endif
 
+#if defined(RTE_ARCH_LOONGARCH)
+   printf("Check for CPUCFG:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_CPUCFG);
+
+   printf("Check for LAM:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LAM);
+
+   printf("Check for UAL:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_UAL);
+
+   printf("Check for FPU:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_FPU);
+
+   printf("Check for LSX:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LSX);
+
+   printf("Check for LASX:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LASX);
+
+   printf("Check for CRC32:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_CRC32);
+
+   printf("Check for COMPLEX:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_COMPLEX);
+
+   printf("Check for CRYPTO:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_CRYPTO);
+
+   printf("Check for LVZ:\t\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LVZ);
+
+   printf("Check for LBT_X86:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LBT_X86);
+
+   printf("Check for LBT_ARM:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LBT_ARM);
+
+   printf("Check for LBT_MIPS:\t");
+   CHECK_FOR_FLAG(RTE_CPUFLAG_LBT_MIPS);
+#endif
+
/*
 * Check if invalid data is handled properly
 */
-- 
2.32.1 (Apple Git-133)



Re: [PATCH v2] mempool: fix get objects from mempool with cache

2022-10-04 Thread Andrew Rybchenko

On 10/4/22 18:13, Morten Brørup wrote:

@Aaron, do you have any insights or comments to my curiosity below?


From: Andrew Rybchenko [mailto:andrew.rybche...@oktetlabs.ru]
Sent: Tuesday, 4 October 2022 14.58

Hi Morten,

In general I agree that the fix is required.
In sent v3 I'm trying to make it a bit better from my point of
view. See few notes below.


I stand by my review and accept of v3 - this message is not intended to change 
that! I'm just curious...

I wonder how accurate the automated performance tests ([v2], [v3]) are, and if 
they are comparable between February and October?

[v2]: http://mails.dpdk.org/archives/test-report/2022-February/256462.html
[v3]: http://mails.dpdk.org/archives/test-report/2022-October/311526.html


Ubuntu 20.04
Kernel: 4.15.0-generic
Compiler: gcc 7.4
NIC: Intel Corporation Ethernet Converged Network Adapter XL710-QDA2 4 Mbps
Target: x86_64-native-linuxapp-gcc
Fail/Total: 0/4

Detail performance results:
** V2 **:
+--+-+-++--+
| num_cpus | num_threads | txd/rxd | frame_size |  throughput difference from  |
|  | | ||   expected   |
+==+=+=++==+
| 1| 2   | 512 | 64 | 0.5% |
+--+-+-++--+
| 1| 2   | 2048| 64 | -1.5%|
+--+-+-++--+
| 1| 1   | 512 | 64 | 4.3% |
+--+-+-++--+
| 1| 1   | 2048| 64 | 10.9%|
+--+-+-++--+

** V3 **:
+--+-+-++--+
| num_cpus | num_threads | txd/rxd | frame_size |  throughput difference from  |
|  | | ||   expected   |
+==+=+=++==+
| 1| 2   | 512 | 64 | -0.7%|
+--+-+-++--+
| 1| 2   | 2048| 64 | -2.3%|
+--+-+-++--+
| 1| 1   | 512 | 64 | 0.5% |
+--+-+-++--+
| 1| 1   | 2048| 64 | 7.9% |
+--+-+-++--+



Very interesting, may be it make sense to sent your patch and
mine once again to check current figures and results stability.



Re: [PATCH RESEND 00/13] some bugfixes and clean code for hns3

2022-10-04 Thread Andrew Rybchenko

On 9/5/22 11:59, Dongdong Liu wrote:

This patchset consists of two parts that have been sent out before.
1. [PATCH 0/5] some bugfixes and clean code for hns3
https://lore.kernel.org/all/20220713115002.8959-2-liudongdo...@huawei.com/T/
2. [PATCH 0/8] some bugfixes for hns3
https://lore.kernel.org/all/20220727103616.18596-1-liudongdo...@huawei.com/

Rebased on the latest dpdk-net-next (branch main) to avoid merge confict.

Chengwen Feng (6):
   net/hns3: fix segment fault when using SVE xmit
   net/hns3: fix next-to-use overflow when using SVE xmit
   net/hns3: fix next-to-use overflow when using simple xmit
   net/hns3: optimize SVE xmit performance
   net/hns3: fix segment fault when secondary process access FW
   net/hns3: revert optimize Tx performance

Dongdong Liu (1):
   net/hns3: adjust code for dump file

Huisong Li (3):
   net/hns3: fix fail to receive PTP packet
   net/hns3: delete rte unused tag
   net/hns3: fix uncleared hardware MAC statistics

Jie Hai (1):
   net/hns3: add dump of VF vlan filter modify capability

Min Hu (Connor) (2):
   net/hns3: rename hns3 dump file
   net/hns3: fix code check warning

  drivers/net/hns3/hns3_common.c|   4 +-
  .../hns3/{hns3_ethdev_dump.c => hns3_dump.c}  | 292 ++
  drivers/net/hns3/hns3_dump.h  |  13 +
  drivers/net/hns3/hns3_ethdev.c|  11 +-
  drivers/net/hns3/hns3_ethdev.h|  15 +-
  drivers/net/hns3/hns3_ethdev_vf.c |  12 +-
  drivers/net/hns3/hns3_flow.c  |   4 +-
  drivers/net/hns3/hns3_intr.c  |  27 +-
  drivers/net/hns3/hns3_intr.h  |   4 +-
  drivers/net/hns3/hns3_ptp.c   |   1 -
  drivers/net/hns3/hns3_regs.c  |   4 +-
  drivers/net/hns3/hns3_rss.c   |   2 +-
  drivers/net/hns3/hns3_rss.h   |   2 +-
  drivers/net/hns3/hns3_rxtx.c  | 127 
  drivers/net/hns3/hns3_rxtx.h  |  14 +-
  drivers/net/hns3/hns3_rxtx_vec.c  |  20 +-
  drivers/net/hns3/hns3_rxtx_vec_sve.c  |  32 +-
  drivers/net/hns3/hns3_stats.c |  26 +-
  drivers/net/hns3/hns3_stats.h |   5 +-
  drivers/net/hns3/meson.build  |   2 +-
  20 files changed, 333 insertions(+), 284 deletions(-)
  rename drivers/net/hns3/{hns3_ethdev_dump.c => hns3_dump.c} (73%)
  create mode 100644 drivers/net/hns3/hns3_dump.h

--
2.22.0



Applied to dpdk-next-net/main, thanks.


[PATCH v2] mempool: fix get objects from mempool with cache

2022-10-04 Thread Morten Brørup
RESENT for test purposes.

A flush threshold for the mempool cache was introduced in DPDK version 1.3, but 
rte_mempool_do_generic_get() was not completely updated back then, and some 
inefficiencies were introduced.

This patch fixes the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated with the 
change in DPDK version 1.3.
The initial screening compared the request length to the cache size, which was 
correct before, but became irrelevant with the introduction of the flush 
threshold. E.g. the cache can hold up to flushthresh objects, which is more 
than its size, so some requests were not served from the cache, even though 
they could be.
The initial screening has now been corrected to match the initial screening in 
rte_mempool_do_generic_put(), which verifies that a cache is present, and that 
the length of the request does not overflow the memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the 
application burst length is the same as the cache size. In such cases, the 
objects were not ever fetched from the mempool cache, regardless if they could 
have been.
This scenario occurs e.g. if an application has configured a mempool with a 
size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must behave 
according to the description of that function.
Specifically, objects must first be returned from the cache, subsequently from 
the ring.
After the change in DPDK version 1.3, this was not the behavior when the 
request was partially satisfied from the cache; instead, the objects from the 
ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache, which 
benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects in 
reverse order, which it still does.) Now, all code paths first return objects 
from the cache, subsequently from the ring.

The function was not behaving as described (by the function using it) and 
expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt to get all 
the requested objects from the ring (instead of only the number of requested 
objects minus the objects available in the ring), and the function would fail 
if that failed.
Now, the first part of the request is always satisfied from the cache, and if 
the subsequent backfilling of the cache from the ring fails, only the remaining 
requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache plus the 
common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache was 
treated as unlikely. Now it is treated as likely.
And in the code path where the cache was backfilled first, numbers were added 
and subtracted from the cache length; now this code path simply sets the cache 
length to its final value.

v2 changes
- Do not modify description of return value. This belongs in a separate doc fix.
- Elaborate even more on which bugs the modifications fix.

Signed-off-by: Morten Brørup 
---
 lib/mempool/rte_mempool.h | 75 ---
 1 file changed, 54 insertions(+), 21 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h index 
1e7a3c1527..2898c690b0 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1463,38 +1463,71 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void 
**obj_table,
uint32_t index, len;
void **cache_objs;
 
-   /* No cache provided or cannot be satisfied from cache */
-   if (unlikely(cache == NULL || n >= cache->size))
+   /* No cache provided or if get would overflow mem allocated for cache */
+   if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
goto ring_dequeue;
 
-   cache_objs = cache->objs;
+   cache_objs = &cache->objs[cache->len];
+
+   if (n <= cache->len) {
+   /* The entire request can be satisfied from the cache. */
+   cache->len -= n;
+   for (index = 0; index < n; index++)
+   *obj_table++ = *--cache_objs;
+
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n);
 
-   /* Can this be satisfied from the cache? */
-   if (cache->len < n) {
-   /* No. Backfill the cache first, and then fill from it */
-   uint32_t req = n + (cache->size - cache->len);
+   return 0;
+   }
 
-   /* How many do we require i.e. number to fill the cache + the 
request */
-   ret = rte_mempool_ops_dequeue_bulk(mp,
-   

[dpdk-dev v1] crypto/qat: fix of qat build request session in mp

2022-10-04 Thread Kai Ji
This patch fix the session pointer passed in set_session()
when ctx has NULL build request pointer in multi-processes
scenario.

Fixes: fb3b9f492205 ("crypto/qat: rework burst data path")
Cc: sta...@dpdk.org

Signed-off-by: Kai Ji 
---
 drivers/crypto/qat/qat_sym.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/crypto/qat/qat_sym.c b/drivers/crypto/qat/qat_sym.c
index 54c3d59a51..fd2d9eed3b 100644
--- a/drivers/crypto/qat/qat_sym.c
+++ b/drivers/crypto/qat/qat_sym.c
@@ -85,7 +85,7 @@ qat_sym_build_request(void *in_op, uint8_t *out_msg,
if (unlikely(ctx->build_request[proc_type] == NULL)) {
int ret =
qat_sym_gen_dev_ops[dev_gen].set_session(
-   (void *)cdev, (void *)sess);
+   (void *)cdev, (void *)ctx);
if (ret < 0) {
op->status =

RTE_CRYPTO_OP_STATUS_INVALID_SESSION;
-- 
2.17.1



RE: [PATCH v2] mempool: fix get objects from mempool with cache

2022-10-04 Thread Morten Brørup
RESENT for test purposes.

A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.

This patch fixes the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the ring.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the ring.

The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the ring (instead of only the
number of requested objects minus the objects available in the ring),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the ring fails, only
the remaining requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache
plus the common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.
And in the code path where the cache was backfilled first, numbers were
added and subtracted from the cache length; now this code path simply
sets the cache length to its final value.

v2 changes
- Do not modify description of return value. This belongs in a separate
doc fix.
- Elaborate even more on which bugs the modifications fix.

Signed-off-by: Morten Brørup 
---
 lib/mempool/rte_mempool.h | 75 ---
 1 file changed, 54 insertions(+), 21 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 1e7a3c1527..2898c690b0 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1463,38 +1463,71 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void 
**obj_table,
uint32_t index, len;
void **cache_objs;
 
-   /* No cache provided or cannot be satisfied from cache */
-   if (unlikely(cache == NULL || n >= cache->size))
+   /* No cache provided or if get would overflow mem allocated for cache */
+   if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
goto ring_dequeue;
 
-   cache_objs = cache->objs;
+   cache_objs = &cache->objs[cache->len];
+
+   if (n <= cache->len) {
+   /* The entire request can be satisfied from the cache. */
+   cache->len -= n;
+   for (index = 0; index < n; index++)
+   *obj_table++ = *--cache_objs;
+
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n);
 
-   /* Can this be satisfied from the cache? */
-   if (cache->len < n) {
-   /* No. Backfill the cache first, and then fill from it */
-   uint32_t req = n + (cache->size - cache->len);
+   return 0;
+   }
 
-   /* How many do we require i.e. number to fill the cache + the 
request */
-   ret = rte_mempool_ops_dequeue_bulk(mp,
-   &cache->objs[cache->len], r

[PATCH v2] mempool: fix get objects from mempool with cache

2022-10-04 Thread Morten Brørup
RESENT for test purposes.

A flush threshold for the mempool cache was introduced in DPDK version
1.3, but rte_mempool_do_generic_get() was not completely updated back
then, and some inefficiencies were introduced.

This patch fixes the following in rte_mempool_do_generic_get():

1. The code that initially screens the cache request was not updated
with the change in DPDK version 1.3.
The initial screening compared the request length to the cache size,
which was correct before, but became irrelevant with the introduction of
the flush threshold. E.g. the cache can hold up to flushthresh objects,
which is more than its size, so some requests were not served from the
cache, even though they could be.
The initial screening has now been corrected to match the initial
screening in rte_mempool_do_generic_put(), which verifies that a cache
is present, and that the length of the request does not overflow the
memory allocated for the cache.

This bug caused a major performance degradation in scenarios where the
application burst length is the same as the cache size. In such cases,
the objects were not ever fetched from the mempool cache, regardless if
they could have been.
This scenario occurs e.g. if an application has configured a mempool
with a size matching the application's burst size.

2. The function is a helper for rte_mempool_generic_get(), so it must
behave according to the description of that function.
Specifically, objects must first be returned from the cache,
subsequently from the ring.
After the change in DPDK version 1.3, this was not the behavior when
the request was partially satisfied from the cache; instead, the objects
from the ring were returned ahead of the objects from the cache.
This bug degraded application performance on CPUs with a small L1 cache,
which benefit from having the hot objects first in the returned array.
(This is probably also the reason why the function returns the objects
in reverse order, which it still does.)
Now, all code paths first return objects from the cache, subsequently
from the ring.

The function was not behaving as described (by the function using it)
and expected by applications using it. This in itself is also a bug.

3. If the cache could not be backfilled, the function would attempt
to get all the requested objects from the ring (instead of only the
number of requested objects minus the objects available in the ring),
and the function would fail if that failed.
Now, the first part of the request is always satisfied from the cache,
and if the subsequent backfilling of the cache from the ring fails, only
the remaining requested objects are retrieved from the ring.

The function would fail despite there are enough objects in the cache
plus the common pool.

4. The code flow for satisfying the request from the cache was slightly
inefficient:
The likely code path where the objects are simply served from the cache
was treated as unlikely. Now it is treated as likely.
And in the code path where the cache was backfilled first, numbers were
added and subtracted from the cache length; now this code path simply
sets the cache length to its final value.

v2 changes
- Do not modify description of return value. This belongs in a separate
doc fix.
- Elaborate even more on which bugs the modifications fix.

Signed-off-by: Morten Brørup 
---
 lib/mempool/rte_mempool.h | 75 ---
 1 file changed, 54 insertions(+), 21 deletions(-)

diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h
index 1e7a3c1527..2898c690b0 100644
--- a/lib/mempool/rte_mempool.h
+++ b/lib/mempool/rte_mempool.h
@@ -1463,38 +1463,71 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, void 
**obj_table,
uint32_t index, len;
void **cache_objs;
 
-   /* No cache provided or cannot be satisfied from cache */
-   if (unlikely(cache == NULL || n >= cache->size))
+   /* No cache provided or if get would overflow mem allocated for cache */
+   if (unlikely(cache == NULL || n > RTE_MEMPOOL_CACHE_MAX_SIZE))
goto ring_dequeue;
 
-   cache_objs = cache->objs;
+   cache_objs = &cache->objs[cache->len];
+
+   if (n <= cache->len) {
+   /* The entire request can be satisfied from the cache. */
+   cache->len -= n;
+   for (index = 0; index < n; index++)
+   *obj_table++ = *--cache_objs;
+
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_bulk, 1);
+   RTE_MEMPOOL_STAT_ADD(mp, get_success_objs, n);
 
-   /* Can this be satisfied from the cache? */
-   if (cache->len < n) {
-   /* No. Backfill the cache first, and then fill from it */
-   uint32_t req = n + (cache->size - cache->len);
+   return 0;
+   }
 
-   /* How many do we require i.e. number to fill the cache + the 
request */
-   ret = rte_mempool_ops_dequeue_bulk(mp,
-   &cache->objs[cache->len], r

Re: [PATCH 0/7] ethdev: introduce hairpin memory capabilities

2022-10-04 Thread Thomas Monjalon
19/09/2022 18:37, Dariusz Sosnowski:
> This patch series introduces hairpin memory configuration options proposed in
> http://patches.dpdk.org/project/dpdk/patch/20220811120530.191683-1-dsosnow...@nvidia.com/
> for Rx and Tx hairpin queues. It also implements handling of these options in 
> mlx5 PMD
> and allows to use new hairpin options in testpmd (through `--hairpin-mode` 
> option) and
> flow-perf (through `--hairpin-conf` option).

2 things are missing in this series:

1/ motivation (why is this needed)
2/ compilation on Windows
looks like devx_umem_reg has 5 parameters in Windows glue!





[PATCH v9] eal: add bus cleanup to eal cleanup

2022-10-04 Thread Kevin Laatz
During EAL init, all buses are probed and the devices found are
initialized. On eal_cleanup(), the inverse does not happen, meaning any
allocated memory and other configuration will not be cleaned up
appropriately on exit.

Currently, in order for device cleanup to take place, applications must
call the driver-relevant functions to ensure proper cleanup is done before
the application exits. Since initialization occurs for all devices on the
bus, not just the devices used by an application, it requires a)
application awareness of all bus devices that could have been probed on the
system, and b) code duplication across applications to ensure cleanup is
performed. An example of this is rte_eth_dev_close() which is commonly used
across the example applications.

This patch proposes adding bus cleanup to the eal_cleanup() to make EAL's
init/exit more symmetrical, ensuring all bus devices are cleaned up
appropriately without the application needing to be aware of all bus types
that may have been probed during initialization.

Contained in this patch are the changes required to perform cleanup for
devices on the PCI bus and VDEV bus during eal_cleanup(). There would be an
ask for bus maintainers to add the relevant cleanup for their buses since
they have the domain expertise.

Signed-off-by: Kevin Laatz 
Acked-by: Morten Brørup 
Reviewed-by: Bruce Richardson 

---
v9:
* move bus cleanup before trace save and uninitialize

v8:
* rebase

v7:
* free rte_pci_device structs during cleanup
* free rte_vdev_device structs during cleanup

v6:
* add bus_cleanup to eal_cleanup for FreeBSD
* add bus_cleanup to eal_cleanup for Windows
* remove bus cleanup function to remove rte_ prefix
* other minor fixes

v5:
* remove unnecessary logs
* move rte_bus_cleanup() definition to eal_private.h
* fix return values for vdev_cleanup and pci_cleanup

v4:
* rebase

v3:
* add vdev bus cleanup

v2:
* change log level from INFO to DEBUG for PCI cleanup
* add abignore entries for rte_bus related false positives
---
 drivers/bus/pci/pci_common.c| 28 
 drivers/bus/vdev/vdev.c | 27 +++
 lib/eal/common/eal_common_bus.c | 17 +
 lib/eal/common/eal_private.h| 10 ++
 lib/eal/freebsd/eal.c   |  1 +
 lib/eal/include/bus_driver.h| 13 +
 lib/eal/linux/eal.c |  1 +
 lib/eal/windows/eal.c   |  1 +
 8 files changed, 98 insertions(+)

diff --git a/drivers/bus/pci/pci_common.c b/drivers/bus/pci/pci_common.c
index 5ea72bcf23..fb754e0e0a 100644
--- a/drivers/bus/pci/pci_common.c
+++ b/drivers/bus/pci/pci_common.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "private.h"
 
@@ -439,6 +440,32 @@ pci_probe(void)
return (probed && probed == failed) ? -1 : 0;
 }
 
+static int
+pci_cleanup(void)
+{
+   struct rte_pci_device *dev, *tmp_dev;
+   int error = 0;
+
+   RTE_TAILQ_FOREACH_SAFE(dev, &rte_pci_bus.device_list, next, tmp_dev) {
+   struct rte_pci_driver *drv = dev->driver;
+   int ret = 0;
+
+   if (drv == NULL || drv->remove == NULL)
+   continue;
+
+   ret = drv->remove(dev);
+   if (ret < 0) {
+   rte_errno = errno;
+   error = -1;
+   }
+   dev->driver = NULL;
+   dev->device.driver = NULL;
+   free(dev);
+   }
+
+   return error;
+}
+
 /* dump one device */
 static int
 pci_dump_one_device(FILE *f, struct rte_pci_device *dev)
@@ -856,6 +883,7 @@ struct rte_pci_bus rte_pci_bus = {
.bus = {
.scan = rte_pci_scan,
.probe = pci_probe,
+   .cleanup = pci_cleanup,
.find_device = pci_find_device,
.plug = pci_plug,
.unplug = pci_unplug,
diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
index b176b658fc..f5b43f1930 100644
--- a/drivers/bus/vdev/vdev.c
+++ b/drivers/bus/vdev/vdev.c
@@ -567,6 +567,32 @@ vdev_probe(void)
return ret;
 }
 
+static int
+vdev_cleanup(void)
+{
+   struct rte_vdev_device *dev, *tmp_dev;
+   int error = 0;
+
+   RTE_TAILQ_FOREACH_SAFE(dev, &vdev_device_list, next, tmp_dev) {
+   const struct rte_vdev_driver *drv;
+   int ret = 0;
+
+   drv = container_of(dev->device.driver, const struct 
rte_vdev_driver, driver);
+
+   if (drv == NULL || drv->remove == NULL)
+   continue;
+
+   ret = drv->remove(dev);
+   if (ret < 0)
+   error = -1;
+
+   dev->device.driver = NULL;
+   free(dev);
+   }
+
+   return error;
+}
+
 struct rte_device *
 rte_vdev_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
 const void *data)
@@ -625,6 +651,7 @@ vdev_get_iommu_class(void)
 static struct rte_bus rte_vd

Re: [PATCH 1/7] ethdev: introduce hairpin memory capabilities

2022-10-04 Thread Thomas Monjalon
19/09/2022 18:37, Dariusz Sosnowski:
> This patch introduces new hairpin queue configuration options through
> rte_eth_hairpin_conf struct, allowing to tune Rx and Tx hairpin queues
> memory configuration. Hairpin configuration is extended with the
> following fields:

What is the benefit?
How the user knows what to use?
Isn't it too much low level for a user?
Why it is not automatic in the driver?

[...]
> + /**
> +  * Use locked device memory as a backing storage.
> +  *
> +  * - When set, PMD will attempt to use on-device memory as a backing 
> storage for descriptors
> +  *   and/or data in hairpin queue.
> +  * - When set, PMD will use detault memory type as a backing storage. 
> Please refer to PMD

You probably mean "clear".
Please make lines shorter.
You should split lines logically, after a dot or at the end of a part.

> +  *   documentation for details.
> +  *
> +  * API user should check if PMD supports this configuration flag using
> +  * @see rte_eth_dev_hairpin_capability_get.
> +  */
> + uint32_t use_locked_device_memory:1;
> +
> + /**
> +  * Use DPDK memory as backing storage.
> +  *
> +  * - When set, PMD will attempt to use memory managed by DPDK as a 
> backing storage
> +  *   for descriptors and/or data in hairpin queue.
> +  * - When clear, PMD will use default memory type as a backing storage. 
> Please refer
> +  *   to PMD documentation for details.
> +  *
> +  * API user should check if PMD supports this configuration flag using
> +  * @see rte_eth_dev_hairpin_capability_get.
> +  */
> + uint32_t use_rte_memory:1;
> +
> + /**
> +  * Force usage of hairpin memory configuration.
> +  *
> +  * - When set, PMD will attempt to use specified memory settings and
> +  *   if resource allocation fails, then hairpin queue setup will result 
> in an
> +  *   error.
> +  * - When clear, PMD will attempt to use specified memory settings and
> +  *   if resource allocation fails, then PMD will retry allocation with 
> default
> +  *   configuration.
> +  */
> + uint32_t force_memory:1;
> +
> + uint32_t reserved:11; /**< Reserved bits. */

You can insert a blank line here.

>   struct rte_eth_hairpin_peer peers[RTE_ETH_MAX_HAIRPIN_PEERS];
>  };




Re: [PATCH v7 1/3] power: add uncore frequency control API to the power library

2022-10-04 Thread Thomas Monjalon
28/09/2022 15:30, Tadhg Kearney:
> Add API to allow uncore frequency adjustment. This is done through
> manipulating related uncore frequency control sysfs entries to
> adjust the minimum and maximum uncore frequency values.
> Nine API's are being added that are all public and experimental.

You cannot introduce an API without explaining what it is about.
Maybe I'm an idiot but I don't know what is "uncore".
I see it is explained in the documentation,
but few words in the commit message would not be too much.
At least you must say it for Linux on Intel,
and which feature it is controlling.

> +Uncore API
> +--
> +
> +Abstract
> +
> +
> +Uncore is a term used by Intel to describe the functions of a microprocessor 
> that are
> +not in the core, but which must be closely connected to the core to achieve 
> high performance;
> +L3 cache, on-die memory controller, etc.
> +Significant power savings can be achieved by reducing the uncore frequency 
> to its lowest value.

So this is an Intel thing.

> +
> +The Linux kernel provides the driver “intel-uncore-frequency" to control the 
> uncore frequency limits
> +for x86 platform. The driver is available from kernel version 5.6 and above.
> +Also CONFIG_INTEL_UNCORE_FREQ_CONTROL will need to be enabled in the kernel, 
> which was added in 5.6.
> +This manipulates the contest of MSR 0x620, which sets min/max of the uncore 
> for the SKU.

It is correctly named "intel-uncore" in the Linux kernel.
Why not having "Intel" in the DPDK feature name?

> +
> +
> +API Overview for Uncore
> +~~~

A blank line is missing here.

> +* **Uncore Power Init**: Initialise uncore power, populate frequency array 
> and record
> +  original min & max for pkg & die.
> +
> +* **Uncore Power Exit**: Exit uncore power, restoring original min & max for 
> pkg & die.
> +
> +* **Get Uncore Power Freq**: Get current uncore freq index for pkg & die.
> +
> +* **Set Uncore Power Freq**: Set min & max uncore freq index for pkg & die 
> (min and max will be the same).
> +
> +* **Uncore Power Max**: Set max uncore freq index for pkg & die.
> +
> +* **Uncore Power Min**: Set min uncore freq index for pkg & die.
> +
> +* **Get Num Freqs**: Get the number of frequencies in the index array.
> +
> +* **Get Num Pkgs**: Get the number of packages (CPUs) on the system.
> +
> +* **Get Num Dies**: Get the number of die's on a given package.

Not sure what you are listing here. Are they functions?
If you really want to keep a list, I suggest using a definition list
available in RST syntax.
If you want to provide an explanation easy to read,
full sentences connecting things together would be better.

> +
>  References
>  --
>  
> diff --git a/doc/guides/rel_notes/release_22_11.rst 
> b/doc/guides/rel_notes/release_22_11.rst
> index cb7677fd3c..5d3f815b54 100644
> --- a/doc/guides/rel_notes/release_22_11.rst
> +++ b/doc/guides/rel_notes/release_22_11.rst
> @@ -75,6 +75,11 @@ New Features
>* Added ``rte_event_eth_tx_adapter_instance_get`` to get Tx adapter
>  instance ID for specified ethernet device ID and Tx queue index.
>  
> +* **Added uncore frequency control API to the power library.**
> +
> +  Add api to allow uncore frequency adjustment. This is done through

s/api/API/

> +  manipulating related uncore frequency control sysfs entries to
> +  adjust the minimum and maximum uncore frequency values.

It is Linux-only for Intel hardware only.

> --- /dev/null
> +++ b/lib/power/rte_power_uncore.c

I would add "intel" in the filename.

[...]
> +#define UNCORE_FREQUENCY_DIR "/sys/devices/system/cpu/intel_uncore_frequency"
> +#define POWER_GOVERNOR_PERF "performance"
> +#define POWER_UNCORE_SYSFILE_MAX_FREQ \
> + 
> "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/max_freq_khz"
> +#define POWER_UNCORE_SYSFILE_MIN_FREQ  \
> + 
> "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/min_freq_khz"
> +#define POWER_UNCORE_SYSFILE_BASE_MAX_FREQ \
> + 
> "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/initial_max_freq_khz"
> +#define POWER_UNCORE_SYSFILE_BASE_MIN_FREQ  \
> + 
> "/sys/devices/system/cpu/intel_uncore_frequency/package_%02u_die_%02u/initial_min_freq_khz"

It is for Intel CPU only, right?

> + * This function should NOT be called in the fast path.
> + *
> + * @param pkg
> + *  Package number.
> + * @param die
> + *  Die number.

To me it is not clear what they are.
Is it possible to better explain "pkg" and "die" somewhere?
Is it related to NUMA nodes?




[PATCH v12 0/7] bbdev changes for 22.11

2022-10-04 Thread Nicolas Chautru
Hi Akhil, Thomas, 

v12: minor change to fix misaligned comment on patch 6 raised by Thomas. 
Thanks. 
v11: updated based on Thomas review notably on comments through the serie and 
ordering. Thanks. I have also updated rel_notes and deprecation through the 
serie this time.
v10: replacing the _PADDED_MAX enum to _SIZE_MAX macro based on suggestion from 
Ferruh/Maxime/Akhil. Thanks
v9: removing code snippet from documentation in 5/7 requested by Akhil. Thanks. 
v8: edit based on review by Akhil : typos, coding guidelines. No functional 
change. Thanks
v7: couple of typos in documentation spotted by Maxime. Thanks.
v6: added one comment in commit 2/7 suggested by Maxime.
v5: update base on review from Tom Rix. Number of typos reported and resolved,
removed the commit related to rw_lock for now, added a commit for
code clean up from review, resolved one rebase issue between 2 commits, used 
size of array for some bound check implementation. Thanks. 
v4: update to the last 2 commits to include function to print the queue status 
and a fix to the rte_lock within the wrong structure
v3: update to device status info to also use padded size for the related array.
Adding also 2 additionals commits to allow the API struc to expose more 
information related to queues corner cases/warning as well as an optional rw 
lock.
Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack early 
is possible to get this applied earlier and due to time off this summer.
Thanks
Nic

Nicolas Chautru (7):
  bbdev: allow operation type enum for growth
  bbdev: add device status info
  bbdev: add device info on queue topology
  drivers/baseband: update PMDs to expose queue per operation
  bbdev: add new operation for FFT processing
  bbdev: add queue related warning and status information
  bbdev: remove unnecessary if-check

 app/test-bbdev/test_bbdev.c   |   2 +-
 app/test-bbdev/test_bbdev_perf.c  |   6 +-
 doc/guides/prog_guide/bbdev.rst   | 103 +++
 doc/guides/rel_notes/deprecation.rst  |  13 --
 doc/guides/rel_notes/release_22_11.rst|  14 ++
 drivers/baseband/acc100/rte_acc100_pmd.c  |  30 ++--
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |   9 +
 drivers/baseband/la12xx/bbdev_la12xx.c|  10 +-
 drivers/baseband/null/bbdev_null.c|   1 +
 .../baseband/turbo_sw/bbdev_turbo_software.c  |  13 ++
 examples/bbdev_app/main.c |   2 +-
 lib/bbdev/rte_bbdev.c |  57 +-
 lib/bbdev/rte_bbdev.h | 158 +++-
 lib/bbdev/rte_bbdev_op.h  | 169 --
 lib/bbdev/version.map |  12 ++
 16 files changed, 560 insertions(+), 48 deletions(-)

-- 
2.37.1



[PATCH v12 1/7] bbdev: allow operation type enum for growth

2022-10-04 Thread Nicolas Chautru
Updating the enum for rte_bbdev_op_type
to allow to keep ABI compatible for enum insertion
while adding padded maximum value for array need.
Removing RTE_BBDEV_OP_TYPE_COUNT and instead exposing
RTE_BBDEV_OP_TYPE_SIZE_MAX.

Signed-off-by: Nicolas Chautru 
Acked-by: Maxime Coquelin 
---
 app/test-bbdev/test_bbdev.c|  2 +-
 app/test-bbdev/test_bbdev_perf.c   |  4 ++--
 doc/guides/rel_notes/deprecation.rst   |  5 +
 doc/guides/rel_notes/release_22_11.rst |  3 +++
 examples/bbdev_app/main.c  |  2 +-
 lib/bbdev/rte_bbdev.c  |  8 +---
 lib/bbdev/rte_bbdev_op.h   | 14 --
 7 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/app/test-bbdev/test_bbdev.c b/app/test-bbdev/test_bbdev.c
index ac06d7320a..65805977ae 100644
--- a/app/test-bbdev/test_bbdev.c
+++ b/app/test-bbdev/test_bbdev.c
@@ -521,7 +521,7 @@ test_bbdev_op_pool(void)
rte_mempool_free(mp);
 
TEST_ASSERT((mp = rte_bbdev_op_pool_create("Test_INV",
-   RTE_BBDEV_OP_TYPE_COUNT, size, cache_size, 0)) == NULL,
+   RTE_BBDEV_OP_TYPE_SIZE_MAX, size, cache_size, 0)) == 
NULL,
"Failed test for rte_bbdev_op_pool_create: "
"returned value is not NULL for invalid type");
 
diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 311e5d1a96..f5eeb735b2 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -2429,13 +2429,13 @@ run_test_case_on_device(test_case_function 
*test_case_func, uint8_t dev_id,
 
/* Find capabilities */
const struct rte_bbdev_op_cap *cap = info.drv.capabilities;
-   for (i = 0; i < RTE_BBDEV_OP_TYPE_COUNT; i++) {
+   do {
if (cap->type == test_vector.op_type) {
capabilities = cap;
break;
}
cap++;
-   }
+   } while (cap->type != RTE_BBDEV_OP_NONE);
TEST_ASSERT_NOT_NULL(capabilities,
"Couldn't find capabilities");
 
diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index a991fa14de..e35c86a25c 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -120,10 +120,7 @@ Deprecation Notices
   ``RTE_ETH_EVENT_IPSEC_SA_BYTE_HARD_EXPIRY`` and
   ``RTE_ETH_EVENT_IPSEC_SA_PKT_HARD_EXPIRY`` in DPDK 22.11.
 
-* bbdev: ``RTE_BBDEV_OP_TYPE_COUNT`` terminating the ``rte_bbdev_op_type``
-  enum will be deprecated and instead use fixed array size when required
-  to allow for future enum extension.
-  Will extend API to support new operation type ``RTE_BBDEV_OP_FFT`` as per
+* bbdev: Will extend API to support new operation type ``RTE_BBDEV_OP_FFT`` as 
per
   this `RFC `__.
   New members will be added in ``rte_bbdev_driver_info`` to expose
   PMD queue topology inspired by
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 53fe21453c..e9db53f372 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -317,6 +317,9 @@ ABI Changes
 * eventdev: Added ``weight`` and ``affinity`` fields
   to ``rte_event_queue_conf`` structure.
 
+* bbdev: enum ``rte_bbdev_op_type`` was affected to remove 
``RTE_BBDEV_OP_TYPE_COUNT``
+  and to allow for futureproof enum insertion a padded 
``RTE_BBDEV_OP_TYPE_SIZE_MAX``
+  macro is added.
 
 Known Issues
 
diff --git a/examples/bbdev_app/main.c b/examples/bbdev_app/main.c
index fc7e8b8174..7e16e16bf8 100644
--- a/examples/bbdev_app/main.c
+++ b/examples/bbdev_app/main.c
@@ -1041,7 +1041,7 @@ main(int argc, char **argv)
void *sigret;
struct app_config_params app_params = def_app_config;
struct rte_mempool *ethdev_mbuf_mempool, *bbdev_mbuf_mempool;
-   struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_COUNT];
+   struct rte_mempool *bbdev_op_pools[RTE_BBDEV_OP_TYPE_SIZE_MAX];
struct lcore_conf lcore_conf[RTE_MAX_LCORE] = { {0} };
struct lcore_statistics lcore_stats[RTE_MAX_LCORE] = { {0} };
struct stats_lcore_params stats_lcore;
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index aaee7b7872..4da80472a8 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -23,6 +23,8 @@
 
 #define DEV_NAME "BBDEV"
 
+/* Number of supported operation types */
+#define BBDEV_OP_TYPE_COUNT 5
 
 /* BBDev library logging ID */
 RTE_LOG_REGISTER_DEFAULT(bbdev_logtype, NOTICE);
@@ -890,10 +892,10 @@ rte_bbdev_op_pool_create(const char *name, enum 
rte_bbdev_op_type type,
return NULL;
}
 
-   if (type >= RTE_BBDEV_OP_TYPE_COUNT) {
+   if (type >= BBDEV_OP_TYPE_COUNT) {
rte_bbdev_log(ERR,
"Invalid op type (%u), should be less than %u",
-   

[PATCH v12 2/7] bbdev: add device status info

2022-10-04 Thread Nicolas Chautru
Added device status information, so that the PMD can
expose information related to the underlying accelerator device status.
Minor order change in structure to fit into padding hole.

Signed-off-by: Nicolas Chautru 
Acked-by: Mingshan Zhang 
Acked-by: Hemant Agrawal 
---
 doc/guides/rel_notes/deprecation.rst  |  3 --
 doc/guides/rel_notes/release_22_11.rst|  3 ++
 drivers/baseband/acc100/rte_acc100_pmd.c  |  1 +
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |  1 +
 drivers/baseband/la12xx/bbdev_la12xx.c|  1 +
 drivers/baseband/null/bbdev_null.c|  1 +
 .../baseband/turbo_sw/bbdev_turbo_software.c  |  1 +
 lib/bbdev/rte_bbdev.c | 22 
 lib/bbdev/rte_bbdev.h | 35 +--
 lib/bbdev/rte_bbdev_op.h  |  2 +-
 lib/bbdev/version.map |  7 
 12 files changed, 72 insertions(+), 6 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index e35c86a25c..3bf5a4a7bd 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -125,9 +125,6 @@ Deprecation Notices
   New members will be added in ``rte_bbdev_driver_info`` to expose
   PMD queue topology inspired by
   this `RFC `__.
-  New member will be added in ``rte_bbdev_driver_info`` to expose
-  the device status as per
-  this `RFC `__.
   This should be updated in DPDK 22.11.
 
 * cryptodev: Hide structures ``rte_cryptodev_sym_session`` and
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index e9db53f372..4a1a7bdc5e 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -321,6 +321,9 @@ ABI Changes
   and to allow for futureproof enum insertion a padded 
``RTE_BBDEV_OP_TYPE_SIZE_MAX``
   macro is added.
 
+* bbdev: Structure ``rte_bbdev_driver_info`` was updated to add new parameters
+  for device status using ``rte_bbdev_device_status``.
+
 Known Issues
 
 
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c 
b/drivers/baseband/acc100/rte_acc100_pmd.c
index e2d9409185..cdabc0f879 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1061,6 +1061,7 @@ acc100_dev_info_get(struct rte_bbdev *dev,
 
/* Read and save the populated config from ACC100 registers */
fetch_acc100_config(dev);
+   dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
/* This isn't ideal because it reports the maximum number of queues but
 * does not provide info on how many can be uplink/downlink or different
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c 
b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 51dd090c1b..3c36d09730 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -369,6 +369,7 @@ fpga_dev_info_get(struct rte_bbdev *dev,
dev_info->capabilities = bbdev_capabilities;
dev_info->cpu_flag_reqs = NULL;
dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+   dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
/* Calculates number of queues assigned to device */
dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c 
b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 036579e3ec..67b44992b2 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -645,6 +645,7 @@ fpga_dev_info_get(struct rte_bbdev *dev,
dev_info->capabilities = bbdev_capabilities;
dev_info->cpu_flag_reqs = NULL;
dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+   dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
/* Calculates number of queues assigned to device */
dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c 
b/drivers/baseband/la12xx/bbdev_la12xx.c
index 5d090c62a0..11a385ef56 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -101,6 +101,7 @@ la12xx_info_get(struct rte_bbdev *dev __rte_unused,
dev_info->capabilities = bbdev_capabilities;
dev_info->cpu_flag_reqs = NULL;
dev_info->min_alignment = 64;
+   dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
diff --git a/drivers/baseband/null/bbdev_null.c 
b/drivers/baseband/null/bbdev_null.c
index 28a0cb5d4e..662663c0c8 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -83,6 +83,7 @@ info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info 
*dev_info)
 

[PATCH v12 3/7] bbdev: add device info on queue topology

2022-10-04 Thread Nicolas Chautru
Adding more options in the API to expose the number
of queues exposed and related priority.

Signed-off-by: Nicolas Chautru 
Acked-by: Maxime Coquelin 
---
 doc/guides/rel_notes/deprecation.rst   | 3 ---
 doc/guides/rel_notes/release_22_11.rst | 2 +-
 lib/bbdev/rte_bbdev.h  | 6 +-
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 3bf5a4a7bd..b6485019d2 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -122,9 +122,6 @@ Deprecation Notices
 
 * bbdev: Will extend API to support new operation type ``RTE_BBDEV_OP_FFT`` as 
per
   this `RFC `__.
-  New members will be added in ``rte_bbdev_driver_info`` to expose
-  PMD queue topology inspired by
-  this `RFC `__.
   This should be updated in DPDK 22.11.
 
 * cryptodev: Hide structures ``rte_cryptodev_sym_session`` and
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 4a1a7bdc5e..0b4e28f416 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -322,7 +322,7 @@ ABI Changes
   macro is added.
 
 * bbdev: Structure ``rte_bbdev_driver_info`` was updated to add new parameters
-  for device status using ``rte_bbdev_device_status``.
+  for queue topology, device status using ``rte_bbdev_device_status``.
 
 Known Issues
 
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 3c428c14e9..4228b4550f 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -289,6 +289,10 @@ struct rte_bbdev_driver_info {
 
/** Maximum number of queues supported by the device */
unsigned int max_num_queues;
+   /** Maximum number of queues supported per operation type */
+   unsigned int num_queues[RTE_BBDEV_OP_TYPE_SIZE_MAX];
+   /** Priority level supported per operation type */
+   unsigned int queue_priority[RTE_BBDEV_OP_TYPE_SIZE_MAX];
/** Queue size limit (queue size must also be power of 2) */
uint32_t queue_size_lim;
/** Set if device off-loads operation to hardware  */
@@ -851,7 +855,7 @@ rte_bbdev_queue_intr_ctl(uint16_t dev_id, uint16_t 
queue_id, int epfd, int op,
  *   Device status as enum.
  *
  * @returns
- *   Operation type as string or NULL if op_type is invalid.
+ *   Device status as string or NULL if invalid.
  *
  */
 __rte_experimental
-- 
2.37.1



[PATCH v12 4/7] drivers/baseband: update PMDs to expose queue per operation

2022-10-04 Thread Nicolas Chautru
Add support in existing bbdev PMDs for the explicit number of queues
and priority for each operation type configured on the device.

Signed-off-by: Nicolas Chautru 
Acked-by: Maxime Coquelin 
Acked-by: Hemant Agrawal 
---
 drivers/baseband/acc100/rte_acc100_pmd.c  | 29 +++
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |  8 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |  8 +
 drivers/baseband/la12xx/bbdev_la12xx.c|  7 +
 .../baseband/turbo_sw/bbdev_turbo_software.c  | 12 
 5 files changed, 52 insertions(+), 12 deletions(-)

diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c 
b/drivers/baseband/acc100/rte_acc100_pmd.c
index cdabc0f879..10272fd149 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -967,6 +967,7 @@ acc100_dev_info_get(struct rte_bbdev *dev,
struct rte_bbdev_driver_info *dev_info)
 {
struct acc100_device *d = dev->data->dev_private;
+   int i;
 
static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
{
@@ -1063,19 +1064,23 @@ acc100_dev_info_get(struct rte_bbdev *dev,
fetch_acc100_config(dev);
dev_info->device_status = RTE_BBDEV_DEV_NOT_SUPPORTED;
 
-   /* This isn't ideal because it reports the maximum number of queues but
-* does not provide info on how many can be uplink/downlink or different
-* priorities
-*/
-   dev_info->max_num_queues =
-   d->acc100_conf.q_dl_5g.num_aqs_per_groups *
-   d->acc100_conf.q_dl_5g.num_qgroups +
-   d->acc100_conf.q_ul_5g.num_aqs_per_groups *
-   d->acc100_conf.q_ul_5g.num_qgroups +
-   d->acc100_conf.q_dl_4g.num_aqs_per_groups *
-   d->acc100_conf.q_dl_4g.num_qgroups +
-   d->acc100_conf.q_ul_4g.num_aqs_per_groups *
+   /* Expose number of queues */
+   dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+   dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 
d->acc100_conf.q_ul_4g.num_aqs_per_groups *
d->acc100_conf.q_ul_4g.num_qgroups;
+   dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 
d->acc100_conf.q_dl_4g.num_aqs_per_groups *
+   d->acc100_conf.q_dl_4g.num_qgroups;
+   dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = 
d->acc100_conf.q_ul_5g.num_aqs_per_groups *
+   d->acc100_conf.q_ul_5g.num_qgroups;
+   dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = 
d->acc100_conf.q_dl_5g.num_aqs_per_groups *
+   d->acc100_conf.q_dl_5g.num_qgroups;
+   dev_info->queue_priority[RTE_BBDEV_OP_TURBO_DEC] = 
d->acc100_conf.q_ul_4g.num_qgroups;
+   dev_info->queue_priority[RTE_BBDEV_OP_TURBO_ENC] = 
d->acc100_conf.q_dl_4g.num_qgroups;
+   dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 
d->acc100_conf.q_ul_5g.num_qgroups;
+   dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 
d->acc100_conf.q_dl_5g.num_qgroups;
+   dev_info->max_num_queues = 0;
+   for (i = RTE_BBDEV_OP_TURBO_DEC; i <= RTE_BBDEV_OP_LDPC_ENC; i++)
+   dev_info->max_num_queues += dev_info->num_queues[i];
dev_info->queue_size_lim = ACC100_MAX_QUEUE_DEPTH;
dev_info->hardware_accelerated = true;
dev_info->max_dl_queue_priority =
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c 
b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 3c36d09730..d520d5238f 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -379,6 +379,14 @@ fpga_dev_info_get(struct rte_bbdev *dev,
if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
dev_info->max_num_queues++;
}
+   /* Expose number of queue per operation type */
+   dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+   dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = 0;
+   dev_info->num_queues[RTE_BBDEV_OP_TURBO_ENC] = 0;
+   dev_info->num_queues[RTE_BBDEV_OP_LDPC_DEC] = dev_info->max_num_queues 
/ 2;
+   dev_info->num_queues[RTE_BBDEV_OP_LDPC_ENC] = dev_info->max_num_queues 
/ 2;
+   dev_info->queue_priority[RTE_BBDEV_OP_LDPC_DEC] = 1;
+   dev_info->queue_priority[RTE_BBDEV_OP_LDPC_ENC] = 1;
 }
 
 /**
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c 
b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 67b44992b2..fc86f13bee 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -655,6 +655,14 @@ fpga_dev_info_get(struct rte_bbdev *dev,
if (hw_q_id != FPGA_INVALID_HW_QUEUE_ID)
dev_info->max_num_queues++;
}
+   /* Expose number of queue per operation type */
+   dev_info->num_queues[RTE_BBDEV_OP_NONE] = 0;
+   dev_info->num_queues[RTE_BBDEV_OP_TURBO_DEC] = dev_info->max_num_queues 
/ 2;
+   dev_info-

[PATCH v12 6/7] bbdev: add queue related warning and status information

2022-10-04 Thread Nicolas Chautru
This allows to expose more information with regards to any
queue related failure and warning which cannot be supported
in existing API.

Signed-off-by: Nicolas Chautru 
Acked-by: Maxime Coquelin 
---
 app/test-bbdev/test_bbdev_perf.c   |  2 ++
 doc/guides/rel_notes/release_22_11.rst |  3 ++
 lib/bbdev/rte_bbdev.c  | 19 
 lib/bbdev/rte_bbdev.h  | 43 ++
 lib/bbdev/version.map  |  1 +
 5 files changed, 68 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index f5eeb735b2..75f1ca4f14 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -4361,6 +4361,8 @@ get_bbdev_queue_stats(uint16_t dev_id, uint16_t queue_id,
stats->dequeued_count = q_stats->dequeued_count;
stats->enqueue_err_count = q_stats->enqueue_err_count;
stats->dequeue_err_count = q_stats->dequeue_err_count;
+   stats->enqueue_warning_count = q_stats->enqueue_warning_count;
+   stats->dequeue_warning_count = q_stats->dequeue_warning_count;
stats->acc_offload_cycles = q_stats->acc_offload_cycles;
 
return 0;
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index edc50e5647..c55fb2a861 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -329,6 +329,9 @@ ABI Changes
 * bbdev: Structure ``rte_bbdev_driver_info`` was updated to add new parameters
   for queue topology, device status using ``rte_bbdev_device_status``.
 
+* bbdev: Structure ``rte_bbdev_queue_data`` was updated to add new parameter
+  for enqueue status using ``rte_bbdev_enqueue_status``.
+
 Known Issues
 
 
diff --git a/lib/bbdev/rte_bbdev.c b/lib/bbdev/rte_bbdev.c
index 9d65ba8cd3..bdd7c2f00d 100644
--- a/lib/bbdev/rte_bbdev.c
+++ b/lib/bbdev/rte_bbdev.c
@@ -721,6 +721,8 @@ get_stats_from_queues(struct rte_bbdev *dev, struct 
rte_bbdev_stats *stats)
stats->dequeued_count += q_stats->dequeued_count;
stats->enqueue_err_count += q_stats->enqueue_err_count;
stats->dequeue_err_count += q_stats->dequeue_err_count;
+   stats->enqueue_warn_count += q_stats->enqueue_warn_count;
+   stats->dequeue_warn_count += q_stats->dequeue_warn_count;
}
rte_bbdev_log_debug("Got stats on %u", dev->data->dev_id);
 }
@@ -1163,3 +1165,20 @@ rte_bbdev_device_status_str(enum rte_bbdev_device_status 
status)
rte_bbdev_log(ERR, "Invalid device status");
return NULL;
 }
+
+const char *
+rte_bbdev_enqueue_status_str(enum rte_bbdev_enqueue_status status)
+{
+   static const char * const enq_sta_string[] = {
+   "RTE_BBDEV_ENQ_STATUS_NONE",
+   "RTE_BBDEV_ENQ_STATUS_QUEUE_FULL",
+   "RTE_BBDEV_ENQ_STATUS_RING_FULL",
+   "RTE_BBDEV_ENQ_STATUS_INVALID_OP",
+   };
+
+   if (status < sizeof(enq_sta_string) / sizeof(char *))
+   return enq_sta_string[status];
+
+   rte_bbdev_log(ERR, "Invalid enqueue status");
+   return NULL;
+}
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 68f18fbb43..c2b0106067 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -35,6 +35,13 @@ extern "C" {
 #define RTE_BBDEV_MAX_DEVS 128  /**< Max number of devices */
 #endif
 
+/*
+ * Maximum size to be used to manage the enum rte_bbdev_enqueue_status
+ * including padding for future enum insertion.
+ * The enum values must be explicitly kept smaller or equal to this padded 
maximum size.
+ */
+#define RTE_BBDEV_ENQ_STATUS_SIZE_MAX 6
+
 /** Flags indicate current state of BBDEV device */
 enum rte_bbdev_state {
RTE_BBDEV_UNUSED,
@@ -223,6 +230,21 @@ rte_bbdev_queue_start(uint16_t dev_id, uint16_t queue_id);
 int
 rte_bbdev_queue_stop(uint16_t dev_id, uint16_t queue_id);
 
+/**
+ * Flags indicate the reason why a previous enqueue may not have
+ * consumed all requested operations.
+ * In case of multiple reasons the latter supersedes a previous one.
+ * The related macro RTE_BBDEV_ENQ_STATUS_SIZE_MAX can be used as an absolute 
maximum for
+ * notably sizing array while allowing for future enumeration insertion.
+ */
+enum rte_bbdev_enqueue_status {
+   RTE_BBDEV_ENQ_STATUS_NONE, /**< Nothing to report */
+   RTE_BBDEV_ENQ_STATUS_QUEUE_FULL,   /**< Not enough room in queue */
+   RTE_BBDEV_ENQ_STATUS_RING_FULL,/**< Not enough room in ring */
+   RTE_BBDEV_ENQ_STATUS_INVALID_OP,   /**< Operation was rejected as 
invalid */
+   /* Note: RTE_BBDEV_ENQ_STATUS_SIZE_MAX must be larger or equal to 
maximum enum value */
+};
+
 /**
  * Flags indicate the status of the device
  */
@@ -246,6 +268,12 @@ struct rte_bbdev_stats {
uint64_t enqueue_err_count;
/** Total error count on operations dequeued */
uint64_t dequeue_err_count;
+   /** Total warning count 

[PATCH v12 5/7] bbdev: add new operation for FFT processing

2022-10-04 Thread Nicolas Chautru
Extension of bbdev operation to support FFT based operations.

Signed-off-by: Nicolas Chautru 
Acked-by: Hemant Agrawal 
Acked-by: Maxime Coquelin 
---
 doc/guides/prog_guide/bbdev.rst| 103 +
 doc/guides/rel_notes/deprecation.rst   |   4 -
 doc/guides/rel_notes/release_22_11.rst |   5 +
 lib/bbdev/rte_bbdev.c  |  10 +-
 lib/bbdev/rte_bbdev.h  |  76 +
 lib/bbdev/rte_bbdev_op.h   | 149 +
 lib/bbdev/version.map  |   4 +
 7 files changed, 346 insertions(+), 5 deletions(-)

diff --git a/doc/guides/prog_guide/bbdev.rst b/doc/guides/prog_guide/bbdev.rst
index 70fa01ada5..1c7eb24148 100644
--- a/doc/guides/prog_guide/bbdev.rst
+++ b/doc/guides/prog_guide/bbdev.rst
@@ -1118,6 +1118,109 @@ Figure :numref:`figure_turbo_tb_decode` above
 showing the Turbo decoding of CBs using BBDEV interface in TB-mode
 is also valid for LDPC decode.
 
+BBDEV FFT Operation
+
+
+This operation allows to run a combination of DFT and/or IDFT and/or 
time-domain windowing.
+These can be used in a modular fashion (using bypass modes) or as a processing 
pipeline
+which can be used for FFT-based baseband signal processing.
+In more details it allows :
+- to process the data first through an IDFT of adjustable size and padding;
+- to perform the windowing as a programmable cyclic shift offset of the data 
followed by a
+pointwise multiplication by a time domain window;
+- to process the related data through a DFT of adjustable size and de-padding 
for each such cyclic
+shift output.
+
+A flexible number of Rx antennas are being processed in parallel with the same 
configuration.
+The API allows more generally for flexibility in what the PMD may support 
(capability flags) and
+flexibility to adjust some of the parameters of the processing.
+
+The operation/capability flags that can be set for each FFT operation are 
given below.
+
+  **NOTE:** The actual operation flags that may be used with a specific
+  BBDEV PMD are dependent on the driver capabilities as reported via
+  ``rte_bbdev_info_get()``, and may be a subset of those below.
+
+++
+|Description of FFT capability flags |
+++
+|RTE_BBDEV_FFT_WINDOWING |
+| Set to enable/support windowing in time domain |
+++
+|RTE_BBDEV_FFT_CS_ADJUSTMENT |
+| Set to enable/support  the cyclic shift time offset adjustment |
+++
+|RTE_BBDEV_FFT_DFT_BYPASS|
+| Set to bypass the DFT and use directly the IDFT as an option   |
+++
+|RTE_BBDEV_FFT_IDFT_BYPASS   |
+| Set to bypass the IDFT and use directly the DFT as an option   |
+++
+|RTE_BBDEV_FFT_WINDOWING_BYPASS  |
+| Set to bypass the time domain windowing  as an option  |
+++
+|RTE_BBDEV_FFT_POWER_MEAS|
+| Set to provide an optional power measurement of the DFT output |
+++
+|RTE_BBDEV_FFT_FP16_INPUT|
+| Set if the input data shall use FP16 format instead of INT16   |
+++
+|RTE_BBDEV_FFT_FP16_OUTPUT   |
+| Set if the output data shall use FP16 format instead of INT16  |
+++
+
+The FFT parameters are set out in the table below.
+
++-+--+
+|Parameter|Description 
  |
++=+==+
+|base_input   |input data  
  |
++-+--+
+|base_output  |output data 
  |
++-+--+
+|power_meas_output|optional output data with power measurement on DFT 
output |
++-+-

  1   2   >