Re: [PATCH 1/1] dts: add binding to different drivers to TG node

2024-09-18 Thread Jeremy Spewock
On Mon, Sep 16, 2024 at 6:04 AM Juraj Linkeš  wrote:
>
>
>
> On 9. 9. 2024 17:55, Jeremy Spewock wrote:
> > On Mon, Sep 9, 2024 at 8:16 AM Juraj Linkeš  
> > wrote:
> >>
> >>
> >>
> >> On 12. 8. 2024 19:22, jspew...@iol.unh.edu wrote:
> >>> From: Jeremy Spewock 
> >>>
> >>> The DTS framework in its current state supports binding ports to
> >>> different drivers on the SUT node but not the TG node. The TG node
> >>> already has the information that it needs about the different drivers
> >>> that it has available in the configuration file, but it did not
> >>> previously have access to the devbind script, so it did not use that
> >>> information for anything.
> >>>
> >>> This patch moves the steps to copy the DPDK tarball into the node class
> >>> rather than the SUT node class, and calls this function on the TG node
> >>> as well as the SUT. It also moves the driver binding step into the Node
> >>> class and triggers the same pattern of binding to ports that existed on
> >>> the SUT on the TG.
> >>>
> >>
> >> This is a very inefficient way to do this. We'll have to build DPDK
> >> twice and that's very time consuming. I was thinking in terms of just
> >
> > This patch shouldn't be compiling DPDK twice, are you referring to the
> > process of copying the tarball over and extracting it taking too long?
> > If so, that makes sense that it takes longer than we need for this one
> > task. I figured it wouldn't hurt to have the whole DPDK directory
> > there, and that it could even be potentially useful to have it if the
> > TG ever needed it. That and it seemed like the most straightforward
> > way that kept these two set up in a similar way. Extracting the
> > tarball is obviously pretty quick, so I guess the real question here
> > is whether it is fine to add the time of one extra SCP of the DPDK
> > tarball around.
> >
>
> Ah, I didn't look carefully at the split. This is fine, but there some
> things I noticed.
>
> As Patrick mentioned, the docstrings in Node.set_up_build_target() and
> SutNode.set_up_build_target() would need to be updated.
> Why are we binding ports on the TG node?

I figured that the assumption would be that whatever is in the config
file is what the TG needs to be bound to in order to run the testing,
similarly to how we always bind on the SUT assuming that we need to be
using the DPDK driver to test DPDK.

> This shouldn't really be part of set_up_build_target; set_up_test_run is
> a better place to put this, as we don't need to copy it for each build
> target. And, as I realized then thinking about the property (down
> below), we don't need to do that even per test_run; once per TG node's
> lifetime is enough.

That makes sense to me actually considering the traffic generator
being used cannot change throughout the node's lifetime. I will make
that change.

>
> >> copying the script to the TG node and storing its location on the TG
> >> node. We should have access to the script whether DTS is run from the
> >> repository or a tarball.
> >
> > We should have access to it regardless, but extracting only that one
> > script would be different based on if it was a tarball or a repository
> > since, I believe at least, I would have to use the tarfile library to
> > read and extract only this one file to copy over if it was a tarball.
> > It would be faster I assume, so if you think it is worth it I could
> > make the change. Unless you are saying that we wouldn't need to take
> > the devbind script from the tarball that is passed into the DTS run
> > and instead assume that we can just go one directory up from `dts/` on
> > the runner host. That could be an interesting idea which would be
> > faster, but I wasn't sure if that was something that was fine to do
> > since (I don't think at least) there is anything that technically ties
> > you to running from in a DPDK directory other than the docker
> > container.
>
> You can run DTS from any directory, but currently DTS it's always going
> to be in a DPDK tree (there's no other way to get DTS), so I think it's
> safe to assume the script is there. We can put a variable pointing to
> dpdk_root into utils.py and use that.
>

Fair enough, I don't see why it would be run outside of the DPDK
directory in most cases. There is one situation where it could happen,
which is the runner target for the Docker container copies only the
DTS directory into the container when it "bakes DTS in." It does so
because it really only needs the .toml file for the poetry install
and, realistically, it seems like you should normally be mounting your
local DPDK directory over the DTS directory, but maybe that isn't
super clear. Out of scope for this patch, but just something to note.

> My idea was copying that one file, nothing else (no tarball or anything
> would be needed).
> I think we'd only need to move _remote_tmp_dir and
> _path_to_devbind_script to Node and then implement set_up_test_run() on
> the TG node to copy just the script (with self.main_session.copy

[PATCH v3] dts: correct typos in user config docstrings

2024-09-18 Thread Dean Marx
Correct docstring error in conf.yaml showing incorrect
example pci address for TG nodes.

Fixes: 55442c14297c ("dts: improve documentation")

Signed-off-by: Dean Marx 
Reviewed-by: Nicholas Pratte 
Reviewed-by: Luca Vizzarro 
Reviewed-by: Jeremy Spewock 
Reviewed-by: Juraj Linkeš 
---
 dts/conf.yaml | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/dts/conf.yaml b/dts/conf.yaml
index 7d95016e68..ca5e87636e 100644
--- a/dts/conf.yaml
+++ b/dts/conf.yaml
@@ -39,13 +39,13 @@ nodes:
 number_of: 256
 force_first_numa: false
 ports:
-  # sets up the physical link between "SUT 1"@000:00:08.0 and "TG 
1"@:00:08.0
+  # sets up the physical link between "SUT 1"@:00:08.0 and "TG 
1"@:00:08.0
   - pci: ":00:08.0"
 os_driver_for_dpdk: vfio-pci # OS driver that DPDK will use
 os_driver: i40e  # OS driver to bind when the tests are 
not running
 peer_node: "TG 1"
 peer_pci: ":00:08.0"
-  # sets up the physical link between "SUT 1"@000:00:08.1 and "TG 
1"@:00:08.1
+  # sets up the physical link between "SUT 1"@:00:08.1 and "TG 
1"@:00:08.1
   - pci: ":00:08.1"
 os_driver_for_dpdk: vfio-pci
 os_driver: i40e
@@ -59,13 +59,13 @@ nodes:
 arch: x86_64
 os: linux
 ports:
-  # sets up the physical link between "TG 1"@000:00:08.0 and "SUT 
1"@:00:08.0
+  # sets up the physical link between "TG 1"@:00:08.0 and "SUT 
1"@:00:08.0
   - pci: ":00:08.0"
 os_driver_for_dpdk: rdma
 os_driver: rdma
 peer_node: "SUT 1"
 peer_pci: ":00:08.0"
-  # sets up the physical link between "SUT 1"@000:00:08.0 and "TG 
1"@:00:08.0
+  # sets up the physical link between "SUT 1"@:00:08.0 and "TG 
1"@:00:08.0
   - pci: ":00:08.1"
 os_driver_for_dpdk: rdma
 os_driver: rdma
-- 
2.44.0



[PATCH] net/gve: add ptype parsing to DQ format

2024-09-18 Thread Joshua Washington
Currently, the packet type is parsed as part of adding the
checksum-related ol_flags for a received packet, but the parsed
information is not added to the mbuf.

This change adds the parsed ptypes to the mbuf and updates the RX
checksum validation to rely on the mbuf instead of re-capturing the
ptype from the descriptor. This helps with compatibility with programs
which rely on the packet type value stored in the mbuf.

Signed-off-by: Joshua Washington 
Reviewed-by: Rushil Gupta 
---
 drivers/net/gve/gve_rx_dqo.c | 62 +---
 1 file changed, 44 insertions(+), 18 deletions(-)

diff --git a/drivers/net/gve/gve_rx_dqo.c b/drivers/net/gve/gve_rx_dqo.c
index 5efcce3312..d8e9eee4a8 100644
--- a/drivers/net/gve/gve_rx_dqo.c
+++ b/drivers/net/gve/gve_rx_dqo.c
@@ -6,6 +6,7 @@
 
 #include "gve_ethdev.h"
 #include "base/gve_adminq.h"
+#include "rte_mbuf_ptype.h"
 
 static inline void
 gve_rx_refill_dqo(struct gve_rx_queue *rxq)
@@ -75,38 +76,63 @@ gve_rx_refill_dqo(struct gve_rx_queue *rxq)
rxq->bufq_tail = next_avail;
 }
 
-static inline uint16_t
-gve_parse_csum_ol_flags(volatile struct gve_rx_compl_desc_dqo *rx_desc,
-   struct gve_priv *priv) {
-   uint64_t ol_flags = 0;
-   struct gve_ptype ptype =
-   priv->ptype_lut_dqo->ptypes[rx_desc->packet_type];
-
+static inline void
+gve_parse_csum_ol_flags(struct rte_mbuf *rx_mbuf,
+   volatile struct gve_rx_compl_desc_dqo *rx_desc)
+{
if (!rx_desc->l3_l4_processed)
-   return ol_flags;
+   return;
 
-   if (ptype.l3_type == GVE_L3_TYPE_IPV4) {
+   if (rx_mbuf->packet_type & RTE_PTYPE_L3_IPV4) {
if (rx_desc->csum_ip_err)
-   ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
+   rx_mbuf->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_BAD;
else
-   ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
+   rx_mbuf->ol_flags |= RTE_MBUF_F_RX_IP_CKSUM_GOOD;
}
 
if (rx_desc->csum_l4_err) {
-   ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
-   return ol_flags;
+   rx_mbuf->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_BAD;
+   return;
}
+   if (rx_mbuf->packet_type & RTE_PTYPE_L4_MASK)
+   rx_mbuf->ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
+}
+
+static inline void
+gve_rx_set_mbuf_ptype(struct gve_priv *priv, struct rte_mbuf *rx_mbuf,
+ volatile struct gve_rx_compl_desc_dqo *rx_desc)
+{
+   struct gve_ptype ptype =
+   priv->ptype_lut_dqo->ptypes[rx_desc->packet_type];
+   rx_mbuf->packet_type = 0;
+
+   switch (ptype.l3_type) {
+   case GVE_L3_TYPE_IPV4:
+   rx_mbuf->packet_type |= RTE_PTYPE_L3_IPV4;
+   break;
+   case GVE_L3_TYPE_IPV6:
+   rx_mbuf->packet_type |= RTE_PTYPE_L3_IPV6;
+   break;
+   default:
+   break;
+   }
+
switch (ptype.l4_type) {
case GVE_L4_TYPE_TCP:
+   rx_mbuf->packet_type |= RTE_PTYPE_L4_TCP;
+   break;
case GVE_L4_TYPE_UDP:
+   rx_mbuf->packet_type |= RTE_PTYPE_L4_UDP;
+   break;
case GVE_L4_TYPE_ICMP:
+   rx_mbuf->packet_type |= RTE_PTYPE_L4_ICMP;
+   break;
case GVE_L4_TYPE_SCTP:
-   ol_flags |= RTE_MBUF_F_RX_L4_CKSUM_GOOD;
+   rx_mbuf->packet_type |= RTE_PTYPE_L4_SCTP;
break;
default:
break;
}
-   return ol_flags;
 }
 
 uint16_t
@@ -158,9 +184,9 @@ gve_rx_burst_dqo(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
rxm->pkt_len = pkt_len;
rxm->data_len = pkt_len;
rxm->port = rxq->port_id;
-   rxm->ol_flags = 0;
-   rxm->ol_flags |= RTE_MBUF_F_RX_RSS_HASH |
-   gve_parse_csum_ol_flags(rx_desc, rxq->hw);
+   gve_rx_set_mbuf_ptype(rxq->hw, rxm, rx_desc);
+   rxm->ol_flags = RTE_MBUF_F_RX_RSS_HASH;
+   gve_parse_csum_ol_flags(rxm, rx_desc);
rxm->hash.rss = rte_le_to_cpu_32(rx_desc->hash);
 
rx_pkts[nb_rx++] = rxm;
-- 
2.46.0.662.g92d0881bb0-goog



[PATCH v15 0/1] dts: port over VLAN test suite

2024-09-18 Thread Dean Marx
Port over VLAN capabilities test suite from old DTS. This test
suite verifies that VLAN filtering, stripping, and header
insertion all function as expected. When a VLAN ID is in the
filter list, all packets with that ID should be forwarded
and all others should be dropped. While stripping is enabled,
packets sent with a VLAN ID should have the ID removed
and then be forwarded. Additionally, when header insertion
is enabled packets without a VLAN ID should have a specified
ID inserted and then be forwarded.

---
v13:
* Combined conf schema and test suite patches

v14:
* Reworded docstrings in suite
* Added flag checking to shell methods
* Fixed tx_vlan_reset method bug

v15:
* Rebased off next-dts

Dean Marx (1):
  dts: VLAN test suite implementation

 dts/framework/config/conf_yaml_schema.json |   3 +-
 dts/tests/TestSuite_vlan.py| 167 +
 2 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_vlan.py

-- 
2.44.0



[PATCH v15 1/1] dts: VLAN test suite implementation

2024-09-18 Thread Dean Marx
Test suite for verifying VLAN filtering, stripping, and insertion
functionality on Poll Mode Driver.

Depends-on: Patch-143966 ("dts: add VLAN methods to testpmd shell")

Signed-off-by: Dean Marx 
Reviewed-by: Jeremy Spewock 
---
 dts/framework/config/conf_yaml_schema.json |   3 +-
 dts/tests/TestSuite_vlan.py| 167 +
 2 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_vlan.py

diff --git a/dts/framework/config/conf_yaml_schema.json 
b/dts/framework/config/conf_yaml_schema.json
index df390e8ae2..d437f4db36 100644
--- a/dts/framework/config/conf_yaml_schema.json
+++ b/dts/framework/config/conf_yaml_schema.json
@@ -187,7 +187,8 @@
   "enum": [
 "hello_world",
 "os_udp",
-"pmd_buffer_scatter"
+"pmd_buffer_scatter",
+"vlan"
   ]
 },
 "test_target": {
diff --git a/dts/tests/TestSuite_vlan.py b/dts/tests/TestSuite_vlan.py
new file mode 100644
index 00..7009c2c72b
--- /dev/null
+++ b/dts/tests/TestSuite_vlan.py
@@ -0,0 +1,167 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 University of New Hampshire
+
+"""Test the support of VLAN Offload Features by Poll Mode Drivers.
+
+This test suite verifies that VLAN filtering, stripping, and header insertion 
all
+function as expected. When a VLAN ID is in the filter list, all packets with 
that
+ID should be forwarded and all others should be dropped. While stripping is 
enabled,
+packets sent with a VLAN ID should have the ID removed and then be forwarded.
+Additionally, when header insertion is enabled packets without a
+VLAN ID should have a specified ID inserted and then be forwarded.
+
+"""
+
+from scapy.layers.l2 import Dot1Q, Ether  # type: ignore[import-untyped]
+from scapy.packet import Raw  # type: ignore[import-untyped]
+
+from framework.remote_session.testpmd_shell import SimpleForwardingModes, 
TestPmdShell
+from framework.test_suite import TestSuite
+
+
+class TestVlan(TestSuite):
+"""DPDK VLAN test suite.
+
+Ensures VLAN packet reception, stripping, and insertion on the Poll Mode 
Driver
+when the appropriate conditions are met. The suite contains four test 
cases:
+
+1. VLAN reception no stripping - verifies that a vlan packet with a tag
+within the filter list is received.
+2. VLAN reception stripping - verifies that a vlan packet with a tag
+within the filter list is received without the vlan tag.
+3. VLAN no reception - verifies that a vlan packet with a tag not within
+the filter list is dropped.
+4. VLAN insertion - verifies that a non vlan packet is received with a vlan
+tag when insertion is enabled.
+"""
+
+def set_up_suite(self) -> None:
+"""Set up the test suite.
+
+Setup:
+Verify that at least two ports are open for session.
+"""
+self.verify(len(self._port_links) > 1, "Not enough ports")
+
+def send_vlan_packet_and_verify(self, should_receive: bool, strip: bool, 
vlan_id: int) -> None:
+"""Generate a vlan packet, send and verify packet with same payload is 
received on the dut.
+
+Args:
+should_receive: Indicate whether the packet should be successfully 
received.
+strip: If :data:`False`, will verify received packets match the 
given VLAN ID,
+otherwise verifies that the received packet has no VLAN ID
+(as it has been stripped off.)
+vlan_id: Expected vlan ID.
+"""
+packet = Ether() / Dot1Q(vlan=vlan_id) / Raw(load="x")
+received_packets = self.send_packet_and_capture(packet)
+test_packet = None
+for packet in received_packets:
+if hasattr(packet, "load") and b"x" in packet.load:
+test_packet = packet
+break
+if should_receive:
+self.verify(
+test_packet is not None, "Packet was dropped when it should 
have been received"
+)
+if test_packet is not None:
+if strip:
+self.verify(
+not test_packet.haslayer(Dot1Q), "Vlan tag was not 
stripped successfully"
+)
+else:
+self.verify(
+test_packet.vlan == vlan_id,
+"The received tag did not match the expected tag",
+)
+else:
+self.verify(
+test_packet is None,
+"Packet was received when it should have been dropped",
+)
+
+def send_packet_and_verify_insertion(self, expected_id: int) -> None:
+"""Generate a packet with no vlan tag, send and verify on the dut.
+
+Args:
+expected_id: The vlan id that is being inserted through tx_offload 
configuration.
+"""
+packet = Ether() / Raw(load="x")
+received_packets = self.send_pa

RE: [PATCH v2 1/3] eventdev: introduce event pre-scheduling

2024-09-18 Thread Pathak, Pravin



> -Original Message-
> From: pbhagavat...@marvell.com 
> Sent: Tuesday, September 17, 2024 3:11 AM
> To: jer...@marvell.com; sthot...@marvell.com; Sevincer, Abdullah
> ; hemant.agra...@nxp.com;
> sachin.sax...@oss.nxp.com; Van Haaren, Harry ;
> mattias.ronnb...@ericsson.com; lian...@liangbit.com; Mccarthy, Peter
> 
> Cc: dev@dpdk.org; Pavan Nikhilesh 
> Subject: [PATCH v2 1/3] eventdev: introduce event pre-scheduling
> 
> From: Pavan Nikhilesh 
> 
> Event pre-scheduling improves scheduling performance by assigning events to
> event ports in advance when dequeues are issued.
> The dequeue operation initiates the pre-schedule operation, which completes in
> parallel without affecting the dequeued event flow contexts and dequeue
> latency.
> 
Is the prescheduling done to get the event more quickly in the next dequeue?
The first dequeue executes pre-schedule to make events available for the next 
dequeue.
Is this how it is supposed to work?

> Event devices can indicate pre-scheduling capabilities using
> `RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE` and
> `RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE` via the event device
> info function `info.event_dev_cap`.
> 
> Applications can select the pre-schedule type and configure it through
> `rte_event_dev_config.preschedule_type` during `rte_event_dev_configure`.
> 
> The supported pre-schedule types are:
>  * `RTE_EVENT_DEV_PRESCHEDULE_NONE` - No pre-scheduling.
>  * `RTE_EVENT_DEV_PRESCHEDULE` - Always issue a pre-schedule on dequeue.
>  * `RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE` - Delay issuing pre-schedule
> until
>there are no forward progress constraints with the held flow contexts.
> 
> Signed-off-by: Pavan Nikhilesh 
> ---
>  app/test/test_eventdev.c| 63 +
>  doc/guides/prog_guide/eventdev/eventdev.rst | 22 +++
>  lib/eventdev/rte_eventdev.h | 48 
>  3 files changed, 133 insertions(+)
> 
> diff --git a/app/test/test_eventdev.c b/app/test/test_eventdev.c index
> e4e234dc98..cf496ee88d 100644
> --- a/app/test/test_eventdev.c
> +++ b/app/test/test_eventdev.c
> @@ -1250,6 +1250,67 @@ test_eventdev_profile_switch(void)
>   return TEST_SUCCESS;
>  }
> 
> +static int
> +preschedule_test(rte_event_dev_preschedule_type_t preschedule_type,
> +const char *preschedule_name) {
> +#define NB_EVENTS 1024
> + uint64_t start, total;
> + struct rte_event ev;
> + int rc, cnt;
> +
> + ev.event_type = RTE_EVENT_TYPE_CPU;
> + ev.queue_id = 0;
> + ev.op = RTE_EVENT_OP_NEW;
> + ev.u64 = 0xBADF00D0;
> +
> + for (cnt = 0; cnt < NB_EVENTS; cnt++) {
> + ev.flow_id = cnt;
> + rc = rte_event_enqueue_burst(TEST_DEV_ID, 0, &ev, 1);
> + TEST_ASSERT(rc == 1, "Failed to enqueue event");
> + }
> +
> + RTE_SET_USED(preschedule_type);
> + total = 0;
> + while (cnt) {
> + start = rte_rdtsc_precise();
> + rc = rte_event_dequeue_burst(TEST_DEV_ID, 0, &ev, 1, 0);
> + if (rc) {
> + total += rte_rdtsc_precise() - start;
> + cnt--;
> + }
> + }
> + printf("Preschedule type : %s, avg cycles %" PRIu64 "\n",
> preschedule_name,
> +total / NB_EVENTS);
> +
> + return TEST_SUCCESS;
> +}
> +
> +static int
> +test_eventdev_preschedule_configure(void)
> +{
> + struct rte_event_dev_config dev_conf;
> + struct rte_event_dev_info info;
> + int rc;
> +
> + rte_event_dev_info_get(TEST_DEV_ID, &info);
> +
> + if ((info.event_dev_cap & RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE)
> == 0)
> + return TEST_SKIPPED;
> +
> + devconf_set_default_sane_values(&dev_conf, &info);
> + dev_conf.preschedule_type = RTE_EVENT_DEV_PRESCHEDULE;
> + rc = rte_event_dev_configure(TEST_DEV_ID, &dev_conf);
> + TEST_ASSERT_SUCCESS(rc, "Failed to configure eventdev");
> +
> + rc = preschedule_test(RTE_EVENT_DEV_PRESCHEDULE_NONE,
> "RTE_EVENT_DEV_PRESCHEDULE_NONE");
> + rc |= preschedule_test(RTE_EVENT_DEV_PRESCHEDULE,
> "RTE_EVENT_DEV_PRESCHEDULE");
> + if (info.event_dev_cap &
> RTE_EVENT_DEV_CAP_EVENT_PRESCHEDULE_ADAPTIVE)
> + rc |=
> preschedule_test(RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE,
> +
> "RTE_EVENT_DEV_PRESCHEDULE_ADAPTIVE");
> +
> + return rc;
> +}
> +
>  static int
>  test_eventdev_close(void)
>  {
> @@ -1310,6 +1371,8 @@ static struct unit_test_suite
> eventdev_common_testsuite  = {
>   test_eventdev_start_stop),
>   TEST_CASE_ST(eventdev_configure_setup,
> eventdev_stop_device,
>   test_eventdev_profile_switch),
> + TEST_CASE_ST(eventdev_configure_setup, NULL,
> + test_eventdev_preschedule_configure),
>   TEST_CASE_ST(eventdev_setup_device, eventdev_stop_device,
>   test_eventdev_link),
>   TEST_CASE_ST(eventdev_setup_device, eventdev_stop_device,
> diff --git

[PATCH v3] ethdev: optimize the activation of fast-path tracepoints

2024-09-18 Thread Adel Belkhiri
From: Adel Belkhiri 

Split the tracepoints rte_ethdev_trace_rx_burst and
rte_eth_trace_call_rx_callbacks into two separate ones
for empty and non-empty calls to avoid saturating
quickly the trace buffer.

Signed-off-by: Adel Belkhiri 
---
 .mailmap   |  1 +
 doc/guides/rel_notes/release_24_11.rst |  2 ++
 lib/ethdev/ethdev_private.c|  8 ++--
 lib/ethdev/ethdev_trace_points.c   | 14 ++
 lib/ethdev/rte_ethdev.h|  5 -
 lib/ethdev/rte_ethdev_trace_fp.h   | 23 +--
 lib/ethdev/version.map |  7 ++-
 7 files changed, 50 insertions(+), 10 deletions(-)

diff --git a/.mailmap b/.mailmap
index 4a508bafad..e86241dced 100644
--- a/.mailmap
+++ b/.mailmap
@@ -16,6 +16,7 @@ Abraham Tovar 
 Adam Bynes 
 Adam Dybkowski 
 Adam Ludkiewicz 
+Adel Belkhiri 
 Adham Masarwah  
 Adrian Moreno 
 Adrian Pielech 
diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..b7c3ac4054 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -68,6 +68,8 @@ Removed Items
Also, make sure to start the actual text at the margin.
===
 
+* ethdev: Removed the __rte_ethdev_trace_rx_burst symbol, as the corresponding
+  tracepoint was split into two separate ones for empty and non-empty calls.
 
 API Changes
 ---
diff --git a/lib/ethdev/ethdev_private.c b/lib/ethdev/ethdev_private.c
index 626524558a..eed8c78747 100644
--- a/lib/ethdev/ethdev_private.c
+++ b/lib/ethdev/ethdev_private.c
@@ -298,8 +298,12 @@ rte_eth_call_rx_callbacks(uint16_t port_id, uint16_t 
queue_id,
cb = cb->next;
}
 
-   rte_eth_trace_call_rx_callbacks(port_id, queue_id, (void **)rx_pkts,
-   nb_rx, nb_pkts);
+   if (unlikely(nb_rx))
+   rte_eth_trace_call_rx_callbacks_nonempty(port_id, queue_id, 
(void **)rx_pkts,
+   nb_rx, nb_pkts);
+   else
+   rte_eth_trace_call_rx_callbacks_empty(port_id, queue_id, (void 
**)rx_pkts,
+   nb_pkts);
 
return nb_rx;
 }
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 99e04f5893..6ecbee289b 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -25,14 +25,20 @@ RTE_TRACE_POINT_REGISTER(rte_ethdev_trace_stop,
 RTE_TRACE_POINT_REGISTER(rte_ethdev_trace_close,
lib.ethdev.close)
 
-RTE_TRACE_POINT_REGISTER(rte_ethdev_trace_rx_burst,
-   lib.ethdev.rx.burst)
+RTE_TRACE_POINT_REGISTER(rte_ethdev_trace_rx_burst_empty,
+   lib.ethdev.rx.burst.empty)
+
+RTE_TRACE_POINT_REGISTER(rte_ethdev_trace_rx_burst_nonempty,
+   lib.ethdev.rx.burst.nonempty)
 
 RTE_TRACE_POINT_REGISTER(rte_ethdev_trace_tx_burst,
lib.ethdev.tx.burst)
 
-RTE_TRACE_POINT_REGISTER(rte_eth_trace_call_rx_callbacks,
-   lib.ethdev.call_rx_callbacks)
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_call_rx_callbacks_empty,
+   lib.ethdev.call_rx_callbacks.empty)
+
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_call_rx_callbacks_nonempty,
+   lib.ethdev.call_rx_callbacks.nonempty)
 
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_call_tx_callbacks,
lib.ethdev.call_tx_callbacks)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 548fada1c7..eef254c463 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -6132,7 +6132,10 @@ rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
}
 #endif
 
-   rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx);
+   if (unlikely(nb_rx))
+   rte_ethdev_trace_rx_burst_nonempty(port_id, queue_id, (void 
**)rx_pkts, nb_rx);
+   else
+   rte_ethdev_trace_rx_burst_empty(port_id, queue_id, (void 
**)rx_pkts);
return nb_rx;
 }
 
diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
index 40b6e4756b..d23865996a 100644
--- a/lib/ethdev/rte_ethdev_trace_fp.h
+++ b/lib/ethdev/rte_ethdev_trace_fp.h
@@ -18,7 +18,16 @@ extern "C" {
 #include 
 
 RTE_TRACE_POINT_FP(
-   rte_ethdev_trace_rx_burst,
+   rte_ethdev_trace_rx_burst_empty,
+   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id,
+   void **pkt_tbl),
+   rte_trace_point_emit_u16(port_id);
+   rte_trace_point_emit_u16(queue_id);
+   rte_trace_point_emit_ptr(pkt_tbl);
+)
+
+RTE_TRACE_POINT_FP(
+   rte_ethdev_trace_rx_burst_nonempty,
RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id,
void **pkt_tbl, uint16_t nb_rx),
rte_trace_point_emit_u16(port_id);
@@ -38,7 +47,17 @@ RTE_TRACE_POINT_FP(
 )
 
 RTE_TRACE_POINT_FP(
-   rte_eth_trace_call_rx_callbacks,
+   rte_eth_trace_call_rx_callbacks_empty,
+   RTE_TRACE_POINT_ARGS(uint16_t p

[PATCH v2] dts: add package mode config and updated docs

2024-09-18 Thread Dean Marx
In the current DTS setup description, the user installs poetry
with the --no-root option. However, adding 'package-mode = false'
to the pyproject.toml sets the same configuration, and running
poetry install --no-root will become an error in a future
poetry version.

Signed-off-by: Dean Marx 
Reviewed-by: Nicholas Pratte 
Reviewed-by: Luca Vizzarro 
---
 doc/guides/tools/dts.rst| 6 +++---
 dts/.devcontainer/devcontainer.json | 2 +-
 dts/README.md   | 4 ++--
 dts/pyproject.toml  | 1 +
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/doc/guides/tools/dts.rst b/doc/guides/tools/dts.rst
index 8008c9f74d..65cce9e5ed 100644
--- a/doc/guides/tools/dts.rst
+++ b/doc/guides/tools/dts.rst
@@ -92,7 +92,7 @@ Setting up DTS environment
 
.. code-block:: console
 
-  poetry install --no-root
+  poetry install
   poetry shell
 
 #. **SSH Connection**
@@ -449,8 +449,8 @@ The :ref:`doc build dependencies ` may be 
installed with Poetr
 
 .. code-block:: console
 
-   poetry install --no-root --only docs
-   poetry install --no-root --with docs  # an alternative that will also 
install DTS dependencies
+   poetry install --only docs
+   poetry install --with docs  # an alternative that will also install DTS 
dependencies
poetry shell
 
 After executing the meson command, build the documentation with:
diff --git a/dts/.devcontainer/devcontainer.json 
b/dts/.devcontainer/devcontainer.json
index 4d737f1b40..d96b4fdab2 100644
--- a/dts/.devcontainer/devcontainer.json
+++ b/dts/.devcontainer/devcontainer.json
@@ -13,7 +13,7 @@
// "forwardPorts": [],
 
// The next line runs commands after the container is created - in our 
case, installing dependencies.
-   "postCreateCommand": "poetry install --no-root",
+   "postCreateCommand": "poetry install",
 
"extensions": [
"ms-python.vscode-pylance",
diff --git a/dts/README.md b/dts/README.md
index ee3fa1c968..2b3a7f89c5 100644
--- a/dts/README.md
+++ b/dts/README.md
@@ -37,7 +37,7 @@ to allow you to connect to hosts without specifying a 
password.
 ```shell
 docker build --target dev -t dpdk-dts .
 docker run -v $(pwd)/..:/dpdk -v /home/dtsuser/.ssh:/root/.ssh:ro -it dpdk-dts 
bash
-$ poetry install --no-root
+$ poetry install
 $ poetry shell
 ```
 
@@ -46,7 +46,7 @@ $ poetry shell
 ```shell
 docker build --target dev -t dpdk-dts .
 docker run -v $(pwd)/..:/dpdk -it dpdk-dts bash
-$ poetry install --no-root
+$ poetry install
 $ poetry shell
 ```
 
diff --git a/dts/pyproject.toml b/dts/pyproject.toml
index 38281f0e39..91d459f573 100644
--- a/dts/pyproject.toml
+++ b/dts/pyproject.toml
@@ -3,6 +3,7 @@
 # Copyright(c) 2023 PANTHEON.tech s.r.o.
 
 [tool.poetry]
+package-mode = false
 name = "dts"
 version = "0.1.0"
 description = "DPDK Test Suite."
-- 
2.44.0



[PATCH v2] dts: add VLAN methods to testpmd shell

2024-09-18 Thread Dean Marx
added the following methods to testpmd shell class:
vlan set filter on/off, rx vlan add/rm,
vlan set strip on/off, tx vlan set/reset,
set promisc/verbose

Fixes: 61d5bc9bf974 ("dts: add port info command to testpmd shell")

Signed-off-by: Dean Marx 
---
 dts/framework/remote_session/testpmd_shell.py | 175 +-
 1 file changed, 174 insertions(+), 1 deletion(-)

diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index 8c228af39f..5c5e681841 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -102,7 +102,7 @@ def make_parser(cls) -> ParserFn:
 r"strip (?Pon|off), "
 r"filter (?Pon|off), "
 r"extend (?Pon|off), "
-r"qinq strip (?Pon|off)$",
+r"qinq strip (?Pon|off)",
 re.MULTILINE,
 named=True,
 ),
@@ -982,6 +982,179 @@ def set_port_mtu_all(self, mtu: int, verify: bool = True) 
-> None:
 for port in self.ports:
 self.set_port_mtu(port.id, mtu, verify)
 
+def vlan_filter_set(self, port: int, on: bool, verify: bool = True) -> 
None:
+"""Set vlan filter on.
+
+Args:
+port: The port number to enable VLAN filter on, should be within 
0-32.
+on: Sets filter on if :data:`True`, otherwise turns off.
+verify: If :data:`True`, the output of the command and show port 
info
+is scanned to verify that vlan filtering was enabled 
successfully.
+If not, it is considered an error.
+
+Raises:
+InteractiveCommandExecutionError: If `verify` is :data:`True` and 
the filter
+fails to update.
+"""
+filter_cmd_output = self.send_command(f"vlan set filter {'on' if on 
else 'off'} {port}")
+if verify:
+vlan_settings = self.show_port_info(port_id=port).vlan_offload
+if on ^ (vlan_settings is not None and VLANOffloadFlag.FILTER in 
vlan_settings):
+self._logger.debug(f"Failed to set filter on port {port}: 
\n{filter_cmd_output}")
+raise InteractiveCommandExecutionError(
+f"Testpmd failed to set VLAN filter on port {port}."
+)
+
+def rx_vlan(self, vlan: int, port: int, add: bool, verify: bool = True) -> 
None:
+"""Add specified vlan tag to the filter list on a port.
+
+Args:
+vlan: The vlan tag to add, should be within 1-1005, 1-4094 
extended.
+port: The port number to add the tag on, should be within 0-32.
+add: Adds the tag if :data:`True`, otherwise removes tag.
+verify: If :data:`True`, the output of the command is scanned to 
verify that
+the vlan tag was added to the filter list on the specified 
port. If not, it is
+considered an error.
+
+Raises:
+InteractiveCommandExecutionError: If `verify` is :data:`True` and 
the tag
+is not added.
+"""
+rx_output = self.send_command(f"rx_vlan {'add' if add else 'rm'} 
{vlan} {port}")
+if verify:
+if (
+"VLAN-filtering disabled" in rx_output
+or "Invalid vlan_id" in rx_output
+or "Bad arguments" in rx_output
+):
+self._logger.debug(
+f"Failed to {'add' if add else 'remove'} tag {vlan} port 
{port}: \n{rx_output}"
+)
+raise InteractiveCommandExecutionError(
+f"Testpmd failed to {'add' if add else 'remove'} tag 
{vlan} on port {port}."
+)
+
+def vlan_strip_set(self, port: int, on: bool, verify: bool = True) -> None:
+"""Enable vlan stripping on the specified port.
+
+Args:
+port: The port number to use, should be within 0-32.
+on: If :data:`True`, will turn strip on, otherwise will turn off.
+verify: If :data:`True`, the output of the command and show port 
info
+is scanned to verify that vlan stripping was enabled on the 
specified port.
+If not, it is considered an error.
+
+Raises:
+InteractiveCommandExecutionError: If `verify` is :data:`True` and 
stripping
+fails to update.
+"""
+strip_output = self.send_command(f"vlan set strip {'on' if on else 
'off'} {port}")
+if verify:
+vlan_settings = self.show_port_info(port_id=port).vlan_offload
+if on ^ (vlan_settings is not None and VLANOffloadFlag.STRIP in 
vlan_settings):
+self._logger.debug(
+f"Failed to set strip {'on' if on else 'off'} port {port}: 
\n{strip_output}"
+)
+raise InteractiveCommandExecutionError(
+f"Testpmd failed to set strip {'on' if

[PATCH 3/6] ethdev: add flow rule insertion by index with pattern

2024-09-18 Thread Alexander Kozyrev
Add a new API to enqueue flow rule creation by index with pattern.
The new template table rules insertion type,
index-based insertion with pattern, requires a new flow rule creation
function with both rule index and pattern provided.
Packets will match on the provided pattern at the provided index.

Signed-off-by: Alexander Kozyrev 
---
 doc/guides/prog_guide/rte_flow.rst | 20 ++
 doc/guides/rel_notes/release_24_11.rst |  5 +++
 lib/ethdev/ethdev_trace.h  | 44 +
 lib/ethdev/ethdev_trace_points.c   |  6 +++
 lib/ethdev/rte_flow.c  | 55 ++
 lib/ethdev/rte_flow.h  | 54 +
 lib/ethdev/rte_flow_driver.h   | 14 +++
 lib/ethdev/version.map |  3 ++
 8 files changed, 201 insertions(+)

diff --git a/doc/guides/prog_guide/rte_flow.rst 
b/doc/guides/prog_guide/rte_flow.rst
index dad588763f..adbd9b1c20 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -4156,6 +4156,26 @@ Enqueueing a flow rule creation operation to insert a 
rule at a table index.
 A valid handle in case of success is returned. It must be destroyed later
 by calling ``rte_flow_async_destroy()`` even if the rule is rejected by HW.
 
+Enqueue creation by index with pattern
+~~
+
+Enqueueing a flow rule creation operation to insert a rule at a table index 
with pattern.
+
+.. code-block:: c
+
+   struct rte_flow *
+   rte_flow_async_create_by_index_with_pattern(uint16_t port_id,
+   uint32_t queue_id,
+   const struct rte_flow_op_attr 
*op_attr,
+   struct rte_flow_template_table 
*template_table,
+   uint32_t rule_index,
+   const struct rte_flow_item 
pattern[],
+   uint8_t pattern_template_index,
+   const struct rte_flow_action 
actions[],
+   uint8_t actions_template_index,
+   void *user_data,
+   struct rte_flow_error *error);
+
 Enqueue destruction operation
 ~
 
diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index 7056f17f3c..f71a9ab562 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -60,6 +60,11 @@ New Features
   Extended rte_flow_table_insertion_type enum with new
   RTE_FLOW_TABLE_INSERTION_TYPE_INDEX_WITH_PATTERN type.
 
+* **Added flow rule insertion by index with pattern to the Flow API.**
+
+  Added API for inserting the rule by index with pattern.
+  Introduced rte_flow_async_create_by_index_with_pattern() function.
+
 Removed Items
 -
 
diff --git a/lib/ethdev/ethdev_trace.h b/lib/ethdev/ethdev_trace.h
index 3bec87bfdb..910bedbebd 100644
--- a/lib/ethdev/ethdev_trace.h
+++ b/lib/ethdev/ethdev_trace.h
@@ -2343,6 +2343,50 @@ RTE_TRACE_POINT_FP(
rte_trace_point_emit_ptr(flow);
 )
 
+RTE_TRACE_POINT_FP(
+   rte_flow_trace_async_create_by_index,
+   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint32_t queue_id,
+   const struct rte_flow_op_attr *op_attr,
+   const struct rte_flow_template_table *template_table,
+   uint32_t rule_index,
+   const struct rte_flow_action *actions,
+   uint8_t actions_template_index,
+   const void *user_data, const struct rte_flow *flow),
+   rte_trace_point_emit_u16(port_id);
+   rte_trace_point_emit_u32(queue_id);
+   rte_trace_point_emit_ptr(op_attr);
+   rte_trace_point_emit_ptr(template_table);
+   rte_trace_point_emit_u32(rule_index);
+   rte_trace_point_emit_ptr(actions);
+   rte_trace_point_emit_u8(actions_template_index);
+   rte_trace_point_emit_ptr(user_data);
+   rte_trace_point_emit_ptr(flow);
+)
+
+RTE_TRACE_POINT_FP(
+   rte_flow_trace_async_create_by_index_with_pattern,
+   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint32_t queue_id,
+   const struct rte_flow_op_attr *op_attr,
+   const struct rte_flow_template_table *template_table,
+   uint32_t rule_index,
+   const struct rte_flow_item *pattern,
+   uint8_t pattern_template_index,
+   const struct rte_flow_action *actions,
+   uint8_t actions_template_index,
+   const void *user_data, const struct rte_flow *flow),
+   rte_trace_point_emit_u16(port_id);
+   rte_trace_point_emit_u32(queue_id);
+   rte_trace_point_emit_ptr(op_attr);
+   rte_trace_point_emit_ptr(template_table);
+   rte_trace_point_emit_u32(rule_in

[PATCH 0/6] ethdev: jump to table support

2024-09-18 Thread Alexander Kozyrev
Introduce new Flow API JUMP_TO_TABLE_INDEX action.
It allows bypassing a hierarchy of groups and going directly
to a specified flow table. That gives a user the flexibility
to jump between different priorities in a group and eliminates
the need to do a table lookup in the group hierarchy.
The JUMP_TO_TABLE_INDEX action forwards a packet to the
specified rule index inside the index-based flow table.

The current index-based flow table doesn't do any matching
on the packet and executes the actions immediately.
Add a new index-based flow table with pattern matching.
The JUMP_TO_TABLE_INDEX can redirect a packet to another
matching criteria at the specified index in this case.

RFC: 
https://patchwork.dpdk.org/project/dpdk/patch/20240822202753.3856703-1-akozy...@nvidia.com/

Alexander Kozyrev (6):
  ethdev: add insertion by index with pattern
  app/testpmd: add index with pattern insertion type
  ethdev: add flow rule insertion by index with pattern
  app/testpmd: add insertion by index with pattern option
  ethdev: add jump to table index action
  app/testpmd: add jump to table index action

 app/test-pmd/cmdline_flow.c| 44 +-
 app/test-pmd/config.c  | 22 +--
 app/test-pmd/testpmd.h |  2 +-
 doc/guides/prog_guide/rte_flow.rst | 20 +++
 doc/guides/rel_notes/release_24_11.rst | 13 +
 lib/ethdev/ethdev_trace.h  | 44 ++
 lib/ethdev/ethdev_trace_points.c   |  6 ++
 lib/ethdev/rte_flow.c  | 56 ++
 lib/ethdev/rte_flow.h  | 81 ++
 lib/ethdev/rte_flow_driver.h   | 14 +
 lib/ethdev/version.map |  3 +
 11 files changed, 296 insertions(+), 9 deletions(-)

-- 
2.18.2



[PATCH 1/6] ethdev: add insertion by index with pattern

2024-09-18 Thread Alexander Kozyrev
There are two flow table rules insertion type today:
pattern-based insertion when packets match on the pattern and
index-based insertion when packets always hit at the index.
We need another mode that allows to match on the pattern at
the index: insertion by index with pattern.

Signed-off-by: Alexander Kozyrev 
---
 doc/guides/rel_notes/release_24_11.rst | 4 
 lib/ethdev/rte_flow.h  | 4 
 2 files changed, 8 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..7056f17f3c 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,10 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Added a new insertion by index with pattern table insertion type.**
+
+  Extended rte_flow_table_insertion_type enum with new
+  RTE_FLOW_TABLE_INSERTION_TYPE_INDEX_WITH_PATTERN type.
 
 Removed Items
 -
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index f864578f80..6f30dd7ae9 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -5898,6 +5898,10 @@ enum rte_flow_table_insertion_type {
 * Index-based insertion.
 */
RTE_FLOW_TABLE_INSERTION_TYPE_INDEX,
+   /**
+* Index-based insertion with pattern.
+*/
+   RTE_FLOW_TABLE_INSERTION_TYPE_INDEX_WITH_PATTERN,
 };
 
 /**
-- 
2.18.2



Re: [PATCH v24 03/15] windows: add os shim for localtime_r

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/19 4:52, Stephen Hemminger wrote:
> Windows does not have localtime_r but it does have a similar
> function that can be used instead.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> Acked-by: Bruce Richardson 
> ---
>  lib/eal/windows/include/rte_os_shim.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/eal/windows/include/rte_os_shim.h 
> b/lib/eal/windows/include/rte_os_shim.h
> index eda8113662..665c9ac93b 100644
> --- a/lib/eal/windows/include/rte_os_shim.h
> +++ b/lib/eal/windows/include/rte_os_shim.h
> @@ -110,4 +110,14 @@ rte_clock_gettime(clockid_t clock_id, struct timespec 
> *tp)
>  }
>  #define clock_gettime(clock_id, tp) rte_clock_gettime(clock_id, tp)
>  
> +static inline struct tm *
> +rte_localtime_r(const time_t *timep, struct tm *result)
> +{
> + if (localtime_s(result, timep) == 0)
> + return result;
> + else
> + return NULL;
> +}
> +#define localtime_r(timep, result) rte_localtime_r(timep, result)
> +
>  #endif /* _RTE_OS_SHIM_ */



[PATCH 4/6] app/testpmd: add insertion by index with pattern option

2024-09-18 Thread Alexander Kozyrev
Allow to specify both the rule index and the pattern
in the flow rule creation command line parameters.
Both are needed for rte_flow_async_create_by_index_with_pattern().

flow queue 0 create 0 template_table 2 rule_index 5
  pattern_template 0 actions_template 0 postpone no pattern eth / end
  actions count / queue index 1 / end

Signed-off-by: Alexander Kozyrev 
---
 app/test-pmd/cmdline_flow.c |  8 +++-
 app/test-pmd/config.c   | 22 --
 app/test-pmd/testpmd.h  |  2 +-
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index a794e5eba5..855273365e 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -1585,6 +1585,12 @@ static const enum index next_async_insert_subcmd[] = {
ZERO,
 };
 
+static const enum index next_async_pattern_subcmd[] = {
+   QUEUE_PATTERN_TEMPLATE,
+   QUEUE_ACTIONS_TEMPLATE,
+   ZERO,
+};
+
 static const enum index item_param[] = {
ITEM_PARAM_IS,
ITEM_PARAM_SPEC,
@@ -3788,7 +3794,7 @@ static const struct token token_list[] = {
[QUEUE_RULE_ID] = {
.name = "rule_index",
.help = "specify flow rule index",
-   .next = NEXT(NEXT_ENTRY(QUEUE_ACTIONS_TEMPLATE),
+   .next = NEXT(next_async_pattern_subcmd,
 NEXT_ENTRY(COMMON_UNSIGNED)),
.args = ARGS(ARGS_ENTRY(struct buffer,
args.vc.rule_id)),
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 6f0beafa27..39924d8da9 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2636,8 +2636,8 @@ port_flow_template_table_create(portid_t port_id, 
uint32_t id,
}
pt->nb_pattern_templates = nb_pattern_templates;
pt->nb_actions_templates = nb_actions_templates;
-   rte_memcpy(&pt->flow_attr, &table_attr->flow_attr,
-  sizeof(struct rte_flow_attr));
+   rte_memcpy(&pt->attr, table_attr,
+  sizeof(struct rte_flow_template_table_attr));
printf("Template table #%u created\n", pt->id);
return 0;
 }
@@ -2835,7 +2835,7 @@ port_queue_flow_create(portid_t port_id, queueid_t 
queue_id,
}
job->type = QUEUE_JOB_TYPE_FLOW_CREATE;
 
-   pf = port_flow_new(&pt->flow_attr, pattern, actions, &error);
+   pf = port_flow_new(&pt->attr.flow_attr, pattern, actions, &error);
if (!pf) {
free(job);
return port_flow_complain(&error);
@@ -2846,12 +2846,22 @@ port_queue_flow_create(portid_t port_id, queueid_t 
queue_id,
}
/* Poisoning to make sure PMDs update it in case of error. */
memset(&error, 0x11, sizeof(error));
-   if (rule_idx == UINT32_MAX)
+   if (pt->attr.insertion_type == RTE_FLOW_TABLE_INSERTION_TYPE_PATTERN)
flow = rte_flow_async_create(port_id, queue_id, &op_attr, 
pt->table,
pattern, pattern_idx, actions, actions_idx, job, 
&error);
-   else
+   else if (pt->attr.insertion_type == RTE_FLOW_TABLE_INSERTION_TYPE_INDEX)
flow = rte_flow_async_create_by_index(port_id, queue_id, 
&op_attr, pt->table,
rule_idx, actions, actions_idx, job, &error);
+   else if (pt->attr.insertion_type == 
RTE_FLOW_TABLE_INSERTION_TYPE_INDEX_WITH_PATTERN)
+   flow = rte_flow_async_create_by_index_with_pattern(port_id, 
queue_id, &op_attr,
+   pt->table, rule_idx, pattern, pattern_idx, actions, 
actions_idx, job,
+   &error);
+   else {
+   free(pf);
+   free(job);
+   printf("Insertion type %d is invalid\n", 
pt->attr.insertion_type);
+   return -EINVAL;
+   }
if (!flow) {
free(pf);
free(job);
@@ -3060,7 +3070,7 @@ port_queue_flow_update(portid_t port_id, queueid_t 
queue_id,
}
job->type = QUEUE_JOB_TYPE_FLOW_UPDATE;
 
-   uf = port_flow_new(&pt->flow_attr, pf->rule.pattern_ro, actions, 
&error);
+   uf = port_flow_new(&pt->attr.flow_attr, pf->rule.pattern_ro, actions, 
&error);
if (!uf) {
free(job);
return port_flow_complain(&error);
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 9facd7f281..f9ab88d667 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -220,7 +220,7 @@ struct port_table {
uint32_t id; /**< Table ID. */
uint32_t nb_pattern_templates; /**< Number of pattern templates. */
uint32_t nb_actions_templates; /**< Number of actions templates. */
-   struct rte_flow_attr flow_attr; /**< Flow attributes. */
+   struct rte_flow_template_table_attr attr; /**< Table attributes. */
struct rte_flow_template_table *table; /**< PMD opaque template object 
*/
 };
 
-- 
2.18.2



[PATCH 5/6] ethdev: add jump to table index action

2024-09-18 Thread Alexander Kozyrev
Introduce the RTE_FLOW_ACTION_TYPE_JUMP_TO_TABLE_INDEX action.
It redirects packets to a particular index in a flow table.

Signed-off-by: Alexander Kozyrev 
---
 doc/guides/rel_notes/release_24_11.rst |  4 
 lib/ethdev/rte_flow.c  |  1 +
 lib/ethdev/rte_flow.h  | 23 +++
 3 files changed, 28 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index f71a9ab562..ccdc44f3d8 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -65,6 +65,10 @@ New Features
   Added API for inserting the rule by index with pattern.
   Introduced rte_flow_async_create_by_index_with_pattern() function.
 
+* **Added the action to redirect packets to a particular index in a flow 
table.**
+
+  Introduced RTE_FLOW_ACTION_TYPE_JUMP_TO_TABLE_INDEX action type.
+
 Removed Items
 -
 
diff --git a/lib/ethdev/rte_flow.c b/lib/ethdev/rte_flow.c
index f6259343e8..a56391b156 100644
--- a/lib/ethdev/rte_flow.c
+++ b/lib/ethdev/rte_flow.c
@@ -275,6 +275,7 @@ static const struct rte_flow_desc_data 
rte_flow_desc_action[] = {
MK_FLOW_ACTION(PROG,
   sizeof(struct rte_flow_action_prog)),
MK_FLOW_ACTION(NAT64, sizeof(struct rte_flow_action_nat64)),
+   MK_FLOW_ACTION(JUMP_TO_TABLE_INDEX, sizeof(struct 
rte_flow_action_jump_to_table_index)),
 };
 
 int
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index 84473241fb..a2929438bf 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -3262,6 +3262,15 @@ enum rte_flow_action_type {
 * @see struct rte_flow_action_nat64
 */
RTE_FLOW_ACTION_TYPE_NAT64,
+
+   /**
+* RTE_FLOW_ACTION_TYPE_JUMP_TO_TABLE_INDEX,
+*
+* Redirects packets to a particular index in a flow table.
+*
+* @see struct rte_flow_action_jump_to_table_index.
+*/
+   RTE_FLOW_ACTION_TYPE_JUMP_TO_TABLE_INDEX,
 };
 
 /**
@@ -4266,6 +4275,20 @@ rte_flow_dynf_metadata_set(struct rte_mbuf *m, uint32_t 
v)
*RTE_FLOW_DYNF_METADATA(m) = v;
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_JUMP_TO_TABLE_INDEX
+ *
+ * Redirects packets to a particular index in a flow table.
+ *
+ */
+struct rte_flow_action_jump_to_table_index {
+   struct rte_flow_template_table *table;
+   uint32_t index;
+};
+
 /**
  * Definition of a single action.
  *
-- 
2.18.2



[PATCH 6/6] app/testpmd: add jump to table index action

2024-09-18 Thread Alexander Kozyrev
Add a new command line options to create the
RTE_FLOW_ACTION_TYPE_JUMP_TO_TABLE_INDEX action
from the testpmd command line.

flow queue 0 create 0 template_table 0 pattern_template 0
  actions_template 0 postpone no pattern eth / end
  actions jump_to_table_index table 0x166f9ce00 index 5 / end

Signed-off-by: Alexander Kozyrev 
---
 app/test-pmd/cmdline_flow.c | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 855273365e..b7bcf18311 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -785,6 +785,9 @@ enum index {
ACTION_IPV6_EXT_PUSH_INDEX_VALUE,
ACTION_NAT64,
ACTION_NAT64_MODE,
+   ACTION_JUMP_TO_TABLE_INDEX,
+   ACTION_JUMP_TO_TABLE_INDEX_TABLE,
+   ACTION_JUMP_TO_TABLE_INDEX_INDEX,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -2328,6 +2331,7 @@ static const enum index next_action[] = {
ACTION_IPV6_EXT_REMOVE,
ACTION_IPV6_EXT_PUSH,
ACTION_NAT64,
+   ACTION_JUMP_TO_TABLE_INDEX,
ZERO,
 };
 
@@ -2688,6 +2692,13 @@ static const enum index next_hash_encap_dest_subcmd[] = {
ZERO,
 };
 
+static const enum index action_jump_to_table_index[] = {
+   ACTION_JUMP_TO_TABLE_INDEX_TABLE,
+   ACTION_JUMP_TO_TABLE_INDEX_INDEX,
+   ACTION_NEXT,
+   ZERO,
+};
+
 static int parse_set_raw_encap_decap(struct context *, const struct token *,
 const char *, unsigned int,
 void *, unsigned int);
@@ -7608,6 +7619,29 @@ static const struct token token_list[] = {
.args = ARGS(ARGS_ENTRY(struct rte_flow_action_nat64, type)),
.call = parse_vc_conf,
},
+   [ACTION_JUMP_TO_TABLE_INDEX] = {
+   .name = "jump_to_table_index",
+   .help = "Jump to table index",
+   .priv = PRIV_ACTION(JUMP_TO_TABLE_INDEX,
+   sizeof(struct 
rte_flow_action_jump_to_table_index)),
+   .next = NEXT(action_jump_to_table_index),
+   .call = parse_vc,
+   },
+   [ACTION_JUMP_TO_TABLE_INDEX_TABLE] = {
+   .name = "table",
+   .help = "table to redirect traffic to",
+   .next = NEXT(action_jump_to_table_index, 
NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct 
rte_flow_action_jump_to_table_index, table)),
+   .call = parse_vc_conf,
+   },
+   [ACTION_JUMP_TO_TABLE_INDEX_INDEX] = {
+   .name = "index",
+   .help = "rule index to redirect traffic to",
+   .next = NEXT(action_jump_to_table_index, 
NEXT_ENTRY(COMMON_UNSIGNED)),
+   .args = ARGS(ARGS_ENTRY(struct 
rte_flow_action_jump_to_table_index, index)),
+   .call = parse_vc_conf,
+   },
+
/* Top level command. */
[SET] = {
.name = "set",
-- 
2.18.2



[PATCH 2/6] app/testpmd: add index with pattern insertion type

2024-09-18 Thread Alexander Kozyrev
Provide index_with_pattern command line option
for the template table insertion type.

flow template_table 0 create table_id 2 group 13 priority 0
  insertion_type index_with_pattern ingress rules_number 64
  pattern_template 2 actions_template 2

Signed-off-by: Alexander Kozyrev 
---
 app/test-pmd/cmdline_flow.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index d04280eb3e..a794e5eba5 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -1030,7 +1030,7 @@ static const char *const meter_colors[] = {
 };
 
 static const char *const table_insertion_types[] = {
-   "pattern", "index", NULL
+   "pattern", "index", "index_with_pattern", NULL
 };
 
 static const char *const table_hash_funcs[] = {
-- 
2.18.2



Re: [PATCH v24 06/15] eal: change rte_exit() output to match rte_log()

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/19 4:52, Stephen Hemminger wrote:
> The rte_exit() output format confuses the timestamp and coloring
> options. Change it to use be a single line with proper prefix.
> 
> Before:
> [ 0.006481] EAL: Error - exiting with code: 1
>   Cause: [ 0.006489] Cannot init EAL: Permission denied
> 
> After:
> [ 0.006238] EAL: Error - exiting with code: 1
> [ 0.006250] EAL: Cannot init EAL: Permission denied
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> Acked-by: Bruce Richardson 
> ---
>  lib/eal/common/eal_common_debug.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/eal/common/eal_common_debug.c 
> b/lib/eal/common/eal_common_debug.c
> index 3e77995896..bcfcd6df6f 100644
> --- a/lib/eal/common/eal_common_debug.c
> +++ b/lib/eal/common/eal_common_debug.c
> @@ -36,15 +36,13 @@ rte_exit(int exit_code, const char *format, ...)
>   va_list ap;
>  
>   if (exit_code != 0)
> - RTE_LOG(CRIT, EAL, "Error - exiting with code: %d\n"
> - "  Cause: ", exit_code);
> + EAL_LOG(CRIT, "Error - exiting with code: %d", exit_code);
>  
>   va_start(ap, format);
>   rte_vlog(RTE_LOG_CRIT, RTE_LOGTYPE_EAL, format, ap);
>   va_end(ap);
>  
>   if (rte_eal_cleanup() != 0 && rte_errno != EALREADY)
> - EAL_LOG(CRIT,
> - "EAL could not release all resources");
> + EAL_LOG(CRIT, "EAL could not release all resources");
>   exit(exit_code);
>  }



[PATCH v5] net/ice: support customized search path for DDP package

2024-09-18 Thread Zhichao Zeng
This patch adds support for customizing firmware search path for
DDP package like the kernel behavior, it will read the search path
from "/sys/module/firmware_class/parameters/path", and try to load
DDP package.

Also, updates documentation for loading the DDP package in ice.rst.

Signed-off-by: Zhichao Zeng 

---
v5: update documentation, fix code logic
v4: fix CI error
v3: update doc, fix code error
v2: separate the patch and rewrite the log
---
 doc/guides/nics/ice.rst  | 61 +++-
 drivers/net/ice/ice_ethdev.c | 44 --
 drivers/net/ice/ice_ethdev.h |  1 +
 3 files changed, 75 insertions(+), 31 deletions(-)

diff --git a/doc/guides/nics/ice.rst b/doc/guides/nics/ice.rst
index ae975d19ad..1494f70415 100644
--- a/doc/guides/nics/ice.rst
+++ b/doc/guides/nics/ice.rst
@@ -80,6 +80,38 @@ are listed in the Tested Platforms section of the Release 
Notes for each release
|24.03  | 1.13.7|  1.3.35 |  1.3.45   |1.3.13|  
  4.4|

+---+---+-+---+--+---+
 
+Dynamic Device Personalization (DDP) package loading
+
+
+The Intel E810 requires a programmable pipeline package be downloaded
+by the driver to support normal operations. The E810 has limited
+functionality built in to allow PXE boot and other use cases, but for DPDK use
+the driver must download a package file during the driver initialization
+stage.
+
+The default DDP package file name is ``ice.pkg``. For a specific NIC, the
+DDP package supposed to be loaded can have a filename: ``ice-xx.pkg``,
+where 'xx' is the 64-bit PCIe Device Serial Number of the NIC. For
+example, if the NIC's device serial number is 00-CC-BB-FF-FF-AA-05-68,
+the device-specific DDP package filename is ``ice-00ccbbaa0568.pkg``
+(in hex and all low case), please review README from
+`Intel® Ethernet 800 Series Dynamic Device Personalization (DDP) for 
Telecommunication (Comms) Package
+`_
+for more information. A symbolic link to the DDP package file is also ok.
+The same package file is used by both the kernel driver and the ICE PMD.
+
+ICE PMD supports using a customized DDP search path. The driver will read
+the search path from '/sys/module/firmware_class/parameters/path'
+as 'CUSTOMIZED_PATH'.
+During initialization, the driver searches in the following paths in order:
+'CUSTOMIZED_PATH', '/lib/firmware/updates/intel/ice/ddp' and 
'/lib/firmware/intel/ice/ddp'.
+The device-specific DDP package has a higher loading priority than default DDP 
package.
+The type of loaded package is stored in ``ice_adapter->active_pkg_type``.
+
+   .. Note::
+
+  Windows support: DDP packages are not supported on Windows.
+
 Configuration
 -
 
@@ -487,32 +519,3 @@ Usage::
 
 In "brief" mode, all scheduling nodes in the tree are displayed.
 In "detail" mode, each node's configuration parameters are also displayed.
-
-Limitations or Known issues

-
-The Intel E810 requires a programmable pipeline package be downloaded
-by the driver to support normal operations. The E810 has a limited
-functionality built in to allow PXE boot and other use cases, but the
-driver must download a package file during the driver initialization
-stage.
-
-The default DDP package file name is ice.pkg. For a specific NIC, the
-DDP package supposed to be loaded can have a filename: ice-xx.pkg,
-where 'xx' is the 64-bit PCIe Device Serial Number of the NIC. For
-example, if the NIC's device serial number is 00-CC-BB-FF-FF-AA-05-68,
-the device-specific DDP package filename is ice-00ccbbaa0568.pkg
-(in hex and all low case). During initialization, the driver searches
-in the following paths in order: /lib/firmware/updates/intel/ice/ddp
-and /lib/firmware/intel/ice/ddp. The corresponding device-specific DDP
-package will be downloaded first if the file exists. If not, then the
-driver tries to load the default package. The type of loaded package
-is stored in ``ice_adapter->active_pkg_type``.
-
-A symbolic link to the DDP package file is also ok. The same package
-file is used by both the kernel driver and the DPDK PMD.
-
-   .. Note::
-
-  Windows support: The DDP package is not supported on Windows so,
-  loading of the package is disabled on Windows.
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index 304f959b7e..f0389cc4ee 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -1873,21 +1873,61 @@ ice_load_pkg_type(struct ice_hw *hw)
return package_type;
 }
 
+static int ice_read_customized_path(char *pkg_file, uint16_t buff_len)
+{
+   FILE *fp = fopen(ICE_PKG_FILE_CUSTOMIZED_PATH, "r");
+   int ret = 0;
+
+   if (fp

Re: [PATCH v2 1/4] power: refactor core power management library

2024-09-18 Thread lihuisong (C)



在 2024/9/18 16:37, Tummala, Sivaprasad 写道:

[AMD Official Use Only - AMD Internal Distribution Only]


-Original Message-
From: lihuisong (C) 
Sent: Friday, September 13, 2024 1:05 PM
To: Tummala, Sivaprasad 
Cc: dev@dpdk.org; david.h...@intel.com; anatoly.bura...@intel.com;
radu.nico...@intel.com; cristian.dumitre...@intel.com; jer...@marvell.com;
konstantin.anan...@huawei.com; Yigit, Ferruh ;
gak...@marvell.com
Subject: Re: [PATCH v2 1/4] power: refactor core power management library

Caution: This message originated from an External Source. Use proper caution
when opening attachments, clicking links, or responding.


在 2024/9/12 19:17, Tummala, Sivaprasad 写道:

[AMD Official Use Only - AMD Internal Distribution Only]

Hi Huisong,

Please find my response inline.


-Original Message-
From: lihuisong (C) 
Sent: Tuesday, August 27, 2024 1:51 PM
To: Tummala, Sivaprasad 
Cc: dev@dpdk.org; david.h...@intel.com; anatoly.bura...@intel.com;
radu.nico...@intel.com; cristian.dumitre...@intel.com;
jer...@marvell.com; konstantin.anan...@huawei.com; Yigit, Ferruh
; gak...@marvell.com
Subject: Re: [PATCH v2 1/4] power: refactor core power management
library

Caution: This message originated from an External Source. Use proper
caution when opening attachments, clicking links, or responding.


Hi Sivaprasa,

Some comments inline.

/Huisong

在 2024/8/26 21:06, Sivaprasad Tummala 写道:

This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity and
organization by relocating specific driver implementations from the
'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files enables
the selective activation of individual drivers.
These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.

v2:
- added NULL check for global_core_ops in rte_power_get_core_ops

Signed-off-by: Sivaprasad Tummala 
---
drivers/meson.build   |   1 +
.../power/acpi/acpi_cpufreq.c |  22 +-
.../power/acpi/acpi_cpufreq.h |   6 +-
drivers/power/acpi/meson.build|  10 +
.../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
.../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
drivers/power/amd_pstate/meson.build  |  10 +
.../power/cppc/cppc_cpufreq.c |  22 +-
.../power/cppc/cppc_cpufreq.h |   8 +-
drivers/power/cppc/meson.build|  10 +
.../power/kvm_vm}/guest_channel.c |   0
.../power/kvm_vm}/guest_channel.h |   0
.../power/kvm_vm/kvm_vm.c |  22 +-
.../power/kvm_vm/kvm_vm.h |   6 +-
drivers/power/kvm_vm/meson.build  |  16 +
drivers/power/meson.build |  12 +
drivers/power/pstate/meson.build  |  10 +
.../power/pstate/pstate_cpufreq.c |  22 +-
.../power/pstate/pstate_cpufreq.h |   6 +-
lib/power/meson.build |   7 +-
lib/power/power_common.c  |   2 +-
lib/power/power_common.h  |  16 +-
lib/power/rte_power.c | 291 ++
lib/power/rte_power.h | 139 ++---
lib/power/rte_power_core_ops.h| 208 +
lib/power/version.map |  14 +
26 files changed, 621 insertions(+), 271 deletions(-)
rename lib/power/power_acpi_cpufreq.c =>
drivers/power/acpi/acpi_cpufreq.c

(95%)

rename lib/power/power_acpi_cpufreq.h =>
drivers/power/acpi/acpi_cpufreq.h

(98%)

create mode 100644 drivers/power/acpi/meson.build
rename lib/power/power_amd_pstate_cpufreq.c =>

drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)

rename lib/power/power_amd_pstate_cpufreq.h =>

drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)

create mode 100644 drivers/power/amd_pstate/meson.build
rename lib/power/power_cppc_cpufreq.c =>
drivers/power/cppc/cppc_cpufreq.c

(95%)

rename lib/power/power_cppc_cpufreq.h =>
drivers/power/cppc/cppc_cpufreq.h

(97%)

create mode 100644 drivers/power/cppc/meson.build
rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c

(82%)

rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h

(98%)

create mode 100644 drivers/power/kvm_vm/meson.build
create mode 100644 drivers/

Re: [PATCH v4 1/7] eal: add static per-lcore memory allocation facility

2024-09-18 Thread Mattias Rönnblom

On 2024-09-17 18:11, Konstantin Ananyev wrote:

+
+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ *   The lcore id specifying which of the @c RTE_MAX_LCORE value
+ *   instances should be accessed. The lcore id need not be valid
+ *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ *   is also not valid (and thus should not be dereferenced).
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)\
+   ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))
+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+   RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)


Would it make sense to check that rte_lcore_id() !=  LCORE_ID_ANY?
After all if people do not want this extra check, they can probably use
RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)
explicitly.



It would make sense, if it was an RTE_ASSERT(). Otherwise, I don't think
so. Attempting to gracefully handle API violations is bad practice, imo.


Ok, RTE_ASSERT() might be a good compromise.
As I said in another mail for that thread, I wouldn't insist here.



After a having a closer look at this issue, I'm not so sure any more. 
Such an assertion would disallow the use of the macros to retrieve a 
potentially-invalid pointer, which is then never used, in case it is 
invalid.





+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param value
+ *   A pointer successively set to point to lcore variable value
+ *   corresponding to every lcore id (up to @c RTE_MAX_LCORE).
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(value, handle) \
+   for (unsigned int lcore_id =\
+(((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+lcore_id < RTE_MAX_LCORE;   \
+lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))


Might be a bit better (and safer) to make lcore_id a macro parameter?
I.E.:
define RTE_LCORE_VAR_FOREACH_VALUE(value, handle, lcore_id) \
for ((lcore_id) = ...



Why?


Variable with the same name (and type) can be defined by user before the loop,
With the intention to use it inside the loop.
Just like it happens here (in patch #2):
+   unsigned int lcore_id;
.
+   /* take the opportunity to test the foreach macro */
+   int *v;
+   lcore_id = 0;
+   RTE_LCORE_VAR_FOREACH_VALUE(v, test_int) {
+   TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v,
+ "Unexpected value on lcore %d during "
+ "iteration", lcore_id);
+   lcore_id++;
+   }
+
  



Indeed. I'll change it. I suppose you could also have issues if you 
nested the macro, although those could be solved by using something like 
__COUNTER__ to create a unique name.


Supplying the variable name does defeat part of the purpose of the 
RTE_LCORE_VAR_FOREACH_VALUE.







Re: [PATCH v23 01/15] maintainers: add for log library

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/18 12:56, Stephen Hemminger wrote:
> "You touch it you own it"
> Add myself as maintainer for log library.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c5a703b5c0..ecf6f955cc 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -185,6 +185,7 @@ F: app/test/test_threads.c
>  F: app/test/test_version.c
>  
>  Logging
> +M: Stephen Hemminger 
>  F: lib/log/
>  F: doc/guides/prog_guide/log_lib.rst
>  F: app/test/test_logs.c



Re: [PATCH v23 02/15] windows: make getopt functions have const properties

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/18 12:56, Stephen Hemminger wrote:
> Having different prototypes on different platforms can lead
> to lots of unnecessary workarounds.  Looks like the version of
> getopt used from windows was based on an older out of date
> version from FreeBSD.
> 
> This patch changes getopt, getopt_long, etc to have the same const
> attributes as Linux and FreeBSD. The changes are derived from
> the current FreeBSD version of getopt_long.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Dmitry Kozlyuk 
> Acked-by: Morten Brørup 





Re: [PATCH v23 03/15] windows: add os shim for localtime_r

2024-09-18 Thread fengchengwen
On 2024/9/18 12:56, Stephen Hemminger wrote:
> Windows does not have localtime_r but it does have a similar
> function that can be used instead.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> ---
>  lib/eal/windows/include/rte_os_shim.h | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/eal/windows/include/rte_os_shim.h 
> b/lib/eal/windows/include/rte_os_shim.h
> index eda8113662..e9741a9df2 100644
> --- a/lib/eal/windows/include/rte_os_shim.h
> +++ b/lib/eal/windows/include/rte_os_shim.h
> @@ -110,4 +110,14 @@ rte_clock_gettime(clockid_t clock_id, struct timespec 
> *tp)
>  }
>  #define clock_gettime(clock_id, tp) rte_clock_gettime(clock_id, tp)
>  
> +static inline struct tm *
> +rte_localtime_r(const time_t *timer, struct tm *buf)

buf always means char *, but this function return struct tm *,
suggest:
struct tm *localtime_r(const time_t *timep, struct tm *result)

> +{
> + if (localtime_s(buf, timer) == 0)
> + return buf;
> + else
> + return NULL;
> +}
> +#define localtime_r(timer, buf) rte_localtime_r(timer, buf)
> +
>  #endif /* _RTE_OS_SHIM_ */



Re: [PATCH v23 04/15] eal: make eal_log_level_parse common

2024-09-18 Thread fengchengwen
On 2024/9/18 12:56, Stephen Hemminger wrote:
> The code to parse for log-level option should be same on
> all OS variants.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> ---
>  lib/eal/common/eal_common_options.c | 45 +
>  lib/eal/common/eal_options.h|  1 +
>  lib/eal/freebsd/eal.c   | 42 ---
>  lib/eal/linux/eal.c | 39 -
>  lib/eal/windows/eal.c   | 35 --
>  5 files changed, 46 insertions(+), 116 deletions(-)
> 
> diff --git a/lib/eal/common/eal_common_options.c 
> b/lib/eal/common/eal_common_options.c
> index f1a5e329a5..b0ceeef632 100644
> --- a/lib/eal/common/eal_common_options.c
> +++ b/lib/eal/common/eal_common_options.c
> @@ -1640,6 +1640,51 @@ eal_parse_huge_unlink(const char *arg, struct 
> hugepage_file_discipline *out)
>   return -1;
>  }
>  
> +/* Parse the all arguments looking for log related ones */
> +int
> +eal_log_level_parse(int argc, char * const argv[])
> +{
> + struct internal_config *internal_conf = 
> eal_get_internal_configuration();
> + int option_index, opt;
> + const int old_optind = optind;
> + const int old_optopt = optopt;
> + const int old_opterr = opterr;
> + char *old_optarg = optarg;
> +#ifdef RTE_EXEC_ENV_FREEBSD
> + const int old_optreset = optreset;
> + optreset = 1;
> +#endif
> +
> + optind = 1;
> + opterr = 0;
> +
> + while ((opt = getopt_long(argc, argv, eal_short_options,
> +   eal_long_options, &option_index)) != EOF) {
> +
> + switch (opt) {
> + case OPT_LOG_LEVEL_NUM:
> + if (eal_parse_common_option(opt, optarg, internal_conf) 
> < 0)
> + return -1;
> + break;
> + case '?':
> + /* getopt is not happy, stop right now */
> + goto out;

no need goto, could use break

> + default:
> + continue;
> + }
> + }
> +out:
> + /* restore getopt lib */
> + optind = old_optind;
> + optopt = old_optopt;
> + optarg = old_optarg;
> + opterr = old_opterr;
> +#ifdef RTE_EXEC_ENV_FREEBSD
> + optreset = old_optreset;
> +#endif
> + return 0;
> +}
> +

...


Re: [PATCH v23 05/15] eal: do not duplicate rte_init_alert() messages

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/18 12:56, Stephen Hemminger wrote:
> The message already goes through logging, and does not need
> to be printed on stderr. Message level should be ALERT
> to match function name.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 
> ---
>  lib/eal/freebsd/eal.c | 3 +--
>  lib/eal/linux/eal.c   | 3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/eal/freebsd/eal.c b/lib/eal/freebsd/eal.c
> index d3b40e81d8..7b974608e4 100644
> --- a/lib/eal/freebsd/eal.c
> +++ b/lib/eal/freebsd/eal.c
> @@ -529,8 +529,7 @@ rte_eal_iopl_init(void)
>  
>  static void rte_eal_init_alert(const char *msg)
>  {
> - fprintf(stderr, "EAL: FATAL: %s\n", msg);
> - EAL_LOG(ERR, "%s", msg);
> + EAL_LOG(ALERT, "%s", msg);
>  }
>  
>  /* Launch threads, called at application init(). */
> diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
> index 0ded386472..7c26393d00 100644
> --- a/lib/eal/linux/eal.c
> +++ b/lib/eal/linux/eal.c
> @@ -840,8 +840,7 @@ static int rte_eal_vfio_setup(void)
>  
>  static void rte_eal_init_alert(const char *msg)
>  {
> - fprintf(stderr, "EAL: FATAL: %s\n", msg);
> - EAL_LOG(ERR, "%s", msg);
> + EAL_LOG(ALERT, "%s", msg);
>  }
>  
>  /*



Re: [PATCH v23 07/15] log: move handling of syslog facility out of eal

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/18 12:56, Stephen Hemminger wrote:
> The syslog facility property is better handled in lib/log
> rather than in eal. This also allows for changes to what
> syslog flag means in later steps.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Morten Brørup 



Re: [PATCH v23 08/15] eal: initialize log before everything else

2024-09-18 Thread fengchengwen
Acked-by: Chengwen Feng 

On 2024/9/18 12:56, Stephen Hemminger wrote:
> In order for all log messages (including CPU mismatch) to
> come out through the logging library, it must be initialized
> as early in rte_eal_init() as possible on all platforms.
> 
> Where it was done before was likely historical based on
> the support of non-OS isolated CPU's which required a shared
> memory buffer; that support was dropped before DPDK was
> publicly released.
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Tyler Retzlaff 
> Acked-by: Morten Brørup 



Re: [PATCH v3 03/19] net/xsc: add PCI device probe and remove

2024-09-18 Thread David Marchand
On Wed, Sep 18, 2024 at 8:10 AM WanRenyong  wrote:
> +static const struct rte_pci_id xsc_ethdev_pci_id_map[] = {
> +   { RTE_PCI_DEVICE(XSC_PCI_VENDOR_ID, XSC_PCI_DEV_ID_MS) },

You need to null terminate this array with something like:
{ .vendor_id = 0, /* sentinel */ },

Otherwise the bus pci code may read a next symbol or data present in
the .data section.

ASan caught this issue when running the unit tests:

==70261==ERROR: AddressSanitizer: global-buffer-overflow on address
0x7f8e46bf45f0 at pc 0x7f8e56be523b bp 0x7ffe2ef88ca0 sp
0x7ffe2ef88c98
READ of size 2 at 0x7f8e46bf45f0 thread T0
#0 0x7f8e56be523a in rte_pci_match
/home/runner/work/dpdk/dpdk/build/../drivers/bus/pci/pci_common.c:178:47
#1 0x7f8e56be523a in rte_pci_probe_one_driver
/home/runner/work/dpdk/dpdk/build/../drivers/bus/pci/pci_common.c:223:7
#2 0x7f8e56be523a in pci_probe_all_drivers
/home/runner/work/dpdk/dpdk/build/../drivers/bus/pci/pci_common.c:391:8
#3 0x7f8e56be3297 in pci_probe
/home/runner/work/dpdk/dpdk/build/../drivers/bus/pci/pci_common.c:418:9
#4 0x7f8e56fe9ea8 in rte_bus_probe
/home/runner/work/dpdk/dpdk/build/../lib/eal/common/eal_common_bus.c:78:9
#5 0x7f8e570580d1 in rte_eal_init
/home/runner/work/dpdk/dpdk/build/../lib/eal/linux/eal.c:1288:6
#6 0x5573a597d65d in main
/home/runner/work/dpdk/dpdk/build/../app/test/test.c:145:9
#7 0x7f8e55829d8f in __libc_start_call_main
csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#8 0x7f8e55829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#9 0x5573a58bf114 in _start
(/home/runner/work/dpdk/dpdk/build/app/dpdk-test+0x1e9114) (BuildId:
8d4741d712c15395a67005124e1f908d96acf7ff)


172 int
173 rte_pci_match(const struct rte_pci_driver *pci_drv,
174   const struct rte_pci_device *pci_dev)
175 {
176 const struct rte_pci_id *id_table;
177
178 for (id_table = pci_drv->id_table; id_table->vendor_id !=
0;
179  id_table++) {


-- 
David Marchand



Re: [PATCH v3 11/12] dts: add Rx offload capabilities

2024-09-18 Thread Juraj Linkeš




On 26. 8. 2024 19:24, Jeremy Spewock wrote:

On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš
 wrote:


diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index 48c31124d1..f83569669e 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -659,6 +659,103 @@ class TestPmdPortStats(TextParser):
  tx_bps: int = field(metadata=TextParser.find_int(r"Tx-bps:\s+(\d+)"))


+class RxOffloadCapability(Flag):
+"""Rx offload capabilities of a device."""
+
+#:
+RX_OFFLOAD_VLAN_STRIP = auto()
+#: Device supports L3 checksum offload.
+RX_OFFLOAD_IPV4_CKSUM = auto()
+#: Device supports L4 checksum offload.
+RX_OFFLOAD_UDP_CKSUM = auto()
+#: Device supports L4 checksum offload.
+RX_OFFLOAD_TCP_CKSUM = auto()
+#: Device supports Large Receive Offload.
+RX_OFFLOAD_TCP_LRO = auto()
+#: Device supports QinQ (queue in queue) offload.
+RX_OFFLOAD_QINQ_STRIP = auto()
+#: Device supports inner packet L3 checksum.
+RX_OFFLOAD_OUTER_IPV4_CKSUM = auto()
+#: Device supports MACsec.
+RX_OFFLOAD_MACSEC_STRIP = auto()
+#: Device supports filtering of a VLAN Tag identifier.
+RX_OFFLOAD_VLAN_FILTER = 1 << 9
+#: Device supports VLAN offload.
+RX_OFFLOAD_VLAN_EXTEND = auto()
+#: Device supports receiving segmented mbufs.
+RX_OFFLOAD_SCATTER = 1 << 13


I know you mentioned in the commit message that the auto() can cause
problems with mypy/sphinx, is that why this one is a specific value
instead? Regardless, I think we should probably make it consistent so
that either all of them are bit-shifts or none of them are unless
there is a specific reason that the scatter offload is different.



Since both you and Dean asked, I'll add something to the docstring about 
this.


There are actually two non-auto values (RX_OFFLOAD_VLAN_FILTER = 1 << 9 
is the first one). I used the actual values to mirror the flags in DPDK 
code.



+#: Device supports Timestamp.
+RX_OFFLOAD_TIMESTAMP = auto()
+#: Device supports crypto processing while packet is received in NIC.
+RX_OFFLOAD_SECURITY = auto()
+#: Device supports CRC stripping.
+RX_OFFLOAD_KEEP_CRC = auto()
+#: Device supports L4 checksum offload.
+RX_OFFLOAD_SCTP_CKSUM = auto()
+#: Device supports inner packet L4 checksum.
+RX_OFFLOAD_OUTER_UDP_CKSUM = auto()
+#: Device supports RSS hashing.
+RX_OFFLOAD_RSS_HASH = auto()
+#: Device supports
+RX_OFFLOAD_BUFFER_SPLIT = auto()
+#: Device supports all checksum capabilities.
+RX_OFFLOAD_CHECKSUM = RX_OFFLOAD_IPV4_CKSUM | RX_OFFLOAD_UDP_CKSUM | 
RX_OFFLOAD_TCP_CKSUM
+#: Device supports all VLAN capabilities.
+RX_OFFLOAD_VLAN = (
+RX_OFFLOAD_VLAN_STRIP
+| RX_OFFLOAD_VLAN_FILTER
+| RX_OFFLOAD_VLAN_EXTEND
+| RX_OFFLOAD_QINQ_STRIP
+)




@@ -1048,6 +1145,42 @@ def _close(self) -> None:
  == Capability retrieval methods ==
  """

+def get_capabilities_rx_offload(
+self,
+supported_capabilities: MutableSet["NicCapability"],
+unsupported_capabilities: MutableSet["NicCapability"],
+) -> None:
+"""Get all rx offload capabilities and divide them into supported and 
unsupported.
+
+Args:
+supported_capabilities: Supported capabilities will be added to 
this set.
+unsupported_capabilities: Unsupported capabilities will be added 
to this set.
+"""
+self._logger.debug("Getting rx offload capabilities.")
+command = f"show port {self.ports[0].id} rx_offload capabilities"


Is it desirable to only get the capabilities of the first port? In the
current framework I suppose it doesn't matter all that much since you
can only use the first few ports in the list of ports anyway, but will
there ever be a case where a test run has 2 different devices included
in the list of ports? Of course it's possible that it will happen, but
is it practical? Because, if so, then we would want this to aggregate
what all the devices are capable of and have capabilities basically
say "at least one of the ports in the list of ports is capable of
these things."

This consideration also applies to the rxq info capability gathering as well.



No parts of the framework are adjusted to use multiple NIC in a single 
test run (because we assume we're testing only one NIC at a time). If we 
add this support, it's going to be a broader change.


I approached this with the above assumption in mind and in that case, 
testing just one port of the NIC seemed just fine.



+rx_offload_capabilities_out = self.send_command(command)
+rx_offload_capabilities = 
RxOffloadCapabilities.parse(rx_offload_capabilities_out)
+self._update_capabilities_from_flag(
+supported_capabilities,
+unsupported_capabilities,
+RxOffloadCapability,
+  

[PATCH] crypto/scheduler: fix incorrect variable usage

2024-09-18 Thread Yong Liang
The variable `pending_deq_ops` was incorrectly used
instead of `pending_enq_ops`.
This causes the program to crash
when the worker PMD accesses the session

Bugzilla ID: 1537
Fixes: 6812b9bf470e ("crypto/scheduler: use unified session")
Cc: roy.fan.zh...@intel.com

Signed-off-by: Yong Liang <1269690...@qq.com>
---
 drivers/crypto/scheduler/scheduler_multicore.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/scheduler/scheduler_multicore.c 
b/drivers/crypto/scheduler/scheduler_multicore.c
index a21b522f9f..70f8a25b70 100644
--- a/drivers/crypto/scheduler/scheduler_multicore.c
+++ b/drivers/crypto/scheduler/scheduler_multicore.c
@@ -191,11 +191,11 @@ mc_scheduler_worker(struct rte_cryptodev *dev)
worker->qp_id,
&enq_ops[pending_enq_ops_idx],
pending_enq_ops);
-   if (processed_ops < pending_deq_ops)
+   if (processed_ops < pending_enq_ops)
scheduler_retrieve_sessions(
&enq_ops[pending_enq_ops_idx +
processed_ops],
-   pending_deq_ops - processed_ops);
+   pending_enq_ops - processed_ops);
pending_enq_ops -= processed_ops;
pending_enq_ops_idx += processed_ops;
inflight_ops += processed_ops;
-- 
2.43.0



Re: [PATCH v2] dts: fix runner target in the Dockerfile

2024-09-18 Thread Jeremy Spewock
On Wed, Sep 18, 2024 at 3:57 AM Juraj Linkeš  wrote:
>
>
> > diff --git a/dts/Dockerfile b/dts/Dockerfile
>
> > @@ -24,9 +27,12 @@ FROM base AS runner
>
> > +# Adds ~/.local/bin to PATH so that packages installed with pipx are 
> > callable. `pipx ensurepath`
> > +# fixes this issue, but requires the shell to be re-opened which isn't an 
> > option for this target.
>
> Let's explain this a bit more, I don't really know why this isn't an option.

The main reason it isn't an option is it is all happening in the same
`docker build` process and it seems like however Docker is deciding to
create the layers isn't refreshing the terminal. I don't think there
is a way we could make it do so, but I can swap the "isn't an option"
part of the comment for something more like "and the build process
does not refresh the terminal in the required way before creating the
next layer."

>
> > +ENV PATH="$PATH:/root/.local/bin"
> > +RUN poetry install --only main --no-root
> >
> > -CMD ["poetry", "run", "python", "main.py"]
> > +ENTRYPOINT ["poetry", "run", "python", "main.py"]
> >


Re: [PATCH] eal/alarm_cancel: Fix thread starvation

2024-09-18 Thread Stephen Hemminger
On Wed, 18 Sep 2024 13:39:06 +0200
Wojciech Panfil  wrote:

> Issue:
> Two threads:
> 
> - A, executing rte_eal_alarm_cancel,
> - B, executing eal_alarm_callback.
> 
> Such case can cause starvation of thread B. Please see that there is a
> small time window between lock and unlock in thread A, so thread B must
> be switched to within a very small time window, so that it can obtain
> the lock.
> 
> Solution to this problem is use sched_yield(), which puts current thread
> (A) at the end of thread execution priority queue and allows thread B to
> execute.
> 
> The issue can be observed e.g. on hot-pluggable device detach path.
> On such path, rte_alarm can used to check if DPDK has completed
> the detachment. Waiting for completion, rte_eal_alarm_cancel
> is called, while another thread periodically calls eal_alarm_callback
> causing the issue to occur.
> 
> Signed-off-by: Wojciech Panfil 

Make sense. Alarm is non-EAL thread, and so is hotplug.

Acked-by: Stephen Hemminger 

Does the timer_stop code have similar issues?
Probably only if users do unexpected things like
map multiple logical lcores to same CPU.


RE: [PATCH] app/test-eventdev: improve DMA adapter test

2024-09-18 Thread Amit Prakash Shukla


> -Original Message-
> From: pbhagavat...@marvell.com 
> Sent: Friday, August 23, 2024 11:43 AM
> To: Jerin Jacob ; Amit Prakash Shukla
> 
> Cc: dev@dpdk.org; Pavan Nikhilesh Bhagavatula
> 
> Subject: [PATCH] app/test-eventdev: improve DMA adapter test
> 
> From: Pavan Nikhilesh 
> 
> Move DMA ops to use mempool to prevent using the same ops before
> completion.
> This also allows us to measure forward latency.
> 
> Signed-off-by: Pavan Nikhilesh 
> ---
>  app/test-eventdev/test_perf_atq.c|  14 ++--
>  app/test-eventdev/test_perf_common.c | 106 ++-
> app/test-eventdev/test_perf_common.h |  63 +---  app/test-
> eventdev/test_perf_queue.c  |  14 ++--
>  4 files changed, 106 insertions(+), 91 deletions(-)
> 

Acked-by: Amit Prakash Shukla 

Thanks


[PATCH v9 5/6] eal: add unit tests for atomic bit access functions

2024-09-18 Thread Mattias Rönnblom
Extend bitops tests to cover the rte_bit_atomic_*() family of
functions.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Tyler Retzlaff 
Acked-by: Jack Bond-Preston 

--

RFC v4:
 * Add atomicity test for atomic bit flip.

RFC v3:
 * Rename variable 'main' to make ICC happy.
---
 app/test/test_bitops.c | 313 -
 1 file changed, 312 insertions(+), 1 deletion(-)

diff --git a/app/test/test_bitops.c b/app/test/test_bitops.c
index 322f58c066..b80216a0a1 100644
--- a/app/test/test_bitops.c
+++ b/app/test/test_bitops.c
@@ -3,10 +3,13 @@
  * Copyright(c) 2024 Ericsson AB
  */
 
+#include 
 #include 
 
-#include 
 #include 
+#include 
+#include 
+#include 
 #include 
 #include "test.h"
 
@@ -61,6 +64,304 @@ GEN_TEST_BIT_ACCESS(test_bit_access32, rte_bit_set, 
rte_bit_clear,
 GEN_TEST_BIT_ACCESS(test_bit_access64, rte_bit_set, rte_bit_clear,
rte_bit_assign, rte_bit_flip, rte_bit_test, 64)
 
+#define bit_atomic_set(addr, nr)   \
+   rte_bit_atomic_set(addr, nr, rte_memory_order_relaxed)
+
+#define bit_atomic_clear(addr, nr) \
+   rte_bit_atomic_clear(addr, nr, rte_memory_order_relaxed)
+
+#define bit_atomic_assign(addr, nr, value) \
+   rte_bit_atomic_assign(addr, nr, value, rte_memory_order_relaxed)
+
+#define bit_atomic_flip(addr, nr)  \
+rte_bit_atomic_flip(addr, nr, rte_memory_order_relaxed)
+
+#define bit_atomic_test(addr, nr)  \
+   rte_bit_atomic_test(addr, nr, rte_memory_order_relaxed)
+
+GEN_TEST_BIT_ACCESS(test_bit_atomic_access32, bit_atomic_set,
+   bit_atomic_clear, bit_atomic_assign,
+   bit_atomic_flip, bit_atomic_test, 32)
+
+GEN_TEST_BIT_ACCESS(test_bit_atomic_access64, bit_atomic_set,
+   bit_atomic_clear, bit_atomic_assign,
+   bit_atomic_flip, bit_atomic_test, 64)
+
+#define PARALLEL_TEST_RUNTIME 0.25
+
+#define GEN_TEST_BIT_PARALLEL_ASSIGN(size) \
+   \
+   struct parallel_access_lcore ## size\
+   {   \
+   unsigned int bit;   \
+   uint ## size ##_t *word;\
+   bool failed;\
+   };  \
+   \
+   static int  \
+   run_parallel_assign ## size(void *arg)  \
+   {   \
+   struct parallel_access_lcore ## size *lcore = arg;  \
+   uint64_t deadline = rte_get_timer_cycles() +\
+   PARALLEL_TEST_RUNTIME * rte_get_timer_hz(); \
+   bool value = false; \
+   \
+   do {\
+   bool new_value = rte_rand() & 1;\
+   bool use_test_and_modify = rte_rand() & 1;  \
+   bool use_assign = rte_rand() & 1;   \
+   \
+   if (rte_bit_atomic_test(lcore->word, lcore->bit, \
+   rte_memory_order_relaxed) != 
value) { \
+   lcore->failed = true;   \
+   break;  \
+   }   \
+   \
+   if (use_test_and_modify) {  \
+   bool old_value; \
+   if (use_assign) \
+   old_value = 
rte_bit_atomic_test_and_assign( \
+   lcore->word, lcore->bit, 
new_value, \
+   rte_memory_order_relaxed); \
+   else {  \
+   old_value = new_value ? \
+   rte_bit_atomic_test_and_set( \
+   lcore->word, 
lcore->bit, \
+   
rte_mem

[PATCH v9 2/6] eal: extend bit manipulation functionality

2024-09-18 Thread Mattias Rönnblom
Add functionality to test and modify the value of individual bits in
32-bit or 64-bit words.

These functions have no implications on memory ordering, atomicity and
does not use volatile and thus does not prevent any compiler
optimizations.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Tyler Retzlaff 
Acked-by: Jack Bond-Preston 

--

PATCH v3:
 * Remove unnecessary  include.
 * Remove redundant 'fun' parameter from the __RTE_GEN_BIT_*() macros
   (Jack Bond-Preston).
 * Introduce __RTE_BIT_BIT_OPS() macro, consistent with how things
   are done when generating the atomic bit operations.
 * Refer to volatile bit op functions as variants instead of families
   (macro parameter naming).

RFC v6:
 * Have rte_bit_test() accept const-marked bitsets.

RFC v4:
 * Add rte_bit_flip() which, believe it or not, flips the value of a bit.
 * Mark macro-generated private functions as experimental.
 * Use macros to generate *assign*() functions.

RFC v3:
 * Work around lack of C++ support for _Generic (Tyler Retzlaff).
 * Fix ','-related checkpatch warnings.
---
 lib/eal/include/rte_bitops.h | 260 ++-
 1 file changed, 258 insertions(+), 2 deletions(-)

diff --git a/lib/eal/include/rte_bitops.h b/lib/eal/include/rte_bitops.h
index 449565eeae..6915b945ba 100644
--- a/lib/eal/include/rte_bitops.h
+++ b/lib/eal/include/rte_bitops.h
@@ -2,6 +2,7 @@
  * Copyright(c) 2020 Arm Limited
  * Copyright(c) 2010-2019 Intel Corporation
  * Copyright(c) 2023 Microsoft Corporation
+ * Copyright(c) 2024 Ericsson AB
  */
 
 #ifndef _RTE_BITOPS_H_
@@ -11,12 +12,14 @@
  * @file
  * Bit Operations
  *
- * This file defines a family of APIs for bit operations
- * without enforcing memory ordering.
+ * This file provides functionality for low-level, single-word
+ * arithmetic and bit-level operations, such as counting or
+ * setting individual bits.
  */
 
 #include 
 
+#include 
 #include 
 
 #ifdef __cplusplus
@@ -105,6 +108,197 @@ extern "C" {
 #define RTE_FIELD_GET64(mask, reg) \
((typeof(mask))(((reg) & (mask)) >> rte_ctz64(mask)))
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Test bit in word.
+ *
+ * Generic selection macro to test the value of a bit in a 32-bit or
+ * 64-bit word. The type of operation depends on the type of the @c
+ * addr parameter.
+ *
+ * This macro does not give any guarantees in regards to memory
+ * ordering or atomicity.
+ *
+ * @param addr
+ *   A pointer to the word to modify.
+ * @param nr
+ *   The index of the bit.
+ */
+#define rte_bit_test(addr, nr) \
+   _Generic((addr),\
+   uint32_t *: __rte_bit_test32,   \
+   const uint32_t *: __rte_bit_test32, \
+   uint64_t *: __rte_bit_test64,   \
+   const uint64_t *: __rte_bit_test64)(addr, nr)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Set bit in word.
+ *
+ * Generic selection macro to set a bit in a 32-bit or 64-bit
+ * word. The type of operation depends on the type of the @c addr
+ * parameter.
+ *
+ * This macro does not give any guarantees in regards to memory
+ * ordering or atomicity.
+ *
+ * @param addr
+ *   A pointer to the word to modify.
+ * @param nr
+ *   The index of the bit.
+ */
+#define rte_bit_set(addr, nr)  \
+   _Generic((addr),\
+uint32_t *: __rte_bit_set32,   \
+uint64_t *: __rte_bit_set64)(addr, nr)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Clear bit in word.
+ *
+ * Generic selection macro to clear a bit in a 32-bit or 64-bit
+ * word. The type of operation depends on the type of the @c addr
+ * parameter.
+ *
+ * This macro does not give any guarantees in regards to memory
+ * ordering or atomicity.
+ *
+ * @param addr
+ *   A pointer to the word to modify.
+ * @param nr
+ *   The index of the bit.
+ */
+#define rte_bit_clear(addr, nr)\
+   _Generic((addr),\
+uint32_t *: __rte_bit_clear32, \
+uint64_t *: __rte_bit_clear64)(addr, nr)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Assign a value to a bit in word.
+ *
+ * Generic selection macro to assign a value to a bit in a 32-bit or 64-bit
+ * word. The type of operation depends on the type of the @c addr parameter.
+ *
+ * This macro does not give any guarantees in regards to memory
+ * ordering or atomicity.
+ *
+ * @param addr
+ *   A pointer to the word to modify.
+ * @param nr
+ *   The index of the bit.
+ * @param value
+ *   The new value of the bit - true for '1', or false for '0'.
+ */
+#define rte_bit_assign(addr, nr, value) 

[PATCH v9 3/6] eal: add unit tests for bit operations

2024-09-18 Thread Mattias Rönnblom
Extend bitops tests to cover the
rte_bit_[test|set|clear|assign|flip]()
functions.

The tests are converted to use the test suite runner framework.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Tyler Retzlaff 
Acked-by: Jack Bond-Preston 

--

RFC v6:
 * Test rte_bit_*test() usage through const pointers.

RFC v4:
 * Remove redundant line continuations.
---
 app/test/test_bitops.c | 85 ++
 1 file changed, 70 insertions(+), 15 deletions(-)

diff --git a/app/test/test_bitops.c b/app/test/test_bitops.c
index 0d4ccfb468..322f58c066 100644
--- a/app/test/test_bitops.c
+++ b/app/test/test_bitops.c
@@ -1,13 +1,68 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  * Copyright(c) 2019 Arm Limited
+ * Copyright(c) 2024 Ericsson AB
  */
 
+#include 
+
 #include 
 #include 
+#include 
 #include "test.h"
 
-uint32_t val32;
-uint64_t val64;
+#define GEN_TEST_BIT_ACCESS(test_name, set_fun, clear_fun, assign_fun, \
+   flip_fun, test_fun, size)   \
+   static int  \
+   test_name(void) \
+   {   \
+   uint ## size ## _t reference = (uint ## size ## _t)rte_rand(); \
+   unsigned int bit_nr;\
+   uint ## size ## _t word = (uint ## size ## _t)rte_rand(); \
+   \
+   for (bit_nr = 0; bit_nr < size; bit_nr++) { \
+   bool reference_bit = (reference >> bit_nr) & 1; \
+   bool assign = rte_rand() & 1;   \
+   if (assign) \
+   assign_fun(&word, bit_nr, reference_bit); \
+   else {  \
+   if (reference_bit)  \
+   set_fun(&word, bit_nr); \
+   else\
+   clear_fun(&word, bit_nr);   \
+   \
+   }   \
+   TEST_ASSERT(test_fun(&word, bit_nr) == reference_bit, \
+   "Bit %d had unexpected value", bit_nr); \
+   flip_fun(&word, bit_nr);\
+   TEST_ASSERT(test_fun(&word, bit_nr) != reference_bit, \
+   "Bit %d had unflipped value", bit_nr); \
+   flip_fun(&word, bit_nr);\
+   \
+   const uint ## size ## _t *const_ptr = &word;\
+   TEST_ASSERT(test_fun(const_ptr, bit_nr) ==  \
+   reference_bit,  \
+   "Bit %d had unexpected value", bit_nr); \
+   }   \
+   \
+   for (bit_nr = 0; bit_nr < size; bit_nr++) { \
+   bool reference_bit = (reference >> bit_nr) & 1; \
+   TEST_ASSERT(test_fun(&word, bit_nr) == reference_bit, \
+   "Bit %d had unexpected value", bit_nr); \
+   }   \
+   \
+   TEST_ASSERT(reference == word, "Word had unexpected value"); \
+   \
+   return TEST_SUCCESS;\
+   }
+
+GEN_TEST_BIT_ACCESS(test_bit_access32, rte_bit_set, rte_bit_clear,
+   rte_bit_assign, rte_bit_flip, rte_bit_test, 32)
+
+GEN_TEST_BIT_ACCESS(test_bit_access64, rte_bit_set, rte_bit_clear,
+   rte_bit_assign, rte_bit_flip, rte_bit_test, 64)
+
+static uint32_t val32;
+static uint64_t val64;
 
 #define MAX_BITS_32 32
 #define MAX_BITS_64 64
@@ -117,22 +172,22 @@ test_bit_relaxed_test_set_clear(void)
return TEST_SUCCESS;
 }
 
+static struct unit_test_suite test_suite = {
+   .suite_name = "Bitops test suite",
+   .unit_test_cases = {
+   TEST_CASE(test_bit_access32),
+   TEST_CASE(test_bit_access64),
+   TEST_CASE(test_bit_relaxed_set),
+   TEST_CASE(test_bit_relaxed_clear),
+   TEST_CASE(test_bit_relaxed_test_set_clear),
+   TEST_

Re: [DPDK/core Bug 1547] Build fails on FreeBSD 14.0

2024-09-18 Thread Bruce Richardson
On Wed, Sep 18, 2024 at 01:51:18AM +, bugzi...@dpdk.org wrote:
>[1]Stephen Hemminger changed [2]bug 1547
> 
>   What  Removed Added
>Status UNCONFIRMED RESOLVED
>Resolution --- INVALID
> 
>[3]Comment # 1 on [4]bug 1547 from [5]Stephen Hemminger
> Missing elftools on the build VM

Thanks seems like something that should be caught earlier at the configure
stage. I thought that meson checks for pyelftools - is it the python
package or the underlying C library that was missing?

/Bruce


[RFC 0/4] ethdev: rework config restore

2024-09-18 Thread Dariusz Sosnowski
Hi all,

We have been working on optimizing the latency of calls to rte_eth_dev_start(),
on ports spawned by mlx5 PMD. Most of the work requires changes in the 
implementation of
.dev_start() PMD callback, but I also wanted to start a discussion regarding
configuration restore.

rte_eth_dev_start() does a few things on top of calling .dev_start() callback:

- Before calling it:
- eth_dev_mac_restore() - if device supports RTE_ETH_DEV_NOLIVE_MAC_ADDR;
- After calling it:
- eth_dev_mac_restore() - if device does not support 
RTE_ETH_DEV_NOLIVE_MAC_ADDR;
- restore promiscuous config
- restore all multicast config

eth_dev_mac_restore() iterates over all known MAC addresses -
stored in rte_eth_dev_data.mac_addrs array - and calls
.mac_addr_set() and .mac_addr_add() callbacks to apply these MAC addresses.

Promiscuous config restore checks if promiscuous mode is enabled or not,
and calls .promiscuous_enable() or .promiscuous_disable() callback.

All multicast config restore checks if all multicast mode is enabled or not,
and calls .allmulticast_enable() or .allmulticast_disable() callback.

Callbacks are called directly in all of these cases, to bypass the checks
for applying the same configuration, which exist in relevant APIs.
Checks are bypassed to force drivers to reapply the configuration.

Let's consider what happens in the following sequence of API calls.

1. rte_eth_dev_configure()
2. rte_eth_tx_queue_setup()
3. rte_eth_rx_queue_setup()
4. rte_eth_promiscuous_enable()
- Call dev->dev_ops->promiscuous_enable()
- Stores promiscuous state in dev->data->promiscuous
5. rte_eth_allmulticast_enable()
- Call dev->dev_ops->allmulticast_enable()
- Stores allmulticast state in dev->data->allmulticast
6. rte_eth_dev_start()
- Call dev->dev_ops->dev_start()
- Call dev->dev_ops->mac_addr_set() - apply default MAC address
- Call dev->dev_ops->promiscuous_enable()
- Call dev->dev_ops->allmulticast_enable()

Even though all configuration is available in dev->data after step 5,
library forces reapplying this configuration in step 6.

In mlx5 PMD case all relevant callbacks require communication with the kernel 
driver,
to configure the device (mlx5 PMD must create/destroy new kernel flow rules 
and/or
change netdev config).

mlx5 PMD handles applying all configuration in .dev_start(), so the following 
forced callbacks
force additional communication with the kernel. The same configuration is 
applied multiple times.

As an optimization, mlx5 PMD could check if a given configuration was applied,
but this would duplicate the functionality of the library
(for example rte_eth_promiscuous_enable() does not call the driver
if dev->data->promiscuous is set).

Question: Since all of the configuration is available before .dev_start() 
callback is called,
why ethdev library does not expect .dev_start() to take this configuration into 
account?
In other words, why library has to reapply the configuration?

I could not find any particular reason why configuration restore exists
as part of the process (it was in the initial DPDK commit).

The patches included in this RFC, propose a mechanism which would help with 
managing which drivers
rely on forceful configuration restore.
Drivers could advertise if forceful configuration restore is needed through
`RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is set, then the driver 
in question
requires ethdev to forcefully restore configuration.

This way, if we would conclude that it makes sense for .dev_start() to handle 
all
starting configuration aspects, we could track which drivers still rely on 
configuration restore.

Dariusz Sosnowski (4):
  ethdev: rework config restore
  ethdev: omit promiscuous config restore if not required
  ethdev: omit all multicast config restore if not required
  ethdev: omit MAC address restore if not required

 lib/ethdev/rte_ethdev.c | 39 ++-
 lib/ethdev/rte_ethdev.h | 18 ++
 2 files changed, 52 insertions(+), 5 deletions(-)

--
2.39.5



[RFC 3/4] ethdev: omit all multicast config restore if not required

2024-09-18 Thread Dariusz Sosnowski
This patch adds a new device flag - RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE.
If device driver sets this flag, then it requires that ethdev library
forcefully reapplies allmulticast configration,
after the port is started.
As a result, unnecessary work can be removed from rte_eth_dev_start()
for drivers which apply all available configuration in dev_start()
(such drivers do not set the flag).

If RFC is approved, then the next version of this patch
should set the new flag for all drivers to maintain the same behavior,
until drivers adjust and it can be safely cleared.

Signed-off-by: Dariusz Sosnowski 
---
 lib/ethdev/rte_ethdev.c | 8 +---
 lib/ethdev/rte_ethdev.h | 6 ++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index ff08abd566..a08922a78a 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1732,9 +1732,11 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
return ret;
}
 
-   ret = eth_dev_allmulticast_restore(dev, port_id);
-   if (ret != 0)
-   return ret;
+   if (*dev_info->dev_flags & RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE) {
+   ret = eth_dev_allmulticast_restore(dev, port_id);
+   if (ret != 0)
+   return ret;
+   }
 
return 0;
 }
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 0fc23fb924..73405dd17d 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2126,6 +2126,12 @@ struct rte_eth_dev_owner {
  * after driver's dev_start() callback is called.
  */
 #define RTE_ETH_DEV_PROMISC_FORCE_RESTORE RTE_BIT32(7)
+/**
+ * If this flag is set, then device driver requires that
+ * ethdev library forcefully reapplies allmulticast configuration,
+ * after driver's dev_start() callback is called.
+ */
+#define RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE RTE_BIT32(8)
 /**@}*/
 
 /**
-- 
2.39.5



[RFC 4/4] ethdev: omit MAC address restore if not required

2024-09-18 Thread Dariusz Sosnowski
This patch adds a new device flag - RTE_ETH_DEV_MAC_ADDR_FORCE_RESTORE.
If device driver sets this flag, then it requires that ethdev library
forcefully reapplies configured MAC addresses,
after the port is started.
As a result, unnecessary work can be removed from rte_eth_dev_start()
for drivers which apply all available configuration in dev_start()
(such drivers do not the set the flag).

If RFC is approved, then the next version of this patch
should set the new flag for all drivers to maintain the same behavior,
until drivers adjust and it can be safely cleared.

Signed-off-by: Dariusz Sosnowski 
---
 lib/ethdev/rte_ethdev.c | 3 ++-
 lib/ethdev/rte_ethdev.h | 6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index a08922a78a..e4bb40cad8 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1723,7 +1723,8 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
 {
int ret;
 
-   if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
+   if ((*dev_info->dev_flags & RTE_ETH_DEV_MAC_ADDR_FORCE_RESTORE) &&
+   !(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
eth_dev_mac_restore(dev, dev_info);
 
if (*dev_info->dev_flags & RTE_ETH_DEV_PROMISC_FORCE_RESTORE) {
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 73405dd17d..deab07fbc0 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2132,6 +2132,12 @@ struct rte_eth_dev_owner {
  * after driver's dev_start() callback is called.
  */
 #define RTE_ETH_DEV_ALLMULTI_FORCE_RESTORE RTE_BIT32(8)
+/**
+ * If this flag is set, then device driver requires that
+ * ethdev library forcefully reapplies active MAC addresses,
+ * after driver's dev_start() callback is called.
+ */
+#define RTE_ETH_DEV_MAC_ADDR_FORCE_RESTORE RTE_BIT32(9)
 /**@}*/
 
 /**
-- 
2.39.5



[RFC 2/4] ethdev: omit promiscuous config restore if not required

2024-09-18 Thread Dariusz Sosnowski
This patch adds a new device flag - RTE_ETH_DEV_PROMISC_FORCE_RESTORE.
If device driver sets this flag, then it requires that ethdev library
forcefully reapplies promiscuous mode configuration,
after the port is started.
As a result, unnecessary work can be removed from rte_eth_dev_start()
for drivers which apply all available configuration in dev_start()
(such drivers do not set the flag).

If RFC is approved, then the next version of this patch
should set the new flag for all drivers to maintain the same behavior,
until drivers adjust and it can be safely cleared.

Signed-off-by: Dariusz Sosnowski 
---
 lib/ethdev/rte_ethdev.c | 8 +---
 lib/ethdev/rte_ethdev.h | 6 ++
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 362a1883f0..ff08abd566 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1726,9 +1726,11 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
eth_dev_mac_restore(dev, dev_info);
 
-   ret = eth_dev_promiscuous_restore(dev, port_id);
-   if (ret != 0)
-   return ret;
+   if (*dev_info->dev_flags & RTE_ETH_DEV_PROMISC_FORCE_RESTORE) {
+   ret = eth_dev_promiscuous_restore(dev, port_id);
+   if (ret != 0)
+   return ret;
+   }
 
ret = eth_dev_allmulticast_restore(dev, port_id);
if (ret != 0)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 548fada1c7..0fc23fb924 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -2120,6 +2120,12 @@ struct rte_eth_dev_owner {
  * PMDs filling the queue xstats themselves should not set this flag
  */
 #define RTE_ETH_DEV_AUTOFILL_QUEUE_XSTATS RTE_BIT32(6)
+/**
+ * If this flag is set, then device driver requires that
+ * ethdev library forcefully reapplies promiscuous mode configuration,
+ * after driver's dev_start() callback is called.
+ */
+#define RTE_ETH_DEV_PROMISC_FORCE_RESTORE RTE_BIT32(7)
 /**@}*/
 
 /**
-- 
2.39.5



[RFC 1/4] ethdev: rework config restore

2024-09-18 Thread Dariusz Sosnowski
Extract promiscuous and all multicast configuration restore
to separate functions.
This change will allow easier integration of disabling
these procedures for supporting PMDs in follow up commits.

Signed-off-by: Dariusz Sosnowski 
---
 lib/ethdev/rte_ethdev.c | 34 +-
 1 file changed, 29 insertions(+), 5 deletions(-)

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index f1c658f49e..362a1883f0 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -1648,14 +1648,10 @@ eth_dev_mac_restore(struct rte_eth_dev *dev,
 }
 
 static int
-eth_dev_config_restore(struct rte_eth_dev *dev,
-   struct rte_eth_dev_info *dev_info, uint16_t port_id)
+eth_dev_promiscuous_restore(struct rte_eth_dev *dev, uint16_t port_id)
 {
int ret;
 
-   if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
-   eth_dev_mac_restore(dev, dev_info);
-
/* replay promiscuous configuration */
/*
 * use callbacks directly since we don't need port_id check and
@@ -1683,6 +1679,14 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
}
}
 
+   return 0;
+}
+
+static int
+eth_dev_allmulticast_restore(struct rte_eth_dev *dev, uint16_t port_id)
+{
+   int ret;
+
/* replay all multicast configuration */
/*
 * use callbacks directly since we don't need port_id check and
@@ -1713,6 +1717,26 @@ eth_dev_config_restore(struct rte_eth_dev *dev,
return 0;
 }
 
+static int
+eth_dev_config_restore(struct rte_eth_dev *dev,
+   struct rte_eth_dev_info *dev_info, uint16_t port_id)
+{
+   int ret;
+
+   if (!(*dev_info->dev_flags & RTE_ETH_DEV_NOLIVE_MAC_ADDR))
+   eth_dev_mac_restore(dev, dev_info);
+
+   ret = eth_dev_promiscuous_restore(dev, port_id);
+   if (ret != 0)
+   return ret;
+
+   ret = eth_dev_allmulticast_restore(dev, port_id);
+   if (ret != 0)
+   return ret;
+
+   return 0;
+}
+
 int
 rte_eth_dev_start(uint16_t port_id)
 {
-- 
2.39.5



RE: [PATCH v7 1/7] eal: add static per-lcore memory allocation facility

2024-09-18 Thread Konstantin Ananyev


> Introduce DPDK per-lcore id variables, or lcore variables for short.
> 
> An lcore variable has one value for every current and future lcore
> id-equipped thread.
> 
> The primary  use case is for statically allocating
> small, frequently-accessed data structures, for which one instance
> should exist for each lcore.
> 
> Lcore variables are similar to thread-local storage (TLS, e.g., C11
> _Thread_local), but decoupling the values' life time with that of the
> threads.
> 
> Lcore variables are also similar in terms of functionality provided by
> FreeBSD kernel's DPCPU_*() family of macros and the associated
> build-time machinery. DPCPU uses linker scripts, which effectively
> prevents the reuse of its, otherwise seemingly viable, approach.
> 
> The currently-prevailing way to solve the same problem as lcore
> variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
> array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
> lcore variables over this approach is that data related to the same
> lcore now is close (spatially, in memory), rather than data used by
> the same module, which in turn avoid excessive use of padding,
> polluting caches with unused data.
> 
> Signed-off-by: Mattias Rönnblom 
> Acked-by: Morten Brørup 
> 
> --

Acked-by: Konstantin Ananyev 

> 2.34.1
> 



Re: [PATCH v3 11/12] dts: add Rx offload capabilities

2024-09-18 Thread Juraj Linkeš




On 29. 8. 2024 17:40, Jeremy Spewock wrote:

On Wed, Aug 28, 2024 at 1:44 PM Jeremy Spewock  wrote:


On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš
 wrote:


diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index 48c31124d1..f83569669e 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -659,6 +659,103 @@ class TestPmdPortStats(TextParser):
  tx_bps: int = field(metadata=TextParser.find_int(r"Tx-bps:\s+(\d+)"))


+class RxOffloadCapability(Flag):
+"""Rx offload capabilities of a device."""
+
+#:
+RX_OFFLOAD_VLAN_STRIP = auto()


One other thought that I had about this; was there a specific reason
that you decided to prefix all of these with `RX_OFFLOAD_`? I am
working on a test suite right now that uses both RX and TX offloads
and thought that it would be a great use of capabilities, so I am
working on adding a TxOffloadCapability flag as well and, since the
output is essentially the same, it made a lot of sense to make it a
sibling class of this one with similar parsing functionality. In what
I was writing, I found it much easier to remove this prefix so that
the parsing method can be the same for both RX and TX, and I didn't
have to restate some options that are shared between both (like
IPv4_CKSUM, UDP_CKSUM, etc.). Is there a reason you can think of why
removing this prefix is a bad idea? Hopefully I will have a patch out
soon that shows this extension that I've made so that you can see
in-code what I was thinking.


I see now that you actually already answered this question, I was just
looking too much at that piece of code, and clearly not looking
further down at the helper-method mapping or the commit message that
you left :).

"The Flag members correspond to NIC
capability names so a convenience function that looks for the supported
Flags in a testpmd output is also added."

Having it prefixed with RX_OFFLOAD_ in NicCapability makes a lot of
sense since it is more explicit. Since there is a good reason to have
it like this, then the redundancy makes sense I think. There are some
ways to potentially avoid this like creating a StrFlag class that
overrides the __str__ method, or something like an additional type
that would contain a toString method, but it feels very situational
and specific to this one use-case so it probably isn't going to be
super valuable. Another thing I could think of to do would be allowing
the user to pass in a function or something to the helper-method that
mapped Flag names to their respective NicCapability name, or just
doing it in the method that gets the offloads instead of using a
helper at all, but this also just makes it more complicated and maybe
it isn't worth it.



I also had it without the prefix, but then I also realized it's needed 
in NicCapability so this is where I ended. I'm not sure complicating 
things to remove the prefix is worth it, especially when these names are 
basically only used internally. The prefix could actually confer some 
benefit if the name appears in a log somewhere (although overriding 
__str__ could be the way; maybe I'll think about that).



I apologize for asking you about something that you already explained,
but maybe something we can get out of this is that, since these names
have to be consistent, it might be worth putting that in the
doc-strings of the flag for when people try to make further expansions
or changes in the future. Or it could also be generally clear that
flags used for capabilities should follow this idea, let me know what
you think.



Adding things to docstring is usually a good thing. What should I 
document? I guess the correspondence between the flag and NicCapability, 
anything else?





+#: Device supports L3 checksum offload.
+RX_OFFLOAD_IPV4_CKSUM = auto()
+#: Device supports L4 checksum offload.
+RX_OFFLOAD_UDP_CKSUM = auto()
+#: Device supports L4 checksum offload.
+RX_OFFLOAD_TCP_CKSUM = auto()
+#: Device supports Large Receive Offload.
+RX_OFFLOAD_TCP_LRO = auto()
+#: Device supports QinQ (queue in queue) offload.
+RX_OFFLOAD_QINQ_STRIP = auto()
+#: Device supports inner packet L3 checksum.
+RX_OFFLOAD_OUTER_IPV4_CKSUM = auto()
+#: Device supports MACsec.
+RX_OFFLOAD_MACSEC_STRIP = auto()
+#: Device supports filtering of a VLAN Tag identifier.
+RX_OFFLOAD_VLAN_FILTER = 1 << 9
+#: Device supports VLAN offload.
+RX_OFFLOAD_VLAN_EXTEND = auto()
+#: Device supports receiving segmented mbufs.
+RX_OFFLOAD_SCATTER = 1 << 13
+#: Device supports Timestamp.
+RX_OFFLOAD_TIMESTAMP = auto()
+#: Device supports crypto processing while packet is received in NIC.
+RX_OFFLOAD_SECURITY = auto()
+#: Device supports CRC stripping.
+RX_OFFLOAD_KEEP_CRC = auto()
+#: Device supports L4 checksum offload.
+RX_OFFLOAD_SCTP_CKSUM = auto()
+#: Device supports inner packet L4 

Re: [PATCH v23 11/15] log: add timestamp option

2024-09-18 Thread Stephen Hemminger
On Wed, 18 Sep 2024 15:37:49 +0800
fengchengwen  wrote:

> ...
> 
> > +
> > +static enum {
> > +   LOG_TIMESTAMP_NONE = 0,
> > +   LOG_TIMESTAMP_TIME, /* time since start */
> > +   LOG_TIMESTAMP_DELTA,/* time since last message */
> > +   LOG_TIMESTAMP_RELTIME,  /* relative time since last message */
> > +   LOG_TIMESTAMP_CTIME,/* Unix standard time format */
> > +   LOG_TIMESTAMP_ISO,  /* ISO8601 time format */  
> 
> Some of the impl should consider multiple-thread safety.
> 
> And for multiple-process, how about the secondary-processes align the 
> main-process.


As much as possible, they are thread safe, that is why locatime_r is used.
Of course if multiple threads are printing it is possible that time stamps
could be out of order.  I.e CPU A got timestamp and is formatting message,
and CPU B got timestamp is formatting message. The formatting might take longer 
for A.


[PATCH v7 1/7] eal: add static per-lcore memory allocation facility

2024-09-18 Thread Mattias Rönnblom
Introduce DPDK per-lcore id variables, or lcore variables for short.

An lcore variable has one value for every current and future lcore
id-equipped thread.

The primary  use case is for statically allocating
small, frequently-accessed data structures, for which one instance
should exist for each lcore.

Lcore variables are similar to thread-local storage (TLS, e.g., C11
_Thread_local), but decoupling the values' life time with that of the
threads.

Lcore variables are also similar in terms of functionality provided by
FreeBSD kernel's DPCPU_*() family of macros and the associated
build-time machinery. DPCPU uses linker scripts, which effectively
prevents the reuse of its, otherwise seemingly viable, approach.

The currently-prevailing way to solve the same problem as lcore
variables is to keep a module's per-lcore data as RTE_MAX_LCORE-sized
array of cache-aligned, RTE_CACHE_GUARDed structs. The benefit of
lcore variables over this approach is that data related to the same
lcore now is close (spatially, in memory), rather than data used by
the same module, which in turn avoid excessive use of padding,
polluting caches with unused data.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 

--

PATCH v7:
 * Add () to the FOREACH lcore id macro parameter, to allow arbitrary
   expression, not just a simple variable name, being passed.
   (Konstantin Ananyev)

PATCH v6:
 * Have API user provide the loop variable in the FOREACH macro, to
   avoid subtle bugs where the loop variable name clashes with some
   other user-defined variable. (Konstantin Ananyev)

PATCH v5:
 * Update EAL programming guide.

PATCH v2:
 * Add Windows support. (Morten Brørup)
 * Fix lcore variables API index reference. (Morten Brørup)
 * Various improvements of the API documentation. (Morten Brørup)
 * Elimination of unused symbol in version.map. (Morten Brørup)

PATCH:
 * Update MAINTAINERS and release notes.
 * Stop covering included files in extern "C" {}.

RFC v6:
 * Include  to get aligned_alloc().
 * Tweak documentation (grammar).
 * Provide API-level guarantees that lcore variable values take on an
   initial value of zero.
 * Fix misplaced __rte_cache_aligned in the API doc example.

RFC v5:
 * In Doxygen, consistenly use @ (and not \).
 * The RTE_LCORE_VAR_GET() and SET() convience access macros
   covered an uncommon use case, where the lcore value is of a
   primitive type, rather than a struct, and is thus eliminated
   from the API. (Morten Brørup)
 * In the wake up GET()/SET() removeal, rename RTE_LCORE_VAR_PTR()
   RTE_LCORE_VAR_VALUE().
 * The underscores are removed from __rte_lcore_var_lcore_ptr() to
   signal that this function is a part of the public API.
 * Macro arguments are documented.

RFV v4:
 * Replace large static array with libc heap-allocated memory. One
   implication of this change is there no longer exists a fixed upper
   bound for the total amount of memory used by lcore variables.
   RTE_MAX_LCORE_VAR has changed meaning, and now represent the
   maximum size of any individual lcore variable value.
 * Fix issues in example. (Morten Brørup)
 * Improve access macro type checking. (Morten Brørup)
 * Refer to the lcore variable handle as "handle" and not "name" in
   various macros.
 * Document lack of thread safety in rte_lcore_var_alloc().
 * Provide API-level assurance the lcore variable handle is
   always non-NULL, to all applications to use NULL to mean
   "not yet allocated".
 * Note zero-sized allocations are not allowed.
 * Give API-level guarantee the lcore variable values are zeroed.

RFC v3:
 * Replace use of GCC-specific alignof() with alignof().
 * Update example to reflect FOREACH macro name change (in RFC v2).

RFC v2:
 * Use alignof to derive alignment requirements. (Morten Brørup)
 * Change name of FOREACH to make it distinct from 's
   *per-EAL-thread* RTE_LCORE_FOREACH(). (Morten Brørup)
 * Allow user-specified alignment, but limit max to cache line size.
---
 MAINTAINERS   |   6 +
 config/rte_config.h   |   1 +
 doc/api/doxy-api-index.md |   1 +
 .../prog_guide/env_abstraction_layer.rst  |  45 +-
 doc/guides/rel_notes/release_24_11.rst|  14 +
 lib/eal/common/eal_common_lcore_var.c |  79 
 lib/eal/common/meson.build|   1 +
 lib/eal/include/meson.build   |   1 +
 lib/eal/include/rte_lcore_var.h   | 390 ++
 lib/eal/version.map   |   2 +
 10 files changed, 534 insertions(+), 6 deletions(-)
 create mode 100644 lib/eal/common/eal_common_lcore_var.c
 create mode 100644 lib/eal/include/rte_lcore_var.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..362d9a3f28 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -282,6 +282,12 @@ F: lib/eal/include/rte_random.h
 F: lib/eal/common/rte_random.c
 F: app/test/test_rand_perf.c
 
+Lcore Variables
+M: Mattias Rönnblom 
+F: lib/eal/include/rte_lcore_var.h
+F

RE: [PATCH v2 1/4] power: refactor core power management library

2024-09-18 Thread Tummala, Sivaprasad
[AMD Official Use Only - AMD Internal Distribution Only]

> -Original Message-
> From: lihuisong (C) 
> Sent: Friday, September 13, 2024 1:05 PM
> To: Tummala, Sivaprasad 
> Cc: dev@dpdk.org; david.h...@intel.com; anatoly.bura...@intel.com;
> radu.nico...@intel.com; cristian.dumitre...@intel.com; jer...@marvell.com;
> konstantin.anan...@huawei.com; Yigit, Ferruh ;
> gak...@marvell.com
> Subject: Re: [PATCH v2 1/4] power: refactor core power management library
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> 在 2024/9/12 19:17, Tummala, Sivaprasad 写道:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> > Hi Huisong,
> >
> > Please find my response inline.
> >
> >> -Original Message-
> >> From: lihuisong (C) 
> >> Sent: Tuesday, August 27, 2024 1:51 PM
> >> To: Tummala, Sivaprasad 
> >> Cc: dev@dpdk.org; david.h...@intel.com; anatoly.bura...@intel.com;
> >> radu.nico...@intel.com; cristian.dumitre...@intel.com;
> >> jer...@marvell.com; konstantin.anan...@huawei.com; Yigit, Ferruh
> >> ; gak...@marvell.com
> >> Subject: Re: [PATCH v2 1/4] power: refactor core power management
> >> library
> >>
> >> Caution: This message originated from an External Source. Use proper
> >> caution when opening attachments, clicking links, or responding.
> >>
> >>
> >> Hi Sivaprasa,
> >>
> >> Some comments inline.
> >>
> >> /Huisong
> >>
> >> 在 2024/8/26 21:06, Sivaprasad Tummala 写道:
> >>> This patch introduces a comprehensive refactor to the core power
> >>> management library. The primary focus is on improving modularity and
> >>> organization by relocating specific driver implementations from the
> >>> 'lib/power' directory to dedicated directories within
> >>> 'drivers/power/core/*'. The adjustment of meson.build files enables
> >>> the selective activation of individual drivers.
> >>> These changes contribute to a significant enhancement in code
> >>> organization, providing a clearer structure for driver implementations.
> >>> The refactor aims to improve overall code clarity and boost
> >>> maintainability. Additionally, it establishes a foundation for
> >>> future development, allowing for more focused work on individual
> >>> drivers and seamless integration of forthcoming enhancements.
> >>>
> >>> v2:
> >>>- added NULL check for global_core_ops in rte_power_get_core_ops
> >>>
> >>> Signed-off-by: Sivaprasad Tummala 
> >>> ---
> >>>drivers/meson.build   |   1 +
> >>>.../power/acpi/acpi_cpufreq.c |  22 +-
> >>>.../power/acpi/acpi_cpufreq.h |   6 +-
> >>>drivers/power/acpi/meson.build|  10 +
> >>>.../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
> >>>.../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
> >>>drivers/power/amd_pstate/meson.build  |  10 +
> >>>.../power/cppc/cppc_cpufreq.c |  22 +-
> >>>.../power/cppc/cppc_cpufreq.h |   8 +-
> >>>drivers/power/cppc/meson.build|  10 +
> >>>.../power/kvm_vm}/guest_channel.c |   0
> >>>.../power/kvm_vm}/guest_channel.h |   0
> >>>.../power/kvm_vm/kvm_vm.c |  22 +-
> >>>.../power/kvm_vm/kvm_vm.h |   6 +-
> >>>drivers/power/kvm_vm/meson.build  |  16 +
> >>>drivers/power/meson.build |  12 +
> >>>drivers/power/pstate/meson.build  |  10 +
> >>>.../power/pstate/pstate_cpufreq.c |  22 +-
> >>>.../power/pstate/pstate_cpufreq.h |   6 +-
> >>>lib/power/meson.build |   7 +-
> >>>lib/power/power_common.c  |   2 +-
> >>>lib/power/power_common.h  |  16 +-
> >>>lib/power/rte_power.c | 291 ++
> >>>lib/power/rte_power.h | 139 ++---
> >>>lib/power/rte_power_core_ops.h| 208 +
> >>>lib/power/version.map |  14 +
> >>>26 files changed, 621 insertions(+), 271 deletions(-)
> >>>rename lib/power/power_acpi_cpufreq.c =>
> >>> drivers/power/acpi/acpi_cpufreq.c
> >> (95%)
> >>>rename lib/power/power_acpi_cpufreq.h =>
> >>> drivers/power/acpi/acpi_cpufreq.h
> >> (98%)
> >>>create mode 100644 drivers/power/acpi/meson.build
> >>>rename lib/power/power_amd_pstate_cpufreq.c =>
> >> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
> >>>rename lib/power/power_amd_pstate_cpufreq.h =>
> >> drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
> >>>create mode 100644 drivers/power/amd_pstate/meson.build
> >>>rename lib/power/power_cppc_cpufreq.c =>
> >>> drivers/power/cppc/cppc_cpufreq.c
> >> (95%)
> >>>rename lib/power/power_cppc_cpufreq.h =>
> >>> drivers/power/cppc/cppc_cpufreq.h
> >> (97%)
> >>>cre

[PATCH v7 3/7] eal: add lcore variable performance test

2024-09-18 Thread Mattias Rönnblom
Add basic micro benchmark for lcore variables, in an attempt to assure
that the overhead isn't significantly greater than alternative
approaches, in scenarios where the benefits aren't expected to show up
(i.e., when plenty of cache is available compared to the working set
size of the per-lcore data).

Signed-off-by: Mattias Rönnblom 

--

PATCH v6:
 * Use floating point math when calculating per-update latency.
   (Morten Brørup)

PATCH v5:
 * Add variant of thread-local storage with initialization performed
   at the time of thread creation to the benchmark scenarios. (Morten
   Brørup)

PATCH v4:
 * Rework the tests to be a little less unrealistic. Instead of a
   single dummy module using a single variable, use a number of
   variables/modules. In this way, differences in cache effects may
   show up.
 * Add RTE_CACHE_GUARD to better mimic that static array pattern.
   (Morten Brørup)
 * Show latencies as TSC cycles. (Morten Brørup)
---
 app/test/meson.build   |   1 +
 app/test/test_lcore_var_perf.c | 257 +
 2 files changed, 258 insertions(+)
 create mode 100644 app/test/test_lcore_var_perf.c

diff --git a/app/test/meson.build b/app/test/meson.build
index 48279522f0..d4e0c59900 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -104,6 +104,7 @@ source_file_deps = {
 'test_kvargs.c': ['kvargs'],
 'test_latencystats.c': ['ethdev', 'latencystats', 'metrics'] + 
sample_packet_forward_deps,
 'test_lcore_var.c': [],
+'test_lcore_var_perf.c': [],
 'test_lcores.c': [],
 'test_link_bonding.c': ['ethdev', 'net_bond',
 'net'] + packet_burst_generator_deps + virtual_pmd_deps,
diff --git a/app/test/test_lcore_var_perf.c b/app/test/test_lcore_var_perf.c
new file mode 100644
index 00..2680bfb6f7
--- /dev/null
+++ b/app/test/test_lcore_var_perf.c
@@ -0,0 +1,257 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#define MAX_MODS 1024
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+struct mod_lcore_state {
+   uint64_t a;
+   uint64_t b;
+   uint64_t sum;
+};
+
+static void
+mod_init(struct mod_lcore_state *state)
+{
+   state->a = rte_rand();
+   state->b = rte_rand();
+   state->sum = 0;
+}
+
+static __rte_always_inline void
+mod_update(volatile struct mod_lcore_state *state)
+{
+   state->sum += state->a * state->b;
+}
+
+struct __rte_cache_aligned mod_lcore_state_aligned {
+   struct mod_lcore_state mod_state;
+
+   RTE_CACHE_GUARD;
+};
+
+static struct mod_lcore_state_aligned
+sarray_lcore_state[MAX_MODS][RTE_MAX_LCORE];
+
+static void
+sarray_init(void)
+{
+   unsigned int lcore_id = rte_lcore_id();
+   int mod;
+
+   for (mod = 0; mod < MAX_MODS; mod++) {
+   struct mod_lcore_state *mod_state =
+   &sarray_lcore_state[mod][lcore_id].mod_state;
+
+   mod_init(mod_state);
+   }
+}
+
+static __rte_noinline void
+sarray_update(unsigned int mod)
+{
+   unsigned int lcore_id = rte_lcore_id();
+   struct mod_lcore_state *mod_state =
+   &sarray_lcore_state[mod][lcore_id].mod_state;
+
+   mod_update(mod_state);
+}
+
+struct mod_lcore_state_lazy {
+   struct mod_lcore_state mod_state;
+   bool initialized;
+};
+
+/*
+ * Note: it's usually a bad idea have this much thread-local storage
+ * allocated in a real application, since it will incur a cost on
+ * thread creation and non-lcore thread memory usage.
+ */
+static RTE_DEFINE_PER_LCORE(struct mod_lcore_state_lazy,
+   tls_lcore_state)[MAX_MODS];
+
+static inline void
+tls_init(struct mod_lcore_state_lazy *state)
+{
+   mod_init(&state->mod_state);
+
+   state->initialized = true;
+}
+
+static __rte_noinline void
+tls_lazy_update(unsigned int mod)
+{
+   struct mod_lcore_state_lazy *state =
+   &RTE_PER_LCORE(tls_lcore_state[mod]);
+
+   /* With thread-local storage, initialization must usually be lazy */
+   if (!state->initialized)
+   tls_init(state);
+
+   mod_update(&state->mod_state);
+}
+
+static __rte_noinline void
+tls_update(unsigned int mod)
+{
+   struct mod_lcore_state_lazy *state =
+   &RTE_PER_LCORE(tls_lcore_state[mod]);
+
+   mod_update(&state->mod_state);
+}
+
+RTE_LCORE_VAR_HANDLE(struct mod_lcore_state, lvar_lcore_state)[MAX_MODS];
+
+static void
+lvar_init(void)
+{
+   unsigned int mod;
+
+   for (mod = 0; mod < MAX_MODS; mod++) {
+   RTE_LCORE_VAR_ALLOC(lvar_lcore_state[mod]);
+
+   struct mod_lcore_state *state =
+   RTE_LCORE_VAR_VALUE(lvar_lcore_state[mod]);
+
+   mod_init(state);
+   }
+}
+
+static __rte_noinline void
+lvar_update(unsigned int mod)
+{
+   struct mod_lcore_state *state =
+   RTE_LCORE_VAR_VALUE(lvar_lcore_state[mod]);
+
+   mod_update(state);
+}
+
+static 

[PATCH v7 0/7] Lcore variables

2024-09-18 Thread Mattias Rönnblom
This patch set introduces a new API  for static
per-lcore id data allocation.

Please refer to the  API documentation for both a
rationale for this new API, and a comparison to the alternatives
available.

The adoption of this API would affect many different DPDK modules, but
the author updated only a few, mostly to serve as examples in this
RFC, and to iron out some, but surely not all, wrinkles in the API.

The question on how to best allocate static per-lcore memory has been
up several times on the dev mailing list, for example in the thread on
"random: use per lcore state" RFC by Stephen Hemminger.

Lcore variables are surely not the answer to all your per-lcore-data
needs, since it only allows for more-or-less static allocation. In the
author's opinion, it does however provide a reasonably simple and
clean and seemingly very much performant solution to a real problem.

Mattias Rönnblom (7):
  eal: add static per-lcore memory allocation facility
  eal: add lcore variable functional tests
  eal: add lcore variable performance test
  random: keep PRNG state in lcore variable
  power: keep per-lcore state in lcore variable
  service: keep per-lcore state in lcore variable
  eal: keep per-lcore power intrinsics state in lcore variable

 MAINTAINERS   |   6 +
 app/test/meson.build  |   2 +
 app/test/test_lcore_var.c | 436 ++
 app/test/test_lcore_var_perf.c| 257 +++
 config/rte_config.h   |   1 +
 doc/api/doxy-api-index.md |   1 +
 .../prog_guide/env_abstraction_layer.rst  |  45 +-
 doc/guides/rel_notes/release_24_11.rst|  14 +
 lib/eal/common/eal_common_lcore_var.c |  79 
 lib/eal/common/meson.build|   1 +
 lib/eal/common/rte_random.c   |  28 +-
 lib/eal/common/rte_service.c  | 117 ++---
 lib/eal/include/meson.build   |   1 +
 lib/eal/include/rte_lcore_var.h   | 390 
 lib/eal/version.map   |   2 +
 lib/eal/x86/rte_power_intrinsics.c|  17 +-
 lib/power/rte_power_pmd_mgmt.c|  35 +-
 17 files changed, 1339 insertions(+), 93 deletions(-)
 create mode 100644 app/test/test_lcore_var.c
 create mode 100644 app/test/test_lcore_var_perf.c
 create mode 100644 lib/eal/common/eal_common_lcore_var.c
 create mode 100644 lib/eal/include/rte_lcore_var.h

-- 
2.34.1



RE: [EXTERNAL] [PATCH 1/2] net: add thread-safe crc api

2024-09-18 Thread Singh, Jasvinder


> -Original Message-
> From: Akhil Goyal 
> Sent: Wednesday, September 18, 2024 6:58 AM
> To: Kusztal, ArkadiuszX ; dev@dpdk.org;
> Singh, Jasvinder 
> Cc: Ji, Kai ; Dooley, Brian ; Ferruh
> Yigit 
> Subject: RE: [EXTERNAL] [PATCH 1/2] net: add thread-safe crc api
> 
> > The current net CRC API is not thread-safe, this patch solves this by
> > adding another, thread-safe API functions.
> > These functions are not safe when using between different processes,
> > though.
> >
> > Signed-off-by: Arkadiusz Kusztal 
> 
> Added Jasvinder for review.
> 
> This patch is mainly related to net library. Delegated this patchset to 
> Ferruh.
> 
> > ---
> >  lib/net/rte_net_crc.c | 40 +-
> --
> >  lib/net/rte_net_crc.h | 14 ++
> >  lib/net/version.map   |  2 ++
> >  3 files changed, 53 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/net/rte_net_crc.c b/lib/net/rte_net_crc.c index
> > 346c285c15..87808a31dc 100644
> > --- a/lib/net/rte_net_crc.c
> > +++ b/lib/net/rte_net_crc.c
> > @@ -35,9 +35,6 @@ rte_crc16_ccitt_handler(const uint8_t *data,
> > uint32_t data_len);  static uint32_t  rte_crc32_eth_handler(const
> > uint8_t *data, uint32_t data_len);
> >
> > -typedef uint32_t
> > -(*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
> > -
> >  static rte_net_crc_handler handlers_default[] = {
> > [RTE_NET_CRC16_CCITT] = rte_crc16_ccitt_default_handler,
> > [RTE_NET_CRC32_ETH] = rte_crc32_eth_default_handler, @@ -331,6
> > +328,43 @@ rte_net_crc_calc(const void *data,
> > return ret;
> >  }
> >
> > +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_type type,
> > +   enum rte_net_crc_alg alg)
> > +{
> > +   const rte_net_crc_handler *handlers = NULL;
> > +
> > +   if (max_simd_bitwidth == 0)
> > +   max_simd_bitwidth = rte_vect_get_max_simd_bitwidth();
> > +
> > +   switch (alg) {
> > +   case RTE_NET_CRC_AVX512:
> > +   handlers = avx512_vpclmulqdq_get_handlers();
> > +   if (handlers != NULL)
> > +   break;
> > +   /* fall-through */
> > +   case RTE_NET_CRC_SSE42:
> > +   handlers = sse42_pclmulqdq_get_handlers();
> > +   break;
> > +   case RTE_NET_CRC_NEON:
> > +   handlers = neon_pmull_get_handlers();
> > +   /* fall-through */
> > +   case RTE_NET_CRC_SCALAR:
> > +   /* fall-through */
> > +   default:
> > +   break;
> > +   }
> > +   if (handlers == NULL)
> > +   handlers = handlers_scalar;
> > +
> > +   return (struct rte_net_crc){ type, handlers[type] }; }
> > +
> > +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
> > +   const void *data, const uint32_t data_len) {
> > +   return ctx->crc(data, data_len);
> > +}
> > +
> >  /* Call initialisation helpers for all crc algorithm handlers */
> >  RTE_INIT(rte_net_crc_init)
> >  {
> > diff --git a/lib/net/rte_net_crc.h b/lib/net/rte_net_crc.h index
> > 72d3e10ff6..f5c8f7173f 100644
> > --- a/lib/net/rte_net_crc.h
> > +++ b/lib/net/rte_net_crc.h
> > @@ -11,6 +11,9 @@
> >  extern "C" {
> >  #endif
> >
> > +typedef uint32_t
> > +(*rte_net_crc_handler)(const uint8_t *data, uint32_t data_len);
> > +
> >  /** CRC types */
> >  enum rte_net_crc_type {
> > RTE_NET_CRC16_CCITT = 0,
> > @@ -26,6 +29,11 @@ enum rte_net_crc_alg {
> > RTE_NET_CRC_AVX512,
> >  };
> >
> > +struct rte_net_crc {
> > +   enum rte_net_crc_type type;
> > +   rte_net_crc_handler crc;
> > +};
> > +
> >  /**
> >   * This API set the CRC computation algorithm (i.e. scalar version,
> >   * x86 64-bit sse4.2 intrinsic version, etc.) and internal data @@
> > -59,6 +67,12 @@ rte_net_crc_calc(const void *data,
> > uint32_t data_len,
> > enum rte_net_crc_type type);
> >
> > +struct rte_net_crc rte_net_crc_set(enum rte_net_crc_type type,
> > +   enum rte_net_crc_alg alg);
> > +
> > +uint32_t rte_net_crc(const struct rte_net_crc *ctx,
> > +   const void *data, const uint32_t data_len);
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/net/version.map b/lib/net/version.map index
> > bec4ce23ea..5c3dbffba7 100644
> > --- a/lib/net/version.map
> > +++ b/lib/net/version.map
> > @@ -4,6 +4,8 @@ DPDK_25 {
> > rte_eth_random_addr;
> > rte_ether_format_addr;
> > rte_ether_unformat_addr;
> > +   rte_net_crc;
> > +   rte_net_crc_set;
> > rte_net_crc_calc;
> > rte_net_crc_set_alg;
> > rte_net_get_ptype;
> > --
> > 2.13.6

Hi Arkadiusz, 

Thanks for the patches. 

New api will make the existing ones obsolete, therefore would suggest removing 
them to avoid confusion as they implement similar functionality. Also, update 
the documentation accordingly.   

  


Re: [PATCH v3 02/19] net/xsc: add log macro

2024-09-18 Thread WanRenyong
On 2024/9/18 16:56, David Marchand wrote:
> On Wed, Sep 18, 2024 at 8:10 AM WanRenyong  wrote:
>> Add log macro to print runtime messages and trace functions.
>>
>> Signed-off-by: WanRenyong 
>>
>> ---
>>
>> v3:
>> * use RTE_LOG_LINE_PREFIX instead of rte_log
>> ---
>>   drivers/net/xsc/xsc_ethdev.c | 11 +
>>   drivers/net/xsc/xsc_log.h| 46 
>>   2 files changed, 57 insertions(+)
>>   create mode 100644 drivers/net/xsc/xsc_log.h
>>
>> diff --git a/drivers/net/xsc/xsc_ethdev.c b/drivers/net/xsc/xsc_ethdev.c
>> index 0e48cb76fa..58ceaa3940 100644
>> --- a/drivers/net/xsc/xsc_ethdev.c
>> +++ b/drivers/net/xsc/xsc_ethdev.c
>> @@ -1,3 +1,14 @@
>>   /* SPDX-License-Identifier: BSD-3-Clause
>>* Copyright 2024 Yunsilicon Technology Co., Ltd.
>>*/
>> +
>> +#include "xsc_log.h"
>> +
>> +RTE_LOG_REGISTER_SUFFIX(xsc_logtype_init, init, NOTICE);
>> +RTE_LOG_REGISTER_SUFFIX(xsc_logtype_driver, driver, NOTICE);
>> +#ifdef RTE_ETHDEV_DEBUG_RX
>> +RTE_LOG_REGISTER_SUFFIX(xsc_logtype_rx, rx, DEBUG);
>> +#endif
>> +#ifdef RTE_ETHDEV_DEBUG_TX
>> +RTE_LOG_REGISTER_SUFFIX(xsc_logtype_tx, tx, DEBUG);
>> +#endif
>> diff --git a/drivers/net/xsc/xsc_log.h b/drivers/net/xsc/xsc_log.h
>> new file mode 100644
>> index 00..99a88fcd1b
>> --- /dev/null
>> +++ b/drivers/net/xsc/xsc_log.h
>> @@ -0,0 +1,46 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright 2024 Yunsilicon Technology Co., Ltd.
>> + */
>> +
>> +#ifndef _XSC_LOG_H_
>> +#define _XSC_LOG_H_
>> +
>> +#include 
>> +
>> +extern int xsc_logtype_init;
>> +extern int xsc_logtype_driver;
>> +#define RTE_LOGTYPE_XSC_INIT xsc_logtype_init
>> +#define RTE_LOGTYPE_XSC_DRV xsc_logtype_driver
>> +
>> +
>> +#define PMD_INIT_LOG(level, ...) \
>> +   RTE_LOG_LINE_PREFIX(level, XSC_INIT, "%s(): ", __func__, __VA_ARGS__)
> Thank you for converting to RTE_LOG_LINE!
>
>> +
>> +
>> +#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
>> +
>> +#ifdef RTE_ETHDEV_DEBUG_RX
>> +extern int xsc_logtype_rx;
>> +#define RTE_LOGTYPE_XSC_RX xsc_logtype_rx
>> +#define PMD_RX_LOG(level, ...) \
>> +   RTE_LOG_LINE_PREFIX(level, XSC_RX, "%s(): ", __func__, __VA_ARGS__)
>> +#else
>> +#define PMD_RX_LOG(level, ...) do { } while (0)
>> +#endif
>> +
>> +#ifdef RTE_ETHDEV_DEBUG_TX
>> +extern int xsc_logtype_tx;
>> +#define RTE_LOGTYPE_XSC_TX xsc_logtype_tx
>> +#define PMD_TX_LOG(level, ...) \
>> +   RTE_LOG_LINE_PREFIX(level, XSC_TX, "%s(): ", __func__, __VA_ARGS__)
>> +#else
>> +#define PMD_TX_LOG(level, ...) do { } while (0)
>> +#endif
> I don't see any code calling those macros in the series, so I would
> remove them for now.
> You can introduce them in the future when needed.
>
>
>> +
>> +#define PMD_DRV_LOG_RAW(level, ...) \
>> +   RTE_LOG_LINE_PREFIX(level, XSC_DRV, "%s(): ", __func__, __VA_ARGS__)
>> +
>> +#define PMD_DRV_LOG(level, ...) \
>> +   PMD_DRV_LOG_RAW(level, __VA_ARGS__)
> The PMD_DRV_LOG_RAW macro seems unused and can be removed =>
> PMD_DRV_LOG() directly calls RTE_LOG_LINE_PREFIX().
>
>
Hello, David,

Thanks for review, all you mentioned above will be fixed in the next 
version.

-- 
Thanks,
WanRenyong


Re: [PATCH v7 0/7] Lcore variables

2024-09-18 Thread fengchengwen
Series-acked-by: Chengwen Feng 

On 2024/9/18 16:26, Mattias Rönnblom wrote:
> This patch set introduces a new API  for static
> per-lcore id data allocation.
> 
> Please refer to the  API documentation for both a
> rationale for this new API, and a comparison to the alternatives
> available.
> 
> The adoption of this API would affect many different DPDK modules, but
> the author updated only a few, mostly to serve as examples in this
> RFC, and to iron out some, but surely not all, wrinkles in the API.
> 
> The question on how to best allocate static per-lcore memory has been
> up several times on the dev mailing list, for example in the thread on
> "random: use per lcore state" RFC by Stephen Hemminger.
> 
> Lcore variables are surely not the answer to all your per-lcore-data
> needs, since it only allows for more-or-less static allocation. In the
> author's opinion, it does however provide a reasonably simple and
> clean and seemingly very much performant solution to a real problem.
> 
> Mattias Rönnblom (7):
>   eal: add static per-lcore memory allocation facility
>   eal: add lcore variable functional tests
>   eal: add lcore variable performance test
>   random: keep PRNG state in lcore variable
>   power: keep per-lcore state in lcore variable
>   service: keep per-lcore state in lcore variable
>   eal: keep per-lcore power intrinsics state in lcore variable
> 
>  MAINTAINERS   |   6 +
>  app/test/meson.build  |   2 +
>  app/test/test_lcore_var.c | 436 ++
>  app/test/test_lcore_var_perf.c| 257 +++
>  config/rte_config.h   |   1 +
>  doc/api/doxy-api-index.md |   1 +
>  .../prog_guide/env_abstraction_layer.rst  |  45 +-
>  doc/guides/rel_notes/release_24_11.rst|  14 +
>  lib/eal/common/eal_common_lcore_var.c |  79 
>  lib/eal/common/meson.build|   1 +
>  lib/eal/common/rte_random.c   |  28 +-
>  lib/eal/common/rte_service.c  | 117 ++---
>  lib/eal/include/meson.build   |   1 +
>  lib/eal/include/rte_lcore_var.h   | 390 
>  lib/eal/version.map   |   2 +
>  lib/eal/x86/rte_power_intrinsics.c|  17 +-
>  lib/power/rte_power_pmd_mgmt.c|  35 +-
>  17 files changed, 1339 insertions(+), 93 deletions(-)
>  create mode 100644 app/test/test_lcore_var.c
>  create mode 100644 app/test/test_lcore_var_perf.c
>  create mode 100644 lib/eal/common/eal_common_lcore_var.c
>  create mode 100644 lib/eal/include/rte_lcore_var.h
> 



[PATCH v2 1/9] net/mlx5: update flex parser arc types support

2024-09-18 Thread Viacheslav Ovsiienko
Add support for input IPv4 and for ESP output flex parser arcs.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_flow_flex.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow_flex.c 
b/drivers/net/mlx5/mlx5_flow_flex.c
index 8a02247406..5b104d583c 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -,6 +,8 @@ mlx5_flex_arc_type(enum rte_flow_item_type type, int in)
return MLX5_GRAPH_ARC_NODE_GENEVE;
case RTE_FLOW_ITEM_TYPE_VXLAN_GPE:
return MLX5_GRAPH_ARC_NODE_VXLAN_GPE;
+   case RTE_FLOW_ITEM_TYPE_ESP:
+   return MLX5_GRAPH_ARC_NODE_IPSEC_ESP;
default:
return -EINVAL;
}
@@ -1148,6 +1150,22 @@ mlx5_flex_arc_in_udp(const struct rte_flow_item *item,
return rte_be_to_cpu_16(spec->hdr.dst_port);
 }
 
+static int
+mlx5_flex_arc_in_ipv4(const struct rte_flow_item *item,
+ struct rte_flow_error *error)
+{
+   const struct rte_flow_item_ipv4 *spec = item->spec;
+   const struct rte_flow_item_ipv4 *mask = item->mask;
+   struct rte_flow_item_ipv4 ip = { .hdr.next_proto_id = 0xff };
+
+   if (memcmp(mask, &ip, sizeof(struct rte_flow_item_ipv4))) {
+   return rte_flow_error_set
+   (error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, item,
+"invalid ipv4 item mask, full mask is desired");
+   }
+   return spec->hdr.next_proto_id;
+}
+
 static int
 mlx5_flex_arc_in_ipv6(const struct rte_flow_item *item,
  struct rte_flow_error *error)
@@ -1210,6 +1228,9 @@ mlx5_flex_translate_arc_in(struct mlx5_hca_flex_attr 
*attr,
case RTE_FLOW_ITEM_TYPE_UDP:
ret = mlx5_flex_arc_in_udp(rte_item, error);
break;
+   case RTE_FLOW_ITEM_TYPE_IPV4:
+   ret = mlx5_flex_arc_in_ipv4(rte_item, error);
+   break;
case RTE_FLOW_ITEM_TYPE_IPV6:
ret = mlx5_flex_arc_in_ipv6(rte_item, error);
break;
-- 
2.34.1



[PATCH v2 4/9] net/mlx5: fix flex item tunnel mode handling

2024-09-18 Thread Viacheslav Ovsiienko
The RTE flex item can represent tunnel header itself,
and split inner and outer items, it should be reflected
in the item flags while PMD is processing the item array.

Fixes: 8c0ca7527bc8 ("net/mlx5/hws: support flex item matching")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_flow_hw.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
index 50888944a5..a275154d4b 100644
--- a/drivers/net/mlx5/mlx5_flow_hw.c
+++ b/drivers/net/mlx5/mlx5_flow_hw.c
@@ -558,6 +558,7 @@ flow_hw_matching_item_flags_get(const struct rte_flow_item 
items[])
uint64_t last_item = 0;
 
for (; items->type != RTE_FLOW_ITEM_TYPE_END; items++) {
+   enum rte_flow_item_flex_tunnel_mode tunnel_mode = 
FLEX_TUNNEL_MODE_SINGLE;
int tunnel = !!(item_flags & MLX5_FLOW_LAYER_TUNNEL);
int item_type = items->type;
 
@@ -606,6 +607,13 @@ flow_hw_matching_item_flags_get(const struct rte_flow_item 
items[])
case RTE_FLOW_ITEM_TYPE_COMPARE:
last_item = MLX5_FLOW_ITEM_COMPARE;
break;
+   case RTE_FLOW_ITEM_TYPE_FLEX:
+   mlx5_flex_get_tunnel_mode(items, &tunnel_mode);
+   last_item = tunnel_mode == FLEX_TUNNEL_MODE_TUNNEL ?
+   MLX5_FLOW_ITEM_FLEX_TUNNEL :
+   tunnel ? MLX5_FLOW_ITEM_INNER_FLEX :
+   MLX5_FLOW_ITEM_OUTER_FLEX;
+   break;
default:
break;
}
-- 
2.34.1



[PATCH v2 2/9] net/mlx5: add flex item query tunnel mode routine

2024-09-18 Thread Viacheslav Ovsiienko
Once parsing the RTE item array the PMD needs to know
whether the flex item represents the tunnel header.
The appropriate tunnel mode query API is added.

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h   |  2 ++
 drivers/net/mlx5/mlx5_flow_flex.c | 27 +++
 2 files changed, 29 insertions(+)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 869aac032b..6d163996e4 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -2605,6 +2605,8 @@ int mlx5_flex_get_sample_id(const struct mlx5_flex_item 
*tp,
 int mlx5_flex_get_parser_value_per_byte_off(const struct rte_flow_item_flex 
*item,
void *flex, uint32_t byte_off,
bool is_mask, bool tunnel, uint32_t 
*value);
+int mlx5_flex_get_tunnel_mode(const struct rte_flow_item *item,
+ enum rte_flow_item_flex_tunnel_mode *tunnel_mode);
 int mlx5_flex_acquire_index(struct rte_eth_dev *dev,
struct rte_flow_item_flex_handle *handle,
bool acquire);
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c 
b/drivers/net/mlx5/mlx5_flow_flex.c
index 5b104d583c..0c41b956b0 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -291,6 +291,33 @@ mlx5_flex_get_parser_value_per_byte_off(const struct 
rte_flow_item_flex *item,
return 0;
 }
 
+/**
+ * Get the flex parser tunnel mode.
+ *
+ * @param[in] item
+ *   RTE Flex item.
+ * @param[in, out] tunnel_mode
+ *   Pointer to return tunnel mode.
+ *
+ * @return
+ *   0 on success, otherwise negative error code.
+ */
+int
+mlx5_flex_get_tunnel_mode(const struct rte_flow_item *item,
+ enum rte_flow_item_flex_tunnel_mode *tunnel_mode)
+{
+   if (item && item->spec && tunnel_mode) {
+   const struct rte_flow_item_flex *spec = item->spec;
+   struct mlx5_flex_item *flex = (struct mlx5_flex_item 
*)spec->handle;
+
+   if (flex) {
+   *tunnel_mode = flex->tunnel_mode;
+   return 0;
+   }
+   }
+   return -EINVAL;
+}
+
 /**
  * Translate item pattern into matcher fields according to translation
  * array.
-- 
2.34.1



[PATCH v2 3/9] net/mlx5/hws: fix flex item support as tunnel header

2024-09-18 Thread Viacheslav Ovsiienko
The RTE flex item can represent the tunnel header and
split the inner and outer layer items. HWS did not
support this flex item specifics.

Fixes: 8c0ca7527bc8 ("net/mlx5/hws: support flex item matching")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/hws/mlx5dr_definer.c | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 51a3f7be4b..2dfcc5eba6 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -3267,8 +3267,17 @@ mlx5dr_definer_conv_items_to_hl(struct mlx5dr_context 
*ctx,
break;
case RTE_FLOW_ITEM_TYPE_FLEX:
ret = mlx5dr_definer_conv_item_flex_parser(&cd, items, 
i);
-   item_flags |= cd.tunnel ? MLX5_FLOW_ITEM_INNER_FLEX :
- MLX5_FLOW_ITEM_OUTER_FLEX;
+   if (ret == 0) {
+   enum rte_flow_item_flex_tunnel_mode tunnel_mode 
=
+   
FLEX_TUNNEL_MODE_SINGLE;
+
+   ret = mlx5_flex_get_tunnel_mode(items, 
&tunnel_mode);
+   if (tunnel_mode == FLEX_TUNNEL_MODE_TUNNEL)
+   item_flags |= 
MLX5_FLOW_ITEM_FLEX_TUNNEL;
+   else
+   item_flags |= cd.tunnel ? 
MLX5_FLOW_ITEM_INNER_FLEX :
+ 
MLX5_FLOW_ITEM_OUTER_FLEX;
+   }
break;
case RTE_FLOW_ITEM_TYPE_MPLS:
ret = mlx5dr_definer_conv_item_mpls(&cd, items, i);
-- 
2.34.1



[PATCH v2 0/9] net/mlx5: cumulative fix series for flex item

2024-09-18 Thread Viacheslav Ovsiienko
There is a series of independent patches related to the flex item.
There is no direct dependency between patches besides the merging
dependency inferred by git, the latter is reason the patches are
sent in series. For more details, please see the individual patch
commit messages.

Signed-off-by: Viacheslav Ovsiienko 

Viacheslav Ovsiienko (9):
  net/mlx5: update flex parser arc types support
  net/mlx5: add flex item query tunnel mode routine
  net/mlx5/hws: fix flex item support as tunnel header
  net/mlx5: fix flex item tunnel mode handling
  net/mlx5: fix number of supported flex parsers
  app/testpmd: remove flex item init command leftover
  net/mlx5: fix next protocol validation after flex item
  net/mlx5: fix non full word sample fields in flex item
  net/mlx5: fix flex item header length field translation

 app/test-pmd/cmdline_flow.c   |  12 --
 drivers/net/mlx5/hws/mlx5dr_definer.c |  17 +-
 drivers/net/mlx5/mlx5.h   |   9 +-
 drivers/net/mlx5/mlx5_flow_dv.c   |   7 +-
 drivers/net/mlx5/mlx5_flow_flex.c | 215 --
 drivers/net/mlx5/mlx5_flow_hw.c   |   8 +
 6 files changed, 167 insertions(+), 101 deletions(-)

-- 
2.34.1



[PATCH v2 9/9] net/mlx5: fix flex item header length field translation

2024-09-18 Thread Viacheslav Ovsiienko
There are hardware imposed limitations on the header length
field description for the mask and shift combinations in the
FIELD_MODE_OFFSET mode.

The patch updates:
  - parameter check for FIELD_MODE_OFFSET for the header length
field
  - check whether length field crosses dword boundaries in header
  - correct mask extension to the hardware required width 6-bits
  - correct adjusting the mask left margin offset, preventing
dword offset

Fixes: b293e8e49d78 ("net/mlx5: translate flex item configuration")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_flow_flex.c | 120 --
 1 file changed, 66 insertions(+), 54 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_flex.c 
b/drivers/net/mlx5/mlx5_flow_flex.c
index bf38643a23..afed16985a 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -449,12 +449,14 @@ mlx5_flex_release_index(struct rte_eth_dev *dev,
  *
  *   shift  mask
  * --- ---
- *0 b00  0x3C
- *1 b10  0x3E
- *2 b11  0x3F
- *3 b01  0x1F
- *4 b00  0x0F
- *5 b000111  0x07
+ *0 b1100  0x3C
+ *1 b0110  0x3E
+ *2 b0011  0x3F
+ *3 b0001  0x1F
+ *4 b  0x0F
+ *5 b0111  0x07
+ *6 b0011  0x03
+ *7 b0001  0x01
  */
 static uint8_t
 mlx5_flex_hdr_len_mask(uint8_t shift,
@@ -464,8 +466,7 @@ mlx5_flex_hdr_len_mask(uint8_t shift,
int diff = shift - MLX5_PARSE_GRAPH_NODE_HDR_LEN_SHIFT_DWORD;
 
base_mask = mlx5_hca_parse_graph_node_base_hdr_len_mask(attr);
-   return diff == 0 ? base_mask :
-  diff < 0 ? (base_mask << -diff) & base_mask : base_mask >> diff;
+   return diff < 0 ? base_mask << -diff : base_mask >> diff;
 }
 
 static int
@@ -476,7 +477,6 @@ mlx5_flex_translate_length(struct mlx5_hca_flex_attr *attr,
 {
const struct rte_flow_item_flex_field *field = &conf->next_header;
struct mlx5_devx_graph_node_attr *node = &devx->devx_conf;
-   uint32_t len_width, mask;
 
if (field->field_base % CHAR_BIT)
return rte_flow_error_set
@@ -504,7 +504,14 @@ mlx5_flex_translate_length(struct mlx5_hca_flex_attr *attr,
 "negative header length field base (FIXED)");
node->header_length_mode = MLX5_GRAPH_NODE_LEN_FIXED;
break;
-   case FIELD_MODE_OFFSET:
+   case FIELD_MODE_OFFSET: {
+   uint32_t msb, lsb;
+   int32_t shift = field->offset_shift;
+   uint32_t offset = field->offset_base;
+   uint32_t mask = field->offset_mask;
+   uint32_t wmax = attr->header_length_mask_width +
+   MLX5_PARSE_GRAPH_NODE_HDR_LEN_SHIFT_DWORD;
+
if (!(attr->header_length_mode &
RTE_BIT32(MLX5_GRAPH_NODE_LEN_FIELD)))
return rte_flow_error_set
@@ -514,47 +521,73 @@ mlx5_flex_translate_length(struct mlx5_hca_flex_attr 
*attr,
return rte_flow_error_set
(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
 "field size is a must for offset mode");
-   if (field->field_size + field->offset_base < 
attr->header_length_mask_width)
+   if ((offset ^ (field->field_size + offset)) >> 5)
return rte_flow_error_set
(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
-"field size plus offset_base is too small");
-   node->header_length_mode = MLX5_GRAPH_NODE_LEN_FIELD;
-   if (field->offset_mask == 0 ||
-   !rte_is_power_of_2(field->offset_mask + 1))
+"field crosses the 32-bit word boundary");
+   /* Hardware counts in dwords, all shifts done by offset within 
mask */
+   if (shift < 0 || (uint32_t)shift >= wmax)
+   return rte_flow_error_set
+   (error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+"header length field shift exceeds limits 
(OFFSET)");
+   if (!mask)
+   return rte_flow_error_set
+   (error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
+"zero length field offset mask (OFFSET)");
+   msb = rte_fls_u32(mask) - 1;
+   lsb = rte_bsf32(mask);
+   if (!rte_is_power_of_2((mask >> lsb) + 1))
return rte_flow_error_set
(error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, NULL,
-"invalid length field offset mask (OFFSET)");
-   len_width = rte_fls_u32(field->offset_mask);
-   if (len_width > a

[PATCH v2 6/9] app/testpmd: remove flex item init command leftover

2024-09-18 Thread Viacheslav Ovsiienko
There was a leftover of "flow flex init" command used
for debug purposes and had no useful functionality in
the production code.

Signed-off-by: Viacheslav Ovsiienko 
---
 app/test-pmd/cmdline_flow.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index d04280eb3e..858f4077bd 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -106,7 +106,6 @@ enum index {
HASH,
 
/* Flex arguments */
-   FLEX_ITEM_INIT,
FLEX_ITEM_CREATE,
FLEX_ITEM_DESTROY,
 
@@ -1317,7 +1316,6 @@ struct parse_action_priv {
})
 
 static const enum index next_flex_item[] = {
-   FLEX_ITEM_INIT,
FLEX_ITEM_CREATE,
FLEX_ITEM_DESTROY,
ZERO,
@@ -4171,15 +4169,6 @@ static const struct token token_list[] = {
.next = NEXT(next_flex_item),
.call = parse_flex,
},
-   [FLEX_ITEM_INIT] = {
-   .name = "init",
-   .help = "flex item init",
-   .args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
-ARGS_ENTRY(struct buffer, port)),
-   .next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
-NEXT_ENTRY(COMMON_PORT_ID)),
-   .call = parse_flex
-   },
[FLEX_ITEM_CREATE] = {
.name = "create",
.help = "flex item create",
@@ -11431,7 +11420,6 @@ parse_flex(struct context *ctx, const struct token 
*token,
switch (ctx->curr) {
default:
break;
-   case FLEX_ITEM_INIT:
case FLEX_ITEM_CREATE:
case FLEX_ITEM_DESTROY:
out->command = ctx->curr;
-- 
2.34.1



[PATCH v2 8/9] net/mlx5: fix non full word sample fields in flex item

2024-09-18 Thread Viacheslav Ovsiienko
If the sample field in flex item did not cover the entire
32-bit word (width was not verified 32 bits) or was not aligned
on the byte boundary the match on this sample in flows
happened to be ignored or wrongly missed. The field mask
"def" was build in wrong endianness, and non-byte aligned
shifts were wrongly performed for the pattern masks and values.

Fixes: 6dac7d7ff2bf ("net/mlx5: translate flex item pattern into matcher")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/hws/mlx5dr_definer.c |  4 +--
 drivers/net/mlx5/mlx5.h   |  5 ++-
 drivers/net/mlx5/mlx5_flow_dv.c   |  5 ++-
 drivers/net/mlx5/mlx5_flow_flex.c | 47 +--
 4 files changed, 29 insertions(+), 32 deletions(-)

diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c 
b/drivers/net/mlx5/hws/mlx5dr_definer.c
index 2dfcc5eba6..10b986d66b 100644
--- a/drivers/net/mlx5/hws/mlx5dr_definer.c
+++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
@@ -574,7 +574,7 @@ mlx5dr_definer_flex_parser_set(struct mlx5dr_definer_fc *fc,
idx = fc->fname - MLX5DR_DEFINER_FNAME_FLEX_PARSER_0;
byte_off -= idx * sizeof(uint32_t);
ret = mlx5_flex_get_parser_value_per_byte_off(flex, flex->handle, 
byte_off,
- false, is_inner, &val);
+ is_inner, &val);
if (ret == -1 || !val)
return;
 
@@ -2825,7 +2825,7 @@ mlx5dr_definer_conv_item_flex_parser(struct 
mlx5dr_definer_conv_data *cd,
for (i = 0; i < MLX5_GRAPH_NODE_SAMPLE_NUM; i++) {
byte_off = base_off - i * sizeof(uint32_t);
ret = mlx5_flex_get_parser_value_per_byte_off(m, v->handle, 
byte_off,
- true, is_inner, 
&mask);
+ is_inner, &mask);
if (ret == -1) {
rte_errno = EINVAL;
return rte_errno;
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index b1423b6868..0fb18f7fb1 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -2600,11 +2600,10 @@ void mlx5_flex_flow_translate_item(struct rte_eth_dev 
*dev, void *matcher,
   void *key, const struct rte_flow_item *item,
   bool is_inner);
 int mlx5_flex_get_sample_id(const struct mlx5_flex_item *tp,
-   uint32_t idx, uint32_t *pos,
-   bool is_inner, uint32_t *def);
+   uint32_t idx, uint32_t *pos, bool is_inner);
 int mlx5_flex_get_parser_value_per_byte_off(const struct rte_flow_item_flex 
*item,
void *flex, uint32_t byte_off,
-   bool is_mask, bool tunnel, uint32_t 
*value);
+   bool tunnel, uint32_t *value);
 int mlx5_flex_get_tunnel_mode(const struct rte_flow_item *item,
  enum rte_flow_item_flex_tunnel_mode *tunnel_mode);
 int mlx5_flex_acquire_index(struct rte_eth_dev *dev,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index b18bb430d7..d2a3f829d5 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -1526,7 +1526,6 @@ mlx5_modify_flex_item(const struct rte_eth_dev *dev,
const struct mlx5_flex_pattern_field *map;
uint32_t offset = data->offset;
uint32_t width_left = width;
-   uint32_t def;
uint32_t cur_width = 0;
uint32_t tmp_ofs;
uint32_t idx = 0;
@@ -1551,7 +1550,7 @@ mlx5_modify_flex_item(const struct rte_eth_dev *dev,
tmp_ofs = pos < data->offset ? data->offset - pos : 0;
for (j = i; i < flex->mapnum && width_left > 0; ) {
map = flex->map + i;
-   id = mlx5_flex_get_sample_id(flex, i, &pos, false, &def);
+   id = mlx5_flex_get_sample_id(flex, i, &pos, false);
if (id == -1) {
i++;
/* All left length is dummy */
@@ -1570,7 +1569,7 @@ mlx5_modify_flex_item(const struct rte_eth_dev *dev,
 * 2. Width has been covered.
 */
for (j = i + 1; j < flex->mapnum; j++) {
-   tmp_id = mlx5_flex_get_sample_id(flex, j, &pos, 
false, &def);
+   tmp_id = mlx5_flex_get_sample_id(flex, j, &pos, 
false);
if (tmp_id == -1) {
i = j;
pos -= flex->map[j].width;
diff --git a/drivers/net/mlx5/mlx5_flow_flex.c 
b/drivers/net/mlx5/mlx5_flow_flex.c
index 0c41b956b0..bf38643a23 100644
--- a/drivers/net/mlx5/mlx5_flow_flex.c
+++ b/drivers/net/mlx5/mlx5_flow_flex.c
@@ -118,28 +118,32 @@ 

[PATCH v2 7/9] net/mlx5: fix next protocol validation after flex item

2024-09-18 Thread Viacheslav Ovsiienko
On the flow validation some items may check the preceding protocols.
In case of flex item the next protocol is opaque (or can be multiple
ones) we should set neutral value and allow successful validation,
for example, for the combination of flex and following ESP items.

Fixes: a23e9b6e3ee9 ("net/mlx5: handle flex item in flows")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_flow_dv.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index a51d4dd1a4..b18bb430d7 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -8196,6 +8196,8 @@ flow_dv_validate(struct rte_eth_dev *dev, const struct 
rte_flow_attr *attr,
 tunnel != 0, error);
if (ret < 0)
return ret;
+   /* Reset for next proto, it is unknown. */
+   next_protocol = 0xff;
break;
case RTE_FLOW_ITEM_TYPE_METER_COLOR:
ret = flow_dv_validate_item_meter_color(dev, items,
-- 
2.34.1



[PATCH v2 5/9] net/mlx5: fix number of supported flex parsers

2024-09-18 Thread Viacheslav Ovsiienko
The hardware supports up to 8 flex parser configurations.
Some of them can be utilized internally by firmware, depending on
the configured profile ("FLEX_PARSER_PROFILE_ENABLE" in NV-setting).
The firmware does not report in capabilities how many flex parser
configuration is remaining available (this is device-wide resource
and can be allocated runtime by other agents - kernel, DPDK
applications, etc.), and once there is no more available parsers
on the parse object creation moment firmware just returns an error.

Fixes: db25cadc0887 ("net/mlx5: add flex item operations")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 6d163996e4..b1423b6868 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -69,7 +69,7 @@
 #define MLX5_ROOT_TBL_MODIFY_NUM   16
 
 /* Maximal number of flex items created on the port.*/
-#define MLX5_PORT_FLEX_ITEM_NUM4
+#define MLX5_PORT_FLEX_ITEM_NUM8
 
 /* Maximal number of field/field parts to map into sample registers .*/
 #define MLX5_FLEX_ITEM_MAPPING_NUM 32
-- 
2.34.1



RE: [PATCH v2 0/9] net/mlx5: cumulative fix series for flex item

2024-09-18 Thread Dariusz Sosnowski
> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH v2 0/9] net/mlx5: cumulative fix series for flex item
> 
> There is a series of independent patches related to the flex item.
> There is no direct dependency between patches besides the merging dependency
> inferred by git, the latter is reason the patches are sent in series. For 
> more details,
> please see the individual patch commit messages.
> 
> Signed-off-by: Viacheslav Ovsiienko 
> 
> Viacheslav Ovsiienko (9):
>   net/mlx5: update flex parser arc types support
>   net/mlx5: add flex item query tunnel mode routine
>   net/mlx5/hws: fix flex item support as tunnel header
>   net/mlx5: fix flex item tunnel mode handling
>   net/mlx5: fix number of supported flex parsers
>   app/testpmd: remove flex item init command leftover
>   net/mlx5: fix next protocol validation after flex item
>   net/mlx5: fix non full word sample fields in flex item
>   net/mlx5: fix flex item header length field translation
> 
>  app/test-pmd/cmdline_flow.c   |  12 --
>  drivers/net/mlx5/hws/mlx5dr_definer.c |  17 +-
>  drivers/net/mlx5/mlx5.h   |   9 +-
>  drivers/net/mlx5/mlx5_flow_dv.c   |   7 +-
>  drivers/net/mlx5/mlx5_flow_flex.c | 215 --
>  drivers/net/mlx5/mlx5_flow_hw.c   |   8 +
>  6 files changed, 167 insertions(+), 101 deletions(-)
> 
> --
> 2.34.1

Series-acked-by: Dariusz Sosnowski 

Best regards,
Dariusz Sosnowski



Minutes of Technical Board meeting 21-August-2024

2024-09-18 Thread Morten Brørup
Members Attending
=
Aaron Conole
Hemant Agrawal
Honappa Nagarahalli
Kevin Traynor
Konstantin Ananyev
Maxime Coquelin
Morten Brørup (chair)
Stephen Hemminger

NOTE

The technical board meetings are on every second Wednesday at 3 pm UTC.
Meetings are public. DPDK community members are welcome to attend on Zoom:
https://zoom-lfx.platform.linuxfoundation.org/meeting/96459488340?
password=d808f1f6-0a28-4165-929e-5a5bcae7efeb
Agenda: https://annuel.framapad.org/p/r.0c3cc4d1e011214183872a98f6b5c7db
Minutes of previous meetings: http://core.dpdk.org/techboard/minutes

Next meeting will be on Wednesday 04-September-2024 at 3pm UTC,
and will be chaired by Stephen.

Agenda Items


1. Tech Writer status update (Nathan)
-
Natan gave an update on tech writer status.
A meeting has been held to establish which tasks tech writer Nandini can work 
on in the next 6 months.
Two tasks were singled out: Redundancies in the documentation, and content gaps.
Both require a bit of pre-work from the tech board. Each tech board member will 
be assigned a piece of documentation to identify redundancies/gaps, and pass on 
their notes to Nandini.

The tech board set an Olympic record with a meeting duration of only 10 
minutes. :-)


Med venlig hilsen / Kind regards,
-Morten Brørup



RE: [PATCH v2 2/9] net/mlx5: add flex item query tunnel mode routine

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH v2 2/9] net/mlx5: add flex item query tunnel mode routine
> 
> Once parsing the RTE item array the PMD needs to know whether the flex item
> represents the tunnel header.
> The appropriate tunnel mode query API is added.
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5.h   |  2 ++
>  drivers/net/mlx5/mlx5_flow_flex.c | 27 +++
>  2 files changed, 29 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> 869aac032b..6d163996e4 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -2605,6 +2605,8 @@ int mlx5_flex_get_sample_id(const struct
> mlx5_flex_item *tp,  int mlx5_flex_get_parser_value_per_byte_off(const struct
> rte_flow_item_flex *item,
>   void *flex, uint32_t byte_off,
>   bool is_mask, bool tunnel,
> uint32_t *value);
> +int mlx5_flex_get_tunnel_mode(const struct rte_flow_item *item,
> +   enum rte_flow_item_flex_tunnel_mode
> *tunnel_mode);
>  int mlx5_flex_acquire_index(struct rte_eth_dev *dev,
>   struct rte_flow_item_flex_handle *handle,
>   bool acquire);
> diff --git a/drivers/net/mlx5/mlx5_flow_flex.c
> b/drivers/net/mlx5/mlx5_flow_flex.c
> index 5b104d583c..0c41b956b0 100644
> --- a/drivers/net/mlx5/mlx5_flow_flex.c
> +++ b/drivers/net/mlx5/mlx5_flow_flex.c
> @@ -291,6 +291,33 @@ mlx5_flex_get_parser_value_per_byte_off(const struct
> rte_flow_item_flex *item,
>   return 0;
>  }
> 
> +/**
> + * Get the flex parser tunnel mode.
> + *
> + * @param[in] item
> + *   RTE Flex item.
> + * @param[in, out] tunnel_mode
> + *   Pointer to return tunnel mode.
> + *
> + * @return
> + *   0 on success, otherwise negative error code.
> + */
> +int
> +mlx5_flex_get_tunnel_mode(const struct rte_flow_item *item,
> +   enum rte_flow_item_flex_tunnel_mode
> *tunnel_mode) {
> + if (item && item->spec && tunnel_mode) {
> + const struct rte_flow_item_flex *spec = item->spec;
> + struct mlx5_flex_item *flex = (struct mlx5_flex_item *)spec-
> >handle;
> +
> + if (flex) {
> + *tunnel_mode = flex->tunnel_mode;
> + return 0;
> + }
> + }
> + return -EINVAL;
> +}
> +
>  /**
>   * Translate item pattern into matcher fields according to translation
>   * array.
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski



RE: [PATCH v2 1/9] net/mlx5: update flex parser arc types support

2024-09-18 Thread Dariusz Sosnowski
> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH v2 1/9] net/mlx5: update flex parser arc types support
> 
> Add support for input IPv4 and for ESP output flex parser arcs.
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5_flow_flex.c | 21 +
>  1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow_flex.c
> b/drivers/net/mlx5/mlx5_flow_flex.c
> index 8a02247406..5b104d583c 100644
> --- a/drivers/net/mlx5/mlx5_flow_flex.c
> +++ b/drivers/net/mlx5/mlx5_flow_flex.c
> @@ -,6 +,8 @@ mlx5_flex_arc_type(enum rte_flow_item_type type,
> int in)
>   return MLX5_GRAPH_ARC_NODE_GENEVE;
>   case RTE_FLOW_ITEM_TYPE_VXLAN_GPE:
>   return MLX5_GRAPH_ARC_NODE_VXLAN_GPE;
> + case RTE_FLOW_ITEM_TYPE_ESP:
> + return MLX5_GRAPH_ARC_NODE_IPSEC_ESP;
>   default:
>   return -EINVAL;
>   }
> @@ -1148,6 +1150,22 @@ mlx5_flex_arc_in_udp(const struct rte_flow_item
> *item,
>   return rte_be_to_cpu_16(spec->hdr.dst_port);
>  }
> 
> +static int
> +mlx5_flex_arc_in_ipv4(const struct rte_flow_item *item,
> +   struct rte_flow_error *error)
> +{
> + const struct rte_flow_item_ipv4 *spec = item->spec;
> + const struct rte_flow_item_ipv4 *mask = item->mask;
> + struct rte_flow_item_ipv4 ip = { .hdr.next_proto_id = 0xff };
> +
> + if (memcmp(mask, &ip, sizeof(struct rte_flow_item_ipv4))) {
> + return rte_flow_error_set
> + (error, EINVAL, RTE_FLOW_ERROR_TYPE_ITEM, item,
> +  "invalid ipv4 item mask, full mask is desired");
> + }
> + return spec->hdr.next_proto_id;
> +}
> +
>  static int
>  mlx5_flex_arc_in_ipv6(const struct rte_flow_item *item,
> struct rte_flow_error *error)
> @@ -1210,6 +1228,9 @@ mlx5_flex_translate_arc_in(struct mlx5_hca_flex_attr
> *attr,
>   case RTE_FLOW_ITEM_TYPE_UDP:
>   ret = mlx5_flex_arc_in_udp(rte_item, error);
>   break;
> + case RTE_FLOW_ITEM_TYPE_IPV4:
> + ret = mlx5_flex_arc_in_ipv4(rte_item, error);
> + break;
>   case RTE_FLOW_ITEM_TYPE_IPV6:
>   ret = mlx5_flex_arc_in_ipv6(rte_item, error);
>   break;
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski



RE: [PATCH v2 5/9] net/mlx5: fix number of supported flex parsers

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> ; sta...@dpdk.org
> Subject: [PATCH v2 5/9] net/mlx5: fix number of supported flex parsers
> 
> The hardware supports up to 8 flex parser configurations.
> Some of them can be utilized internally by firmware, depending on the
> configured profile ("FLEX_PARSER_PROFILE_ENABLE" in NV-setting).
> The firmware does not report in capabilities how many flex parser 
> configuration
> is remaining available (this is device-wide resource and can be allocated 
> runtime
> by other agents - kernel, DPDK applications, etc.), and once there is no more
> available parsers on the parse object creation moment firmware just returns an
> error.
> 
> Fixes: db25cadc0887 ("net/mlx5: add flex item operations")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> 6d163996e4..b1423b6868 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -69,7 +69,7 @@
>  #define MLX5_ROOT_TBL_MODIFY_NUM 16
> 
>  /* Maximal number of flex items created on the port.*/
> -#define MLX5_PORT_FLEX_ITEM_NUM  4
> +#define MLX5_PORT_FLEX_ITEM_NUM  8
> 
>  /* Maximal number of field/field parts to map into sample registers .*/
>  #define MLX5_FLEX_ITEM_MAPPING_NUM   32
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski



RE: [PATCH v2 4/9] net/mlx5: fix flex item tunnel mode handling

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> ; sta...@dpdk.org
> Subject: [PATCH v2 4/9] net/mlx5: fix flex item tunnel mode handling
> 
> The RTE flex item can represent tunnel header itself, and split inner and 
> outer
> items, it should be reflected in the item flags while PMD is processing the 
> item
> array.
> 
> Fixes: 8c0ca7527bc8 ("net/mlx5/hws: support flex item matching")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5_flow_hw.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow_hw.c b/drivers/net/mlx5/mlx5_flow_hw.c
> index 50888944a5..a275154d4b 100644
> --- a/drivers/net/mlx5/mlx5_flow_hw.c
> +++ b/drivers/net/mlx5/mlx5_flow_hw.c
> @@ -558,6 +558,7 @@ flow_hw_matching_item_flags_get(const struct
> rte_flow_item items[])
>   uint64_t last_item = 0;
> 
>   for (; items->type != RTE_FLOW_ITEM_TYPE_END; items++) {
> + enum rte_flow_item_flex_tunnel_mode tunnel_mode =
> +FLEX_TUNNEL_MODE_SINGLE;
>   int tunnel = !!(item_flags & MLX5_FLOW_LAYER_TUNNEL);
>   int item_type = items->type;
> 
> @@ -606,6 +607,13 @@ flow_hw_matching_item_flags_get(const struct
> rte_flow_item items[])
>   case RTE_FLOW_ITEM_TYPE_COMPARE:
>   last_item = MLX5_FLOW_ITEM_COMPARE;
>   break;
> + case RTE_FLOW_ITEM_TYPE_FLEX:
> + mlx5_flex_get_tunnel_mode(items, &tunnel_mode);
> + last_item = tunnel_mode ==
> FLEX_TUNNEL_MODE_TUNNEL ?
> + MLX5_FLOW_ITEM_FLEX_TUNNEL :
> + tunnel ?
> MLX5_FLOW_ITEM_INNER_FLEX :
> +
>   MLX5_FLOW_ITEM_OUTER_FLEX;
> + break;
>   default:
>   break;
>   }
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski



RE: [PATCH v2 6/9] app/testpmd: remove flex item init command leftover

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> 
> Subject: [PATCH v2 6/9] app/testpmd: remove flex item init command leftover
> 
> There was a leftover of "flow flex init" command used for debug purposes and
> had no useful functionality in the production code.
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  app/test-pmd/cmdline_flow.c | 12 
>  1 file changed, 12 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c index
> d04280eb3e..858f4077bd 100644
> --- a/app/test-pmd/cmdline_flow.c
> +++ b/app/test-pmd/cmdline_flow.c
> @@ -106,7 +106,6 @@ enum index {
>   HASH,
> 
>   /* Flex arguments */
> - FLEX_ITEM_INIT,
>   FLEX_ITEM_CREATE,
>   FLEX_ITEM_DESTROY,
> 
> @@ -1317,7 +1316,6 @@ struct parse_action_priv {
>   })
> 
>  static const enum index next_flex_item[] = {
> - FLEX_ITEM_INIT,
>   FLEX_ITEM_CREATE,
>   FLEX_ITEM_DESTROY,
>   ZERO,
> @@ -4171,15 +4169,6 @@ static const struct token token_list[] = {
>   .next = NEXT(next_flex_item),
>   .call = parse_flex,
>   },
> - [FLEX_ITEM_INIT] = {
> - .name = "init",
> - .help = "flex item init",
> - .args = ARGS(ARGS_ENTRY(struct buffer, args.flex.token),
> -  ARGS_ENTRY(struct buffer, port)),
> - .next = NEXT(NEXT_ENTRY(COMMON_FLEX_TOKEN),
> -  NEXT_ENTRY(COMMON_PORT_ID)),
> - .call = parse_flex
> - },
>   [FLEX_ITEM_CREATE] = {
>   .name = "create",
>   .help = "flex item create",
> @@ -11431,7 +11420,6 @@ parse_flex(struct context *ctx, const struct token
> *token,
>   switch (ctx->curr) {
>   default:
>   break;
> - case FLEX_ITEM_INIT:
>   case FLEX_ITEM_CREATE:
>   case FLEX_ITEM_DESTROY:
>   out->command = ctx->curr;
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski



RE: [PATCH v2 3/9] net/mlx5/hws: fix flex item support as tunnel header

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> ; sta...@dpdk.org
> Subject: [PATCH v2 3/9] net/mlx5/hws: fix flex item support as tunnel header
> 
> The RTE flex item can represent the tunnel header and split the inner and 
> outer
> layer items. HWS did not support this flex item specifics.
> 
> Fixes: 8c0ca7527bc8 ("net/mlx5/hws: support flex item matching")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/hws/mlx5dr_definer.c | 13 +++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c
> b/drivers/net/mlx5/hws/mlx5dr_definer.c
> index 51a3f7be4b..2dfcc5eba6 100644
> --- a/drivers/net/mlx5/hws/mlx5dr_definer.c
> +++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
> @@ -3267,8 +3267,17 @@ mlx5dr_definer_conv_items_to_hl(struct
> mlx5dr_context *ctx,
>   break;
>   case RTE_FLOW_ITEM_TYPE_FLEX:
>   ret = mlx5dr_definer_conv_item_flex_parser(&cd,
> items, i);
> - item_flags |= cd.tunnel ?
> MLX5_FLOW_ITEM_INNER_FLEX :
> -
> MLX5_FLOW_ITEM_OUTER_FLEX;
> + if (ret == 0) {
> + enum rte_flow_item_flex_tunnel_mode
> tunnel_mode =
> +
>   FLEX_TUNNEL_MODE_SINGLE;
> +
> + ret = mlx5_flex_get_tunnel_mode(items,
> &tunnel_mode);
> + if (tunnel_mode ==
> FLEX_TUNNEL_MODE_TUNNEL)
> + item_flags |=
> MLX5_FLOW_ITEM_FLEX_TUNNEL;
> + else
> + item_flags |= cd.tunnel ?
> MLX5_FLOW_ITEM_INNER_FLEX :
> +
> MLX5_FLOW_ITEM_OUTER_FLEX;
> + }
>   break;
>   case RTE_FLOW_ITEM_TYPE_MPLS:
>   ret = mlx5dr_definer_conv_item_mpls(&cd, items, i);
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski



RE: [PATCH v2 8/9] net/mlx5: fix non full word sample fields in flex item

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> ; sta...@dpdk.org
> Subject: [PATCH v2 8/9] net/mlx5: fix non full word sample fields in flex item
> 
> If the sample field in flex item did not cover the entire 32-bit word (width 
> was not
> verified 32 bits) or was not aligned on the byte boundary the match on this
> sample in flows happened to be ignored or wrongly missed. The field mask "def"
> was build in wrong endianness, and non-byte aligned shifts were wrongly
> performed for the pattern masks and values.
> 
> Fixes: 6dac7d7ff2bf ("net/mlx5: translate flex item pattern into matcher")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/hws/mlx5dr_definer.c |  4 +--
>  drivers/net/mlx5/mlx5.h   |  5 ++-
>  drivers/net/mlx5/mlx5_flow_dv.c   |  5 ++-
>  drivers/net/mlx5/mlx5_flow_flex.c | 47 +--
>  4 files changed, 29 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/net/mlx5/hws/mlx5dr_definer.c
> b/drivers/net/mlx5/hws/mlx5dr_definer.c
> index 2dfcc5eba6..10b986d66b 100644
> --- a/drivers/net/mlx5/hws/mlx5dr_definer.c
> +++ b/drivers/net/mlx5/hws/mlx5dr_definer.c
> @@ -574,7 +574,7 @@ mlx5dr_definer_flex_parser_set(struct
> mlx5dr_definer_fc *fc,
>   idx = fc->fname - MLX5DR_DEFINER_FNAME_FLEX_PARSER_0;
>   byte_off -= idx * sizeof(uint32_t);
>   ret = mlx5_flex_get_parser_value_per_byte_off(flex, flex->handle,
> byte_off,
> -   false, is_inner, &val);
> +   is_inner, &val);
>   if (ret == -1 || !val)
>   return;
> 
> @@ -2825,7 +2825,7 @@ mlx5dr_definer_conv_item_flex_parser(struct
> mlx5dr_definer_conv_data *cd,
>   for (i = 0; i < MLX5_GRAPH_NODE_SAMPLE_NUM; i++) {
>   byte_off = base_off - i * sizeof(uint32_t);
>   ret = mlx5_flex_get_parser_value_per_byte_off(m, v->handle,
> byte_off,
> -   true, is_inner,
> &mask);
> +   is_inner,
> &mask);
>   if (ret == -1) {
>   rte_errno = EINVAL;
>   return rte_errno;
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> b1423b6868..0fb18f7fb1 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -2600,11 +2600,10 @@ void mlx5_flex_flow_translate_item(struct
> rte_eth_dev *dev, void *matcher,
>  void *key, const struct rte_flow_item *item,
>  bool is_inner);
>  int mlx5_flex_get_sample_id(const struct mlx5_flex_item *tp,
> - uint32_t idx, uint32_t *pos,
> - bool is_inner, uint32_t *def);
> + uint32_t idx, uint32_t *pos, bool is_inner);
>  int mlx5_flex_get_parser_value_per_byte_off(const struct rte_flow_item_flex
> *item,
>   void *flex, uint32_t byte_off,
> - bool is_mask, bool tunnel,
> uint32_t *value);
> + bool tunnel, uint32_t *value);
>  int mlx5_flex_get_tunnel_mode(const struct rte_flow_item *item,
> enum rte_flow_item_flex_tunnel_mode
> *tunnel_mode);  int mlx5_flex_acquire_index(struct rte_eth_dev *dev, diff 
> --git
> a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c index
> b18bb430d7..d2a3f829d5 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -1526,7 +1526,6 @@ mlx5_modify_flex_item(const struct rte_eth_dev
> *dev,
>   const struct mlx5_flex_pattern_field *map;
>   uint32_t offset = data->offset;
>   uint32_t width_left = width;
> - uint32_t def;
>   uint32_t cur_width = 0;
>   uint32_t tmp_ofs;
>   uint32_t idx = 0;
> @@ -1551,7 +1550,7 @@ mlx5_modify_flex_item(const struct rte_eth_dev
> *dev,
>   tmp_ofs = pos < data->offset ? data->offset - pos : 0;
>   for (j = i; i < flex->mapnum && width_left > 0; ) {
>   map = flex->map + i;
> - id = mlx5_flex_get_sample_id(flex, i, &pos, false, &def);
> + id = mlx5_flex_get_sample_id(flex, i, &pos, false);
>   if (id == -1) {
>   i++;
>   /* All left length is dummy */
> @@ -1570,7 +1569,7 @@ mlx5_modify_flex_item(const struct rte_eth_dev
> *dev,
>* 2. Width has been covered.
>*/
>   for (j = i + 1; j < flex->mapnum; j++) {
> - tmp_id = mlx5_flex_get_sample_id(flex, j,
> &pos, false, &def);
> + tmp_id = mlx5_flex_get_sample_id(flex, j,
> &pos

RE: [PATCH v2 9/9] net/mlx5: fix flex item header length field translation

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> ; sta...@dpdk.org
> Subject: [PATCH v2 9/9] net/mlx5: fix flex item header length field 
> translation
> 
> There are hardware imposed limitations on the header length field description 
> for
> the mask and shift combinations in the FIELD_MODE_OFFSET mode.
> 
> The patch updates:
>   - parameter check for FIELD_MODE_OFFSET for the header length
> field
>   - check whether length field crosses dword boundaries in header
>   - correct mask extension to the hardware required width 6-bits
>   - correct adjusting the mask left margin offset, preventing
> dword offset
> 
> Fixes: b293e8e49d78 ("net/mlx5: translate flex item configuration")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5_flow_flex.c | 120 --
>  1 file changed, 66 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow_flex.c
> b/drivers/net/mlx5/mlx5_flow_flex.c
> index bf38643a23..afed16985a 100644
> --- a/drivers/net/mlx5/mlx5_flow_flex.c
> +++ b/drivers/net/mlx5/mlx5_flow_flex.c
> @@ -449,12 +449,14 @@ mlx5_flex_release_index(struct rte_eth_dev *dev,
>   *
>   *   shift  mask
>   * --- ---
> - *0 b00  0x3C
> - *1 b10  0x3E
> - *2 b11  0x3F
> - *3 b01  0x1F
> - *4 b00  0x0F
> - *5 b000111  0x07
> + *0 b1100  0x3C
> + *1 b0110  0x3E
> + *2 b0011  0x3F
> + *3 b0001  0x1F
> + *4 b  0x0F
> + *5 b0111  0x07
> + *6 b0011  0x03
> + *7 b0001  0x01
>   */
>  static uint8_t
>  mlx5_flex_hdr_len_mask(uint8_t shift,
> @@ -464,8 +466,7 @@ mlx5_flex_hdr_len_mask(uint8_t shift,
>   int diff = shift - MLX5_PARSE_GRAPH_NODE_HDR_LEN_SHIFT_DWORD;
> 
>   base_mask = mlx5_hca_parse_graph_node_base_hdr_len_mask(attr);
> - return diff == 0 ? base_mask :
> -diff < 0 ? (base_mask << -diff) & base_mask : base_mask >> diff;
> + return diff < 0 ? base_mask << -diff : base_mask >> diff;
>  }
> 
>  static int
> @@ -476,7 +477,6 @@ mlx5_flex_translate_length(struct mlx5_hca_flex_attr
> *attr,  {
>   const struct rte_flow_item_flex_field *field = &conf->next_header;
>   struct mlx5_devx_graph_node_attr *node = &devx->devx_conf;
> - uint32_t len_width, mask;
> 
>   if (field->field_base % CHAR_BIT)
>   return rte_flow_error_set
> @@ -504,7 +504,14 @@ mlx5_flex_translate_length(struct mlx5_hca_flex_attr
> *attr,
>"negative header length field base (FIXED)");
>   node->header_length_mode =
> MLX5_GRAPH_NODE_LEN_FIXED;
>   break;
> - case FIELD_MODE_OFFSET:
> + case FIELD_MODE_OFFSET: {
> + uint32_t msb, lsb;
> + int32_t shift = field->offset_shift;
> + uint32_t offset = field->offset_base;
> + uint32_t mask = field->offset_mask;
> + uint32_t wmax = attr->header_length_mask_width +
> +
>   MLX5_PARSE_GRAPH_NODE_HDR_LEN_SHIFT_DWORD;
> +
>   if (!(attr->header_length_mode &
>   RTE_BIT32(MLX5_GRAPH_NODE_LEN_FIELD)))
>   return rte_flow_error_set
> @@ -514,47 +521,73 @@ mlx5_flex_translate_length(struct mlx5_hca_flex_attr
> *attr,
>   return rte_flow_error_set
>   (error, EINVAL,
> RTE_FLOW_ERROR_TYPE_ITEM, NULL,
>"field size is a must for offset mode");
> - if (field->field_size + field->offset_base < attr-
> >header_length_mask_width)
> + if ((offset ^ (field->field_size + offset)) >> 5)
>   return rte_flow_error_set
>   (error, EINVAL,
> RTE_FLOW_ERROR_TYPE_ITEM, NULL,
> -  "field size plus offset_base is too small");
> - node->header_length_mode =
> MLX5_GRAPH_NODE_LEN_FIELD;
> - if (field->offset_mask == 0 ||
> - !rte_is_power_of_2(field->offset_mask + 1))
> +  "field crosses the 32-bit word boundary");
> + /* Hardware counts in dwords, all shifts done by offset within
> mask */
> + if (shift < 0 || (uint32_t)shift >= wmax)
> + return rte_flow_error_set
> + (error, EINVAL,
> RTE_FLOW_ERROR_TYPE_ITEM, NULL,
> +  "header length field shift exceeds limits
> (OFFSET)");
> + if (!mask)
> + return rte_flow_error_set
> + (error, EINVAL,
> RTE_FLOW_ERROR_TYPE_ITEM, NULL,
> +  "zero length field offset mask (OFFSET)");
> + msb = rte_fls_u32(mask)

RE: [PATCH v2 7/9] net/mlx5: fix next protocol validation after flex item

2024-09-18 Thread Dariusz Sosnowski



> -Original Message-
> From: Slava Ovsiienko 
> Sent: Wednesday, September 18, 2024 15:46
> To: dev@dpdk.org
> Cc: Matan Azrad ; Raslan Darawsheh
> ; Ori Kam ; Dariusz Sosnowski
> ; sta...@dpdk.org
> Subject: [PATCH v2 7/9] net/mlx5: fix next protocol validation after flex item
> 
> On the flow validation some items may check the preceding protocols.
> In case of flex item the next protocol is opaque (or can be multiple
> ones) we should set neutral value and allow successful validation, for 
> example,
> for the combination of flex and following ESP items.
> 
> Fixes: a23e9b6e3ee9 ("net/mlx5: handle flex item in flows")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Viacheslav Ovsiienko 
> ---
>  drivers/net/mlx5/mlx5_flow_dv.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
> index a51d4dd1a4..b18bb430d7 100644
> --- a/drivers/net/mlx5/mlx5_flow_dv.c
> +++ b/drivers/net/mlx5/mlx5_flow_dv.c
> @@ -8196,6 +8196,8 @@ flow_dv_validate(struct rte_eth_dev *dev, const
> struct rte_flow_attr *attr,
>tunnel != 0,
> error);
>   if (ret < 0)
>   return ret;
> + /* Reset for next proto, it is unknown. */
> + next_protocol = 0xff;
>   break;
>   case RTE_FLOW_ITEM_TYPE_METER_COLOR:
>   ret = flow_dv_validate_item_meter_color(dev, items,
> --
> 2.34.1

Acked-by: Dariusz Sosnowski 

Resending the Ack for each patch separately, because patchwork assigned my Ack 
for the series to v1, not v2.

Best regards,
Dariusz Sosnowski


Re: [PATCH v3 11/12] dts: add Rx offload capabilities

2024-09-18 Thread Jeremy Spewock
On Wed, Sep 18, 2024 at 10:18 AM Juraj Linkeš
 wrote:
>
>
>
> On 26. 8. 2024 19:24, Jeremy Spewock wrote:
> > On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš
> >  wrote:
> > 
> >> diff --git a/dts/framework/remote_session/testpmd_shell.py 
> >> b/dts/framework/remote_session/testpmd_shell.py
> >> index 48c31124d1..f83569669e 100644
> >> --- a/dts/framework/remote_session/testpmd_shell.py
> >> +++ b/dts/framework/remote_session/testpmd_shell.py
> >> @@ -659,6 +659,103 @@ class TestPmdPortStats(TextParser):
> >>   tx_bps: int = field(metadata=TextParser.find_int(r"Tx-bps:\s+(\d+)"))
> >>
> >>
> >> +class RxOffloadCapability(Flag):
> >> +"""Rx offload capabilities of a device."""
> >> +
> >> +#:
> >> +RX_OFFLOAD_VLAN_STRIP = auto()
> >> +#: Device supports L3 checksum offload.
> >> +RX_OFFLOAD_IPV4_CKSUM = auto()
> >> +#: Device supports L4 checksum offload.
> >> +RX_OFFLOAD_UDP_CKSUM = auto()
> >> +#: Device supports L4 checksum offload.
> >> +RX_OFFLOAD_TCP_CKSUM = auto()
> >> +#: Device supports Large Receive Offload.
> >> +RX_OFFLOAD_TCP_LRO = auto()
> >> +#: Device supports QinQ (queue in queue) offload.
> >> +RX_OFFLOAD_QINQ_STRIP = auto()
> >> +#: Device supports inner packet L3 checksum.
> >> +RX_OFFLOAD_OUTER_IPV4_CKSUM = auto()
> >> +#: Device supports MACsec.
> >> +RX_OFFLOAD_MACSEC_STRIP = auto()
> >> +#: Device supports filtering of a VLAN Tag identifier.
> >> +RX_OFFLOAD_VLAN_FILTER = 1 << 9
> >> +#: Device supports VLAN offload.
> >> +RX_OFFLOAD_VLAN_EXTEND = auto()
> >> +#: Device supports receiving segmented mbufs.
> >> +RX_OFFLOAD_SCATTER = 1 << 13
> >
> > I know you mentioned in the commit message that the auto() can cause
> > problems with mypy/sphinx, is that why this one is a specific value
> > instead? Regardless, I think we should probably make it consistent so
> > that either all of them are bit-shifts or none of them are unless
> > there is a specific reason that the scatter offload is different.
> >
>
> Since both you and Dean asked, I'll add something to the docstring about
> this.
>
> There are actually two non-auto values (RX_OFFLOAD_VLAN_FILTER = 1 << 9
> is the first one). I used the actual values to mirror the flags in DPDK
> code.

Gotcha, that makes sense.

>
> >> +#: Device supports Timestamp.
> >> +RX_OFFLOAD_TIMESTAMP = auto()
> >> +#: Device supports crypto processing while packet is received in NIC.
> >> +RX_OFFLOAD_SECURITY = auto()
> >> +#: Device supports CRC stripping.
> >> +RX_OFFLOAD_KEEP_CRC = auto()
> >> +#: Device supports L4 checksum offload.
> >> +RX_OFFLOAD_SCTP_CKSUM = auto()
> >> +#: Device supports inner packet L4 checksum.
> >> +RX_OFFLOAD_OUTER_UDP_CKSUM = auto()
> >> +#: Device supports RSS hashing.
> >> +RX_OFFLOAD_RSS_HASH = auto()
> >> +#: Device supports
> >> +RX_OFFLOAD_BUFFER_SPLIT = auto()
> >> +#: Device supports all checksum capabilities.
> >> +RX_OFFLOAD_CHECKSUM = RX_OFFLOAD_IPV4_CKSUM | RX_OFFLOAD_UDP_CKSUM | 
> >> RX_OFFLOAD_TCP_CKSUM
> >> +#: Device supports all VLAN capabilities.
> >> +RX_OFFLOAD_VLAN = (
> >> +RX_OFFLOAD_VLAN_STRIP
> >> +| RX_OFFLOAD_VLAN_FILTER
> >> +| RX_OFFLOAD_VLAN_EXTEND
> >> +| RX_OFFLOAD_QINQ_STRIP
> >> +)
> > 
> >>
> >> @@ -1048,6 +1145,42 @@ def _close(self) -> None:
> >>   == Capability retrieval methods ==
> >>   """
> >>
> >> +def get_capabilities_rx_offload(
> >> +self,
> >> +supported_capabilities: MutableSet["NicCapability"],
> >> +unsupported_capabilities: MutableSet["NicCapability"],
> >> +) -> None:
> >> +"""Get all rx offload capabilities and divide them into supported 
> >> and unsupported.
> >> +
> >> +Args:
> >> +supported_capabilities: Supported capabilities will be added 
> >> to this set.
> >> +unsupported_capabilities: Unsupported capabilities will be 
> >> added to this set.
> >> +"""
> >> +self._logger.debug("Getting rx offload capabilities.")
> >> +command = f"show port {self.ports[0].id} rx_offload capabilities"
> >
> > Is it desirable to only get the capabilities of the first port? In the
> > current framework I suppose it doesn't matter all that much since you
> > can only use the first few ports in the list of ports anyway, but will
> > there ever be a case where a test run has 2 different devices included
> > in the list of ports? Of course it's possible that it will happen, but
> > is it practical? Because, if so, then we would want this to aggregate
> > what all the devices are capable of and have capabilities basically
> > say "at least one of the ports in the list of ports is capable of
> > these things."
> >
> > This consideration also applies to the rxq info capability gathering as 
> > well.
> >
>
> No parts of the framework are adjusted to use multiple NIC in a single
> 

[PATCH v5 1/1] dts: add text parser for testpmd verbose output

2024-09-18 Thread jspewock
From: Jeremy Spewock 

Multiple test suites from the old DTS framework rely on being able to
consume and interpret the verbose output of testpmd. The new framework
doesn't have an elegant way for handling the verbose output, but test
suites are starting to be written that rely on it. This patch creates a
TextParser class that can be used to extract the verbose information
from any testpmd output and also adjusts the `stop` method of the shell
to return all output that it collected.

Signed-off-by: Jeremy Spewock 
---
 dts/framework/remote_session/testpmd_shell.py | 525 +-
 dts/framework/utils.py|   6 +
 2 files changed, 529 insertions(+), 2 deletions(-)

diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index 43e9f56517..2d741802c7 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -31,7 +31,7 @@
 from framework.settings import SETTINGS
 from framework.testbed_model.cpu import LogicalCoreCount, LogicalCoreList
 from framework.testbed_model.sut_node import SutNode
-from framework.utils import StrEnum
+from framework.utils import REGEX_FOR_MAC_ADDRESS, StrEnum
 
 
 class TestPmdDevice:
@@ -577,6 +577,497 @@ class TestPmdPortStats(TextParser):
 tx_bps: int = field(metadata=TextParser.find_int(r"Tx-bps:\s+(\d+)"))
 
 
+class PacketOffloadFlag(Flag):
+"""Flag representing the Packet Offload Features Flags in DPDK.
+
+Values in this class are taken from the definitions in the RTE MBUF core 
library in DPDK
+located in lib/mbuf/rte_mbuf_core.h. It is expected that flag values in 
this class will match
+the values they are set to in said DPDK library with one exception; all 
values must be unique.
+For example, the definitions for unknown checksum flags in rte_mbuf_core.h 
are all set to
+:data:`0`, but it is valuable to distinguish between them in this 
framework. For this reason
+flags that are not unique in the DPDK library are set either to values 
within the
+RTE_MBUF_F_FIRST_FREE-RTE_MBUF_F_LAST_FREE range for Rx or shifted 61+ 
bits for Tx.
+"""
+
+# RX flags
+
+#: The RX packet is a 802.1q VLAN packet, and the tci has been saved in 
mbuf->vlan_tci. If the
+#: flag RTE_MBUF_F_RX_VLAN_STRIPPED is also present, the VLAN header has 
been stripped from
+#: mbuf data, else it is still present.
+RTE_MBUF_F_RX_VLAN = auto()
+
+#: RX packet with RSS hash result.
+RTE_MBUF_F_RX_RSS_HASH = auto()
+
+#: RX packet with FDIR match indicate.
+RTE_MBUF_F_RX_FDIR = auto()
+
+#: This flag is set when the outermost IP header checksum is detected as 
wrong by the hardware.
+RTE_MBUF_F_RX_OUTER_IP_CKSUM_BAD = 1 << 5
+
+#: A vlan has been stripped by the hardware and its tci is saved in 
mbuf->vlan_tci. This can
+#: only happen if vlan stripping is enabled in the RX configuration of the 
PMD. When
+#: RTE_MBUF_F_RX_VLAN_STRIPPED is set, RTE_MBUF_F_RX_VLAN must also be set.
+RTE_MBUF_F_RX_VLAN_STRIPPED = auto()
+
+#: No information about the RX IP checksum.
+RTE_MBUF_F_RX_IP_CKSUM_UNKNOWN = 1 << 23
+#: The IP checksum in the packet is wrong.
+RTE_MBUF_F_RX_IP_CKSUM_BAD = 1 << 4
+#: The IP checksum in the packet is valid.
+RTE_MBUF_F_RX_IP_CKSUM_GOOD = 1 << 7
+#: The IP checksum is not correct in the packet data, but the integrity of 
the IP header is
+#: verified.
+RTE_MBUF_F_RX_IP_CKSUM_NONE = RTE_MBUF_F_RX_IP_CKSUM_BAD | 
RTE_MBUF_F_RX_IP_CKSUM_GOOD
+
+#: No information about the RX L4 checksum.
+RTE_MBUF_F_RX_L4_CKSUM_UNKNOWN = 1 << 24
+#: The L4 checksum in the packet is wrong.
+RTE_MBUF_F_RX_L4_CKSUM_BAD = 1 << 3
+#: The L4 checksum in the packet is valid.
+RTE_MBUF_F_RX_L4_CKSUM_GOOD = 1 << 8
+#: The L4 checksum is not correct in the packet data, but the integrity of 
the L4 data is
+#: verified.
+RTE_MBUF_F_RX_L4_CKSUM_NONE = RTE_MBUF_F_RX_L4_CKSUM_BAD | 
RTE_MBUF_F_RX_L4_CKSUM_GOOD
+
+#: RX IEEE1588 L2 Ethernet PT Packet.
+RTE_MBUF_F_RX_IEEE1588_PTP = 1 << 9
+#: RX IEEE1588 L2/L4 timestamped packet.
+RTE_MBUF_F_RX_IEEE1588_TMST = 1 << 10
+
+#: FD id reported if FDIR match.
+RTE_MBUF_F_RX_FDIR_ID = 1 << 13
+#: Flexible bytes reported if FDIR match.
+RTE_MBUF_F_RX_FDIR_FLX = 1 << 14
+
+#: If both RTE_MBUF_F_RX_QINQ_STRIPPED and RTE_MBUF_F_RX_VLAN_STRIPPED are 
set, the 2 VLANs
+#: have been stripped by the hardware. If RTE_MBUF_F_RX_QINQ_STRIPPED is 
set and
+#: RTE_MBUF_F_RX_VLAN_STRIPPED is unset, only the outer VLAN is removed 
from packet data.
+RTE_MBUF_F_RX_QINQ_STRIPPED = auto()
+
+#: When packets are coalesced by a hardware or virtual driver, this flag 
can be set in the RX
+#: mbuf, meaning that the m->tso_segsz field is valid and is set to the 
segment size of
+#: original packets.
+RTE_MBUF_F_RX_LRO = auto()
+
+#: Indi

[PATCH v5 0/1] dts: testpmd verbose parser

2024-09-18 Thread jspewock
From: Jeremy Spewock 

v5:
 * fix typo

Jeremy Spewock (1):
  dts: add text parser for testpmd verbose output

 dts/framework/remote_session/testpmd_shell.py | 525 +-
 dts/framework/utils.py|   6 +
 2 files changed, 529 insertions(+), 2 deletions(-)

-- 
2.46.0



Re: [PATCH v3 08/12] dts: add NIC capability support

2024-09-18 Thread Jeremy Spewock
On Wed, Sep 18, 2024 at 8:58 AM Juraj Linkeš  wrote:
>
>
>
> On 27. 8. 2024 18:36, Jeremy Spewock wrote:
> > On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš
> >  wrote:
> > 
> >> diff --git a/dts/framework/testbed_model/capability.py 
> >> b/dts/framework/testbed_model/capability.py
> >> index 8899f07f76..9a79e6ebb3 100644
> >> --- a/dts/framework/testbed_model/capability.py
> >> +++ b/dts/framework/testbed_model/capability.py
> >> @@ -5,14 +5,40 @@
> > 
> >> +@classmethod
> >> +def get_supported_capabilities(
> >> +cls, sut_node: SutNode, topology: "Topology"
> >> +) -> set["DecoratedNicCapability"]:
> >> +"""Overrides :meth:`~Capability.get_supported_capabilities`.
> >> +
> >> +The capabilities are first sorted by decorators, then reduced 
> >> into a single function which
> >> +is then passed to the decorator. This way we only execute each 
> >> decorator only once.
> >> +"""
> >> +supported_conditional_capabilities: set["DecoratedNicCapability"] 
> >> = set()
> >> +logger = get_dts_logger(f"{sut_node.name}.{cls.__name__}")
> >> +if topology.type is Topology.type.no_link:
> >
> > As a follow-up, I didn't notice this during my initial review, but in
> > testing this line was throwing attribute errors for me due to Topology
> > not having an attribute named `type`. I think this was because of
> > `Topology.type.no_link` since this attribute isn't initialized on the
> > class itself. I fixed this by just replacing it with
> > `TopologyType.no_link` locally.
> >
>
> I also ran into this, the type attribute is not a class variable. Your
> solution works (and I also originally fixed it with exactly that), but I
> then I realized topology.type.no_link also works (and was probably my
> intention), which doesn't require the extra import of TopologyType.

Right, that's smart. I forget that you can do that with enums.


Re: [PATCH v3 11/12] dts: add Rx offload capabilities

2024-09-18 Thread Jeremy Spewock
On Wed, Sep 18, 2024 at 10:27 AM Juraj Linkeš
 wrote:
>
>
>
> On 29. 8. 2024 17:40, Jeremy Spewock wrote:
> > On Wed, Aug 28, 2024 at 1:44 PM Jeremy Spewock  wrote:
> >>
> >> On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš
> >>  wrote:
> >> 
> >>> diff --git a/dts/framework/remote_session/testpmd_shell.py 
> >>> b/dts/framework/remote_session/testpmd_shell.py
> >>> index 48c31124d1..f83569669e 100644
> >>> --- a/dts/framework/remote_session/testpmd_shell.py
> >>> +++ b/dts/framework/remote_session/testpmd_shell.py
> >>> @@ -659,6 +659,103 @@ class TestPmdPortStats(TextParser):
> >>>   tx_bps: int = 
> >>> field(metadata=TextParser.find_int(r"Tx-bps:\s+(\d+)"))
> >>>
> >>>
> >>> +class RxOffloadCapability(Flag):
> >>> +"""Rx offload capabilities of a device."""
> >>> +
> >>> +#:
> >>> +RX_OFFLOAD_VLAN_STRIP = auto()
> >>
> >> One other thought that I had about this; was there a specific reason
> >> that you decided to prefix all of these with `RX_OFFLOAD_`? I am
> >> working on a test suite right now that uses both RX and TX offloads
> >> and thought that it would be a great use of capabilities, so I am
> >> working on adding a TxOffloadCapability flag as well and, since the
> >> output is essentially the same, it made a lot of sense to make it a
> >> sibling class of this one with similar parsing functionality. In what
> >> I was writing, I found it much easier to remove this prefix so that
> >> the parsing method can be the same for both RX and TX, and I didn't
> >> have to restate some options that are shared between both (like
> >> IPv4_CKSUM, UDP_CKSUM, etc.). Is there a reason you can think of why
> >> removing this prefix is a bad idea? Hopefully I will have a patch out
> >> soon that shows this extension that I've made so that you can see
> >> in-code what I was thinking.
> >
> > I see now that you actually already answered this question, I was just
> > looking too much at that piece of code, and clearly not looking
> > further down at the helper-method mapping or the commit message that
> > you left :).
> >
> > "The Flag members correspond to NIC
> > capability names so a convenience function that looks for the supported
> > Flags in a testpmd output is also added."
> >
> > Having it prefixed with RX_OFFLOAD_ in NicCapability makes a lot of
> > sense since it is more explicit. Since there is a good reason to have
> > it like this, then the redundancy makes sense I think. There are some
> > ways to potentially avoid this like creating a StrFlag class that
> > overrides the __str__ method, or something like an additional type
> > that would contain a toString method, but it feels very situational
> > and specific to this one use-case so it probably isn't going to be
> > super valuable. Another thing I could think of to do would be allowing
> > the user to pass in a function or something to the helper-method that
> > mapped Flag names to their respective NicCapability name, or just
> > doing it in the method that gets the offloads instead of using a
> > helper at all, but this also just makes it more complicated and maybe
> > it isn't worth it.
> >
>
> I also had it without the prefix, but then I also realized it's needed
> in NicCapability so this is where I ended. I'm not sure complicating
> things to remove the prefix is worth it, especially when these names are
> basically only used internally. The prefix could actually confer some
> benefit if the name appears in a log somewhere (although overriding
> __str__ could be the way; maybe I'll think about that).

It could be done with modifying str, but I found that an approach that
was easier was just adding an optional prefix to the
_update_capabilities_from_flag() method since you will know whether
the capability is Rx or Tx at the point of calling this method. I feel
like either or could work, I'm not sure exactly which is better. The
change that adds the prefix is in the Rx/Tx offload suite in the first
commit [1] if you wanted to look at it. This commit and the one after
it are isolated to be only changes to the capabilities series.

[1] 
https://patchwork.dpdk.org/project/dpdk/patch/20240903194642.24458-2-jspew...@iol.unh.edu/

>
> > I apologize for asking you about something that you already explained,
> > but maybe something we can get out of this is that, since these names
> > have to be consistent, it might be worth putting that in the
> > doc-strings of the flag for when people try to make further expansions
> > or changes in the future. Or it could also be generally clear that
> > flags used for capabilities should follow this idea, let me know what
> > you think.
> >
>
> Adding things to docstring is usually a good thing. What should I
> document? I guess the correspondence between the flag and NicCapability,
> anything else?

The only thing I was thinking was that the flag values have to match
the values in NicCapability. I think explaining it this way is enough
just to make it clear that it is done that way for a purpose a

Re: [PATCH v23 00/15] Logging improvements

2024-09-18 Thread Bruce Richardson
On Tue, Sep 17, 2024 at 09:56:05PM -0700, Stephen Hemminger wrote:
> Improvements and unification of logging library.
> This version works on all platforms: Linux, Windows and FreeBSD.
> 
> This is update to rework patch set. It adds several new features
> to the console log output.
> 
>   * Putting a timestamp on console output which is useful for
> analyzing performance of startup codes. Timestamp is optional
> and must be enabled on command line.
> 
>   * Displaying console output with colors.
> It uses the standard conventions used by many other Linux commands
> for colorized display.  The default is to enable color if the
> console output is going to a terminal. But it can be always
> on or disabled by command line flag. This default was chosen
> based on what dmesg(1) command does.
> 
> Color is used by many tools (vi, iproute2, git) because it is helpful;
> DPDK drivers and libraries print lots of not very useful messages.
> And having error messages highlighted in bold face helps.
> This might also get users to pay more attention to error messages.
> Many bug reports have earlier messages that are lost because
> there are so many info messages.
> 
>   * Add support for automatic detection of systemd journal
> protocol. If running as systemd service will get enhanced
> logging.
> 
>   * Use of syslog is optional and the meaning of the
> --syslog flag has changed. The default is *not* to use
> syslog if output is going to a terminal.
> 
> Add myself as maintainer for log because by now have added
> more than previous authors.
> 
> v23 - simplify and fix Windows and FreeBSD builds; fix #ifdefs.
>   Change from defining stubs to using inline functions in log_private.h.
> 
> Stephen Hemminger (15):
>   maintainers: add for log library
>   windows: make getopt functions have const properties
>   windows: add os shim for localtime_r
>   eal: make eal_log_level_parse common
>   eal: do not duplicate rte_init_alert() messages
>   eal: change rte_exit() output to match rte_log()
>   log: move handling of syslog facility out of eal
>   eal: initialize log before everything else
>   log: drop syslog support, and make code common
>   log: add hook for printing log messages
>   log: add timestamp option
>   log: add optional support of syslog
>   log: add support for systemd journal
>   log: colorize log output
>   doc: add release note about log library
> 
Thanks for the cleanup.

Series-acked-by: Bruce Richardson 


[PATCH v7 6/7] service: keep per-lcore state in lcore variable

2024-09-18 Thread Mattias Rönnblom
Replace static array of cache-aligned structs with an lcore variable,
to slightly benefit code simplicity and performance.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Konstantin Ananyev 

--

PATCH v7:
 * Update to match new FOREACH API.

RFC v6:
 * Remove a now-redundant lcore variable value memset().

RFC v5:
 * Fix lcore value pointer bug introduced by RFC v4.

RFC v4:
 * Remove strange-looking lcore value lookup potentially containing
   invalid lcore id. (Morten Brørup)
 * Replace misplaced tab with space. (Morten Brørup)
---
 lib/eal/common/rte_service.c | 117 +++
 1 file changed, 65 insertions(+), 52 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 56379930b6..59c4f77966 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -75,7 +76,7 @@ struct __rte_cache_aligned core_state {
 
 static uint32_t rte_service_count;
 static struct rte_service_spec_impl *rte_services;
-static struct core_state *lcore_states;
+static RTE_LCORE_VAR_HANDLE(struct core_state, lcore_states);
 static uint32_t rte_service_library_initialized;
 
 int32_t
@@ -101,12 +102,8 @@ rte_service_init(void)
goto fail_mem;
}
 
-   lcore_states = rte_calloc("rte_service_core_states", RTE_MAX_LCORE,
-   sizeof(struct core_state), RTE_CACHE_LINE_SIZE);
-   if (!lcore_states) {
-   EAL_LOG(ERR, "error allocating core states array");
-   goto fail_mem;
-   }
+   if (lcore_states == NULL)
+   RTE_LCORE_VAR_ALLOC(lcore_states);
 
int i;
struct rte_config *cfg = rte_eal_get_configuration();
@@ -122,7 +119,6 @@ rte_service_init(void)
return 0;
 fail_mem:
rte_free(rte_services);
-   rte_free(lcore_states);
return -ENOMEM;
 }
 
@@ -136,7 +132,6 @@ rte_service_finalize(void)
rte_eal_mp_wait_lcore();
 
rte_free(rte_services);
-   rte_free(lcore_states);
 
rte_service_library_initialized = 0;
 }
@@ -286,7 +281,6 @@ rte_service_component_register(const struct 
rte_service_spec *spec,
 int32_t
 rte_service_component_unregister(uint32_t id)
 {
-   uint32_t i;
struct rte_service_spec_impl *s;
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
@@ -294,9 +288,11 @@ rte_service_component_unregister(uint32_t id)
 
s->internal_flags &= ~(SERVICE_F_REGISTERED);
 
+   unsigned int lcore_id;
+   struct core_state *cs;
/* clear the run-bit in all cores */
-   for (i = 0; i < RTE_MAX_LCORE; i++)
-   lcore_states[i].service_mask &= ~(UINT64_C(1) << id);
+   RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, cs, lcore_states)
+   cs->service_mask &= ~(UINT64_C(1) << id);
 
memset(&rte_services[id], 0, sizeof(struct rte_service_spec_impl));
 
@@ -454,7 +450,10 @@ rte_service_may_be_active(uint32_t id)
return -EINVAL;
 
for (i = 0; i < lcore_count; i++) {
-   if (lcore_states[ids[i]].service_active_on_lcore[id])
+   struct core_state *cs =
+   RTE_LCORE_VAR_LCORE_VALUE(ids[i], lcore_states);
+
+   if (cs->service_active_on_lcore[id])
return 1;
}
 
@@ -464,7 +463,7 @@ rte_service_may_be_active(uint32_t id)
 int32_t
 rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t serialize_mt_unsafe)
 {
-   struct core_state *cs = &lcore_states[rte_lcore_id()];
+   struct core_state *cs = RTE_LCORE_VAR_VALUE(lcore_states);
struct rte_service_spec_impl *s;
 
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
@@ -486,8 +485,7 @@ service_runner_func(void *arg)
 {
RTE_SET_USED(arg);
uint8_t i;
-   const int lcore = rte_lcore_id();
-   struct core_state *cs = &lcore_states[lcore];
+   struct core_state *cs = RTE_LCORE_VAR_VALUE(lcore_states);
 
rte_atomic_store_explicit(&cs->thread_active, 1, 
rte_memory_order_seq_cst);
 
@@ -533,13 +531,15 @@ service_runner_func(void *arg)
 int32_t
 rte_service_lcore_may_be_active(uint32_t lcore)
 {
-   if (lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core)
+   struct core_state *cs = RTE_LCORE_VAR_LCORE_VALUE(lcore, lcore_states);
+
+   if (lcore >= RTE_MAX_LCORE || !cs->is_service_core)
return -EINVAL;
 
/* Load thread_active using ACQUIRE to avoid instructions dependent on
 * the result being re-ordered before this load completes.
 */
-   return rte_atomic_load_explicit(&lcore_states[lcore].thread_active,
+   return rte_atomic_load_explicit(&cs->thread_active,
   rte_memory_order_acquire);
 }
 
@@ -547,9 +547,12 @@ int32_t
 rte_service_lcore_count(void)
 {
int32_t count = 0;
-   uint32_t i;
-   for (i = 0; i < RTE_MAX_LCORE; i

[PATCH v7 2/7] eal: add lcore variable functional tests

2024-09-18 Thread Mattias Rönnblom
Add functional test suite to exercise the  API.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 

--

PATCH v6:
 * Update FOREACH invocations to match new API.

RFC v5:
 * Adapt tests to reflect the removal of the GET() and SET() macros.

RFC v4:
 * Check all lcore id's values for all variables in the many variables
   test case.
 * Introduce test case for max-sized lcore variables.

RFC v2:
 * Improve alignment-related test coverage.
---
 app/test/meson.build  |   1 +
 app/test/test_lcore_var.c | 436 ++
 2 files changed, 437 insertions(+)
 create mode 100644 app/test/test_lcore_var.c

diff --git a/app/test/meson.build b/app/test/meson.build
index e29258e6ec..48279522f0 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -103,6 +103,7 @@ source_file_deps = {
 'test_ipsec_sad.c': ['ipsec'],
 'test_kvargs.c': ['kvargs'],
 'test_latencystats.c': ['ethdev', 'latencystats', 'metrics'] + 
sample_packet_forward_deps,
+'test_lcore_var.c': [],
 'test_lcores.c': [],
 'test_link_bonding.c': ['ethdev', 'net_bond',
 'net'] + packet_burst_generator_deps + virtual_pmd_deps,
diff --git a/app/test/test_lcore_var.c b/app/test/test_lcore_var.c
new file mode 100644
index 00..2a1f258548
--- /dev/null
+++ b/app/test/test_lcore_var.c
@@ -0,0 +1,436 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Ericsson AB
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "test.h"
+
+#define MIN_LCORES 2
+
+RTE_LCORE_VAR_HANDLE(int, test_int);
+RTE_LCORE_VAR_HANDLE(char, test_char);
+RTE_LCORE_VAR_HANDLE(long, test_long_sized);
+RTE_LCORE_VAR_HANDLE(short, test_short);
+RTE_LCORE_VAR_HANDLE(long, test_long_sized_aligned);
+
+struct int_checker_state {
+   int old_value;
+   int new_value;
+   bool success;
+};
+
+static void
+rand_blk(void *blk, size_t size)
+{
+   size_t i;
+
+   for (i = 0; i < size; i++)
+   ((unsigned char *)blk)[i] = (unsigned char)rte_rand();
+}
+
+static bool
+is_ptr_aligned(const void *ptr, size_t align)
+{
+   return ptr != NULL ? (uintptr_t)ptr % align == 0 : false;
+}
+
+static int
+check_int(void *arg)
+{
+   struct int_checker_state *state = arg;
+
+   int *ptr = RTE_LCORE_VAR_VALUE(test_int);
+
+   bool naturally_aligned = is_ptr_aligned(ptr, sizeof(int));
+
+   bool equal = *(RTE_LCORE_VAR_VALUE(test_int)) == state->old_value;
+
+   state->success = equal && naturally_aligned;
+
+   *ptr = state->new_value;
+
+   return 0;
+}
+
+RTE_LCORE_VAR_INIT(test_int);
+RTE_LCORE_VAR_INIT(test_char);
+RTE_LCORE_VAR_INIT_SIZE(test_long_sized, 32);
+RTE_LCORE_VAR_INIT(test_short);
+RTE_LCORE_VAR_INIT_SIZE_ALIGN(test_long_sized_aligned, sizeof(long),
+ RTE_CACHE_LINE_SIZE);
+
+static int
+test_int_lvar(void)
+{
+   unsigned int lcore_id;
+
+   struct int_checker_state states[RTE_MAX_LCORE] = {};
+
+   RTE_LCORE_FOREACH_WORKER(lcore_id) {
+   struct int_checker_state *state = &states[lcore_id];
+
+   state->old_value = (int)rte_rand();
+   state->new_value = (int)rte_rand();
+
+   *RTE_LCORE_VAR_LCORE_VALUE(lcore_id, test_int) =
+   state->old_value;
+   }
+
+   RTE_LCORE_FOREACH_WORKER(lcore_id)
+   rte_eal_remote_launch(check_int, &states[lcore_id], lcore_id);
+
+   rte_eal_mp_wait_lcore();
+
+   RTE_LCORE_FOREACH_WORKER(lcore_id) {
+   struct int_checker_state *state = &states[lcore_id];
+   int value;
+
+   TEST_ASSERT(state->success, "Unexpected value "
+   "encountered on lcore %d", lcore_id);
+
+   value = *RTE_LCORE_VAR_LCORE_VALUE(lcore_id, test_int);
+   TEST_ASSERT_EQUAL(state->new_value, value,
+ "Lcore %d failed to update int", lcore_id);
+   }
+
+   /* take the opportunity to test the foreach macro */
+   int *v;
+   unsigned int i = 0;
+   RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, v, test_int) {
+   TEST_ASSERT_EQUAL(i, lcore_id, "Encountered lcore id %d "
+ "while expecting %d during iteration",
+ lcore_id, i);
+   TEST_ASSERT_EQUAL(states[lcore_id].new_value, *v,
+ "Unexpected value on lcore %d during "
+ "iteration", lcore_id);
+   i++;
+   }
+
+   return TEST_SUCCESS;
+}
+
+static int
+test_sized_alignment(void)
+{
+   unsigned int lcore_id;
+   long *v;
+
+   RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, v, test_long_sized) {
+   TEST_ASSERT(is_ptr_aligned(v, alignof(long)),
+   "Type-derived alignment failed");
+   }
+
+   RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, v, test_long_sized_aligned) {
+   TEST_

[PATCH v7 5/7] power: keep per-lcore state in lcore variable

2024-09-18 Thread Mattias Rönnblom
Replace static array of cache-aligned structs with an lcore variable,
to slightly benefit code simplicity and performance.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Konstantin Ananyev 

--

PATCH v6:
 * Update FOREACH invocation to match new API.

RFC v3:
 * Replace for loop with FOREACH macro.
---
 lib/power/rte_power_pmd_mgmt.c | 35 +-
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index b1c18a5f56..a981db4b39 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -5,6 +5,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -69,7 +70,7 @@ struct __rte_cache_aligned pmd_core_cfg {
uint64_t sleep_target;
/**< Prevent a queue from triggering sleep multiple times */
 };
-static struct pmd_core_cfg lcore_cfgs[RTE_MAX_LCORE];
+static RTE_LCORE_VAR_HANDLE(struct pmd_core_cfg, lcore_cfgs);
 
 static inline bool
 queue_equal(const union queue *l, const union queue *r)
@@ -252,12 +253,11 @@ clb_multiwait(uint16_t port_id __rte_unused, uint16_t 
qidx __rte_unused,
struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx,
uint16_t max_pkts __rte_unused, void *arg)
 {
-   const unsigned int lcore = rte_lcore_id();
struct queue_list_entry *queue_conf = arg;
struct pmd_core_cfg *lcore_conf;
const bool empty = nb_rx == 0;
 
-   lcore_conf = &lcore_cfgs[lcore];
+   lcore_conf = RTE_LCORE_VAR_VALUE(lcore_cfgs);
 
/* early exit */
if (likely(!empty))
@@ -317,13 +317,12 @@ clb_pause(uint16_t port_id __rte_unused, uint16_t qidx 
__rte_unused,
struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx,
uint16_t max_pkts __rte_unused, void *arg)
 {
-   const unsigned int lcore = rte_lcore_id();
struct queue_list_entry *queue_conf = arg;
struct pmd_core_cfg *lcore_conf;
const bool empty = nb_rx == 0;
uint32_t pause_duration = rte_power_pmd_mgmt_get_pause_duration();
 
-   lcore_conf = &lcore_cfgs[lcore];
+   lcore_conf = RTE_LCORE_VAR_VALUE(lcore_cfgs);
 
if (likely(!empty))
/* early exit */
@@ -358,9 +357,8 @@ clb_scale_freq(uint16_t port_id __rte_unused, uint16_t qidx 
__rte_unused,
struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx,
uint16_t max_pkts __rte_unused, void *arg)
 {
-   const unsigned int lcore = rte_lcore_id();
const bool empty = nb_rx == 0;
-   struct pmd_core_cfg *lcore_conf = &lcore_cfgs[lcore];
+   struct pmd_core_cfg *lcore_conf = RTE_LCORE_VAR_VALUE(lcore_cfgs);
struct queue_list_entry *queue_conf = arg;
 
if (likely(!empty)) {
@@ -518,7 +516,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, 
uint16_t port_id,
goto end;
}
 
-   lcore_cfg = &lcore_cfgs[lcore_id];
+   lcore_cfg = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, lcore_cfgs);
 
/* check if other queues are stopped as well */
ret = cfg_queues_stopped(lcore_cfg);
@@ -619,7 +617,7 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
}
 
/* no need to check queue id as wrong queue id would not be enabled */
-   lcore_cfg = &lcore_cfgs[lcore_id];
+   lcore_cfg = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, lcore_cfgs);
 
/* check if other queues are stopped as well */
ret = cfg_queues_stopped(lcore_cfg);
@@ -769,21 +767,22 @@ rte_power_pmd_mgmt_get_scaling_freq_max(unsigned int 
lcore)
 }
 
 RTE_INIT(rte_power_ethdev_pmgmt_init) {
-   size_t i;
-   int j;
+   unsigned int lcore_id;
+   struct pmd_core_cfg *lcore_cfg;
+   int i;
+
+   RTE_LCORE_VAR_ALLOC(lcore_cfgs);
 
/* initialize all tailqs */
-   for (i = 0; i < RTE_DIM(lcore_cfgs); i++) {
-   struct pmd_core_cfg *cfg = &lcore_cfgs[i];
-   TAILQ_INIT(&cfg->head);
-   }
+   RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, lcore_cfg, lcore_cfgs)
+   TAILQ_INIT(&lcore_cfg->head);
 
/* initialize config defaults */
emptypoll_max = 512;
pause_duration = 1;
/* scaling defaults out of range to ensure not used unless set by user 
or app */
-   for (j = 0; j < RTE_MAX_LCORE; j++) {
-   scale_freq_min[j] = 0;
-   scale_freq_max[j] = UINT32_MAX;
+   for (i = 0; i < RTE_MAX_LCORE; i++) {
+   scale_freq_min[i] = 0;
+   scale_freq_max[i] = UINT32_MAX;
}
 }
-- 
2.34.1



[PATCH v7 7/7] eal: keep per-lcore power intrinsics state in lcore variable

2024-09-18 Thread Mattias Rönnblom
Keep per-lcore power intrinsics state in a lcore variable to reduce
cache working set size and avoid any CPU next-line-prefetching causing
false sharing.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Konstantin Ananyev 
---
 lib/eal/x86/rte_power_intrinsics.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/lib/eal/x86/rte_power_intrinsics.c 
b/lib/eal/x86/rte_power_intrinsics.c
index 6d9b64240c..f4ba2c8ecb 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -6,6 +6,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -14,10 +15,14 @@
 /*
  * Per-lcore structure holding current status of C0.2 sleeps.
  */
-static alignas(RTE_CACHE_LINE_SIZE) struct power_wait_status {
+struct power_wait_status {
rte_spinlock_t lock;
volatile void *monitor_addr; /**< NULL if not currently sleeping */
-} wait_status[RTE_MAX_LCORE];
+};
+
+RTE_LCORE_VAR_HANDLE(struct power_wait_status, wait_status);
+
+RTE_LCORE_VAR_INIT(wait_status);
 
 /*
  * This function uses UMONITOR/UMWAIT instructions and will enter C0.2 state.
@@ -172,7 +177,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
if (pmc->fn == NULL)
return -EINVAL;
 
-   s = &wait_status[lcore_id];
+   s = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, wait_status);
 
/* update sleep address */
rte_spinlock_lock(&s->lock);
@@ -264,7 +269,7 @@ rte_power_monitor_wakeup(const unsigned int lcore_id)
if (lcore_id >= RTE_MAX_LCORE)
return -EINVAL;
 
-   s = &wait_status[lcore_id];
+   s = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, wait_status);
 
/*
 * There is a race condition between sleep, wakeup and locking, but we
@@ -303,8 +308,8 @@ int
 rte_power_monitor_multi(const struct rte_power_monitor_cond pmc[],
const uint32_t num, const uint64_t tsc_timestamp)
 {
-   const unsigned int lcore_id = rte_lcore_id();
-   struct power_wait_status *s = &wait_status[lcore_id];
+   struct power_wait_status *s = RTE_LCORE_VAR_VALUE(wait_status);
+
uint32_t i, rc;
 
/* check if supported */
-- 
2.34.1



[PATCH v7 4/7] random: keep PRNG state in lcore variable

2024-09-18 Thread Mattias Rönnblom
Replace keeping PRNG state in a RTE_MAX_LCORE-sized static array of
cache-aligned and RTE_CACHE_GUARDed struct instances with keeping the
same state in a more cache-friendly lcore variable.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Konstantin Ananyev 

--

RFC v3:
 * Remove cache alignment on unregistered threads' rte_rand_state.
   (Morten Brørup)
---
 lib/eal/common/rte_random.c | 28 +---
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/lib/eal/common/rte_random.c b/lib/eal/common/rte_random.c
index 90e91b3c4f..a8d00308dd 100644
--- a/lib/eal/common/rte_random.c
+++ b/lib/eal/common/rte_random.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct __rte_cache_aligned rte_rand_state {
@@ -19,14 +20,12 @@ struct __rte_cache_aligned rte_rand_state {
uint64_t z3;
uint64_t z4;
uint64_t z5;
-   RTE_CACHE_GUARD;
 };
 
-/* One instance each for every lcore id-equipped thread, and one
- * additional instance to be shared by all others threads (i.e., all
- * unregistered non-EAL threads).
- */
-static struct rte_rand_state rand_states[RTE_MAX_LCORE + 1];
+RTE_LCORE_VAR_HANDLE(struct rte_rand_state, rand_state);
+
+/* instance to be shared by all unregistered non-EAL threads */
+static struct rte_rand_state unregistered_rand_state;
 
 static uint32_t
 __rte_rand_lcg32(uint32_t *seed)
@@ -85,8 +84,14 @@ rte_srand(uint64_t seed)
unsigned int lcore_id;
 
/* add lcore_id to seed to avoid having the same sequence */
-   for (lcore_id = 0; lcore_id < RTE_DIM(rand_states); lcore_id++)
-   __rte_srand_lfsr258(seed + lcore_id, &rand_states[lcore_id]);
+   for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
+   struct rte_rand_state *lcore_state =
+   RTE_LCORE_VAR_LCORE_VALUE(lcore_id, rand_state);
+
+   __rte_srand_lfsr258(seed + lcore_id, lcore_state);
+   }
+
+   __rte_srand_lfsr258(seed + lcore_id, &unregistered_rand_state);
 }
 
 static __rte_always_inline uint64_t
@@ -124,11 +129,10 @@ struct rte_rand_state *__rte_rand_get_state(void)
 
idx = rte_lcore_id();
 
-   /* last instance reserved for unregistered non-EAL threads */
if (unlikely(idx == LCORE_ID_ANY))
-   idx = RTE_MAX_LCORE;
+   return &unregistered_rand_state;
 
-   return &rand_states[idx];
+   return RTE_LCORE_VAR_VALUE(rand_state);
 }
 
 uint64_t
@@ -228,6 +232,8 @@ RTE_INIT(rte_rand_init)
 {
uint64_t seed;
 
+   RTE_LCORE_VAR_ALLOC(rand_state);
+
seed = __rte_random_initial_seed();
 
rte_srand(seed);
-- 
2.34.1



Re: [PATCH v6 1/6] dpdk: do not force C linkage on include file dependencies

2024-09-18 Thread David Marchand
On Tue, Sep 17, 2024 at 11:30 AM Mattias Rönnblom  wrote:
>
> On 2024-09-16 14:05, David Marchand wrote:
> > Hello,
> >
> > On Tue, Sep 10, 2024 at 10:41 AM Mattias Rönnblom
> >  wrote:
> >> diff --git a/lib/acl/rte_acl_osdep.h b/lib/acl/rte_acl_osdep.h
> >> index 3c1dc402ca..e4c7d07c69 100644
> >> --- a/lib/acl/rte_acl_osdep.h
> >> +++ b/lib/acl/rte_acl_osdep.h
> >> @@ -5,10 +5,6 @@
> >>   #ifndef _RTE_ACL_OSDEP_H_
> >>   #define _RTE_ACL_OSDEP_H_
> >>
> >> -#ifdef __cplusplus
> >> -extern "C" {
> >> -#endif
> >> -
> >>   /**
> >>* @file
> >>*
> >> @@ -49,6 +45,10 @@ extern "C" {
> >>   #include 
> >>   #include 
> >>
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >>   #ifdef __cplusplus
> >>   }
> >>   #endif
> >
> > This part is a NOOP, so we can just drop it.
> >
>
> I did try to drop such NOOPs, but then something called
> sanitycheckcpp.exe failed the build because it required 'extern "C"' in
> those header files.
>
> Isn't that check superfluous? A missing 'extern "C"' would be detected
> at a later stage, when the dummy C++ programs are compiled against the
> public header files.
>
> If we agree santifycheckcpp.exe should be fixed, is that a separate
> patch or need it be a part of this patch set?

This check was added with 1ee492bdc4ff ("buildtools/chkincs: check
missing C++ guards").
The check is too naive, and I am not sure we can actually make a better one...

I would remove this check, if no better option.


> >> diff --git a/lib/eal/include/generic/rte_atomic.h 
> >> b/lib/eal/include/generic/rte_atomic.h
> >> index f859707744..0a4f3f8528 100644
> >> --- a/lib/eal/include/generic/rte_atomic.h
> >> +++ b/lib/eal/include/generic/rte_atomic.h
> >> @@ -17,6 +17,10 @@
> >>   #include 
> >>   #include 
> >>
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >>   #ifdef __DOXYGEN__
> >>
> >>   /** @name Memory Barrier
> >> @@ -1156,4 +1160,8 @@ rte_atomic128_cmp_exchange(rte_int128_t *dst,
> >>
> >>   #endif /* __DOXYGEN__ */
> >>
> >> +#ifdef __cplusplus
> >> +}
> >> +#endif
> >> +
> >
> > I would move under #ifdef DOXYGEN.
> >
>
> Why? The pattern now is "almost always directly after the #includes".
> That is better than before, but not ideal. C linkage should only be
> covering functions and global variables declared, I think.

I hear you about how the marking was done but it already has some
manual edits (seeing how some fixes were needed).


-- 
David Marchand



Re: [PATCH v6 1/6] dpdk: do not force C linkage on include file dependencies

2024-09-18 Thread Bruce Richardson
On Wed, Sep 18, 2024 at 02:09:26PM +0200, Mattias Rönnblom wrote:
> On 2024-09-18 13:15, David Marchand wrote:
> > On Tue, Sep 17, 2024 at 11:30 AM Mattias Rönnblom  
> > wrote:
> > > 
> > > On 2024-09-16 14:05, David Marchand wrote:
> > > > Hello,
> > > > 
> > > > On Tue, Sep 10, 2024 at 10:41 AM Mattias Rönnblom
> > > >  wrote:
> > > > > diff --git a/lib/acl/rte_acl_osdep.h b/lib/acl/rte_acl_osdep.h
> > > > > index 3c1dc402ca..e4c7d07c69 100644
> > > > > --- a/lib/acl/rte_acl_osdep.h
> > > > > +++ b/lib/acl/rte_acl_osdep.h
> > > > > @@ -5,10 +5,6 @@
> > > > >#ifndef _RTE_ACL_OSDEP_H_
> > > > >#define _RTE_ACL_OSDEP_H_
> > > > > 
> > > > > -#ifdef __cplusplus
> > > > > -extern "C" {
> > > > > -#endif
> > > > > -
> > > > >/**
> > > > > * @file
> > > > > *
> > > > > @@ -49,6 +45,10 @@ extern "C" {
> > > > >#include 
> > > > >#include 
> > > > > 
> > > > > +#ifdef __cplusplus
> > > > > +extern "C" {
> > > > > +#endif
> > > > > +
> > > > >#ifdef __cplusplus
> > > > >}
> > > > >#endif
> > > > 
> > > > This part is a NOOP, so we can just drop it.
> > > > 
> > > 
> > > I did try to drop such NOOPs, but then something called
> > > sanitycheckcpp.exe failed the build because it required 'extern "C"' in
> > > those header files.
> > > 
> > > Isn't that check superfluous? A missing 'extern "C"' would be detected
> > > at a later stage, when the dummy C++ programs are compiled against the
> > > public header files.
> > > 
> > > If we agree santifycheckcpp.exe should be fixed, is that a separate
> > > patch or need it be a part of this patch set?
> > 
> > This check was added with 1ee492bdc4ff ("buildtools/chkincs: check
> > missing C++ guards").
> > The check is too naive, and I am not sure we can actually make a better 
> > one...
> > 
> > I would remove this check, if no better option.
> > 
> 
> Just to be clear: what you are suggesting is removing the check as a part of
> this patch set?
> 
> I think I was wrong saying the dummy C++ programs already detect omissions
> of C linkage.
> 
> I'll leave for Bruce to comment on this before I do anything.
> 

I agree that the existing check is very naive. Maybe we can go with a
simple fix like adding an allowlist of files which we ignore for 'extern C'
checking?

I don't remember the details of the original patch unfortunately, but from the
commit log I think I found that just compiling C++ with the C headers
didn't throw any errors for the missing extern. I think the functions need
to be actually called and then attempted linked for us to see the errors,
and that is not something that is easily implemented.

/Bruce


Re: [PATCH v3 08/12] dts: add NIC capability support

2024-09-18 Thread Juraj Linkeš




On 27. 8. 2024 18:36, Jeremy Spewock wrote:

On Wed, Aug 21, 2024 at 10:53 AM Juraj Linkeš
 wrote:


diff --git a/dts/framework/testbed_model/capability.py 
b/dts/framework/testbed_model/capability.py
index 8899f07f76..9a79e6ebb3 100644
--- a/dts/framework/testbed_model/capability.py
+++ b/dts/framework/testbed_model/capability.py
@@ -5,14 +5,40 @@



+@classmethod
+def get_supported_capabilities(
+cls, sut_node: SutNode, topology: "Topology"
+) -> set["DecoratedNicCapability"]:
+"""Overrides :meth:`~Capability.get_supported_capabilities`.
+
+The capabilities are first sorted by decorators, then reduced into a 
single function which
+is then passed to the decorator. This way we only execute each 
decorator only once.
+"""
+supported_conditional_capabilities: set["DecoratedNicCapability"] = 
set()
+logger = get_dts_logger(f"{sut_node.name}.{cls.__name__}")
+if topology.type is Topology.type.no_link:


As a follow-up, I didn't notice this during my initial review, but in
testing this line was throwing attribute errors for me due to Topology
not having an attribute named `type`. I think this was because of
`Topology.type.no_link` since this attribute isn't initialized on the
class itself. I fixed this by just replacing it with
`TopologyType.no_link` locally.



I also ran into this, the type attribute is not a class variable. Your 
solution works (and I also originally fixed it with exactly that), but I 
then I realized topology.type.no_link also works (and was probably my 
intention), which doesn't require the extra import of TopologyType.


RE: [PATCH] eal: add build-time option to omit trace

2024-09-18 Thread Morten Brørup
> From: Jerin Jacob [mailto:jerinjac...@gmail.com]
> Sent: Wednesday, 18 September 2024 11.50
to omit trace
> 
> On Wed, Sep 18, 2024 at 2:39 PM Morten Brørup 
> wrote:
> >
> > Some applications want to omit the trace feature.
> > Either to reduce the memory footprint, to reduce the exposed attack
> > surface, or for other reasons.
> >
> > This patch adds an option in rte_config.h to include or omit trace in the
> > build. Trace is included by default.
> >
> > Omitting trace works by omitting all trace points.
> > For API and ABI compatibility, the trace feature itself remains.
> >
> > Signed-off-by: Morten Brørup 
> > ---
> >  app/test/test_trace.c  | 11 ++-
> >  config/rte_config.h|  1 +
> >  lib/eal/include/rte_trace_point.h  |  6 +-
> >  lib/eal/include/rte_trace_point_register.h |  8 
> >  4 files changed, 24 insertions(+), 2 deletions(-)
> >
> > diff --git a/app/test/test_trace.c b/app/test/test_trace.c
> > index 00809f433b..7918cc865d 100644
> > --- a/app/test/test_trace.c
> > +++ b/app/test/test_trace.c
> > @@ -12,7 +12,16 @@
> >
> >  int app_dpdk_test_tp_count;
> >
> > -#ifdef RTE_EXEC_ENV_WINDOWS
> > +#if !defined(RTE_TRACE)
> > +
> > +static int
> > +test_trace(void)
> > +{
> > +   printf("trace omitted at build-time, skipping test\n");
> > +   return TEST_SKIPPED;
> > +}
> > +
> > +#elif defined(RTE_EXEC_ENV_WINDOWS)
> >
> >  static int
> >  test_trace(void)
> > diff --git a/config/rte_config.h b/config/rte_config.h
> > index dd7bb0d35b..fd6f8a2f1a 100644
> > --- a/config/rte_config.h
> > +++ b/config/rte_config.h
> > @@ -49,6 +49,7 @@
> >  #define RTE_MAX_TAILQ 32
> >  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
> >  #define RTE_MAX_VFIO_CONTAINERS 64
> > +#define RTE_TRACE 1
> >
> >  /* bsd module defines */
> >  #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
> > diff --git a/lib/eal/include/rte_trace_point.h
> b/lib/eal/include/rte_trace_point.h
> > index 41e2a7f99e..1b60bba043 100644
> > --- a/lib/eal/include/rte_trace_point.h
> > +++ b/lib/eal/include/rte_trace_point.h
> > @@ -212,6 +212,7 @@ bool rte_trace_point_is_enabled(rte_trace_point_t *tp);
> >  __rte_experimental
> >  rte_trace_point_t *rte_trace_point_lookup(const char *name);
> >
> > +#ifdef RTE_TRACE
> >  /**
> >   * @internal
> >   *
> > @@ -230,6 +231,7 @@ __rte_trace_point_fp_is_enabled(void)
> > return false;
> >  #endif
> >  }
> > +#endif /* RTE_TRACE */
> >
> >  /**
> >   * @internal
> > @@ -356,6 +358,8 @@ __rte_trace_point_emit_ev_header(void *mem, uint64_t in)
> > return RTE_PTR_ADD(mem, __RTE_TRACE_EVENT_HEADER_SZ);
> >  }
> >
> > +#ifdef RTE_TRACE
> 
> 
> Please change to 1.4.5 style _if possible_ in
> https://doc.dpdk.org/guides/contributing/coding_style.html.

The Coding Style's chapter 1.4.5 only applies to O/S selection.
Boolean configuration definitions either have the value 1 or are not defined.

> Assuming linker will remove the memory from the image if it is not
> using by stubbing out.

Assumption could be false if building with optimizations disabled. Don't know.

> 
> Untested.
> 
> #define __rte_trace_point_emit_header_generic(t) \
> void *mem; \
> do { \
>   +  if (RTE_TRACE_ENABLED == 0) \
>   + return \
> const uint64_t val = rte_atomic_load_explicit(t,
> rte_memory_order_acquire); \
> if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
> return; \
> mem = __rte_trace_mem_get(val); \
> if (unlikely(mem == NULL)) \
> return; \
> mem = __rte_trace_point_emit_ev_header(mem, val); \
> } while (0)

I was initially down that path, inspired by how FP is enabled/disabled;
using an added __rte_trace_point_is_enabled(void) inline function, like the 
__rte_trace_point_fp_is_enabled(void).

But I kept getting lots of warnings, either about nonexisting references, or 
stuff not being used.

So I gave up and did it this way instead.
After having tried the other way, this way also looks cleaner to me.

> 
> 
> 
> > +
> >  #define __rte_trace_point_emit_header_generic(t) \
> >  void *mem; \
> >  do { \
> > @@ -411,7 +415,7 @@ do { \
> > RTE_SET_USED(len); \
> >  } while (0)
> >
> > -
> > +#endif /* RTE_TRACE */
> >  #endif /* ALLOW_EXPERIMENTAL_API */
> >  #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
> >
> > diff --git a/lib/eal/include/rte_trace_point_register.h
> b/lib/eal/include/rte_trace_point_register.h
> > index 41260e5964..78c0ede5f1 100644
> > --- a/lib/eal/include/rte_trace_point_register.h
> > +++ b/lib/eal/include/rte_trace_point_register.h
> > @@ -18,6 +18,8 @@ extern "C" {
> >
> >  RTE_DECLARE_PER_LCORE(volatile int, trace_point_sz);
> >
> > +#ifdef RTE_TRACE
> > +
> >  #define RTE_TRACE_POINT_REGISTER(trace, name) \
> >  rte_trace_point_t __rte_section("__rte_trace_point") __##trace; \
> >  static const char __##trace##_name[] = RTE_STR(name); \
> > @@ -27,6 +29,12 @@ RTE_INIT(trace##_init) \
> > (void (*)(void)) tr

Re: 21.11.8 patches review and test

2024-09-18 Thread Kevin Traynor
On 18/09/2024 08:50, Ali Alnubani wrote:
>> -Original Message-
>> From: Kevin Traynor 
>> Sent: Thursday, September 5, 2024 3:38 PM
>> To: sta...@dpdk.org
>> Cc: dev@dpdk.org; Abhishek Marathe ; Ali
>> Alnubani ; David Christensen ;
>> Hemant Agrawal ; Ian Stokes
>> ; Jerin Jacob ; John McNamara
>> ; Ju-Hyoung Lee ; Kevin
>> Traynor ; Luca Boccassi ; Pei Zhang
>> ; Raslan Darawsheh ; NBU-
>> Contact-Thomas Monjalon (EXTERNAL) ;
>> yangh...@redhat.com
>> Subject: 21.11.8 patches review and test
>>
>> Hi all,
>>
>> Here is a list of patches targeted for stable release 21.11.8.
>>
>> The planned date for the final release is 18th September.
>>
>> Please help with testing and validation of your use cases and report
>> any issues/results with reply-all to this mail. For the final release
>> the fixes and reported validations will be added to the release notes.
>>
>> A release candidate tarball can be found at:
>>
>> https://dpdk.org/browse/dpdk-stable/tag/?id=v21.11.8-rc1
>>
>> These patches are located at branch 21.11 of dpdk-stable repo:
>> https://dpdk.org/browse/dpdk-stable/
>>
>> Thanks.
>>
>> Kevin
>>
>> ---
> 
> Hello,
> 
> We ran the following functional tests with Nvidia hardware on 21.11.8-rc1:
> - Basic functionality:
>   Send and receive multiple types of traffic.
> - testpmd xstats counter test.
> - testpmd timestamp test.
> - Changing/checking link status through testpmd.
> - rte_flow tests 
> (https://doc.dpdk.org/guides/nics/mlx5.html#supported-hardware-offloads)
> - RSS tests.
> - VLAN filtering, stripping, and insertion tests.
> - Checksum and TSO tests.
> - ptype tests.
> - link_status_interrupt example application tests.
> - l3fwd-power example application tests.
> - Multi-process example applications tests.
> - Hardware LRO tests.
> - Buffer Split tests.
> - Tx scheduling tests.
> 
> Functional tests ran on:
> - NIC: ConnectX-6 Dx / OS: Ubuntu 20.04 / Driver: 
> MLNX_OFED_LINUX-24.07-0.6.1.0 / Firmware: 22.42.1000
> - NIC: ConnectX-7 / OS: Ubuntu 20.04 / Driver: MLNX_OFED_LINUX-24.07-0.6.1.0 
> / Firmware: 28.42.1000
> - DPU: BlueField-2 / DOCA SW version: 2.8 / Firmware: 24.42.1000
> 
> Additionally, we ran build tests with multiple configurations on the 
> following OS/driver combinations (all passed):
> - Debian 12 with MLNX_OFED_LINUX-24.07-0.6.1.0.
> - Ubuntu 20.04.6 with MLNX_OFED_LINUX-24.07-0.6.1.0.
> - Ubuntu 20.04.6 with rdma-core master (dd9c687).
> - Ubuntu 20.04.6 with rdma-core v28.0.
> - Fedora 40 with rdma-core v48.0.
> - Fedora 42 (Rawhide) with rdma-core v51.0.
> - OpenSUSE Leap 15.6 with rdma-core v49.1.
> 
> We don't see new issues caused by the changes in this release.
> 

Thanks Ali for this and your help. I will add to the release notes,
Kevin.

> Thanks,
> Ali





Re: [PATCH v6 1/7] eal: add static per-lcore memory allocation facility

2024-09-18 Thread Mattias Rönnblom

On 2024-09-18 10:24, Konstantin Ananyev wrote:

+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param lcore_id
+ *   An unsigned int variable successively set to the
+ *   lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
+ * @param value
+ *   A pointer variable successively set to point to lcore variable
+ *   value instance of the current lcore id being processed.
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle)   \
+   for (lcore_id = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
+lcore_id < RTE_MAX_LCORE;   \
+lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
+


I think we need a '()' around references to lcore_id:
  for ((lcore_id) = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
 (lcore_id) < RTE_MAX_LCORE; \
 (lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, 
handle))


Yes, of course. Thanks.


RE: [PATCH v6 1/7] eal: add static per-lcore memory allocation facility

2024-09-18 Thread Konstantin Ananyev
> +/**
> + * Iterate over each lcore id's value for an lcore variable.
> + *
> + * @param lcore_id
> + *   An unsigned int variable successively set to the
> + *   lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
> + * @param value
> + *   A pointer variable successively set to point to lcore variable
> + *   value instance of the current lcore id being processed.
> + * @param handle
> + *   The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle) \
> + for (lcore_id = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
> +  lcore_id < RTE_MAX_LCORE;  \
> +  lcore_id++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle))
> +

I think we need a '()' around references to lcore_id:
 for ((lcore_id) = (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0); \
 (lcore_id) < RTE_MAX_LCORE;
\
 (lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id, 
handle))


RE: [PATCH v6 2/7] eal: add lcore variable functional tests

2024-09-18 Thread Konstantin Ananyev


> -Original Message-
> From: Mattias Rönnblom 
> Sent: Wednesday, September 18, 2024 9:01 AM
> To: dev@dpdk.org
> Cc: hof...@lysator.liu.se; Morten Brørup ; 
> Stephen Hemminger ;
> Konstantin Ananyev ; David Marchand 
> ; Jerin Jacob
> ; Mattias Rönnblom 
> Subject: [PATCH v6 2/7] eal: add lcore variable functional tests
> 
> Add functional test suite to exercise the  API.
> 
> Signed-off-by: Mattias Rönnblom 
> Acked-by: Morten Brørup 
> 
> --

Acked-by: Konstantin Ananyev 

> 2.34.1
> 



Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility

2024-09-18 Thread Jerin Jacob
On Thu, Sep 12, 2024 at 8:52 PM Jerin Jacob  wrote:
>
> On Thu, Sep 12, 2024 at 7:11 PM Morten Brørup  
> wrote:
> >
> > > From: Jerin Jacob [mailto:jerinjac...@gmail.com]
> > > Sent: Thursday, 12 September 2024 15.17
> > >
> > > On Thu, Sep 12, 2024 at 2:40 PM Morten Brørup 
> > > wrote:
> > > >
> > > > > +#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR * RTE_MAX_LCORE)
> > > >
> > > > Considering hugepages...
> > > >
> > > > Lcore variables may be allocated before DPDK's memory allocator
> > > (rte_malloc()) is ready, so rte_malloc() cannot be used for lcore 
> > > variables.
> > > >
> > > > And lcore variables are not usable (shared) for DPDK multi-process, so 
> > > > the
> > > lcore_buffer could be allocated through the O/S APIs as anonymous 
> > > hugepages,
> > > instead of using rte_malloc().
> > > >
> > > > The alternative, using rte_malloc(), would disallow allocating lcore
> > > variables before DPDK's memory allocator has been initialized, which I 
> > > think
> > > is too late.
> > >
> > > I thought it is not. A lot of the subsystems are initialized after the
> > > memory subsystem is initialized.
> > > [1] example given in documentation. I thought, RTE_INIT needs to
> > > replaced if the subsystem called after memory initialized (which is
> > > the case for most of the libraries)
> >
> > The list of RTE_INIT functions are called before main(). It is not very 
> > useful.
> >
> > Yes, it would be good to replace (or supplement) RTE_INIT_PRIO by something 
> > similar, which calls the list of "INIT" functions at the appropriate time 
> > during EAL initialization.
> >
> > DPDK should then use this "INIT" list for all its initialization, so the 
> > init function of new features (such as this, and trace) can be inserted at 
> > the correct location in the list.
> >
> > > Trace library had a similar situation. It is managed like [2]
> >
> > Yes, if we insist on using rte_malloc() for lcore variables, the 
> > alternative is to prohibit establishing lcore variables in functions called 
> > through RTE_INIT.
>
> I was not insisting on using ONLY rte_malloc(). Since rte_malloc() can
> be called before rte_eal_init)(it will return NULL). Alloc routine can
> check first rte_malloc() is available if not switch over glibc.


@Mattias Rönnblom This comment is not addressed in v7. Could you check?


[RFC v2 1/1] dmadev: support priority configuration

2024-09-18 Thread Vamsi Krishna
From: Vamsi Attunuru 

Some DMA controllers offer the ability to configure priority level
for the hardware command queues, allowing for the prioritization of
DMA command execution based on queue importance.

This patch introduces the necessary fields in the dmadev structures to
retrieve information about the hardware-supported priority levels and to
enable priority configuration from the application.

Signed-off-by: Vamsi Attunuru 
Signed-off-by: Amit Prakash Shukla 
---
V2 changes:
* Reverted removed text from release_24_11.rst

V1 changes:
* Added trace support
* Added new capability flag

Deprecation notice:
https://patches.dpdk.org/project/dpdk/patch/20240730144612.2132848-1-amitpraka...@marvell.com/

* Assuming we do not anticipate any advanced scheduling schemes for dmadev 
queues,
  this RFC is intended to support a strict priority scheme.

 doc/guides/rel_notes/release_24_11.rst |  8 
 lib/dmadev/rte_dmadev.c| 15 +++
 lib/dmadev/rte_dmadev.h| 21 +
 lib/dmadev/rte_dmadev_trace.h  |  2 ++
 4 files changed, 46 insertions(+)

diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..fc3610deff 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,11 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Added strict priority capability flag in dmadev.**
+
+  Added new capability flag ``RTE_DMA_CAPA_PRI_POLICY_SP`` to check if the
+  DMA device supports assigning fixed priority to its channels, allowing
+  for better control over resource allocation and scheduling.
 
 Removed Items
 -
@@ -100,6 +105,9 @@ ABI Changes
Also, make sure to start the actual text at the margin.
===
 
+* dmadev: Added ``nb_priorities`` field to ``rte_dma_info`` structure and
+  ``priority`` field to ``rte_dma_conf`` structure to get device supported
+  priority levels and configure required priority from the application.
 
 Known Issues
 
diff --git a/lib/dmadev/rte_dmadev.c b/lib/dmadev/rte_dmadev.c
index 845727210f..3d9063dee3 100644
--- a/lib/dmadev/rte_dmadev.c
+++ b/lib/dmadev/rte_dmadev.c
@@ -497,6 +497,21 @@ rte_dma_configure(int16_t dev_id, const struct 
rte_dma_conf *dev_conf)
return -EINVAL;
}
 
+   if (dev_conf->priority && !(dev_info.dev_capa & 
RTE_DMA_CAPA_PRI_POLICY_SP)) {
+   RTE_DMA_LOG(ERR, "Device %d don't support prioritization", 
dev_id);
+   return -EINVAL;
+   }
+
+   if (dev_info.nb_priorities == 1) {
+   RTE_DMA_LOG(ERR, "Device %d must support more than 1 priority, 
or else 0", dev_id);
+   return -EINVAL;
+   }
+
+   if (dev_info.nb_priorities && (dev_conf->priority >= 
dev_info.nb_priorities)) {
+   RTE_DMA_LOG(ERR, "Device %d configure invalid priority", 
dev_id);
+   return -EINVAL;
+   }
+
if (*dev->dev_ops->dev_configure == NULL)
return -ENOTSUP;
ret = (*dev->dev_ops->dev_configure)(dev, dev_conf,
diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
index 5474a5281d..e5f730c327 100644
--- a/lib/dmadev/rte_dmadev.h
+++ b/lib/dmadev/rte_dmadev.h
@@ -268,6 +268,16 @@ int16_t rte_dma_next_dev(int16_t start_dev_id);
 #define RTE_DMA_CAPA_OPS_COPY_SG   RTE_BIT64(33)
 /** Support fill operation. */
 #define RTE_DMA_CAPA_OPS_FILL  RTE_BIT64(34)
+/** Support strict prioritization at DMA HW channel level
+ *
+ * If device supports HW channel prioritization then application could
+ * assign fixed priority to the DMA HW channel using 'priority' field in
+ * struct rte_dma_conf. Number of supported priority levels will be known
+ * from 'nb_priorities' field in struct rte_dma_info.
+ *
+ * DMA devices which support prioritization can advertise this capability.
+ */
+#define RTE_DMA_CAPA_PRI_POLICY_SP RTE_BIT64(35)
 /**@}*/
 
 /**
@@ -297,6 +307,10 @@ struct rte_dma_info {
int16_t numa_node;
/** Number of virtual DMA channel configured. */
uint16_t nb_vchans;
+   /** Number of priority levels (must be > 1), if supported by DMA HW 
channel.
+* 0 otherwise.
+*/
+   uint16_t nb_priorities;
 };
 
 /**
@@ -332,6 +346,13 @@ struct rte_dma_conf {
 * @see RTE_DMA_CAPA_SILENT
 */
bool enable_silent;
+   /* The priority of the DMA HW channel.
+* This value cannot be greater than or equal to the field 
'nb_priorities'
+* of struct rte_dma_info which get from rte_dma_info_get().
+* Among the values between '0' and 'nb_priorities - 1', lowest value
+* indicates higher priority and vice-versa.
+*/
+   uint16_t priority;
 };
 
 /**
diff --git a/lib/dmadev/rte_dmadev_trace.h

[PATCH] eal/alarm_cancel: Fix thread starvation

2024-09-18 Thread Wojciech Panfil
Issue:
Two threads:

- A, executing rte_eal_alarm_cancel,
- B, executing eal_alarm_callback.

Such case can cause starvation of thread B. Please see that there is a
small time window between lock and unlock in thread A, so thread B must
be switched to within a very small time window, so that it can obtain
the lock.

Solution to this problem is use sched_yield(), which puts current thread
(A) at the end of thread execution priority queue and allows thread B to
execute.

The issue can be observed e.g. on hot-pluggable device detach path.
On such path, rte_alarm can used to check if DPDK has completed
the detachment. Waiting for completion, rte_eal_alarm_cancel
is called, while another thread periodically calls eal_alarm_callback
causing the issue to occur.

Signed-off-by: Wojciech Panfil 
---
 lib/eal/freebsd/eal_alarm.c | 6 ++
 lib/eal/linux/eal_alarm.c   | 6 ++
 lib/eal/windows/eal_alarm.c | 5 +
 3 files changed, 17 insertions(+)

diff --git a/lib/eal/freebsd/eal_alarm.c b/lib/eal/freebsd/eal_alarm.c
index 94cae5f4b6..3680f5caba 100644
--- a/lib/eal/freebsd/eal_alarm.c
+++ b/lib/eal/freebsd/eal_alarm.c
@@ -318,7 +318,13 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void 
*cb_arg)
}
ap_prev = ap;
}
+
rte_spinlock_unlock(&alarm_list_lk);
+
+   /* Yield control to a second thread executing 
eal_alarm_callback to avoid
+* its starvation, as it is waiting for the lock we have just 
released.
+*/
+   sched_yield();
} while (executing != 0);
 
if (count == 0 && err == 0)
diff --git a/lib/eal/linux/eal_alarm.c b/lib/eal/linux/eal_alarm.c
index eeb096213b..9fe14ade63 100644
--- a/lib/eal/linux/eal_alarm.c
+++ b/lib/eal/linux/eal_alarm.c
@@ -248,7 +248,13 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void 
*cb_arg)
}
ap_prev = ap;
}
+
rte_spinlock_unlock(&alarm_list_lk);
+
+   /* Yield control to a second thread executing 
eal_alarm_callback to avoid
+* its starvation, as it is waiting for the lock we have just 
released.
+*/
+   sched_yield();
} while (executing != 0);
 
if (count == 0 && err == 0)
diff --git a/lib/eal/windows/eal_alarm.c b/lib/eal/windows/eal_alarm.c
index 052af4b21b..9ad530dd31 100644
--- a/lib/eal/windows/eal_alarm.c
+++ b/lib/eal/windows/eal_alarm.c
@@ -211,6 +211,11 @@ rte_eal_alarm_cancel(rte_eal_alarm_callback cb_fn, void 
*cb_arg)
}
 
rte_spinlock_unlock(&alarm_lock);
+
+   /* Yield control to a second thread executing 
eal_alarm_callback to avoid
+* its starvation, as it is waiting for the lock we have just 
released.
+*/
+   SwitchToThread();
} while (executing);
 
rte_eal_trace_alarm_cancel(cb_fn, cb_arg, removed);
-- 
2.46.0



Re: [PATCH] eal: add build-time option to omit trace

2024-09-18 Thread Jerin Jacob
On Wed, Sep 18, 2024 at 2:39 PM Morten Brørup  
wrote:
>
> Some applications want to omit the trace feature.
> Either to reduce the memory footprint, to reduce the exposed attack
> surface, or for other reasons.
>
> This patch adds an option in rte_config.h to include or omit trace in the
> build. Trace is included by default.
>
> Omitting trace works by omitting all trace points.
> For API and ABI compatibility, the trace feature itself remains.
>
> Signed-off-by: Morten Brørup 
> ---
>  app/test/test_trace.c  | 11 ++-
>  config/rte_config.h|  1 +
>  lib/eal/include/rte_trace_point.h  |  6 +-
>  lib/eal/include/rte_trace_point_register.h |  8 
>  4 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/app/test/test_trace.c b/app/test/test_trace.c
> index 00809f433b..7918cc865d 100644
> --- a/app/test/test_trace.c
> +++ b/app/test/test_trace.c
> @@ -12,7 +12,16 @@
>
>  int app_dpdk_test_tp_count;
>
> -#ifdef RTE_EXEC_ENV_WINDOWS
> +#if !defined(RTE_TRACE)
> +
> +static int
> +test_trace(void)
> +{
> +   printf("trace omitted at build-time, skipping test\n");
> +   return TEST_SKIPPED;
> +}
> +
> +#elif defined(RTE_EXEC_ENV_WINDOWS)
>
>  static int
>  test_trace(void)
> diff --git a/config/rte_config.h b/config/rte_config.h
> index dd7bb0d35b..fd6f8a2f1a 100644
> --- a/config/rte_config.h
> +++ b/config/rte_config.h
> @@ -49,6 +49,7 @@
>  #define RTE_MAX_TAILQ 32
>  #define RTE_LOG_DP_LEVEL RTE_LOG_INFO
>  #define RTE_MAX_VFIO_CONTAINERS 64
> +#define RTE_TRACE 1
>
>  /* bsd module defines */
>  #define RTE_CONTIGMEM_MAX_NUM_BUFS 64
> diff --git a/lib/eal/include/rte_trace_point.h 
> b/lib/eal/include/rte_trace_point.h
> index 41e2a7f99e..1b60bba043 100644
> --- a/lib/eal/include/rte_trace_point.h
> +++ b/lib/eal/include/rte_trace_point.h
> @@ -212,6 +212,7 @@ bool rte_trace_point_is_enabled(rte_trace_point_t *tp);
>  __rte_experimental
>  rte_trace_point_t *rte_trace_point_lookup(const char *name);
>
> +#ifdef RTE_TRACE
>  /**
>   * @internal
>   *
> @@ -230,6 +231,7 @@ __rte_trace_point_fp_is_enabled(void)
> return false;
>  #endif
>  }
> +#endif /* RTE_TRACE */
>
>  /**
>   * @internal
> @@ -356,6 +358,8 @@ __rte_trace_point_emit_ev_header(void *mem, uint64_t in)
> return RTE_PTR_ADD(mem, __RTE_TRACE_EVENT_HEADER_SZ);
>  }
>
> +#ifdef RTE_TRACE


Please change to 1.4.5 style _if possible_ in
https://doc.dpdk.org/guides/contributing/coding_style.html.
Assuming linker will remove the memory from the image if it is not
using by stubbing out.

Untested.

#define __rte_trace_point_emit_header_generic(t) \
void *mem; \
do { \
  +  if (RTE_TRACE_ENABLED == 0) \
  + return \
const uint64_t val = rte_atomic_load_explicit(t,
rte_memory_order_acquire); \
if (likely(!(val & __RTE_TRACE_FIELD_ENABLE_MASK))) \
return; \
mem = __rte_trace_mem_get(val); \
if (unlikely(mem == NULL)) \
return; \
mem = __rte_trace_point_emit_ev_header(mem, val); \
} while (0)



> +
>  #define __rte_trace_point_emit_header_generic(t) \
>  void *mem; \
>  do { \
> @@ -411,7 +415,7 @@ do { \
> RTE_SET_USED(len); \
>  } while (0)
>
> -
> +#endif /* RTE_TRACE */
>  #endif /* ALLOW_EXPERIMENTAL_API */
>  #endif /* _RTE_TRACE_POINT_REGISTER_H_ */
>
> diff --git a/lib/eal/include/rte_trace_point_register.h 
> b/lib/eal/include/rte_trace_point_register.h
> index 41260e5964..78c0ede5f1 100644
> --- a/lib/eal/include/rte_trace_point_register.h
> +++ b/lib/eal/include/rte_trace_point_register.h
> @@ -18,6 +18,8 @@ extern "C" {
>
>  RTE_DECLARE_PER_LCORE(volatile int, trace_point_sz);
>
> +#ifdef RTE_TRACE
> +
>  #define RTE_TRACE_POINT_REGISTER(trace, name) \
>  rte_trace_point_t __rte_section("__rte_trace_point") __##trace; \
>  static const char __##trace##_name[] = RTE_STR(name); \
> @@ -27,6 +29,12 @@ RTE_INIT(trace##_init) \
> (void (*)(void)) trace); \
>  }
>
> +#else
> +
> +#define RTE_TRACE_POINT_REGISTER(trace, name)
> +
> +#endif /* RTE_TRACE */
> +
>  #define __rte_trace_point_emit_header_generic(t) \
> RTE_PER_LCORE(trace_point_sz) = __RTE_TRACE_EVENT_HEADER_SZ
>
> --
> 2.43.0
>


Re: [PATCH v2] dts: fix runner target in the Dockerfile

2024-09-18 Thread Juraj Linkeš




diff --git a/dts/Dockerfile b/dts/Dockerfile



@@ -24,9 +27,12 @@ FROM base AS runner



+# Adds ~/.local/bin to PATH so that packages installed with pipx are callable. 
`pipx ensurepath`
+# fixes this issue, but requires the shell to be re-opened which isn't an 
option for this target.


Let's explain this a bit more, I don't really know why this isn't an option.


+ENV PATH="$PATH:/root/.local/bin"
+RUN poetry install --only main --no-root
  
-CMD ["poetry", "run", "python", "main.py"]

+ENTRYPOINT ["poetry", "run", "python", "main.py"]
  


Re: [PATCH v3 1/1] dts: add text parser for testpmd verbose output

2024-09-18 Thread Juraj Linkeš




On 17. 9. 2024 15:40, Jeremy Spewock wrote:

On Mon, Sep 9, 2024 at 7:44 AM Juraj Linkeš  wrote:




diff --git a/dts/framework/parser.py b/dts/framework/parser.py
index 741dfff821..0b39025a48 100644
--- a/dts/framework/parser.py
+++ b/dts/framework/parser.py
@@ -160,6 +160,36 @@ def _find(text: str) -> Any:

   return ParserFn(TextParser_fn=_find)

+@staticmethod
+def find_all(
+pattern: str | re.Pattern[str],
+flags: re.RegexFlag = re.RegexFlag(0),
+) -> ParserFn:


I'd remove this if it's not used, the rule being let's not introduce
unused code because it's not going to be maintained. We can always add
it when needed.


Since submitting this I did actually find one use for it in the Rx/Tx
suite, but that one has yet to undergo review, so it could be the case
that people don't like that implementation.



Ok, we can (and probably should) add it in that test suite patchset if 
needed. The required infrastructure is in this patchset and additions to 
it can be added in individual test suites (if only that one uses that 
addition).



diff --git a/dts/framework/utils.py b/dts/framework/utils.py
index 6b5d5a805f..9c64cf497f 100644
--- a/dts/framework/utils.py
+++ b/dts/framework/utils.py
@@ -27,6 +27,7 @@
   from .exception import ConfigurationError

   REGEX_FOR_PCI_ADDRESS: str = 
"/[0-9a-fA-F]{4}:[0-9a-fA-F]{2}:[0-9a-fA-F]{2}.[0-9]{1}/"
+REGEX_FOR_MAC_ADDRESS: str = r"(?:[\da-fA-F]{2}:){5}[\da-fA-F]{2}"


Is this the only format that testpmd returns?


I believe so, but because I'm not completely sure I can change this
regex to support other variants as well. The hyphen separated one is
easy enough that it might as well be included, the group of 4
separated by a dot might be a little more involved but I can probably
get it to work.



Ok, might as well be safe, doesn't sound like a big change.

A small point is to make the regex as readable as possible - we could 
split it into multiple regexes (and try to match multiple times) or put 
the one big regex string on mutliple lines if possible (with each line 
being a separate variant or multiple closely related variants).


[PATCH v6 6/7] service: keep per-lcore state in lcore variable

2024-09-18 Thread Mattias Rönnblom
Replace static array of cache-aligned structs with an lcore variable,
to slightly benefit code simplicity and performance.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Konstantin Ananyev 

--

RFC v6:
 * Remove a now-redundant lcore variable value memset().

RFC v5:
 * Fix lcore value pointer bug introduced by RFC v4.

RFC v4:
 * Remove strange-looking lcore value lookup potentially containing
   invalid lcore id. (Morten Brørup)
 * Replace misplaced tab with space. (Morten Brørup)
---
 lib/eal/common/rte_service.c | 115 +++
 1 file changed, 63 insertions(+), 52 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 56379930b6..03379f1588 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -11,6 +11,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -75,7 +76,7 @@ struct __rte_cache_aligned core_state {
 
 static uint32_t rte_service_count;
 static struct rte_service_spec_impl *rte_services;
-static struct core_state *lcore_states;
+static RTE_LCORE_VAR_HANDLE(struct core_state, lcore_states);
 static uint32_t rte_service_library_initialized;
 
 int32_t
@@ -101,12 +102,8 @@ rte_service_init(void)
goto fail_mem;
}
 
-   lcore_states = rte_calloc("rte_service_core_states", RTE_MAX_LCORE,
-   sizeof(struct core_state), RTE_CACHE_LINE_SIZE);
-   if (!lcore_states) {
-   EAL_LOG(ERR, "error allocating core states array");
-   goto fail_mem;
-   }
+   if (lcore_states == NULL)
+   RTE_LCORE_VAR_ALLOC(lcore_states);
 
int i;
struct rte_config *cfg = rte_eal_get_configuration();
@@ -122,7 +119,6 @@ rte_service_init(void)
return 0;
 fail_mem:
rte_free(rte_services);
-   rte_free(lcore_states);
return -ENOMEM;
 }
 
@@ -136,7 +132,6 @@ rte_service_finalize(void)
rte_eal_mp_wait_lcore();
 
rte_free(rte_services);
-   rte_free(lcore_states);
 
rte_service_library_initialized = 0;
 }
@@ -286,7 +281,6 @@ rte_service_component_register(const struct 
rte_service_spec *spec,
 int32_t
 rte_service_component_unregister(uint32_t id)
 {
-   uint32_t i;
struct rte_service_spec_impl *s;
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
 
@@ -294,9 +288,10 @@ rte_service_component_unregister(uint32_t id)
 
s->internal_flags &= ~(SERVICE_F_REGISTERED);
 
+   struct core_state *cs;
/* clear the run-bit in all cores */
-   for (i = 0; i < RTE_MAX_LCORE; i++)
-   lcore_states[i].service_mask &= ~(UINT64_C(1) << id);
+   RTE_LCORE_VAR_FOREACH_VALUE(cs, lcore_states)
+   cs->service_mask &= ~(UINT64_C(1) << id);
 
memset(&rte_services[id], 0, sizeof(struct rte_service_spec_impl));
 
@@ -454,7 +449,10 @@ rte_service_may_be_active(uint32_t id)
return -EINVAL;
 
for (i = 0; i < lcore_count; i++) {
-   if (lcore_states[ids[i]].service_active_on_lcore[id])
+   struct core_state *cs =
+   RTE_LCORE_VAR_LCORE_VALUE(ids[i], lcore_states);
+
+   if (cs->service_active_on_lcore[id])
return 1;
}
 
@@ -464,7 +462,7 @@ rte_service_may_be_active(uint32_t id)
 int32_t
 rte_service_run_iter_on_app_lcore(uint32_t id, uint32_t serialize_mt_unsafe)
 {
-   struct core_state *cs = &lcore_states[rte_lcore_id()];
+   struct core_state *cs = RTE_LCORE_VAR_VALUE(lcore_states);
struct rte_service_spec_impl *s;
 
SERVICE_VALID_GET_OR_ERR_RET(id, s, -EINVAL);
@@ -486,8 +484,7 @@ service_runner_func(void *arg)
 {
RTE_SET_USED(arg);
uint8_t i;
-   const int lcore = rte_lcore_id();
-   struct core_state *cs = &lcore_states[lcore];
+   struct core_state *cs = RTE_LCORE_VAR_VALUE(lcore_states);
 
rte_atomic_store_explicit(&cs->thread_active, 1, 
rte_memory_order_seq_cst);
 
@@ -533,13 +530,15 @@ service_runner_func(void *arg)
 int32_t
 rte_service_lcore_may_be_active(uint32_t lcore)
 {
-   if (lcore >= RTE_MAX_LCORE || !lcore_states[lcore].is_service_core)
+   struct core_state *cs = RTE_LCORE_VAR_LCORE_VALUE(lcore, lcore_states);
+
+   if (lcore >= RTE_MAX_LCORE || !cs->is_service_core)
return -EINVAL;
 
/* Load thread_active using ACQUIRE to avoid instructions dependent on
 * the result being re-ordered before this load completes.
 */
-   return rte_atomic_load_explicit(&lcore_states[lcore].thread_active,
+   return rte_atomic_load_explicit(&cs->thread_active,
   rte_memory_order_acquire);
 }
 
@@ -547,9 +546,11 @@ int32_t
 rte_service_lcore_count(void)
 {
int32_t count = 0;
-   uint32_t i;
-   for (i = 0; i < RTE_MAX_LCORE; i++)
-   count += lcore_states[i].is_service_core;
+
+   struct core_stat

  1   2   >