RE: [PATCH v6 2/2] Add l2reflect measurement application

2022-09-25 Thread Moessbauer, Felix
> -Original Message-
> From: Maxime Coquelin 
> Sent: Tuesday, September 20, 2022 10:02 PM
> To: Moessbauer, Felix (T CED SES-DE) ;
> dev@dpdk.org
> Cc: Schild, Henning (T CED SES-DE) ; Kiszka, Jan
> (T CED) ; tho...@monjalon.net; Marcelo Tosatti
> 
> Subject: Re: [PATCH v6 2/2] Add l2reflect measurement application
> 
> Hi Felix,
> 
> First, I support the idea of having the l2reflect application part of
> the DPDK repository.

Hi Maxime,

Many thanks for the review.
I'm currently travelling but plan to address all responses regarding l2reflect 
in the upcoming week.

Best regards,
Felix

> 
> Please note CI failed to build it on different platforms:
> https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmails.dpd
> k.org%2Farchives%2Ftest-report%2F2022-
> September%2F304617.html&data=05%7C01%7Cfelix.moessbauer%40siem
> ens.com%7Cdd1323af309a42d0a12b08da9b10a598%7C38ae3bcd95794fd4adda
> b42e1495d55a%7C1%7C0%7C637992793049685048%7CUnknown%7CTWFpbGZ
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C3000%7C%7C%7C&sdata=eM6SKiinN6Jc%2Fb1lAYpY2TX1x7hxMzA
> 3iGetLzRzL0g%3D&reserved=0
> 
> It also fails to build on my Fc35 machine:
> [3237/3537] Compiling C object app/dpdk-l2reflect.p/l2reflect_main.c.o
> ../app/l2reflect/main.c: In function 'l2reflect_main_loop':
> ../app/l2reflect/main.c:560:19: warning: array subscript 'uint64_t {aka
> long unsigned int}[0]' is partly outside array bounds of 'struct
> rte_ether_addr[1]' [-Warray-bounds]
>560 | i_win = ((*((uint64_t *)&l2reflect_port_eth_addr)   &
> MAC_ADDR_CMP) >
>|   ^~~
> ../app/l2reflect/main.c:110:23: note: while referencing
> 'l2reflect_port_eth_addr'
>110 | struct rte_ether_addr l2reflect_port_eth_addr;
>|   ^~~
> ../app/l2reflect/main.c:561:27: warning: array subscript 'uint64_t {aka
> long unsigned int}[0]' is partly outside array bounds of 'struct
> rte_ether_addr[1]' [-Warray-bounds]
>561 |  (*((uint64_t
> *)&l2reflect_remote_eth_addr) & MAC_ADDR_CMP));
>|   ^
> ../app/l2reflect/main.c:111:23: note: while referencing
> 'l2reflect_remote_eth_addr'
>111 | struct rte_ether_addr l2reflect_remote_eth_addr;
>|   ^
> 
> Some more comments inline:
> 
> On 9/2/22 10:45, Felix Moessbauer wrote:
> > The l2reflect application implements a ping-pong benchmark to
> > measure the latency between two instances. For communication,
> > we use raw ethernet and send one packet at a time. The timing data
> > is collected locally and min/max/avg values are displayed in a TUI.
> > Finally, a histogram of the latencies is printed which can be
> > further processed with the jitterdebugger visualization scripts.
> > To debug latency spikes, a max threshold can be defined.
> > If it is hit, a trace point is created on both instances.
> >
> > Signed-off-by: Felix Moessbauer 
> > Signed-off-by: Henning Schild 
> > ---
> >   app/l2reflect/colors.c|   34 ++
> >   app/l2reflect/colors.h|   19 +
> >   app/l2reflect/l2reflect.h |   53 ++
> >   app/l2reflect/main.c  | 1007 +
> >   app/l2reflect/meson.build |   21 +
> >   app/l2reflect/payload.h   |   26 +
> >   app/l2reflect/stats.c |  225 +
> >   app/l2reflect/stats.h |   67 +++
> >   app/l2reflect/utils.c |   67 +++
> >   app/l2reflect/utils.h |   20 +
> >   app/meson.build   |1 +
> >   11 files changed, 1540 insertions(+)
> >   create mode 100644 app/l2reflect/colors.c
> >   create mode 100644 app/l2reflect/colors.h
> >   create mode 100644 app/l2reflect/l2reflect.h
> >   create mode 100644 app/l2reflect/main.c
> >   create mode 100644 app/l2reflect/meson.build
> >   create mode 100644 app/l2reflect/payload.h
> >   create mode 100644 app/l2reflect/stats.c
> >   create mode 100644 app/l2reflect/stats.h
> >   create mode 100644 app/l2reflect/utils.c
> >   create mode 100644 app/l2reflect/utils.h
> 
> If we agree to have this application in app/ directory,
> I think you'll have to add documentation for this new tool in
> doc/guides/tools/.
> 
> > diff --git a/app/l2reflect/colors.c b/app/l2reflect/colors.c
> > new file mode 100644
> > index 00..af881d8788
> > --- /dev/null
> > +++ b/app/l2reflect/colors.c
> > @@ -0,0 +1,34 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(c) 2020 Siemens AG
> > + */
> > +
> > +#include "colors.h"
> > +
> > +const struct color_palette *colors;
> > +
> > +static const struct color_palette color_palette_default = {
> > +   .red = "\x1b[01;31m",
> > +   .green = "\x1b[01;32m",
> > +   .yellow = "\x1b[01;33m",
> > +   .blue = "\x1b[01;34m",
> > +   .magenta = "\x1b[01;35m",
> > +   .cyan = "\x1b[01;36m",
> > +   .reset = "\x1b[0m"
> > +};
> > +
> > +static const struct color_palette 

mbuf pool pointer optimization is obsolete

2022-09-25 Thread Morten Brørup
Dear Olivier, Thomas and DPDK PMD developers:

In November 2020, the pool pointer in the mbuf structure was moved to the first 
cache line [1].

[1]: 
http://git.dpdk.org/dpdk/commit/lib/librte_mbuf/rte_mbuf_core.h?id=4630290af46ed44a58515067b6d2add9c044252a

That patch also made rte_mbuf_to_baddr() obsolete, but we didn't notice back 
then. I think this function should be deprecated and marked for removal. And 
its note about the pool pointer being in the 2nd cache line should be removed.


Furthermore, some PMDs still seem to use an alternative pool pointer (e.g. in 
the port queue structure) instead of the pool pointer in the mbuf; possibly 
inspired by the optimization advice given in the documentation for 
rte_mbuf_to_baddr(): "@note: Accessing mempool pointer of a mbuf is expensive 
because the pointer is stored in the 2nd cache line of mbuf. If mempool is 
known, it is better not to reference the mempool pointer in mbuf but calling 
rte_mbuf_buf_addr() would be more efficient."

Since this precondition (i.e. the pool pointer being located in the mbuf's 2nd 
cache line) is no longer true, PMD developers should consider if their design 
(using an mempool pointer from elsewhere) is still optimal, or if using the 
pool pointer in the mbuf structure would be better.


Med venlig hilsen / Kind regards,
-Morten Brørup



[PATCH] examples/pipeline: fix build with some compilers

2022-09-25 Thread Ali Alnubani
Fixes the following build failure with gcc 5.4.0 because
of uninitialized variables:

[..]
examples/pipeline/cli.c:1801:10: error: 'idx' may be used
  uninitialized in this function [-Werror=maybe-uninitialized]
[..]
examples/pipeline/cli.c:1916:10: error: 'idx' may be used
  uninitialized in this function [-Werror=maybe-uninitialized]
[..]

Fixes: 83f58a7b7b0a ("examples/pipeline: add commands for direct registers")
Cc: cristian.dumitre...@intel.com

Signed-off-by: Ali Alnubani 
---
 examples/pipeline/cli.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/pipeline/cli.c b/examples/pipeline/cli.c
index 35beb1139f..8013541c4b 100644
--- a/examples/pipeline/cli.c
+++ b/examples/pipeline/cli.c
@@ -1786,7 +1786,7 @@ cmd_pipeline_regrd(char **tokens,
 
/* index. */
if (!strcmp(tokens[4], "index")) {
-   uint32_t idx;
+   uint32_t idx = 0;
 
if (n_tokens != 6) {
snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
@@ -1901,7 +1901,7 @@ cmd_pipeline_regwr(char **tokens,
 
/* index. */
if (!strcmp(tokens[6], "index")) {
-   uint32_t idx;
+   uint32_t idx = 0;
 
if (n_tokens != 8) {
snprintf(out, out_size, MSG_ARG_MISMATCH, tokens[0]);
-- 
2.25.1



Re: [PATCH v2 1/3] net/bonding: support Tx prepare

2022-09-25 Thread Chas Williams




On 9/21/22 22:12, fengchengwen wrote:



On 2022/9/20 7:02, Chas Williams wrote:



On 9/19/22 10:07, Konstantin Ananyev wrote:




On 9/16/22 22:35, fengchengwen wrote:

Hi Chas,

On 2022/9/15 0:59, Chas Williams wrote:

On 9/13/22 20:46, fengchengwen wrote:


The main problem is hard to design a tx_prepare for bonding device:
1. as Chas Williams said, there maybe twice hash calc to get target slave
       devices.
2. also more important, if the slave devices have changes(e.g. slave device
       link down or remove), and if the changes happens between bond-tx-prepare 
and
       bond-tx-burst, the output slave will changes, and this may lead to 
checksum
       failed. (Note: a bond device with slave devices may from different 
vendors,
       and slave devices may have different requirements, e.g. slave-A support 
calc
       IPv4 pseudo-head automatic (no need driver pre-calc), but slave-B need 
driver
       pre-calc).

Current design cover the above two scenarios by using in-place tx-prepare. and
in addition, bond devices are not transparent to applications, I think it's a
practical method to provide tx-prepare support in this way.




I don't think you need to export an enable/disable routine for the use of
rte_eth_tx_prepare. It's safe to just call that routine, even if it isn't
implemented. You are just trading one branch in DPDK librte_eth_dev for a
branch in drivers/net/bonding.


Our first patch was just like yours (just add tx-prepare default), but community
is concerned about impacting performance.

As a trade-off, I think we can add the enable/disable API.


IMHO, that's a bad idea. If the rte_eth_dev_tx_prepare API affects
performance adversly, that is not a bonding problem. All applications
should be calling rte_eth_dev_tx_prepare. There's no defined API
to determine if rte_eth_dev_tx_prepare should be called. Therefore,
applications should always call rte_eth_dev_tx_prepare. Regardless,
as I previously mentioned, you are just trading the location of
the branch, especially in the bonding case.

If rte_eth_dev_tx_prepare is causing a performance drop, then that API
should be improved or rewritten. There are PMD that require you to use
that API. Locally, we had maintained a patch to eliminate the use of
rte_eth_dev_tx_prepare. However, that has been getting harder and harder
to maintain. The performance lost by just calling rte_eth_dev_tx_prepare
was marginal.





I think you missed fixing tx_machine in 802.3ad support. We have been using
the following patch locally which I never got around to submitting.


You are right, I will send V3 fix it.




   From a458654d68ff5144266807ef136ac3dd2adfcd98 Mon Sep 17 00:00:00 2001
From: "Charles (Chas) Williams" 
Date: Tue, 3 May 2022 16:52:37 -0400
Subject: [PATCH] net/bonding: call rte_eth_tx_prepare before rte_eth_tx_burst

Some PMDs might require a call to rte_eth_tx_prepare before sending the
packets for transmission. Typically, the prepare step handles the VLAN
headers, but it may need to do other things.

Signed-off-by: Chas Williams 


...


     * ring if transmission fails so the packet isn't lost.
@@ -1322,8 +1350,12 @@ bond_ethdev_tx_burst_broadcast(void *queue, struct 
rte_mbuf **bufs,

    /* Transmit burst on each active slave */
    for (i = 0; i < num_of_slaves; i++) {
-    slave_tx_total[i] = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
+    uint16_t nb_prep;
+
+    nb_prep = rte_eth_tx_prepare(slaves[i], bd_tx_q->queue_id,
    bufs, nb_pkts);
+    slave_tx_total[i] = rte_eth_tx_burst(slaves[i], bd_tx_q->queue_id,
+    bufs, nb_prep);


The tx-prepare may edit packet data, and the broadcast mode will send a packet 
to all slaves,
the packet data is sent and edited at the same time. Is this likely to cause 
problems ?


This routine is already broken. You can't just increment the refcount
and send the packet into a PMD's transmit routine. Nothing guarantees
that a transmit routine will not modify the packet. Many PMDs perform an
rte_vlan_insert.


Hmm interesting
My uderstanding was quite opposite - tx_burst() can't modify packet data and 
metadata
(except when refcnt==1 and tx_burst() going to free the mbuf and put it back to 
the mempool).
While tx_prepare() can - actually as I remember that was one of the reasons why 
a separate routine
was introduced.


Is that documented anywhere? It's been my experience that the device PMD
can do practically anything and you need to protect yourself.  Currently,
the af_packet, dpaa2, and vhost driver call rte_vlan_insert. Before 2019,
the virtio driver also used to call rte_vlan_insert during its transmit
path. Of course, rte_vlan_insert modifies the packet data and the mbuf
header. Regardless, it looks like rte_eth_dev_tx_prepare should always be
called. Handling that correctly in broadcast mode probably means always
make a deep copy of the packet, or check to see if all the members are
the same PMD type. If so, 

Re: [PATCH V3] net/bonding: add link speeds configuration

2022-09-25 Thread Chas Williams

Thanks for making the changes!

Signed-off-by: 3ch...@gmail.com

On 9/21/22 21:33, Huisong Li wrote:

This patch adds link speeds configuration.

---
  -v3: add an intersection of the supported speeds to check 'link_speeds'.
  -v2: resend due to CI compiling failure.

Signed-off-by: Huisong Li 
---
  drivers/net/bonding/eth_bond_private.h |  3 +++
  drivers/net/bonding/rte_eth_bond_api.c |  3 +++
  drivers/net/bonding/rte_eth_bond_pmd.c | 27 ++
  3 files changed, 33 insertions(+)

diff --git a/drivers/net/bonding/eth_bond_private.h 
b/drivers/net/bonding/eth_bond_private.h
index 8222e3cd38..d067ea8c9a 100644
--- a/drivers/net/bonding/eth_bond_private.h
+++ b/drivers/net/bonding/eth_bond_private.h
@@ -131,6 +131,9 @@ struct bond_dev_private {
uint32_t link_down_delay_ms;
uint32_t link_up_delay_ms;
  
+	uint32_t speed_capa;

+   /**< Supported speeds bitmap (RTE_ETH_LINK_SPEED_). */
+
uint16_t nb_rx_queues;  /**< Total number of rx queues 
*/
uint16_t nb_tx_queues;  /**< Total number of tx queues*/
  
diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c

index 4ac191c468..e64ec0ed20 100644
--- a/drivers/net/bonding/rte_eth_bond_api.c
+++ b/drivers/net/bonding/rte_eth_bond_api.c
@@ -513,6 +513,8 @@ __eth_bond_slave_add_lock_free(uint16_t bonded_port_id, 
uint16_t slave_port_id)
internals->primary_port = slave_port_id;
internals->current_primary_port = slave_port_id;
  
+		internals->speed_capa = dev_info.speed_capa;

+
/* Inherit queues settings from first slave */
internals->nb_rx_queues = slave_eth_dev->data->nb_rx_queues;
internals->nb_tx_queues = slave_eth_dev->data->nb_tx_queues;
@@ -527,6 +529,7 @@ __eth_bond_slave_add_lock_free(uint16_t bonded_port_id, 
uint16_t slave_port_id)
} else {
int ret;
  
+		internals->speed_capa &= dev_info.speed_capa;

eth_bond_slave_inherit_dev_info_rx_next(internals, &dev_info);
eth_bond_slave_inherit_dev_info_tx_next(internals, &dev_info);
  
diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c b/drivers/net/bonding/rte_eth_bond_pmd.c

index 3191158ca7..0adbf0e1b2 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1717,6 +1717,8 @@ slave_configure(struct rte_eth_dev *bonded_eth_dev,
  
  	slave_eth_dev->data->dev_conf.rxmode.mtu =

bonded_eth_dev->data->dev_conf.rxmode.mtu;
+   slave_eth_dev->data->dev_conf.link_speeds =
+   bonded_eth_dev->data->dev_conf.link_speeds;
  
  	slave_eth_dev->data->dev_conf.txmode.offloads |=

bonded_eth_dev->data->dev_conf.txmode.offloads;
@@ -2275,6 +2277,7 @@ bond_ethdev_info(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
  
  	dev_info->reta_size = internals->reta_size;

dev_info->hash_key_size = internals->rss_key_len;
+   dev_info->speed_capa = internals->speed_capa;
  
  	return 0;

  }
@@ -3591,6 +3594,7 @@ bond_ethdev_configure(struct rte_eth_dev *dev)
uint64_t offloads;
int arg_count;
uint16_t port_id = dev - rte_eth_devices;
+   uint32_t link_speeds;
uint8_t agg_mode;
  
  	static const uint8_t default_rss_key[40] = {

@@ -3659,6 +3663,29 @@ bond_ethdev_configure(struct rte_eth_dev *dev)
dev->data->dev_conf.txmode.offloads = offloads;
}
  
+	link_speeds = dev->data->dev_conf.link_speeds;

+   /*
+* The default value of 'link_speeds' is zero. From its definition,
+* this value actually means auto-negotiation. But not all PMDs support
+* auto-negotiation. So ignore the check for the auto-negotiation and
+* only consider fixed speed to reduce the impact on PMDs.
+*/
+   if (link_speeds & RTE_ETH_LINK_SPEED_FIXED) {
+   if ((link_speeds &
+   (internals->speed_capa & ~RTE_ETH_LINK_SPEED_FIXED)) == 0) {
+   RTE_BOND_LOG(ERR, "the fixed speed is not supported by all 
slave devices.");
+   return -EINVAL;
+   }
+   /*
+* Two '1' in binary of 'link_speeds': bit0 and a unique
+* speed bit.
+*/
+   if (__builtin_popcountl(link_speeds) != 2) {
+   RTE_BOND_LOG(ERR, "please set a unique speed.");
+   return -EINVAL;
+   }
+   }
+
/* set the max_rx_pktlen */
internals->max_rx_pktlen = internals->candidate_max_rx_pktlen;
  


Re: [PATCH] net/bonding: fix error in bonding mode 4 with dedicated queues enabled

2022-09-25 Thread Chas Williams

It's probably cleaner to just move the bond_ethdev_8023ad_flow_set
until after the device start. For the reader, they don't need to
understand why you might not have started the device earlier.

On 9/24/22 10:19, Usman Tanveer wrote:

when dedicated queues are enable with bonding mode 4 (mlx5), the
application sets the flow, which cannot be set if the device is not
started. This fixed the issue by starting the device just before
setting the flow. Because device should be started to set the flow.
Also it does not effect other driver codes (I have tried on ixgbe).

Bugzilla ID: 759

Signed-off-by: Usman Tanveer 
---
  drivers/net/bonding/rte_eth_bond_pmd.c | 19 ++-
  1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index 73e6972035..2dfb613ea6 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1829,6 +1829,13 @@ slave_start(struct rte_eth_dev *bonded_eth_dev,
slave_eth_dev->data->port_id, errval);
}
  
+		errval = rte_eth_dev_start(slave_eth_dev->data->port_id);

+   if (errval != 0) {
+   RTE_BOND_LOG(ERR, "rte_eth_dev_start: port=%u, err 
(%d)",
+   slave_eth_dev->data->port_id, errval);
+   return -1;
+   }
+
errval = bond_ethdev_8023ad_flow_set(bonded_eth_dev,
slave_eth_dev->data->port_id);
if (errval != 0) {
@@ -1840,11 +1847,13 @@ slave_start(struct rte_eth_dev *bonded_eth_dev,
}
  
  	/* Start device */

-   errval = rte_eth_dev_start(slave_eth_dev->data->port_id);
-   if (errval != 0) {
-   RTE_BOND_LOG(ERR, "rte_eth_dev_start: port=%u, err (%d)",
-   slave_eth_dev->data->port_id, errval);
-   return -1;
+   if (!slave_eth_dev->data->dev_started) {
+   errval = rte_eth_dev_start(slave_eth_dev->data->port_id);
+   if (errval != 0) {
+   RTE_BOND_LOG(ERR, "rte_eth_dev_start: port=%u, err 
(%d)",
+   slave_eth_dev->data->port_id, errval);
+   return -1;
+   }
}
  
  	/* If RSS is enabled for bonding, synchronize RETA */


Re: [PATCH] examples/pipeline: fix build with some compilers

2022-09-25 Thread Thomas Monjalon
25/09/2022 11:20, Ali Alnubani:
> Fixes the following build failure with gcc 5.4.0 because
> of uninitialized variables:
> 
> [..]
> examples/pipeline/cli.c:1801:10: error: 'idx' may be used
>   uninitialized in this function [-Werror=maybe-uninitialized]
> [..]
> examples/pipeline/cli.c:1916:10: error: 'idx' may be used
>   uninitialized in this function [-Werror=maybe-uninitialized]
> [..]
> 
> Fixes: 83f58a7b7b0a ("examples/pipeline: add commands for direct registers")
> Cc: cristian.dumitre...@intel.com
> 
> Signed-off-by: Ali Alnubani 

Applied, thanks.





RE: DPDK with i225/226 on elkhartlake

2022-09-25 Thread SamChen 陳嘉良
Hi Qiming,
Thanks for your information. 
From the log of i226V, you can see that the string "Requested device 
:03:00.0 cannot be used" appeared at init 03:00.0. 
Then its execution stopped at "EAL: Probe PCI driver: net_igc (8086:125c) 
device: :06:00.0 (socket 0)" with no subsequent action.

If the device can be executed normally, the complete message should be like the 
i211_log attached, right?

Sam Chen

-Original Message-
From: Yang, Qiming  
Sent: Friday, September 23, 2022 3:51 PM
To: SamChen 陳嘉良 ; Richardson, Bruce 

Cc: dev@dpdk.org
Subject: RE: DPDK with i225/226 on elkhartlake

Hi, Sam
We don't support I225-IT in 22.07 release, you can update you code to the 
latest next-net-intel branch, it contain the fix.
But for i226v's log, I don't see any issue, seems 06:00.0 init success. Could 
you clarify it?

Qiming 
> -Original Message-
> From: SamChen 陳嘉良 
> Sent: Thursday, September 22, 2022 10:57 AM
> To: Richardson, Bruce 
> Cc: dev@dpdk.org
> Subject: RE: DPDK with i225/226 on elkhartlake
> 
> Hi all,
> I have switched to using vfio-pci, but still can't use DPDK. Attached 
> are the logs I can't use in I225-IT and I226V for your reference.
> If I have obtained other platforms with intel 225/226 systems, I will 
> check again, thanks!
> 
> Sam Chen
> 
> -Original Message-
> From: Bruce Richardson 
> Sent: Wednesday, September 21, 2022 4:57 PM
> To: SamChen 陳嘉良 
> Cc: dev@dpdk.org
> Subject: Re: DPDK with i225/226 on elkhartlake
> 
> On Wed, Sep 21, 2022 at 03:23:09AM +, SamChen 陳嘉良 wrote:
> >Hi,
> >We bind the uio_pci_generic driver and use the command of the attached
> >image to execute DPDK. However, we encountered the problem of device
> >cannot be used on i225V/i226V in elkhartlake platform. Currently, the
> >same method works fine for i211 and intel® Killer™ Ethernet E3100 2.5
> >Gbps. Do you have any suggestions? Thanks!
> >
> Hi,
> 
> while I can't comment on the NIC-specific issues, I would just comment 
> that it's really not recommended to use uio_pci_generic - or any other 
> uio driver - for DPDK any more. You should consider switching to using 
> vfio-pci. Even if there is no iommu enabled, vfio-pci still tends to 
> give a better user experience. If iommu is enabled, you also get a lot 
> more security from the memory protection it provides.
> 
> Regards,
> /Bruce


i211_log
Description: i211_log


i226V_log
Description: i226V_log


[PATCH] dumpcap: fix list interfaces

2022-09-25 Thread Stephen Hemminger
The change to do argument process before EAL init broke
the support of list-interfaces option. Fix by setting flag
and doing list-interfaces later.

Fixes: a8dde09f97df ("app/dumpcap: allow help/version without primary process")
Cc: konce...@gmail.com
Signed-off-by: Stephen Hemminger 
---
 app/dumpcap/main.c | 19 +--
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/app/dumpcap/main.c b/app/dumpcap/main.c
index a6041d4ff495..490a0f050bc8 100644
--- a/app/dumpcap/main.c
+++ b/app/dumpcap/main.c
@@ -63,6 +63,8 @@ static unsigned int ring_size = 2048;
 static const char *capture_comment;
 static uint32_t snaplen = RTE_MBUF_DEFAULT_BUF_SIZE;
 static bool dump_bpf;
+static bool show_interfaces;
+
 static struct {
uint64_t  duration; /* nanoseconds */
unsigned long packets;  /* number of packets in file */
@@ -256,7 +258,7 @@ static void select_interface(const char *arg)
 }
 
 /* Display list of possible interfaces that can be used. */
-static void show_interfaces(void)
+static void dump_interfaces(void)
 {
char name[RTE_ETH_NAME_MAX_LEN];
uint16_t p;
@@ -266,6 +268,8 @@ static void show_interfaces(void)
continue;
printf("%u. %s\n", p, name);
}
+
+   exit(0);
 }
 
 static void compile_filter(void)
@@ -353,8 +357,8 @@ static void parse_opts(int argc, char **argv)
dump_bpf = true;
break;
case 'D':
-   show_interfaces();
-   exit(0);
+   show_interfaces = true;
+   break;
case 'f':
filter_str = optarg;
break;
@@ -529,9 +533,6 @@ static void dpdk_init(void)
 
if (rte_eal_init(eal_argc, eal_argv) < 0)
rte_exit(EXIT_FAILURE, "EAL init failed: is primary process 
running?\n");
-
-   if (rte_eth_dev_count_avail() == 0)
-   rte_exit(EXIT_FAILURE, "No Ethernet ports found\n");
 }
 
 /* Create packet ring shared between callbacks and process */
@@ -789,6 +790,12 @@ int main(int argc, char **argv)
parse_opts(argc, argv);
dpdk_init();
 
+   if (show_interfaces)
+   dump_interfaces();
+
+   if (rte_eth_dev_count_avail() == 0)
+   rte_exit(EXIT_FAILURE, "No Ethernet ports found\n");
+
if (filter_str)
compile_filter();
 
-- 
2.35.1



[PATCH v5 0/4] support protocol based buffer split

2022-09-25 Thread Yuan Wang
Protocol type based buffer split consists of splitting a received packet
into several separate segments based on the packet content. It is useful
in some scenarios, such as GPU acceleration. The splitting will help to
enable true zero copy and hence improve the performance significantly.

This patchset aims to support protocol header split based on current buffer
split. When Rx queue is configured with RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT
offload and corresponding protocol, packets received will be directly split
into different mempools.

Change log:
v5:
Define proto_hdr to use mask instead of single protocol type.
Define PMD to return protocol header mask.
Refine the doc and commit log.
Remove deprecated RTE_FUNC_PTR_OR_ERR_RET.

v4:
Change proto_hdr to a bit mask of RTE_PTYPE_*.
Add the description on how to put the unsplit packages.
Use proto_hdr to determine whether to use protocol based split.

v3:
Fix mail thread.

v2:
Add mbuf dump to the driver's buffer split path.
Add buffer split to the driver feature list.
Remove unsupported header protocols from the driver.

Yuan Wang (4):
  ethdev: introduce protocol header API
  ethdev: introduce protocol hdr based buffer split
  app/testpmd: add rxhdrs commands and parameters
  net/ice: support buffer split in Rx path

 app/test-pmd/cmdline.c | 146 -
 app/test-pmd/config.c  |  88 ++
 app/test-pmd/parameters.c  |  16 +-
 app/test-pmd/testpmd.c |   2 +
 app/test-pmd/testpmd.h |   6 +
 doc/guides/rel_notes/release_22_11.rst |  16 ++
 drivers/net/ice/ice_ethdev.c   |  55 ++-
 drivers/net/ice/ice_rxtx.c | 218 +
 drivers/net/ice/ice_rxtx.h |  16 ++
 drivers/net/ice/ice_rxtx_vec_common.h  |   3 +
 lib/ethdev/ethdev_driver.h |  15 ++
 lib/ethdev/rte_ethdev.c| 107 ++--
 lib/ethdev/rte_ethdev.h|  59 ++-
 lib/ethdev/version.map |   3 +
 14 files changed, 702 insertions(+), 48 deletions(-)

-- 
2.25.1



[PATCH v5 1/4] ethdev: introduce protocol header API

2022-09-25 Thread Yuan Wang
Add a new ethdev API to retrieve supported protocol headers
of a PMD, which helps to configure protocol header based buffer split.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
Reviewed-by: Andrew Rybchenko 
---
 doc/guides/rel_notes/release_22_11.rst |  5 
 lib/ethdev/ethdev_driver.h | 15 
 lib/ethdev/rte_ethdev.c| 33 ++
 lib/ethdev/rte_ethdev.h| 30 +++
 lib/ethdev/version.map |  3 +++
 5 files changed, 86 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 235ac9bf94..8e5bdde46a 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -59,6 +59,11 @@ New Features
 
   * Added support to set device link down/up.
 
+* **Added new ethdev API for PMD to get buffer split supported protocol 
types.**
+
+  Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
+  header protocols of a PMD to split.
+
 
 Removed Items
 -
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index 8cd8eb8685..791b264610 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -1055,6 +1055,18 @@ typedef int (*eth_ip_reassembly_conf_get_t)(struct 
rte_eth_dev *dev,
 typedef int (*eth_ip_reassembly_conf_set_t)(struct rte_eth_dev *dev,
const struct rte_eth_ip_reassembly_params *conf);
 
+/**
+ * @internal
+ * Get supported header protocols of a PMD to split.
+ *
+ * @param dev
+ *   Ethdev handle of port.
+ *
+ * @return
+ *   An array pointer to store supported protocol headers.
+ */
+typedef const uint32_t *(*eth_buffer_split_supported_hdr_ptypes_get_t)(struct 
rte_eth_dev *dev);
+
 /**
  * @internal
  * Dump private info from device to a file.
@@ -1302,6 +1314,9 @@ struct eth_dev_ops {
/** Set IP reassembly configuration */
eth_ip_reassembly_conf_set_t ip_reassembly_conf_set;
 
+   /** Get supported header ptypes to split */
+   eth_buffer_split_supported_hdr_ptypes_get_t 
buffer_split_supported_hdr_ptypes_get;
+
/** Dump private info from device */
eth_dev_priv_dump_t eth_dev_priv_dump;
 
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 0c2c1088c0..1f0a7f8f3f 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -6002,6 +6002,39 @@ rte_eth_dev_priv_dump(uint16_t port_id, FILE *file)
return eth_err(port_id, (*dev->dev_ops->eth_dev_priv_dump)(dev, file));
 }
 
+int
+rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t 
*ptypes, int num)
+{
+   int i, j;
+   struct rte_eth_dev *dev;
+   const uint32_t *all_types;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+   dev = &rte_eth_devices[port_id];
+
+   if (ptypes == NULL && num > 0) {
+   RTE_ETHDEV_LOG(ERR,
+   "Cannot get ethdev port %u supported header protocol 
types to NULL when array size is non zero\n",
+   port_id);
+   return -EINVAL;
+   }
+
+   if (*dev->dev_ops->buffer_split_supported_hdr_ptypes_get == NULL)
+   return -ENOTSUP;
+   all_types = (*dev->dev_ops->buffer_split_supported_hdr_ptypes_get)(dev);
+
+   if (!all_types)
+   return 0;
+
+   for (i = 0, j = 0; all_types[i] != RTE_PTYPE_UNKNOWN; ++i) {
+   if (j < num)
+   ptypes[j] = all_types[i];
+   j++;
+   }
+
+   return j;
+}
+
 RTE_LOG_REGISTER_DEFAULT(rte_eth_dev_logtype, INFO);
 
 RTE_INIT(ethdev_init_telemetry)
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 45d17ddd13..c440e3863a 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -5924,6 +5924,36 @@ rte_eth_tx_buffer(uint16_t port_id, uint16_t queue_id,
return rte_eth_tx_buffer_flush(port_id, queue_id, buffer);
 }
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Get supported header protocols to split on Rx.
+ *
+ * When a packet type is announced to be split, it *must* be supported by
+ * the PMD. For instance, if eth-ipv4, eth-ipv4-udp is announced, the PMD must
+ * return the following packet types for these packets:
+ * - Ether/IPv4 -> RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4
+ * - Ether/IPv4/UDP -> RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4 | 
RTE_PTYPE_L4_UDP
+ *
+ * @param port_id
+ *   The port identifier of the device.
+ * @param[out] ptypes
+ *   An array pointer to store supported protocol headers, allocated by caller.
+ *   These ptypes are composed with RTE_PTYPE_*.
+ * @param num
+ *   Size of the array pointed by param ptypes.
+ * @return
+ *   - (>=0) Number of supported ptypes. If the number of types exceeds num,
+ *   only num entries will be filled into the ptypes array, but the 
full
+ *   co

[PATCH v5 2/4] ethdev: introduce protocol hdr based buffer split

2022-09-25 Thread Yuan Wang
Currently, Rx buffer split supports length based split. With Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment
configured, PMD will be able to split the received packets into
multiple segments.

However, length based buffer split is not suitable for NICs that do split
based on protocol headers. Given an arbitrarily variable length in Rx
packet segment, it is almost impossible to pass a fixed protocol header to
driver. Besides, the existence of tunneling results in the composition of
a packet is various, which makes the situation even worse.

This patch extends current buffer split to support protocol header based
buffer split. A new proto_hdr field is introduced in the reserved field
of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr
field defines the split position of packet, splitting will always happen
after the protocol header defined in the Rx packet segment. When Rx queue
offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding
protocol header is configured, driver will split the ingress packets into
multiple segments.

Examples for proto_hdr field defines:
To split after ETH-IPV4-UDP, it should be defined as
RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_L4_UDP

For inner ETH-IPV4-UDP, it should be defined as
RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN | RTE_PTYPE_INNER_L4_UDP

struct rte_eth_rxseg_split {
struct rte_mempool *mp; /* memory pools to allocate segment from */
uint16_t length; /* segment maximal data length,
configures split point */
uint16_t offset; /* data offset from beginning
of mbuf data buffer */
/**
 * Proto_hdr defines a bit mask of the protocol sequence as
 * RTE_PTYPE_*, configures split point. The last RTE_PTYPE*
 * in the mask indicates the split position.
 * For non-tunneling packets, the complete protocol sequence
 * should be defined.
 * For tunneling packets, for simplicity, only the tunnel and
 * inner protocol sequence should be defined.
 */
uint32_t proto_hdr;
};

If protocol header split can be supported by a PMD, the
rte_eth_buffer_split_get_supported_hdr_ptypes function can
be use to obtain a list of these protocol headers.

For example, let's suppose we configured the Rx queue with the
following segments:
seg0 - pool0, proto_hdr0=RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4,
   off0=2B
seg1 - pool1, proto_hdr1=RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4
   | RTE_PTYPE_L4_UDP, off1=128B
seg2 - pool2, off1=0B

The packet consists of ETH_IPV4_UDP_PAYLOAD will be split like
following:
seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0
seg1 - udp header @ 128 in mbuf from pool1
seg2 - payload @ 0 in mbuf from pool2

Note: NIC will only do split when the packets exactly match all the
protocol headers in the segments. For example, if ARP packets received
with above config, the NIC won't do split for ARP packets since
it does not contains ipv4 header and udp header. These packets will be put
into the last valid mempool, with zero offset.

Now buffer split can be configured in two modes. For length based
buffer split, the mp, length, offset field in Rx packet segment should
be configured, while the proto_hdr field will be ignored.
For protocol header based buffer split, the mp, offset, proto_hdr field
in Rx packet segment should be configured, while the length field will
be ignored.

The split limitations imposed by underlying driver is reported in the
rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split
parts may differ either, dpdk memory and external memory, respectively.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
---
 doc/guides/rel_notes/release_22_11.rst |  7 +++
 lib/ethdev/rte_ethdev.c| 74 ++
 lib/ethdev/rte_ethdev.h| 29 +-
 3 files changed, 98 insertions(+), 12 deletions(-)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 8e5bdde46a..cce1f6e50c 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -64,6 +64,13 @@ New Features
   Added ``rte_eth_buffer_split_get_supported_hdr_ptypes()``, to get supported
   header protocols of a PMD to split.
 
+* **Added protocol header based buffer split.**
+
+  Ethdev: The ``reserved`` field in the ``rte_eth_rxseg_split`` structure is
+  replaced with ``proto_hdr`` to support protocol header based buffer split.
+  User can choose length or protocol header to configure buffer split
+  according to NIC's capability.
+
 
 Removed Items
 -
diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index 1f0a7f8f3f..27ec19faed 100644
--- a/lib/ethde

[PATCH v5 3/4] app/testpmd: add rxhdrs commands and parameters

2022-09-25 Thread Yuan Wang
Add command line parameter:
--rxhdrs=eth,[eth-ipv4,eth-ipv4-udp]

Set the protocol_hdr of segments to scatter packets on receiving if
split feature is engaged. And the queues with BUFFER_SPLIT flag.

Add interactive mode command:
testpmd>set rxhdrs eth,eth-ipv4,eth-ipv4-udp
(protocol sequence should be valid)

The protocol split feature is off by default. To enable protocol split,
you need:
1. Start testpmd with multiple mempools. E.g. --mbuf-size=2048,2048
2. Configure Rx queue with rx_offload buffer split on.
3. Set the protocol type of buffer split. E.g. set rxhdrs eth,eth-ipv4
(default protocols of testpmd : eth|eth-ipv4|eth-ipv6|
 eth-ipv4-tcp|eth-ipv6-tcp|eth-ipv4-udp|eth-ipv6-udp|
 eth-ipv4-sctp|eth-ipv6-sctp|grenat-eth|grenat-eth-ipv4|
 grenat-eth-ipv6|grenat-eth-ipv4-tcp|grenat-eth-ipv6-tcp|
 grenat-eth-ipv4-udp|grenat-eth-ipv6-udp|grenat-eth-ipv4-sctp|
 grenat-eth-ipv6-sctp)
Above protocols can be configured in testpmd. But the configuration can
only be applied when it is supported by specific pmd.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
---
 app/test-pmd/cmdline.c| 146 +-
 app/test-pmd/config.c |  88 +++
 app/test-pmd/parameters.c |  16 -
 app/test-pmd/testpmd.c|   2 +
 app/test-pmd/testpmd.h|   6 ++
 5 files changed, 254 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index ba749f588a..00c7d167ce 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -181,7 +181,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"show (rxq|txq) info (port_id) (queue_id)\n"
"Display information for configured RX/TX 
queue.\n\n"
 
-   "show config (rxtx|cores|fwd|rxoffs|rxpkts|txpkts)\n"
+   "show config 
(rxtx|cores|fwd|rxoffs|rxpkts|rxhdrs|txpkts)\n"
"Display the given configuration.\n\n"
 
"read rxd (port_id) (queue_id) (rxd_id)\n"
@@ -305,6 +305,17 @@ static void cmd_help_long_parsed(void *parsed_result,
" Affects only the queues configured with split"
" offloads.\n\n"
 
+   "set rxhdrs (eth[,eth-ipv4])*\n"
+   "Set the protocol hdr of each segment to scatter"
+   " packets on receiving if split feature is engaged."
+   " Affects only the queues configured with split"
+   " offloads.\n"
+   "Supported values: 
eth|eth-ipv4|eth-ipv6|eth-ipv4-tcp|eth-ipv6-tcp|"
+   "eth-ipv4-udp|eth-ipv6-udp|eth-ipv4-sctp|eth-ipv6-sctp|"
+   
"grenat-eth|grenat-eth-ipv4|grenat-eth-ipv6|grenat-eth-ipv4-tcp|"
+   
"grenat-eth-ipv6-tcp|grenat-eth-ipv4-udp|grenat-eth-ipv6-udp|"
+   "grenat-eth-ipv4-sctp|grenat-eth-ipv6-sctp\n\n"
+
"set txpkts (x[,y]*)\n"
"Set the length of each segment of TXONLY"
" and optionally CSUM packets.\n\n"
@@ -3366,6 +3377,88 @@ static cmdline_parse_inst_t cmd_stop = {
},
 };
 
+static unsigned int
+get_ptype(char *value)
+{
+   uint32_t protocol;
+
+   if (!strcmp(value, "eth"))
+   protocol = RTE_PTYPE_L2_ETHER;
+   else if (!strcmp(value, "eth-ipv4"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN;
+   else if (!strcmp(value, "eth-ipv6"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN;
+   else if (!strcmp(value, "eth-ipv4-tcp"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_TCP;
+   else if (!strcmp(value, "eth-ipv6-tcp"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | 
RTE_PTYPE_L4_TCP;
+   else if (!strcmp(value, "eth-ipv4-udp"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_UDP;
+   else if (!strcmp(value, "eth-ipv6-udp"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | 
RTE_PTYPE_L4_UDP;
+   else if (!strcmp(value, "eth-ipv4-sctp"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_SCTP;
+   else if (!strcmp(value, "eth-ipv6-sctp"))
+   protocol = RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | 
RTE_PTYPE_L4_SCTP;
+   else if (!strcmp(value, "grenat-eth"))
+   protocol = RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER;
+   else if (!strcmp(value, "grenat-eth-ipv4"))
+   protocol = RTE_PTYPE_TUNNEL_GRENAT | RTE_PTYPE_INNER_L2_ETHER |
+   RTE_PTYPE_INNER_L3_IPV4_EXT_UNKNOWN;
+   else if (!strcmp(value, "gr

[PATCH v5 4/4] net/ice: support buffer split in Rx path

2022-09-25 Thread Yuan Wang
This patch adds support for protocol based buffer split in normal Rx
data paths. When the Rx queue is configured with specific protocol type,
packets received will be directly split into protocol header and
payload parts limitation of pmd. And the two parts will be
put into different mempools.

Currently, protocol based buffer split is not supported in vectorized
paths.

A new api ice_buffer_split_supported_hdr_ptypes_get() has been
introduced, it will return the supported header protocols of ice PMD
to app for splitting.

Signed-off-by: Yuan Wang 
Signed-off-by: Xuan Ding 
Signed-off-by: Wenxuan Wu 
---
 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/net/ice/ice_ethdev.c   |  55 ++-
 drivers/net/ice/ice_rxtx.c | 218 +
 drivers/net/ice/ice_rxtx.h |  16 ++
 drivers/net/ice/ice_rxtx_vec_common.h  |   3 +
 5 files changed, 264 insertions(+), 32 deletions(-)

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index cce1f6e50c..f11bbbdc1f 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -71,6 +71,10 @@ New Features
   User can choose length or protocol header to configure buffer split
   according to NIC's capability.
 
+* **Updated Intel ice driver.**
+
+  Added protocol based buffer split support in scalar path.
+
 
 Removed Items
 -
diff --git a/drivers/net/ice/ice_ethdev.c b/drivers/net/ice/ice_ethdev.c
index e8304a1f2b..23f0f2140c 100644
--- a/drivers/net/ice/ice_ethdev.c
+++ b/drivers/net/ice/ice_ethdev.c
@@ -170,6 +170,7 @@ static int ice_timesync_read_time(struct rte_eth_dev *dev,
 static int ice_timesync_write_time(struct rte_eth_dev *dev,
   const struct timespec *timestamp);
 static int ice_timesync_disable(struct rte_eth_dev *dev);
+static const uint32_t *ice_buffer_split_supported_hdr_ptypes_get(struct 
rte_eth_dev *dev);
 
 static const struct rte_pci_id pci_id_ice_map[] = {
{ RTE_PCI_DEVICE(ICE_INTEL_VENDOR_ID, ICE_DEV_ID_E823L_BACKPLANE) },
@@ -281,6 +282,7 @@ static const struct eth_dev_ops ice_eth_dev_ops = {
.timesync_write_time  = ice_timesync_write_time,
.timesync_disable = ice_timesync_disable,
.tm_ops_get   = ice_tm_ops_get,
+   .buffer_split_supported_hdr_ptypes_get = 
ice_buffer_split_supported_hdr_ptypes_get,
 };
 
 /* store statistics names and its offset in stats structure */
@@ -3750,7 +3752,8 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
RTE_ETH_RX_OFFLOAD_OUTER_IPV4_CKSUM |
RTE_ETH_RX_OFFLOAD_VLAN_EXTEND |
RTE_ETH_RX_OFFLOAD_RSS_HASH |
-   RTE_ETH_RX_OFFLOAD_TIMESTAMP;
+   RTE_ETH_RX_OFFLOAD_TIMESTAMP |
+   RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT;
dev_info->tx_offload_capa |=
RTE_ETH_TX_OFFLOAD_QINQ_INSERT |
RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
@@ -3762,7 +3765,7 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->flow_type_rss_offloads |= ICE_RSS_OFFLOAD_ALL;
}
 
-   dev_info->rx_queue_offload_capa = 0;
+   dev_info->rx_queue_offload_capa = RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT;
dev_info->tx_queue_offload_capa = RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;
 
dev_info->reta_size = pf->hash_lut_size;
@@ -3831,6 +3834,11 @@ ice_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->default_rxportconf.ring_size = ICE_BUF_SIZE_MIN;
dev_info->default_txportconf.ring_size = ICE_BUF_SIZE_MIN;
 
+   dev_info->rx_seg_capa.max_nseg = ICE_RX_MAX_NSEG;
+   dev_info->rx_seg_capa.multi_pools = 1;
+   dev_info->rx_seg_capa.offset_allowed = 0;
+   dev_info->rx_seg_capa.offset_align_log2 = 0;
+
return 0;
 }
 
@@ -5887,6 +5895,49 @@ ice_timesync_disable(struct rte_eth_dev *dev)
return 0;
 }
 
+static const uint32_t *
+ice_buffer_split_supported_hdr_ptypes_get(struct rte_eth_dev *dev __rte_unused)
+{
+   /* Buffer split protocol header capability. */
+   static const uint32_t ptypes[] = {
+   /* Non tunneled */
+   RTE_PTYPE_L2_ETHER,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_TCP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV4_EXT_UNKNOWN | 
RTE_PTYPE_L4_SCTP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | 
RTE_PTYPE_L4_UDP,
+   RTE_PTYPE_L2_ETHER | RTE_PTYPE_L3_IPV6_EXT_UNKNOWN | 
RTE_PTYPE_L4_TCP,
+   RTE_PTYPE_L2_ETHER 

[PATCH v6 1/2] eventdev/eth_tx: add queue start stop API

2022-09-25 Thread Naga Harish K S V
Add support to start or stop a particular queue
that is associated with the adapter.

Start function enables the Tx adapter to start enqueueing
packets to the Tx queue.

Stop function stops the Tx adapter from enqueueing any
packets to the Tx queue. The stop API also frees any packets
that may have been buffered for this queue. All inflight packets
destined to the queue are freed by the adapter runtime until the
queue is started again.

Signed-off-by: Naga Harish K S V 
---
v6:
* fix nitpicks
v5:
* fix build failure
v4:
* update programmer guide and doxygen comments
v3:
* fix documentation and address review comments
---
---
 .../prog_guide/event_ethernet_tx_adapter.rst  |  25 
 doc/guides/rel_notes/release_22_11.rst|   8 ++
 lib/eventdev/eventdev_pmd.h   |  41 +++
 lib/eventdev/rte_event_eth_tx_adapter.c   | 112 +-
 lib/eventdev/rte_event_eth_tx_adapter.h   |  54 +
 lib/eventdev/version.map  |   2 +
 6 files changed, 238 insertions(+), 4 deletions(-)

diff --git a/doc/guides/prog_guide/event_ethernet_tx_adapter.rst 
b/doc/guides/prog_guide/event_ethernet_tx_adapter.rst
index 1a5b069d60..73022e307a 100644
--- a/doc/guides/prog_guide/event_ethernet_tx_adapter.rst
+++ b/doc/guides/prog_guide/event_ethernet_tx_adapter.rst
@@ -182,3 +182,28 @@ mbufs are destined to the same ethernet port and queue by 
setting the bit
 ``rte_event_vector::queue``.
 If ``rte_event_vector::attr_valid`` is not set then the Tx adapter should peek
 into each mbuf and transmit them to the requested ethernet port and queue pair.
+
+Queue start/stop
+
+
+The adapter can be configured to start/stop enqueueing of packets to a
+associated NIC queue using ``rte_event_eth_tx_adapter_queue_start()`` or
+``rte_event_eth_tx_adapter_queue_stop()`` respectively. By default the queue
+is in start state.
+
+These APIs help avoid some unexpected behavior with application stopping ethdev
+Tx queues and adapter being unaware of it. With these APIs, the application can
+call stop API to notify adapter that corresponding ethdev Tx queue is stopped
+and any in-flight packets are freed by adapter dataplane code. Adapter queue
+stop API is called before stopping the ethdev Tx queue. When ethdev Tx queue
+is enabled, application can notify adapter to resume processing of the packets
+for that queue by calling the start API. The ethdev Tx queue is started before
+calling adapter start API.
+
+Start function enables the adapter runtime to start enqueueing of packets
+to the Tx queue.
+
+Stop function stops the adapter runtime function from enqueueing any
+packets to the associated Tx queue. This API also frees any packets that
+may have been buffered for this queue. All inflight packets destined to the
+queue are freed by the adapter runtime until the queue is started again.
diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 97720a78ee..a4a51598a1 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -31,6 +31,14 @@ New Features
   Added ``rte_event_eth_tx_adapter_instance_get`` to get the Tx adapter 
instance id for specified
   ethernet device id and Tx queue index.
 
+
+* **Added Tx adapter queue start/stop API**
+
+  Added ``rte_event_eth_tx_adapter_queue_start`` to start enqueueing packets 
to the Tx queue by
+  Tx adapter.
+  Added ``rte_event_eth_tx_adapter_queue_stop`` to stop the Tx Adapter from 
enqueueing any
+  packets to the Tx queue.
+
 .. This section should contain new features added in this release.
Sample format:
 
diff --git a/lib/eventdev/eventdev_pmd.h b/lib/eventdev/eventdev_pmd.h
index 1e65d096f1..27b936a60e 100644
--- a/lib/eventdev/eventdev_pmd.h
+++ b/lib/eventdev/eventdev_pmd.h
@@ -1277,6 +1277,43 @@ typedef int 
(*eventdev_eth_tx_adapter_stats_reset_t)(uint8_t id,
 typedef int (*eventdev_eth_tx_adapter_instance_get_t)
(uint16_t eth_dev_id, uint16_t tx_queue_id, uint8_t *txa_inst_id);
 
+/**
+ * Start a Tx queue that is assigned to Tx adapter instance
+ *
+ * @param id
+ *  Adapter identifier
+ *
+ * @param eth_dev_id
+ *  Port identifier of Ethernet device
+ *
+ * @param tx_queue_id
+ *  Ethernet device Tx queue index
+ *
+ * @return
+ *  -  0: Success
+ *  - <0: Error code on failure
+ */
+typedef int (*eventdev_eth_tx_adapter_queue_start)
+   (uint8_t id, uint16_t eth_dev_id, uint16_t tx_queue_id);
+
+/**
+ * Stop a Tx queue that is assigned to Tx adapter instance
+ *
+ * @param id
+ *  Adapter identifier
+ *
+ * @param eth_dev_id
+ *  Port identifier of Ethernet device
+ *
+ * @param tx_queue_id
+ *  Ethernet device Tx queue index
+ *
+ * @return
+ *  -  0: Success
+ *  - <0: Error code on failure
+ */
+typedef int (*eventdev_eth_tx_adapter_queue_stop)
+   (uint8_t id, uint16_t eth_dev_id, uint16_t tx_queue_id);
 
 /** Event device operations function pointer table */
 struct eventdev_ops {
@@ -1390,6 +1427,10 @@ str

[PATCH v6 2/2] test/eth_tx: add testcase for queue start stop APIs

2022-09-25 Thread Naga Harish K S V
Added testcase for rte_event_eth_tx_adapter_queue_start()
and rte_event_eth_tx_adapter_queue_stop() APIs.

Signed-off-by: Naga Harish K S V 
---
 app/test/test_event_eth_tx_adapter.c | 86 
 1 file changed, 86 insertions(+)

diff --git a/app/test/test_event_eth_tx_adapter.c 
b/app/test/test_event_eth_tx_adapter.c
index 98debfdd2c..c19a87a86a 100644
--- a/app/test/test_event_eth_tx_adapter.c
+++ b/app/test/test_event_eth_tx_adapter.c
@@ -711,6 +711,90 @@ tx_adapter_instance_get(void)
return TEST_SUCCESS;
 }
 
+static int
+tx_adapter_queue_start_stop(void)
+{
+   int err;
+   uint16_t eth_dev_id;
+   struct rte_eth_dev_info dev_info;
+
+   /* Case 1: Test without adding eth Tx queue */
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 2: Test with wrong eth port */
+   eth_dev_id = rte_eth_dev_count_total() + 1;
+   err = rte_event_eth_tx_adapter_queue_start(eth_dev_id,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(eth_dev_id,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 3: Test with wrong tx queue */
+   err = rte_eth_dev_info_get(TEST_ETHDEV_ID, &dev_info);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   dev_info.max_tx_queues + 1);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   dev_info.max_tx_queues + 1);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Case 4: Test with right instance, port & rxq */
+   /* Add queue to tx adapter */
+   err = rte_event_eth_tx_adapter_queue_add(TEST_INST_ID,
+TEST_ETHDEV_ID,
+TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   /* Add another queue to tx adapter */
+   err = rte_event_eth_tx_adapter_queue_add(TEST_INST_ID,
+TEST_ETHDEV_ID,
+TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   /* Case 5: Test with right instance, port & wrong rxq */
+   err = rte_event_eth_tx_adapter_queue_stop(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 2);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   err = rte_event_eth_tx_adapter_queue_start(TEST_ETHDEV_ID,
+   TEST_ETH_QUEUE_ID + 2);
+   TEST_ASSERT(err == -EINVAL, "Expected -EINVAL got %d", err);
+
+   /* Delete all queues from the Tx adapter */
+   err = rte_event_eth_tx_adapter_queue_del(TEST_INST_ID,
+TEST_ETHDEV_ID,
+-1);
+   TEST_ASSERT(err == 0, "Expected 0 got %d", err);
+
+   return TEST_SUCCESS;
+}
+
 static int
 tx_adapter_dynamic_device(void)
 {
@@ -770,6 +854,8 @@ static struct unit_test_suite event_eth_tx_tests = {
tx_adapter_service),
TEST_CASE_ST(tx_adapter_create, tx_adapter_free,
tx_adapter_instance_get),
+   TEST_CASE_ST(tx_adapter_create, tx_adapter_free,
+   tx_adapter_queue_start_stop)

RE: [PATCH v2] vhost: fix build

2022-09-25 Thread Xia, Chenbo
Hi Min,

> -Original Message-
> From: Min Zhou 
> Sent: Monday, August 29, 2022 4:29 PM
> To: david.march...@redhat.com; maxime.coque...@redhat.com; Xia, Chenbo
> ; zhou...@loongson.cn
> Cc: dev@dpdk.org; maob...@loongson.cn
> Subject: [PATCH v2] vhost: fix build
> 
> On CentOS 8 or Debian 10.4 systems using gcc 12.1 to cross
> compile DPDK, gcc shows a following warning which will cause
> build to fail when build is run with -werror:
> 
> In function 'mbuf_to_desc',
> inlined from 'vhost_enqueue_async_packed'
> at ../lib/vhost/virtio_net.c:1826:6,
> inlined from 'virtio_dev_rx_async_packed'
> at ../lib/vhost/virtio_net.c:1840:6,
> inlined from 'virtio_dev_rx_async_submit_packed.constprop'
> at ../lib/vhost/virtio_net.c:1900:7:
> ../lib/vhost/virtio_net.c:1161:35: error: 'buf_vec[0].buf_len' may be used
> uninitialized [-Werror=maybe-uninitialized]
>  1161 | buf_len = buf_vec[vec_idx].buf_len;
>   |   ^~~~
> ../lib/vhost/virtio_net.c: In function
> 'virtio_dev_rx_async_submit_packed.constprop':
> ../lib/vhost/virtio_net.c:1838:27: note: 'buf_vec' declared here
>  1838 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
>   |   ^~~
> cc1: all warnings being treated as errors
> 
> Actually, there are eight places to see the same codes in the file
> lib/vhost/virtio_net.c, and all these `buf_vec` arraies are
> initialized by sub-function calls under various conditions.
> 
> Although It's hard to understand why gcc just emits warning at one
> of the eight places, adding validity checks for array length is
> reasonable and can also fix the warning.
> 
> Signed-off-by: David Marchand 
> Signed-off-by: Min Zhou 
> ---
>  lib/vhost/virtio_net.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)

Just want you to know that your patch is still pending because by accident
your fix is almost the same as a previous patch that fixes a real issue but
that patch is still in progress:

http://patchwork.dpdk.org/project/dpdk/patch/20220802004938.23670-2-cfont...@suse.de/

Thanks,
Chenbo

> 
> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> index 35fa4670fd..99233f1759 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -1153,7 +1153,7 @@ mbuf_to_desc(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
>   struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL;
>   struct vhost_async *async = vq->async;
> 
> - if (unlikely(m == NULL))
> + if (unlikely(m == NULL || nr_vec == 0))
>   return -1;
> 
>   buf_addr = buf_vec[vec_idx].buf_addr;
> @@ -2673,6 +2673,9 @@ desc_to_mbuf(struct virtio_net *dev, struct
> vhost_virtqueue *vq,
>   struct vhost_async *async = vq->async;
>   struct async_inflight_info *pkts_info;
> 
> + if (unlikely(nr_vec == 0))
> + return -1;
> +
>   buf_addr = buf_vec[vec_idx].buf_addr;
>   buf_iova = buf_vec[vec_idx].buf_iova;
>   buf_len = buf_vec[vec_idx].buf_len;
> --
> 2.31.1



Re: [PATCH v2] vhost: fix build

2022-09-25 Thread zhoumin

Hi Chenbo,


On Mon, 26 Sep 2022, 10:57, Xia, Chenbo wrote:

Hi Min,


-Original Message-
From: Min Zhou 
Sent: Monday, August 29, 2022 4:29 PM
To: david.march...@redhat.com; maxime.coque...@redhat.com; Xia, Chenbo
; zhou...@loongson.cn
Cc: dev@dpdk.org; maob...@loongson.cn
Subject: [PATCH v2] vhost: fix build

On CentOS 8 or Debian 10.4 systems using gcc 12.1 to cross
compile DPDK, gcc shows a following warning which will cause
build to fail when build is run with -werror:

In function 'mbuf_to_desc',
 inlined from 'vhost_enqueue_async_packed'
at ../lib/vhost/virtio_net.c:1826:6,
 inlined from 'virtio_dev_rx_async_packed'
at ../lib/vhost/virtio_net.c:1840:6,
 inlined from 'virtio_dev_rx_async_submit_packed.constprop'
at ../lib/vhost/virtio_net.c:1900:7:
../lib/vhost/virtio_net.c:1161:35: error: 'buf_vec[0].buf_len' may be used
uninitialized [-Werror=maybe-uninitialized]
  1161 | buf_len = buf_vec[vec_idx].buf_len;
   |   ^~~~
../lib/vhost/virtio_net.c: In function
'virtio_dev_rx_async_submit_packed.constprop':
../lib/vhost/virtio_net.c:1838:27: note: 'buf_vec' declared here
  1838 | struct buf_vector buf_vec[BUF_VECTOR_MAX];
   |   ^~~
cc1: all warnings being treated as errors

Actually, there are eight places to see the same codes in the file
lib/vhost/virtio_net.c, and all these `buf_vec` arraies are
initialized by sub-function calls under various conditions.

Although It's hard to understand why gcc just emits warning at one
of the eight places, adding validity checks for array length is
reasonable and can also fix the warning.

Signed-off-by: David Marchand 
Signed-off-by: Min Zhou 
---
  lib/vhost/virtio_net.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

Just want you to know that your patch is still pending because by accident
your fix is almost the same as a previous patch that fixes a real issue but
that patch is still in progress:

http://patchwork.dpdk.org/project/dpdk/patch/20220802004938.23670-2-cfont...@suse.de/

Thanks,
Chenbo


Thanks for your helpful reply.
I think I can drop this patch if the patch you mentioned above could be 
accepted.


Thanks,
Min Zhou

diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
index 35fa4670fd..99233f1759 100644
--- a/lib/vhost/virtio_net.c
+++ b/lib/vhost/virtio_net.c
@@ -1153,7 +1153,7 @@ mbuf_to_desc(struct virtio_net *dev, struct
vhost_virtqueue *vq,
struct virtio_net_hdr_mrg_rxbuf tmp_hdr, *hdr = NULL;
struct vhost_async *async = vq->async;

-   if (unlikely(m == NULL))
+   if (unlikely(m == NULL || nr_vec == 0))
return -1;

buf_addr = buf_vec[vec_idx].buf_addr;
@@ -2673,6 +2673,9 @@ desc_to_mbuf(struct virtio_net *dev, struct
vhost_virtqueue *vq,
struct vhost_async *async = vq->async;
struct async_inflight_info *pkts_info;

+   if (unlikely(nr_vec == 0))
+   return -1;
+
buf_addr = buf_vec[vec_idx].buf_addr;
buf_iova = buf_vec[vec_idx].buf_iova;
buf_len = buf_vec[vec_idx].buf_len;
--
2.31.1




[PATCH v2] net/iavf: fix TSO offload for tunnel case

2022-09-25 Thread Zhichao Zeng
This patch is to fix the tunnel TSO not enabling issue, simplify
the logic of calculating 'Tx Buffer Size' of data descriptor with IPSec
and fix handling that the mbuf size exceeds the TX descriptor
hardware limit(1B-16KB) which causes malicious behavior to the NIC.

Fixes: 1e728b01120c ("net/iavf: rework Tx path")

---
v2: rework patch

Signed-off-by: Zhichao Zeng 
---
 drivers/common/iavf/iavf_osdep.h |  2 +
 drivers/net/iavf/iavf_rxtx.c | 95 +++-
 2 files changed, 59 insertions(+), 38 deletions(-)

diff --git a/drivers/common/iavf/iavf_osdep.h b/drivers/common/iavf/iavf_osdep.h
index 31d3d809f9..bf1436dfc6 100644
--- a/drivers/common/iavf/iavf_osdep.h
+++ b/drivers/common/iavf/iavf_osdep.h
@@ -126,6 +126,8 @@ writeq(uint64_t value, volatile void *addr)
 #define iavf_memset(a, b, c, d) memset((a), (b), (c))
 #define iavf_memcpy(a, b, c, d) rte_memcpy((a), (b), (c))
 
+#define DIV_ROUND_UP(n, d) (((n) + (d) - 1) / (d))
+
 #define iavf_usec_delay(x) rte_delay_us_sleep(x)
 #define iavf_msec_delay(x) iavf_usec_delay(1000 * (x))
 
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 109ba756f8..a06d9d3da6 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -2417,7 +2417,7 @@ iavf_fill_ctx_desc_segmentation_field(volatile uint64_t 
*field,
total_length = m->pkt_len - (m->l2_len + m->l3_len + m->l4_len);
 
if (m->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK)
-   total_length -= m->outer_l3_len;
+   total_length -= m->outer_l3_len + m->outer_l2_len;
}
 
 #ifdef RTE_LIBRTE_IAVF_DEBUG_TX
@@ -2581,50 +2581,39 @@ iavf_build_data_desc_cmd_offset_fields(volatile 
uint64_t *qw1,
((uint64_t)l2tag1 << IAVF_TXD_DATA_QW1_L2TAG1_SHIFT));
 }
 
+/* HW requires that TX buffer size ranges from 1B up to (16K-1)B. */
+#define IAVF_MAX_DATA_PER_TXD \
+   (IAVF_TXD_QW1_TX_BUF_SZ_MASK >> IAVF_TXD_QW1_TX_BUF_SZ_SHIFT)
+
+/* Calculate the number of TX descriptors needed for each pkt */
+static inline uint16_t
+iavf_calc_pkt_desc(struct rte_mbuf *tx_pkt)
+{
+   struct rte_mbuf *txd = tx_pkt;
+   uint16_t count = 0;
+
+   while (txd != NULL) {
+   count += DIV_ROUND_UP(txd->data_len, IAVF_MAX_DATA_PER_TXD);
+   txd = txd->next;
+   }
+
+   return count;
+}
+
 static inline void
 iavf_fill_data_desc(volatile struct iavf_tx_desc *desc,
-   struct rte_mbuf *m, uint64_t desc_template,
-   uint16_t tlen, uint16_t ipseclen)
+   uint64_t desc_template, uint16_t buffsz,
+   uint64_t buffer_addr)
 {
-   uint32_t hdrlen = m->l2_len;
-   uint32_t bufsz = 0;
-
/* fill data descriptor qw1 from template */
desc->cmd_type_offset_bsz = desc_template;
 
-   /* set data buffer address */
-   desc->buffer_addr = rte_mbuf_data_iova(m);
-
-   /* calculate data buffer size less set header lengths */
-   if ((m->ol_flags & RTE_MBUF_F_TX_TUNNEL_MASK) &&
-   (m->ol_flags & (RTE_MBUF_F_TX_TCP_SEG |
-   RTE_MBUF_F_TX_UDP_SEG))) {
-   hdrlen += m->outer_l3_len;
-   if (m->ol_flags & RTE_MBUF_F_TX_L4_MASK)
-   hdrlen += m->l3_len + m->l4_len;
-   else
-   hdrlen += m->l3_len;
-   if (m->ol_flags & RTE_MBUF_F_TX_SEC_OFFLOAD)
-   hdrlen += ipseclen;
-   bufsz = hdrlen + tlen;
-   } else if ((m->ol_flags & RTE_MBUF_F_TX_SEC_OFFLOAD) &&
-   (m->ol_flags & (RTE_MBUF_F_TX_TCP_SEG |
-   RTE_MBUF_F_TX_UDP_SEG))) {
-   hdrlen += m->outer_l3_len + m->l3_len + ipseclen;
-   if (m->ol_flags & RTE_MBUF_F_TX_L4_MASK)
-   hdrlen += m->l4_len;
-   bufsz = hdrlen + tlen;
-
-   } else {
-   bufsz = m->data_len;
-   }
-
/* set data buffer size */
desc->cmd_type_offset_bsz |=
-   (((uint64_t)bufsz << IAVF_TXD_DATA_QW1_TX_BUF_SZ_SHIFT) &
+   (((uint64_t)buffsz << IAVF_TXD_DATA_QW1_TX_BUF_SZ_SHIFT) &
IAVF_TXD_DATA_QW1_TX_BUF_SZ_MASK);
 
-   desc->buffer_addr = rte_cpu_to_le_64(desc->buffer_addr);
+   desc->buffer_addr = rte_cpu_to_le_64(buffer_addr);
desc->cmd_type_offset_bsz = rte_cpu_to_le_64(desc->cmd_type_offset_bsz);
 }
 
@@ -2649,8 +2638,10 @@ iavf_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
struct iavf_tx_entry *txe_ring = txq->sw_ring;
struct iavf_tx_entry *txe, *txn;
struct rte_mbuf *mb, *mb_seg;
+   uint64_t buf_dma_addr;
uint16_t desc_idx, desc_idx_last;
uint16_t idx;
+   uint16_t slen;
 
 
/* Check if the descriptor ring needs to be cleaned. */
@@ -2689,8 +2680,14 @@ iavf_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)

RE: [PATCH v7 2/3] timer: fix function to stop all timers

2022-09-25 Thread Naga Harish K, S V
Hi Thomas,
Did you get a chance to review this patch?
Without this patch, the periodic event timer tests for SW timer adapter hangs.

-Harish

> -Original Message-
> From: Jerin Jacob 
> Sent: Thursday, September 15, 2022 12:12 PM
> To: Naga Harish K, S V ; Thomas Monjalon
> 
> Cc: jer...@marvell.com; dev@dpdk.org; Carrillo, Erik G
> ; pbhagavat...@marvell.com;
> sthot...@marvell.com; sta...@dpdk.org
> Subject: Re: [PATCH v7 2/3] timer: fix function to stop all timers
> 
> On Wed, Sep 14, 2022 at 9:03 PM Naga Harish K S V
>  wrote:
> >
> > There is a possibility of deadlock in this API, as same spinlock is
> > tried to be acquired in nested manner.
> >
> > If the lcore that is stopping the timer is different from the lcore
> > that owns the timer, the timer list lock is acquired in timer_del(),
> > even if local_is_locked is true. Because the same lock was already
> > acquired in rte_timer_stop_all(), the thread will hang.
> >
> > This patch removes the acquisition of nested lock.
> >
> > Fixes: 821c51267bcd63a ("timer: add function to stop all timers in a
> > list")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Naga Harish K S V 
> > ---
> >  lib/timer/rte_timer.c | 13 -
> 
> Since this change in lib/timer. Delegating this patch to @Thomas Monjalon


RE: [PATCH v2 1/2] vhost: introduce DMA vchannel unconfiguration

2022-09-25 Thread Xia, Chenbo
> -Original Message-
> From: Ding, Xuan 
> Sent: Tuesday, September 6, 2022 1:22 PM
> To: maxime.coque...@redhat.com; Xia, Chenbo 
> Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> ; Yang, YvonneX ; Jiang,
> Cheng1 ; Wang, YuanX ; Ma,
> WenwuX ; Ding, Xuan 
> Subject: [PATCH v2 1/2] vhost: introduce DMA vchannel unconfiguration
> 
> From: Xuan Ding 
> 
> This patch adds a new API rte_vhost_async_dma_unconfigure() to unconfigure
> DMA vchannels in vhost async data path.
> 
> Lock protection are also added to protect DMA vchannels configuration and
> unconfiguration from concurrent calls.
> 
> Signed-off-by: Xuan Ding 
> ---
>  doc/guides/prog_guide/vhost_lib.rst|  5 ++
>  doc/guides/rel_notes/release_22_11.rst |  2 +
>  lib/vhost/rte_vhost_async.h| 17 +++
>  lib/vhost/version.map  |  3 ++
>  lib/vhost/vhost.c  | 69 --
>  5 files changed, 91 insertions(+), 5 deletions(-)
> 
> diff --git a/doc/guides/prog_guide/vhost_lib.rst
> b/doc/guides/prog_guide/vhost_lib.rst
> index bad4d819e1..22764cbeaa 100644
> --- a/doc/guides/prog_guide/vhost_lib.rst
> +++ b/doc/guides/prog_guide/vhost_lib.rst
> @@ -323,6 +323,11 @@ The following is an overview of some key Vhost API
> functions:
>Get device type of vDPA device, such as VDPA_DEVICE_TYPE_NET,
>VDPA_DEVICE_TYPE_BLK.
> 
> +* ``rte_vhost_async_dma_unconfigure(dma_id, vchan_id)``
> +
> +  Clear DMA vChannels finished to use. This function needs to
> +  be called after the deregisterration of async path has been finished.

Deregistration

> +
>  Vhost-user Implementations
>  --
> 
> diff --git a/doc/guides/rel_notes/release_22_11.rst
> b/doc/guides/rel_notes/release_22_11.rst
> index 8c021cf050..e94c006e39 100644
> --- a/doc/guides/rel_notes/release_22_11.rst
> +++ b/doc/guides/rel_notes/release_22_11.rst
> @@ -55,6 +55,8 @@ New Features
>   Also, make sure to start the actual text at the margin.
>   ===
> 
> +* **Added vhost API to unconfigure DMA vchannels.**
> +  Added an API which helps to unconfigure DMA vchannels.

Added XXX for async vhost

Overall LGTM. It seems it needs some rebasing too.

Thanks,
Chenbo

> 
>  Removed Items
>  -
> diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
> index 1db2a10124..0442e027fd 100644
> --- a/lib/vhost/rte_vhost_async.h
> +++ b/lib/vhost/rte_vhost_async.h
> @@ -266,6 +266,23 @@ rte_vhost_async_try_dequeue_burst(int vid, uint16_t
> queue_id,
>   struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> count,
>   int *nr_inflight, int16_t dma_id, uint16_t vchan_id);
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> notice.
> + *
> + * Unconfigure DMA vChannels in asynchronous data path.
> + *
> + * @param dma_id
> + *  the identifier of DMA device
> + * @param vchan_id
> + *  the identifier of virtual DMA channel
> + * @return
> + *  0 on success, and -1 on failure
> + */
> +__rte_experimental
> +int
> +rte_vhost_async_dma_unconfigure(int16_t dma_id, uint16_t vchan_id);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/vhost/version.map b/lib/vhost/version.map
> index 18574346d5..013a6bcc42 100644
> --- a/lib/vhost/version.map
> +++ b/lib/vhost/version.map
> @@ -96,6 +96,9 @@ EXPERIMENTAL {
>   rte_vhost_async_try_dequeue_burst;
>   rte_vhost_driver_get_vdpa_dev_type;
>   rte_vhost_clear_queue;
> +
> + # added in 22.11
> + rte_vhost_async_dma_unconfigure;
>  };
> 
>  INTERNAL {
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 60cb05a0ff..273616da11 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -23,6 +23,7 @@
> 
>  struct virtio_net *vhost_devices[RTE_MAX_VHOST_DEVICE];
>  pthread_mutex_t vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
> +static rte_spinlock_t vhost_dma_lock = RTE_SPINLOCK_INITIALIZER;
> 
>  struct vhost_vq_stats_name_off {
>   char name[RTE_VHOST_STATS_NAME_SIZE];
> @@ -1870,19 +1871,20 @@ rte_vhost_async_dma_configure(int16_t dma_id,
> uint16_t vchan_id)
>   void *pkts_cmpl_flag_addr;
>   uint16_t max_desc;
> 
> + rte_spinlock_lock(&vhost_dma_lock);
>   if (!rte_dma_is_valid(dma_id)) {
>   VHOST_LOG_CONFIG("dma", ERR, "DMA %d is not found.\n", dma_id);
> - return -1;
> + goto error;
>   }
> 
>   if (rte_dma_info_get(dma_id, &info) != 0) {
>   VHOST_LOG_CONFIG("dma", ERR, "Fail to get DMA %d
> information.\n", dma_id);
> - return -1;
> + goto error;
>   }
> 
>   if (vchan_id >= info.max_vchans) {
>   VHOST_LOG_CONFIG("dma", ERR, "Invalid DMA %d vChannel %u.\n",
> dma_id, vchan_id);
> - return -1;
> + goto error;
>   }
> 
>   if (!dma_copy_track[dma_id].vchans) {
> @@ -1894,7 +1896,7 @@ rte_vhost_async_dma_configure(int16_t dma_id,
> uint1

RE: [PATCH v2 1/2] vhost: introduce DMA vchannel unconfiguration

2022-09-25 Thread Ding, Xuan
Hi Chenbo,

Thanks for your comments, please see replies inline.

> -Original Message-
> From: Xia, Chenbo 
> Sent: Monday, September 26, 2022 2:07 PM
> To: Ding, Xuan ; maxime.coque...@redhat.com
> Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> ; Yang, YvonneX ;
> Jiang, Cheng1 ; Wang, YuanX
> ; Ma, WenwuX 
> Subject: RE: [PATCH v2 1/2] vhost: introduce DMA vchannel unconfiguration
> 
> > -Original Message-
> > From: Ding, Xuan 
> > Sent: Tuesday, September 6, 2022 1:22 PM
> > To: maxime.coque...@redhat.com; Xia, Chenbo 
> > Cc: dev@dpdk.org; Hu, Jiayu ; He, Xingguang
> > ; Yang, YvonneX ;
> > Jiang,
> > Cheng1 ; Wang, YuanX ;
> > Ma, WenwuX ; Ding, Xuan 
> > Subject: [PATCH v2 1/2] vhost: introduce DMA vchannel unconfiguration
> >
> > From: Xuan Ding 
> >
> > This patch adds a new API rte_vhost_async_dma_unconfigure() to
> > unconfigure DMA vchannels in vhost async data path.
> >
> > Lock protection are also added to protect DMA vchannels configuration
> > and unconfiguration from concurrent calls.
> >
> > Signed-off-by: Xuan Ding 
> > ---
> >  doc/guides/prog_guide/vhost_lib.rst|  5 ++
> >  doc/guides/rel_notes/release_22_11.rst |  2 +
> >  lib/vhost/rte_vhost_async.h| 17 +++
> >  lib/vhost/version.map  |  3 ++
> >  lib/vhost/vhost.c  | 69 --
> >  5 files changed, 91 insertions(+), 5 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/vhost_lib.rst
> > b/doc/guides/prog_guide/vhost_lib.rst
> > index bad4d819e1..22764cbeaa 100644
> > --- a/doc/guides/prog_guide/vhost_lib.rst
> > +++ b/doc/guides/prog_guide/vhost_lib.rst
> > @@ -323,6 +323,11 @@ The following is an overview of some key Vhost
> > API
> > functions:
> >Get device type of vDPA device, such as VDPA_DEVICE_TYPE_NET,
> >VDPA_DEVICE_TYPE_BLK.
> >
> > +* ``rte_vhost_async_dma_unconfigure(dma_id, vchan_id)``
> > +
> > +  Clear DMA vChannels finished to use. This function needs to  be
> > + called after the deregisterration of async path has been finished.
> 
> Deregistration

Thanks for your catch.

> 
> > +
> >  Vhost-user Implementations
> >  --
> >
> > diff --git a/doc/guides/rel_notes/release_22_11.rst
> > b/doc/guides/rel_notes/release_22_11.rst
> > index 8c021cf050..e94c006e39 100644
> > --- a/doc/guides/rel_notes/release_22_11.rst
> > +++ b/doc/guides/rel_notes/release_22_11.rst
> > @@ -55,6 +55,8 @@ New Features
> >   Also, make sure to start the actual text at the margin.
> >   ===
> >
> > +* **Added vhost API to unconfigure DMA vchannels.**
> > +  Added an API which helps to unconfigure DMA vchannels.
> 
> Added XXX for async vhost

Good idea.

> 
> Overall LGTM. It seems it needs some rebasing too.

I'm preparing v3 patch series, please see next version.

Regards,
Xuan

> 
> Thanks,
> Chenbo
> 
> >
> >  Removed Items
> >  -
> > diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
> > index 1db2a10124..0442e027fd 100644
> > --- a/lib/vhost/rte_vhost_async.h
> > +++ b/lib/vhost/rte_vhost_async.h
> > @@ -266,6 +266,23 @@ rte_vhost_async_try_dequeue_burst(int vid,
> > uint16_t queue_id,
> > struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count,
> > int *nr_inflight, int16_t dma_id, uint16_t vchan_id);
> >
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change, or be removed, without prior
> > notice.
> > + *
> > + * Unconfigure DMA vChannels in asynchronous data path.
> > + *
> > + * @param dma_id
> > + *  the identifier of DMA device
> > + * @param vchan_id
> > + *  the identifier of virtual DMA channel
> > + * @return
> > + *  0 on success, and -1 on failure
> > + */
> > +__rte_experimental
> > +int
> > +rte_vhost_async_dma_unconfigure(int16_t dma_id, uint16_t vchan_id);
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/vhost/version.map b/lib/vhost/version.map index
> > 18574346d5..013a6bcc42 100644
> > --- a/lib/vhost/version.map
> > +++ b/lib/vhost/version.map
> > @@ -96,6 +96,9 @@ EXPERIMENTAL {
> > rte_vhost_async_try_dequeue_burst;
> > rte_vhost_driver_get_vdpa_dev_type;
> > rte_vhost_clear_queue;
> > +
> > +   # added in 22.11
> > +   rte_vhost_async_dma_unconfigure;
> >  };
> >
> >  INTERNAL {
> > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index
> > 60cb05a0ff..273616da11 100644
> > --- a/lib/vhost/vhost.c
> > +++ b/lib/vhost/vhost.c
> > @@ -23,6 +23,7 @@
> >
> >  struct virtio_net *vhost_devices[RTE_MAX_VHOST_DEVICE];
> >  pthread_mutex_t vhost_dev_lock = PTHREAD_MUTEX_INITIALIZER;
> > +static rte_spinlock_t vhost_dma_lock = RTE_SPINLOCK_INITIALIZER;
> >
> >  struct vhost_vq_stats_name_off {
> > char name[RTE_VHOST_STATS_NAME_SIZE]; @@ -1870,19 +1871,20
> @@
> > rte_vhost_async_dma_configure(int16_t dma_id, uint16_t vchan_id)
> > void *pkts_cmpl_flag_addr;
> > uint16_t max_desc;
> >
> > +   rte_spinlock_lock(&vhost_dma

RE: [PATCH v3] net/vhost: support asynchronous data path

2022-09-25 Thread Xia, Chenbo
> -Original Message-
> From: Wang, YuanX 
> Sent: Wednesday, August 24, 2022 12:36 AM
> To: maxime.coque...@redhat.com; Xia, Chenbo ;
> dev@dpdk.org
> Cc: Hu, Jiayu ; He, Xingguang ;
> Jiang, Cheng1 ; Wang, YuanX ;
> Ma, WenwuX 
> Subject: [PATCH v3] net/vhost: support asynchronous data path

How many % will this patch impact the sync data path perf?
It should be minor, right?

Maxime, should we plan to remove vhost example now? Maintaining vhost 
example/PMD
that have the same functionalities, does not make sense to me. We should send
deprecation notice in 22.11 and also get to know tech-board opinion?

> 
> Vhost asynchronous data-path offloads packet copy from the CPU
> to the DMA engine. As a result, large packet copy can be accelerated
> by the DMA engine, and vhost can free CPU cycles for higher level
> functions.
> 
> In this patch, we enable asynchronous data-path for vhostpmd.
> Asynchronous data path is enabled per tx/rx queue, and users need
> to specify the DMA device used by the tx/rx queue. Each tx/rx queue
> only supports to use one DMA device, but one DMA device can be shared
> among multiple tx/rx queues of different vhostpmd ports.

Vhostpmd -> vhost PMD

> 
> Two PMD parameters are added:
> - dmas:   specify the used DMA device for a tx/rx queue.
>   (Default: no queues enable asynchronous data path)
> - dma-ring-size: DMA ring size.
>   (Default: 4096).
> 
> Here is an example:
> --vdev
> 'eth_vhost0,iface=./s0,dmas=[txq0@:00.01.0;rxq0@:00.01.1],dma-
> ring-size=4096'
> 
> Signed-off-by: Jiayu Hu 
> Signed-off-by: Yuan Wang 
> Signed-off-by: Wenwu Ma 
> ---
> v3:
> - add the API to version.map
> 
> v2:
> - add missing file
> - hide async_tx_poll_completed
> - change default DMA ring size to 4096
> ---
>  drivers/net/vhost/meson.build |   1 +
>  drivers/net/vhost/rte_eth_vhost.c | 494 --
>  drivers/net/vhost/rte_eth_vhost.h |  15 +
>  drivers/net/vhost/version.map |   7 +
>  drivers/net/vhost/vhost_testpmd.c |  65 
>  5 files changed, 549 insertions(+), 33 deletions(-)
>  create mode 100644 drivers/net/vhost/vhost_testpmd.c
> 
> diff --git a/drivers/net/vhost/meson.build b/drivers/net/vhost/meson.build
> index f481a3a4b8..22a0ab3a58 100644
> --- a/drivers/net/vhost/meson.build
> +++ b/drivers/net/vhost/meson.build
> @@ -9,4 +9,5 @@ endif
> 
>  deps += 'vhost'
>  sources = files('rte_eth_vhost.c')
> +testpmd_sources = files('vhost_testpmd.c')
>  headers = files('rte_eth_vhost.h')
> diff --git a/drivers/net/vhost/rte_eth_vhost.c
> b/drivers/net/vhost/rte_eth_vhost.c
> index 7e512d94bf..aa069c6b68 100644
> --- a/drivers/net/vhost/rte_eth_vhost.c
> +++ b/drivers/net/vhost/rte_eth_vhost.c
> @@ -17,6 +17,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
> 
>  #include "rte_eth_vhost.h"
> 
> @@ -36,8 +38,13 @@ enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
>  #define ETH_VHOST_LINEAR_BUF "linear-buffer"
>  #define ETH_VHOST_EXT_BUF"ext-buffer"
>  #define ETH_VHOST_LEGACY_OL_FLAGS"legacy-ol-flags"
> +#define ETH_VHOST_DMA_ARG"dmas"
> +#define ETH_VHOST_DMA_RING_SIZE  "dma-ring-size"
>  #define VHOST_MAX_PKT_BURST 32
> 
> +#define INVALID_DMA_ID   -1
> +#define DEFAULT_DMA_RING_SIZE4096
> +
>  static const char *valid_arguments[] = {
>   ETH_VHOST_IFACE_ARG,
>   ETH_VHOST_QUEUES_ARG,
> @@ -48,6 +55,8 @@ static const char *valid_arguments[] = {
>   ETH_VHOST_LINEAR_BUF,
>   ETH_VHOST_EXT_BUF,
>   ETH_VHOST_LEGACY_OL_FLAGS,
> + ETH_VHOST_DMA_ARG,
> + ETH_VHOST_DMA_RING_SIZE,
>   NULL
>  };
> 
> @@ -79,8 +88,39 @@ struct vhost_queue {
>   struct vhost_stats stats;
>   int intr_enable;
>   rte_spinlock_t intr_lock;
> +
> + /* Flag of enabling async data path */
> + bool async_register;
> + /* DMA device ID */
> + int16_t dma_id;
> + /**
> +  * For a Rx queue, "txq" points to its peer Tx queue.
> +  * For a Tx queue, "txq" is never used.
> +  */
> + struct vhost_queue *txq;
> + /* Array to keep DMA completed packets */
> + struct rte_mbuf *cmpl_pkts[VHOST_MAX_PKT_BURST];
>  };
> 
> +struct dma_input_info {
> + int16_t dmas[RTE_MAX_QUEUES_PER_PORT * 2];
> + uint16_t dma_ring_size;
> +};
> +
> +static int16_t configured_dmas[RTE_DMADEV_DEFAULT_MAX];
> +static int dma_count;
> +
> +/**
> + * By default, its Rx path to call rte_vhost_poll_enqueue_completed() for
> enqueue operations.
> + * However, Rx function is never been called in testpmd "txonly" mode,
> thus causing virtio
> + * cannot receive DMA completed packets. To make txonly mode work
> correctly, we provide a
> + * command in testpmd to call rte_vhost_poll_enqueue_completed() in Tx
> path.
> + *
> + * When set async_tx_poll_completed to true, Tx path calls
> rte_vhost_poll_enqueue_completed();
> + * otherwise, Rx path calls it.
> + */
> +bool async_tx_poll_completed;
> +
>  struct pmd_internal {
>  

RE: [PATCH v2] vhost: use dedicated variable for vhost message result code

2022-09-25 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Friday, September 23, 2022 10:33 AM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; maxime.coque...@redhat.com
> Subject: [PATCH v2] vhost: use dedicated variable for vhost message result
> code
> 
> Currently in function vhost_user_msg_handler, variable ret is used to
> store both vhost msg result code and function call return value.
> After this patch, variable ret is used only to store function call
> return value, a new dedicated variable msg_result is used to
> store vhost msg result. This can improve readability.
> 
> Signed-off-by: Andy Pei 
> ---
>  lib/vhost/vhost_user.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index 0182090..6d93495 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -2954,6 +2954,7 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   struct vhu_msg_context ctx;
>   vhost_message_handler_t *msg_handler;
>   struct rte_vdpa_device *vdpa_dev;
> + int msg_result = RTE_VHOST_MSG_RESULT_OK;
>   int ret;
>   int unlock_required = 0;
>   bool handled;
> @@ -3046,8 +3047,8 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   handled = false;
>   if (dev->extern_ops.pre_msg_handle) {
>   RTE_BUILD_BUG_ON(offsetof(struct vhu_msg_context, msg) != 0);
> - ret = (*dev->extern_ops.pre_msg_handle)(dev->vid, &ctx);
> - switch (ret) {
> + msg_result = (*dev->extern_ops.pre_msg_handle)(dev->vid, &ctx);
> + switch (msg_result) {
>   case RTE_VHOST_MSG_RESULT_REPLY:
>   send_vhost_reply(dev, fd, &ctx);
>   /* Fall-through */
> @@ -3065,12 +3066,12 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   goto skip_to_post_handle;
> 
>   if (!msg_handler->accepts_fd && validate_msg_fds(dev, &ctx, 0) != 0)
> {
> - ret = RTE_VHOST_MSG_RESULT_ERR;
> + msg_result = RTE_VHOST_MSG_RESULT_ERR;
>   } else {
> - ret = msg_handler->callback(&dev, &ctx, fd);
> + msg_result = msg_handler->callback(&dev, &ctx, fd);
>   }
> 
> - switch (ret) {
> + switch (msg_result) {
>   case RTE_VHOST_MSG_RESULT_ERR:
>   VHOST_LOG_CONFIG(dev->ifname, ERR,
>   "processing %s failed.\n",
> @@ -3095,11 +3096,11 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   }
> 
>  skip_to_post_handle:
> - if (ret != RTE_VHOST_MSG_RESULT_ERR &&
> + if (msg_result != RTE_VHOST_MSG_RESULT_ERR &&
>   dev->extern_ops.post_msg_handle) {
>   RTE_BUILD_BUG_ON(offsetof(struct vhu_msg_context, msg) != 0);
> - ret = (*dev->extern_ops.post_msg_handle)(dev->vid, &ctx);
> - switch (ret) {
> + msg_result = (*dev->extern_ops.post_msg_handle)(dev->vid,
> &ctx);
> + switch (msg_result) {
>   case RTE_VHOST_MSG_RESULT_REPLY:
>   send_vhost_reply(dev, fd, &ctx);
>   /* Fall-through */
> @@ -3118,7 +3119,7 @@ static int is_vring_iotlb(struct virtio_net *dev,
>   "vhost message (req: %d) was not handled.\n",
>   request);
>   close_msg_fds(&ctx);
> - ret = RTE_VHOST_MSG_RESULT_ERR;
> + msg_result = RTE_VHOST_MSG_RESULT_ERR;
>   }
> 
>   /*
> @@ -3127,17 +3128,16 @@ static int is_vring_iotlb(struct virtio_net *dev,
>* VHOST_USER_NEED_REPLY was cleared in send_vhost_reply().
>*/
>   if (ctx.msg.flags & VHOST_USER_NEED_REPLY) {
> - ctx.msg.payload.u64 = ret == RTE_VHOST_MSG_RESULT_ERR;
> + ctx.msg.payload.u64 = msg_result == RTE_VHOST_MSG_RESULT_ERR;
>   ctx.msg.size = sizeof(ctx.msg.payload.u64);
>   ctx.fd_num = 0;
>   send_vhost_reply(dev, fd, &ctx);
> - } else if (ret == RTE_VHOST_MSG_RESULT_ERR) {
> + } else if (msg_result == RTE_VHOST_MSG_RESULT_ERR) {
>   VHOST_LOG_CONFIG(dev->ifname, ERR, "vhost message handling
> failed.\n");
>   ret = -1;
>   goto unlock;
>   }
> 
> - ret = 0;
>   for (i = 0; i < dev->nr_vring; i++) {
>   struct vhost_virtqueue *vq = dev->virtqueue[i];
>   bool cur_ready = vq_is_ready(dev, vq);
> --
> 1.8.3.1

Reviewed-by: Chenbo Xia