[PATCH v3] app/testpmd: fix incorrect queues state of secondary process

2022-09-06 Thread Peng Zhang
Primary process could set up queues state correctly when starting port,
while secondary process not. Under multi-process scenario, "stream_init"
function would get wrong queues state for secondary process.

This commit is to get queues state from ethdev which is located in
shared memory.

Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
Cc: sta...@dpdk.org

Signed-off-by: Peng Zhang 

---
 v3:
 - Modify the parameter of rx or tx queue state array 
 v2:
 - Change the way of getting secondary process queues states
---
 app/test-pmd/testpmd.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index addcbcac85..977ec4fa28 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -75,6 +75,8 @@
 
 #include "testpmd.h"
 
+#include 
+
 #ifndef MAP_HUGETLB
 /* FreeBSD may not have MAP_HUGETLB (in fact, it probably doesn't) */
 #define HUGE_FLAG (0x4)
@@ -2402,10 +2404,24 @@ start_packet_forwarding(int with_tx_first)
if (!pkt_fwd_shared_rxq_check())
return;
 
-   if (stream_init != NULL)
-   for (i = 0; i < cur_fwd_config.nb_fwd_streams; i++)
-   stream_init(fwd_streams[i]);
+   if (stream_init != NULL) {
+   for (i = 0; i < cur_fwd_config.nb_fwd_streams; i++) {
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
+   struct fwd_stream *fs = fwd_streams[i];
+   struct rte_eth_dev_data *dev_rx_data, 
*dev_tx_data;
+
+   dev_rx_data = 
(&rte_eth_devices[fs->rx_port])->data;
+   dev_tx_data = 
(&rte_eth_devices[fs->tx_port])->data;
+
+   uint8_t rx_state = 
dev_rx_data->rx_queue_state[fs->rx_queue];
+   ports[fs->rx_port].rxq[fs->rx_queue].state = 
rx_state;
+   uint8_t tx_state = 
dev_tx_data->tx_queue_state[fs->tx_queue];
+   ports[fs->tx_port].txq[fs->tx_queue].state = 
tx_state;
+   }
 
+   stream_init(fwd_streams[i]);
+   }
+   }
port_fwd_begin = cur_fwd_config.fwd_eng->port_fwd_begin;
if (port_fwd_begin != NULL) {
for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++) {
-- 
2.25.1



RE: [PATCH v3 3/5] net/iavf: support flow subscrption pattern

2022-09-06 Thread Zhang, Qi Z



> -Original Message-
> From: Wang, Jie1X 
> Sent: Wednesday, August 31, 2022 2:05 AM
> To: dev@dpdk.org
> Cc: Yang, Qiming ; Zhang, Qi Z
> ; Wu, Jingjing ; Xing, Beilei
> ; Yang, SteveX ; Wang, Jie1X
> 
> Subject: [PATCH v3 3/5] net/iavf: support flow subscrption pattern
> 
...
> +static int
> +iavf_fsub_parse_action(struct iavf_adapter *ad,
> +const struct rte_flow_action *actions,
> +uint32_t priority,
> +struct rte_flow_error *error,
> +struct iavf_fsub_conf *filter)
>  {
> + const struct rte_flow_action *action;
> + const struct rte_flow_action_ethdev *act_ethdev;
> + const struct rte_flow_action_queue *act_q;
> + const struct rte_flow_action_rss *act_qgrop;
> + struct virtchnl_filter_action *filter_action;
> + uint16_t valid_qgrop_number[MAX_QGRP_NUM_TYPE] = {
> + 2, 4, 8, 16, 32, 64, 128};
> + uint16_t i, num = 0, dest_num = 0, vf_num = 0;
> + uint16_t rule_port_id;
> +
> + for (action = actions; action->type !=
> + RTE_FLOW_ACTION_TYPE_END; action++) {
> + switch (action->type) {
> + case RTE_FLOW_ACTION_TYPE_VOID:
> + break;
> +
> + case RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT:

Should be RTE_FLOW_ACTION_PORT_REPRESENTOR, as the traffic is expected to be 
sent to the given ethdev.




[PATCH v2] net/pcap: fix timeout of stopping device

2022-09-06 Thread Yiding Zhou
The pcap file will be synchronized to the disk when stopping the device.
It takes a long time if the file is large that would cause the
'detach sync request' timeout when the device is closed under multi-process
scenario.

This commit fixes the issue by using alarm handler to release dumper.

Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
Cc: sta...@dpdk.org

Signed-off-by: Yiding Zhou 

---
v2: use alarm handler to release dumper
---
 drivers/net/pcap/pcap_ethdev.c | 22 +-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/drivers/net/pcap/pcap_ethdev.c b/drivers/net/pcap/pcap_ethdev.c
index ec29fd6bc5..5c643a0277 100644
--- a/drivers/net/pcap/pcap_ethdev.c
+++ b/drivers/net/pcap/pcap_ethdev.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "pcap_osdep.h"
 
@@ -664,6 +665,25 @@ eth_dev_start(struct rte_eth_dev *dev)
return 0;
 }
 
+static void eth_pcap_dumper_release(void *arg)
+{
+   pcap_dump_close((pcap_dumper_t *)arg);
+}
+
+static void
+eth_pcap_dumper_close(pcap_dumper_t *dumper)
+{
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+   /*
+* Delay 30 seconds before releasing dumper to wait for file 
sync
+* to complete to avoid blocking alarm thread in PRIMARY process
+*/
+   rte_eal_alarm_set(3000, eth_pcap_dumper_release, dumper);
+   } else {
+   rte_eal_alarm_set(1, eth_pcap_dumper_release, dumper);
+   }
+}
+
 /*
  * This function gets called when the current port gets stopped.
  * Is the only place for us to close all the tx streams dumpers.
@@ -689,7 +709,7 @@ eth_dev_stop(struct rte_eth_dev *dev)
 
for (i = 0; i < dev->data->nb_tx_queues; i++) {
if (pp->tx_dumper[i] != NULL) {
-   pcap_dump_close(pp->tx_dumper[i]);
+   eth_pcap_dumper_close(pp->tx_dumper[i]);
pp->tx_dumper[i] = NULL;
}
 
-- 
2.34.1



RE: [PATCH v4 2/4] event/sw: report periodic event timer capability

2022-09-06 Thread Van Haaren, Harry
> -Original Message-
> From: Naga Harish K, S V 
> Sent: Friday, August 12, 2022 5:08 PM
> To: Carrillo, Erik G ; jer...@marvell.com; Van 
> Haaren,
> Harry 
> Cc: dev@dpdk.org
> Subject: [PATCH v4 2/4] event/sw: report periodic event timer capability
> 
> update the software eventdev pmd timer_adapter_caps_get
> callback function to report the support of periodic
> event timer capability
> 
> Signed-off-by: Naga Harish K S V 

Thanks for explaining how things work on the v2 & follow-up reworks;
Acked-by: Harry van Haaren 


[PATCH v3] net/i40e: fix single VLAN cannot work normal

2022-09-06 Thread Kevin Liu
After disable QinQ, single VLAN can not work normal.
The reason is that QinQ is not disabled correctly.

Before configuring QinQ, need to back up and clean
MAC/VLAN filters of all ports. After configuring QinQ,
restore MAC/VLAN filters of all ports. When disable
QinQ, need to set valid_flags to 0x0008 and set first_tag
to 0x88a8.

Fixes: 38e9762be16a ("net/i40e: add outer VLAN processing")
Signed-off-by: Kevin Liu 
Tested-by: Jiale Song 
---
v2: refine code
---
v3: refine code
---
 doc/guides/nics/i40e.rst   |   1 -
 drivers/net/i40e/i40e_ethdev.c | 147 ++---
 2 files changed, 100 insertions(+), 48 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index abb99406b3..15b796e67a 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -983,7 +983,6 @@ If FW version >= 8.4, there'll be some Vlan related issues:
 
 #. TCI input set for QinQ  is invalid.
 #. Fail to configure TPID for QinQ.
-#. Need to enable QinQ before enabling Vlan filter.
 #. Fail to strip outer Vlan.
 
 Example of getting best performance with l3fwd example
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 67d79de08d..cf327ed576 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -1650,7 +1650,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void 
*init_params __rte_unused)
vsi = pf->main_vsi;
 
/* Disable double vlan by default */
-   i40e_vsi_config_double_vlan(vsi, FALSE);
+   if (!pf->fw8_3gt)
+   i40e_vsi_config_double_vlan(vsi, FALSE);
 
/* Disable S-TAG identification when floating_veb is disabled */
if (!pf->floating_veb) {
@@ -3909,7 +3910,6 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev,
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
int qinq = dev->data->dev_conf.rxmode.offloads &
   RTE_ETH_RX_OFFLOAD_VLAN_EXTEND;
-   u16 sw_flags = 0, valid_flags = 0;
int ret = 0;
 
if ((vlan_type != RTE_ETH_VLAN_TYPE_INNER &&
@@ -3928,10 +3928,6 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev,
/* 802.1ad frames ability is added in NVM API 1.7*/
if (hw->flags & I40E_HW_FLAG_802_1AD_CAPABLE) {
if (qinq) {
-   if (pf->fw8_3gt) {
-   sw_flags = I40E_AQ_SET_SWITCH_CFG_OUTER_VLAN;
-   valid_flags = I40E_AQ_SET_SWITCH_CFG_OUTER_VLAN;
-   }
if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER)
hw->first_tag = rte_cpu_to_le_16(tpid);
else if (vlan_type == RTE_ETH_VLAN_TYPE_INNER)
@@ -3940,8 +3936,8 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev,
if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER)
hw->second_tag = rte_cpu_to_le_16(tpid);
}
-   ret = i40e_aq_set_switch_config(hw, sw_flags,
-   valid_flags, 0, NULL);
+   ret = i40e_aq_set_switch_config(hw, 0,
+   0, 0, NULL);
if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR,
"Set switch config failed aq_err: %d",
@@ -3993,11 +3989,15 @@ static int
 i40e_vlan_offload_set(struct rte_eth_dev *dev, int mask)
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+   struct i40e_mac_filter_info *vmac_filter[RTE_MAX_ETHPORTS];
+   struct i40e_vsi *vvsi[RTE_MAX_ETHPORTS];
struct i40e_mac_filter_info *mac_filter;
struct i40e_vsi *vsi = pf->main_vsi;
struct rte_eth_rxmode *rxmode;
+   int vnum[RTE_MAX_ETHPORTS];
struct i40e_mac_filter *f;
-   int i, num;
+   int port_num = 0;
+   int i, num, j;
void *temp;
int ret;
 
@@ -4018,50 +4018,75 @@ i40e_vlan_offload_set(struct rte_eth_dev *dev, int mask)
}
 
if (mask & RTE_ETH_VLAN_EXTEND_MASK) {
-   i = 0;
-   num = vsi->mac_num;
-   mac_filter = rte_zmalloc("mac_filter_info_data",
-num * sizeof(*mac_filter), 0);
-   if (mac_filter == NULL) {
-   PMD_DRV_LOG(ERR, "failed to allocate memory");
-   return I40E_ERR_NO_MEMORY;
-   }
-
-   /*
-* Outer VLAN processing is supported after firmware v8.4, 
kernel driver
-* also change the default behavior to support this feature. To 
align with
-* kernel driver, set switch config in 'i40e_vlan_tpie_set' to 
support for
-* outer VLAN processing. But it is forbidden for firmware to 
change the
-* Inner/Outer VLAN configuration while there are MAC/VLAN 
filters in the
-* switch table. Therefore, we 

RE: [PATCH v5 08/27] eal: deprecate RTE_FUNC_PTR_* macros

2022-09-06 Thread Jayatheerthan, Jay
> -Original Message-
> From: David Marchand 
> Sent: Monday, September 5, 2022 2:05 PM
> To: dev@dpdk.org
> Cc: tho...@monjalon.net; Richardson, Bruce ; Ray 
> Kinsella ; Zhang, Roy Fan
> ; Ashish Gupta ; Yang, 
> Qiming ; Wu, Wenjun1
> ; Shijith Thotton ; 
> Srisivasubramanian Srinivasan ; Xu,
> Rosen ; Zhang, Tianfei ; Sachin 
> Saxena ; Hemant Agrawal
> ; Akhil Goyal ; Chengwen Feng 
> ; Laatz, Kevin
> ; Ferruh Yigit ; Andrew 
> Rybchenko ; Gujjar,
> Abhinandan S ; Jerin Jacob ; 
> Jayatheerthan, Jay ;
> Matz, Olivier ; Ori Kam ; Maxime 
> Coquelin ; Xia, Chenbo
> 
> Subject: [PATCH v5 08/27] eal: deprecate RTE_FUNC_PTR_* macros
> 
> Those macros have no real value and are easily replaced with a simple
> if() block.
> 
> Existing users have been converted using a new cocci script.
> Deprecate them.
> 
> Signed-off-by: David Marchand 
> ---
>  devtools/cocci/func_or_ret.cocci  |  12 +
>  doc/guides/rel_notes/deprecation.rst  |   4 +
>  doc/guides/rel_notes/release_22_11.rst|   4 +
>  drivers/common/qat/qat_device.c   |   8 +-
>  drivers/common/qat/qat_qp.c   |  31 +-
>  drivers/compress/qat/qat_comp_pmd.c   |   4 +-
>  .../scheduler/rte_cryptodev_scheduler.c   |   6 +-
>  drivers/crypto/scheduler/scheduler_pmd_ops.c  |   6 +-
>  drivers/net/ixgbe/rte_pmd_ixgbe.c |   3 +-
>  drivers/net/liquidio/lio_ethdev.c |   3 +-
>  drivers/raw/ifpga/ifpga_rawdev.c  |   6 +-
>  drivers/raw/skeleton/skeleton_rawdev.c|  21 +-
>  lib/compressdev/rte_compressdev.c |  47 +--
>  lib/cryptodev/rte_cryptodev.c |  43 ++-
>  lib/dmadev/rte_dmadev.c   |  21 +-
>  lib/dmadev/rte_dmadev.h   |  21 +-
>  lib/eal/include/rte_dev.h |   7 +-
>  lib/ethdev/ethdev_driver.c|  18 +-
>  lib/ethdev/ethdev_pci.h   |   3 +-
>  lib/ethdev/rte_ethdev.c   | 276 --
>  lib/ethdev/rte_ethdev.h   |   9 +-
>  lib/eventdev/rte_event_crypto_adapter.c   |  10 +-
>  lib/eventdev/rte_event_eth_rx_adapter.c   |  13 +-

Looks good to me.

Acked-by: Jay Jayatheerthan 

>  lib/eventdev/rte_eventdev.c   |  62 ++--
>  lib/mempool/rte_mempool_ops.c |   3 +-
>  lib/rawdev/rte_rawdev.c   |  75 +++--
>  lib/regexdev/rte_regexdev.c   |  59 ++--
>  lib/regexdev/rte_regexdev.h   |   6 +-
>  lib/security/rte_security.c   |   6 +-
>  lib/vhost/vdpa.c  |   9 +-
>  lib/vhost/vhost_user.c|   6 +-
>  31 files changed, 517 insertions(+), 285 deletions(-)
>  create mode 100644 devtools/cocci/func_or_ret.cocci
> 
> diff --git a/devtools/cocci/func_or_ret.cocci 
> b/devtools/cocci/func_or_ret.cocci
> new file mode 100644
> index 00..f23d60cc4e
> --- /dev/null
> +++ b/devtools/cocci/func_or_ret.cocci
> @@ -0,0 +1,12 @@
> +@@
> +expression cond, ret;
> +@@
> +-RTE_FUNC_PTR_OR_ERR_RET(cond, ret);
> ++if (cond == NULL)
> ++return ret;
> +@@
> +expression cond;
> +@@
> +-RTE_FUNC_PTR_OR_RET(cond);
> ++if (cond == NULL)
> ++return;
> diff --git a/doc/guides/rel_notes/deprecation.rst 
> b/doc/guides/rel_notes/deprecation.rst
> index dba252067c..5b4ffc992d 100644
> --- a/doc/guides/rel_notes/deprecation.rst
> +++ b/doc/guides/rel_notes/deprecation.rst
> @@ -14,6 +14,10 @@ Deprecation Notices
>  * kvargs: The function ``rte_kvargs_process`` will get a new parameter
>for returning key match count. It will ease handling of no-match case.
> 
> +* eal: RTE_FUNC_PTR_OR_* macros have been marked deprecated and will be 
> removed
> +  in the future. Applications can use ``devtools/cocci/func_or_ret.cocci``
> +  to update their code.
> +
>  * eal: The function ``rte_eal_remote_launch`` will return new error codes
>after read or write error on the pipe, instead of calling ``rte_panic``.
> 
> diff --git a/doc/guides/rel_notes/release_22_11.rst 
> b/doc/guides/rel_notes/release_22_11.rst
> index 3cea3aa8eb..225a380de0 100644
> --- a/doc/guides/rel_notes/release_22_11.rst
> +++ b/doc/guides/rel_notes/release_22_11.rst
> @@ -84,6 +84,10 @@ API Changes
> Also, make sure to start the actual text at the margin.
> ===
> 
> +* eal: RTE_FUNC_PTR_OR_* macros have been marked deprecated and will be 
> removed
> +  in the future. Applications can use ``devtools/cocci/func_or_ret.cocci``
> +  to update their code.
> +
>  * raw/ifgpa: The function ``rte_pmd_ifpga_get_pci_bus`` has been removed.
> 
> 
> diff --git a/drivers/common/qat/qat_device.c b/drivers/common/qat/qat_device.c
> index db4b087d2b..30e5cdb573 100644
> --- a/drivers/common/qat/qat_device.c
> +++ b/drivers/common/qat/qat_device.c
> @@ -58,8 +58,8 @@ qat_pci_get_extra_size(enum qat_device_gen qat_dev_gen)
>  

RE: [PATCH v7 03/12] net/nfp: move app specific init logic to own function

2022-09-06 Thread Chaoyong He


> -Original Message-
> From: Ferruh Yigit 
> Sent: Monday, September 5, 2022 11:39 PM
> To: Chaoyong He ; dev@dpdk.org
> Cc: oss-drivers ; Niklas Soderlund
> 
> Subject: Re: [PATCH v7 03/12] net/nfp: move app specific init logic to own
> function
> 
> On 8/12/2022 11:22 AM, Chaoyong He wrote:
> > The NFP card can load different firmware applications.
> > This commit move the init logic of corenic app of the secondary
> > process into its own function.
> >
> > Signed-off-by: Chaoyong He 
> > Reviewed-by: Niklas Söderlund 
> 
> <...>
> 
> > +   switch (app_id) {
> > +   case NFP_APP_CORE_NIC:
> > +   PMD_INIT_LOG(INFO, "Initializing coreNIC");
> > +   ret = nfp_secondary_init_app_nic(pci_dev, sym_tbl, cpp);
> > +   if (ret != 0) {
> > +   PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
> > +   goto sym_tbl_cleanup;
> > }
> 
> If you are planning to add more FW app support, what do you think to add
> another abstraction for it? Something like
> 
> struct fw_ops {
>   *init()
>   *secondary_init()
>   ...
> }
> 
>   ...
>   ret = fw_ops[app_id].secondary_init(...);
>   ...
> 

It does make sense if we can translate this switch statement into an array of 
function pointers.
But there are some problems:
1. The `app_id` is returned by the firmware, so we can't simply regard it as an 
index of the array.
 There should import some relation map logic. And we can't find a suitable 
upper limit for the
 array, so maybe we will always update it as we add more and more firmware 
apps in the future. 
 We also have to check the value of `fw_ops[app_id].secondary_init` before 
we invoke it.
2. Different firmware app may need different variables to initialize, which 
make it difficult to find a
 suitable function prototype when we declare the function pointer.

So the final logics seems will more complicated and maybe we can keep use the 
logics now?


RE: [PATCH v7 05/12] net/nfp: add flower PF setup and mempool init logic

2022-09-06 Thread Chaoyong He


> -Original Message-
> From: Ferruh Yigit 
> Sent: Monday, September 5, 2022 11:42 PM
> To: Chaoyong He ; dev@dpdk.org
> Cc: oss-drivers ; Niklas Soderlund
> 
> Subject: Re: [PATCH v7 05/12] net/nfp: add flower PF setup and mempool
> init logic
> 
> On 8/12/2022 11:22 AM, Chaoyong He wrote:
> > Adds the vNIC initialization logic for the flower PF vNIC.  The flower
> > firmware exposes this vNIC for the purposes of fallback traffic in the
> > switchdev use-case. The logic of setting up this vNIC is similar to
> > the logic seen in nfp_net_init() and nfp_net_start().
> >
> > Adds minimal dev_ops for this PF device. Because the device is being
> > exposed externally to DPDK it should also be configured using DPDK
> > helpers like rte_eth_configure(). For these helpers to work the flower
> > logic needs to implements a minimal set of dev_ops. The Rx and Tx
> > logic for this vNIC will be added in a subsequent commit.
> >
> > OVS expects incoming packets coming into the OVS datapath to be
> > allocated from a mempool that contains objects of type "struct
> > dp_packet". For the PF handling the slowpath into OVS it should use a
> > mempool that is compatible with OVS. This commit adds the logic to
> > create the OVS compatible mempool. It adds certain OVS specific
> > structs to be able to instantiate the mempool.
> >
> 
> Can you please elaborate what is OVS compatible mempool?
> 
> <...>
> 
> > +static inline struct nfp_app_flower * nfp_app_flower_priv_get(struct
> > +nfp_pf_dev *pf_dev) {
> > +   if (pf_dev == NULL)
> > +   return NULL;
> > +   else if (pf_dev->app_id != NFP_APP_FLOWER_NIC)
> > +   return NULL;
> > +   else
> > +   return (struct nfp_app_flower *)pf_dev->app_priv; }
> > +
> 
> What do you think to unify functions to get private data, instead of having a
> function for each FW, it can be possible to have single one?
> 

At first, we use two macros for this, and Andrew advice change them to 
functions.
```
#define NFP_APP_PRIV_TO_APP_NIC(app_priv)\
((struct nfp_app_nic *)app_priv)

#define NFP_APP_PRIV_TO_APP_FLOWER(app_priv)\
((struct nfp_app_flower *)app_priv)
```
So your advice is we unify the functions into:
```
static inline struct nfp_app_nic *
nfp_app_priv_get(struct nfp_pf_dev *pf_dev)
{
if (pf_dev == NULL)
return NULL;
else if (pf_dev->app_id == NFP_APP_CORE_NIC ||
   pf_dev->app_id == NFP_APP_FLOWER_NIC)
return pf_dev->app_priv;
  else
   return NULL;
}
```
and convert the pointer type at where this function been called?


RE: [PATCH v7 01/12] net/nfp: move app specific attributes to own struct

2022-09-06 Thread Chaoyong He

> From: Ferruh Yigit 
> Sent: Monday, September 5, 2022 11:38 PM
> To: Chaoyong He ; dev@dpdk.org
> Cc: oss-drivers ; Niklas Soderlund
> ; Heinrich Kuhn
> 
> Subject: Re: [PATCH v7 01/12] net/nfp: move app specific attributes to own
> struct
> 
> On 8/12/2022 11:22 AM, Chaoyong He wrote:
> > The NFP Card can load different firmware applications. Currently only
> > the CoreNIC application is supported. This commit makes needed
> > infrastructure changes in order to support other firmware applications
> > too.
> >
> 
> App (or firmware application) is a little confusing, why not just call it FW?
> 
> Same for code variable/struct names, and other patches.

We decided to not just use "FW", as this is an overloaded term for the NFP. 
There is also the lower-level management FW, which is flashed to the card and
 is mostly separate to the application firmware, although they do interact with
 each other.
The application firmware is the software (application) that runs on the card's 
flow
 processors, is loaded during run time, and determines the behavior of the 
card. 
To avoid confusion with the different types of firmware we decided to call this 
the app_firmware - this is also similar to what is happening in the nfp kernel 
driver.

To avoid further confuse, we will change the places where just use "app" to be 
"app_fw" instead.

> > Clearer separation is made between the PF device and any application
> > specific concepts. The PF struct is now generic regardless of the
> > application loaded. A new struct is also made for the CoreNIC
> > application. Future additions to support other applications should
> > also add an applications specific struct.
> >
> > Signed-off-by: Chaoyong He
> > Signed-off-by: Heinrich Kuhn
> > Reviewed-by: Niklas Söderlund



RE: [PATCH v2 01/10] net/gve: introduce GVE PMD base code

2022-09-06 Thread Guo, Junfeng



> -Original Message-
> From: Thomas Monjalon 
> Sent: Friday, September 2, 2022 04:50
> To: Ferruh Yigit ; techbo...@dpdk.org
> Cc: Guo, Junfeng ; Zhang, Qi Z
> ; Wu, Jingjing ; Hemant
> Agrawal ; dev@dpdk.org; Li, Xiaoyun
> ; awogbem...@google.com; Richardson, Bruce
> ; Wang, Haiyue ;
> techbo...@dpdk.org; Stephen Hemminger
> 
> Subject: Re: [PATCH v2 01/10] net/gve: introduce GVE PMD base code
> 
> 01/09/2022 20:23, Stephen Hemminger:
> > On Thu, 1 Sep 2022 18:19:22 +0100
> > Ferruh Yigit  wrote:
> >
> > > >
> > > > diff --git a/drivers/net/gve/gve_adminq.c
> b/drivers/net/gve/gve_adminq.c
> > > > new file mode 100644
> > > > index 00..8a724f12c6
> > > > --- /dev/null
> > > > +++ b/drivers/net/gve/gve_adminq.c
> > > > @@ -0,0 +1,925 @@
> > > > +/* SPDX-License-Identifier: MIT
> > > > + * Google Virtual Ethernet (gve) driver
> > > > + * Version: 1.3.0
> > > > + * Copyright (C) 2015-2022 Google, Inc.
> > > > + * Copyright(C) 2022 Intel Corporation
> > > > + */
> > > > +
> > >
> > > Can you please get approval for the MIT license from techboard, as
> > > Stephen highlighted in previous version?
> >
> >
> > I would prefer that it be BSD or dual licensed.
> > Although MIT and BSD-3 licenses are compatible, this is not something
> techboard can decide
> > it requires a statement from a knowledgeable open source lawyer (Intel
> or LF).
> >
> > Please fix the license to BSD and save lots of trouble.
> 
> +1 to change to BSD to avoid trouble.

Thanks for your concern and comments!
Yes, we are also willing to have these base code under BSD license.

Note that these code are not Intel files and they come from the kernel 
community.
Everyone can reach the code at:
https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux/tree/v1.3.0.
The base code here has the statement of SPDX-License-Identifier: (GPL-2.0 OR 
MIT).

Thus, we may not be in the good position to re-license these code,
and we didn't find the BSD-licensed version at any open community,
so we just follow the required MIT license as an exception to DPDK.

Regards,
Junfeng

> 
> 



Re: [PATCH v7 01/12] net/nfp: move app specific attributes to own struct

2022-09-06 Thread Ferruh Yigit

On 9/6/2022 10:20 AM, Chaoyong He wrote:



From: Ferruh Yigit 
Sent: Monday, September 5, 2022 11:38 PM
To: Chaoyong He ; dev@dpdk.org
Cc: oss-drivers ; Niklas Soderlund
; Heinrich Kuhn

Subject: Re: [PATCH v7 01/12] net/nfp: move app specific attributes to own
struct

On 8/12/2022 11:22 AM, Chaoyong He wrote:

The NFP Card can load different firmware applications. Currently only
the CoreNIC application is supported. This commit makes needed
infrastructure changes in order to support other firmware applications
too.



App (or firmware application) is a little confusing, why not just call it FW?

Same for code variable/struct names, and other patches.


We decided to not just use "FW", as this is an overloaded term for the NFP.
There is also the lower-level management FW, which is flashed to the card and
  is mostly separate to the application firmware, although they do interact with
  each other.
The application firmware is the software (application) that runs on the card's 
flow
  processors, is loaded during run time, and determines the behavior of the 
card.
To avoid confusion with the different types of firmware we decided to call this
the app_firmware - this is also similar to what is happening in the nfp kernel 
driver.

To avoid further confuse, we will change the places where just use "app" to be 
"app_fw" instead.



OK to 'app_fw'


Clearer separation is made between the PF device and any application
specific concepts. The PF struct is now generic regardless of the
application loaded. A new struct is also made for the CoreNIC
application. Future additions to support other applications should
also add an applications specific struct.

Signed-off-by: Chaoyong He
Signed-off-by: Heinrich Kuhn
Reviewed-by: Niklas Söderlund






Re: [PATCH v5 1/3] eal: add lcore poll busyness telemetry

2022-09-06 Thread Kevin Laatz

On 03/09/2022 14:33, Jerin Jacob wrote:

On Fri, Sep 2, 2022 at 9:26 PM Kevin Laatz  wrote:

From: Anatoly Burakov 

Currently, there is no way to measure lcore poll busyness in a passive way,
without any modifications to the application. This patch adds a new EAL API
that will be able to passively track core polling busyness.

The poll busyness is calculated by relying on the fact that most DPDK API's
will poll for work (packets, completions, eventdev events, etc). Empty
polls can be counted as "idle", while non-empty polls can be counted as
busy. To measure lcore poll busyness, we simply call the telemetry
timestamping function with the number of polls a particular code section
has processed, and count the number of cycles we've spent processing empty
bursts. The more empty bursts we encounter, the less cycles we spend in
"busy" state, and the less core poll busyness will be reported.

In order for all of the above to work without modifications to the
application, the library code needs to be instrumented with calls to the
lcore telemetry busyness timestamping function. The following parts of DPDK
are instrumented with lcore poll busyness timestamping calls:

- All major driver API's:
   - ethdev
   - cryptodev
   - compressdev
   - regexdev
   - bbdev
   - rawdev
   - eventdev
   - dmadev
- Some additional libraries:
   - ring
   - distributor

To avoid performance impact from having lcore telemetry support, a global
variable is exported by EAL, and a call to timestamping function is wrapped
into a macro, so that whenever telemetry is disabled, it only takes one
additional branch and no function calls are performed. It is disabled at
compile time by default.

This patch also adds a telemetry endpoint to report lcore poll busyness, as
well as telemetry endpoints to enable/disable lcore telemetry. A
documentation entry has been added to the howto guides to explain the usage
of the new telemetry endpoints and API.

Signed-off-by: Kevin Laatz 
Signed-off-by: Conor Walsh 
Signed-off-by: David Hunt 
Signed-off-by: Anatoly Burakov 

This version looks good to me. Thanks for this new feature.

I think, we need to add a UT for this new rte_lcore_poll_* APIs also
add a performance test case to measure the cycles for
RTE_LCORE_POLL_BUSYNESS_TIMESTAMP [1]

[1]
Reference performance test application for trace: app/test/test_trace_perf.c


Thanks for reviewing, Jerin.

I'll look into adding a UT, thanks!



Re: [PATCH v7 03/12] net/nfp: move app specific init logic to own function

2022-09-06 Thread Ferruh Yigit

On 9/6/2022 9:29 AM, Chaoyong He wrote:

CAUTION: This message has originated from an External Source. Please use proper 
judgment and caution when opening attachments, clicking links, or responding to 
this email.



-Original Message-
From: Ferruh Yigit 
Sent: Monday, September 5, 2022 11:39 PM
To: Chaoyong He ; dev@dpdk.org
Cc: oss-drivers ; Niklas Soderlund

Subject: Re: [PATCH v7 03/12] net/nfp: move app specific init logic to own
function

On 8/12/2022 11:22 AM, Chaoyong He wrote:

The NFP card can load different firmware applications.
This commit move the init logic of corenic app of the secondary
process into its own function.

Signed-off-by: Chaoyong He 
Reviewed-by: Niklas Söderlund 


<...>


+   switch (app_id) {
+   case NFP_APP_CORE_NIC:
+   PMD_INIT_LOG(INFO, "Initializing coreNIC");
+   ret = nfp_secondary_init_app_nic(pci_dev, sym_tbl, cpp);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "Could not initialize coreNIC!");
+   goto sym_tbl_cleanup;
 }


If you are planning to add more FW app support, what do you think to add
another abstraction for it? Something like

struct fw_ops {
   *init()
   *secondary_init()
   ...
}

   ...
   ret = fw_ops[app_id].secondary_init(...);
   ...



It does make sense if we can translate this switch statement into an array of 
function pointers.
But there are some problems:
1. The `app_id` is returned by the firmware, so we can't simply regard it as an 
index of the array.
  There should import some relation map logic. And we can't find a suitable 
upper limit for the
  array, so maybe we will always update it as we add more and more firmware 
apps in the future.
  We also have to check the value of `fw_ops[app_id].secondary_init` before 
we invoke it.
2. Different firmware app may need different variables to initialize, which 
make it difficult to find a
  suitable function prototype when we declare the function pointer.

So the final logics seems will more complicated and maybe we can keep use the 
logics now?


Got it, above mentioned issues looks valid, so it is OK to keep as it is.


Re: [PATCH v7 05/12] net/nfp: add flower PF setup and mempool init logic

2022-09-06 Thread Ferruh Yigit

On 9/6/2022 9:45 AM, Chaoyong He wrote:

CAUTION: This message has originated from an External Source. Please use proper 
judgment and caution when opening attachments, clicking links, or responding to 
this email.



-Original Message-
From: Ferruh Yigit 
Sent: Monday, September 5, 2022 11:42 PM
To: Chaoyong He ; dev@dpdk.org
Cc: oss-drivers ; Niklas Soderlund

Subject: Re: [PATCH v7 05/12] net/nfp: add flower PF setup and mempool
init logic

On 8/12/2022 11:22 AM, Chaoyong He wrote:

Adds the vNIC initialization logic for the flower PF vNIC.  The flower
firmware exposes this vNIC for the purposes of fallback traffic in the
switchdev use-case. The logic of setting up this vNIC is similar to
the logic seen in nfp_net_init() and nfp_net_start().

Adds minimal dev_ops for this PF device. Because the device is being
exposed externally to DPDK it should also be configured using DPDK
helpers like rte_eth_configure(). For these helpers to work the flower
logic needs to implements a minimal set of dev_ops. The Rx and Tx
logic for this vNIC will be added in a subsequent commit.

OVS expects incoming packets coming into the OVS datapath to be
allocated from a mempool that contains objects of type "struct
dp_packet". For the PF handling the slowpath into OVS it should use a
mempool that is compatible with OVS. This commit adds the logic to
create the OVS compatible mempool. It adds certain OVS specific
structs to be able to instantiate the mempool.



Can you please elaborate what is OVS compatible mempool?

<...>


+static inline struct nfp_app_flower * nfp_app_flower_priv_get(struct
+nfp_pf_dev *pf_dev) {
+   if (pf_dev == NULL)
+   return NULL;
+   else if (pf_dev->app_id != NFP_APP_FLOWER_NIC)
+   return NULL;
+   else
+   return (struct nfp_app_flower *)pf_dev->app_priv; }
+


What do you think to unify functions to get private data, instead of having a
function for each FW, it can be possible to have single one?



At first, we use two macros for this, and Andrew advice change them to 
functions.
```
#define NFP_APP_PRIV_TO_APP_NIC(app_priv)\
 ((struct nfp_app_nic *)app_priv)

#define NFP_APP_PRIV_TO_APP_FLOWER(app_priv)\
 ((struct nfp_app_flower *)app_priv)
```
So your advice is we unify the functions into:
```
static inline struct nfp_app_nic *
nfp_app_priv_get(struct nfp_pf_dev *pf_dev)
{
 if (pf_dev == NULL)
 return NULL;
 else if (pf_dev->app_id == NFP_APP_CORE_NIC ||
pf_dev->app_id == NFP_APP_FLOWER_NIC)
 return pf_dev->app_priv;
   else
return NULL;
}
```
and convert the pointer type at where this function been called?



Since return pointer types are different, it should return "void *",

```
static inline void *
nfp_app_priv_get(struct nfp_pf_dev *pf_dev)
{
if (pf_dev == NULL)
return NULL;
else if (pf_dev->app_id == NFP_APP_CORE_NIC ||
pf_dev->app_id == NFP_APP_FLOWER_NIC)
return pf_dev->app_priv;
else
return NULL;
}
```

And when assigning a pointer from "void *", no explicit cast is required.

```
struct nfp_app_flower *app_flower;

app_flower = nfp_app_priv_get(pf_dev);
```

I think this is better to have single function, instead of different 
helper function for each FW, but I would like to get @Andrew's comment too.



Btw, since 'nfp_app_nic*_priv_get' return 'NULL' now, should callers 
check for NULL, this may introduce too many checks, and if checks are 
not necessary, what is the benefit of the function against macro?




Re: [PATCH v2 3/3] l3fwd-power: add option to call uncore API

2022-09-06 Thread Hunt, David

Hi Tadhg,

On 13/07/2022 15:07, Tadhg Kearney wrote:

Add option for setting uncore frequency min/max/index, through uncore api.
This will be set for each package and die on the SKU. On exit, uncore
frequency will be reverted back to previous frequency.

Signed-off-by: Tadhg Kearney 
---
  .../sample_app_ug/l3_forward_power_man.rst|  28 +++
  examples/l3fwd-power/main.c   | 190 --
  2 files changed, 202 insertions(+), 16 deletions(-)

diff --git a/doc/guides/sample_app_ug/l3_forward_power_man.rst 
b/doc/guides/sample_app_ug/l3_forward_power_man.rst
index 8f6d906200..1e452140a1 100644
--- a/doc/guides/sample_app_ug/l3_forward_power_man.rst
+++ b/doc/guides/sample_app_ug/l3_forward_power_man.rst
@@ -97,6 +97,12 @@ where,
  *   -P: Sets all ports to promiscuous mode so that packets are accepted 
regardless of the packet's Ethernet MAC destination address.
  Without this option, only packets with the Ethernet MAC destination 
address set to the Ethernet address of the port are accepted.
  
+*   -u: optional, sets uncore frequency to minimum value.

+
+*   -U: optional, sets uncore frequency to maximum value.
+
+*   -i (frequency index): optional, sets uncore frequency to frequency index 
value, by setting min and max values to be the same.
+
  *   --config (port,queue,lcore)[,(port,queue,lcore)]: determines which queues 
from which ports are mapped to which cores.
  
  *   --max-pkt-len: optional, maximum packet length in decimal (64-9600)

@@ -364,3 +370,25 @@ in the DPDK Programmer's Guide for more details on PMD 
power management.
  .. code-block:: console
  
  .//examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f --config="(0,0,2),(0,1,3)" --pmd-mgmt=scale

+
+Setting Uncore Values
+-
+
+Uncore frequency can be adjusted through manipulating related sysfs entries to 
adjust the minimum and maximum uncore values.
+This will be set for each package and die on the SKU. Three options are 
available for setting uncore frequency:
+
+``-u``
+  This will set uncore to minimum frequency possible.
+
+``-U``
+  This will set uncore to maximum frequency possible.
+
+``-i``
+  This will allow you to set the specific uncore frequency index that you 
want, by setting
+  minimum and maximum values to be the same. Frequency index's are set 
10Hz apart from
+  maximum to minimum.
+  Frequency index values are in descending order, ie, index 0 is maximum 
frequency index.
+
+.. code-block:: console
+
+.//examples/dpdk-l3fwd-power -l 1-3 -- -p 0x0f 
--config="(0,0,2),(0,1,3)" -i 1
diff --git a/examples/l3fwd-power/main.c b/examples/l3fwd-power/main.c
index 887c6eae3f..5f74e29e3a 100644
--- a/examples/l3fwd-power/main.c
+++ b/examples/l3fwd-power/main.c
@@ -15,6 +15,8 @@
  #include 
  #include 
  #include 
+#include 
+#include 
  
  #include 

  #include 
@@ -47,6 +49,7 @@
  #include 
  #include 
  #include 
+#include 
  
  #include "perf_core.h"

  #include "main.h"
@@ -71,6 +74,7 @@
  
  #ifndef APP_LOOKUP_METHOD

  #define APP_LOOKUP_METHOD APP_LOOKUP_LPM
+
  #endif
  
  #if (APP_LOOKUP_METHOD == APP_LOOKUP_EXACT_MATCH)

@@ -83,7 +87,7 @@
  
  #ifndef IPv6_BYTES

  #define IPv6_BYTES_FMT "%02x%02x:%02x%02x:%02x%02x:%02x%02x:"\
-   "%02x%02x:%02x%02x:%02x%02x:%02x%02x"
+  "%02x%02x:%02x%02x:%02x%02x:%02x%02x"
  #define IPv6_BYTES(addr) \
addr[0],  addr[1], addr[2],  addr[3], \
addr[4],  addr[5], addr[6],  addr[7], \
@@ -134,9 +138,14 @@
  
  #define NUM_TELSTATS RTE_DIM(telstats_strings)
  
+#define UNCORE_FREQUENCY_DIR "/sys/devices/system/cpu/intel_uncore_frequency"

+
  static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
  static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
  
+/* Max number of nodes times dies available on uncore */

+#define MAX_DIE_NODES (RTE_MAX_NUMA_DIE * RTE_MAX_NUMA_NODES)
+
  /* ethernet addresses of ports */
  static struct rte_ether_addr ports_eth_addr[RTE_MAX_ETHPORTS];
  
@@ -145,6 +154,8 @@ static rte_spinlock_t locks[RTE_MAX_ETHPORTS];
  
  /* mask of enabled ports */

  static uint32_t enabled_port_mask = 0;
+/* if uncore frequency was enabled without errors */
+static int enabled_uncore;
  /* Ports set in promiscuous mode off by default. */
  static int promiscuous_on = 0;
  /* NUMA is enabled by default. */
@@ -165,6 +176,13 @@ struct telstats_name {
char name[RTE_ETH_XSTATS_NAME_SIZE];
  };
  
+struct uncore_info {

+   unsigned int pkg;
+   unsigned int die;
+};
+
+struct uncore_info ui[MAX_DIE_NODES];
+
  /* telemetry stats to be reported */
  const struct telstats_name telstats_strings[] = {
{"empty_poll"},
@@ -557,9 +575,9 @@ static void
  print_ipv6_key(struct ipv6_5tuple key)
  {
printf( "IP dst = " IPv6_BYTES_FMT ", IP src = " IPv6_BYTES_FMT ", "
-   "port dst = %d, port src = %d, proto = %d\n",
-   IPv6_BYTES(key.ip_dst), IPv6_BYTES(key.ip_src),
-   key.port

[PATCH v2] net/axgbe: optimise scattered rx

2022-09-06 Thread Bhagyada Modali
Updated the logic to remove the extra increments of the variables.

Fixes: 965b3127d425 ("net/axgbe: support scattered Rx")
Cc: sta...@dpdk.org

Signed-off-by: Bhagyada Modali 

---
v2:
* rebased to the latest changes and submitting the patch again

---
---
 drivers/net/axgbe/axgbe_rxtx.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/net/axgbe/axgbe_rxtx.c b/drivers/net/axgbe/axgbe_rxtx.c
index 2bad638f79..8b43e8160b 100644
--- a/drivers/net/axgbe/axgbe_rxtx.c
+++ b/drivers/net/axgbe/axgbe_rxtx.c
@@ -340,7 +340,6 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
struct axgbe_rx_queue *rxq = rx_queue;
volatile union axgbe_rx_desc *desc;
 
-   uint64_t old_dirty = rxq->dirty;
struct rte_mbuf *first_seg = NULL;
struct rte_mbuf *mbuf, *tmbuf;
unsigned int err = 0, etlt;
@@ -352,8 +351,7 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
while (nb_rx < nb_pkts) {
bool eop = 0;
 next_desc:
-   if (unlikely(idx == rxq->nb_desc))
-   idx = 0;
+   idx = AXGBE_GET_DESC_IDX(rxq, rxq->cur);
 
desc = &rxq->desc[idx];
 
@@ -446,19 +444,19 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
~RTE_MBUF_F_RX_VLAN_STRIPPED;
} else {
first_seg->ol_flags &=
-   ~(RTE_MBUF_F_RX_VLAN | 
RTE_MBUF_F_RX_VLAN_STRIPPED);
+   ~(RTE_MBUF_F_RX_VLAN |
+   
RTE_MBUF_F_RX_VLAN_STRIPPED);
first_seg->vlan_tci = 0;
}
}
 
 err_set:
rxq->cur++;
-   rxq->sw_ring[idx++] = tmbuf;
+   rxq->sw_ring[idx] = tmbuf;
desc->read.baddr =
rte_cpu_to_le_64(rte_mbuf_data_iova_default(tmbuf));
memset((void *)(&desc->read.desc2), 0, 8);
AXGMAC_SET_BITS_LE(desc->read.desc3, RX_NORMAL_DESC3, OWN, 1);
-   rxq->dirty++;
 
if (!eop) {
rte_pktmbuf_free(mbuf);
@@ -501,12 +499,13 @@ uint16_t eth_axgbe_recv_scattered_pkts(void *rx_queue,
/* Save receive context.*/
rxq->pkts += nb_rx;
 
-   if (rxq->dirty != old_dirty) {
+   if (rxq->dirty != rxq->cur) {
rte_wmb();
-   idx = AXGBE_GET_DESC_IDX(rxq, rxq->dirty - 1);
+   idx = AXGBE_GET_DESC_IDX(rxq, rxq->cur - 1);
AXGMAC_DMA_IOWRITE(rxq, DMA_CH_RDTR_LO,
   low32_value(rxq->ring_phys_addr +
   (idx * sizeof(union axgbe_rx_desc;
+   rxq->dirty = rxq->cur;
}
return nb_rx;
 }
-- 
2.25.1



RE: [PATCH v2] net/af_xdp: improve documentation

2022-09-06 Thread Zhang, Qi Z



> -Original Message-
> From: Koikkara Reeny, Shibin 
> Sent: Friday, July 22, 2022 4:51 PM
> To: dev@dpdk.org
> Cc: Loftus, Ciara ; Zhang, Qi Z 
> Subject: [PATCH v2] net/af_xdp: improve documentation
> 
> From: Ciara Loftus 
> 
> Instead of a one-liner describing each vdev argument, add a description and
> example for each. Move the information describing preferred busy polling
> from the "Limitations" section to the "Options" section where it is better
> placed. Also make general grammar improvements.
> 
> Signed-off-by: Ciara Loftus 

Reviewed by: Qi Zhang 



Re: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability

2022-09-06 Thread Ferruh Yigit

On 8/30/2022 1:08 PM, Hanumanth Reddy Pothula wrote:




-Original Message-
From: Ferruh Yigit 
Sent: Wednesday, August 24, 2022 9:04 PM
To: Ding, Xuan ; Hanumanth Reddy Pothula
; Thomas Monjalon ; Andrew
Rybchenko 
Cc: dev@dpdk.org; Wu, WenxuanX ; Li, Xiaoyun
; step...@networkplumber.org; Wang, YuanX
; m...@ashroe.eu; Zhang, Yuying
; Zhang, Qi Z ;
viachesl...@nvidia.com; Jerin Jacob Kollanukkaran ;
Nithin Kumar Dabilpuram 
Subject: [EXT] Re: [PATCH v2 1/3] ethdev: introduce pool sort capability

External Email

--



Thanks Ding Xuan and Ferruh Yigit for reviewing the changes and for providing 
your valuable feedback.
Please find responses inline.


On 8/23/2022 4:26 AM, Ding, Xuan wrote:

Hi Hanumanth,


-Original Message-
From: Hanumanth Pothula 
Sent: Saturday, August 13, 2022 1:25 AM
To: Thomas Monjalon ; Ferruh Yigit
; Andrew Rybchenko

Cc: dev@dpdk.org; Ding, Xuan ; Wu, WenxuanX
; Li, Xiaoyun ;
step...@networkplumber.org; Wang, YuanX ;
m...@ashroe.eu; Zhang, Yuying ; Zhang, Qi Z
; viachesl...@nvidia.com; jer...@marvell.com;
ndabilpu...@marvell.com; Hanumanth Pothula 
Subject: [PATCH v2 1/3] ethdev: introduce pool sort capability

Presently, the 'Buffer Split' feature supports sending multiple
segments of the received packet to PMD, which programs the HW to
receive the packet in segments from different pools.

This patch extends the feature to support the pool sort capability.
Some of the HW has support for choosing memory pools based on the
packet's size. The pool sort capability allows PMD to choose a memory
pool based on the packet's length.

This is often useful for saving the memory where the application can
create a different pool to steer the specific size of the packet,
thus enabling effective use of memory.

For example, let's say HW has a capability of three pools,
   - pool-1 size is 2K
   - pool-2 size is > 2K and < 4K
   - pool-3 size is > 4K
Here,
  pool-1 can accommodate packets with sizes < 2K
  pool-2 can accommodate packets with sizes > 2K and < 4K
  pool-3 can accommodate packets with sizes > 4K

With pool sort capability enabled in SW, an application may create
three pools of different sizes and send them to PMD. Allowing PMD to
program HW based on packet lengths. So that packets with less than 2K
are received on pool-1, packets with lengths between 2K and 4K are
received on pool-2 and finally packets greater than 4K are received on pool-

3.


The following two capabilities are added to the rte_eth_rxseg_capa
structure, 1. pool_sort --> tells pool sort capability is supported by HW.
2. max_npool --> max number of pools supported by HW.

Defined new structure rte_eth_rxseg_sort, to be used only when pool
sort capability is present. If required this may be extended further
to support more configurations.

Signed-off-by: Hanumanth Pothula 

v2:
   - Along with spec changes, uploading testpmd and driver changes.


Thanks for CCing. It's an interesting feature.

But I have one question here:
Buffer split is for split receiving packets into multiple segments,
while pool sort supports PMD to put the receiving packets into different pools

according to packet size.

Every packet is still intact.

So, at this level, pool sort does not belong to buffer split.
And you already use a different function to check pool sort rather than check

buffer split.


Should a new RX offload be introduced? like

"RTE_ETH_RX_OFFLOAD_POOL_SORT".



Please find my response below.


Hi Hanumanth,

I had the similar concern with the feature. I assume you want to benefit from
exiting config structure that gets multiple mempool as argument, since this
feature also needs multiple mempools, but the feature is different.

It looks to me wrong to check 'OFFLOAD_BUFFER_SPLIT' offload to decide if to
receive into multiple mempool or not, which doesn't have anything related split.
Also not sure about using the 'sort' keyword.
What do you think to introduce new fetaure, instead of extending existing split
one?


Actually we thought both BUFFER_SPLIT and POOL_SORT are similar features where 
RX
pools are configured in certain way and thought not use up one more RX offload 
capability,
as the existing software architecture can be extended to support pool_sort 
capability.
Yes, as part of pool sort, there is no buffer split but pools are picked based 
on the buffer length.

Since you think it's better to use new RX offload for POOL_SORT, will go ahead 
and implement the same.


This is optimisation, right? To enable us to use less memory for the packet
buffer, does it qualify to a device offload?


Yes, its qualify as a device offload and saves memory.
Marvel NIC has a capability to receive packets on  two different pools based on 
its length.
Below explained more on the same.


Also, what is the relation with segmented Rx, how a PMD decide to use
segmented Rx or bigger mempool? How can application can configure thi

Re: [PATCH v1 00/10] baseband/acc200

2022-09-06 Thread Tom Rix



On 9/1/22 1:34 PM, Chautru, Nicolas wrote:

Hi Tom,


-Original Message-
From: Tom Rix 
Sent: Thursday, September 1, 2022 6:49 AM
To: Chautru, Nicolas ; Maxime Coquelin
; dev@dpdk.org; tho...@monjalon.net;
gak...@marvell.com; hemant.agra...@nxp.com; Vargas, Hernan

Cc: m...@ashroe.eu; Richardson, Bruce ;
david.march...@redhat.com; step...@networkplumber.org
Subject: Re: [PATCH v1 00/10] baseband/acc200


On 8/31/22 6:26 PM, Chautru, Nicolas wrote:

Hi Tom,


-Original Message-
From: Tom Rix 
Sent: Wednesday, August 31, 2022 5:28 PM
To: Chautru, Nicolas ; Maxime Coquelin
; dev@dpdk.org;

tho...@monjalon.net;

gak...@marvell.com; hemant.agra...@nxp.com; Vargas, Hernan

Cc: m...@ashroe.eu; Richardson, Bruce ;
david.march...@redhat.com; step...@networkplumber.org
Subject: Re: [PATCH v1 00/10] baseband/acc200


On 8/31/22 3:37 PM, Chautru, Nicolas wrote:

Hi Thomas, Tom,


-Original Message-
From: Tom Rix 
Sent: Wednesday, August 31, 2022 12:26 PM
To: Chautru, Nicolas ; Maxime Coquelin
; dev@dpdk.org;

tho...@monjalon.net;

gak...@marvell.com; hemant.agra...@nxp.com; Vargas, Hernan

Cc: m...@ashroe.eu; Richardson, Bruce ;
david.march...@redhat.com; step...@networkplumber.org
Subject: Re: [PATCH v1 00/10] baseband/acc200


On 8/30/22 12:45 PM, Chautru, Nicolas wrote:

Hi Maxime,


-Original Message-
From: Maxime Coquelin 
Sent: Tuesday, August 30, 2022 12:45 AM
To: Chautru, Nicolas ; dev@dpdk.org;
tho...@monjalon.net; gak...@marvell.com;

hemant.agra...@nxp.com;

t...@redhat.com; Vargas, Hernan 
Cc: m...@ashroe.eu; Richardson, Bruce
; david.march...@redhat.com;
step...@networkplumber.org
Subject: Re: [PATCH v1 00/10] baseband/acc200

Hi Nicolas,

On 7/12/22 15:48, Maxime Coquelin wrote:

Hi Nicolas, Hernan,

(Adding Hernan in the recipients list)

On 7/8/22 02:01, Nicolas Chautru wrote:

This is targeting 22.11 and includes the PMD for the integrated
accelerator on Intel Xeon SPR-EEC.
There is a dependency on that parallel serie still in-flight
which extends the bbdev api
https://patches.dpdk.org/project/dpdk/list/?series=23894

I will be offline for a few weeks for the summer break but
Hernan will cover for me during that time if required.

Thanks
Nic

Nicolas Chautru (10):
   baseband/acc200: introduce PMD for ACC200
   baseband/acc200: add HW register definitions
   baseband/acc200: add info get function
   baseband/acc200: add queue configuration
   baseband/acc200: add LDPC processing functions
   baseband/acc200: add LTE processing functions
   baseband/acc200: add support for FFT operations
   baseband/acc200: support interrupt
   baseband/acc200: add device status and vf2pf comms
   baseband/acc200: add PF configure companion function

  MAINTAINERS  |    3 +
  app/test-bbdev/meson.build   |    3 +
  app/test-bbdev/test_bbdev_perf.c |   76 +
  doc/guides/bbdevs/acc200.rst |  244 ++
  doc/guides/bbdevs/index.rst  |    1 +
  drivers/baseband/acc200/acc200_pf_enum.h |  468 +++
  drivers/baseband/acc200/acc200_pmd.h |  690 
  drivers/baseband/acc200/acc200_vf_enum.h |   89 +
  drivers/baseband/acc200/meson.build  |    8 +
  drivers/baseband/acc200/rte_acc200_cfg.h |  115 +
  drivers/baseband/acc200/rte_acc200_pmd.c | 5403
++
  drivers/baseband/acc200/version.map  |   10 +
  drivers/baseband/meson.build |    1 +
  13 files changed, 7111 insertions(+)
  create mode 100644 doc/guides/bbdevs/acc200.rst
  create mode 100644

drivers/baseband/acc200/acc200_pf_enum.h

  create mode 100644 drivers/baseband/acc200/acc200_pmd.h
  create mode 100644

drivers/baseband/acc200/acc200_vf_enum.h

  create mode 100644 drivers/baseband/acc200/meson.build
  create mode 100644

drivers/baseband/acc200/rte_acc200_cfg.h

  create mode 100644

drivers/baseband/acc200/rte_acc200_pmd.c

  create mode 100644 drivers/baseband/acc200/version.map


Comparing ACC200 & ACC100 header files, I understand ACC200 is
an evolution of the ACC10x family. The FEC bits are really
close,
ACC200 main addition seems to be FFT acceleration which could be
handled in ACC10x driver based on device ID.

I think both drivers have to be merged in order to avoid code
duplication. That's how other families of devices (e.g. i40e)
are handled.

I haven't seen your reply on this point.
Do you confirm you are working on a single driver for ACC family
in order to avoid code duplication?


The implementation is based on distinct ACC100 and ACC200 drivers.
The 2

devices are fundamentally different generation, processes and IP.

MountBryce is an eASIC device over PCIe while ACC200 is an
integrated

accelerator on Xeon CPU.

The actual implementation are not the same, underlying IP are all
distinct

even if many of the descriptor format have similarities.

The actual capabilities of

Re: [Patch v7 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Ferruh Yigit

On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:



From: Long Li 

MANA is a PCI device. It uses IB verbs to access hardware through the
kernel RDMA layer. This patch introduces build environment and basic
device probe functions.

Signed-off-by: Long Li 
---
Change log:
v2:
Fix typos.
Make the driver build only on x86-64 and Linux.
Remove unused header files.
Change port definition to uint16_t or uint8_t (for IB).
Use getline() in place of fgets() to read and truncate a line.
v3:
Add meson build check for required functions from RDMA direct verb header file
v4:
Remove extra "\n" in logging code.
Use "r" in place of "rb" in fopen() to read text files.
v7:
Remove RTE_ETH_TX_OFFLOAD_TCP_TSO from offload cap.



Can you please check review comments on v4 [1], they seem still valid in 
this version.
I didn't go through other patches, but can you please double check 
comments on all v4 patches?



[1]
https://inbox.dpdk.org/dev/859e95d9-2483-b017-6daa-0852317b4...@xilinx.com/



Re: [Patch v7 00/18] Introduce Microsoft Azure Network Adatper (MANA) PMD

2022-09-06 Thread Ferruh Yigit

On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:



From: Long Li 

MANA is a network interface card to be used in the Azure cloud environment.
MANA provides safe access to user memory through memory registration. It has
IOMMU built into the hardware.

MANA uses IB verbs and RDMA layer to configure hardware resources. It
requires the corresponding RDMA kernel-mode and user-mode drivers.

The MANA RDMA kernel-mode driver is being reviewed at:
https://patchwork.kernel.org/project/netdevbpf/cover/1655345240-26411-1-git-send-email-lon...@linuxonhyperv.com/

The MANA RDMA user-mode driver is being reviewed at:
https://github.com/linux-rdma/rdma-core/pull/1177


Long Li (18):
   net/mana: add basic driver, build environment and doc
   net/mana: add device configuration and stop
   net/mana: add function to report support ptypes
   net/mana: add link update
   net/mana: add function for device removal interrupts
   net/mana: add device info
   net/mana: add function to configure RSS
   net/mana: add function to configure RX queues
   net/mana: add function to configure TX queues
   net/mana: implement memory registration
   net/mana: implement the hardware layer operations
   net/mana: add function to start/stop TX queues
   net/mana: add function to start/stop RX queues
   net/mana: add function to receive packets
   net/mana: add function to send packets
   net/mana: add function to start/stop device
   net/mana: add function to report queue stats
   net/mana: add function to support RX interrupts



Can you please send new versions of the patches as reply to previous 
versions, so all versions can be in same thread, using git send-email 
'--in-reply-to' argument?


More details in the contribution guide:
https://doc.dpdk.org/guides/contributing/patches.html#sending-patches



Minutes of Technical Board Meeting, 2022-08-24

2022-09-06 Thread Honnappa Nagarahalli
Members Attending: 5
- Bruce Richardson
- Honnappa Nagarahalli (Chair)
- Kevin Traynor
- Konstantin Ananyev
- Maxime Coquelin
- Thomas Monjalon

NOTE: The Technical Board meetings take place every second Wednesday on 
https://meet.jit.si/DPDK at 3 pm UTC.
Meetings are public, and DPDK community members are welcome to attend.
Agenda and minutes can be found at http://core.dpdk.org/techboard/minutes

NOTE: Next meeting will be on Wednesday 2022-09-07 @3pm UTC, and will be 
chaired by Hemant.

1) Jim, St Leger has resigned from DPDK Governing Board.

2) User space event updates
a) All registration fees are waived. The money for existing 
registration has been/will be refunded.
b) Hackathon details finalized. Virtual attendance is not supported.
c) Techboard meeting will be held after Hackathon. Virtual attendance 
is supported. Nathan will ensure there is a meeting notice.
d) Storage request is clarified - agreed for S3 storage, starting with 
100G. This may be moved to glacier later (which is slower but cheaper).

3) Tech-writer hiring status
a) Thomas is reviewing some of the profiles Nathan has sent

4) DTS WG next steps
a) DTS WG patches are in the community from sometime, they need to be 
reviewed.
b) Agreed that reviewing is not one person's responsibility, should be 
reviewed by the community.
c) DTS WG will send out the roadmap for 22.11 release
d) Agreed that merging the DTS WG with DPDK CI would provide more 
visibility for DTS WG. Honnappa to initiate a conversation with Lincoln and 
Aaron on this proposal.

5) Need a process to choose the minimum supported version for Meson
a) The process to identify the minimum supported version of Meson will 
be discussed in the DPDK dev mailing list.
b) Rules on deprecating supported distro versions
i) Running CentOS Stream, Ubuntu, Arch Linux, RedHat, Debian, 
Fedora, Alpine Linux, FreeBSD. There are enough compute cycles for running 
tests on all these distros.
ii) In general end-of-lifed versions should be dropped. However,
iii) For distros with long support life, we can drop the 
support after 5 years. However,
iv) For RHEL, dropping support should be discussed with 
TechBoard. For now, support for RHEL 6 and earlier can be dropped.
c) Owen Hilyard will send out a proposal on the supported distros and 
supported versions to the mailing list.

6) Agreed to stop using our own Elixir server and will use Bootlin's Elixir for 
DPDK. Thomas will change the required links.



Hackathon presentation

2022-09-06 Thread David Marchand
Hello all,

Here is the link to the quick presentation we gave to start the
Hackathon session in Arcachon.

Links to tools reports are in it:
https://docs.google.com/presentation/d/10Zgsl8GawP7TL1JufXkpPrYXkTwA8eRhzBiGWejYXxo/edit?usp=sharing

The branch that was used to enable some tools is in my github repo:
https://github.com/david-marchand/dpdk/commits/ci



-- 
David Marchand



Re: Hackathon presentation

2022-09-06 Thread David Marchand
On Tue, Sep 6, 2022 at 3:11 PM David Marchand  wrote:
>
> Hello all,
>
> Here is the link to the quick presentation we gave to start the
> Hackathon session in Arcachon.
>
> Links to tools reports are in it:

Fixed link:
https://docs.google.com/presentation/d/1f90_ZL4hRdLV8ynfstQ8LDBaQa1og9BM_Bb5P_QRDmU/edit?usp=sharing


>
> The branch that was used to enable some tools is in my github repo:
> https://github.com/david-marchand/dpdk/commits/ci


--
David Marchand



Re: [Patch v7 00/18] Introduce Microsoft Azure Network Adatper (MANA) PMD

2022-09-06 Thread Ferruh Yigit

On 9/6/2022 2:03 PM, Ferruh Yigit wrote:

On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:



From: Long Li 

MANA is a network interface card to be used in the Azure cloud 
environment.
MANA provides safe access to user memory through memory registration. 
It has

IOMMU built into the hardware.

MANA uses IB verbs and RDMA layer to configure hardware resources. It
requires the corresponding RDMA kernel-mode and user-mode drivers.

The MANA RDMA kernel-mode driver is being reviewed at:
https://patchwork.kernel.org/project/netdevbpf/cover/1655345240-26411-1-git-send-email-lon...@linuxonhyperv.com/

The MANA RDMA user-mode driver is being reviewed at:
https://github.com/linux-rdma/rdma-core/pull/1177


Long Li (18):
   net/mana: add basic driver, build environment and doc
   net/mana: add device configuration and stop
   net/mana: add function to report support ptypes
   net/mana: add link update
   net/mana: add function for device removal interrupts
   net/mana: add device info
   net/mana: add function to configure RSS
   net/mana: add function to configure RX queues
   net/mana: add function to configure TX queues
   net/mana: implement memory registration
   net/mana: implement the hardware layer operations
   net/mana: add function to start/stop TX queues
   net/mana: add function to start/stop RX queues
   net/mana: add function to receive packets
   net/mana: add function to send packets
   net/mana: add function to start/stop device
   net/mana: add function to report queue stats
   net/mana: add function to support RX interrupts



Can you please send new versions of the patches as reply to previous 
versions, so all versions can be in same thread, using git send-email 
'--in-reply-to' argument?


More details in the contribution guide:
https://doc.dpdk.org/guides/contributing/patches.html#sending-patches



Also for next version, can you please fix warnings reported by 
'./devtools/check-git-log.sh'.


Re: [PATCH v2] net/pcap: fix timeout of stopping device

2022-09-06 Thread Stephen Hemminger
On Tue,  6 Sep 2022 16:05:11 +0800
Yiding Zhou  wrote:

> The pcap file will be synchronized to the disk when stopping the device.
> It takes a long time if the file is large that would cause the
> 'detach sync request' timeout when the device is closed under multi-process
> scenario.
> 
> This commit fixes the issue by using alarm handler to release dumper.
> 
> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yiding Zhou 


I think you need to redesign the handshake if this the case.
Forcing 30 second delay at the end of all uses of pcap is not acceptable.


Re: [Patch v7 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Stephen Hemminger
On Fri,  2 Sep 2022 18:40:43 -0700
lon...@linuxonhyperv.com wrote:

> From: Long Li 
> 
> MANA is a PCI device. It uses IB verbs to access hardware through the
> kernel RDMA layer. This patch introduces build environment and basic
> device probe functions.
> 
> Signed-off-by: Long Li 
> ---

You should add a reference to minimal required version of rdma-core.
Older versions won't work right.


Re: [PATCH v5] lib/eal: fix segfaults in exiting

2022-09-06 Thread Stephen Hemminger
On Tue,  6 Sep 2022 10:51:31 +0800
Zhichao Zeng  wrote:

>  
> +static void mark_forked(void)
> +{
> + is_forked++;
> +}
> +

This will end up counting application threads as well.

Also, it would need to be atomic.

>  /* Launch threads, called at application init(). */
>  int
>  rte_eal_init(int argc, char **argv)
> @@ -1324,6 +1331,8 @@ rte_eal_init(int argc, char **argv)
>  
>   eal_mcfg_complete();
>  
> + pthread_atfork(NULL, NULL, mark_forked);
> +
>   return fctret;
>  }

>  int
>  rte_eal_cleanup(void)
>  {
> + if (is_forked)
> + return 0;
> +

rte_eal_cleanup is supposed to be called only once by application.


DPDK technical board meeting

2022-09-06 Thread Thomas Monjalon
The DPDK summit is starting!

After a hackathon this afternoon, we are going to have a technical board 
meeting.
As shown in the schedule, everybody is welcome to join remotely
at 18:00 french time (in 10 minutes) at this URL:
https://meet.jit.si/DPDK

See you there




[PATCH 3/6] service: reduce average case service core overhead

2022-09-06 Thread Mattias Rönnblom
Optimize service loop so that the starting point is the lowest-indexed
service mapped to the lcore in question, and terminate the loop at the
highest-indexed service.

While the worst case latency remains the same, this patch
significantly reduces the service framework overhead for the average
case. In particular, scenarios where an lcore only runs a single
service, or multiple services which id values are close (e.g., three
services with ids 17, 18 and 22), show significant improvements.

The worse case is a where the lcore two services mapped to it; one
with service id 0 and the other with id 63.

On a service lcore serving a single service, the service loop overhead
is reduced from ~190 core clock cycles to ~46. (On an Intel Cascade
Lake generation Xeon.) On weakly ordered CPUs, the gain is larger,
since the loop included load-acquire atomic operations.

Signed-off-by: Mattias Rönnblom 
---
 lib/eal/common/rte_service.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 87df04e3ac..4cac866792 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -464,7 +464,6 @@ static int32_t
 service_runner_func(void *arg)
 {
RTE_SET_USED(arg);
-   uint32_t i;
const int lcore = rte_lcore_id();
struct core_state *cs = &lcore_states[lcore];
 
@@ -478,10 +477,17 @@ service_runner_func(void *arg)
RUNSTATE_RUNNING) {
 
const uint64_t service_mask = cs->service_mask;
+   uint8_t start_id;
+   uint8_t end_id;
+   uint8_t i;
 
-   for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (!service_registered(i))
-   continue;
+   if (service_mask == 0)
+   continue;
+
+   start_id = __builtin_ctzl(service_mask);
+   end_id = 64 - __builtin_clzl(service_mask);
+
+   for (i = start_id; i < end_id; i++) {
/* return value ignored as no change to code flow */
service_run(i, cs, service_mask, service_get(i), 1);
}
-- 
2.34.1



[PATCH 5/6] event/sw: report idle when no work is performed

2022-09-06 Thread Mattias Rönnblom
Have the SW event device conform to the service core convention, where
-EAGAIN is return in case no work was performed.

Prior to this patch, for an idle SW event device, a service lcore load
estimate based on RTE_SERVICE_ATTR_CYCLES would suggest 48% core
load.

At 7% of its maximum capacity, the SW event device needs about 15% of
the available CPU cycles* to perform its duties, but
RTE_SERVICE_ATTR_CYCLES would suggest the SW service used 48% of the
service core.

After this change, load deduced from RTE_SERVICE_ATTR_CYCLES will only
be a minor overestimation of the actual cycles used.

* The SW scheduler becomes more efficient at higher loads.

Signed-off-by: Mattias Rönnblom 
---
 drivers/event/sw/sw_evdev.c   | 3 +--
 drivers/event/sw/sw_evdev.h   | 2 +-
 drivers/event/sw/sw_evdev_scheduler.c | 6 --
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/event/sw/sw_evdev.c b/drivers/event/sw/sw_evdev.c
index f93313b31b..f14c6427dd 100644
--- a/drivers/event/sw/sw_evdev.c
+++ b/drivers/event/sw/sw_evdev.c
@@ -933,8 +933,7 @@ set_refill_once(const char *key __rte_unused, const char 
*value, void *opaque)
 static int32_t sw_sched_service_func(void *args)
 {
struct rte_eventdev *dev = args;
-   sw_event_schedule(dev);
-   return 0;
+   return sw_event_schedule(dev);
 }
 
 static int
diff --git a/drivers/event/sw/sw_evdev.h b/drivers/event/sw/sw_evdev.h
index 4fd1054470..8542b7d34d 100644
--- a/drivers/event/sw/sw_evdev.h
+++ b/drivers/event/sw/sw_evdev.h
@@ -295,7 +295,7 @@ uint16_t sw_event_enqueue_burst(void *port, const struct 
rte_event ev[],
 uint16_t sw_event_dequeue(void *port, struct rte_event *ev, uint64_t wait);
 uint16_t sw_event_dequeue_burst(void *port, struct rte_event *ev, uint16_t num,
uint64_t wait);
-void sw_event_schedule(struct rte_eventdev *dev);
+int32_t sw_event_schedule(struct rte_eventdev *dev);
 int sw_xstats_init(struct sw_evdev *dev);
 int sw_xstats_uninit(struct sw_evdev *dev);
 int sw_xstats_get_names(const struct rte_eventdev *dev,
diff --git a/drivers/event/sw/sw_evdev_scheduler.c 
b/drivers/event/sw/sw_evdev_scheduler.c
index 809a54d731..8bc21944f5 100644
--- a/drivers/event/sw/sw_evdev_scheduler.c
+++ b/drivers/event/sw/sw_evdev_scheduler.c
@@ -506,7 +506,7 @@ sw_schedule_pull_port_dir(struct sw_evdev *sw, uint32_t 
port_id)
return pkts_iter;
 }
 
-void
+int32_t
 sw_event_schedule(struct rte_eventdev *dev)
 {
struct sw_evdev *sw = sw_pmd_priv(dev);
@@ -517,7 +517,7 @@ sw_event_schedule(struct rte_eventdev *dev)
 
sw->sched_called++;
if (unlikely(!sw->started))
-   return;
+   return -EAGAIN;
 
do {
uint32_t in_pkts_this_iteration = 0;
@@ -610,4 +610,6 @@ sw_event_schedule(struct rte_eventdev *dev)
sw->sched_last_iter_bitmask = cqs_scheds_last_iter;
if (unlikely(sw->port_count >= 64))
sw->sched_last_iter_bitmask = UINT64_MAX;
+
+   return work_done ? 0 : -EAGAIN;
 }
-- 
2.34.1



[PATCH 4/6] service: tweak cycle statistics semantics

2022-09-06 Thread Mattias Rönnblom
As a part of its service function, a service usually polls some kind
of source (e.g., an RX queue, a ring, an eventdev port, or a timer
wheel) to retrieve one or more items of work.

In low-load situations, the service framework reports a significant
amount of cycles spent for all running services, despite the fact they
have performed little or no actual work.

The per-call cycle expenditure for an idle service (i.e., a service
currently without pending jobs) is typically very low. Polling an
empty ring or RX queue is inexpensive. However, since the service
function call frequency on an idle or lightly loaded lcore is going to
be very high indeed, the service function calls' cycles adds up to a
significant amount. The only thing preventing the idle services'
cycles counters to make up 100% of the available CPU cycles is the
overhead of the service framework itself.

If the RTE_SERVICE_ATTR_CYCLES or RTE_SERVICE_LCORE_ATTR_CYCLES are
used to estimate service core load, the cores may look very busy when
the system is mostly doing nothing useful at all.

This patch allows for an idle service to indicate that no actual work
was performed during a particular service function call (by returning
-EAGAIN). In such cases the RTE_SERVICE_ATTR_CYCLES and
RTE_SERVICE_LCORE_ATTR_CYCLES values are not incremented.

The convention of returning -EAGAIN for idle services may in the
future also be used to have the lcore enter a short sleep, or reduce
its operating frequency, in case all services are currently idle.

This change is backward-compatible.

Signed-off-by: Mattias Rönnblom 
---
 lib/eal/common/rte_service.c| 22 ++
 lib/eal/include/rte_service_component.h |  5 +
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 4cac866792..123610688c 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -10,6 +10,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -364,24 +365,29 @@ service_runner_do_callback(struct rte_service_spec_impl 
*s,
 
if (service_stats_enabled(s)) {
uint64_t start = rte_rdtsc();
-   s->spec.callback(userdata);
-   uint64_t end = rte_rdtsc();
-   uint64_t cycles = end - start;
+   int rc = s->spec.callback(userdata);
 
/* The lcore service worker thread is the only writer,
 * and thus only a non-atomic load and an atomic store
 * is needed, and not the more expensive atomic
 * add.
 */
-   __atomic_store_n(&cs->cycles, cs->cycles + cycles,
-__ATOMIC_RELAXED);
+
+   if (likely(rc != -EAGAIN)) {
+   uint64_t end = rte_rdtsc();
+   uint64_t cycles = end - start;
+
+   __atomic_store_n(&cs->cycles, cs->cycles + cycles,
+__ATOMIC_RELAXED);
+   __atomic_store_n(&cs->cycles_per_service[service_idx],
+cs->cycles_per_service[service_idx] +
+cycles, __ATOMIC_RELAXED);
+   }
+
__atomic_store_n(&cs->calls_per_service[service_idx],
 cs->calls_per_service[service_idx] + 1,
 __ATOMIC_RELAXED);
 
-   __atomic_store_n(&cs->cycles_per_service[service_idx],
-cs->cycles_per_service[service_idx] + cycles,
-__ATOMIC_RELAXED);
} else
s->spec.callback(userdata);
 }
diff --git a/lib/eal/include/rte_service_component.h 
b/lib/eal/include/rte_service_component.h
index 9e66ee7e29..9be49d698a 100644
--- a/lib/eal/include/rte_service_component.h
+++ b/lib/eal/include/rte_service_component.h
@@ -19,6 +19,11 @@ extern "C" {
 
 /**
  * Signature of callback function to run a service.
+ *
+ * A service function call resulting in no actual work being
+ * performed, should return -EAGAIN. In that case, the (presumbly few)
+ * cycles spent will not be counted toward the lcore or service-level
+ * cycles attributes.
  */
 typedef int32_t (*rte_service_func)(void *args);
 
-- 
2.34.1



[PATCH 2/6] service: introduce per-lcore cycles counter

2022-09-06 Thread Mattias Rönnblom
Introduce a per-lcore counter for the total time spent on processing
services on that core.

This counter is useful when measuring individual lcore load.

Signed-off-by: Mattias Rönnblom 
---
 app/test/test_service_cores.c |  2 +-
 lib/eal/common/rte_service.c  | 14 ++
 lib/eal/include/rte_service.h |  6 ++
 3 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/app/test/test_service_cores.c b/app/test/test_service_cores.c
index 7415b6b686..096405133b 100644
--- a/app/test/test_service_cores.c
+++ b/app/test/test_service_cores.c
@@ -403,7 +403,7 @@ service_lcore_attr_get(void)
"lcore_attr_get() failed to get loops "
"(expected > zero)");
 
-   lcore_attr_id++;  // invalid lcore attr id
+   lcore_attr_id = 42; /* invalid lcore attr id */
TEST_ASSERT_EQUAL(-EINVAL, rte_service_lcore_attr_get(slcore_id,
lcore_attr_id, &lcore_attr_value),
"Invalid lcore attr didn't return -EINVAL");
diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index b5103f2a20..87df04e3ac 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -61,6 +61,7 @@ struct core_state {
uint8_t is_service_core; /* set if core is currently a service core */
uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
uint64_t loops;
+   uint64_t cycles;
uint64_t calls_per_service[RTE_SERVICE_NUM_MAX];
uint64_t cycles_per_service[RTE_SERVICE_NUM_MAX];
 } __rte_cache_aligned;
@@ -372,6 +373,8 @@ service_runner_do_callback(struct rte_service_spec_impl *s,
 * is needed, and not the more expensive atomic
 * add.
 */
+   __atomic_store_n(&cs->cycles, cs->cycles + cycles,
+__ATOMIC_RELAXED);
__atomic_store_n(&cs->calls_per_service[service_idx],
 cs->calls_per_service[service_idx] + 1,
 __ATOMIC_RELAXED);
@@ -812,6 +815,14 @@ lcore_attr_get_loops(unsigned int lcore)
return __atomic_load_n(&cs->loops, __ATOMIC_RELAXED);
 }
 
+static uint64_t
+lcore_attr_get_cycles(unsigned int lcore)
+{
+   struct core_state *cs = &lcore_states[lcore];
+
+   return __atomic_load_n(&cs->cycles, __ATOMIC_RELAXED);
+}
+
 static uint64_t
 lcore_attr_get_service_calls(uint32_t service_id, unsigned int lcore)
 {
@@ -896,6 +907,9 @@ rte_service_lcore_attr_get(uint32_t lcore, uint32_t attr_id,
case RTE_SERVICE_LCORE_ATTR_LOOPS:
*attr_value = lcore_attr_get_loops(lcore);
return 0;
+   case RTE_SERVICE_LCORE_ATTR_CYCLES:
+   *attr_value = lcore_attr_get_cycles(lcore);
+   return 0;
default:
return -EINVAL;
}
diff --git a/lib/eal/include/rte_service.h b/lib/eal/include/rte_service.h
index 35d8018684..70deb6e53a 100644
--- a/lib/eal/include/rte_service.h
+++ b/lib/eal/include/rte_service.h
@@ -407,6 +407,12 @@ int32_t rte_service_attr_reset_all(uint32_t id);
  */
 #define RTE_SERVICE_LCORE_ATTR_LOOPS 0
 
+/**
+ * Returns the total number of cycles that the lcore has spent on
+ * running services.
+ */
+#define RTE_SERVICE_LCORE_ATTR_CYCLES 1
+
 /**
  * Get an attribute from a service core.
  *
-- 
2.34.1



[PATCH 1/6] service: reduce statistics overhead for parallel services

2022-09-06 Thread Mattias Rönnblom
Move the statistics from the service data structure to the per-lcore
struct. This eliminates contention for the counter cache lines, which
decreases the producer-side statistics overhead for services deployed
across many lcores.

Prior to this patch, enabling statistics for a service with a
per-service function call latency of 1000 clock cycles deployed across
16 cores on a Intel Xeon 6230N @ 2,3 GHz would incur a cost of ~1
core clock cycles per service call. After this patch, the statistics
overhead is reduce to 22 clock cycles per call.

Signed-off-by: Mattias Rönnblom 
---
 lib/eal/common/rte_service.c | 182 +++
 1 file changed, 121 insertions(+), 61 deletions(-)

diff --git a/lib/eal/common/rte_service.c b/lib/eal/common/rte_service.c
index 94cb056196..b5103f2a20 100644
--- a/lib/eal/common/rte_service.c
+++ b/lib/eal/common/rte_service.c
@@ -50,17 +50,8 @@ struct rte_service_spec_impl {
 * on currently.
 */
uint32_t num_mapped_cores;
-
-   /* 32-bit builds won't naturally align a uint64_t, so force alignment,
-* allowing regular reads to be atomic.
-*/
-   uint64_t calls __rte_aligned(8);
-   uint64_t cycles_spent __rte_aligned(8);
 } __rte_cache_aligned;
 
-/* Mask used to ensure uint64_t 8 byte vars are naturally aligned. */
-#define RTE_SERVICE_STAT_ALIGN_MASK (8 - 1)
-
 /* the internal values of a service core */
 struct core_state {
/* map of services IDs are run on this core */
@@ -71,6 +62,7 @@ struct core_state {
uint8_t service_active_on_lcore[RTE_SERVICE_NUM_MAX];
uint64_t loops;
uint64_t calls_per_service[RTE_SERVICE_NUM_MAX];
+   uint64_t cycles_per_service[RTE_SERVICE_NUM_MAX];
 } __rte_cache_aligned;
 
 static uint32_t rte_service_count;
@@ -138,13 +130,16 @@ rte_service_finalize(void)
rte_service_library_initialized = 0;
 }
 
-/* returns 1 if service is registered and has not been unregistered
- * Returns 0 if service never registered, or has been unregistered
- */
-static inline int
+static inline bool
+service_registered(uint32_t id)
+{
+   return rte_services[id].internal_flags & SERVICE_F_REGISTERED;
+}
+
+static inline bool
 service_valid(uint32_t id)
 {
-   return !!(rte_services[id].internal_flags & SERVICE_F_REGISTERED);
+   return id < RTE_SERVICE_NUM_MAX && service_registered(id);
 }
 
 static struct rte_service_spec_impl *
@@ -155,7 +150,7 @@ service_get(uint32_t id)
 
 /* validate ID and retrieve service pointer, or return error value */
 #define SERVICE_VALID_GET_OR_ERR_RET(id, service, retval) do {  \
-   if (id >= RTE_SERVICE_NUM_MAX || !service_valid(id))\
+   if (!service_valid(id)) \
return retval;  \
service = &rte_services[id];\
 } while (0)
@@ -217,7 +212,7 @@ rte_service_get_by_name(const char *name, uint32_t 
*service_id)
 
int i;
for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (service_valid(i) &&
+   if (service_registered(i) &&
strcmp(name, rte_services[i].spec.name) == 0) {
*service_id = i;
return 0;
@@ -254,7 +249,7 @@ rte_service_component_register(const struct 
rte_service_spec *spec,
return -EINVAL;
 
for (i = 0; i < RTE_SERVICE_NUM_MAX; i++) {
-   if (!service_valid(i)) {
+   if (!service_registered(i)) {
free_slot = i;
break;
}
@@ -366,29 +361,24 @@ service_runner_do_callback(struct rte_service_spec_impl 
*s,
 {
void *userdata = s->spec.callback_userdata;
 
-   /* Ensure the atomically stored variables are naturally aligned,
-* as required for regular loads to be atomic.
-*/
-   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, calls)
-   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
-   RTE_BUILD_BUG_ON((offsetof(struct rte_service_spec_impl, cycles_spent)
-   & RTE_SERVICE_STAT_ALIGN_MASK) != 0);
-
if (service_stats_enabled(s)) {
uint64_t start = rte_rdtsc();
s->spec.callback(userdata);
uint64_t end = rte_rdtsc();
uint64_t cycles = end - start;
-   cs->calls_per_service[service_idx]++;
-   if (service_mt_safe(s)) {
-   __atomic_fetch_add(&s->cycles_spent, cycles, 
__ATOMIC_RELAXED);
-   __atomic_fetch_add(&s->calls, 1, __ATOMIC_RELAXED);
-   } else {
-   uint64_t cycles_new = s->cycles_spent + cycles;
-   uint64_t calls_new = s->calls++;
-   __atomic_store_n(&s->cycles_spent, cycles_new, 
__ATOMIC_RELAXED);
-   __atomic_store_n(&s->c

[PATCH 6/6] service: provide links to functions in documentation

2022-09-06 Thread Mattias Rönnblom
Refer to API functions with parenthesis, making doxygen create
hyperlinks.

Signed-off-by: Mattias Rönnblom 
---
 lib/eal/include/rte_service.h | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/lib/eal/include/rte_service.h b/lib/eal/include/rte_service.h
index 70deb6e53a..90116a773a 100644
--- a/lib/eal/include/rte_service.h
+++ b/lib/eal/include/rte_service.h
@@ -37,7 +37,7 @@ extern "C" {
 
 /* Capabilities of a service.
  *
- * Use the *rte_service_probe_capability* function to check if a service is
+ * Use the rte_service_probe_capability() function to check if a service is
  * capable of a specific capability.
  */
 /** When set, the service is capable of having multiple threads run it at the
@@ -147,13 +147,13 @@ int32_t rte_service_map_lcore_get(uint32_t service_id, 
uint32_t lcore);
 int32_t rte_service_runstate_set(uint32_t id, uint32_t runstate);
 
 /**
- * Get the runstate for the service with *id*. See *rte_service_runstate_set*
+ * Get the runstate for the service with *id*. See rte_service_runstate_set()
  * for details of runstates. A service can call this function to ensure that
  * the application has indicated that it will receive CPU cycles. Either a
  * service-core is mapped (default case), or the application has explicitly
  * disabled the check that a service-cores is mapped to the service and takes
  * responsibility to run the service manually using the available function
- * *rte_service_run_iter_on_app_lcore* to do so.
+ * rte_service_run_iter_on_app_lcore() to do so.
  *
  * @retval 1 Service is running
  * @retval 0 Service is stopped
@@ -181,7 +181,7 @@ rte_service_may_be_active(uint32_t id);
 /**
  * Enable or disable the check for a service-core being mapped to the service.
  * An application can disable the check when takes the responsibility to run a
- * service itself using *rte_service_run_iter_on_app_lcore*.
+ * service itself using rte_service_run_iter_on_app_lcore().
  *
  * @param id The id of the service to set the check on
  * @param enable When zero, the check is disabled. Non-zero enables the check.
@@ -216,7 +216,7 @@ int32_t rte_service_set_runstate_mapped_check(uint32_t id, 
int32_t enable);
  *   atomics, applications can choose to enable or disable this feature
  *
  * Note that any thread calling this function MUST be a DPDK EAL thread, as
- * the *rte_lcore_id* function is used to access internal data structures.
+ * the rte_lcore_id() function is used to access internal data structures.
  *
  * @retval 0 Service was run on the calling thread successfully
  * @retval -EBUSY Another lcore is executing the service, and it is not a
@@ -232,7 +232,7 @@ int32_t rte_service_run_iter_on_app_lcore(uint32_t id,
  *
  * Starting a core makes the core begin polling. Any services assigned to it
  * will be run as fast as possible. The application must ensure that the lcore
- * is in a launchable state: e.g. call *rte_eal_lcore_wait* on the lcore_id
+ * is in a launchable state: e.g. call rte_eal_lcore_wait() on the lcore_id
  * before calling this function.
  *
  * @retval 0 Success
@@ -248,7 +248,7 @@ int32_t rte_service_lcore_start(uint32_t lcore_id);
  * service core. Note that the service lcore thread may not have returned from
  * the service it is running when this API returns.
  *
- * The *rte_service_lcore_may_be_active* API can be used to check if the
+ * The rte_service_lcore_may_be_active() API can be used to check if the
  * service lcore is * still active.
  *
  * @retval 0 Success
@@ -265,7 +265,7 @@ int32_t rte_service_lcore_stop(uint32_t lcore_id);
  * Reports if a service lcore is currently running.
  *
  * This function returns if the core has finished service cores code, and has
- * returned to EAL control. If *rte_service_lcore_stop* has been called but
+ * returned to EAL control. If rte_service_lcore_stop() has been called but
  * the lcore has not returned to EAL yet, it might be required to wait and call
  * this function again. The amount of time to wait before the core returns
  * depends on the duration of the services being run.
@@ -293,7 +293,7 @@ int32_t rte_service_lcore_add(uint32_t lcore);
 /**
  * Removes lcore from the list of service cores.
  *
- * This can fail if the core is not stopped, see *rte_service_core_stop*.
+ * This can fail if the core is not stopped, see rte_service_core_stop().
  *
  * @retval 0 Success
  * @retval -EBUSY Lcore is not stopped, stop service core before removing.
@@ -308,7 +308,7 @@ int32_t rte_service_lcore_del(uint32_t lcore);
  * service core count can be used in mapping logic when creating mappings
  * from service cores to services.
  *
- * See *rte_service_lcore_list* for details on retrieving the lcore_id of each
+ * See rte_service_lcore_list() for details on retrieving the lcore_id of each
  * service core.
  *
  * @return The number of service cores currently configured.
@@ -344,14 +344,14 @@ int32_t rte_service_set_stats_enable(uint32_t i

RE: [PATCH] SCSY-192443 Bug fix DPDK-dumpcap for interface parameter

2022-09-06 Thread Pattan, Reshma



> -Original Message-
> From: Kaur, Arshdeep 
> Subject: [PATCH] SCSY-192443 Bug fix DPDK-dumpcap for interface
> parameter
> 
> Bug: IF condition to handle -i parameter was not incorrect.
> 

Hi ,


Remove SCSY* from heading
No need to add *Bug  word in the heading.
Heading should have the example name that you are fixing that is "dumpcap".
So, Heading can be as simple as "dumpcap:  fix interface parameter check"
Need to add fixes line to the commit message for any fix.  Refer the section 
7.7 in the link on how to add the fixes line. 
https://doc.dpdk.org/guides/contributing/patches.html
With these changes you can send the V2.

Thanks,
Reshma





RE: [PATCH v2] net/pcap: fix timeout of stopping device

2022-09-06 Thread Zhou, YidingX
On Tue,  6 Sep 2022 16:05:11 +0800
Yiding Zhou  wrote:

> The pcap file will be synchronized to the disk when stopping the device.
> It takes a long time if the file is large that would cause the 'detach 
> sync request' timeout when the device is closed under multi-process 
> scenario.
> 
> This commit fixes the issue by using alarm handler to release dumper.
> 
> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yiding Zhou 


I think you need to redesign the handshake if this the case.
Forcing 30 second delay at the end of all uses of pcap is not acceptable.

Thanks for your comments.
According to my test, the time required to sync a 100G pcap file is about 20s, 
so I set a delay of 30s. I also tried to use AIO, but some issue in the 
multi-process scenario.
And I also consider io_uring,  it is only supported in linux-5.x, we need to 
consider compatibility,. 
Maybe better way is to do more work to redesign the handshake.


RE: [PATCH] SCSY-192443 Fix DPDK-dumpcap for XRAN Library

2022-09-06 Thread Pattan, Reshma



> -Original Message-
> From: Kaur, Arshdeep 
> Subject: [PATCH] SCSY-192443 Fix DPDK-dumpcap for XRAN Library
>  

> Issue: By default, dpdk-dumpcap tries to listen to
> /var/run/dpdk/rte/mp_socket, whereas in flexran a socket is created at
> /var/run/dpdk//mp_socket. File prefix is provided to flexran via
> config files which is used in EAL options. There is no way in dpdk-dumpcap
> today to provide file-prefix.
> 
> Fix: Added a new parameter "-m" to handle this requirement. User needs to
> provide "-m  as first argument to dpdk-dumpcap.



Remove SCSY* from heading

Heading should beging with the example name that you are modifying  that is 
"dumpcap:".

Tell exactly what you are adding in this patch. No need to add  for whom you 
are doing this patch and all.

So, Heading can be as simple as "dumpcap:  add the mutiprocess fileprefix 
support"

With these changes you can send the V2.

This is not the issue and you are not fixing it. This is some new support you 
are adding.  So need to reframe your commit message.

Thanks,
Reshma


[RFC PATCH 0/1] Add support for code-coverage analysis

2022-09-06 Thread Felix Moessbauer
This patch has been developed as part of the DPDK Userspace Summit Hackathon.
It provides a PoC for code-coverage analysis for the DPDK project.

To generate the report, a developer simply follows the official
meson coverage workflow, described in [1].
In doing so, both an HTML report, as well as an XML version is generated
for further processing.

In short, the following steps are required:

- install gcovr
- meson -Db_coverage=true build-cov
- meson compile -C build-cov
- meson test -C build-cov --suite fast-tests
- ninja coverage -C build-cov

[1] https://mesonbuild.com/howtox.html#producing-a-coverage-report

Best regards,
Felix Moessbauer
Siemens AG

Felix Moessbauer (1):
  Add basic support for code coverage analysis

 gcovr.cfg | 8 
 1 file changed, 8 insertions(+)
 create mode 100644 gcovr.cfg

-- 
2.30.2



[RFC PATCH 1/1] Add basic support for code coverage analysis

2022-09-06 Thread Felix Moessbauer
This patch adds basic support to get meaningful
code coverage statistics for some central components.
To keep things simple, we only focus on the parts that are
tested as part of the "fast-tests" suite.
This includes the lib as well as drivers that do not require
special hardware to be tested.

By providing the gcovr.cfg file in the project root,
modern versions of meson (>=0.63) can pass that information
to gcovr, making it possible to configure the coverage target
of meson.
This enables us to use the default meson coverage infrastructure
and customize it for the needs of the DPDK project.

Signed-off-by: Felix Moessbauer 
Acked-by: William Lam 
Acked-by: Chriss Windle 
---
 gcovr.cfg | 8 
 1 file changed, 8 insertions(+)
 create mode 100644 gcovr.cfg

diff --git a/gcovr.cfg b/gcovr.cfg
new file mode 100644
index 00..6e247499e8
--- /dev/null
+++ b/gcovr.cfg
@@ -0,0 +1,8 @@
+filter = lib
+filter = drivers/bus/pci
+filter = drivers/bus/vdev
+filter = drivers/mempool/ring
+filter = drivers/event/skeleton
+filter = drivers/net/ring
+filter = drivers/net/null
+filter = drivers/raw/skeleton
-- 
2.30.2



[PATCH v2] eal: fix data race in multi-process support

2022-09-06 Thread Stephen Hemminger
If DPDK is built with thread sanitizer it reports a race
in setting of multiprocess file descriptor. The fix is to
use atomic operations when updating mp_fd.

Build:
$ meson -Db_sanitize=address build
$ ninja -C build

Simple example:
$ .build/app/dpdk-testpmd -l 1-3 --no-huge
EAL: Detected CPU lcores: 16
EAL: Detected NUMA nodes: 1
EAL: Static memory layout is selected, amount of reserved memory can be 
adjusted with -m or --socket-mem
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
testpmd: No probed ethernet devices
testpmd: create a new mbuf pool : n=163456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: Error - exiting with code: 1
  Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
==
WARNING: ThreadSanitizer: data race (pid=87245)
  Write of size 4 at 0x558e04d8ff70 by main thread:
#0 rte_mp_channel_cleanup  (dpdk-testpmd+0x1e7d30c)
#1 rte_eal_cleanup  (dpdk-testpmd+0x1e85929)
#2 rte_exit  (dpdk-testpmd+0x1e5bc0a)
#3 mbuf_pool_create.cold  (dpdk-testpmd+0x274011)
#4 main  (dpdk-testpmd+0x5cc15d)

  Previous read of size 4 at 0x558e04d8ff70 by thread T2:
#0 mp_handle  (dpdk-testpmd+0x1e7c439)
#1 ctrl_thread_init  (dpdk-testpmd+0x1e6ee1e)

  As if synchronized via sleep:
#0 nanosleep 
../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:366 
(libtsan.so.0+0x6075e)
#1 get_tsc_freq  (dpdk-testpmd+0x1e92ff9)
#2 set_tsc_freq  (dpdk-testpmd+0x1e6f2fc)
#3 rte_eal_timer_init  (dpdk-testpmd+0x1e931a4)
#4 rte_eal_init.cold  (dpdk-testpmd+0x29e578)
#5 main  (dpdk-testpmd+0x5cbc45)

  Location is global 'mp_fd' of size 4 at 0x558e04d8ff70 
(dpdk-testpmd+0x03122f70)

  Thread T2 'rte_mp_handle' (tid=87248, running) created by main thread at:
#0 pthread_create 
../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 
(libtsan.so.0+0x5ad75)
#1 rte_ctrl_thread_create  (dpdk-testpmd+0x1e6efd0)
#2 rte_mp_channel_init.cold  (dpdk-testpmd+0x29cb7c)
#3 rte_eal_init  (dpdk-testpmd+0x1e8662e)
#4 main  (dpdk-testpmd+0x5cbc45)

SUMMARY: ThreadSanitizer: data race 
(/home/shemminger/DPDK/main/build/app/dpdk-testpmd+0x1e7d30c) in 
rte_mp_channel_cleanup
==
ThreadSanitizer: reported 1 warnings

Fixes: bacaa2754017 ("eal: add channel for multi-process communication")
Signed-off-by: Stephen Hemminger 
Acked-by: Anatoly Burakov 
---
 lib/eal/common/eal_common_proc.c | 17 -
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c
index 313060528fec..1fc1d6c53bd2 100644
--- a/lib/eal/common/eal_common_proc.c
+++ b/lib/eal/common/eal_common_proc.c
@@ -260,7 +260,7 @@ rte_mp_action_unregister(const char *name)
 }
 
 static int
-read_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
+read_msg(int fd, struct mp_msg_internal *m, struct sockaddr_un *s)
 {
int msglen;
struct iovec iov;
@@ -281,7 +281,7 @@ read_msg(struct mp_msg_internal *m, struct sockaddr_un *s)
msgh.msg_controllen = sizeof(control);
 
 retry:
-   msglen = recvmsg(mp_fd, &msgh, 0);
+   msglen = recvmsg(fd, &msgh, 0);
 
/* zero length message means socket was closed */
if (msglen == 0)
@@ -390,11 +390,12 @@ mp_handle(void *arg __rte_unused)
 {
struct mp_msg_internal msg;
struct sockaddr_un sa;
+   int fd;
 
-   while (mp_fd >= 0) {
+   while ((fd = __atomic_load_n(&mp_fd, __ATOMIC_RELAXED)) >= 0) {
int ret;
 
-   ret = read_msg(&msg, &sa);
+   ret = read_msg(fd, &msg, &sa);
if (ret <= 0)
break;
 
@@ -638,9 +639,8 @@ rte_mp_channel_init(void)
NULL, mp_handle, NULL) < 0) {
RTE_LOG(ERR, EAL, "failed to create mp thread: %s\n",
strerror(errno));
-   close(mp_fd);
close(dir_fd);
-   mp_fd = -1;
+   close(__atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED));
return -1;
}
 
@@ -656,11 +656,10 @@ rte_mp_channel_cleanup(void)
 {
int fd;
 
-   if (mp_fd < 0)
+   fd = __atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED);
+   if (fd < 0)
return;
 
-   fd = mp_fd;
-   mp_fd = -1;
pthread_cancel(mp_handle_tid);
pthread_join(mp_handle_tid, NULL);
close_socket_fd(fd);
-- 
2.35.1



RE: [PATCH v7 0/7] bbdev changes for 22.11

2022-09-06 Thread Chautru, Nicolas
Hi Akhil, 

Can this very serie be applied now?

I would notably like to rebase the other PMDs series based on main branch once 
this one is merged. 

Thanks!
Nic

> -Original Message-
> From: Chautru, Nicolas 
> Sent: Monday, August 29, 2022 11:07 AM
> To: dev@dpdk.org; tho...@monjalon.net; gak...@marvell.com;
> hemant.agra...@nxp.com
> Cc: maxime.coque...@redhat.com; t...@redhat.com; m...@ashroe.eu;
> Richardson, Bruce ;
> david.march...@redhat.com; step...@networkplumber.org; Zhang,
> Mingshan ; Chautru, Nicolas
> 
> Subject: [PATCH v7 0/7] bbdev changes for 22.11
> 
> v7: couple of typos in documentation spotted by Maxime. Thanks.
> v6: added one comment in commit 2/7 suggested by Maxime.
> v5: update base on review from Tom Rix. Number of typos reported and
> resolved, removed the commit related to rw_lock for now, added a commit
> for code clean up from review, resolved one rebase issue between 2
> commits, used size of array for some bound check implementation. Thanks.
> v4: update to the last 2 commits to include function to print the queue status
> and a fix to the rte_lock within the wrong structure
> v3: update to device status info to also use padded size for the related 
> array.
> Adding also 2 additionals commits to allow the API struc to expose more
> information related to queues corner cases/warning as well as an optional
> rw lock.
> Hemant, Maxime, this is planned for DPDK 21.11 but would like review/ack
> early is possible to get this applied earlier and due to time off this summer.
> Thanks
> Nic
> 
> Nicolas Chautru (7):
>   bbdev: allow operation type enum for growth
>   bbdev: add device status info
>   bbdev: add device info on queue topology
>   drivers/baseband: update PMDs to expose queue per operation
>   bbdev: add new operation for FFT processing
>   bbdev: add queue related warning and status information
>   bbdev: remove unnecessary if-check
> 
>  app/test-bbdev/test_bbdev.c|   2 +-
>  app/test-bbdev/test_bbdev_perf.c   |   6 +-
>  doc/guides/prog_guide/bbdev.rst| 130 +
>  drivers/baseband/acc100/rte_acc100_pmd.c   |  30 ++--
>  drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c |   9 ++
>  drivers/baseband/fpga_lte_fec/fpga_lte_fec.c   |   9 ++
>  drivers/baseband/la12xx/bbdev_la12xx.c |  10 +-
>  drivers/baseband/null/bbdev_null.c |   1 +
>  drivers/baseband/turbo_sw/bbdev_turbo_software.c   |  12 ++
>  examples/bbdev_app/main.c  |   2 +-
>  lib/bbdev/rte_bbdev.c  |  57 +++-
>  lib/bbdev/rte_bbdev.h  | 149 +++-
>  lib/bbdev/rte_bbdev_op.h   | 156 
> -
>  lib/bbdev/version.map  |  12 ++
>  14 files changed, 556 insertions(+), 29 deletions(-)
> 
> --
> 1.8.3.1



RE: [PATCH v2 00/37] baseband/acc100: changes for 22.11

2022-09-06 Thread Chautru, Nicolas
Hi Tom, Maxime, Hermant, 
Can we get a few reviews/or and acks for the patches in that serie please?
Much appreciated, 
Nic

> -Original Message-
> From: Vargas, Hernan 
> Sent: Friday, August 19, 2022 7:31 PM
> To: dev@dpdk.org; gak...@marvell.com; t...@redhat.com
> Cc: Chautru, Nicolas ; Zhang, Qi Z
> ; Vargas, Hernan 
> Subject: [PATCH v2 00/37] baseband/acc100: changes for 22.11
> 
> Upstreaming ACC100 changes for 22.11.
> This patch series is dependant on series:
> https://patches.dpdk.org/project/dpdk/patch/1657150110-69957
> 
> Hernan Vargas (37):
>   baseband/acc100: add enqueue status
>   baseband/acc100: update ring availability calculation
>   baseband/acc100: add function to check AQ availability
>   baseband/acc100: free SW ring mem for reconfiguration
>   baseband/acc100: memory leak fix
>   baseband/acc100: add default e value for FCW
>   baseband/acc100: add LDPC encoder padding function
>   baseband/acc100: add scatter-gather support
>   baseband/acc100: add HARQ index helper function
>   baseband/acc100: avoid mux for small inbound frames
>   baseband/acc100: separate validation functions from debug
>   baseband/acc100: add LDPC transport block support
>   baseband/acc10x: limit cases for HARQ pruning
>   baseband/acc100: update validate LDPC enc/dec
>   baseband/acc100: add workaround for deRM corner cases
>   baseband/acc100: add ring companion address
>   baseband/acc100: configure PMON control registers
>   baseband/acc100: implement configurable queue depth
>   baseband/acc100: add queue stop operation
>   baseband/acc100: check turbo dec/enc input
>   baseband/acc100: check for unlikely operation vals
>   baseband/acc100: enforce additional check on FCW
>   baseband/acc100: update uplink CB input length
>   baseband/acc100: rename ldpc encode function arg
>   baseband/acc100: update log messages
>   baseband/acc100: allocate ring/queue mem when NULL
>   baseband/acc100: store FCW from first CB descriptor
>   baseband/acc100: make desc optimization optional
>   baseband/acc100: update device info
>   baseband/acc100: reduce input length for CRC24B
>   baseband/acc100: fix clearing PF IR outside handler
>   baseband/acc100: fix debug print for LDPC FCW
>   baseband/acc100: set device min alignment to 1
>   baseband/acc100: update meson file sdk dependency
>   baseband/acc100: add protection for NULL HARQ input
>   baseband/acc100: make HARQ layout memory 4GB
>   baseband/acc100: reset pointer after rte_free
> 
>  drivers/baseband/acc100/acc100_pf_enum.h |   52 +-
>  drivers/baseband/acc100/acc100_pmd.h |   41 +-
>  drivers/baseband/acc100/acc100_vf_enum.h |6 +
>  drivers/baseband/acc100/meson.build  |   21 +
>  drivers/baseband/acc100/rte_acc100_pmd.c | 1388 ++
>  5 files changed, 1254 insertions(+), 254 deletions(-)
> 
> --
> 2.37.1



Re: [PATCH] vhost: use try_lock in rte_vhost_vring_call

2022-09-06 Thread Stephen Hemminger
On Tue,  6 Sep 2022 10:22:25 +0800
Changpeng Liu  wrote:

> Note that this function is in data path, so the thread context
> may not same as socket messages processing context, by using
> try_lock here, users can have another try in case of VQ's access
> lock is held by `vhost-events` thread.
> 
> Signed-off-by: Changpeng Liu 
> ---
>  lib/vhost/vhost.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> index 60cb05a0ff..072d2acb7b 100644
> --- a/lib/vhost/vhost.c
> +++ b/lib/vhost/vhost.c
> @@ -1329,7 +1329,11 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
>   if (!vq)
>   return -1;
>  
> - rte_spinlock_lock(&vq->access_lock);
> + if (!rte_spinlock_trylock(&vq->access_lock)) {
> + VHOST_LOG_CONFIG(dev->ifname, DEBUG,
> + "failed to kick guest, virtqueue busy.\n");
> + return -1;
> + }
>  

If it is a race, logging a message is not a good idea; the log will fill
with this noise.

Instead make it statistic that can be seen by xstats.


RE: [PATCH] vhost: use try_lock in rte_vhost_vring_call

2022-09-06 Thread Liu, Changpeng



> -Original Message-
> From: Stephen Hemminger 
> Sent: Wednesday, September 7, 2022 5:16 AM
> To: Liu, Changpeng 
> Cc: dev@dpdk.org; Maxime Coquelin ; Xia,
> Chenbo 
> Subject: Re: [PATCH] vhost: use try_lock in rte_vhost_vring_call
> 
> On Tue,  6 Sep 2022 10:22:25 +0800
> Changpeng Liu  wrote:
> 
> > Note that this function is in data path, so the thread context
> > may not same as socket messages processing context, by using
> > try_lock here, users can have another try in case of VQ's access
> > lock is held by `vhost-events` thread.
> >
> > Signed-off-by: Changpeng Liu 
> > ---
> >  lib/vhost/vhost.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> > index 60cb05a0ff..072d2acb7b 100644
> > --- a/lib/vhost/vhost.c
> > +++ b/lib/vhost/vhost.c
> > @@ -1329,7 +1329,11 @@ rte_vhost_vring_call(int vid, uint16_t vring_idx)
> > if (!vq)
> > return -1;
> >
> > -   rte_spinlock_lock(&vq->access_lock);
> > +   if (!rte_spinlock_trylock(&vq->access_lock)) {
> > +   VHOST_LOG_CONFIG(dev->ifname, DEBUG,
> > +   "failed to kick guest, virtqueue busy.\n");
> > +   return -1;
> > +   }
> >
> 
> If it is a race, logging a message is not a good idea; the log will fill
> with this noise.
> 
> Instead make it statistic that can be seen by xstats.
It's a DEBUG log, users can't see it in practice.


RE: [Patch v6 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Long Li
> Subject: Re: [Patch v6 01/18] net/mana: add basic driver, build environment
> and doc
> 
> 
> 在 2022/9/1 2:05, Long Li 写道:
> >> Subject: Re: [Patch v6 01/18] net/mana: add basic driver, build
> >> environment and doc
> >>
> >>
> >> 在 2022/8/31 6:51, lon...@linuxonhyperv.com 写道:
> >>> From: Long Li 
> >>>
> >>> MANA is a PCI device. It uses IB verbs to access hardware through
> >>> the kernel RDMA layer. This patch introduces build environment and
> >>> basic device probe functions.
> >>>
> >>> Signed-off-by: Long Li 
> >>> ---
> >>> Change log:
> >>> v2:
> >>> Fix typos.
> >>> Make the driver build only on x86-64 and Linux.
> >>> Remove unused header files.
> >>> Change port definition to uint16_t or uint8_t (for IB).
> >>> Use getline() in place of fgets() to read and truncate a line.
> >>> v3:
> >>> Add meson build check for required functions from RDMA direct verb
> >>> header file
> >>> v4:
> >>> Remove extra "\n" in logging code.
> >>> Use "r" in place of "rb" in fopen() to read text files.
> >>>
> >>> [snip]
> >>> +
> >>> +static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv
> >> __rte_unused,
> >>> +   struct rte_pci_device *pci_dev,
> >>> +   struct rte_ether_addr *mac_addr) {
> >>> + struct ibv_device **ibv_list;
> >>> + int ibv_idx;
> >>> + struct ibv_context *ctx;
> >>> + struct ibv_device_attr_ex dev_attr;
> >>> + int num_devices;
> >>> + int ret = 0;
> >>> + uint8_t port;
> >>> + struct mana_priv *priv = NULL;
> >>> + struct rte_eth_dev *eth_dev = NULL;
> >>> + bool found_port;
> >>> +
> >>> + ibv_list = ibv_get_device_list(&num_devices);
> >>> + for (ibv_idx = 0; ibv_idx < num_devices; ibv_idx++) {
> >>> + struct ibv_device *ibdev = ibv_list[ibv_idx];
> >>> + struct rte_pci_addr pci_addr;
> >>> +
> >>> + DRV_LOG(INFO, "Probe device name %s dev_name %s
> >> ibdev_path %s",
> >>> + ibdev->name, ibdev->dev_name, ibdev-
> >>> ibdev_path);
> >>> +
> >>> + if (mana_ibv_device_to_pci_addr(ibdev, &pci_addr))
> >>> + continue;
> >>> +
> >>> + /* Ignore if this IB device is not this PCI device */
> >>> + if (pci_dev->addr.domain != pci_addr.domain ||
> >>> + pci_dev->addr.bus != pci_addr.bus ||
> >>> + pci_dev->addr.devid != pci_addr.devid ||
> >>> + pci_dev->addr.function != pci_addr.function)
> >>> + continue;
> >>> +
> >>> + ctx = ibv_open_device(ibdev);
> >>> + if (!ctx) {
> >>> + DRV_LOG(ERR, "Failed to open IB device %s",
> >>> + ibdev->name);
> >>> + continue;
> >>> + }
> >>> +
> >>> + ret = ibv_query_device_ex(ctx, NULL, &dev_attr);
> >>> + DRV_LOG(INFO, "dev_attr.orig_attr.phys_port_cnt %u",
> >>> + dev_attr.orig_attr.phys_port_cnt);
> >>> + found_port = false;
> >>> +
> >>> + for (port = 1; port <= dev_attr.orig_attr.phys_port_cnt;
> >>> +  port++) {
> >>> + struct ibv_parent_domain_init_attr attr = {};
> >>> + struct rte_ether_addr addr;
> >>> + char address[64];
> >>> + char name[RTE_ETH_NAME_MAX_LEN];
> >>> +
> >>> + ret = get_port_mac(ibdev, port, &addr);
> >>> + if (ret)
> >>> + continue;
> >>> +
> >>> + if (mac_addr && !rte_is_same_ether_addr(&addr,
> >> mac_addr))
> >>> + continue;
> >>> +
> >>> + rte_ether_format_addr(address, sizeof(address),
> >> &addr);
> >>> + DRV_LOG(INFO, "device located port %u address
> >> %s",
> >>> + port, address);
> >>> + found_port = true;
> >>> +
> >>> + priv = rte_zmalloc_socket(NULL, sizeof(*priv),
> >>> +   RTE_CACHE_LINE_SIZE,
> >>> +   SOCKET_ID_ANY);
> >>> + if (!priv) {
> >>> + ret = -ENOMEM;
> >>> + goto failed;
> >>> + }
> >>> +
> >>> + snprintf(name, sizeof(name), "%s_port%d",
> >>> +  pci_dev->device.name, port);
> >>> +
> >>> + if (rte_eal_process_type() ==
> >> RTE_PROC_SECONDARY) {
> >>> + int fd;
> >>> +
> >>> + eth_dev =
> >> rte_eth_dev_attach_secondary(name);
> >>> + if (!eth_dev) {
> >>> + DRV_LOG(ERR, "Can't attach to dev
> >> %s",
> >>> + name);
> >>> + ret = -ENOMEM;
> >>> + goto failed;
> >>> + }
> >>> +
> >>> + eth_dev->device = &pci_dev->device;
> >>> + eth_dev->dev_ops = &mana_dev_se

RE: [Patch v7 00/18] Introduce Microsoft Azure Network Adatper (MANA) PMD

2022-09-06 Thread Long Li
> Subject: Re: [Patch v7 00/18] Introduce Microsoft Azure Network Adatper
> (MANA) PMD
> 
> On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:
> 
> >
> > From: Long Li 
> >
> > MANA is a network interface card to be used in the Azure cloud
> environment.
> > MANA provides safe access to user memory through memory registration.
> > It has IOMMU built into the hardware.
> >
> > MANA uses IB verbs and RDMA layer to configure hardware resources. It
> > requires the corresponding RDMA kernel-mode and user-mode drivers.
> >
> > The MANA RDMA kernel-mode driver is being reviewed at:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatc
> > hwork.kernel.org%2Fproject%2Fnetdevbpf%2Fcover%2F1655345240-
> 26411-1-gi
> > t-send-email-
> longli%40linuxonhyperv.com%2F&data=05%7C01%7Clongli%4
> >
> 0microsoft.com%7C8cd6ffba9b5544435e8308da900846a8%7C72f988bf86f141
> af91
> >
> ab2d7cd011db47%7C1%7C0%7C637980662484490031%7CUnknown%7CTWFp
> bGZsb3d8ey
> >
> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%
> 7C300
> >
> 0%7C%7C%7C&sdata=nr6rB9%2BN8hNV3RWhVr%2B5XgB0I5V6XtajWDz
> NIgF5un4%3
> > D&reserved=0
> >
> > The MANA RDMA user-mode driver is being reviewed at:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith
> > ub.com%2Flinux-rdma%2Frdma-
> core%2Fpull%2F1177&data=05%7C01%7Clongl
> >
> i%40microsoft.com%7C8cd6ffba9b5544435e8308da900846a8%7C72f988bf86f
> 141a
> >
> f91ab2d7cd011db47%7C1%7C0%7C637980662484490031%7CUnknown%7CT
> WFpbGZsb3d
> >
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%
> 3D%7C
> >
> 3000%7C%7C%7C&sdata=LNlH77y0MHa2C43j5ArWZy%2BKlMaXNpb%2F
> AE0am971F4
> > 4%3D&reserved=0
> >
> >
> > Long Li (18):
> >net/mana: add basic driver, build environment and doc
> >net/mana: add device configuration and stop
> >net/mana: add function to report support ptypes
> >net/mana: add link update
> >net/mana: add function for device removal interrupts
> >net/mana: add device info
> >net/mana: add function to configure RSS
> >net/mana: add function to configure RX queues
> >net/mana: add function to configure TX queues
> >net/mana: implement memory registration
> >net/mana: implement the hardware layer operations
> >net/mana: add function to start/stop TX queues
> >net/mana: add function to start/stop RX queues
> >net/mana: add function to receive packets
> >net/mana: add function to send packets
> >net/mana: add function to start/stop device
> >net/mana: add function to report queue stats
> >net/mana: add function to support RX interrupts
> >
> 
> Can you please send new versions of the patches as reply to previous
> versions, so all versions can be in same thread, using git send-email '--in-
> reply-to' argument?

Sure, I will send soon.

> 
> More details in the contribution guide:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.
> dpdk.org%2Fguides%2Fcontributing%2Fpatches.html%23sending-
> patches&data=05%7C01%7Clongli%40microsoft.com%7C8cd6ffba9b554
> 4435e8308da900846a8%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7
> C637980662484490031%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%
> 7C&sdata=YTEWfVRjiobdQPQDCGoMuLW4N5NISl7VZYKhf6mvSxQ%3D
> &reserved=0



RE: [Patch v7 00/18] Introduce Microsoft Azure Network Adatper (MANA) PMD

2022-09-06 Thread Long Li
> Subject: Re: [Patch v7 00/18] Introduce Microsoft Azure Network Adatper
> (MANA) PMD
> 
> On 9/6/2022 2:03 PM, Ferruh Yigit wrote:
> > On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:
> >
> >>
> >> From: Long Li 
> >>
> >> MANA is a network interface card to be used in the Azure cloud
> >> environment.
> >> MANA provides safe access to user memory through memory registration.
> >> It has
> >> IOMMU built into the hardware.
> >>
> >> MANA uses IB verbs and RDMA layer to configure hardware resources. It
> >> requires the corresponding RDMA kernel-mode and user-mode drivers.
> >>
> >> The MANA RDMA kernel-mode driver is being reviewed at:
> >>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpat
> >> chwork.kernel.org%2Fproject%2Fnetdevbpf%2Fcover%2F1655345240-
> 26411-1-
> >> git-send-email-
> longli%40linuxonhyperv.com%2F&data=05%7C01%7Clongl
> >>
> i%40microsoft.com%7C7b028477af2f4dc9adbb08da901578ca%7C72f988bf86f
> 141
> >>
> af91ab2d7cd011db47%7C1%7C0%7C637980719147810170%7CUnknown%7CT
> WFpbGZsb
> >>
> 3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D
> >> %7C3000%7C%7C%7C&sdata=1cHl7GcqA7IVaPYeOj1Fr59%2FkkizeQij
> t7Rqi6aQ
> >> 9gw%3D&reserved=0
> >>
> >> The MANA RDMA user-mode driver is being reviewed at:
> >>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> hub.com%2Flinux-rdma%2Frdma-
> core%2Fpull%2F1177&data=05%7C01%7Clon
> >>
> gli%40microsoft.com%7C7b028477af2f4dc9adbb08da901578ca%7C72f988bf86
> f1
> >>
> 41af91ab2d7cd011db47%7C1%7C0%7C637980719147810170%7CUnknown%7
> CTWFpbGZ
> >>
> sb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6M
> n0%
> >>
> 3D%7C3000%7C%7C%7C&sdata=vSWsqSZycwOIBw1hq1IZ4s3G8lXKV82J
> bpy99f1K
> >> Bck%3D&reserved=0
> >>
> >>
> >> Long Li (18):
> >>    net/mana: add basic driver, build environment and doc
> >>    net/mana: add device configuration and stop
> >>    net/mana: add function to report support ptypes
> >>    net/mana: add link update
> >>    net/mana: add function for device removal interrupts
> >>    net/mana: add device info
> >>    net/mana: add function to configure RSS
> >>    net/mana: add function to configure RX queues
> >>    net/mana: add function to configure TX queues
> >>    net/mana: implement memory registration
> >>    net/mana: implement the hardware layer operations
> >>    net/mana: add function to start/stop TX queues
> >>    net/mana: add function to start/stop RX queues
> >>    net/mana: add function to receive packets
> >>    net/mana: add function to send packets
> >>    net/mana: add function to start/stop device
> >>    net/mana: add function to report queue stats
> >>    net/mana: add function to support RX interrupts
> >>
> >
> > Can you please send new versions of the patches as reply to previous
> > versions, so all versions can be in same thread, using git send-email
> > '--in-reply-to' argument?
> >
> > More details in the contribution guide:
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.
> > dpdk.org%2Fguides%2Fcontributing%2Fpatches.html%23sending-
> patches&
> >
> data=05%7C01%7Clongli%40microsoft.com%7C7b028477af2f4dc9adbb08da90
> 1578
> >
> ca%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6379807191478101
> 70%7CU
> >
> nknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI
> 6Ik1ha
> >
> WwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=N0XBkRX9LdgkT2jA
> lPZEP6g0GB
> > aH%2ByHeG1jLHKJB6AE%3D&reserved=0
> >
> 
> Also for next version, can you please fix warnings reported by
> './devtools/check-git-log.sh'.

Will fix those.


RE: [Patch v7 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Long Li
> Subject: Re: [Patch v7 01/18] net/mana: add basic driver, build environment
> and doc
> 
> On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:
> 
> >
> > From: Long Li 
> >
> > MANA is a PCI device. It uses IB verbs to access hardware through the
> > kernel RDMA layer. This patch introduces build environment and basic
> > device probe functions.
> >
> > Signed-off-by: Long Li 
> > ---
> > Change log:
> > v2:
> > Fix typos.
> > Make the driver build only on x86-64 and Linux.
> > Remove unused header files.
> > Change port definition to uint16_t or uint8_t (for IB).
> > Use getline() in place of fgets() to read and truncate a line.
> > v3:
> > Add meson build check for required functions from RDMA direct verb
> > header file
> > v4:
> > Remove extra "\n" in logging code.
> > Use "r" in place of "rb" in fopen() to read text files.
> > v7:
> > Remove RTE_ETH_TX_OFFLOAD_TCP_TSO from offload cap.
> >
> 
> Can you please check review comments on v4 [1], they seem still valid in this
> version.
> I didn't go through other patches, but can you please double check
> comments on all v4 patches?

Sorry it was an oversight. Will remove all the "\n" and double check.

> 
> 
> [1]
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Finbo
> x.dpdk.org%2Fdev%2F859e95d9-2483-b017-6daa-
> 0852317b4a72%40xilinx.com%2F&data=05%7C01%7Clongli%40microsoft
> .com%7C85fe7680325e402d210408da9008036c%7C72f988bf86f141af91ab2d7c
> d011db47%7C1%7C0%7C637980661342767895%7CUnknown%7CTWFpbGZsb3
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> %3D%7C3000%7C%7C%7C&sdata=4CHI9uw%2B0MwJtVjamECVZWvUYq
> BCitq7STstFNPNIN8%3D&reserved=0



RE: [Patch v7 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Long Li
> Subject: Re: [Patch v7 01/18] net/mana: add basic driver, build environment
> and doc
> 
> On Fri,  2 Sep 2022 18:40:43 -0700
> lon...@linuxonhyperv.com wrote:
> 
> > From: Long Li 
> >
> > MANA is a PCI device. It uses IB verbs to access hardware through the
> > kernel RDMA layer. This patch introduces build environment and basic
> > device probe functions.
> >
> > Signed-off-by: Long Li 
> > ---
> 
> You should add a reference to minimal required version of rdma-core.
> Older versions won't work right.

I'm adding a reference to build requirement in doc/guides/nics/mana.rst.

"drivers/net/mana/meson.build" has a build dependency on libmana from 
rdma-core. It won't build on older versions of rdma-core.


[PATCH] config/arm: add PHYTIUM fts2500 Here adds configs for PHYTIUM server.

2022-09-06 Thread luzhipeng
From: root 

Signed-off-by: luzhipeng 
---
 dpdk/config/arm/arm64_fts2500_linux_gcc | 16 
 dpdk/config/arm/meson.build | 22 --
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 dpdk/config/arm/arm64_fts2500_linux_gcc

diff --git a/dpdk/config/arm/arm64_fts2500_linux_gcc 
b/dpdk/config/arm/arm64_fts2500_linux_gcc
new file mode 100644
index 0..d43c7aad3
--- /dev/null
+++ b/dpdk/config/arm/arm64_fts2500_linux_gcc
@@ -0,0 +1,16 @@
+[binaries]
+c = 'aarch64-linux-gnu-gcc'
+cpp = 'aarch64-linux-gnu-g++'
+ar = 'aarch64-linux-gnu-gcc-ar'
+strip = 'aarch64-linux-gnu-strip'
+pkgconfig = 'aarch64-linux-gnu-pkg-config'
+pcap-config = ''
+
+[host_machine]
+system = 'linux'
+cpu_family = 'aarch64'
+cpu = 'armv8-a'
+endian = 'little'
+
+[properties]
+platform = 'fts2500'
diff --git a/dpdk/config/arm/meson.build b/dpdk/config/arm/meson.build
index c32f02bc2..93c7f4695 100644
--- a/dpdk/config/arm/meson.build
+++ b/dpdk/config/arm/meson.build
@@ -224,12 +224,21 @@ implementer_phytium = {
 ['RTE_MACHINE', '"armv8a"'],
 ['RTE_USE_C11_MEM_MODEL', true],
 ['RTE_CACHE_LINE_SIZE', 64],
-['RTE_MAX_LCORE', 64],
-['RTE_MAX_NUMA_NODES', 8]
 ],
 'part_number_config': {
 '0x662': {
 'machine_args': ['-march=armv8-a+crc'],
+'flags': [
+['RTE_MAX_LCORE', 64],
+['RTE_MAX_NUMA_NODES', 8]
+ ],
+},
+   '0x663': {
+'machine_args': ['-march=armv8-a+crc'],
+'flags': [
+['RTE_MAX_LCORE', 128],
+['RTE_MAX_NUMA_NODES', 16]
+],
 },
 }
 }
@@ -395,6 +404,13 @@ soc_ft2000plus = {
 'numa': true
 }
 
+soc_fts2500 = {
+'description': 'Phytium FT-S2500',
+'implementer': '0x70',
+'part_number': '0x663',
+'numa': true
+}
+
 '''
 Start of SoCs list
 generic: Generic un-optimized build for armv8 aarch64 execution mode.
@@ -407,6 +423,7 @@ cn10k:   Marvell OCTEON 10
 dpaa:NXP DPAA
 emag:Ampere eMAG
 ft2000plus:  Phytium FT-2000+
+fts2500: Phytium FT-S2500
 graviton2:   AWS Graviton2
 kunpeng920:  HiSilicon Kunpeng 920
 kunpeng930:  HiSilicon Kunpeng 930
@@ -438,6 +455,7 @@ socs = {
 'thunderx2': soc_thunderx2,
 'thunderxt88': soc_thunderxt88,
 'ft2000plus': soc_ft2000plus,
+'fts2500': soc_fts2500,
 }
 
 dpdk_conf.set('RTE_ARCH_ARM', 1)
-- 
2.27.0





[PATCH RESEND] config/arm: add PHYTIUM fts2500 Here adds configs for PHYTIUM server.

2022-09-06 Thread luzhipeng
From: luzhipeng 

Signed-off-by: luzhipeng 
---
 dpdk/config/arm/arm64_fts2500_linux_gcc | 16 
 dpdk/config/arm/meson.build | 22 --
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 dpdk/config/arm/arm64_fts2500_linux_gcc

diff --git a/dpdk/config/arm/arm64_fts2500_linux_gcc 
b/dpdk/config/arm/arm64_fts2500_linux_gcc
new file mode 100644
index 0..d43c7aad3
--- /dev/null
+++ b/dpdk/config/arm/arm64_fts2500_linux_gcc
@@ -0,0 +1,16 @@
+[binaries]
+c = 'aarch64-linux-gnu-gcc'
+cpp = 'aarch64-linux-gnu-g++'
+ar = 'aarch64-linux-gnu-gcc-ar'
+strip = 'aarch64-linux-gnu-strip'
+pkgconfig = 'aarch64-linux-gnu-pkg-config'
+pcap-config = ''
+
+[host_machine]
+system = 'linux'
+cpu_family = 'aarch64'
+cpu = 'armv8-a'
+endian = 'little'
+
+[properties]
+platform = 'fts2500'
diff --git a/dpdk/config/arm/meson.build b/dpdk/config/arm/meson.build
index c32f02bc2..93c7f4695 100644
--- a/dpdk/config/arm/meson.build
+++ b/dpdk/config/arm/meson.build
@@ -224,12 +224,21 @@ implementer_phytium = {
 ['RTE_MACHINE', '"armv8a"'],
 ['RTE_USE_C11_MEM_MODEL', true],
 ['RTE_CACHE_LINE_SIZE', 64],
-['RTE_MAX_LCORE', 64],
-['RTE_MAX_NUMA_NODES', 8]
 ],
 'part_number_config': {
 '0x662': {
 'machine_args': ['-march=armv8-a+crc'],
+'flags': [
+['RTE_MAX_LCORE', 64],
+['RTE_MAX_NUMA_NODES', 8]
+ ],
+},
+   '0x663': {
+'machine_args': ['-march=armv8-a+crc'],
+'flags': [
+['RTE_MAX_LCORE', 128],
+['RTE_MAX_NUMA_NODES', 16]
+],
 },
 }
 }
@@ -395,6 +404,13 @@ soc_ft2000plus = {
 'numa': true
 }
 
+soc_fts2500 = {
+'description': 'Phytium FT-S2500',
+'implementer': '0x70',
+'part_number': '0x663',
+'numa': true
+}
+
 '''
 Start of SoCs list
 generic: Generic un-optimized build for armv8 aarch64 execution mode.
@@ -407,6 +423,7 @@ cn10k:   Marvell OCTEON 10
 dpaa:NXP DPAA
 emag:Ampere eMAG
 ft2000plus:  Phytium FT-2000+
+fts2500: Phytium FT-S2500
 graviton2:   AWS Graviton2
 kunpeng920:  HiSilicon Kunpeng 920
 kunpeng930:  HiSilicon Kunpeng 930
@@ -438,6 +455,7 @@ socs = {
 'thunderx2': soc_thunderx2,
 'thunderxt88': soc_thunderxt88,
 'ft2000plus': soc_ft2000plus,
+'fts2500': soc_fts2500,
 }
 
 dpdk_conf.set('RTE_ARCH_ARM', 1)
-- 
2.27.0





RE: [PATCH v2 00/10] introduce GVE PMD

2022-09-06 Thread Guo, Junfeng


> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, September 2, 2022 01:19
> To: Guo, Junfeng ; Zhang, Qi Z
> ; Wu, Jingjing 
> Cc: dev@dpdk.org; Li, Xiaoyun ;
> awogbem...@google.com; Richardson, Bruce
> 
> Subject: Re: [PATCH v2 00/10] introduce GVE PMD
> 
> On 8/29/2022 9:41 AM, Junfeng Guo wrote:
> 
> >
> > Introduce a new PMD for Google Virtual Ethernet (GVE).
> >
> > This patch set requires an exception for MIT license for GVE base code.
> > And the base code includes the following files:
> >  - gve_adminq.c
> >  - gve_adminq.h
> >  - gve_desc.h
> >  - gve_desc_dqo.h
> >  - gve_register.h
> >
> > It's based on GVE kernel driver v1.3.0 and the original code is in
> > https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-
> linux/tree/v1.3.0
> >
> > v2:
> > fix some CI check error.
> >
> > Junfeng Guo (10):
> >net/gve: introduce GVE PMD base code
> >net/gve: add logs and OS specific implementation
> >net/gve: support device initialization
> >net/gve: add link update support
> >net/gve: add MTU set support
> >net/gve: add queue operations
> >net/gve: add Rx/Tx support
> >net/gve: add support to get dev info and configure dev
> >net/gve: add stats support
> >doc: update documentation
> >
> 
> Please check build error reported by CI:
> https://patches.dpdk.org/project/dpdk/patch/20220829084127.934183-
> 11-junfeng@intel.com/
> 
> I am also getting various build errors, even not able to reach patch by
> patch build stage where I expect some issues, can you please verify
> patch by patch build in next version?

Sure, thanks for reminding!
The compile/build issues are being handled in process now. 
Thanks!



RE: [Patch v6 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Long Li
> Subject: Re: [Patch v6 01/18] net/mana: add basic driver, build environment
> and doc
> 
> 
> 在 2022/9/7 9:36, Long Li 写道:
> >> Subject: Re: [Patch v6 01/18] net/mana: add basic driver, build
> >> environment and doc
> >>
> >>
> >> 在 2022/9/1 2:05, Long Li 写道:
>  Subject: Re: [Patch v6 01/18] net/mana: add basic driver, build
>  environment and doc
> 
> 
>  在 2022/8/31 6:51, lon...@linuxonhyperv.com 写道:
> > From: Long Li 
> >
> > MANA is a PCI device. It uses IB verbs to access hardware through
> > the kernel RDMA layer. This patch introduces build environment and
> > basic device probe functions.
> >
> > Signed-off-by: Long Li 
> > ---
> > Change log:
> > v2:
> > Fix typos.
> > Make the driver build only on x86-64 and Linux.
> > Remove unused header files.
> > Change port definition to uint16_t or uint8_t (for IB).
> > Use getline() in place of fgets() to read and truncate a line.
> > v3:
> > Add meson build check for required functions from RDMA direct verb
> > header file
> > v4:
> > Remove extra "\n" in logging code.
> > Use "r" in place of "rb" in fopen() to read text files.
> >
> > [snip]
> > +
> > +static int mana_pci_probe_mac(struct rte_pci_driver *pci_drv
>  __rte_unused,
> > + struct rte_pci_device *pci_dev,
> > + struct rte_ether_addr *mac_addr) {
> > +   struct ibv_device **ibv_list;
> > +   int ibv_idx;
> > +   struct ibv_context *ctx;
> > +   struct ibv_device_attr_ex dev_attr;
> > +   int num_devices;
> > +   int ret = 0;
> > +   uint8_t port;
> > +   struct mana_priv *priv = NULL;
> > +   struct rte_eth_dev *eth_dev = NULL;
> > +   bool found_port;
> > +
> > +   ibv_list = ibv_get_device_list(&num_devices);
> > +   for (ibv_idx = 0; ibv_idx < num_devices; ibv_idx++) {
> > +   struct ibv_device *ibdev = ibv_list[ibv_idx];
> > +   struct rte_pci_addr pci_addr;
> > +
> > +   DRV_LOG(INFO, "Probe device name %s
> dev_name %s
>  ibdev_path %s",
> > +   ibdev->name, ibdev->dev_name, ibdev-
> > ibdev_path);
> > +
> > +   if (mana_ibv_device_to_pci_addr(ibdev, &pci_addr))
> > +   continue;
> > +
> > +   /* Ignore if this IB device is not this PCI device */
> > +   if (pci_dev->addr.domain != pci_addr.domain ||
> > +   pci_dev->addr.bus != pci_addr.bus ||
> > +   pci_dev->addr.devid != pci_addr.devid ||
> > +   pci_dev->addr.function != pci_addr.function)
> > +   continue;
> > +
> > +   ctx = ibv_open_device(ibdev);
> > +   if (!ctx) {
> > +   DRV_LOG(ERR, "Failed to open IB device %s",
> > +   ibdev->name);
> > +   continue;
> > +   }
> > +
> > +   ret = ibv_query_device_ex(ctx, NULL, &dev_attr);
> > +   DRV_LOG(INFO,
> "dev_attr.orig_attr.phys_port_cnt %u",
> > +   dev_attr.orig_attr.phys_port_cnt);
> > +   found_port = false;
> > +
> > +   for (port = 1; port <=
> dev_attr.orig_attr.phys_port_cnt;
> > +port++) {
> > +   struct ibv_parent_domain_init_attr attr = {};
> > +   struct rte_ether_addr addr;
> > +   char address[64];
> > +   char name[RTE_ETH_NAME_MAX_LEN];
> > +
> > +   ret = get_port_mac(ibdev, port, &addr);
> > +   if (ret)
> > +   continue;
> > +
> > +   if (mac_addr
> && !rte_is_same_ether_addr(&addr,
>  mac_addr))
> > +   continue;
> > +
> > +   rte_ether_format_addr(address,
> sizeof(address),
>  &addr);
> > +   DRV_LOG(INFO, "device located port %u
> address
>  %s",
> > +   port, address);
> > +   found_port = true;
> > +
> > +   priv = rte_zmalloc_socket(NULL, sizeof(*priv),
> > +
> RTE_CACHE_LINE_SIZE,
> > + SOCKET_ID_ANY);
> > +   if (!priv) {
> > +   ret = -ENOMEM;
> > +   goto failed;
> > +   }
> > +
> > +   snprintf(name, sizeof(name), "%s_port%d",
> > +pci_dev->device.name, port);
> > 

[PATCH v1] ethdev: add direction info when creating the transfer table

2022-09-06 Thread Rongwei Liu
The transfer domain rule is able to match traffic wire/vf
origin and it means two directions' underlayer resource.

In customer deployments, they usually match only one direction
traffic in single flow table: either from wire or from vf.

Introduce one new member transfer_mode into rte_flow_attr to
indicate the flow table direction property: from wire, from vf
or bi-direction(default).

It helps to save underlayer memory also on insertion rate.

By default, the transfer domain is bi-direction, and no behavior changes.

1. Match wire origin only
   flow template_table 0 create group 0 priority 0 transfer wire_orig...
2. Match vf origin only
   flow template_table 0 create group 0 priority 0 transfer vf_orig...

Signed-off-by: Rongwei Liu 
---
 app/test-pmd/cmdline_flow.c | 26 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  3 ++-
 lib/ethdev/rte_flow.h   |  9 ++-
 3 files changed, 36 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index 7f50028eb7..b25b595e82 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -177,6 +177,8 @@ enum index {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+   TABLE_TRANSFER_WIRE_ORIG,
+   TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -1141,6 +1143,8 @@ static const enum index next_table_attr[] = {
TABLE_INGRESS,
TABLE_EGRESS,
TABLE_TRANSFER,
+   TABLE_TRANSFER_WIRE_ORIG,
+   TABLE_TRANSFER_VF_ORIG,
TABLE_RULES_NUMBER,
TABLE_PATTERN_TEMPLATE,
TABLE_ACTIONS_TEMPLATE,
@@ -2881,6 +2885,18 @@ static const struct token token_list[] = {
.next = NEXT(next_table_attr),
.call = parse_table,
},
+   [TABLE_TRANSFER_WIRE_ORIG] = {
+   .name = "wire_orig",
+   .help = "affect rule direction to transfer",
+   .next = NEXT(next_table_attr),
+   .call = parse_table,
+   },
+   [TABLE_TRANSFER_VF_ORIG] = {
+   .name = "vf_orig",
+   .help = "affect rule direction to transfer",
+   .next = NEXT(next_table_attr),
+   .call = parse_table,
+   },
[TABLE_RULES_NUMBER] = {
.name = "rules_number",
.help = "number of rules in table",
@@ -8894,6 +8910,16 @@ parse_table(struct context *ctx, const struct token 
*token,
case TABLE_TRANSFER:
out->args.table.attr.flow_attr.transfer = 1;
return len;
+   case TABLE_TRANSFER_WIRE_ORIG:
+   if (!out->args.table.attr.flow_attr.transfer)
+   return -1;
+   out->args.table.attr.flow_attr.transfer_mode = 1;
+   return len;
+   case TABLE_TRANSFER_VF_ORIG:
+   if (!out->args.table.attr.flow_attr.transfer)
+   return -1;
+   out->args.table.attr.flow_attr.transfer_mode = 2;
+   return len;
default:
return -1;
}
diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst 
b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
index 330e34427d..603b7988dd 100644
--- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst
+++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst
@@ -3332,7 +3332,8 @@ It is bound to ``rte_flow_template_table_create()``::
 
flow template_table {port_id} create
[table_id {id}] [group {group_id}]
-   [priority {level}] [ingress] [egress] [transfer]
+   [priority {level}] [ingress] [egress]
+   [transfer [vf_orig] [wire_orig]]
rules_number {number}
pattern_template {pattern_template_id}
actions_template {actions_template_id}
diff --git a/lib/ethdev/rte_flow.h b/lib/ethdev/rte_flow.h
index a79f1e7ef0..512b08d817 100644
--- a/lib/ethdev/rte_flow.h
+++ b/lib/ethdev/rte_flow.h
@@ -130,7 +130,14 @@ struct rte_flow_attr {
 * through a suitable port. @see rte_flow_pick_transfer_proxy().
 */
uint32_t transfer:1;
-   uint32_t reserved:29; /**< Reserved, must be zero. */
+   /**
+* 0 means bidirection,
+* 0x1 origin uplink,
+* 0x2 origin vport,
+* N/A both set.
+*/
+   uint32_t transfer_mode:2;
+   uint32_t reserved:27; /**< Reserved, must be zero. */
 };
 
 /**
-- 
2.27.0



RE: [Patch v7 01/18] net/mana: add basic driver, build environment and doc

2022-09-06 Thread Long Li
> Subject: RE: [Patch v7 01/18] net/mana: add basic driver, build environment
> and doc
> 
> > Subject: Re: [Patch v7 01/18] net/mana: add basic driver, build
> > environment and doc
> >
> > On 9/3/2022 2:40 AM, lon...@linuxonhyperv.com wrote:
> >
> > >
> > > From: Long Li 
> > >
> > > MANA is a PCI device. It uses IB verbs to access hardware through
> > > the kernel RDMA layer. This patch introduces build environment and
> > > basic device probe functions.
> > >
> > > Signed-off-by: Long Li 
> > > ---
> > > Change log:
> > > v2:
> > > Fix typos.
> > > Make the driver build only on x86-64 and Linux.
> > > Remove unused header files.
> > > Change port definition to uint16_t or uint8_t (for IB).
> > > Use getline() in place of fgets() to read and truncate a line.
> > > v3:
> > > Add meson build check for required functions from RDMA direct verb
> > > header file
> > > v4:
> > > Remove extra "\n" in logging code.
> > > Use "r" in place of "rb" in fopen() to read text files.
> > > v7:
> > > Remove RTE_ETH_TX_OFFLOAD_TCP_TSO from offload cap.
> > >
> >
> > Can you please check review comments on v4 [1], they seem still valid
> > in this version.
> > I didn't go through other patches, but can you please double check
> > comments on all v4 patches?
> 
> Sorry it was an oversight. Will remove all the "\n" and double check.

Are you referring to " Remove extra "\n" in logging code." In the comment?

There are two places "\n" are used, DRV_LOG() and PMD_INIT_LOG() in mana.h. I 
think they are okay as there is a single "\n" on each output line.

Please let me know if I missed anything.

> 
> >
> >
> > [1]
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Finbo
> > x.dpdk.org%2Fdev%2F859e95d9-2483-b017-6daa-
> >
> 0852317b4a72%40xilinx.com%2F&data=05%7C01%7Clongli%40microsoft
> > .com%7C85fe7680325e402d210408da9008036c%7C72f988bf86f141af91ab2d
> 7c
> >
> d011db47%7C1%7C0%7C637980661342767895%7CUnknown%7CTWFpbGZsb3
> >
> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0
> > %3D%7C3000%7C%7C%7C&sdata=4CHI9uw%2B0MwJtVjamECVZWvU
> Yq
> > BCitq7STstFNPNIN8%3D&reserved=0



[PATCH v4 0/5] support flow subscription

2022-09-06 Thread Jie Wang
Add support AVF can be able to subscribe a flow from PF.

--
v4: update commit log and rebase.
v3:
 * fix eth layer inputset.
 * rebase.
v2:
 * split v1 patch 2/2 to 4 small patches.
 * remove rule action RTE_FLOW_ACTION_TYPE_VF and add
   RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT.

Jie Wang (5):
  common/iavf: support flow subscription
  net/iavf: add flow subscription to AVF
  net/iavf: support flow subscrption pattern
  net/iavf: support flow subscription rule
  net/iavf: support priority of flow rule

 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/common/iavf/virtchnl.h | 104 +++-
 drivers/net/iavf/iavf.h|  13 +
 drivers/net/iavf/iavf_fdir.c   |   4 +
 drivers/net/iavf/iavf_fsub.c   | 745 +
 drivers/net/iavf/iavf_generic_flow.c   |  40 +-
 drivers/net/iavf/iavf_generic_flow.h   |   2 +
 drivers/net/iavf/iavf_hash.c   |   5 +
 drivers/net/iavf/iavf_ipsec_crypto.c   |  16 +-
 drivers/net/iavf/iavf_vchnl.c  | 133 +
 drivers/net/iavf/meson.build   |   1 +
 11 files changed, 1046 insertions(+), 21 deletions(-)
 create mode 100644 drivers/net/iavf/iavf_fsub.c

-- 
2.25.1



[PATCH v4 1/5] common/iavf: support flow subscription

2022-09-06 Thread Jie Wang
VF is able to subscribe a flow from PF by VIRTCHNL_FLOW_SUBSCRIBE.

PF is expected to offload a rule to hardware which will redirect
the packet that matching the required pattern to this VF.

Only a flow with dst mac address as PF's mac address can be subscribed.

VIRTCHNL_VF_OFFLOAD_FSUB_PF is used for Flow subscription capability
negotiation and only a trusted VF can be granted with this capability.

A flow can be unsubscribed by VIRTCHNL_FLOW_UNSUBSCRIBE.

Signed-off-by: Jie Wang 
Signed-off-by: Qi Zhang 
---
 drivers/common/iavf/virtchnl.h | 104 +++--
 1 file changed, 100 insertions(+), 4 deletions(-)

diff --git a/drivers/common/iavf/virtchnl.h b/drivers/common/iavf/virtchnl.h
index f123daec8e..e02eec4935 100644
--- a/drivers/common/iavf/virtchnl.h
+++ b/drivers/common/iavf/virtchnl.h
@@ -168,6 +168,8 @@ enum virtchnl_ops {
VIRTCHNL_OP_MAP_QUEUE_VECTOR = 111,
VIRTCHNL_OP_CONFIG_QUEUE_BW = 112,
VIRTCHNL_OP_CONFIG_QUANTA = 113,
+   VIRTCHNL_OP_FLOW_SUBSCRIBE = 114,
+   VIRTCHNL_OP_FLOW_UNSUBSCRIBE = 115,
VIRTCHNL_OP_MAX,
 };
 
@@ -282,6 +284,10 @@ static inline const char *virtchnl_op_str(enum 
virtchnl_ops v_opcode)
return "VIRTCHNL_OP_1588_PTP_GET_CAPS";
case VIRTCHNL_OP_1588_PTP_GET_TIME:
return "VIRTCHNL_OP_1588_PTP_GET_TIME";
+   case VIRTCHNL_OP_FLOW_SUBSCRIBE:
+   return "VIRTCHNL_OP_FLOW_SUBSCRIBE";
+   case VIRTCHNL_OP_FLOW_UNSUBSCRIBE:
+   return "VIRTCHNL_OP_FLOW_UNSUBSCRIBE";
case VIRTCHNL_OP_MAX:
return "VIRTCHNL_OP_MAX";
default:
@@ -401,6 +407,7 @@ VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_vsi_resource);
 #define VIRTCHNL_VF_OFFLOAD_INLINE_IPSEC_CRYPTOBIT(8)
 #define VIRTCHNL_VF_LARGE_NUM_QPAIRS   BIT(9)
 #define VIRTCHNL_VF_OFFLOAD_CRCBIT(10)
+#define VIRTCHNL_VF_OFFLOAD_FSUB_PFBIT(14)
 #define VIRTCHNL_VF_OFFLOAD_VLAN_V2BIT(15)
 #define VIRTCHNL_VF_OFFLOAD_VLAN   BIT(16)
 #define VIRTCHNL_VF_OFFLOAD_RX_POLLING BIT(17)
@@ -1503,6 +1510,7 @@ enum virtchnl_vfr_states {
 };
 
 #define VIRTCHNL_MAX_NUM_PROTO_HDRS32
+#define VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK  16
 #define VIRTCHNL_MAX_SIZE_RAW_PACKET   1024
 #define PROTO_HDR_SHIFT5
 #define PROTO_HDR_FIELD_START(proto_hdr_type) \
@@ -1695,6 +1703,22 @@ struct virtchnl_proto_hdr {
 
 VIRTCHNL_CHECK_STRUCT_LEN(72, virtchnl_proto_hdr);
 
+struct virtchnl_proto_hdr_w_msk {
+   /* see enum virtchnl_proto_hdr_type */
+   s32 type;
+   u32 pad;
+   /**
+* binary buffer in network order for specific header type.
+* For example, if type = VIRTCHNL_PROTO_HDR_IPV4, a IPv4
+* header is expected to be copied into the buffer.
+*/
+   u8 buffer_spec[64];
+   /* binary buffer for bit-mask applied to specific header type */
+   u8 buffer_mask[64];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(136, virtchnl_proto_hdr_w_msk);
+
 struct virtchnl_proto_hdrs {
u8 tunnel_level;
/**
@@ -1706,11 +1730,18 @@ struct virtchnl_proto_hdrs {
 */
int count;
/**
-* number of proto layers, must < VIRTCHNL_MAX_NUM_PROTO_HDRS
-* must be 0 for a raw packet request.
+* count must <=
+* VIRTCHNL_MAX_NUM_PROTO_HDRS + VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK
+* count = 0 :  select raw
+* 1 < count <= VIRTCHNL_MAX_NUM_PROTO_HDRS :   select proto_hdr
+* count > VIRTCHNL_MAX_NUM_PROTO_HDRS :select proto_hdr_w_msk
+* last valid index = count - VIRTCHNL_MAX_NUM_PROTO_HDRS
 */
union {
-   struct virtchnl_proto_hdr 
proto_hdr[VIRTCHNL_MAX_NUM_PROTO_HDRS];
+   struct virtchnl_proto_hdr
+   proto_hdr[VIRTCHNL_MAX_NUM_PROTO_HDRS];
+   struct virtchnl_proto_hdr_w_msk
+   proto_hdr_w_msk[VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK];
struct {
u16 pkt_len;
u8 spec[VIRTCHNL_MAX_SIZE_RAW_PACKET];
@@ -1731,7 +1762,7 @@ struct virtchnl_rss_cfg {
 
 VIRTCHNL_CHECK_STRUCT_LEN(2444, virtchnl_rss_cfg);
 
-/* action configuration for FDIR */
+/* action configuration for FDIR and FSUB */
 struct virtchnl_filter_action {
/* see enum virtchnl_action type */
s32 type;
@@ -1849,6 +1880,65 @@ struct virtchnl_fdir_del {
 
 VIRTCHNL_CHECK_STRUCT_LEN(12, virtchnl_fdir_del);
 
+/* Status returned to VF after VF requests FSUB commands
+ * VIRTCHNL_FSUB_SUCCESS
+ * VF FLOW related request is successfully done by PF
+ * The request can be OP_FLOW_SUBSCRIBE/UNSUBSCRIBE.
+ *
+ * VIRTCHNL_FSUB_FAILURE_RULE_NORESOURCE
+ * OP_FLOW_SUBSCRIBE request is failed due to no Hardware resource.
+ *
+ * VIRTCHNL_FSUB_FAILURE_RULE_EXIST
+ * OP_FLOW_SUBSCRIBE request is failed due to the rule is already existed.
+ 

[PATCH v4 3/5] net/iavf: support flow subscrption pattern

2022-09-06 Thread Jie Wang
Add flow subscription pattern support for AVF.

The supported patterns are listed below:
eth/vlan/ipv4
eth/ipv4(6)
eth/ipv4(6)/udp
eth/ipv4(6)/tcp

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf.h  |   7 +
 drivers/net/iavf/iavf_fsub.c | 598 ++-
 2 files changed, 597 insertions(+), 8 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index 025ab3ff60..f79c7f9f6e 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -148,6 +148,13 @@ struct iavf_fdir_info {
struct iavf_fdir_conf conf;
 };
 
+struct iavf_fsub_conf {
+   struct virtchnl_flow_sub sub_fltr;
+   struct virtchnl_flow_unsub unsub_fltr;
+   uint64_t input_set;
+   uint32_t flow_id;
+};
+
 struct iavf_qv_map {
uint16_t queue_id;
uint16_t vector_id;
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 17f9bb2976..4600d52b91 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -22,9 +22,51 @@
 #include "iavf_generic_flow.h"
 
 
+#define MAX_QGRP_NUM_TYPE  7
+#define IAVF_IPV6_ADDR_LENGTH  16
+#define MAX_INPUT_SET_BYTE 32
+
+#define IAVF_SW_INSET_ETHER ( \
+   IAVF_INSET_DMAC | IAVF_INSET_SMAC | IAVF_INSET_ETHERTYPE)
+#define IAVF_SW_INSET_MAC_IPV4 ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_PROTO | IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS)
+#define IAVF_SW_INSET_MAC_VLAN_IPV4 ( \
+   IAVF_SW_INSET_MAC_IPV4 | IAVF_INSET_VLAN_OUTER)
+#define IAVF_SW_INSET_MAC_IPV4_TCP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS | \
+   IAVF_INSET_TCP_DST_PORT | IAVF_INSET_TCP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV4_UDP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS | \
+   IAVF_INSET_UDP_DST_PORT | IAVF_INSET_UDP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV6 ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_TC | IAVF_INSET_IPV6_HOP_LIMIT | \
+   IAVF_INSET_IPV6_NEXT_HDR)
+#define IAVF_SW_INSET_MAC_IPV6_TCP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_HOP_LIMIT | IAVF_INSET_IPV6_TC | \
+   IAVF_INSET_TCP_DST_PORT | IAVF_INSET_TCP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV6_UDP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_HOP_LIMIT | IAVF_INSET_IPV6_TC | \
+   IAVF_INSET_UDP_DST_PORT | IAVF_INSET_UDP_SRC_PORT)
+
 static struct iavf_flow_parser iavf_fsub_parser;
 
-static struct iavf_pattern_match_item iavf_fsub_pattern_list[] = {};
+static struct
+iavf_pattern_match_item iavf_fsub_pattern_list[] = {
+   {iavf_pattern_ethertype,IAVF_SW_INSET_ETHER,
IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4, IAVF_SW_INSET_MAC_IPV4, 
IAVF_INSET_NONE},
+   {iavf_pattern_eth_vlan_ipv4,
IAVF_SW_INSET_MAC_VLAN_IPV4,IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4_udp, 
IAVF_SW_INSET_MAC_IPV4_UDP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4_tcp, 
IAVF_SW_INSET_MAC_IPV4_TCP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6, IAVF_SW_INSET_MAC_IPV6, 
IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6_udp, 
IAVF_SW_INSET_MAC_IPV6_UDP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6_tcp, 
IAVF_SW_INSET_MAC_IPV6_TCP, IAVF_INSET_NONE},
+};
 
 static int
 iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
@@ -53,17 +95,557 @@ iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
 };
 
 static int
-iavf_fsub_parse(__rte_unused struct iavf_adapter *ad,
-   __rte_unused struct iavf_pattern_match_item *array,
-   __rte_unused uint32_t array_len,
-   __rte_unused const struct rte_flow_item pattern[],
-   __rte_unused const struct rte_flow_action actions[],
-   __rte_unused void **meta,
-   __rte_unused struct rte_flow_error *error)
+iavf_fsub_parse_pattern(const struct rte_flow_item pattern[],
+   const uint64_t input_set_mask,
+   struct rte_flow_error *error,
+   struct iavf_fsub_conf *filter)
+{
+   struct virtchnl_proto_hdrs *hdrs = &filter->sub_fltr.proto_hdrs;
+   enum rte_flow_item_type item_type;
+   const struct rte_flow_item_eth *eth_spec, *eth_mask;
+   const struct rte_flow_item_ipv4 *ipv4_spec, *ipv4_mask;
+   const struct rte_flow_item_ipv6 *ipv6_spec, *ipv6_mask;
+   const struct rte_flow_item_tcp *tcp_spec, *tcp_mask;
+   const struct rte_flow_item_udp *u

[PATCH v4 2/5] net/iavf: add flow subscription to AVF

2022-09-06 Thread Jie Wang
Add the skeletal code of flow subscription to AVF driver.

Signed-off-by: Jie Wang 
---
 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/net/iavf/iavf_fsub.c   | 112 +
 drivers/net/iavf/iavf_generic_flow.c   |  17 +++-
 drivers/net/iavf/iavf_generic_flow.h   |   1 +
 drivers/net/iavf/iavf_vchnl.c  |   1 +
 drivers/net/iavf/meson.build   |   1 +
 6 files changed, 135 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/iavf/iavf_fsub.c

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 8c021cf050..bb77a03e24 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -55,6 +55,10 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Updated Intel iavf driver.**
+
+  * Added flow subscription support.
+
 
 Removed Items
 -
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
new file mode 100644
index 00..17f9bb2976
--- /dev/null
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "iavf_generic_flow.h"
+
+
+static struct iavf_flow_parser iavf_fsub_parser;
+
+static struct iavf_pattern_match_item iavf_fsub_pattern_list[] = {};
+
+static int
+iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
+__rte_unused struct rte_flow *flow,
+__rte_unused void *meta,
+__rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_destroy(__rte_unused struct iavf_adapter *ad,
+ __rte_unused struct rte_flow *flow,
+ __rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
+__rte_unused struct rte_flow *flow,
+__rte_unused void *meta,
+__rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+};
+
+static int
+iavf_fsub_parse(__rte_unused struct iavf_adapter *ad,
+   __rte_unused struct iavf_pattern_match_item *array,
+   __rte_unused uint32_t array_len,
+   __rte_unused const struct rte_flow_item pattern[],
+   __rte_unused const struct rte_flow_action actions[],
+   __rte_unused void **meta,
+   __rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_init(struct iavf_adapter *ad)
+{
+   struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(ad);
+   struct iavf_flow_parser *parser;
+
+   if (!vf->vf_res)
+   return -EINVAL;
+
+   if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_FSUB_PF)
+   parser = &iavf_fsub_parser;
+   else
+   return -ENOTSUP;
+
+   return iavf_register_parser(parser, ad);
+}
+
+static void
+iavf_fsub_uninit(struct iavf_adapter *ad)
+{
+   iavf_unregister_parser(&iavf_fsub_parser, ad);
+}
+
+static struct
+iavf_flow_engine iavf_fsub_engine = {
+   .init = iavf_fsub_init,
+   .uninit = iavf_fsub_uninit,
+   .create = iavf_fsub_create,
+   .destroy = iavf_fsub_destroy,
+   .validation = iavf_fsub_validation,
+   .type = IAVF_FLOW_ENGINE_FSUB,
+};
+
+static struct
+iavf_flow_parser iavf_fsub_parser = {
+   .engine = &iavf_fsub_engine,
+   .array = iavf_fsub_pattern_list,
+   .array_len = RTE_DIM(iavf_fsub_pattern_list),
+   .parse_pattern_action = iavf_fsub_parse,
+   .stage = IAVF_FLOW_STAGE_DISTRIBUTOR,
+};
+
+RTE_INIT(iavf_fsub_engine_init)
+{
+   iavf_register_flow_engine(&iavf_fsub_engine);
+}
diff --git a/drivers/net/iavf/iavf_generic_flow.c 
b/drivers/net/iavf/iavf_generic_flow.c
index e1a611e319..b04614ba6e 100644
--- a/drivers/net/iavf/iavf_generic_flow.c
+++ b/drivers/net/iavf/iavf_generic_flow.c
@@ -1866,6 +1866,8 @@ iavf_register_parser(struct iavf_flow_parser *parser,
 {
struct iavf_parser_list *list = NULL;
struct iavf_flow_parser_node *parser_node;
+   struct iavf_flow_parser_node *existing_node;
+   void *temp;
struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(ad);
 
parser_node = rte_zmalloc("iavf_parser", sizeof(*parser_node), 0);
@@ -1880,14 +1882,26 @@ iavf_register_parser(struct iavf_flow_parser *parser,
TAILQ_INSERT_TAIL(list, parser_node, node);
} else if (parser->engine->type == IAVF_FLOW_ENGINE_FDIR) {
list = &vf->dist_parser_list;
+   RTE_TAILQ_FOREACH_SAFE(existing_node, list, node, temp) {
+   if (existing_

[PATCH v4 4/5] net/iavf: support flow subscription rule

2022-09-06 Thread Jie Wang
Support flow subscribption create/destroy/validation flow
rule for AVF.

For examples:
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 11
  / end actions represented_port port_id 1 / end
testpmd> flow validate 1 ingress pattern eth / ipv4 / tcp src is 22
  / end actions represented_port port_id 1 / end
testpmd> flow destroy 1 rule 0

VF subscribes to a rule, which means the packets will be sent to VF
instead of PF, and olny VF will receive the packets.

It is allowed multiple VF subscribe to same rule, the packets will
be replicated and received by each VF.

PF will destroy all subscriptions during VF reset.

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf.h   |   6 ++
 drivers/net/iavf/iavf_fsub.c  |  75 +++
 drivers/net/iavf/iavf_vchnl.c | 132 ++
 3 files changed, 201 insertions(+), 12 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index f79c7f9f6e..26b858f6f0 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -489,4 +489,10 @@ int iavf_ipsec_crypto_request(struct iavf_adapter *adapter,
 extern const struct rte_tm_ops iavf_tm_ops;
 int iavf_get_ptp_cap(struct iavf_adapter *adapter);
 int iavf_get_phc_time(struct iavf_rx_queue *rxq);
+int iavf_flow_sub(struct iavf_adapter *adapter,
+ struct iavf_fsub_conf *filter);
+int iavf_flow_unsub(struct iavf_adapter *adapter,
+   struct iavf_fsub_conf *filter);
+int iavf_flow_sub_check(struct iavf_adapter *adapter,
+   struct iavf_fsub_conf *filter);
 #endif /* _IAVF_ETHDEV_H_ */
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 4600d52b91..b9ad3531ff 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -69,29 +69,80 @@ iavf_pattern_match_item iavf_fsub_pattern_list[] = {
 };
 
 static int
-iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
-__rte_unused struct rte_flow *flow,
-__rte_unused void *meta,
-__rte_unused struct rte_flow_error *error)
+iavf_fsub_create(struct iavf_adapter *ad, struct rte_flow *flow,
+void *meta, struct rte_flow_error *error)
 {
+   struct iavf_fsub_conf *filter = meta;
+   struct iavf_fsub_conf *rule;
+   int ret;
+
+   rule = rte_zmalloc("fsub_entry", sizeof(*rule), 0);
+   if (!rule) {
+   rte_flow_error_set(error, ENOMEM,
+   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+   "Failed to allocate memory for fsub rule");
+   return -rte_errno;
+   }
+
+   ret = iavf_flow_sub(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to subscribe flow rule.");
+   goto free_entry;
+   }
+
+   rte_memcpy(rule, filter, sizeof(*rule));
+   flow->rule = rule;
+
+   return ret;
+
+free_entry:
+   rte_free(rule);
return -rte_errno;
 }
 
 static int
-iavf_fsub_destroy(__rte_unused struct iavf_adapter *ad,
- __rte_unused struct rte_flow *flow,
- __rte_unused struct rte_flow_error *error)
+iavf_fsub_destroy(struct iavf_adapter *ad, struct rte_flow *flow,
+ struct rte_flow_error *error)
 {
-   return -rte_errno;
+   struct iavf_fsub_conf *filter;
+   int ret;
+
+   filter = (struct iavf_fsub_conf *)flow->rule;
+
+   ret = iavf_flow_unsub(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to unsubscribe flow rule.");
+   return -rte_errno;
+   }
+
+   flow->rule = NULL;
+   rte_free(filter);
+
+   return ret;
 }
 
 static int
-iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
+iavf_fsub_validation(struct iavf_adapter *ad,
 __rte_unused struct rte_flow *flow,
-__rte_unused void *meta,
-__rte_unused struct rte_flow_error *error)
+void *meta,
+struct rte_flow_error *error)
 {
-   return -rte_errno;
+   struct iavf_fsub_conf *filter = meta;
+   int ret;
+
+   ret = iavf_flow_sub_check(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to validate filter rule.");
+   return -rte_errno;
+   }
+
+   return ret;
 };
 
 static int
diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
index 6d84add423..cc0db8d093 100644
--- a/drivers/net/iavf/iavf_vchnl.c
+++ b/drivers/net/iavf/iavf_vchnl.c
@@ -1534,6 +1534,138 @@ iavf_fdir_check(struct iavf_adapter *adapte

[PATCH v4 5/5] net/iavf: support priority of flow rule

2022-09-06 Thread Jie Wang
Add flow rule attribute "priority" support for AVF.

Lower values denote higher priority, the highest priority for
a flow rule is 0.

All subscription rule will have a lower priority than the rules
that be created by host.

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf_fdir.c |  4 
 drivers/net/iavf/iavf_fsub.c |  2 +-
 drivers/net/iavf/iavf_generic_flow.c | 23 +--
 drivers/net/iavf/iavf_generic_flow.h |  1 +
 drivers/net/iavf/iavf_hash.c |  5 +
 drivers/net/iavf/iavf_ipsec_crypto.c | 16 ++--
 6 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/drivers/net/iavf/iavf_fdir.c b/drivers/net/iavf/iavf_fdir.c
index a397047fdb..8f80873925 100644
--- a/drivers/net/iavf/iavf_fdir.c
+++ b/drivers/net/iavf/iavf_fdir.c
@@ -1583,6 +1583,7 @@ iavf_fdir_parse(struct iavf_adapter *ad,
uint32_t array_len,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
+   uint32_t priority,
void **meta,
struct rte_flow_error *error)
 {
@@ -1593,6 +1594,9 @@ iavf_fdir_parse(struct iavf_adapter *ad,
 
memset(filter, 0, sizeof(*filter));
 
+   if (priority >= 1)
+   return -rte_errno;
+
item = iavf_search_pattern_match_item(pattern, array, array_len, error);
if (!item)
return -rte_errno;
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index b9ad3531ff..46effda9a0 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -649,13 +649,13 @@ iavf_fsub_parse(struct iavf_adapter *ad,
uint32_t array_len,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
+   uint32_t priority,
void **meta,
struct rte_flow_error *error)
 {
struct iavf_fsub_conf *filter;
struct iavf_pattern_match_item *pattern_match_item = NULL;
int ret = 0;
-   uint32_t priority = 0;
 
filter = rte_zmalloc(NULL, sizeof(*filter), 0);
if (!filter) {
diff --git a/drivers/net/iavf/iavf_generic_flow.c 
b/drivers/net/iavf/iavf_generic_flow.c
index b04614ba6e..f33c764764 100644
--- a/drivers/net/iavf/iavf_generic_flow.c
+++ b/drivers/net/iavf/iavf_generic_flow.c
@@ -1785,6 +1785,7 @@ enum rte_flow_item_type 
iavf_pattern_eth_ipv6_udp_l2tpv2_ppp_ipv6_tcp[] = {
 typedef struct iavf_flow_engine * (*parse_engine_t)(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error);
@@ -1951,11 +1952,11 @@ iavf_flow_valid_attr(const struct rte_flow_attr *attr,
return -rte_errno;
}
 
-   /* Not supported */
-   if (attr->priority) {
+   /* support priority for flow subscribe */
+   if (attr->priority > 1) {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
-   attr, "Not support priority.");
+   attr, "Only support priority 0 and 1.");
return -rte_errno;
}
 
@@ -2098,6 +2099,7 @@ static struct iavf_flow_engine *
 iavf_parse_engine_create(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error)
@@ -2111,7 +2113,7 @@ iavf_parse_engine_create(struct iavf_adapter *ad,
if (parser_node->parser->parse_pattern_action(ad,
parser_node->parser->array,
parser_node->parser->array_len,
-   pattern, actions, &meta, error) < 0)
+   pattern, actions, priority, &meta, error) < 0)
continue;
 
engine = parser_node->parser->engine;
@@ -2127,6 +2129,7 @@ static struct iavf_flow_engine *
 iavf_parse_engine_validate(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error)
@@ -2140,7 +2143,7 @@ iavf_parse_engine_validate(struct iavf_adapter *ad,
if (parser_node->parser->parse_pattern_action(ad,
parser_node->parser->array,
parser_node->parser->array_len,
-   patter

[PATCH v4 0/5] support flow subscription

2022-09-06 Thread Jie Wang
Add support AVF can be able to subscribe a flow from PF.

--
v4:
 * replace flow action represented_port with port_representor.
 * update commit log and rebase.
v3:
 * fix eth layer inputset.
 * rebase.
v2:
 * split v1 patch 2/2 to 4 small patches.
 * remove rule action RTE_FLOW_ACTION_TYPE_VF and add
   RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT.

Jie Wang (5):
  common/iavf: support flow subscription
  net/iavf: add flow subscription to AVF
  net/iavf: support flow subscrption pattern
  net/iavf: support flow subscription rule
  net/iavf: support priority of flow rule

 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/common/iavf/virtchnl.h | 104 +++-
 drivers/net/iavf/iavf.h|  13 +
 drivers/net/iavf/iavf_fdir.c   |   4 +
 drivers/net/iavf/iavf_fsub.c   | 745 +
 drivers/net/iavf/iavf_generic_flow.c   |  40 +-
 drivers/net/iavf/iavf_generic_flow.h   |   2 +
 drivers/net/iavf/iavf_hash.c   |   5 +
 drivers/net/iavf/iavf_ipsec_crypto.c   |  16 +-
 drivers/net/iavf/iavf_vchnl.c  | 133 +
 drivers/net/iavf/meson.build   |   1 +
 11 files changed, 1046 insertions(+), 21 deletions(-)
 create mode 100644 drivers/net/iavf/iavf_fsub.c

-- 
2.25.1



[PATCH v4 1/5] common/iavf: support flow subscription

2022-09-06 Thread Jie Wang
VF is able to subscribe a flow from PF by VIRTCHNL_FLOW_SUBSCRIBE.

PF is expected to offload a rule to hardware which will redirect
the packet that matching the required pattern to this VF.

Only a flow with dst mac address as PF's mac address can be subscribed.

VIRTCHNL_VF_OFFLOAD_FSUB_PF is used for Flow subscription capability
negotiation and only a trusted VF can be granted with this capability.

A flow can be unsubscribed by VIRTCHNL_FLOW_UNSUBSCRIBE.

Signed-off-by: Jie Wang 
Signed-off-by: Qi Zhang 
---
 drivers/common/iavf/virtchnl.h | 104 +++--
 1 file changed, 100 insertions(+), 4 deletions(-)

diff --git a/drivers/common/iavf/virtchnl.h b/drivers/common/iavf/virtchnl.h
index f123daec8e..e02eec4935 100644
--- a/drivers/common/iavf/virtchnl.h
+++ b/drivers/common/iavf/virtchnl.h
@@ -168,6 +168,8 @@ enum virtchnl_ops {
VIRTCHNL_OP_MAP_QUEUE_VECTOR = 111,
VIRTCHNL_OP_CONFIG_QUEUE_BW = 112,
VIRTCHNL_OP_CONFIG_QUANTA = 113,
+   VIRTCHNL_OP_FLOW_SUBSCRIBE = 114,
+   VIRTCHNL_OP_FLOW_UNSUBSCRIBE = 115,
VIRTCHNL_OP_MAX,
 };
 
@@ -282,6 +284,10 @@ static inline const char *virtchnl_op_str(enum 
virtchnl_ops v_opcode)
return "VIRTCHNL_OP_1588_PTP_GET_CAPS";
case VIRTCHNL_OP_1588_PTP_GET_TIME:
return "VIRTCHNL_OP_1588_PTP_GET_TIME";
+   case VIRTCHNL_OP_FLOW_SUBSCRIBE:
+   return "VIRTCHNL_OP_FLOW_SUBSCRIBE";
+   case VIRTCHNL_OP_FLOW_UNSUBSCRIBE:
+   return "VIRTCHNL_OP_FLOW_UNSUBSCRIBE";
case VIRTCHNL_OP_MAX:
return "VIRTCHNL_OP_MAX";
default:
@@ -401,6 +407,7 @@ VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_vsi_resource);
 #define VIRTCHNL_VF_OFFLOAD_INLINE_IPSEC_CRYPTOBIT(8)
 #define VIRTCHNL_VF_LARGE_NUM_QPAIRS   BIT(9)
 #define VIRTCHNL_VF_OFFLOAD_CRCBIT(10)
+#define VIRTCHNL_VF_OFFLOAD_FSUB_PFBIT(14)
 #define VIRTCHNL_VF_OFFLOAD_VLAN_V2BIT(15)
 #define VIRTCHNL_VF_OFFLOAD_VLAN   BIT(16)
 #define VIRTCHNL_VF_OFFLOAD_RX_POLLING BIT(17)
@@ -1503,6 +1510,7 @@ enum virtchnl_vfr_states {
 };
 
 #define VIRTCHNL_MAX_NUM_PROTO_HDRS32
+#define VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK  16
 #define VIRTCHNL_MAX_SIZE_RAW_PACKET   1024
 #define PROTO_HDR_SHIFT5
 #define PROTO_HDR_FIELD_START(proto_hdr_type) \
@@ -1695,6 +1703,22 @@ struct virtchnl_proto_hdr {
 
 VIRTCHNL_CHECK_STRUCT_LEN(72, virtchnl_proto_hdr);
 
+struct virtchnl_proto_hdr_w_msk {
+   /* see enum virtchnl_proto_hdr_type */
+   s32 type;
+   u32 pad;
+   /**
+* binary buffer in network order for specific header type.
+* For example, if type = VIRTCHNL_PROTO_HDR_IPV4, a IPv4
+* header is expected to be copied into the buffer.
+*/
+   u8 buffer_spec[64];
+   /* binary buffer for bit-mask applied to specific header type */
+   u8 buffer_mask[64];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(136, virtchnl_proto_hdr_w_msk);
+
 struct virtchnl_proto_hdrs {
u8 tunnel_level;
/**
@@ -1706,11 +1730,18 @@ struct virtchnl_proto_hdrs {
 */
int count;
/**
-* number of proto layers, must < VIRTCHNL_MAX_NUM_PROTO_HDRS
-* must be 0 for a raw packet request.
+* count must <=
+* VIRTCHNL_MAX_NUM_PROTO_HDRS + VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK
+* count = 0 :  select raw
+* 1 < count <= VIRTCHNL_MAX_NUM_PROTO_HDRS :   select proto_hdr
+* count > VIRTCHNL_MAX_NUM_PROTO_HDRS :select proto_hdr_w_msk
+* last valid index = count - VIRTCHNL_MAX_NUM_PROTO_HDRS
 */
union {
-   struct virtchnl_proto_hdr 
proto_hdr[VIRTCHNL_MAX_NUM_PROTO_HDRS];
+   struct virtchnl_proto_hdr
+   proto_hdr[VIRTCHNL_MAX_NUM_PROTO_HDRS];
+   struct virtchnl_proto_hdr_w_msk
+   proto_hdr_w_msk[VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK];
struct {
u16 pkt_len;
u8 spec[VIRTCHNL_MAX_SIZE_RAW_PACKET];
@@ -1731,7 +1762,7 @@ struct virtchnl_rss_cfg {
 
 VIRTCHNL_CHECK_STRUCT_LEN(2444, virtchnl_rss_cfg);
 
-/* action configuration for FDIR */
+/* action configuration for FDIR and FSUB */
 struct virtchnl_filter_action {
/* see enum virtchnl_action type */
s32 type;
@@ -1849,6 +1880,65 @@ struct virtchnl_fdir_del {
 
 VIRTCHNL_CHECK_STRUCT_LEN(12, virtchnl_fdir_del);
 
+/* Status returned to VF after VF requests FSUB commands
+ * VIRTCHNL_FSUB_SUCCESS
+ * VF FLOW related request is successfully done by PF
+ * The request can be OP_FLOW_SUBSCRIBE/UNSUBSCRIBE.
+ *
+ * VIRTCHNL_FSUB_FAILURE_RULE_NORESOURCE
+ * OP_FLOW_SUBSCRIBE request is failed due to no Hardware resource.
+ *
+ * VIRTCHNL_FSUB_FAILURE_RULE_EXIST
+ * OP_FLOW_SUBSCRIBE request is failed due to the rule is already existed.
+ 

[PATCH v4 2/5] net/iavf: add flow subscription to AVF

2022-09-06 Thread Jie Wang
Add the skeletal code of flow subscription to AVF driver.

Signed-off-by: Jie Wang 
---
 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/net/iavf/iavf_fsub.c   | 112 +
 drivers/net/iavf/iavf_generic_flow.c   |  17 +++-
 drivers/net/iavf/iavf_generic_flow.h   |   1 +
 drivers/net/iavf/iavf_vchnl.c  |   1 +
 drivers/net/iavf/meson.build   |   1 +
 6 files changed, 135 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/iavf/iavf_fsub.c

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 8c021cf050..bb77a03e24 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -55,6 +55,10 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Updated Intel iavf driver.**
+
+  * Added flow subscription support.
+
 
 Removed Items
 -
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
new file mode 100644
index 00..17f9bb2976
--- /dev/null
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "iavf_generic_flow.h"
+
+
+static struct iavf_flow_parser iavf_fsub_parser;
+
+static struct iavf_pattern_match_item iavf_fsub_pattern_list[] = {};
+
+static int
+iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
+__rte_unused struct rte_flow *flow,
+__rte_unused void *meta,
+__rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_destroy(__rte_unused struct iavf_adapter *ad,
+ __rte_unused struct rte_flow *flow,
+ __rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
+__rte_unused struct rte_flow *flow,
+__rte_unused void *meta,
+__rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+};
+
+static int
+iavf_fsub_parse(__rte_unused struct iavf_adapter *ad,
+   __rte_unused struct iavf_pattern_match_item *array,
+   __rte_unused uint32_t array_len,
+   __rte_unused const struct rte_flow_item pattern[],
+   __rte_unused const struct rte_flow_action actions[],
+   __rte_unused void **meta,
+   __rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_init(struct iavf_adapter *ad)
+{
+   struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(ad);
+   struct iavf_flow_parser *parser;
+
+   if (!vf->vf_res)
+   return -EINVAL;
+
+   if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_FSUB_PF)
+   parser = &iavf_fsub_parser;
+   else
+   return -ENOTSUP;
+
+   return iavf_register_parser(parser, ad);
+}
+
+static void
+iavf_fsub_uninit(struct iavf_adapter *ad)
+{
+   iavf_unregister_parser(&iavf_fsub_parser, ad);
+}
+
+static struct
+iavf_flow_engine iavf_fsub_engine = {
+   .init = iavf_fsub_init,
+   .uninit = iavf_fsub_uninit,
+   .create = iavf_fsub_create,
+   .destroy = iavf_fsub_destroy,
+   .validation = iavf_fsub_validation,
+   .type = IAVF_FLOW_ENGINE_FSUB,
+};
+
+static struct
+iavf_flow_parser iavf_fsub_parser = {
+   .engine = &iavf_fsub_engine,
+   .array = iavf_fsub_pattern_list,
+   .array_len = RTE_DIM(iavf_fsub_pattern_list),
+   .parse_pattern_action = iavf_fsub_parse,
+   .stage = IAVF_FLOW_STAGE_DISTRIBUTOR,
+};
+
+RTE_INIT(iavf_fsub_engine_init)
+{
+   iavf_register_flow_engine(&iavf_fsub_engine);
+}
diff --git a/drivers/net/iavf/iavf_generic_flow.c 
b/drivers/net/iavf/iavf_generic_flow.c
index e1a611e319..b04614ba6e 100644
--- a/drivers/net/iavf/iavf_generic_flow.c
+++ b/drivers/net/iavf/iavf_generic_flow.c
@@ -1866,6 +1866,8 @@ iavf_register_parser(struct iavf_flow_parser *parser,
 {
struct iavf_parser_list *list = NULL;
struct iavf_flow_parser_node *parser_node;
+   struct iavf_flow_parser_node *existing_node;
+   void *temp;
struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(ad);
 
parser_node = rte_zmalloc("iavf_parser", sizeof(*parser_node), 0);
@@ -1880,14 +1882,26 @@ iavf_register_parser(struct iavf_flow_parser *parser,
TAILQ_INSERT_TAIL(list, parser_node, node);
} else if (parser->engine->type == IAVF_FLOW_ENGINE_FDIR) {
list = &vf->dist_parser_list;
+   RTE_TAILQ_FOREACH_SAFE(existing_node, list, node, temp) {
+   if (existing_

[PATCH v4 3/5] net/iavf: support flow subscrption pattern

2022-09-06 Thread Jie Wang
Add flow subscription pattern support for AVF.

The supported patterns are listed below:
eth/vlan/ipv4
eth/ipv4(6)
eth/ipv4(6)/udp
eth/ipv4(6)/tcp

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf.h  |   7 +
 drivers/net/iavf/iavf_fsub.c | 598 ++-
 2 files changed, 597 insertions(+), 8 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index 025ab3ff60..f79c7f9f6e 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -148,6 +148,13 @@ struct iavf_fdir_info {
struct iavf_fdir_conf conf;
 };
 
+struct iavf_fsub_conf {
+   struct virtchnl_flow_sub sub_fltr;
+   struct virtchnl_flow_unsub unsub_fltr;
+   uint64_t input_set;
+   uint32_t flow_id;
+};
+
 struct iavf_qv_map {
uint16_t queue_id;
uint16_t vector_id;
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 17f9bb2976..66e403d585 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -22,9 +22,51 @@
 #include "iavf_generic_flow.h"
 
 
+#define MAX_QGRP_NUM_TYPE  7
+#define IAVF_IPV6_ADDR_LENGTH  16
+#define MAX_INPUT_SET_BYTE 32
+
+#define IAVF_SW_INSET_ETHER ( \
+   IAVF_INSET_DMAC | IAVF_INSET_SMAC | IAVF_INSET_ETHERTYPE)
+#define IAVF_SW_INSET_MAC_IPV4 ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_PROTO | IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS)
+#define IAVF_SW_INSET_MAC_VLAN_IPV4 ( \
+   IAVF_SW_INSET_MAC_IPV4 | IAVF_INSET_VLAN_OUTER)
+#define IAVF_SW_INSET_MAC_IPV4_TCP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS | \
+   IAVF_INSET_TCP_DST_PORT | IAVF_INSET_TCP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV4_UDP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS | \
+   IAVF_INSET_UDP_DST_PORT | IAVF_INSET_UDP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV6 ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_TC | IAVF_INSET_IPV6_HOP_LIMIT | \
+   IAVF_INSET_IPV6_NEXT_HDR)
+#define IAVF_SW_INSET_MAC_IPV6_TCP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_HOP_LIMIT | IAVF_INSET_IPV6_TC | \
+   IAVF_INSET_TCP_DST_PORT | IAVF_INSET_TCP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV6_UDP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_HOP_LIMIT | IAVF_INSET_IPV6_TC | \
+   IAVF_INSET_UDP_DST_PORT | IAVF_INSET_UDP_SRC_PORT)
+
 static struct iavf_flow_parser iavf_fsub_parser;
 
-static struct iavf_pattern_match_item iavf_fsub_pattern_list[] = {};
+static struct
+iavf_pattern_match_item iavf_fsub_pattern_list[] = {
+   {iavf_pattern_ethertype,IAVF_SW_INSET_ETHER,
IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4, IAVF_SW_INSET_MAC_IPV4, 
IAVF_INSET_NONE},
+   {iavf_pattern_eth_vlan_ipv4,
IAVF_SW_INSET_MAC_VLAN_IPV4,IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4_udp, 
IAVF_SW_INSET_MAC_IPV4_UDP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4_tcp, 
IAVF_SW_INSET_MAC_IPV4_TCP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6, IAVF_SW_INSET_MAC_IPV6, 
IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6_udp, 
IAVF_SW_INSET_MAC_IPV6_UDP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6_tcp, 
IAVF_SW_INSET_MAC_IPV6_TCP, IAVF_INSET_NONE},
+};
 
 static int
 iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
@@ -53,17 +95,557 @@ iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
 };
 
 static int
-iavf_fsub_parse(__rte_unused struct iavf_adapter *ad,
-   __rte_unused struct iavf_pattern_match_item *array,
-   __rte_unused uint32_t array_len,
-   __rte_unused const struct rte_flow_item pattern[],
-   __rte_unused const struct rte_flow_action actions[],
-   __rte_unused void **meta,
-   __rte_unused struct rte_flow_error *error)
+iavf_fsub_parse_pattern(const struct rte_flow_item pattern[],
+   const uint64_t input_set_mask,
+   struct rte_flow_error *error,
+   struct iavf_fsub_conf *filter)
+{
+   struct virtchnl_proto_hdrs *hdrs = &filter->sub_fltr.proto_hdrs;
+   enum rte_flow_item_type item_type;
+   const struct rte_flow_item_eth *eth_spec, *eth_mask;
+   const struct rte_flow_item_ipv4 *ipv4_spec, *ipv4_mask;
+   const struct rte_flow_item_ipv6 *ipv6_spec, *ipv6_mask;
+   const struct rte_flow_item_tcp *tcp_spec, *tcp_mask;
+   const struct rte_flow_item_udp *u

[PATCH v4 4/5] net/iavf: support flow subscription rule

2022-09-06 Thread Jie Wang
Support flow subscribption create/destroy/validation flow
rule for AVF.

For examples:
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 11
  / end actions represented_port port_id 1 / end
testpmd> flow validate 1 ingress pattern eth / ipv4 / tcp src is 22
  / end actions represented_port port_id 1 / end
testpmd> flow destroy 1 rule 0

VF subscribes to a rule, which means the packets will be sent to VF
instead of PF, and only VF will receive the packets.

It is allowed multiple VF subscribe to same rule, the packets will
be replicated and received by each VF.

PF will destroy all subscriptions during VF reset.

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf.h   |   6 ++
 drivers/net/iavf/iavf_fsub.c  |  75 +++
 drivers/net/iavf/iavf_vchnl.c | 132 ++
 3 files changed, 201 insertions(+), 12 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index f79c7f9f6e..26b858f6f0 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -489,4 +489,10 @@ int iavf_ipsec_crypto_request(struct iavf_adapter *adapter,
 extern const struct rte_tm_ops iavf_tm_ops;
 int iavf_get_ptp_cap(struct iavf_adapter *adapter);
 int iavf_get_phc_time(struct iavf_rx_queue *rxq);
+int iavf_flow_sub(struct iavf_adapter *adapter,
+ struct iavf_fsub_conf *filter);
+int iavf_flow_unsub(struct iavf_adapter *adapter,
+   struct iavf_fsub_conf *filter);
+int iavf_flow_sub_check(struct iavf_adapter *adapter,
+   struct iavf_fsub_conf *filter);
 #endif /* _IAVF_ETHDEV_H_ */
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 66e403d585..28857d7577 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -69,29 +69,80 @@ iavf_pattern_match_item iavf_fsub_pattern_list[] = {
 };
 
 static int
-iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
-__rte_unused struct rte_flow *flow,
-__rte_unused void *meta,
-__rte_unused struct rte_flow_error *error)
+iavf_fsub_create(struct iavf_adapter *ad, struct rte_flow *flow,
+void *meta, struct rte_flow_error *error)
 {
+   struct iavf_fsub_conf *filter = meta;
+   struct iavf_fsub_conf *rule;
+   int ret;
+
+   rule = rte_zmalloc("fsub_entry", sizeof(*rule), 0);
+   if (!rule) {
+   rte_flow_error_set(error, ENOMEM,
+   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+   "Failed to allocate memory for fsub rule");
+   return -rte_errno;
+   }
+
+   ret = iavf_flow_sub(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to subscribe flow rule.");
+   goto free_entry;
+   }
+
+   rte_memcpy(rule, filter, sizeof(*rule));
+   flow->rule = rule;
+
+   return ret;
+
+free_entry:
+   rte_free(rule);
return -rte_errno;
 }
 
 static int
-iavf_fsub_destroy(__rte_unused struct iavf_adapter *ad,
- __rte_unused struct rte_flow *flow,
- __rte_unused struct rte_flow_error *error)
+iavf_fsub_destroy(struct iavf_adapter *ad, struct rte_flow *flow,
+ struct rte_flow_error *error)
 {
-   return -rte_errno;
+   struct iavf_fsub_conf *filter;
+   int ret;
+
+   filter = (struct iavf_fsub_conf *)flow->rule;
+
+   ret = iavf_flow_unsub(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to unsubscribe flow rule.");
+   return -rte_errno;
+   }
+
+   flow->rule = NULL;
+   rte_free(filter);
+
+   return ret;
 }
 
 static int
-iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
+iavf_fsub_validation(struct iavf_adapter *ad,
 __rte_unused struct rte_flow *flow,
-__rte_unused void *meta,
-__rte_unused struct rte_flow_error *error)
+void *meta,
+struct rte_flow_error *error)
 {
-   return -rte_errno;
+   struct iavf_fsub_conf *filter = meta;
+   int ret;
+
+   ret = iavf_flow_sub_check(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to validate filter rule.");
+   return -rte_errno;
+   }
+
+   return ret;
 };
 
 static int
diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
index 6d84add423..cc0db8d093 100644
--- a/drivers/net/iavf/iavf_vchnl.c
+++ b/drivers/net/iavf/iavf_vchnl.c
@@ -1534,6 +1534,138 @@ iavf_fdir_check(struct iavf_adapter *adapte

[PATCH v4 5/5] net/iavf: support priority of flow rule

2022-09-06 Thread Jie Wang
Add flow rule attribute "priority" support for AVF.

Lower values denote higher priority, the highest priority for
a flow rule is 0.

All subscription rule will have a lower priority than the rules
that be created by host.

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf_fdir.c |  4 
 drivers/net/iavf/iavf_fsub.c |  2 +-
 drivers/net/iavf/iavf_generic_flow.c | 23 +--
 drivers/net/iavf/iavf_generic_flow.h |  1 +
 drivers/net/iavf/iavf_hash.c |  5 +
 drivers/net/iavf/iavf_ipsec_crypto.c | 16 ++--
 6 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/drivers/net/iavf/iavf_fdir.c b/drivers/net/iavf/iavf_fdir.c
index a397047fdb..8f80873925 100644
--- a/drivers/net/iavf/iavf_fdir.c
+++ b/drivers/net/iavf/iavf_fdir.c
@@ -1583,6 +1583,7 @@ iavf_fdir_parse(struct iavf_adapter *ad,
uint32_t array_len,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
+   uint32_t priority,
void **meta,
struct rte_flow_error *error)
 {
@@ -1593,6 +1594,9 @@ iavf_fdir_parse(struct iavf_adapter *ad,
 
memset(filter, 0, sizeof(*filter));
 
+   if (priority >= 1)
+   return -rte_errno;
+
item = iavf_search_pattern_match_item(pattern, array, array_len, error);
if (!item)
return -rte_errno;
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 28857d7577..3bb6c30d3c 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -649,13 +649,13 @@ iavf_fsub_parse(struct iavf_adapter *ad,
uint32_t array_len,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
+   uint32_t priority,
void **meta,
struct rte_flow_error *error)
 {
struct iavf_fsub_conf *filter;
struct iavf_pattern_match_item *pattern_match_item = NULL;
int ret = 0;
-   uint32_t priority = 0;
 
filter = rte_zmalloc(NULL, sizeof(*filter), 0);
if (!filter) {
diff --git a/drivers/net/iavf/iavf_generic_flow.c 
b/drivers/net/iavf/iavf_generic_flow.c
index b04614ba6e..f33c764764 100644
--- a/drivers/net/iavf/iavf_generic_flow.c
+++ b/drivers/net/iavf/iavf_generic_flow.c
@@ -1785,6 +1785,7 @@ enum rte_flow_item_type 
iavf_pattern_eth_ipv6_udp_l2tpv2_ppp_ipv6_tcp[] = {
 typedef struct iavf_flow_engine * (*parse_engine_t)(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error);
@@ -1951,11 +1952,11 @@ iavf_flow_valid_attr(const struct rte_flow_attr *attr,
return -rte_errno;
}
 
-   /* Not supported */
-   if (attr->priority) {
+   /* support priority for flow subscribe */
+   if (attr->priority > 1) {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
-   attr, "Not support priority.");
+   attr, "Only support priority 0 and 1.");
return -rte_errno;
}
 
@@ -2098,6 +2099,7 @@ static struct iavf_flow_engine *
 iavf_parse_engine_create(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error)
@@ -2111,7 +2113,7 @@ iavf_parse_engine_create(struct iavf_adapter *ad,
if (parser_node->parser->parse_pattern_action(ad,
parser_node->parser->array,
parser_node->parser->array_len,
-   pattern, actions, &meta, error) < 0)
+   pattern, actions, priority, &meta, error) < 0)
continue;
 
engine = parser_node->parser->engine;
@@ -2127,6 +2129,7 @@ static struct iavf_flow_engine *
 iavf_parse_engine_validate(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error)
@@ -2140,7 +2143,7 @@ iavf_parse_engine_validate(struct iavf_adapter *ad,
if (parser_node->parser->parse_pattern_action(ad,
parser_node->parser->array,
parser_node->parser->array_len,
-   patter

[PATCH v5 1/5] common/iavf: support flow subscription

2022-09-06 Thread Jie Wang
VF is able to subscribe a flow from PF by VIRTCHNL_FLOW_SUBSCRIBE.

PF is expected to offload a rule to hardware which will redirect
the packet that matching the required pattern to this VF.

Only a flow with dst mac address as PF's mac address can be subscribed.

VIRTCHNL_VF_OFFLOAD_FSUB_PF is used for Flow subscription capability
negotiation and only a trusted VF can be granted with this capability.

A flow can be unsubscribed by VIRTCHNL_FLOW_UNSUBSCRIBE.

Signed-off-by: Jie Wang 
Signed-off-by: Qi Zhang 
---
 drivers/common/iavf/virtchnl.h | 104 +++--
 1 file changed, 100 insertions(+), 4 deletions(-)

diff --git a/drivers/common/iavf/virtchnl.h b/drivers/common/iavf/virtchnl.h
index f123daec8e..e02eec4935 100644
--- a/drivers/common/iavf/virtchnl.h
+++ b/drivers/common/iavf/virtchnl.h
@@ -168,6 +168,8 @@ enum virtchnl_ops {
VIRTCHNL_OP_MAP_QUEUE_VECTOR = 111,
VIRTCHNL_OP_CONFIG_QUEUE_BW = 112,
VIRTCHNL_OP_CONFIG_QUANTA = 113,
+   VIRTCHNL_OP_FLOW_SUBSCRIBE = 114,
+   VIRTCHNL_OP_FLOW_UNSUBSCRIBE = 115,
VIRTCHNL_OP_MAX,
 };
 
@@ -282,6 +284,10 @@ static inline const char *virtchnl_op_str(enum 
virtchnl_ops v_opcode)
return "VIRTCHNL_OP_1588_PTP_GET_CAPS";
case VIRTCHNL_OP_1588_PTP_GET_TIME:
return "VIRTCHNL_OP_1588_PTP_GET_TIME";
+   case VIRTCHNL_OP_FLOW_SUBSCRIBE:
+   return "VIRTCHNL_OP_FLOW_SUBSCRIBE";
+   case VIRTCHNL_OP_FLOW_UNSUBSCRIBE:
+   return "VIRTCHNL_OP_FLOW_UNSUBSCRIBE";
case VIRTCHNL_OP_MAX:
return "VIRTCHNL_OP_MAX";
default:
@@ -401,6 +407,7 @@ VIRTCHNL_CHECK_STRUCT_LEN(16, virtchnl_vsi_resource);
 #define VIRTCHNL_VF_OFFLOAD_INLINE_IPSEC_CRYPTOBIT(8)
 #define VIRTCHNL_VF_LARGE_NUM_QPAIRS   BIT(9)
 #define VIRTCHNL_VF_OFFLOAD_CRCBIT(10)
+#define VIRTCHNL_VF_OFFLOAD_FSUB_PFBIT(14)
 #define VIRTCHNL_VF_OFFLOAD_VLAN_V2BIT(15)
 #define VIRTCHNL_VF_OFFLOAD_VLAN   BIT(16)
 #define VIRTCHNL_VF_OFFLOAD_RX_POLLING BIT(17)
@@ -1503,6 +1510,7 @@ enum virtchnl_vfr_states {
 };
 
 #define VIRTCHNL_MAX_NUM_PROTO_HDRS32
+#define VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK  16
 #define VIRTCHNL_MAX_SIZE_RAW_PACKET   1024
 #define PROTO_HDR_SHIFT5
 #define PROTO_HDR_FIELD_START(proto_hdr_type) \
@@ -1695,6 +1703,22 @@ struct virtchnl_proto_hdr {
 
 VIRTCHNL_CHECK_STRUCT_LEN(72, virtchnl_proto_hdr);
 
+struct virtchnl_proto_hdr_w_msk {
+   /* see enum virtchnl_proto_hdr_type */
+   s32 type;
+   u32 pad;
+   /**
+* binary buffer in network order for specific header type.
+* For example, if type = VIRTCHNL_PROTO_HDR_IPV4, a IPv4
+* header is expected to be copied into the buffer.
+*/
+   u8 buffer_spec[64];
+   /* binary buffer for bit-mask applied to specific header type */
+   u8 buffer_mask[64];
+};
+
+VIRTCHNL_CHECK_STRUCT_LEN(136, virtchnl_proto_hdr_w_msk);
+
 struct virtchnl_proto_hdrs {
u8 tunnel_level;
/**
@@ -1706,11 +1730,18 @@ struct virtchnl_proto_hdrs {
 */
int count;
/**
-* number of proto layers, must < VIRTCHNL_MAX_NUM_PROTO_HDRS
-* must be 0 for a raw packet request.
+* count must <=
+* VIRTCHNL_MAX_NUM_PROTO_HDRS + VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK
+* count = 0 :  select raw
+* 1 < count <= VIRTCHNL_MAX_NUM_PROTO_HDRS :   select proto_hdr
+* count > VIRTCHNL_MAX_NUM_PROTO_HDRS :select proto_hdr_w_msk
+* last valid index = count - VIRTCHNL_MAX_NUM_PROTO_HDRS
 */
union {
-   struct virtchnl_proto_hdr 
proto_hdr[VIRTCHNL_MAX_NUM_PROTO_HDRS];
+   struct virtchnl_proto_hdr
+   proto_hdr[VIRTCHNL_MAX_NUM_PROTO_HDRS];
+   struct virtchnl_proto_hdr_w_msk
+   proto_hdr_w_msk[VIRTCHNL_MAX_NUM_PROTO_HDRS_W_MSK];
struct {
u16 pkt_len;
u8 spec[VIRTCHNL_MAX_SIZE_RAW_PACKET];
@@ -1731,7 +1762,7 @@ struct virtchnl_rss_cfg {
 
 VIRTCHNL_CHECK_STRUCT_LEN(2444, virtchnl_rss_cfg);
 
-/* action configuration for FDIR */
+/* action configuration for FDIR and FSUB */
 struct virtchnl_filter_action {
/* see enum virtchnl_action type */
s32 type;
@@ -1849,6 +1880,65 @@ struct virtchnl_fdir_del {
 
 VIRTCHNL_CHECK_STRUCT_LEN(12, virtchnl_fdir_del);
 
+/* Status returned to VF after VF requests FSUB commands
+ * VIRTCHNL_FSUB_SUCCESS
+ * VF FLOW related request is successfully done by PF
+ * The request can be OP_FLOW_SUBSCRIBE/UNSUBSCRIBE.
+ *
+ * VIRTCHNL_FSUB_FAILURE_RULE_NORESOURCE
+ * OP_FLOW_SUBSCRIBE request is failed due to no Hardware resource.
+ *
+ * VIRTCHNL_FSUB_FAILURE_RULE_EXIST
+ * OP_FLOW_SUBSCRIBE request is failed due to the rule is already existed.
+ 

[PATCH v5 0/5] support flow subscription

2022-09-06 Thread Jie Wang
Add support AVF can be able to subscribe a flow from PF.

--
v4:
 * replace flow action represented_port with port_representor.
 * update commit log and rebase.
v3:
 * fix eth layer inputset.
 * rebase.
v2:
 * split v1 patch 2/2 to 4 small patches.
 * remove rule action RTE_FLOW_ACTION_TYPE_VF and add
   RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT.

Jie Wang (5):
  common/iavf: support flow subscription
  net/iavf: add flow subscription to AVF
  net/iavf: support flow subscrption pattern
  net/iavf: support flow subscription rule
  net/iavf: support priority of flow rule

 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/common/iavf/virtchnl.h | 104 +++-
 drivers/net/iavf/iavf.h|  13 +
 drivers/net/iavf/iavf_fdir.c   |   4 +
 drivers/net/iavf/iavf_fsub.c   | 745 +
 drivers/net/iavf/iavf_generic_flow.c   |  40 +-
 drivers/net/iavf/iavf_generic_flow.h   |   2 +
 drivers/net/iavf/iavf_hash.c   |   5 +
 drivers/net/iavf/iavf_ipsec_crypto.c   |  16 +-
 drivers/net/iavf/iavf_vchnl.c  | 133 +
 drivers/net/iavf/meson.build   |   1 +
 11 files changed, 1046 insertions(+), 21 deletions(-)
 create mode 100644 drivers/net/iavf/iavf_fsub.c

-- 
2.25.1



[PATCH v5 2/5] net/iavf: add flow subscription to AVF

2022-09-06 Thread Jie Wang
Add the skeletal code of flow subscription to AVF driver.

Signed-off-by: Jie Wang 
---
 doc/guides/rel_notes/release_22_11.rst |   4 +
 drivers/net/iavf/iavf_fsub.c   | 112 +
 drivers/net/iavf/iavf_generic_flow.c   |  17 +++-
 drivers/net/iavf/iavf_generic_flow.h   |   1 +
 drivers/net/iavf/iavf_vchnl.c  |   1 +
 drivers/net/iavf/meson.build   |   1 +
 6 files changed, 135 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/iavf/iavf_fsub.c

diff --git a/doc/guides/rel_notes/release_22_11.rst 
b/doc/guides/rel_notes/release_22_11.rst
index 8c021cf050..bb77a03e24 100644
--- a/doc/guides/rel_notes/release_22_11.rst
+++ b/doc/guides/rel_notes/release_22_11.rst
@@ -55,6 +55,10 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Updated Intel iavf driver.**
+
+  * Added flow subscription support.
+
 
 Removed Items
 -
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
new file mode 100644
index 00..17f9bb2976
--- /dev/null
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -0,0 +1,112 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2022 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "iavf_generic_flow.h"
+
+
+static struct iavf_flow_parser iavf_fsub_parser;
+
+static struct iavf_pattern_match_item iavf_fsub_pattern_list[] = {};
+
+static int
+iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
+__rte_unused struct rte_flow *flow,
+__rte_unused void *meta,
+__rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_destroy(__rte_unused struct iavf_adapter *ad,
+ __rte_unused struct rte_flow *flow,
+ __rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
+__rte_unused struct rte_flow *flow,
+__rte_unused void *meta,
+__rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+};
+
+static int
+iavf_fsub_parse(__rte_unused struct iavf_adapter *ad,
+   __rte_unused struct iavf_pattern_match_item *array,
+   __rte_unused uint32_t array_len,
+   __rte_unused const struct rte_flow_item pattern[],
+   __rte_unused const struct rte_flow_action actions[],
+   __rte_unused void **meta,
+   __rte_unused struct rte_flow_error *error)
+{
+   return -rte_errno;
+}
+
+static int
+iavf_fsub_init(struct iavf_adapter *ad)
+{
+   struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(ad);
+   struct iavf_flow_parser *parser;
+
+   if (!vf->vf_res)
+   return -EINVAL;
+
+   if (vf->vf_res->vf_cap_flags & VIRTCHNL_VF_OFFLOAD_FSUB_PF)
+   parser = &iavf_fsub_parser;
+   else
+   return -ENOTSUP;
+
+   return iavf_register_parser(parser, ad);
+}
+
+static void
+iavf_fsub_uninit(struct iavf_adapter *ad)
+{
+   iavf_unregister_parser(&iavf_fsub_parser, ad);
+}
+
+static struct
+iavf_flow_engine iavf_fsub_engine = {
+   .init = iavf_fsub_init,
+   .uninit = iavf_fsub_uninit,
+   .create = iavf_fsub_create,
+   .destroy = iavf_fsub_destroy,
+   .validation = iavf_fsub_validation,
+   .type = IAVF_FLOW_ENGINE_FSUB,
+};
+
+static struct
+iavf_flow_parser iavf_fsub_parser = {
+   .engine = &iavf_fsub_engine,
+   .array = iavf_fsub_pattern_list,
+   .array_len = RTE_DIM(iavf_fsub_pattern_list),
+   .parse_pattern_action = iavf_fsub_parse,
+   .stage = IAVF_FLOW_STAGE_DISTRIBUTOR,
+};
+
+RTE_INIT(iavf_fsub_engine_init)
+{
+   iavf_register_flow_engine(&iavf_fsub_engine);
+}
diff --git a/drivers/net/iavf/iavf_generic_flow.c 
b/drivers/net/iavf/iavf_generic_flow.c
index e1a611e319..b04614ba6e 100644
--- a/drivers/net/iavf/iavf_generic_flow.c
+++ b/drivers/net/iavf/iavf_generic_flow.c
@@ -1866,6 +1866,8 @@ iavf_register_parser(struct iavf_flow_parser *parser,
 {
struct iavf_parser_list *list = NULL;
struct iavf_flow_parser_node *parser_node;
+   struct iavf_flow_parser_node *existing_node;
+   void *temp;
struct iavf_info *vf = IAVF_DEV_PRIVATE_TO_VF(ad);
 
parser_node = rte_zmalloc("iavf_parser", sizeof(*parser_node), 0);
@@ -1880,14 +1882,26 @@ iavf_register_parser(struct iavf_flow_parser *parser,
TAILQ_INSERT_TAIL(list, parser_node, node);
} else if (parser->engine->type == IAVF_FLOW_ENGINE_FDIR) {
list = &vf->dist_parser_list;
+   RTE_TAILQ_FOREACH_SAFE(existing_node, list, node, temp) {
+   if (existing_

[PATCH v5 3/5] net/iavf: support flow subscrption pattern

2022-09-06 Thread Jie Wang
Add flow subscription pattern support for AVF.

The supported patterns are listed below:
eth/vlan/ipv4
eth/ipv4(6)
eth/ipv4(6)/udp
eth/ipv4(6)/tcp

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf.h  |   7 +
 drivers/net/iavf/iavf_fsub.c | 598 ++-
 2 files changed, 597 insertions(+), 8 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index 025ab3ff60..f79c7f9f6e 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -148,6 +148,13 @@ struct iavf_fdir_info {
struct iavf_fdir_conf conf;
 };
 
+struct iavf_fsub_conf {
+   struct virtchnl_flow_sub sub_fltr;
+   struct virtchnl_flow_unsub unsub_fltr;
+   uint64_t input_set;
+   uint32_t flow_id;
+};
+
 struct iavf_qv_map {
uint16_t queue_id;
uint16_t vector_id;
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 17f9bb2976..66e403d585 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -22,9 +22,51 @@
 #include "iavf_generic_flow.h"
 
 
+#define MAX_QGRP_NUM_TYPE  7
+#define IAVF_IPV6_ADDR_LENGTH  16
+#define MAX_INPUT_SET_BYTE 32
+
+#define IAVF_SW_INSET_ETHER ( \
+   IAVF_INSET_DMAC | IAVF_INSET_SMAC | IAVF_INSET_ETHERTYPE)
+#define IAVF_SW_INSET_MAC_IPV4 ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_PROTO | IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS)
+#define IAVF_SW_INSET_MAC_VLAN_IPV4 ( \
+   IAVF_SW_INSET_MAC_IPV4 | IAVF_INSET_VLAN_OUTER)
+#define IAVF_SW_INSET_MAC_IPV4_TCP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS | \
+   IAVF_INSET_TCP_DST_PORT | IAVF_INSET_TCP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV4_UDP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV4_DST | IAVF_INSET_IPV4_SRC | \
+   IAVF_INSET_IPV4_TTL | IAVF_INSET_IPV4_TOS | \
+   IAVF_INSET_UDP_DST_PORT | IAVF_INSET_UDP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV6 ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_TC | IAVF_INSET_IPV6_HOP_LIMIT | \
+   IAVF_INSET_IPV6_NEXT_HDR)
+#define IAVF_SW_INSET_MAC_IPV6_TCP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_HOP_LIMIT | IAVF_INSET_IPV6_TC | \
+   IAVF_INSET_TCP_DST_PORT | IAVF_INSET_TCP_SRC_PORT)
+#define IAVF_SW_INSET_MAC_IPV6_UDP ( \
+   IAVF_INSET_DMAC | IAVF_INSET_IPV6_DST | IAVF_INSET_IPV6_SRC | \
+   IAVF_INSET_IPV6_HOP_LIMIT | IAVF_INSET_IPV6_TC | \
+   IAVF_INSET_UDP_DST_PORT | IAVF_INSET_UDP_SRC_PORT)
+
 static struct iavf_flow_parser iavf_fsub_parser;
 
-static struct iavf_pattern_match_item iavf_fsub_pattern_list[] = {};
+static struct
+iavf_pattern_match_item iavf_fsub_pattern_list[] = {
+   {iavf_pattern_ethertype,IAVF_SW_INSET_ETHER,
IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4, IAVF_SW_INSET_MAC_IPV4, 
IAVF_INSET_NONE},
+   {iavf_pattern_eth_vlan_ipv4,
IAVF_SW_INSET_MAC_VLAN_IPV4,IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4_udp, 
IAVF_SW_INSET_MAC_IPV4_UDP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv4_tcp, 
IAVF_SW_INSET_MAC_IPV4_TCP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6, IAVF_SW_INSET_MAC_IPV6, 
IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6_udp, 
IAVF_SW_INSET_MAC_IPV6_UDP, IAVF_INSET_NONE},
+   {iavf_pattern_eth_ipv6_tcp, 
IAVF_SW_INSET_MAC_IPV6_TCP, IAVF_INSET_NONE},
+};
 
 static int
 iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
@@ -53,17 +95,557 @@ iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
 };
 
 static int
-iavf_fsub_parse(__rte_unused struct iavf_adapter *ad,
-   __rte_unused struct iavf_pattern_match_item *array,
-   __rte_unused uint32_t array_len,
-   __rte_unused const struct rte_flow_item pattern[],
-   __rte_unused const struct rte_flow_action actions[],
-   __rte_unused void **meta,
-   __rte_unused struct rte_flow_error *error)
+iavf_fsub_parse_pattern(const struct rte_flow_item pattern[],
+   const uint64_t input_set_mask,
+   struct rte_flow_error *error,
+   struct iavf_fsub_conf *filter)
+{
+   struct virtchnl_proto_hdrs *hdrs = &filter->sub_fltr.proto_hdrs;
+   enum rte_flow_item_type item_type;
+   const struct rte_flow_item_eth *eth_spec, *eth_mask;
+   const struct rte_flow_item_ipv4 *ipv4_spec, *ipv4_mask;
+   const struct rte_flow_item_ipv6 *ipv6_spec, *ipv6_mask;
+   const struct rte_flow_item_tcp *tcp_spec, *tcp_mask;
+   const struct rte_flow_item_udp *u

[PATCH v5 4/5] net/iavf: support flow subscription rule

2022-09-06 Thread Jie Wang
Support flow subscribption create/destroy/validation flow
rule for AVF.

For examples:
testpmd> flow create 0 ingress pattern eth / ipv4 / udp src is 11
  / end actions represented_port port_id 1 / end
testpmd> flow validate 1 ingress pattern eth / ipv4 / tcp src is 22
  / end actions represented_port port_id 1 / end
testpmd> flow destroy 1 rule 0

VF subscribes to a rule, which means the packets will be sent to VF
instead of PF, and only VF will receive the packets.

It is allowed multiple VF subscribe to same rule, the packets will
be replicated and received by each VF.

PF will destroy all subscriptions during VF reset.

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf.h   |   6 ++
 drivers/net/iavf/iavf_fsub.c  |  75 +++
 drivers/net/iavf/iavf_vchnl.c | 132 ++
 3 files changed, 201 insertions(+), 12 deletions(-)

diff --git a/drivers/net/iavf/iavf.h b/drivers/net/iavf/iavf.h
index f79c7f9f6e..26b858f6f0 100644
--- a/drivers/net/iavf/iavf.h
+++ b/drivers/net/iavf/iavf.h
@@ -489,4 +489,10 @@ int iavf_ipsec_crypto_request(struct iavf_adapter *adapter,
 extern const struct rte_tm_ops iavf_tm_ops;
 int iavf_get_ptp_cap(struct iavf_adapter *adapter);
 int iavf_get_phc_time(struct iavf_rx_queue *rxq);
+int iavf_flow_sub(struct iavf_adapter *adapter,
+ struct iavf_fsub_conf *filter);
+int iavf_flow_unsub(struct iavf_adapter *adapter,
+   struct iavf_fsub_conf *filter);
+int iavf_flow_sub_check(struct iavf_adapter *adapter,
+   struct iavf_fsub_conf *filter);
 #endif /* _IAVF_ETHDEV_H_ */
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 66e403d585..28857d7577 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -69,29 +69,80 @@ iavf_pattern_match_item iavf_fsub_pattern_list[] = {
 };
 
 static int
-iavf_fsub_create(__rte_unused struct iavf_adapter *ad,
-__rte_unused struct rte_flow *flow,
-__rte_unused void *meta,
-__rte_unused struct rte_flow_error *error)
+iavf_fsub_create(struct iavf_adapter *ad, struct rte_flow *flow,
+void *meta, struct rte_flow_error *error)
 {
+   struct iavf_fsub_conf *filter = meta;
+   struct iavf_fsub_conf *rule;
+   int ret;
+
+   rule = rte_zmalloc("fsub_entry", sizeof(*rule), 0);
+   if (!rule) {
+   rte_flow_error_set(error, ENOMEM,
+   RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+   "Failed to allocate memory for fsub rule");
+   return -rte_errno;
+   }
+
+   ret = iavf_flow_sub(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to subscribe flow rule.");
+   goto free_entry;
+   }
+
+   rte_memcpy(rule, filter, sizeof(*rule));
+   flow->rule = rule;
+
+   return ret;
+
+free_entry:
+   rte_free(rule);
return -rte_errno;
 }
 
 static int
-iavf_fsub_destroy(__rte_unused struct iavf_adapter *ad,
- __rte_unused struct rte_flow *flow,
- __rte_unused struct rte_flow_error *error)
+iavf_fsub_destroy(struct iavf_adapter *ad, struct rte_flow *flow,
+ struct rte_flow_error *error)
 {
-   return -rte_errno;
+   struct iavf_fsub_conf *filter;
+   int ret;
+
+   filter = (struct iavf_fsub_conf *)flow->rule;
+
+   ret = iavf_flow_unsub(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to unsubscribe flow rule.");
+   return -rte_errno;
+   }
+
+   flow->rule = NULL;
+   rte_free(filter);
+
+   return ret;
 }
 
 static int
-iavf_fsub_validation(__rte_unused struct iavf_adapter *ad,
+iavf_fsub_validation(struct iavf_adapter *ad,
 __rte_unused struct rte_flow *flow,
-__rte_unused void *meta,
-__rte_unused struct rte_flow_error *error)
+void *meta,
+struct rte_flow_error *error)
 {
-   return -rte_errno;
+   struct iavf_fsub_conf *filter = meta;
+   int ret;
+
+   ret = iavf_flow_sub_check(ad, filter);
+   if (ret) {
+   rte_flow_error_set(error, -ret,
+  RTE_FLOW_ERROR_TYPE_HANDLE, NULL,
+  "Failed to validate filter rule.");
+   return -rte_errno;
+   }
+
+   return ret;
 };
 
 static int
diff --git a/drivers/net/iavf/iavf_vchnl.c b/drivers/net/iavf/iavf_vchnl.c
index 6d84add423..cc0db8d093 100644
--- a/drivers/net/iavf/iavf_vchnl.c
+++ b/drivers/net/iavf/iavf_vchnl.c
@@ -1534,6 +1534,138 @@ iavf_fdir_check(struct iavf_adapter *adapte

[PATCH v5 5/5] net/iavf: support priority of flow rule

2022-09-06 Thread Jie Wang
Add flow rule attribute "priority" support for AVF.

Lower values denote higher priority, the highest priority for
a flow rule is 0.

All subscription rule will have a lower priority than the rules
that be created by host.

Signed-off-by: Jie Wang 
---
 drivers/net/iavf/iavf_fdir.c |  4 
 drivers/net/iavf/iavf_fsub.c |  2 +-
 drivers/net/iavf/iavf_generic_flow.c | 23 +--
 drivers/net/iavf/iavf_generic_flow.h |  1 +
 drivers/net/iavf/iavf_hash.c |  5 +
 drivers/net/iavf/iavf_ipsec_crypto.c | 16 ++--
 6 files changed, 34 insertions(+), 17 deletions(-)

diff --git a/drivers/net/iavf/iavf_fdir.c b/drivers/net/iavf/iavf_fdir.c
index a397047fdb..8f80873925 100644
--- a/drivers/net/iavf/iavf_fdir.c
+++ b/drivers/net/iavf/iavf_fdir.c
@@ -1583,6 +1583,7 @@ iavf_fdir_parse(struct iavf_adapter *ad,
uint32_t array_len,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
+   uint32_t priority,
void **meta,
struct rte_flow_error *error)
 {
@@ -1593,6 +1594,9 @@ iavf_fdir_parse(struct iavf_adapter *ad,
 
memset(filter, 0, sizeof(*filter));
 
+   if (priority >= 1)
+   return -rte_errno;
+
item = iavf_search_pattern_match_item(pattern, array, array_len, error);
if (!item)
return -rte_errno;
diff --git a/drivers/net/iavf/iavf_fsub.c b/drivers/net/iavf/iavf_fsub.c
index 28857d7577..3bb6c30d3c 100644
--- a/drivers/net/iavf/iavf_fsub.c
+++ b/drivers/net/iavf/iavf_fsub.c
@@ -649,13 +649,13 @@ iavf_fsub_parse(struct iavf_adapter *ad,
uint32_t array_len,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
+   uint32_t priority,
void **meta,
struct rte_flow_error *error)
 {
struct iavf_fsub_conf *filter;
struct iavf_pattern_match_item *pattern_match_item = NULL;
int ret = 0;
-   uint32_t priority = 0;
 
filter = rte_zmalloc(NULL, sizeof(*filter), 0);
if (!filter) {
diff --git a/drivers/net/iavf/iavf_generic_flow.c 
b/drivers/net/iavf/iavf_generic_flow.c
index b04614ba6e..f33c764764 100644
--- a/drivers/net/iavf/iavf_generic_flow.c
+++ b/drivers/net/iavf/iavf_generic_flow.c
@@ -1785,6 +1785,7 @@ enum rte_flow_item_type 
iavf_pattern_eth_ipv6_udp_l2tpv2_ppp_ipv6_tcp[] = {
 typedef struct iavf_flow_engine * (*parse_engine_t)(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error);
@@ -1951,11 +1952,11 @@ iavf_flow_valid_attr(const struct rte_flow_attr *attr,
return -rte_errno;
}
 
-   /* Not supported */
-   if (attr->priority) {
+   /* support priority for flow subscribe */
+   if (attr->priority > 1) {
rte_flow_error_set(error, EINVAL,
RTE_FLOW_ERROR_TYPE_ATTR_PRIORITY,
-   attr, "Not support priority.");
+   attr, "Only support priority 0 and 1.");
return -rte_errno;
}
 
@@ -2098,6 +2099,7 @@ static struct iavf_flow_engine *
 iavf_parse_engine_create(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error)
@@ -2111,7 +2113,7 @@ iavf_parse_engine_create(struct iavf_adapter *ad,
if (parser_node->parser->parse_pattern_action(ad,
parser_node->parser->array,
parser_node->parser->array_len,
-   pattern, actions, &meta, error) < 0)
+   pattern, actions, priority, &meta, error) < 0)
continue;
 
engine = parser_node->parser->engine;
@@ -2127,6 +2129,7 @@ static struct iavf_flow_engine *
 iavf_parse_engine_validate(struct iavf_adapter *ad,
struct rte_flow *flow,
struct iavf_parser_list *parser_list,
+   uint32_t priority,
const struct rte_flow_item pattern[],
const struct rte_flow_action actions[],
struct rte_flow_error *error)
@@ -2140,7 +2143,7 @@ iavf_parse_engine_validate(struct iavf_adapter *ad,
if (parser_node->parser->parse_pattern_action(ad,
parser_node->parser->array,
parser_node->parser->array_len,
-   patter

RE: [PATCH v5 3/5] net/iavf: support flow subscrption pattern

2022-09-06 Thread Zhang, Qi Z



> -Original Message-
> From: Wang, Jie1X 
> Sent: Wednesday, September 7, 2022 1:11 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming ; Zhang, Qi Z
> ; Wu, Jingjing ; Xing, Beilei
> ; Yang, SteveX ; Wang, Jie1X
> 
> Subject: [PATCH v5 3/5] net/iavf: support flow subscrption pattern
> 
> Add flow subscription pattern support for AVF.
> 
> The supported patterns are listed below:
> eth/vlan/ipv4
> eth/ipv4(6)
> eth/ipv4(6)/udp
> eth/ipv4(6)/tcp
> 
> Signed-off-by: Jie Wang 
> ---
> 
> +static int
> +iavf_fsub_check_action(const struct rte_flow_action *actions,
> +struct rte_flow_error *error)
> +{
> + const struct rte_flow_action *action;
> + enum rte_flow_action_type action_type;
> + uint16_t actions_num = 0;
> + bool vf_valid = false;
> + bool queue_valid = false;
> +
> + for (action = actions; action->type !=
> + RTE_FLOW_ACTION_TYPE_END; action++) {
> + action_type = action->type;
> + switch (action_type) {
> + case RTE_FLOW_ACTION_TYPE_PORT_REPRESENTOR:

Need to sync the document in iavf.ini

[rte_flow actions]

port_representor = Y

will be fixed during code merge.



RE: [PATCH v5 0/5] support flow subscription

2022-09-06 Thread Zhang, Qi Z



> -Original Message-
> From: Wang, Jie1X 
> Sent: Wednesday, September 7, 2022 1:11 PM
> To: dev@dpdk.org
> Cc: Yang, Qiming ; Zhang, Qi Z
> ; Wu, Jingjing ; Xing, Beilei
> ; Yang, SteveX ; Wang, Jie1X
> 
> Subject: [PATCH v5 0/5] support flow subscription
> 
> Add support AVF can be able to subscribe a flow from PF.
> 
> --
> v4:
>  * replace flow action represented_port with port_representor.
>  * update commit log and rebase.
> v3:
>  * fix eth layer inputset.
>  * rebase.
> v2:
>  * split v1 patch 2/2 to 4 small patches.
>  * remove rule action RTE_FLOW_ACTION_TYPE_VF and add
>RTE_FLOW_ACTION_TYPE_REPRESENTED_PORT.
> 
> Jie Wang (5):
>   common/iavf: support flow subscription
>   net/iavf: add flow subscription to AVF
>   net/iavf: support flow subscrption pattern
>   net/iavf: support flow subscription rule
>   net/iavf: support priority of flow rule
> 
>  doc/guides/rel_notes/release_22_11.rst |   4 +
>  drivers/common/iavf/virtchnl.h | 104 +++-
>  drivers/net/iavf/iavf.h|  13 +
>  drivers/net/iavf/iavf_fdir.c   |   4 +
>  drivers/net/iavf/iavf_fsub.c   | 745 +
>  drivers/net/iavf/iavf_generic_flow.c   |  40 +-
>  drivers/net/iavf/iavf_generic_flow.h   |   2 +
>  drivers/net/iavf/iavf_hash.c   |   5 +
>  drivers/net/iavf/iavf_ipsec_crypto.c   |  16 +-
>  drivers/net/iavf/iavf_vchnl.c  | 133 +
>  drivers/net/iavf/meson.build   |   1 +
>  11 files changed, 1046 insertions(+), 21 deletions(-)  create mode 100644
> drivers/net/iavf/iavf_fsub.c
> 
> --
> 2.25.1

Acked-by: Qi Zhang 

Applied to dpdk-next-net-intel.

Thanks
Qi



RE: [PATCH v3 1/2] net/memif: add a Rx fast path

2022-09-06 Thread Joyce Kong
Hi Stephen,

> -Original Message-
> From: Stephen Hemminger 
> Sent: Thursday, September 1, 2022 12:26 AM
> To: Joyce Kong 
> Cc: jgraj...@cisco.com; huzaifa.rah...@emumba.com; dev@dpdk.org; nd
> ; m...@smartsharesystems.com; Ruifeng Wang
> 
> Subject: Re: [PATCH v3 1/2] net/memif: add a Rx fast path
> 
> On Mon, 22 Aug 2022 03:47:30 +
> Joyce Kong  wrote:
> 
> > +   if (likely(mbuf_size >= pmd->cfg.pkt_buffer_size)) {
> > +   struct rte_mbuf *mbufs[nb_pkts];
> > +   ret = rte_pktmbuf_alloc_bulk(mq->mempool, mbufs,
> nb_pkts);
> > +   if (unlikely(ret < 0))
> > +   goto no_free_bufs;
> > +
> 
> The indentation looks off here, is this because of diff?
> Also, my preference is to use blank line after declaration.
Will modify the format in next version.

> 
> One more thing, the use of variable length array on stack will cause the
> function to get additional overhead if stack-protector strong is enabled.
Will fix the array length in next version.


Re: [PATCH] net/mlx5: fix Rx queue recovery mechanism

2022-09-06 Thread Amiya Mohakud
Hi All,

I would need some confirmation on this patch.
For some earlier issues encountered on mlx5, we have disable cqe_comp in
the mlx5 driver. In that case, do we still need this fix or disabling
cqe_comp will take care of it as well?

Regards
Amiya

On Mon, Aug 29, 2022 at 8:45 PM Thomas Monjalon  wrote:

> From: Matan Azrad 
>
> The local variables are getting inconsistent in data receiving routines
> after queue error recovery.
> Receive queue consumer index is getting wrong, need to reset one to the
> size of the queue (as RQ was fully replenished in recovery procedure).
>
> In MPRQ case, also the local consumed strd variable should be reset.
>
> CVE-2022-28199
> Fixes: 88c0733535d6 ("net/mlx5: extend Rx completion with error handling")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Alexander Kozyrev 
> Signed-off-by: Matan Azrad 
> ---
>
> Already applied in main branch as part of the public disclosure process.
>
> ---
>  drivers/net/mlx5/mlx5_rx.c | 34 --
>  1 file changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
> index bb3ccc36e5..917c517b83 100644
> --- a/drivers/net/mlx5/mlx5_rx.c
> +++ b/drivers/net/mlx5/mlx5_rx.c
> @@ -408,6 +408,11 @@ mlx5_rxq_initialize(struct mlx5_rxq_data *rxq)
> *rxq->rq_db = rte_cpu_to_be_32(rxq->rq_ci);
>  }
>
> +/* Must be negative. */
> +#define MLX5_ERROR_CQE_RET (-1)
> +/* Must not be negative. */
> +#define MLX5_RECOVERY_ERROR_RET 0
> +
>  /**
>   * Handle a Rx error.
>   * The function inserts the RQ state to reset when the first error CQE is
> @@ -422,7 +427,7 @@ mlx5_rxq_initialize(struct mlx5_rxq_data *rxq)
>   *   0 when called from non-vectorized Rx burst.
>   *
>   * @return
> - *   -1 in case of recovery error, otherwise the CQE status.
> + *   MLX5_RECOVERY_ERROR_RET in case of recovery error, otherwise the CQE
> status.
>   */
>  int
>  mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec)
> @@ -451,7 +456,7 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t
> vec)
> sm.queue_id = rxq->idx;
> sm.state = IBV_WQS_RESET;
> if (mlx5_queue_state_modify(RXQ_DEV(rxq_ctrl), &sm))
> -   return -1;
> +   return MLX5_RECOVERY_ERROR_RET;
> if (rxq_ctrl->dump_file_n <
> RXQ_PORT(rxq_ctrl)->config.max_dump_files_num) {
> MKSTR(err_str, "Unexpected CQE error syndrome "
> @@ -491,7 +496,7 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t
> vec)
> sm.queue_id = rxq->idx;
> sm.state = IBV_WQS_RDY;
> if (mlx5_queue_state_modify(RXQ_DEV(rxq_ctrl),
> &sm))
> -   return -1;
> +   return MLX5_RECOVERY_ERROR_RET;
> if (vec) {
> const uint32_t elts_n =
> mlx5_rxq_mprq_enabled(rxq) ?
> @@ -519,7 +524,7 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t
> vec)
>
> rte_pktmbuf_free_seg
> (*elt);
> }
> -   return -1;
> +   return
> MLX5_RECOVERY_ERROR_RET;
> }
> }
> for (i = 0; i < (int)elts_n; ++i) {
> @@ -538,7 +543,7 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t
> vec)
> }
> return ret;
> default:
> -   return -1;
> +   return MLX5_RECOVERY_ERROR_RET;
> }
>  }
>
> @@ -556,7 +561,9 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t
> vec)
>   *   written.
>   *
>   * @return
> - *   0 in case of empty CQE, otherwise the packet size in bytes.
> + *   0 in case of empty CQE, MLX5_ERROR_CQE_RET in case of error CQE,
> + *   otherwise the packet size in regular RxQ, and striding byte
> + *   count format in mprq case.
>   */
>  static inline int
>  mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile struct mlx5_cqe *cqe,
> @@ -623,8 +630,8 @@ mlx5_rx_poll_len(struct mlx5_rxq_data *rxq, volatile
> struct mlx5_cqe *cqe,
>  rxq->err_state)) {
> ret = mlx5_rx_err_handle(rxq, 0);
> if (ret == MLX5_CQE_STATUS_HW_OWN
> ||
> -   ret == -1)
> -   return 0;
> +   ret == MLX5_RECOVERY_ERROR_RET)
> +   return MLX5_ERROR_CQE_RET;
> } else {
> return 0;
> 

RE: [PATCH v3] net/i40e: fix single VLAN cannot work normal

2022-09-06 Thread Zhang, Yuying
Hi,

> -Original Message-
> From: Liu, KevinX 
> Sent: Wednesday, September 7, 2022 12:15 AM
> To: dev@dpdk.org
> Cc: Zhang, Yuying ; Xing, Beilei
> ; Yang, SteveX ; Liu, KevinX
> ; Jiale, SongX 
> Subject: [PATCH v3] net/i40e: fix single VLAN cannot work normal
> 
> After disable QinQ, single VLAN can not work normal.
> The reason is that QinQ is not disabled correctly.
> 
> Before configuring QinQ, need to back up and clean MAC/VLAN filters of all
> ports. After configuring QinQ, restore MAC/VLAN filters of all ports. When
> disable QinQ, need to set valid_flags to 0x0008 and set first_tag to 0x88a8.

Please correct grammar error of commit log and change numbers to macro 
definition for readability.

> 
> Fixes: 38e9762be16a ("net/i40e: add outer VLAN processing")
> Signed-off-by: Kevin Liu 
> Tested-by: Jiale Song 

Make sure new version has been validated before adding "tested-by".

> ---
> v2: refine code
> ---
> v3: refine code
> ---
>  doc/guides/nics/i40e.rst   |   1 -
>  drivers/net/i40e/i40e_ethdev.c | 147 ++---
>  2 files changed, 100 insertions(+), 48 deletions(-)
> 
> diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index
> abb99406b3..15b796e67a 100644
> --- a/doc/guides/nics/i40e.rst
> +++ b/doc/guides/nics/i40e.rst
> @@ -983,7 +983,6 @@ If FW version >= 8.4, there'll be some Vlan related 
> issues:
> 
>  #. TCI input set for QinQ  is invalid.
>  #. Fail to configure TPID for QinQ.
> -#. Need to enable QinQ before enabling Vlan filter.
>  #. Fail to strip outer Vlan.
> 
>  Example of getting best performance with l3fwd example diff --git
> a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c index
> 67d79de08d..cf327ed576 100644
> --- a/drivers/net/i40e/i40e_ethdev.c
> +++ b/drivers/net/i40e/i40e_ethdev.c
> @@ -1650,7 +1650,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev, void
> *init_params __rte_unused)
>   vsi = pf->main_vsi;
> 
>   /* Disable double vlan by default */
> - i40e_vsi_config_double_vlan(vsi, FALSE);
> + if (!pf->fw8_3gt)
> + i40e_vsi_config_double_vlan(vsi, FALSE);

Disable double vlan by default no matter the firmware is.

> 
>   /* Disable S-TAG identification when floating_veb is disabled */
>   if (!pf->floating_veb) {
> @@ -3909,7 +3910,6 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev,
>   struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data-
> >dev_private);
>   int qinq = dev->data->dev_conf.rxmode.offloads &
>  RTE_ETH_RX_OFFLOAD_VLAN_EXTEND;
> - u16 sw_flags = 0, valid_flags = 0;
>   int ret = 0;
> 
>   if ((vlan_type != RTE_ETH_VLAN_TYPE_INNER && @@ -3928,10
> +3928,6 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev,
>   /* 802.1ad frames ability is added in NVM API 1.7*/
>   if (hw->flags & I40E_HW_FLAG_802_1AD_CAPABLE) {
>   if (qinq) {
> - if (pf->fw8_3gt) {
> - sw_flags =
> I40E_AQ_SET_SWITCH_CFG_OUTER_VLAN;
> - valid_flags =
> I40E_AQ_SET_SWITCH_CFG_OUTER_VLAN;
> - }
>   if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER)
>   hw->first_tag = rte_cpu_to_le_16(tpid);
>   else if (vlan_type == RTE_ETH_VLAN_TYPE_INNER) @@
> -3940,8 +3936,8 @@ i40e_vlan_tpid_set(struct rte_eth_dev *dev,
>   if (vlan_type == RTE_ETH_VLAN_TYPE_OUTER)
>   hw->second_tag = rte_cpu_to_le_16(tpid);
>   }
> - ret = i40e_aq_set_switch_config(hw, sw_flags,
> - valid_flags, 0, NULL);
> + ret = i40e_aq_set_switch_config(hw, 0,
> + 0, 0, NULL);
>   if (ret != I40E_SUCCESS) {
>   PMD_DRV_LOG(ERR,
>   "Set switch config failed aq_err: %d", @@ -
> 3993,11 +3989,15 @@ static int  i40e_vlan_offload_set(struct rte_eth_dev *dev,
> int mask)  {
>   struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data-
> >dev_private);
> + struct i40e_mac_filter_info *vmac_filter[RTE_MAX_ETHPORTS];
> + struct i40e_vsi *vvsi[RTE_MAX_ETHPORTS];
>   struct i40e_mac_filter_info *mac_filter;
>   struct i40e_vsi *vsi = pf->main_vsi;
>   struct rte_eth_rxmode *rxmode;
> + int vnum[RTE_MAX_ETHPORTS];
>   struct i40e_mac_filter *f;
> - int i, num;
> + int port_num = 0;
> + int i, num, j;
>   void *temp;
>   int ret;
> 
> @@ -4018,50 +4018,75 @@ i40e_vlan_offload_set(struct rte_eth_dev *dev, int
> mask)
>   }
> 
>   if (mask & RTE_ETH_VLAN_EXTEND_MASK) {
> - i = 0;
> - num = vsi->mac_num;
> - mac_filter = rte_zmalloc("mac_filter_info_data",
> -  num * sizeof(*mac_filter), 0);
> - if (mac_filter == NULL) {
> - PMD_DRV_LOG(ERR, "failed to a

[Bug 1076] [dpdk 22.11] kernel/linux/kni meson build failed with gcc 11.3.1 on rhel9.0

2022-09-06 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1076

Bug ID: 1076
   Summary: [dpdk 22.11] kernel/linux/kni meson build failed with
gcc 11.3.1  on rhel9.0
   Product: DPDK
   Version: unspecified
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: normal
  Priority: Normal
 Component: core
  Assignee: dev@dpdk.org
  Reporter: daxuex@intel.com
  Target Milestone: ---

[DPDK version]:
commit 4aee6110bb10b0225fa9562f2e48af233a9058a1 (HEAD -> main, origin/main,
origin/HEAD)
Author: Huichao Cai 
Date:   Mon Aug 8 09:48:12 2022 +0800

ip_frag: add IPv4 fragment copy

Some NIC drivers support MBUF_FAST_FREE (device supports optimization
for fast release of mbufs. When set, application must guarantee that
per-queue all mbufs comes from the same mempool, has refcnt = 1, direct
and non-segmented.) offload.
In order to adapt to this offload function, add this API.
Add some test data for this API.

Signed-off-by: Huichao Cai 
Acked-by: Konstantin Ananyev 


[OS version]:
RHEL9.0/5.14.0-160.el9.x86_64
gcc version 11.3.1 20220421

[Test Setup]:
CC=gcc meson --werror -Denable_kmods=True -Dlibdir=lib -Dexamples=all
--default-library=static x86_64-native-linuxapp-gcc 
ninja -j 10 -C x86_64-native-linuxapp-gcc/

[Log]
FAILED: kernel/linux/kni/rte_kni.ko
/usr/bin/make -j4 -C /lib/modules/5.14.0-160.el9.x86_64/build
M=/root/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni
src=/root/dpdk/kernel/linux/kni 'MODULE_CFLAGS= -DHAVE_ARG_TX_QUEUE -include
/root/dpdk/config/rte_config.h -I/root/dpdk/lib/eal/include
-I/root/dpdk/lib/kni -I/root/dpdk/x86_64-native-linuxapp-gcc
-I/root/dpdk/kernel/linux/kni' modules
make: Entering directory '/usr/src/kernels/5.14.0-160.el9.x86_64'
  CC [M]  /root/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_misc.o
  CC [M]  /root/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_net.o
/root/dpdk/kernel/linux/kni/kni_net.c: In function ‘kni_net_rx_normal’:
/root/dpdk/kernel/linux/kni/kni_net.c:445:17: error: implicit declaration of
function ‘netif_rx_ni’; did you mean ‘netif_rx’?
[-Werror=implicit-function-declaration]
  445 | netif_rx_ni(skb);
  | ^~~
  | netif_rx
cc1: some warnings being treated as errors
make[1]: *** [scripts/Makefile.build:295:
/root/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni/kni_net.o] Error 1
make: *** [Makefile:1915:
/root/dpdk/x86_64-native-linuxapp-gcc/kernel/linux/kni] Error 2
make: Leaving directory '/usr/src/kernels/5.14.0-160.el9.x86_64'
ninja: build stopped: subcommand failed.


[Bad commit]
This is new os found problem, old os no found problem.

-- 
You are receiving this mail because:
You are the assignee for the bug.

RE: [PATCH v2 02/10] net/gve: add logs and OS specific implementation

2022-09-06 Thread Guo, Junfeng


> -Original Message-
> From: Ferruh Yigit 
> Sent: Friday, September 2, 2022 01:21
> To: Guo, Junfeng ; Zhang, Qi Z
> ; Wu, Jingjing 
> Cc: dev@dpdk.org; Li, Xiaoyun ;
> awogbem...@google.com; Richardson, Bruce
> ; Wang, Haiyue 
> Subject: Re: [PATCH v2 02/10] net/gve: add logs and OS specific
> implementation
> 
> On 8/29/2022 9:41 AM, Junfeng Guo wrote:
> 
> >
> > Add GVE PMD logs.
> > Add some MACRO definitions and memory operations which are specific
> > for DPDK.
> >
> > Signed-off-by: Haiyue Wang 
> > Signed-off-by: Xiaoyun Li 
> > Signed-off-by: Junfeng Guo 
> 
> <...>
> 
> > diff --git a/drivers/net/gve/gve_logs.h b/drivers/net/gve/gve_logs.h
> > new file mode 100644
> > index 00..a050253f59
> > --- /dev/null
> > +++ b/drivers/net/gve/gve_logs.h
> > @@ -0,0 +1,22 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright(C) 2022 Intel Corporation
> > + */
> > +
> > +#ifndef _GVE_LOGS_H_
> > +#define _GVE_LOGS_H_
> > +
> > +extern int gve_logtype_init;
> > +extern int gve_logtype_driver;
> > +
> > +#define PMD_INIT_LOG(level, fmt, args...) \
> > +   rte_log(RTE_LOG_ ## level, gve_logtype_init, "%s(): " fmt "\n", \
> > +   __func__, ##args)
> > +
> > +#define PMD_DRV_LOG_RAW(level, fmt, args...) \
> > +   rte_log(RTE_LOG_ ## level, gve_logtype_driver, "%s(): " fmt, \
> > +   __func__, ## args)
> > + > +#define PMD_DRV_LOG(level, fmt, args...) \
> > +   PMD_DRV_LOG_RAW(level, fmt "\n", ## args)
> > +
> 
> Why 'PMD_DRV_LOG_RAW' is needed, why not directly use
> 'PMD_DRV_LOG'?

It seems that the _RAW macro was first introduced at i40e driver logs file.
Since sometimes the trailing '\n' is added at the end of the log message in
the base code, the PMD_DRV_LOG_RAW macro that will not add one is
used to keep consistent of the new line character.

Well, looks that the macro PMD_DRV_LOG_RAW is somewhat redundant.
I think it's ok to remove PMD_DRV_LOG_RAW and keep all the log messages
end without the trailing '\n'. Thanks!

> 
> 
> Do you really need two different log types? How do you differentiate
> 'init' & 'driver' types? As far as I can see there is mixed usage of them.

The PMD_INIT_LOG is used at the init stage, while the PMD_DRV_LOG
is used at the driver normal running stage. I agree that there might be
mixed usage of these two macros. I'll try to check all these usages and 
update them at correct conditions in the coming versions. 
If you insist that only one log type is needed to keep the code clean,
then I could update them as you expected. Thanks!

Regards,
Junfeng