RE: [EXT] Re: [PATCH 1/2] net/cnxk: update meter bpf ID in rq

2022-01-13 Thread Rakesh Kudurumalla


> -Original Message-
> From: Ferruh Yigit 
> Sent: Tuesday, January 11, 2022 6:16 PM
> To: Rakesh Kudurumalla ; Nithin Kumar
> Dabilpuram ; Kiran Kumar Kokkilagadda
> ; Sunil Kumar Kori ; Satha
> Koteswara Rao Kottidi 
> Cc: dev@dpdk.org
> Subject: [EXT] Re: [PATCH 1/2] net/cnxk: update meter bpf ID in rq
> 
> External Email
> 
> --
> On 11/30/2021 6:41 AM, Rakesh Kudurumalla wrote:
> > Patch updates configured meter bpf is in rq context during meter
> > creation
> 
> RQ is receive queue, right? Can you please use the long version for
> clarification?
Yes RQ is receive queue. Sure will use long version
> 
> >
> > Signed-off-by: Rakesh Kudurumalla 
> > ---
> >   drivers/net/cnxk/cn10k_rte_flow.c  |  9 -
> >   drivers/net/cnxk/cnxk_ethdev_mtr.c | 25 ++---
> >   2 files changed, 22 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/net/cnxk/cn10k_rte_flow.c
> > b/drivers/net/cnxk/cn10k_rte_flow.c
> > index b830abe63e..402bb1c72f 100644
> > --- a/drivers/net/cnxk/cn10k_rte_flow.c
> > +++ b/drivers/net/cnxk/cn10k_rte_flow.c
> > @@ -36,20 +36,20 @@ cn10k_mtr_configure(struct rte_eth_dev *eth_dev,
> > for (i = 0; actions[i].type != RTE_FLOW_ACTION_TYPE_END; i++) {
> > if (actions[i].type == RTE_FLOW_ACTION_TYPE_METER) {
> > mtr_conf = (const struct rte_flow_action_meter
> > -   *)(actions->conf);
> > +   *)(actions[i].conf);
> > mtr_id = mtr_conf->mtr_id;
> > is_mtr_act = true;
> > }
> > if (actions[i].type == RTE_FLOW_ACTION_TYPE_QUEUE) {
> > q_conf = (const struct rte_flow_action_queue
> > - *)(actions->conf);
> > + *)(actions[i].conf);
> > if (is_mtr_act)
> > nix_mtr_rq_update(eth_dev, mtr_id, 1,
> >   &q_conf->index);
> > }
> > if (actions[i].type == RTE_FLOW_ACTION_TYPE_RSS) {
> > rss_conf = (const struct rte_flow_action_rss
> > -   *)(actions->conf);
> > +   *)(actions[i].conf);
> > if (is_mtr_act)
> > nix_mtr_rq_update(eth_dev, mtr_id,
> >   rss_conf->queue_num,
> > @@ -98,7 +98,7 @@ cn10k_rss_action_validate(struct rte_eth_dev
> *eth_dev,
> > return -EINVAL;
> > }
> >
> > -   if (eth_dev->data->dev_conf.rxmode.mq_mode !=
> RTE_ETH_MQ_RX_RSS) {
> > +   if (eth_dev->data->dev_conf.rxmode.mq_mode != ETH_MQ_RX_RSS)
> {
> 
> This change seems unintended. Please keep the original value.
Will keep original value
> 
> 
> 
> <...>
> 
> > if (!capa)
> > return -rte_mtr_error_set(error, EINVAL,
> > -   RTE_MTR_ERROR_TYPE_MTR_PARAMS,
> NULL,
> > -   "NULL input parameter");
> > +
> RTE_MTR_ERROR_TYPE_MTR_PARAMS, NULL,
> > + "NULL input parameter");
> >
> 
> Previous indentation looks more consistent with DPDK coding guide.
Will update to pervious indentation


RE: [PATCH] net/mlx5: relax headroom assertion

2022-01-13 Thread Raslan Darawsheh
Hi,

> -Original Message-
> From: Dmitry Kozlyuk 
> Sent: Tuesday, December 28, 2021 11:21 AM
> To: dev@dpdk.org
> Cc: Slava Ovsiienko ; sta...@dpdk.org; Matan
> Azrad 
> Subject: [PATCH] net/mlx5: relax headroom assertion
> 
> A debug assertion in Single-Packet Receive Queue (SPRQ) mode required all
> Rx mbufs to have a 128 byte headroom, based on the assumption that
> rte_pktmbuf_init() sets it.
> However, rte_pktmbuf_init() may set a smaller headroom if the dataroom is
> insufficient, e.g. this is a natural case for split buffer segments. The
> headroom can also be larger.
> Only check the headroom size when vectored Rx routines are used because
> they rely on it. Relax the assertion to require sufficient headroom size, not 
> an
> exact one.
> 
> Fixes: a0a45e8af723 ("net/mlx5: configure Rx queue for buffer split")
> Cc: viachesl...@nvidia.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Dmitry Kozlyuk 
> Acked-by: Matan Azrad 

Patch rebased and applied to next-net-mlx,

Kindest regards,
Raslan Darawsheh


[PATCH] app/testpmd: skip stopped queues when forwarding

2022-01-13 Thread Dmitry Kozlyuk
After "port  rxq|txq  stop"
the stopped queue was used in forwarding nonetheless,
which may cause undefined behavior in the PMD.

Record the configured queue state
and account for it when launching forwarding as follows:
++-+-+---+
|RxQ |TxQ  |Configured mode  |Launch routine |
++-+-+---+
|stopped |stopped  |*|-  |
|stopped |started  |txonly   |(configured)   |
|stopped |started  |*|-  |
|started |stopped  |*|rxonly |
|started |started  |*|(configured)   |
++-+-+---+
Display stopped queues on "show port config rxtx".

Fixes: 5f4ec54f1d16 ("testpmd: queue start and stop")
Cc: jing.d.c...@intel.com
Cc: sta...@dpdk.org

Signed-off-by: Dmitry Kozlyuk 
Reviewed-by: Raslan Darawsheh 
---
 app/test-pmd/cmdline.c |  8 
 app/test-pmd/config.c  |  6 ++
 app/test-pmd/testpmd.c | 18 --
 app/test-pmd/testpmd.h | 10 ++
 4 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index e626b1c7d9..8b0920e23d 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -2702,6 +2702,14 @@ cmd_config_rxtx_queue_parsed(void *parsed_result,
 
if (ret == -ENOTSUP)
fprintf(stderr, "Function not supported in PMD\n");
+   if (ret == 0) {
+   struct rte_port *port;
+   struct queue_state *states;
+
+   port = &ports[res->portid];
+   states = isrx ? port->rxq_state : port->txq_state;
+   states[res->qid].stopped = !isstart;
+   }
 }
 
 cmdline_parse_token_string_t cmd_config_rxtx_queue_port =
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 1722d6c8f8..7ce9cb483a 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2817,6 +2817,9 @@ rxtx_config_display(void)
   rx_conf->share_qid);
printf("\n");
}
+   for (qid = 0; qid < nb_rxq; qid++)
+   if (ports[pid].rxq_state[qid].stopped)
+   printf("RX queue %d is stopped\n", qid);
 
/* per tx queue config only for first queue to be less verbose 
*/
for (qid = 0; qid < 1; qid++) {
@@ -2850,6 +2853,9 @@ rxtx_config_display(void)
printf("  TX offloads=0x%"PRIx64" - TX RS bit 
threshold=%d\n",
offloads_tmp, tx_rs_thresh_tmp);
}
+   for (qid = 0; qid < nb_txq; qid++)
+   if (ports[pid].txq_state[qid].stopped)
+   printf("TX queue %d is stopped\n", qid);
}
 }
 
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 6c387bde84..36ff845181 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -2152,6 +2152,8 @@ flush_fwd_rx_queues(void)
for (rxp = 0; rxp < cur_fwd_config.nb_fwd_ports; rxp++) {
for (rxq = 0; rxq < nb_rxq; rxq++) {
port_id = fwd_ports_ids[rxp];
+   if (ports[port_id].rxq_state[rxq].stopped)
+   continue;
/**
* testpmd can stuck in the below do while loop
* if rte_eth_rx_burst() always returns nonzero
@@ -2223,8 +2225,20 @@ run_pkt_fwd_on_lcore(struct fwd_lcore *fc, packet_fwd_t 
pkt_fwd)
 static int
 start_pkt_forward_on_core(void *fwd_arg)
 {
-   run_pkt_fwd_on_lcore((struct fwd_lcore *) fwd_arg,
-cur_fwd_config.fwd_eng->packet_fwd);
+   struct fwd_lcore *fc = fwd_arg;
+   struct fwd_stream *fsm = fwd_streams[fc->stream_idx];
+   struct queue_state *rxq = &ports[fsm->rx_port].rxq_state[fsm->rx_queue];
+   struct queue_state *txq = &ports[fsm->tx_port].txq_state[fsm->tx_queue];
+   struct fwd_engine *fwd_engine = cur_fwd_config.fwd_eng;
+   packet_fwd_t packet_fwd;
+
+   /* Check if there will ever be any packets to send. */
+   if (rxq->stopped && (txq->stopped || fwd_engine != &tx_only_engine))
+   return 0;
+   /* Force rxonly mode if RxQ is started, but TxQ is stopped. */
+   packet_fwd = !rxq->stopped && txq->stopped ? rx_only_engine.packet_fwd
+  : fwd_engine->packet_fwd;
+   run_pkt_fwd_on_lcore(fc, packet_fwd);
return 0;
 }
 
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 2149ecd93a..2744fa4d76 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -216,6 +216,12 @@ struct xstat_display_info {
bool allocated;
 };
 
+/** Application state of a queue. */
+struct queue_sta

Re: [PATCH v2 1/1] mempool: implement index-based per core cache

2022-01-13 Thread Jerin Jacob
On Thu, Jan 13, 2022 at 11:06 AM Dharmik Thakkar
 wrote:
>
> Current mempool per core cache implementation stores pointers to mbufs
> On 64b architectures, each pointer consumes 8B
> This patch replaces it with index-based implementation,
> where in each buffer is addressed by (pool base address + index)
> It reduces the amount of memory/cache required for per core cache
>
> L3Fwd performance testing reveals minor improvements in the cache
> performance (L1 and L2 misses reduced by 0.60%)
> with no change in throughput
>
> Suggested-by: Honnappa Nagarahalli 
> Signed-off-by: Dharmik Thakkar 
> Reviewed-by: Ruifeng Wang 
> ---

>
> /* Now fill in the response ... */
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE

Instead of having this #ifdef clutter everywhere for the pair,
I think, we can define RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE once,
and have a different implementation.
i.e
#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE
void x()
{

}
void y()
{

}
#else

void x()
{

}
void y()
{

}

#endif

call
x();
y();

in the main code.

> diff --git a/lib/mempool/rte_mempool_ops_default.c 
> b/lib/mempool/rte_mempool_ops_default.c
> index 22fccf9d7619..3543cad9d4ce 100644
> --- a/lib/mempool/rte_mempool_ops_default.c
> +++ b/lib/mempool/rte_mempool_ops_default.c
> @@ -127,6 +127,13 @@ rte_mempool_op_populate_helper(struct rte_mempool *mp, 
> unsigned int flags,
> obj = va + off;
> obj_cb(mp, obj_cb_arg, obj,
>(iova == RTE_BAD_IOVA) ? RTE_BAD_IOVA : (iova + off));
> +#ifdef RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE

This is the only place used in C code.
Since we are going compile time approach. Can make this unconditional?
That will enable the use of this model in the application, without
recompiling DPDK.
All application needs to

#define RTE_MEMPOOL_INDEX_BASED_LCORE_CACHE 1
#include 

I believe enabling such structuring helps to avoid DPDK recompilation of code.


> +   /* Store pool base value to calculate indices for index-based
> +* lcore cache implementation
> +*/
> +   if (i == 0)
> +   mp->pool_base_value = obj;
> +#endif
> rte_mempool_ops_enqueue_bulk(mp, &obj, 1);
> off += mp->elt_size + mp->trailer_size;
> }
> --
> 2.17.1
>


[dpdk-dev] [PATCH v2 1/2] ethdev: support queue-based priority flow control

2022-01-13 Thread jerinj
From: Jerin Jacob 

Based on device support and use-case need, there are two different ways
to enable PFC. The first case is the port level PFC configuration, in
this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to
configure the PFC, and PFC frames will be generated using based on VLAN
TC value.

The second case is the queue level PFC configuration, in this
case, Any packet field content can be used to steer the packet to the
specific queue using rte_flow or RSS and then use
rte_eth_dev_priority_flow_ctrl_queue_set() to set the TC mapping on each
queue. Based on congestion selected on the specific queue, configured TC
shall be used to generate PFC frames.

Operation of these modes are mutually exclusive, when driver sets
non zero value for rte_eth_dev_info::pfc_queue_tc_max,
application must use queue level PFC configuration via
rte_eth_dev_priority_flow_ctrl_queue_set() API instead of port level
PFC configuration via rte_eth_dev_priority_flow_ctrl_set() API to
realize PFC configuration.

This patch enables the configuration for second case a.k.a queue
based PFC also updates rte_eth_dev_priority_flow_ctrl_set()
implmentaion to adheher to rte_eth_dev_info::pfc_queue_tc_max
handling.

Also updated libabigail.abignore to ignore the update
to reserved fields in rte_eth_dev_info.

Signed-off-by: Jerin Jacob 
Signed-off-by: Sunil Kumar Kori 
---

A driver implemtion based on this API is at
https://patches.dpdk.org/project/dpdk/patch/20220111081831.881374-1-sk...@marvell.com/

RFC..v1
 - Added queue based PFC config API instead port based

v1..v2
- Updated libabigail.ignore as
  has_data_member_inserted_between = {offset_of(reserved_64s), end}
- Updated doxygen comments of rte_eth_pfc_queue_conf


 devtools/libabigail.abignore   |   5 ++
 doc/guides/nics/features.rst   |   5 +-
 doc/guides/rel_notes/release_22_03.rst |   3 +
 lib/ethdev/ethdev_driver.h |   6 +-
 lib/ethdev/rte_ethdev.c| 109 +
 lib/ethdev/rte_ethdev.h|  83 ++-
 lib/ethdev/version.map |   3 +
 7 files changed, 208 insertions(+), 6 deletions(-)

diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore
index 4b676f317d..3bdecaaef0 100644
--- a/devtools/libabigail.abignore
+++ b/devtools/libabigail.abignore
@@ -11,3 +11,8 @@
 ; Ignore generated PMD information strings
 [suppress_variable]
 name_regexp = _pmd_info$
+
+;Ignore fields inserted in place of reserved fields of rte_eth_dev_info
+[suppress_type]
+   name = rte_eth_dev_info
+   has_data_member_inserted_between = {offset_of(reserved_64s), end}
diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 27be2d2576..277a784f4e 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -379,9 +379,10 @@ Flow control
 Supports configuring link flow control.
 
 * **[implements] eth_dev_ops**: ``flow_ctrl_get``, ``flow_ctrl_set``,
-  ``priority_flow_ctrl_set``.
+  ``priority_flow_ctrl_set``, ``priority_flow_ctrl_queue_set``.
 * **[related]API**: ``rte_eth_dev_flow_ctrl_get()``, 
``rte_eth_dev_flow_ctrl_set()``,
-  ``rte_eth_dev_priority_flow_ctrl_set()``.
+  ``rte_eth_dev_priority_flow_ctrl_set()``, 
``rte_eth_dev_priority_flow_ctrl_queue_set()``.
+* **[provides]   rte_eth_dev_info**: ``pfc_queue_tc_max``.
 
 
 .. _nic_features_rate_limitation:
diff --git a/doc/guides/rel_notes/release_22_03.rst 
b/doc/guides/rel_notes/release_22_03.rst
index 6d99d1eaa9..b75c0356e6 100644
--- a/doc/guides/rel_notes/release_22_03.rst
+++ b/doc/guides/rel_notes/release_22_03.rst
@@ -55,6 +55,9 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Added an API to enable queue based priority flow ctrl(PFC).**
+
+  A new API, ``rte_eth_dev_priority_flow_ctrl_queue_set()``, was added.
 
 Removed Items
 -
diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h
index d95605a355..e0bbfe89d7 100644
--- a/lib/ethdev/ethdev_driver.h
+++ b/lib/ethdev/ethdev_driver.h
@@ -532,6 +532,9 @@ typedef int (*flow_ctrl_set_t)(struct rte_eth_dev *dev,
 /** @internal Setup priority flow control parameter on an Ethernet device. */
 typedef int (*priority_flow_ctrl_set_t)(struct rte_eth_dev *dev,
struct rte_eth_pfc_conf *pfc_conf);
+/** @internal Queue setup for priority flow control parameter on an Ethernet 
device. */
+typedef int (*priority_flow_ctrl_queue_set_t)(struct rte_eth_dev *dev,
+   struct rte_eth_pfc_queue_conf *pfc_queue_conf);
 
 /** @internal Update RSS redirection table on an Ethernet device. */
 typedef int (*reta_update_t)(struct rte_eth_dev *dev,
@@ -1080,7 +1083,8 @@ struct eth_dev_ops {
flow_ctrl_set_tflow_ctrl_set; /**< Setup flow control */
/** Setup priority flow control */
priority_flow_ctrl_set_t   prio

[dpdk-dev] [PATCH v2 2/2] app/testpmd: add queue based pfc CLI options

2022-01-13 Thread jerinj
From: Sunil Kumar Kori 

Patch adds command line options to configure queue based
priority flow control.

- Syntax command is given as below:

set pfc_queue_ctrl  rx\
tx

- Example command to configure queue based priority flow control
  on rx and tx side for port 0, Rx queue 0, Tx queue 0 with pause
  time 2047

testpmd> set pfc_queue_ctrl 0 rx on 0 0 tx on 0 0 2047

Signed-off-by: Sunil Kumar Kori 
---
 app/test-pmd/cmdline.c  | 122 
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  22 
 2 files changed, 144 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index e626b1c7d9..19dbcea4f1 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -544,6 +544,11 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the priority flow control parameter on a"
" port.\n\n"
 
+   "set pfc_queue_ctrl (port_id) rx (on|off) (tx_qid)"
+   " (tx_tc) tx (on|off) (rx_qid) (rx_tc) (pause_time)\n"
+   "Set the queue priority flow control parameter on a"
+   " given Rx and Tx queues of a port.\n\n"
+
"set stat_qmap (tx|rx) (port_id) (queue_id) 
(qmapping)\n"
"Set statistics mapping (qmapping 0..15) for RX/TX"
" queue on port.\n"
@@ -7690,6 +7695,122 @@ cmdline_parse_inst_t cmd_priority_flow_control_set = {
},
 };
 
+struct cmd_queue_priority_flow_ctrl_set_result {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t pfc_queue_ctrl;
+   portid_t port_id;
+   cmdline_fixed_string_t rx;
+   cmdline_fixed_string_t rx_pfc_mode;
+   uint16_t tx_qid;
+   uint8_t  tx_tc;
+   cmdline_fixed_string_t tx;
+   cmdline_fixed_string_t tx_pfc_mode;
+   uint16_t rx_qid;
+   uint8_t  rx_tc;
+   uint16_t pause_time;
+};
+
+static void
+cmd_queue_priority_flow_ctrl_set_parsed(void *parsed_result,
+   __rte_unused struct cmdline *cl,
+   __rte_unused void *data)
+{
+   struct cmd_queue_priority_flow_ctrl_set_result *res = parsed_result;
+   struct rte_eth_pfc_queue_conf pfc_queue_conf;
+   int rx_fc_enable, tx_fc_enable;
+   int ret;
+
+   /*
+* Rx on/off, flow control is enabled/disabled on RX side. This can
+* indicate the RTE_ETH_FC_TX_PAUSE, Transmit pause frame at the Rx
+* side. Tx on/off, flow control is enabled/disabled on TX side. This
+* can indicate the RTE_ETH_FC_RX_PAUSE, Respond to the pause frame at
+* the Tx side.
+*/
+   static enum rte_eth_fc_mode rx_tx_onoff_2_mode[2][2] = {
+   {RTE_ETH_FC_NONE, RTE_ETH_FC_TX_PAUSE},
+   {RTE_ETH_FC_RX_PAUSE, RTE_ETH_FC_FULL}
+   };
+
+   memset(&pfc_queue_conf, 0, sizeof(struct rte_eth_pfc_queue_conf));
+   rx_fc_enable = (!strncmp(res->rx_pfc_mode, "on", 2)) ? 1 : 0;
+   tx_fc_enable = (!strncmp(res->tx_pfc_mode, "on", 2)) ? 1 : 0;
+   pfc_queue_conf.mode = rx_tx_onoff_2_mode[rx_fc_enable][tx_fc_enable];
+   pfc_queue_conf.rx_pause.tc  = res->tx_tc;
+   pfc_queue_conf.rx_pause.tx_qid = res->tx_qid;
+   pfc_queue_conf.tx_pause.tc  = res->rx_tc;
+   pfc_queue_conf.tx_pause.rx_qid  = res->rx_qid;
+   pfc_queue_conf.tx_pause.pause_time = res->pause_time;
+
+   ret = rte_eth_dev_priority_flow_ctrl_queue_set(res->port_id,
+  &pfc_queue_conf);
+   if (ret != 0) {
+   fprintf(stderr,
+   "bad queue priority flow control parameter, rc = %d\n",
+   ret);
+   }
+}
+
+cmdline_parse_token_string_t cmd_q_pfc_set_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_queue_priority_flow_ctrl_set_result,
+   set, "set");
+cmdline_parse_token_string_t cmd_q_pfc_set_flow_ctrl =
+   TOKEN_STRING_INITIALIZER(struct cmd_queue_priority_flow_ctrl_set_result,
+   pfc_queue_ctrl, "pfc_queue_ctrl");
+cmdline_parse_token_num_t cmd_q_pfc_set_portid =
+   TOKEN_NUM_INITIALIZER(struct cmd_queue_priority_flow_ctrl_set_result,
+   port_id, RTE_UINT16);
+cmdline_parse_token_string_t cmd_q_pfc_set_rx =
+   TOKEN_STRING_INITIALIZER(struct cmd_queue_priority_flow_ctrl_set_result,
+   rx, "rx");
+cmdline_parse_token_string_t cmd_q_pfc_set_rx_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_queue_priority_flow_ctrl_set_result,
+   rx_pfc_mode, "on#off");
+cmdline_parse_token_num_t cmd_q_pfc_set_tx_qid =
+   TOKEN_NUM_INITIALIZER(struct cmd_queue_priority_flow_ctrl_set_result,
+   tx_qid, RTE_UINT16);
+cmdline_parse_token_num_t cmd_q_pfc_set_tx_tc =
+   TOKEN_

RE: [PATCH 2/5] crypto/openssl: fix output of RSA verify op

2022-01-13 Thread Ramkumar Balu
Thank you for the comments. I agree that OpenSSL PMD needs a major refactoring 
in asym crypto. 
I have asked Akhil to reject this patch series.

-Original Message-
From: Kusztal, ArkadiuszX  
Sent: Tuesday, December 28, 2021 2:41 PM
To: Ramkumar Balu ; Akhil Goyal ; Anoob 
Joseph ; Doherty, Declan ; Zhang, 
Roy Fan ; Ankur Dwivedi ; 
Tejasree Kondoj 
Cc: sta...@dpdk.org; dev@dpdk.org
Subject: [EXT] RE: [PATCH 2/5] crypto/openssl: fix output of RSA verify op

--


> -Original Message-
> From: Ramkumar Balu 
> Sent: Monday, November 29, 2021 10:52 AM
> To: Akhil Goyal ; Anoob Joseph 
> ; Doherty, Declan ; 
> Zhang, Roy Fan ; Ankur Dwivedi 
> ; Tejasree Kondoj 
> Cc: sta...@dpdk.org; dev@dpdk.org; Ramkumar 
> Subject: [PATCH 2/5] crypto/openssl: fix output of RSA verify op
> 
> From: Ramkumar 
> 
> During RSA verify, the OpenSSL PMD fails to return the plaintext after 
> public key decryption.
> This patch fixes the OpenSSL PMD to return the decrypted plaintext in 
> cipher.data / cipher.length fields
> 
> Fixes: 3e9d6bd447fb ("crypto/openssl: add RSA and mod asym 
> operations")
> Fixes: fe1606e0138c ("crypto/openssl: fix RSA verify operation")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ramkumar 
> ---
>  drivers/crypto/openssl/rte_openssl_pmd.c | 16 +++-
>  1 file changed, 11 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c
> b/drivers/crypto/openssl/rte_openssl_pmd.c
> index 5794ed8159..3ab2c3b5c1 100644
> --- a/drivers/crypto/openssl/rte_openssl_pmd.c
> +++ b/drivers/crypto/openssl/rte_openssl_pmd.c
> @@ -1953,12 +1953,16 @@ process_openssl_rsa_op(struct rte_crypto_op 
> *cop,
>   break;
> 
>   case RTE_CRYPTO_ASYM_OP_VERIFY:
> - tmp = rte_malloc(NULL, op->rsa.sign.length, 0);
> + tmp = op->rsa.cipher.data;
>   if (tmp == NULL) {
> - OPENSSL_LOG(ERR, "Memory allocation failed");
> - cop->status = RTE_CRYPTO_OP_STATUS_ERROR;
> - break;
> + tmp = rte_malloc(NULL, op->rsa.sign.length, 0);
> + if (tmp == NULL) {
> + OPENSSL_LOG(ERR, "Memory allocation
> failed");
> + cop->status =
> RTE_CRYPTO_OP_STATUS_ERROR;
> + break;
> + }
>   }
> +
>   ret = RSA_public_decrypt(op->rsa.sign.length,
>   op->rsa.sign.data,
>   tmp,
[Arek] - this function is deprecated and more importantly it properly handle 
only NO_PADDING situation (no der encoding, like pre TLS 1.2). OpenSSL code 
needs major refactor in this area soon (mostly in asymmetric crypto).
> @@ -1974,7 +1978,9 @@ process_openssl_rsa_op(struct rte_crypto_op *cop,
>   OPENSSL_LOG(ERR, "RSA sign Verification failed");
>   cop->status = RTE_CRYPTO_OP_STATUS_ERROR;
>   }
> - rte_free(tmp);
> + op->rsa.cipher.length = ret;
> + if (tmp != op->rsa.cipher.data)
> + rte_free(tmp);
>   break;
> 
>   default:
> --
> 2.17.1



RE: [PATCH 1/1] mempool: implement index-based per core cache

2022-01-13 Thread Ananyev, Konstantin


Hi Dharmik,

> >
> >> Current mempool per core cache implementation stores pointers to mbufs
> >> On 64b architectures, each pointer consumes 8B
> >> This patch replaces it with index-based implementation,
> >> where in each buffer is addressed by (pool base address + index)
> >> It reduces the amount of memory/cache required for per core cache
> >>
> >> L3Fwd performance testing reveals minor improvements in the cache
> >> performance (L1 and L2 misses reduced by 0.60%)
> >> with no change in throughput
> >
> > I feel really sceptical about that patch and the whole idea in general:
> > - From what I read above there is no real performance improvement observed.
> >  (In fact on my IA boxes mempool_perf_autotest reports ~20% slowdown,
> >  see below for more details).
> 
> Currently, the optimizations (loop unroll and vectorization) are only 
> implemented for ARM64.
> Similar optimizations can be implemented for x86 platforms which should close 
> the performance gap
> and in my understanding should give better performance for a bulk size of 32.

Might be, but I still don't see the reason for such effort.
As you mentioned there is no performance improvement in 'real' apps: l3fwd, etc.
on ARM64 even with vectorized version of the code.

> > - Space utilization difference looks neglectable too.
> 
> Sorry, I did not understand this point.

As I understand one of the expectations from that patch was:
reduce memory/cache required, which should improve cache utilization
(less misses, etc.).
Though I think such improvements would be neglectable and wouldn't
cause any real performance gain. 

> > - The change introduces a new build time config option with a major 
> > limitation:
> >   All memzones in a pool have to be within the same 4GB boundary.
> >   To address it properly, extra changes will be required in init(/populate) 
> > part of the code.
> 
> I agree to the above mentioned challenges and I am currently working on 
> resolving these issues.

I still think that to justify such changes some really noticeable performance
improvement needs to be demonstrated: double-digit speedup for 
l3fwd/ipsec-secgw/...  
Otherwise it just not worth the hassle. 
 
> >   All that will complicate mempool code, will make it more error prone
> >   and harder to maintain.
> > But, as there is no real gain in return - no point to add such extra 
> > complexity at all.
> >
> > Konstantin
> >


RE: [PATCH v3 1/2] eventdev/crypto_adapter: move crypto ops to circular buffer

2022-01-13 Thread Gujjar, Abhinandan S
Hi Ganapati,


> -Original Message-
> From: Kundapura, Ganapati 
> Sent: Tuesday, January 11, 2022 4:07 PM
> To: jerinjac...@gmail.com; Jayatheerthan, Jay ;
> dev@dpdk.org
> Cc: Gujjar, Abhinandan S 
> Subject: [PATCH v3 1/2] eventdev/crypto_adapter: move crypto ops to circular
> buffer
> 
> Move crypto ops to circular buffer to retain crypto ops when
> cryptodev/eventdev are temporarily full
> 
> ---
> v3:
> * update eca_ops_buffer_flush() to flush out all the crypto
>   ops out of circular buffer.
> * remove freeing of failed crypto ops from eca_ops_enqueue_burst()
>   and add to cirular buffer for later processing.
> 
> v2:
> * reset cryptp adapter next cdev id before dequeueing from the
Cryptp -> crypto
>   next cdev
> ---
> 
> Signed-off-by: Ganapati Kundapura 
I don't see sign off after applying the patch. Could you take a look?
commit 18d9c3b1728f325dba5fe5dbb51df2366b70527a
Author: Ganapati Kundapura 
Date:   Tue Jan 11 04:36:30 2022 -0600

eventdev/crypto_adapter: move crypto ops to circular buffer

Move crypto ops to circular buffer to retain crypto
ops when cryptodev/eventdev are temporarily full


> 
> diff --git a/lib/eventdev/rte_event_crypto_adapter.c
> b/lib/eventdev/rte_event_crypto_adapter.c
> index d840803..9086368 100644
> --- a/lib/eventdev/rte_event_crypto_adapter.c
> +++ b/lib/eventdev/rte_event_crypto_adapter.c
> @@ -25,11 +25,27 @@
>  #define CRYPTO_ADAPTER_MEM_NAME_LEN 32
>  #define CRYPTO_ADAPTER_MAX_EV_ENQ_RETRIES 100
> 
> +#define CRYPTO_ADAPTER_OPS_BUFFER_SZ (BATCH_SIZE + BATCH_SIZE)
> #define
> +CRYPTO_ADAPTER_BUFFER_SZ 1024
> +
>  /* Flush an instance's enqueue buffers every CRYPTO_ENQ_FLUSH_THRESHOLD
>   * iterations of eca_crypto_adapter_enq_run()
>   */
>  #define CRYPTO_ENQ_FLUSH_THRESHOLD 1024
> 
> +struct crypto_ops_circular_buffer {
> + /* index of head element in circular buffer */
> + uint16_t head;
> + /* index of tail element in circular buffer */
> + uint16_t tail;
> + /* number elements in buffer */
number elements -> number of elements 
> + uint16_t count;
> + /* size of circular buffer */
> + uint16_t size;
> + /* Pointer to hold rte_crypto_ops for batching */
> + struct rte_crypto_op **op_buffer;
> +} __rte_cache_aligned;
> +
>  struct event_crypto_adapter {
>   /* Event device identifier */
>   uint8_t eventdev_id;
> @@ -47,6 +63,8 @@ struct event_crypto_adapter {
>   struct crypto_device_info *cdevs;
>   /* Loop counter to flush crypto ops */
>   uint16_t transmit_loop_count;
> + /* Circular buffer for batching crypto ops to eventdev */
> + struct crypto_ops_circular_buffer ebuf;
>   /* Per instance stats structure */
>   struct rte_event_crypto_adapter_stats crypto_stats;
>   /* Configuration callback for rte_service configuration */ @@ -93,8
> +111,8 @@ struct crypto_device_info {  struct crypto_queue_pair_info {
>   /* Set to indicate queue pair is enabled */
>   bool qp_enabled;
> - /* Pointer to hold rte_crypto_ops for batching */
> - struct rte_crypto_op **op_buffer;
> + /* Circular buffer for batching crypto ops to cdev */
> + struct crypto_ops_circular_buffer cbuf;
>   /* No of crypto ops accumulated */
>   uint8_t len;
>  } __rte_cache_aligned;
> @@ -141,6 +159,77 @@ eca_init(void)
>   return 0;
>  }
> 
> +static inline bool
> +eca_circular_buffer_batch_ready(struct crypto_ops_circular_buffer
> +*bufp) {
> + return bufp->count >= BATCH_SIZE;
> +}
> +
> +static inline void
> +eca_circular_buffer_free(struct crypto_ops_circular_buffer *bufp) {
> + rte_free(bufp->op_buffer);
> +}
> +
> +static inline int
> +eca_circular_buffer_init(const char *name,
> +  struct crypto_ops_circular_buffer *bufp,
> +  uint16_t sz)
> +{
> + bufp->op_buffer = rte_zmalloc(name,
> +   sizeof(struct rte_crypto_op *) * sz,
> +   0);
> + if (bufp->op_buffer == NULL)
> + return -ENOMEM;
> +
> + bufp->size = sz;
> + return 0;
> +}
> +
> +static inline int
> +eca_circular_buffer_add(struct crypto_ops_circular_buffer *bufp,
> + struct rte_crypto_op *op)
> +{
> + uint16_t *tailp = &bufp->tail;
> +
> + bufp->op_buffer[*tailp] = op;
> + *tailp = (*tailp + 1) % bufp->size;
Provide a comment saying you are taking care of buffer rollover condition
> + bufp->count++;
> +
> + return 0;
> +}
> +
> +static inline int
> +eca_circular_buffer_flush_to_cdev(struct crypto_ops_circular_buffer *bufp,
> +   uint8_t cdev_id, uint16_t qp_id,
> +   uint16_t *nb_ops_flushed)
This function is returning a value but caller does not check!
> +{
> + uint16_t n = 0;
> + uint16_t *headp = &bufp->head;
> + uint16_t *tailp = &bufp->tail;
> + struct rte_crypto_op **ops = bufp->op_buffer;
> +
> + if (*tailp > *headp)
> +

Re: [dpdk-dev] [PATCH 1/2] devtools: remove ugly workaround from get maintainers

2022-01-13 Thread Thomas Monjalon
01/11/2021 14:35, Ferruh Yigit:
> Linux kernel 'get_maintainer.pl' script supports running out of Linux
> tree since commit
> 31bb82c9caa9 ("get_maintainer: allow usage outside of kernel tree")
> 
> As commit is a few years old now, integrating it to DPDK and removing
> ugly workaround for it.
> 
> Signed-off-by: Ferruh Yigit 

Applied without patch 2, thanks.





Re: [PATCH 1/1] ci: restrict concurrency

2022-01-13 Thread Thomas Monjalon
Hi,

The explanation should be in the patch, not the cover letter.
Actually, you don't need a cover letter for a single patch.
Copying it here:
"
dpdk is fairly expensive to build in GitHub.

It's helpful to abandon old builds as soon as there's a new
build waiting instead of wasting resources on the previous
round.
"

12/01/2022 07:50, Josh Soref:
> Signed-off-by: Josh Soref 
> ---
> +concurrency:
> +  group: build-${{ matrix.config.os }}-${{ matrix.config.compiler }}-${{ 
> matrix.config.library }}-${{ matrix.config.cross }}-${{ matrix.config.mini 
> }}-${{ github.event.pull_request.number || github.ref }}
> +  cancel-in-progress: true

The goal of the CI is to catch any issue in a submitted patch.
Is your change cancelling a test of a patch when another one is submitted?




[PATCH] common/cnxk: enable NIX Tx interrupts errata

2022-01-13 Thread Harman Kalra
An errata exists whereby NIX may incorrectly overwrite the value in
NIX_SQ_CTX_S[SQ_INT]. This may cause set interrupts to get cleared or
causing an QINT when no error is outstanding.
As a workaround, software should always read all SQ debug registers
and not just rely on NIX_SQINT_E bits set in NIX_SQ_CTX_S[SQ_INT].
Also for detecting SQB faults software must read SQ context and
check id next_sqb is NULL.

Signed-off-by: Harman Kalra 
---
 drivers/common/cnxk/roc_nix_irq.c | 69 ++-
 1 file changed, 50 insertions(+), 19 deletions(-)

diff --git a/drivers/common/cnxk/roc_nix_irq.c 
b/drivers/common/cnxk/roc_nix_irq.c
index a5cd9d4b02..71971ef261 100644
--- a/drivers/common/cnxk/roc_nix_irq.c
+++ b/drivers/common/cnxk/roc_nix_irq.c
@@ -196,15 +196,42 @@ nix_lf_sq_irq_get_and_clear(struct nix *nix, uint16_t sq)
return nix_lf_q_irq_get_and_clear(nix, sq, NIX_LF_SQ_OP_INT, ~0x1ff00);
 }
 
-static inline void
+static inline bool
+nix_lf_is_sqb_null(struct dev *dev, int q)
+{
+   bool is_sqb_null = false;
+   volatile void *ctx;
+   int rc;
+
+   rc = nix_q_ctx_get(dev, NIX_AQ_CTYPE_SQ, q, &ctx);
+   if (rc) {
+   plt_err("Failed to get sq context");
+   } else {
+   is_sqb_null =
+   roc_model_is_cn9k() ?
+   (((__io struct nix_sq_ctx_s *)ctx)->next_sqb ==
+0) :
+   (((__io struct nix_cn10k_sq_ctx_s *)ctx)
+->next_sqb == 0);
+   }
+
+   return is_sqb_null;
+}
+
+static inline uint8_t
 nix_lf_sq_debug_reg(struct nix *nix, uint32_t off)
 {
+   uint8_t err = 0;
uint64_t reg;
 
reg = plt_read64(nix->base + off);
-   if (reg & BIT_ULL(44))
-   plt_err("SQ=%d err_code=0x%x", (int)((reg >> 8) & 0xf),
-   (uint8_t)(reg & 0xff));
+   if (reg & BIT_ULL(44)) {
+   err = reg & 0xff;
+   /* Clear valid bit */
+   plt_write64(BIT_ULL(44), nix->base + off);
+   }
+
+   return err;
 }
 
 static void
@@ -226,6 +253,7 @@ nix_lf_q_irq(void *param)
struct dev *dev = &nix->dev;
int q, cq, rq, sq;
uint64_t intr;
+   uint8_t rc;
 
intr = plt_read64(nix->base + NIX_LF_QINTX_INT(qintx));
if (intr == 0)
@@ -266,22 +294,25 @@ nix_lf_q_irq(void *param)
sq = q % nix->qints;
irq = nix_lf_sq_irq_get_and_clear(nix, sq);
 
-   if (irq & BIT_ULL(NIX_SQINT_LMT_ERR)) {
-   plt_err("SQ=%d NIX_SQINT_LMT_ERR", sq);
-   nix_lf_sq_debug_reg(nix, NIX_LF_SQ_OP_ERR_DBG);
-   }
-   if (irq & BIT_ULL(NIX_SQINT_MNQ_ERR)) {
-   plt_err("SQ=%d NIX_SQINT_MNQ_ERR", sq);
-   nix_lf_sq_debug_reg(nix, NIX_LF_MNQ_ERR_DBG);
-   }
-   if (irq & BIT_ULL(NIX_SQINT_SEND_ERR)) {
-   plt_err("SQ=%d NIX_SQINT_SEND_ERR", sq);
-   nix_lf_sq_debug_reg(nix, NIX_LF_SEND_ERR_DBG);
-   }
-   if (irq & BIT_ULL(NIX_SQINT_SQB_ALLOC_FAIL)) {
+   /* Detect LMT store error */
+   rc = nix_lf_sq_debug_reg(nix, NIX_LF_SQ_OP_ERR_DBG);
+   if (rc)
+   plt_err("SQ=%d NIX_SQINT_LMT_ERR, errcode %x", sq, rc);
+
+   /* Detect Meta-descriptor enqueue error */
+   rc = nix_lf_sq_debug_reg(nix, NIX_LF_MNQ_ERR_DBG);
+   if (rc)
+   plt_err("SQ=%d NIX_SQINT_MNQ_ERR, errcode %x", sq, rc);
+
+   /* Detect Send error */
+   rc = nix_lf_sq_debug_reg(nix, NIX_LF_SEND_ERR_DBG);
+   if (rc)
+   plt_err("SQ=%d NIX_SQINT_SEND_ERR, errcode %x", sq, rc);
+
+   /* Detect SQB fault, read SQ context to check SQB NULL case */
+   if (irq & BIT_ULL(NIX_SQINT_SQB_ALLOC_FAIL) ||
+   nix_lf_is_sqb_null(dev, q))
plt_err("SQ=%d NIX_SQINT_SQB_ALLOC_FAIL", sq);
-   nix_lf_sq_debug_reg(nix, NIX_LF_SEND_ERR_DBG);
-   }
}
 
/* Clear interrupt */
-- 
2.18.0



[PATCH v2 1/2] net/cnxk: update meter bpf ID in Recevie Queue

2022-01-13 Thread Rakesh Kudurumalla
Patch updates configured meter bpf is in recevie queue
context during meter creation

Signed-off-by: Rakesh Kudurumalla 
Acked-by: Sunil Kumar Kori 
---
v2: Fix commit message
 drivers/net/cnxk/cn10k_rte_flow.c  |  7 +++
 drivers/net/cnxk/cnxk_ethdev_mtr.c | 21 -
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/net/cnxk/cn10k_rte_flow.c 
b/drivers/net/cnxk/cn10k_rte_flow.c
index b830abe63e..529fb0e4b7 100644
--- a/drivers/net/cnxk/cn10k_rte_flow.c
+++ b/drivers/net/cnxk/cn10k_rte_flow.c
@@ -36,20 +36,20 @@ cn10k_mtr_configure(struct rte_eth_dev *eth_dev,
for (i = 0; actions[i].type != RTE_FLOW_ACTION_TYPE_END; i++) {
if (actions[i].type == RTE_FLOW_ACTION_TYPE_METER) {
mtr_conf = (const struct rte_flow_action_meter
-   *)(actions->conf);
+   *)(actions[i].conf);
mtr_id = mtr_conf->mtr_id;
is_mtr_act = true;
}
if (actions[i].type == RTE_FLOW_ACTION_TYPE_QUEUE) {
q_conf = (const struct rte_flow_action_queue
- *)(actions->conf);
+ *)(actions[i].conf);
if (is_mtr_act)
nix_mtr_rq_update(eth_dev, mtr_id, 1,
  &q_conf->index);
}
if (actions[i].type == RTE_FLOW_ACTION_TYPE_RSS) {
rss_conf = (const struct rte_flow_action_rss
-   *)(actions->conf);
+   *)(actions[i].conf);
if (is_mtr_act)
nix_mtr_rq_update(eth_dev, mtr_id,
  rss_conf->queue_num,
@@ -171,7 +171,6 @@ cn10k_flow_create(struct rte_eth_dev *eth_dev, const struct 
rte_flow_attr *attr,
return NULL;
}
}
-
for (i = 0; actions[i].type != RTE_FLOW_ACTION_TYPE_END; i++) {
if (actions[i].type == RTE_FLOW_ACTION_TYPE_METER) {
mtr = (const struct rte_flow_action_meter *)actions[i]
diff --git a/drivers/net/cnxk/cnxk_ethdev_mtr.c 
b/drivers/net/cnxk/cnxk_ethdev_mtr.c
index 39d8563826..cc783e5f86 100644
--- a/drivers/net/cnxk/cnxk_ethdev_mtr.c
+++ b/drivers/net/cnxk/cnxk_ethdev_mtr.c
@@ -35,7 +35,6 @@ static struct rte_mtr_capabilities mtr_capa = {
.chaining_n_mtrs_per_flow_max = NIX_MTR_COUNT_PER_FLOW,
.chaining_use_prev_mtr_color_supported = true,
.chaining_use_prev_mtr_color_enforced = true,
-   .meter_rate_max = NIX_BPF_RATE_MAX / 8, /* Bytes per second */
.color_aware_srtcm_rfc2697_supported = true,
.color_aware_trtcm_rfc2698_supported = true,
.color_aware_trtcm_rfc4115_supported = true,
@@ -180,13 +179,13 @@ cnxk_nix_mtr_capabilities_get(struct rte_eth_dev *dev,
  struct rte_mtr_capabilities *capa,
  struct rte_mtr_error *error)
 {
-   struct cnxk_eth_dev *eth_dev = cnxk_eth_pmd_priv(dev);
-   uint16_t count[ROC_NIX_BPF_LEVEL_MAX] = {0};
uint8_t lvl_mask = ROC_NIX_BPF_LEVEL_F_LEAF | ROC_NIX_BPF_LEVEL_F_MID |
   ROC_NIX_BPF_LEVEL_F_TOP;
+   struct cnxk_eth_dev *eth_dev = cnxk_eth_pmd_priv(dev);
+   uint16_t count[ROC_NIX_BPF_LEVEL_MAX] = {0};
struct roc_nix *nix = ð_dev->nix;
-   int rc;
-   int i;
+   uint32_t time_unit;
+   int rc, i;
 
RTE_SET_USED(dev);
 
@@ -207,6 +206,15 @@ cnxk_nix_mtr_capabilities_get(struct rte_eth_dev *dev,
mtr_capa.meter_trtcm_rfc4115_n_max = mtr_capa.n_max;
mtr_capa.meter_policy_n_max = mtr_capa.n_max;
 
+   rc = roc_nix_bpf_timeunit_get(nix, &time_unit);
+   if (rc)
+   return rc;
+
+   mtr_capa.meter_rate_max =
+   NIX_BPF_RATE(time_unit, NIX_BPF_MAX_RATE_EXPONENT,
+NIX_BPF_MAX_RATE_MANTISSA, 0) /
+   8;
+
*capa = mtr_capa;
return 0;
 }
@@ -304,6 +312,9 @@ cnxk_nix_mtr_policy_validate(struct rte_eth_dev *dev,
if (action->type == RTE_FLOW_ACTION_TYPE_DROP)
supported[i] = true;
 
+   if (action->type == RTE_FLOW_ACTION_TYPE_VOID)
+   supported[i] = true;
+
if (!supported[i]) {
sprintf(message,
"%s action is not valid",
-- 
2.25.1



[PATCH v2 2/2] common/cnxk: update meter algorithm in band profile

2022-01-13 Thread Rakesh Kudurumalla
Patch updates meter algorithm in nix band profile
structure

Signed-off-by: Rakesh Kudurumalla 
Acked-by: Sunil Kumar Kori 
---
v2: series patch of 1/2
 drivers/common/cnxk/hw/nix.h  |  5 ---
 drivers/common/cnxk/roc_nix_bpf.c | 61 +--
 2 files changed, 17 insertions(+), 49 deletions(-)

diff --git a/drivers/common/cnxk/hw/nix.h b/drivers/common/cnxk/hw/nix.h
index dd2ebecc6a..6931f1d1d2 100644
--- a/drivers/common/cnxk/hw/nix.h
+++ b/drivers/common/cnxk/hw/nix.h
@@ -2133,11 +2133,6 @@ struct nix_lso_format {
((NIX_BPF_RATE_CONST * ((256 + (mantissa)) << (exponent))) /   \
 (((1ull << (div_exp)) * 256 * policer_timeunit)))
 
-/* Meter rate limits in Bits/Sec */
-#define NIX_BPF_RATE_MIN NIX_BPF_RATE(10, 0, 0, 0)
-#define NIX_BPF_RATE_MAX   
\
-   NIX_BPF_RATE(1, NIX_BPF_MAX_RATE_EXPONENT, NIX_BPF_MAX_RATE_MANTISSA, 0)
-
 #define NIX_BPF_DEFAULT_ADJUST_MANTISSA 511
 #define NIX_BPF_DEFAULT_ADJUST_EXPONENT 0
 
diff --git a/drivers/common/cnxk/roc_nix_bpf.c 
b/drivers/common/cnxk/roc_nix_bpf.c
index 6996a54be0..46ed91e87b 100644
--- a/drivers/common/cnxk/roc_nix_bpf.c
+++ b/drivers/common/cnxk/roc_nix_bpf.c
@@ -38,48 +38,23 @@ meter_rate_to_nix(uint64_t value, uint64_t *exponent_p, 
uint64_t *mantissa_p,
  uint64_t *div_exp_p, uint32_t timeunit_p)
 {
uint64_t div_exp, exponent, mantissa;
-   uint32_t time_us = timeunit_p;
+   uint32_t time_ns = timeunit_p;
 
/* Boundary checks */
-   if (value < NIX_BPF_RATE_MIN || value > NIX_BPF_RATE_MAX)
+   if (value < NIX_BPF_RATE(time_ns, 0, 0, 0) ||
+   value > NIX_BPF_RATE(time_ns, NIX_BPF_MAX_RATE_EXPONENT,
+NIX_BPF_MAX_RATE_MANTISSA, 0))
return 0;
 
-   if (value <= NIX_BPF_RATE(time_us, 0, 0, 0)) {
-   /* Calculate rate div_exp and mantissa using
-* the following formula:
-*
-* value = (2E6 * (256 + mantissa)
-*  / ((1 << div_exp) * 256))
-*/
-   div_exp = 0;
-   exponent = 0;
-   mantissa = NIX_BPF_MAX_RATE_MANTISSA;
-
-   while (value < (NIX_BPF_RATE_CONST / (1 << div_exp)))
-   div_exp += 1;
-
-   while (value < ((NIX_BPF_RATE_CONST * (256 + mantissa)) /
-   ((1 << div_exp) * 256)))
-   mantissa -= 1;
-   } else {
-   /* Calculate rate exponent and mantissa using
-* the following formula:
-*
-* value = (2E6 * ((256 + mantissa) << exponent)) / 256
-*
-*/
-   div_exp = 0;
-   exponent = NIX_BPF_MAX_RATE_EXPONENT;
-   mantissa = NIX_BPF_MAX_RATE_MANTISSA;
-
-   while (value < (NIX_BPF_RATE_CONST * (1 << exponent)))
-   exponent -= 1;
-
-   while (value <
-  ((NIX_BPF_RATE_CONST * ((256 + mantissa) << exponent)) /
-   256))
-   mantissa -= 1;
-   }
+   div_exp = 0;
+   exponent = NIX_BPF_MAX_RATE_EXPONENT;
+   mantissa = NIX_BPF_MAX_RATE_MANTISSA;
+
+   while (value < (NIX_BPF_RATE(time_ns, exponent, 0, 0)))
+   exponent -= 1;
+
+   while (value < (NIX_BPF_RATE(time_ns, exponent, mantissa, 0)))
+   mantissa -= 1;
 
if (div_exp > NIX_BPF_MAX_RATE_DIV_EXP ||
exponent > NIX_BPF_MAX_RATE_EXPONENT ||
@@ -94,7 +69,7 @@ meter_rate_to_nix(uint64_t value, uint64_t *exponent_p, 
uint64_t *mantissa_p,
*mantissa_p = mantissa;
 
/* Calculate real rate value */
-   return NIX_BPF_RATE(time_us, exponent, mantissa, div_exp);
+   return NIX_BPF_RATE(time_ns, exponent, mantissa, div_exp);
 }
 
 static inline uint64_t
@@ -195,11 +170,7 @@ nix_precolor_conv_table_write(struct roc_nix *roc_nix, 
uint64_t val,
int64_t *addr;
 
addr = PLT_PTR_ADD(nix->base, off);
-   /* FIXME: Currently writing to this register throwing kernel dump.
-* plt_write64(val, addr);
-*/
-   PLT_SET_USED(val);
-   PLT_SET_USED(addr);
+   plt_write64(val, addr);
 }
 
 static uint8_t
@@ -665,6 +636,7 @@ roc_nix_bpf_config(struct roc_nix *roc_nix, uint16_t id,
 
aq->prof.lmode = cfg->lmode;
aq->prof.icolor = cfg->icolor;
+   aq->prof.meter_algo = cfg->alg;
aq->prof.pc_mode = cfg->pc_mode;
aq->prof.tnl_ena = cfg->tnl_ena;
aq->prof.gc_action = cfg->action[ROC_NIX_BPF_COLOR_GREEN];
@@ -673,6 +645,7 @@ roc_nix_bpf_config(struct roc_nix *roc_nix, uint16_t id,
 
aq->prof_mask.lmode = ~(aq->prof_mask.lmode);
aq->prof_mask.icolor = ~(aq->prof_mask.icolor);
+   aq->prof_mask.meter_algo = ~(aq->prof_mask.meter_algo);
aq->prof_m

[PATCH v3 1/2] net/cnxk: update meter bpf ID in Receive Queue

2022-01-13 Thread Rakesh Kudurumalla
Patch updates configured meter bpf is in receive queue
context during meter creation

Signed-off-by: Rakesh Kudurumalla 
Acked-by: Sunil Kumar Kori 
---
v3: Fix commit message spelling
 drivers/net/cnxk/cn10k_rte_flow.c  |  7 +++
 drivers/net/cnxk/cnxk_ethdev_mtr.c | 21 -
 2 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/net/cnxk/cn10k_rte_flow.c 
b/drivers/net/cnxk/cn10k_rte_flow.c
index b830abe63e..529fb0e4b7 100644
--- a/drivers/net/cnxk/cn10k_rte_flow.c
+++ b/drivers/net/cnxk/cn10k_rte_flow.c
@@ -36,20 +36,20 @@ cn10k_mtr_configure(struct rte_eth_dev *eth_dev,
for (i = 0; actions[i].type != RTE_FLOW_ACTION_TYPE_END; i++) {
if (actions[i].type == RTE_FLOW_ACTION_TYPE_METER) {
mtr_conf = (const struct rte_flow_action_meter
-   *)(actions->conf);
+   *)(actions[i].conf);
mtr_id = mtr_conf->mtr_id;
is_mtr_act = true;
}
if (actions[i].type == RTE_FLOW_ACTION_TYPE_QUEUE) {
q_conf = (const struct rte_flow_action_queue
- *)(actions->conf);
+ *)(actions[i].conf);
if (is_mtr_act)
nix_mtr_rq_update(eth_dev, mtr_id, 1,
  &q_conf->index);
}
if (actions[i].type == RTE_FLOW_ACTION_TYPE_RSS) {
rss_conf = (const struct rte_flow_action_rss
-   *)(actions->conf);
+   *)(actions[i].conf);
if (is_mtr_act)
nix_mtr_rq_update(eth_dev, mtr_id,
  rss_conf->queue_num,
@@ -171,7 +171,6 @@ cn10k_flow_create(struct rte_eth_dev *eth_dev, const struct 
rte_flow_attr *attr,
return NULL;
}
}
-
for (i = 0; actions[i].type != RTE_FLOW_ACTION_TYPE_END; i++) {
if (actions[i].type == RTE_FLOW_ACTION_TYPE_METER) {
mtr = (const struct rte_flow_action_meter *)actions[i]
diff --git a/drivers/net/cnxk/cnxk_ethdev_mtr.c 
b/drivers/net/cnxk/cnxk_ethdev_mtr.c
index 39d8563826..cc783e5f86 100644
--- a/drivers/net/cnxk/cnxk_ethdev_mtr.c
+++ b/drivers/net/cnxk/cnxk_ethdev_mtr.c
@@ -35,7 +35,6 @@ static struct rte_mtr_capabilities mtr_capa = {
.chaining_n_mtrs_per_flow_max = NIX_MTR_COUNT_PER_FLOW,
.chaining_use_prev_mtr_color_supported = true,
.chaining_use_prev_mtr_color_enforced = true,
-   .meter_rate_max = NIX_BPF_RATE_MAX / 8, /* Bytes per second */
.color_aware_srtcm_rfc2697_supported = true,
.color_aware_trtcm_rfc2698_supported = true,
.color_aware_trtcm_rfc4115_supported = true,
@@ -180,13 +179,13 @@ cnxk_nix_mtr_capabilities_get(struct rte_eth_dev *dev,
  struct rte_mtr_capabilities *capa,
  struct rte_mtr_error *error)
 {
-   struct cnxk_eth_dev *eth_dev = cnxk_eth_pmd_priv(dev);
-   uint16_t count[ROC_NIX_BPF_LEVEL_MAX] = {0};
uint8_t lvl_mask = ROC_NIX_BPF_LEVEL_F_LEAF | ROC_NIX_BPF_LEVEL_F_MID |
   ROC_NIX_BPF_LEVEL_F_TOP;
+   struct cnxk_eth_dev *eth_dev = cnxk_eth_pmd_priv(dev);
+   uint16_t count[ROC_NIX_BPF_LEVEL_MAX] = {0};
struct roc_nix *nix = ð_dev->nix;
-   int rc;
-   int i;
+   uint32_t time_unit;
+   int rc, i;
 
RTE_SET_USED(dev);
 
@@ -207,6 +206,15 @@ cnxk_nix_mtr_capabilities_get(struct rte_eth_dev *dev,
mtr_capa.meter_trtcm_rfc4115_n_max = mtr_capa.n_max;
mtr_capa.meter_policy_n_max = mtr_capa.n_max;
 
+   rc = roc_nix_bpf_timeunit_get(nix, &time_unit);
+   if (rc)
+   return rc;
+
+   mtr_capa.meter_rate_max =
+   NIX_BPF_RATE(time_unit, NIX_BPF_MAX_RATE_EXPONENT,
+NIX_BPF_MAX_RATE_MANTISSA, 0) /
+   8;
+
*capa = mtr_capa;
return 0;
 }
@@ -304,6 +312,9 @@ cnxk_nix_mtr_policy_validate(struct rte_eth_dev *dev,
if (action->type == RTE_FLOW_ACTION_TYPE_DROP)
supported[i] = true;
 
+   if (action->type == RTE_FLOW_ACTION_TYPE_VOID)
+   supported[i] = true;
+
if (!supported[i]) {
sprintf(message,
"%s action is not valid",
-- 
2.25.1



[PATCH v3 2/2] common/cnxk: update meter algorithm in band profile

2022-01-13 Thread Rakesh Kudurumalla
Patch updates meter algorithm in nix band profile
structure

Signed-off-by: Rakesh Kudurumalla 
Acked-by: Sunil Kumar Kori 
---
v3: series patch of 1/2
 drivers/common/cnxk/hw/nix.h  |  5 ---
 drivers/common/cnxk/roc_nix_bpf.c | 61 +--
 2 files changed, 17 insertions(+), 49 deletions(-)

diff --git a/drivers/common/cnxk/hw/nix.h b/drivers/common/cnxk/hw/nix.h
index dd2ebecc6a..6931f1d1d2 100644
--- a/drivers/common/cnxk/hw/nix.h
+++ b/drivers/common/cnxk/hw/nix.h
@@ -2133,11 +2133,6 @@ struct nix_lso_format {
((NIX_BPF_RATE_CONST * ((256 + (mantissa)) << (exponent))) /   \
 (((1ull << (div_exp)) * 256 * policer_timeunit)))
 
-/* Meter rate limits in Bits/Sec */
-#define NIX_BPF_RATE_MIN NIX_BPF_RATE(10, 0, 0, 0)
-#define NIX_BPF_RATE_MAX   
\
-   NIX_BPF_RATE(1, NIX_BPF_MAX_RATE_EXPONENT, NIX_BPF_MAX_RATE_MANTISSA, 0)
-
 #define NIX_BPF_DEFAULT_ADJUST_MANTISSA 511
 #define NIX_BPF_DEFAULT_ADJUST_EXPONENT 0
 
diff --git a/drivers/common/cnxk/roc_nix_bpf.c 
b/drivers/common/cnxk/roc_nix_bpf.c
index 6996a54be0..46ed91e87b 100644
--- a/drivers/common/cnxk/roc_nix_bpf.c
+++ b/drivers/common/cnxk/roc_nix_bpf.c
@@ -38,48 +38,23 @@ meter_rate_to_nix(uint64_t value, uint64_t *exponent_p, 
uint64_t *mantissa_p,
  uint64_t *div_exp_p, uint32_t timeunit_p)
 {
uint64_t div_exp, exponent, mantissa;
-   uint32_t time_us = timeunit_p;
+   uint32_t time_ns = timeunit_p;
 
/* Boundary checks */
-   if (value < NIX_BPF_RATE_MIN || value > NIX_BPF_RATE_MAX)
+   if (value < NIX_BPF_RATE(time_ns, 0, 0, 0) ||
+   value > NIX_BPF_RATE(time_ns, NIX_BPF_MAX_RATE_EXPONENT,
+NIX_BPF_MAX_RATE_MANTISSA, 0))
return 0;
 
-   if (value <= NIX_BPF_RATE(time_us, 0, 0, 0)) {
-   /* Calculate rate div_exp and mantissa using
-* the following formula:
-*
-* value = (2E6 * (256 + mantissa)
-*  / ((1 << div_exp) * 256))
-*/
-   div_exp = 0;
-   exponent = 0;
-   mantissa = NIX_BPF_MAX_RATE_MANTISSA;
-
-   while (value < (NIX_BPF_RATE_CONST / (1 << div_exp)))
-   div_exp += 1;
-
-   while (value < ((NIX_BPF_RATE_CONST * (256 + mantissa)) /
-   ((1 << div_exp) * 256)))
-   mantissa -= 1;
-   } else {
-   /* Calculate rate exponent and mantissa using
-* the following formula:
-*
-* value = (2E6 * ((256 + mantissa) << exponent)) / 256
-*
-*/
-   div_exp = 0;
-   exponent = NIX_BPF_MAX_RATE_EXPONENT;
-   mantissa = NIX_BPF_MAX_RATE_MANTISSA;
-
-   while (value < (NIX_BPF_RATE_CONST * (1 << exponent)))
-   exponent -= 1;
-
-   while (value <
-  ((NIX_BPF_RATE_CONST * ((256 + mantissa) << exponent)) /
-   256))
-   mantissa -= 1;
-   }
+   div_exp = 0;
+   exponent = NIX_BPF_MAX_RATE_EXPONENT;
+   mantissa = NIX_BPF_MAX_RATE_MANTISSA;
+
+   while (value < (NIX_BPF_RATE(time_ns, exponent, 0, 0)))
+   exponent -= 1;
+
+   while (value < (NIX_BPF_RATE(time_ns, exponent, mantissa, 0)))
+   mantissa -= 1;
 
if (div_exp > NIX_BPF_MAX_RATE_DIV_EXP ||
exponent > NIX_BPF_MAX_RATE_EXPONENT ||
@@ -94,7 +69,7 @@ meter_rate_to_nix(uint64_t value, uint64_t *exponent_p, 
uint64_t *mantissa_p,
*mantissa_p = mantissa;
 
/* Calculate real rate value */
-   return NIX_BPF_RATE(time_us, exponent, mantissa, div_exp);
+   return NIX_BPF_RATE(time_ns, exponent, mantissa, div_exp);
 }
 
 static inline uint64_t
@@ -195,11 +170,7 @@ nix_precolor_conv_table_write(struct roc_nix *roc_nix, 
uint64_t val,
int64_t *addr;
 
addr = PLT_PTR_ADD(nix->base, off);
-   /* FIXME: Currently writing to this register throwing kernel dump.
-* plt_write64(val, addr);
-*/
-   PLT_SET_USED(val);
-   PLT_SET_USED(addr);
+   plt_write64(val, addr);
 }
 
 static uint8_t
@@ -665,6 +636,7 @@ roc_nix_bpf_config(struct roc_nix *roc_nix, uint16_t id,
 
aq->prof.lmode = cfg->lmode;
aq->prof.icolor = cfg->icolor;
+   aq->prof.meter_algo = cfg->alg;
aq->prof.pc_mode = cfg->pc_mode;
aq->prof.tnl_ena = cfg->tnl_ena;
aq->prof.gc_action = cfg->action[ROC_NIX_BPF_COLOR_GREEN];
@@ -673,6 +645,7 @@ roc_nix_bpf_config(struct roc_nix *roc_nix, uint16_t id,
 
aq->prof_mask.lmode = ~(aq->prof_mask.lmode);
aq->prof_mask.icolor = ~(aq->prof_mask.icolor);
+   aq->prof_mask.meter_algo = ~(aq->prof_mask.meter_algo);
aq->prof_m

Re: [PATCH 1/1] ci: restrict concurrency

2022-01-13 Thread Josh Soref
On Thu, Jan 13, 2022, 6:42 AM Thomas Monjalon  wrote:

> Hi,
>
> The explanation should be in the patch, not the cover letter.
> Actually, you don't need a cover letter for a single patch.
> Copying it here:
> "
> dpdk is fairly expensive to build in GitHub.
>
> It's helpful to abandon old builds as soon as there's a new
> build waiting instead of wasting resources on the previous
> round.
> "
>
> 12/01/2022 07:50, Josh Soref:
> > Signed-off-by: Josh Soref 
> > ---
> > +concurrency:
> > +  group: build-${{ matrix.config.os }}-${{ matrix.config.compiler
> }}-${{ matrix.config.library }}-${{ matrix.config.cross }}-${{
> matrix.config.mini }}-${{ github.event.pull_request.number || github.ref }}
> > +  cancel-in-progress: true
>
> The goal of the CI is to catch any issue in a submitted patch.
> Is your change cancelling a test of a patch when another one is submitted?
>

If it's on the same branch or if it's in the same pull request yes,
otherwise, no.

>


RE: [PATCH 3/8] ethdev: add mbuf dynfield for incomplete IP reassembly

2022-01-13 Thread Akhil Goyal
Hi Konstantin,
> > > Hardware IP reassembly may be incomplete for multiple reasons like
> > > reassembly timeout reached, duplicate fragments, etc.
> > > To save application cycles to process these packets again, a new
> > > mbuf ol_flag (RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE) is added to
> > > show that the mbuf received is not reassembled properly.
> >
> > If we use dynfiled for data, why not use dynflag for
> > RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE?
> > That way we can avoid introduced hardcoded (always defined) flags for that
> > case.
> 
> I have not looked into using dynflag. Will explore if it can be used.
The intent of adding this feature is to reduce application cycles for IP 
reassembly.
But if we use dynflag, it will take a lot of cycles to check if dyn flag is set 
or not.
As I understand, it first need to be looked up in a linked list and then 
checked.
And this will be checked for each packet even if there is no reassembly 
involved.


RE: [PATCH 2/8] ethdev: add dev op for IP reassembly configuration

2022-01-13 Thread Akhil Goyal
Hi Konstantin,

> > > > > > Another question - if we have reassembly_conf_set() would it make
> sense
> > to
> > > > > > have also reassembly_conf_get?
> > > > > > So user can retrieve current ip_reassembly config values?
> > > > > >
> > > > > The set/supported values can be retrieved using rte_eth_dev_info ::
> > > > reass_capa
> > > >
> > > > Hmm, I thought rte_eth_dev_info :: reass_capa reports
> > > > max supported values, not currently set values.
> > > > Did I misunderstand something?
> > > >
> > > Reassembly configuration is expected to be a one-time setting and is not
> > expected
> > > to change multiple times in the application.
> > > You are correct that rte_eth_dev_info :: reass_capa reports max supported
> > values
> > > by the PMD.
> > > But if somebody uses the _set API, dev_info values will be overwritten.
> > > However, a get API can be added, if we have some use case.
> > > IMO, we can add it later if it will be required.
> >
> > Basically you forbid user to reconfigure this feature
> > during application life-time?
> > That sounds like a really strange approach to me and
> > Probably will affect its usability in a negative way.
> > Wonder why it has to be that restrictive?
> > Also with the model you suggest, what would happen after user will do:
> > dev_stop(); dev_configure();?
> > Would rte_eth_dev_info :: reass_capa be reset to initial values,
> > or user values will be preserved, or ...?
> >
> I am not restricting the user to not reconfigure the feature.
> When dev_configure() is called again after dev_stop(), it will reset the 
> previously
> set values to max ones.
> However, if you insist the get API can be added. No strong opinion on that.

On another thought, setting dev_info :: reass_capa to a max value and not 
changing it
in reassembly_conf_set() will make more sense.
The most common case, would be to get the max values and if they are not good
Enough for the application, set lesser values using the new API.
I do not see a use case to get the current values set. However, it may be used 
for debugging
some driver issue related to these values. But, I believe that can be managed 
internally
in the PMD. Do you suspect any other use case for get API?



RE: [PATCH] bus/ifpga: remove useless check while browsing devices

2022-01-13 Thread Xu, Rosen
Hi,

Thanks.

> -Original Message-
> From: Maxime Gouin 
> Sent: Wednesday, January 05, 2022 18:27
> To: dev@dpdk.org
> Cc: Maxime Gouin ; Xu, Rosen
> ; Zhang, Qi Z ; Zhang, Tianfei
> ; Olivier Matz 
> Subject: [PATCH] bus/ifpga: remove useless check while browsing devices
> 
> reported by code analysis tool C++test (version 10.4):
> 
> > /build/dpdk-20.11/drivers/bus/ifpga/ifpga_bus.c
> > 67Condition "afu_dev" is always evaluated to true
> > 81Condition "afu_dev" is always evaluated to true
> 
> The "for" loop already checks that afu_dev is not NULL.
> 
> Fixes: 05fa3d4a6539 ("bus/ifpga: add Intel FPGA bus library")
> 
> Signed-off-by: Maxime Gouin 
> Reviewed-by: Olivier Matz 
> ---
>  drivers/bus/ifpga/ifpga_bus.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/bus/ifpga/ifpga_bus.c b/drivers/bus/ifpga/ifpga_bus.c
> index cbc680928486..c5c8bbd57219 100644
> --- a/drivers/bus/ifpga/ifpga_bus.c
> +++ b/drivers/bus/ifpga/ifpga_bus.c
> @@ -64,8 +64,7 @@ ifpga_find_afu_dev(const struct rte_rawdev *rdev,
>   struct rte_afu_device *afu_dev = NULL;
> 
>   TAILQ_FOREACH(afu_dev, &ifpga_afu_dev_list, next) {
> - if (afu_dev &&
> - afu_dev->rawdev == rdev &&
> + if (afu_dev->rawdev == rdev &&
>   !ifpga_afu_id_cmp(&afu_dev->id, afu_id))
>   return afu_dev;
>   }
> @@ -78,8 +77,7 @@ rte_ifpga_find_afu_by_name(const char *name)
>   struct rte_afu_device *afu_dev = NULL;
> 
>   TAILQ_FOREACH(afu_dev, &ifpga_afu_dev_list, next) {
> - if (afu_dev &&
> - !strcmp(afu_dev->device.name, name))
> + if (!strcmp(afu_dev->device.name, name))
>   return afu_dev;
>   }
>   return NULL;
> --
> 2.30.2

Acked-by: Rosen Xu 


[PATCH] net/mlx5: fix maximum packet headers size for TSO

2022-01-13 Thread Alexander Kozyrev
The maximum packet headers size for TSO is calculated as a sum of
Ethernet, VLAN, IPv6 and TCP headers (plus inner headers).
The rationale  behind choosing IPv6 and TCP is their headers
are bigger than IPv4 and UDP respectively, giving us the maximum
possible headers size. But it is not true for L3 headers.
IPv4 header size (20 bytes) is smaller than IPv6 header size
(40 bytes) only in the default case. There are up to 10
optional header fields called Options in case IHL > 5.
This means that the maximum size of the IPv4 header is 60 bytes.

Choosing the wrong maximum packets headers size causes inability
to transmit multi-segment TSO packets with IPv4 Options present.
PMD check that it is possible to inline all the packet headers
and the packet headers size exceeds the expected maximum size.
The maximum packet headers size was set to 192 bytes before,
but its value has been reduced during Tx path refactor activity.
Restore the proper maximum packet headers size for TSO.

Fixes: 50724e1 ("net/mlx5: update Tx definitions")
Cc: sta...@dpdk.org

Signed-off-by: Alexander Kozyrev 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_defs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
index 36b384fa08..2d48fde010 100644
--- a/drivers/net/mlx5/mlx5_defs.h
+++ b/drivers/net/mlx5/mlx5_defs.h
@@ -50,7 +50,7 @@
 #define MLX5_MAX_XSTATS 32
 
 /* Maximum Packet headers size (L2+L3+L4) for TSO. */
-#define MLX5_MAX_TSO_HEADER (128u + 34u)
+#define MLX5_MAX_TSO_HEADER 192U
 
 /* Inline data size required by NICs. */
 #define MLX5_INLINE_HSIZE_NONE 0
-- 
2.18.2



RE: [PATCH 3/8] ethdev: add mbuf dynfield for incomplete IP reassembly

2022-01-13 Thread Ananyev, Konstantin
Hi Akhil,

> Hi Konstantin,
> > > > Hardware IP reassembly may be incomplete for multiple reasons like
> > > > reassembly timeout reached, duplicate fragments, etc.
> > > > To save application cycles to process these packets again, a new
> > > > mbuf ol_flag (RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE) is added to
> > > > show that the mbuf received is not reassembled properly.
> > >
> > > If we use dynfiled for data, why not use dynflag for
> > > RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE?
> > > That way we can avoid introduced hardcoded (always defined) flags for that
> > > case.
> >
> > I have not looked into using dynflag. Will explore if it can be used.
> The intent of adding this feature is to reduce application cycles for IP 
> reassembly.
> But if we use dynflag, it will take a lot of cycles to check if dyn flag is 
> set or not.
> As I understand, it first need to be looked up in a linked list and then 
> checked.
> And this will be checked for each packet even if there is no reassembly 
> involved.

No, I don't think it is correct understanding.
For dyn-flag it is the same approach as for dyn-field.
At init time it selects the bit which will be used and return it'e value to the 
user.
Then user will set/check the at runtime.
So no linking list walks at runtime.
All you missing comparing to hard-coded values: complier optimizations.  




[PATCH] net/mlx5: fix wrong MPRQ WQE size assertion

2022-01-13 Thread Alexander Kozyrev
Preparation of the stride size and the number of strides for
Multi-Packet RQ was updated recently to accommodate the hardware
limitation about minimum WQE size. The wrong assertion was
introduced to ensure this limitation is met. Assert that the
configured WQE size is not less than the minimum supported size.

Fixes: 219f08a ("net/mlx5: fix missing adjustment MPRQ stride devargs")

Signed-off-by: Alexander Kozyrev 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_rxq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index ed21fbba1e..105e094fdf 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1619,7 +1619,7 @@ mlx5_mprq_prepare(struct rte_eth_dev *dev, uint16_t idx, 
uint16_t desc,
RTE_BIT32(log_def_stride_size));
log_stride_wqe_size = log_def_stride_num + log_def_stride_size;
}
-   MLX5_ASSERT(log_stride_wqe_size < config->mprq.log_min_stride_wqe_size);
+   MLX5_ASSERT(log_stride_wqe_size >= 
config->mprq.log_min_stride_wqe_size);
if (desc <= RTE_BIT32(*actual_log_stride_num))
goto unsupport;
if (min_mbuf_size > RTE_BIT32(log_stride_wqe_size)) {
-- 
2.18.2



RE: [PATCH 2/8] ethdev: add dev op for IP reassembly configuration

2022-01-13 Thread Ananyev, Konstantin



> > > > > > > Another question - if we have reassembly_conf_set() would it make
> > sense
> > > to
> > > > > > > have also reassembly_conf_get?
> > > > > > > So user can retrieve current ip_reassembly config values?
> > > > > > >
> > > > > > The set/supported values can be retrieved using rte_eth_dev_info ::
> > > > > reass_capa
> > > > >
> > > > > Hmm, I thought rte_eth_dev_info :: reass_capa reports
> > > > > max supported values, not currently set values.
> > > > > Did I misunderstand something?
> > > > >
> > > > Reassembly configuration is expected to be a one-time setting and is not
> > > expected
> > > > to change multiple times in the application.
> > > > You are correct that rte_eth_dev_info :: reass_capa reports max 
> > > > supported
> > > values
> > > > by the PMD.
> > > > But if somebody uses the _set API, dev_info values will be overwritten.
> > > > However, a get API can be added, if we have some use case.
> > > > IMO, we can add it later if it will be required.
> > >
> > > Basically you forbid user to reconfigure this feature
> > > during application life-time?
> > > That sounds like a really strange approach to me and
> > > Probably will affect its usability in a negative way.
> > > Wonder why it has to be that restrictive?
> > > Also with the model you suggest, what would happen after user will do:
> > > dev_stop(); dev_configure();?
> > > Would rte_eth_dev_info :: reass_capa be reset to initial values,
> > > or user values will be preserved, or ...?
> > >
> > I am not restricting the user to not reconfigure the feature.
> > When dev_configure() is called again after dev_stop(), it will reset the 
> > previously
> > set values to max ones.
> > However, if you insist the get API can be added. No strong opinion on that.
> 
> On another thought, setting dev_info :: reass_capa to a max value and not 
> changing it
> in reassembly_conf_set() will make more sense.

Yes, agree.

> The most common case, would be to get the max values and if they are not good
> Enough for the application, set lesser values using the new API.
> I do not see a use case to get the current values set. However, it may be 
> used for debugging
> some driver issue related to these values. But, I believe that can be managed 
> internally
> in the PMD. Do you suspect any other use case for get API?

I think it would be really plausible for both user and ethdev layer to have an 
ability to get
values that are currently in place.  



RE: [PATCH 3/8] ethdev: add mbuf dynfield for incomplete IP reassembly

2022-01-13 Thread Akhil Goyal
> Hi Akhil,
> 
> > Hi Konstantin,
> > > > > Hardware IP reassembly may be incomplete for multiple reasons like
> > > > > reassembly timeout reached, duplicate fragments, etc.
> > > > > To save application cycles to process these packets again, a new
> > > > > mbuf ol_flag (RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE) is added
> to
> > > > > show that the mbuf received is not reassembled properly.
> > > >
> > > > If we use dynfiled for data, why not use dynflag for
> > > > RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE?
> > > > That way we can avoid introduced hardcoded (always defined) flags for
> that
> > > > case.
> > >
> > > I have not looked into using dynflag. Will explore if it can be used.
> > The intent of adding this feature is to reduce application cycles for IP
> reassembly.
> > But if we use dynflag, it will take a lot of cycles to check if dyn flag is 
> > set or
> not.
> > As I understand, it first need to be looked up in a linked list and then 
> > checked.
> > And this will be checked for each packet even if there is no reassembly
> involved.
> 
> No, I don't think it is correct understanding.
> For dyn-flag it is the same approach as for dyn-field.
> At init time it selects the bit which will be used and return it'e value to 
> the user.
> Then user will set/check the at runtime.
> So no linking list walks at runtime.
> All you missing comparing to hard-coded values: complier optimizations.
> 
Ok, got it. rte_mbuf_dynflag_lookup() need to happen only for the first mbuf.
I was checking is_timestamp_enabled() in test-pmd. Didn't see that dynflag was
a static variable.
I thought it was happening for each packet.



Re: ETH_RSS_IP only does not seem to balance traffic

2022-01-13 Thread Bruce Richardson
On Thu, Jan 13, 2022 at 12:52:04AM +0900, Yasuhiro Ohara wrote:
> 
> Hi,
> 
> My system developper friend recently ran into a problem where
> l3fwd does not appear to receive balanced traffic on Intel XL710,
> but it is resolved when the attached patch is applied.
> 
> -.rss_hf = ETH_RSS_IP,
> +.rss_hf = ETH_RSS_IP | ETH_RSS_TCP | ETH_RSS_UDP,
> 
> IIRC I ran into a similar problem 3 or 4 years back,
> but didn't report then because I believed I was doing something silly.
> But since my friend is an experienced engineer, I feel like
> it may be better (for the community) to ask this in the list.
> 
> We are using dpdk-stable-18.11.6 and igb_uio.
> 
> What are we doing wrong?
> 
> If it is not a FAQ, I can test it again with more recent stable,
> and will report the details.
> 
For XL710 NICs, I believe that ETH_RSS_IP load balances only IP frames that
do not have TCP or UDP headers also. Adding i40e driver maintainer on CC
to comment further.

/Bruce


[PATCH] net/bnxt: add back dependency to virt kmods

2022-01-13 Thread Geoffrey Le Gourriérec
During a large refactoring sweep for 21.11, a previous commit
removed the dependency the bnxt driver had on Linux virtual
bus drivers, such as vfio-pci. This breaks port detection.

This patch adds the kmod dependency back as it was.

Fixes: 295968d17407 ("ethdev: add namespace")
Signed-off-by: Geoffrey Le Gourriérec 
---
 drivers/net/bnxt/bnxt_ethdev.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index f79f33ab4e17..3ca6b7be7b6a 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -6302,4 +6302,5 @@ bool is_bnxt_supported(struct rte_eth_dev *dev)
 RTE_LOG_REGISTER_SUFFIX(bnxt_logtype_driver, driver, NOTICE);
 RTE_PMD_REGISTER_PCI(net_bnxt, bnxt_rte_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_bnxt, bnxt_pci_id_map);
+RTE_PMD_REGISTER_KMOD_DEP(net_bnxt, "* igb_uio | uio_pci_generic | vfio-pci");
 
-- 
2.30.2



Re: [PATCH] build: add missing arch define for Arm

2022-01-13 Thread Thomas Monjalon
17/12/2021 09:54, Ruifeng Wang:
> As per design document, RTE_ARCH is the name of the architecture.
> However, the definition was missing on Arm with meson build.
> It impacts applications that refers to this string.
> 
> Added for Arm builds.
> 
> Fixes: b1d48c41189a ("build: support ARM with meson")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ruifeng Wang 
> ---
>  ['RTE_ARCH_ARMv8_AARCH32', true],
> +['RTE_ARCH', 'arm64_aarch32'],

Why not armv8_aarch32?

[...]
>  dpdk_conf.set('RTE_ARCH_ARMv7', true)
> +dpdk_conf.set('RTE_ARCH', 'armv7')
[...]
>  # armv8 build
> +dpdk_conf.set('RTE_ARCH', 'arm64')

Why not armv8?

What I prefer the most in silicon industry is the naming craziness :)




[PATCH 0/6] allow more DPDK libraries to be disabled on build

2022-01-13 Thread Bruce Richardson
A common request on-list has been to allow more of the DPDK build to be 
disabled by those who are
doing their own builds and only use a subset of the libraries. To this end, 
this patchset makes some
infrastructure changes [first two patches] to make it easier to have libraries 
disabled, and then
adds a six libraries to the "optional" list.

Bruce Richardson (6):
  lib: allow recursive disabling of libs in build
  app/test: link unit test binary against all available libs
  build: add node library to optional list
  build: add flow classification library to optional list
  build: add "packet framework" libs to optional list
  build: add cfgfile library to optional list

 app/test/meson.build | 74 
 lib/meson.build  | 30 --
 2 files changed, 40 insertions(+), 64 deletions(-)

--
2.32.0



[PATCH 1/6] lib: allow recursive disabling of libs in build

2022-01-13 Thread Bruce Richardson
Align the code in lib/meson.build with that in drivers/meson.build to
enable recursive disabling of libraries, i.e. if library b depends on
library a, disable library b if a is disabled (either explicitly or
implicitly). This allows libraries to be optional even if other DPDK
libs depend on them, something that was not previously possible.

Signed-off-by: Bruce Richardson 
---
 lib/meson.build | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/lib/meson.build b/lib/meson.build
index fbaa6ef7c2..af4662e942 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -134,23 +134,29 @@ foreach l:libraries
 warning('Library name, "@0@", and directory name, "@1@", do not 
match'.format(name, l))
 endif
 
-if not build
-dpdk_libs_disabled += name
-set_variable(name.underscorify() + '_disable_reason', reason)
-continue
-endif
-
 shared_deps = ext_deps
 static_deps = ext_deps
 foreach d:deps
+if not build
+break
+endif
 if not is_variable('shared_rte_' + d)
-error('Missing internal dependency "@0@" for @1@ [@2@]'
+build = false
+reason = 'missing internal dependency, "@0@"'.format(d)
+message('Disabling @1@ [@2@]: missing internal dependency "@0@"'
 .format(d, name, 'lib/' + l))
+else
+shared_deps += [get_variable('shared_rte_' + d)]
+static_deps += [get_variable('static_rte_' + d)]
 endif
-shared_deps += [get_variable('shared_rte_' + d)]
-static_deps += [get_variable('static_rte_' + d)]
 endforeach
 
+if not build
+dpdk_libs_disabled += name
+set_variable(name.underscorify() + '_disable_reason', reason)
+continue
+endif
+
 enabled_libs += name
 dpdk_conf.set('RTE_LIB_' + name.to_upper(), 1)
 install_headers(headers)
-- 
2.32.0



[PATCH 2/6] app/test: link unit test binary against all available libs

2022-01-13 Thread Bruce Richardson
Rather than maintaining a list of the libraries the unit tests need, and
having to conditionally include/omit optional libs from the list, we can
just link against all available libraries, simplifying the code
considerably.

Signed-off-by: Bruce Richardson 
---
 app/test/meson.build | 47 +---
 1 file changed, 1 insertion(+), 46 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 344a609a4d..9919de4307 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -157,39 +157,7 @@ test_sources = files(
 'virtual_pmd.c',
 )
 
-test_deps = [
-'acl',
-'bus_pci',
-'bus_vdev',
-'bpf',
-'cfgfile',
-'cmdline',
-'cryptodev',
-'distributor',
-'dmadev',
-'efd',
-'ethdev',
-'eventdev',
-'fib',
-'flow_classify',
-'graph',
-'hash',
-'ipsec',
-'lpm',
-'member',
-'node',
-'pipeline',
-'port',
-'rawdev',
-'rcu',
-'reorder',
-'rib',
-'ring',
-'security',
-'stack',
-'telemetry',
-'timer',
-]
+test_deps = enabled_libs
 
 # Each test is marked with flag true/false
 # to indicate whether it can run in no-huge mode.
@@ -380,7 +348,6 @@ if dpdk_conf.has('RTE_EVENT_SKELETON')
 test_deps += 'event_skeleton'
 endif
 if dpdk_conf.has('RTE_LIB_METRICS')
-test_deps += 'metrics'
 test_sources += ['test_metrics.c']
 fast_tests += [['metrics_autotest', true]]
 endif
@@ -410,17 +377,14 @@ if dpdk_conf.has('RTE_NET_RING')
 perf_test_names += 'ring_pmd_perf_autotest'
 fast_tests += [['event_eth_tx_adapter_autotest', false]]
 if dpdk_conf.has('RTE_LIB_BITRATESTATS')
-test_deps += 'bitratestats'
 test_sources += 'test_bitratestats.c'
 fast_tests += [['bitratestats_autotest', true]]
 endif
 if dpdk_conf.has('RTE_LIB_LATENCYSTATS')
-test_deps += 'latencystats'
 test_sources += 'test_latencystats.c'
 fast_tests += [['latencystats_autotest', true]]
 endif
 if dpdk_conf.has('RTE_LIB_PDUMP')
-test_deps += 'pdump'
 test_sources += 'test_pdump.c'
 fast_tests += [['pdump_autotest', true]]
 endif
@@ -434,18 +398,10 @@ endif
 if dpdk_conf.has('RTE_HAS_LIBPCAP')
 ext_deps += pcap_dep
 if dpdk_conf.has('RTE_LIB_PCAPNG')
-test_deps += 'pcapng'
 test_sources += 'test_pcapng.c'
 endif
 endif
 
-if dpdk_conf.has('RTE_LIB_POWER')
-test_deps += 'power'
-endif
-if dpdk_conf.has('RTE_LIB_KNI')
-test_deps += 'kni'
-endif
-
 if cc.has_argument('-Wno-format-truncation')
 cflags += '-Wno-format-truncation'
 endif
@@ -462,7 +418,6 @@ if dpdk_conf.has('RTE_LIB_COMPRESSDEV')
 if compress_test_dep.found()
 test_dep_objs += compress_test_dep
 test_sources += 'test_compressdev.c'
-test_deps += 'compressdev'
 fast_tests += [['compressdev_autotest', false]]
 endif
 endif
-- 
2.32.0



[PATCH 3/6] build: add node library to optional list

2022-01-13 Thread Bruce Richardson
Allow the 'node' library to be disabled in builds

Signed-off-by: Bruce Richardson 
---
 lib/meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/meson.build b/lib/meson.build
index af4662e942..dd20fe70a6 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -74,6 +74,7 @@ optional_libs = [
 'jobstats',
 'latencystats',
 'metrics',
+'node',
 'pdump',
 'power',
 'vhost',
-- 
2.32.0



[PATCH 4/6] build: add flow classification library to optional list

2022-01-13 Thread Bruce Richardson
Add the flow_classify library to the list of optional libraries, and
ensure tests can build with it disabled.

Signed-off-by: Bruce Richardson 
---
 app/test/meson.build | 7 +--
 lib/meson.build  | 1 +
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index 9919de4307..a92dd0c1f0 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -65,7 +65,6 @@ test_sources = files(
 'test_fib6.c',
 'test_fib6_perf.c',
 'test_func_reentrancy.c',
-'test_flow_classify.c',
 'test_graph.c',
 'test_graph_perf.c',
 'test_hash.c',
@@ -194,7 +193,6 @@ fast_tests = [
 ['fib_autotest', true],
 ['fib6_autotest', true],
 ['func_reentrancy_autotest', false],
-['flow_classify_autotest', false],
 ['hash_autotest', true],
 ['interrupt_autotest', true],
 ['ipfrag_autotest', false],
@@ -347,6 +345,11 @@ endif
 if dpdk_conf.has('RTE_EVENT_SKELETON')
 test_deps += 'event_skeleton'
 endif
+
+if dpdk_conf.has('RTE_LIB_FLOW_CLASSIFY')
+test_sources += 'test_flow_classify.c'
+fast_tests += [['flow_classify_autotest', false]]
+endif
 if dpdk_conf.has('RTE_LIB_METRICS')
 test_sources += ['test_metrics.c']
 fast_tests += [['metrics_autotest', true]]
diff --git a/lib/meson.build b/lib/meson.build
index dd20fe70a6..ede5199374 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -67,6 +67,7 @@ libraries = [
 
 optional_libs = [
 'bitratestats',
+'flow_classify',
 'gpudev',
 'gro',
 'gso',
-- 
2.32.0



[PATCH 5/6] build: add "packet framework" libs to optional list

2022-01-13 Thread Bruce Richardson
Add port, table and pipeline libraries - collectively often known as
the "packet framework" -  to the list of optional libraries, and
ensure tests can build with them disabled.

Signed-off-by: Bruce Richardson 
---
 app/test/meson.build | 20 +---
 lib/meson.build  |  3 +++
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/app/test/meson.build b/app/test/meson.build
index a92dd0c1f0..ba62e5e98c 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -135,12 +135,6 @@ test_sources = files(
 'test_stack.c',
 'test_stack_perf.c',
 'test_string_fns.c',
-'test_table.c',
-'test_table_acl.c',
-'test_table_combined.c',
-'test_table_pipeline.c',
-'test_table_ports.c',
-'test_table_tables.c',
 'test_tailq.c',
 'test_thash.c',
 'test_thash_perf.c',
@@ -227,7 +221,6 @@ fast_tests = [
 ['stack_autotest', false],
 ['stack_lf_autotest', false],
 ['string_autotest', true],
-['table_autotest', true],
 ['tailq_autotest', true],
 ['ticketlock_autotest', true],
 ['timer_autotest', false],
@@ -358,6 +351,19 @@ if dpdk_conf.has('RTE_LIB_TELEMETRY')
 test_sources += ['test_telemetry_json.c', 'test_telemetry_data.c']
 fast_tests += [['telemetry_json_autotest', true], 
['telemetry_data_autotest', true]]
 endif
+if dpdk_conf.has('RTE_LIB_PIPELINE')
+# pipeline lib depends on port and table libs, so those must be present
+# if pipeline library is.
+test_sources += [
+'test_table.c',
+'test_table_acl.c',
+'test_table_combined.c',
+'test_table_pipeline.c',
+'test_table_ports.c',
+'test_table_tables.c',
+]
+fast_tests += [['table_autotest', true]]
+endif
 
 # The following linkages of drivers are required because
 # they are used via a driver-specific API.
diff --git a/lib/meson.build b/lib/meson.build
index ede5199374..dcc1b4d835 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -77,7 +77,10 @@ optional_libs = [
 'metrics',
 'node',
 'pdump',
+'pipeline',
+'port',
 'power',
+'table',
 'vhost',
 ]
 
-- 
2.32.0



[PATCH 6/6] build: add cfgfile library to optional list

2022-01-13 Thread Bruce Richardson
Allow disabling of the cfgfile library in builds.

Signed-off-by: Bruce Richardson 
---
 lib/meson.build | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/meson.build b/lib/meson.build
index dcc1b4d835..8e5acd7819 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -67,6 +67,7 @@ libraries = [
 
 optional_libs = [
 'bitratestats',
+'cfgfile',
 'flow_classify',
 'gpudev',
 'gro',
-- 
2.32.0



Re: [PATCH 0/6] allow more DPDK libraries to be disabled on build

2022-01-13 Thread Stephen Hemminger
On Thu, 13 Jan 2022 17:39:12 +
Bruce Richardson  wrote:

> A common request on-list has been to allow more of the DPDK build to be 
> disabled by those who are
> doing their own builds and only use a subset of the libraries. To this end, 
> this patchset makes some
> infrastructure changes [first two patches] to make it easier to have 
> libraries disabled, and then
> adds a six libraries to the "optional" list.
> 
> Bruce Richardson (6):
>   lib: allow recursive disabling of libs in build
>   app/test: link unit test binary against all available libs
>   build: add node library to optional list
>   build: add flow classification library to optional list
>   build: add "packet framework" libs to optional list
>   build: add cfgfile library to optional list
> 
>  app/test/meson.build | 74 
>  lib/meson.build  | 30 --
>  2 files changed, 40 insertions(+), 64 deletions(-)
> 
> --
> 2.32.0
> 

Acked-by: Stephen Hemminger 


RE: [PATCH 3/8] ethdev: add mbuf dynfield for incomplete IP reassembly

2022-01-13 Thread Ananyev, Konstantin



> > > > >
> > > > > +/**
> > > > > + * In case of IP reassembly offload failure, ol_flags in mbuf will 
> > > > > be set
> > > > > + * with RTE_MBUF_F_RX_IPREASSEMBLY_INCOMPLETE and packets will
> > be
> > > > returned
> > > > > + * without alteration. The application can retrieve the attached 
> > > > > fragments
> > > > > + * using mbuf dynamic field.
> > > > > + */
> > > > > +typedef struct {
> > > > > + /**
> > > > > +  * Next fragment packet. Application should fetch dynamic field 
> > > > > of
> > > > > +  * each fragment until a NULL is received and nb_frags is 0.
> > > > > +  */
> > > > > + struct rte_mbuf *next_frag;
> > > > > + /** Time spent(in ms) by HW in waiting for further fragments. */
> > > > > + uint16_t time_spent;
> > > > > + /** Number of more fragments attached in mbuf dynamic fields. */
> > > > > + uint16_t nb_frags;
> > > > > +} rte_eth_ip_reass_dynfield_t;
> > > >
> > > >
> > > > Looks like a bit of overkill to me:
> > > > We do already have 'next' and 'nb_frags' fields inside mbuf,
> > > > why can't they be used here? Why a separate ones are necessary?
> > > >
> > > The next and nb_frags in mbuf is for segmented buffers and not IP 
> > > fragments.
> > > But here we will have separate mbufs in each dynfield denoting each of the
> > > fragments which may have further segmented buffers.
> >
> > Makes sense, thanks for explanation.
> > Though in that case just 'struct rte_mbuf *next_frag' might be enough
> > (user will walk though the list till mbuf->next_frag != NULL)?
> > The reason I am asking: current sizeof(rte_eth_ip_reass_dynfield_t) is 16B,
> > which is quite a lot for mbuf, especially considering that it has to be 
> > continuous
> > 16B.
> > Making it smaller (8B) or even splitting into 2 fileds (8+4) will give it 
> > more
> > chances
> > to coexist with other dynfields.
> 
> Even if we drop nb_frags, we will be left with uint16_t time_spent.
> Are you suggesting to use separate dynfield altogether for 2 bytes of 
> time_spent?

Yes, that's was my thought - split it into two separate fields, if possible.
 
 




Re: ETH_RSS_IP only does not seem to balance traffic

2022-01-13 Thread Yasuhiro Ohara


That makes sense. Thank you.
It would be great to have further comments from the maintener.

For the RTE framework integrity, would it be better for us
to have a consistent meaning for the ETH_RSS_XXX flags?
Do the other drivers act differently?

Best regards,
Yasu

From: Bruce Richardson 
Subject: Re: ETH_RSS_IP only does not seem to balance traffic
Date: Thu, 13 Jan 2022 15:05:53 +
Message-ID: 

> On Thu, Jan 13, 2022 at 12:52:04AM +0900, Yasuhiro Ohara wrote:
>> 
>> Hi,
>> 
>> My system developper friend recently ran into a problem where
>> l3fwd does not appear to receive balanced traffic on Intel XL710,
>> but it is resolved when the attached patch is applied.
>> 
>> -.rss_hf = ETH_RSS_IP,
>> +.rss_hf = ETH_RSS_IP | ETH_RSS_TCP | ETH_RSS_UDP,
>> 
>> IIRC I ran into a similar problem 3 or 4 years back,
>> but didn't report then because I believed I was doing something silly.
>> But since my friend is an experienced engineer, I feel like
>> it may be better (for the community) to ask this in the list.
>> 
>> We are using dpdk-stable-18.11.6 and igb_uio.
>> 
>> What are we doing wrong?
>> 
>> If it is not a FAQ, I can test it again with more recent stable,
>> and will report the details.
>> 
> For XL710 NICs, I believe that ETH_RSS_IP load balances only IP frames that
> do not have TCP or UDP headers also. Adding i40e driver maintainer on CC
> to comment further.
> 
> /Bruce
> 


RE: ETH_RSS_IP only does not seem to balance traffic

2022-01-13 Thread Xing, Beilei



> -Original Message-
> From: Richardson, Bruce 
> Sent: Thursday, January 13, 2022 11:06 PM
> To: Yasuhiro Ohara 
> Cc: dev@dpdk.org; Xing, Beilei 
> Subject: Re: ETH_RSS_IP only does not seem to balance traffic
> 
> On Thu, Jan 13, 2022 at 12:52:04AM +0900, Yasuhiro Ohara wrote:
> >
> > Hi,
> >
> > My system developper friend recently ran into a problem where l3fwd
> > does not appear to receive balanced traffic on Intel XL710, but it is
> > resolved when the attached patch is applied.
> >
> > -.rss_hf = ETH_RSS_IP,
> > +.rss_hf = ETH_RSS_IP | ETH_RSS_TCP | ETH_RSS_UDP,
> >
> > IIRC I ran into a similar problem 3 or 4 years back, but didn't report
> > then because I believed I was doing something silly.
> > But since my friend is an experienced engineer, I feel like it may be
> > better (for the community) to ask this in the list.
> >
> > We are using dpdk-stable-18.11.6 and igb_uio.
> >
> > What are we doing wrong?
> >
> > If it is not a FAQ, I can test it again with more recent stable, and
> > will report the details.
> >
> For XL710 NICs, I believe that ETH_RSS_IP load balances only IP frames that do
> not have TCP or UDP headers also. Adding i40e driver maintainer on CC to
> comment further.

Yes, Bruce is right. For XL710, ETH_RSS_IP doesn't cover TCP and UDP packets.

> 
> /Bruce


Re: ETH_RSS_IP only does not seem to balance traffic

2022-01-13 Thread Yasuhiro Ohara


Thank you for the confirmation!

Best regards,
Yasu

From: "Xing, Beilei" 
Subject: RE: ETH_RSS_IP only does not seem to balance traffic
Date: Fri, 14 Jan 2022 01:44:45 +
Message-ID: 


> 
> 
>> -Original Message-
>> From: Richardson, Bruce 
>> Sent: Thursday, January 13, 2022 11:06 PM
>> To: Yasuhiro Ohara 
>> Cc: dev@dpdk.org; Xing, Beilei 
>> Subject: Re: ETH_RSS_IP only does not seem to balance traffic
>> 
>> On Thu, Jan 13, 2022 at 12:52:04AM +0900, Yasuhiro Ohara wrote:
>> >
>> > Hi,
>> >
>> > My system developper friend recently ran into a problem where l3fwd
>> > does not appear to receive balanced traffic on Intel XL710, but it is
>> > resolved when the attached patch is applied.
>> >
>> > -.rss_hf = ETH_RSS_IP,
>> > +.rss_hf = ETH_RSS_IP | ETH_RSS_TCP | ETH_RSS_UDP,
>> >
>> > IIRC I ran into a similar problem 3 or 4 years back, but didn't report
>> > then because I believed I was doing something silly.
>> > But since my friend is an experienced engineer, I feel like it may be
>> > better (for the community) to ask this in the list.
>> >
>> > We are using dpdk-stable-18.11.6 and igb_uio.
>> >
>> > What are we doing wrong?
>> >
>> > If it is not a FAQ, I can test it again with more recent stable, and
>> > will report the details.
>> >
>> For XL710 NICs, I believe that ETH_RSS_IP load balances only IP frames that 
>> do
>> not have TCP or UDP headers also. Adding i40e driver maintainer on CC to
>> comment further.
> 
> Yes, Bruce is right. For XL710, ETH_RSS_IP doesn't cover TCP and UDP packets.
> 
>> 
>> /Bruce
> 


dumpcap w/ pcapng produces out of order/negative times

2022-01-13 Thread Ben Magistro
While utilizing dumpcap with our app, we have observed the captured file
producing out of order timestamps to include negative times.  We are still
investigating the root cause but believe it is in lib/pcapng.  While doing
some testing of this issue, this behavior was not observed with pcap.  In
the attached pcap, there are 5 streams (same curl multiple times a few
seconds apart), with streams 1 and 3 showing oddities.  Specifically,
stream 1 is in the future relative to the packet order and stream 3 has a
negative time.

Not sure if the pcap file will actual post/attach, if it doesn't will try
something.

System: CentOS7 + devtoolset 9
DPDK: v21.11


1060.pcapng
Description: Binary data


[PATCH 0/3] enic PMD patches

2022-01-13 Thread John Daley
Here are a couple patches to the enic PMD that should apply on top of
the patch:
'net/enic: support GENEVE flow item' by Hyong Youb Kim.

John Daley (3):
  net/enic: add support for eCPRI matching
  net/enic: update VIC firmware API
  net/enic: support max descriptors allowed by adapter

 doc/guides/rel_notes/release_22_03.rst |  1 +
 drivers/net/enic/base/cq_enet_desc.h   |  6 ++-
 drivers/net/enic/base/vnic_enet.h  | 22 +
 drivers/net/enic/enic_fm_flow.c| 65 ++
 drivers/net/enic/enic_res.c| 20 ++--
 drivers/net/enic/enic_res.h|  6 ++-
 drivers/net/enic/enic_rxtx.c   | 33 +
 7 files changed, 136 insertions(+), 17 deletions(-)

-- 
2.33.1



[PATCH 1/3] net/enic: add support for eCPRI matching

2022-01-13 Thread John Daley
eCPRI message can be over Ethernet layer (.1Q supported also) or over
UDP layer. Message header formats are the same in these two variants.

Only up though the first packet header in the PDU can be matched.
RSS on the eCPRI header fields is not supported.

Signed-off-by: John Daley 
Reviewed-by: Hyong Youb Kim 
---
 doc/guides/rel_notes/release_22_03.rst |  1 +
 drivers/net/enic/enic_fm_flow.c| 65 ++
 2 files changed, 66 insertions(+)

diff --git a/doc/guides/rel_notes/release_22_03.rst 
b/doc/guides/rel_notes/release_22_03.rst
index b38dc54e62..52d1e32cf6 100644
--- a/doc/guides/rel_notes/release_22_03.rst
+++ b/doc/guides/rel_notes/release_22_03.rst
@@ -58,6 +58,7 @@ New Features
 * **Updated Cisco enic driver.**
 
   * Added rte_flow support for matching GENEVE packets.
+  * Added rte_flow support for matching eCPRI packets.
 
 Removed Items
 -
diff --git a/drivers/net/enic/enic_fm_flow.c b/drivers/net/enic/enic_fm_flow.c
index 752ffeb5c5..589c9253e1 100644
--- a/drivers/net/enic/enic_fm_flow.c
+++ b/drivers/net/enic/enic_fm_flow.c
@@ -237,6 +237,7 @@ static enic_copy_item_fn enic_fm_copy_item_vxlan;
 static enic_copy_item_fn enic_fm_copy_item_gtp;
 static enic_copy_item_fn enic_fm_copy_item_geneve;
 static enic_copy_item_fn enic_fm_copy_item_geneve_opt;
+static enic_copy_item_fn enic_fm_copy_item_ecpri;
 
 /* Ingress actions */
 static const enum rte_flow_action_type enic_fm_supported_ig_actions[] = {
@@ -392,6 +393,15 @@ static const struct enic_fm_items enic_fm_items[] = {
   RTE_FLOW_ITEM_TYPE_END,
},
},
+   [RTE_FLOW_ITEM_TYPE_ECPRI] = {
+   .copy_item = enic_fm_copy_item_ecpri,
+   .valid_start_item = 1,
+   .prev_items = (const enum rte_flow_item_type[]) {
+  RTE_FLOW_ITEM_TYPE_ETH,
+  RTE_FLOW_ITEM_TYPE_UDP,
+  RTE_FLOW_ITEM_TYPE_END,
+   },
+   },
 };
 
 static int
@@ -877,6 +887,61 @@ enic_fm_copy_item_geneve_opt(struct copy_item_args *arg)
return 0;
 }
 
+/* Match eCPRI combined message header */
+static int
+enic_fm_copy_item_ecpri(struct copy_item_args *arg)
+{
+   const struct rte_flow_item *item = arg->item;
+   const struct rte_flow_item_ecpri *spec = item->spec;
+   const struct rte_flow_item_ecpri *mask = item->mask;
+   struct fm_tcam_match_entry *entry = arg->fm_tcam_entry;
+   struct fm_header_set *fm_data, *fm_mask;
+   uint8_t *fm_data_to, *fm_mask_to;
+
+   ENICPMD_FUNC_TRACE();
+
+   /* Tunneling not supported- only matching on inner eCPRI fields. */
+   if (arg->header_level > 0)
+   return -EINVAL;
+
+   /* Need both spec and mask */
+   if (!spec || !mask)
+   return -EINVAL;
+
+   fm_data = &entry->ftm_data.fk_hdrset[0];
+   fm_mask = &entry->ftm_mask.fk_hdrset[0];
+
+   /* eCPRI can only follow L2/VLAN layer if ethernet type is 0xAEFE. */
+   if (!(fm_data->fk_metadata & FKM_UDP) &&
+   (fm_mask->l2.eth.fk_ethtype != UINT16_MAX ||
+   rte_cpu_to_be_16(fm_data->l2.eth.fk_ethtype) !=
+   RTE_ETHER_TYPE_ECPRI))
+   return -EINVAL;
+
+   if (fm_data->fk_metadata & FKM_UDP) {
+   /* eCPRI on UDP */
+   fm_data->fk_header_select |= FKH_L4RAW;
+   fm_mask->fk_header_select |= FKH_L4RAW;
+   fm_data_to = &fm_data->l4.rawdata[sizeof(fm_data->l4.udp)];
+   fm_mask_to = &fm_mask->l4.rawdata[sizeof(fm_data->l4.udp)];
+   } else {
+   /* eCPRI directly after Etherent header */
+   fm_data->fk_header_select |= FKH_L3RAW;
+   fm_mask->fk_header_select |= FKH_L3RAW;
+   fm_data_to = &fm_data->l3.rawdata[0];
+   fm_mask_to = &fm_mask->l3.rawdata[0];
+   }
+
+   /*
+* Use the raw L3 or L4 buffer to match eCPRI since fm_header_set does
+* not have eCPRI header. Only 1st message header of PDU can be matched.
+* "C" * bit ignored.
+*/
+   memcpy(fm_data_to, spec, sizeof(*spec));
+   memcpy(fm_mask_to, mask, sizeof(*mask));
+   return 0;
+}
+
 /*
  * Currently, raw pattern match is very limited. It is intended for matching
  * UDP tunnel header (e.g. vxlan or geneve).
-- 
2.33.1



[PATCH 2/3] net/enic: update VIC firmware API

2022-01-13 Thread John Daley
Update the configuration structure used between the adapter and
driver. The structure is compatible with all Cisco VIC adapters.

Signed-off-by: John Daley 
Reviewed-by: Hyong Youb Kim 
---
 drivers/net/enic/base/vnic_enet.h | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/drivers/net/enic/base/vnic_enet.h 
b/drivers/net/enic/base/vnic_enet.h
index 2a97a33044..66261d9127 100644
--- a/drivers/net/enic/base/vnic_enet.h
+++ b/drivers/net/enic/base/vnic_enet.h
@@ -31,6 +31,28 @@ struct vnic_enet_config {
uint32_t rdma_mr_id;
uint32_t rdma_mr_count;
uint32_t max_pkt_size;
+   uint16_t vf_subvnic_count;
+   uint16_t mq_subvnic_count;
+   uint32_t mq_flags;
+
+   /* the following 3 fields are per-MQ-vnic counts */
+   uint32_t mq_rdma_mr_count;
+   uint16_t mq_rdma_qp_count;
+   uint16_t mq_rdma_resgrp;
+
+   uint16_t rdma_max_sq_ring_sz;
+   uint16_t rdma_max_rq_ring_sz;
+   uint32_t rdma_max_cq_ring_sz;
+   uint16_t rdma_max_wr_sge;
+   uint16_t rdma_max_mr_sge;
+   uint8_t rdma_max_rd_per_qp;
+   uint8_t unused; /* available */
+   uint16_t mq_rdma_engine_count;
+   uint32_t intr_coal_tick_ns; /* coalescing timer tick in nsec */
+   uint32_t max_rq_ring;   /* MAX RQ ring size */
+   uint32_t max_wq_ring;   /* MAX WQ ring size */
+   uint32_t max_cq_ring;   /* MAX CQ ring size */
+   uint32_t rdma_rsvd_lkey;/* Reserved (privileged) LKey */
 };
 
 #define VENETF_TSO 0x1 /* TSO enabled */
-- 
2.33.1



[PATCH 3/3] net/enic: support max descriptors allowed by adapter

2022-01-13 Thread John Daley
Newer VIC adapters have the max number of supported RX and TX
descriptors in their configuration. Use these values as the
maximums.

Signed-off-by: John Daley 
Reviewed-by: Hyong Youb Kim 
---
 drivers/net/enic/base/cq_enet_desc.h |  6 -
 drivers/net/enic/enic_res.c  | 20 +
 drivers/net/enic/enic_res.h  |  6 +++--
 drivers/net/enic/enic_rxtx.c | 33 +++-
 4 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/drivers/net/enic/base/cq_enet_desc.h 
b/drivers/net/enic/base/cq_enet_desc.h
index a34a4f5400..02db85b9a0 100644
--- a/drivers/net/enic/base/cq_enet_desc.h
+++ b/drivers/net/enic/base/cq_enet_desc.h
@@ -67,7 +67,8 @@ struct cq_enet_rq_desc_64 {
uint16_t vlan;
uint16_t checksum_fcoe;
uint8_t flags;
-   uint8_t unused[48];
+   uint8_t fetch_idx_flags;
+   uint8_t unused[47];
uint8_t type_color;
 };
 
@@ -92,6 +93,9 @@ struct cq_enet_rq_desc_64 {
 #define CQ_ENET_RQ_DESC_BYTES_WRITTEN_BITS  14
 #define CQ_ENET_RQ_DESC_BYTES_WRITTEN_MASK \
((1 << CQ_ENET_RQ_DESC_BYTES_WRITTEN_BITS) - 1)
+#define CQ_ENET_RQ_DESC_FETCH_IDX_BITS  2
+#define CQ_ENET_RQ_DESC_FETCH_IDX_MASK \
+   ((1 << CQ_ENET_RQ_DESC_FETCH_IDX_BITS) - 1)
 #define CQ_ENET_RQ_DESC_FLAGS_TRUNCATED (0x1 << 14)
 #define CQ_ENET_RQ_DESC_FLAGS_VLAN_STRIPPED (0x1 << 15)
 
diff --git a/drivers/net/enic/enic_res.c b/drivers/net/enic/enic_res.c
index 9cfb857939..caf773bab2 100644
--- a/drivers/net/enic/enic_res.c
+++ b/drivers/net/enic/enic_res.c
@@ -26,6 +26,7 @@ int enic_get_vnic_config(struct enic *enic)
struct vnic_enet_config *c = &enic->config;
int err;
uint64_t sizes;
+   uint32_t max_rq_descs, max_wq_descs;
 
err = vnic_dev_get_mac_addr(enic->vdev, enic->mac_addr);
if (err) {
@@ -57,6 +58,8 @@ int enic_get_vnic_config(struct enic *enic)
GET_CONFIG(loop_tag);
GET_CONFIG(num_arfs);
GET_CONFIG(max_pkt_size);
+   GET_CONFIG(max_rq_ring);
+   GET_CONFIG(max_wq_ring);
 
/* max packet size is only defined in newer VIC firmware
 * and will be 0 for legacy firmware and VICs
@@ -101,20 +104,29 @@ int enic_get_vnic_config(struct enic *enic)
((enic->filter_actions & FILTER_ACTION_COUNTER_FLAG) ?
 "count " : ""));
 
-   c->wq_desc_count = RTE_MIN((uint32_t)ENIC_MAX_WQ_DESCS,
+   /* The max size of RQ and WQ rings are specified in 1500 series VICs and
+* beyond. If they are not specified by the VIC or if 64B CQ descriptors
+* are not being used, the max number of descriptors is 4096.
+*/
+   max_wq_descs = (enic->cq64_request && c->max_wq_ring) ? c->max_wq_ring :
+  ENIC_LEGACY_MAX_WQ_DESCS;
+   c->wq_desc_count = RTE_MIN(max_wq_descs,
RTE_MAX((uint32_t)ENIC_MIN_WQ_DESCS, c->wq_desc_count));
c->wq_desc_count &= 0xffe0; /* must be aligned to groups of 32 */
-
-   c->rq_desc_count = RTE_MIN((uint32_t)ENIC_MAX_RQ_DESCS,
+   max_rq_descs = (enic->cq64_request && c->max_rq_ring) ? c->max_rq_ring
+  : ENIC_LEGACY_MAX_WQ_DESCS;
+   c->rq_desc_count = RTE_MIN(max_rq_descs,
RTE_MAX((uint32_t)ENIC_MIN_RQ_DESCS, c->rq_desc_count));
c->rq_desc_count &= 0xffe0; /* must be aligned to groups of 32 */
+   dev_debug(NULL, "Max supported VIC descriptors: WQ:%u, RQ:%u\n",
+ max_wq_descs, max_rq_descs);
 
c->intr_timer_usec = RTE_MIN(c->intr_timer_usec,
  vnic_dev_get_intr_coal_timer_max(enic->vdev));
 
dev_info(enic_get_dev(enic),
"vNIC MAC addr " RTE_ETHER_ADDR_PRT_FMT
-   "wq/rq %d/%d mtu %d, max mtu:%d\n",
+   " wq/rq %d/%d mtu %d, max mtu:%d\n",
enic->mac_addr[0], enic->mac_addr[1], enic->mac_addr[2],
enic->mac_addr[3], enic->mac_addr[4], enic->mac_addr[5],
c->wq_desc_count, c->rq_desc_count,
diff --git a/drivers/net/enic/enic_res.h b/drivers/net/enic/enic_res.h
index 34f15d5a42..ae979d52be 100644
--- a/drivers/net/enic/enic_res.h
+++ b/drivers/net/enic/enic_res.h
@@ -12,9 +12,11 @@
 #include "vnic_rq.h"
 
 #define ENIC_MIN_WQ_DESCS  64
-#define ENIC_MAX_WQ_DESCS  4096
 #define ENIC_MIN_RQ_DESCS  64
-#define ENIC_MAX_RQ_DESCS  4096
+
+/* 1400 series VICs and prior all have 4K max, after that it's in the config */
+#define ENIC_LEGACY_MAX_WQ_DESCS4096
+#define ENIC_LEGACY_MAX_RQ_DESCS4096
 
 /* A descriptor ring has a multiple of 32 descriptors */
 #define ENIC_ALIGN_DESCS   32
diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
index c44715bfd0..4681ef6eca 100644
--- a/drivers/net/enic/enic_rxtx.c
+++ b/drivers/net/enic/enic_rxtx.c
@@ -84,6 +84,7 @@ enic_recv_pkts_common(vo

RE: [PATCH v1 1/1] vhost: integrate dmadev in asynchronous datapath

2022-01-13 Thread Xia, Chenbo
Hi Jiayu,

This is first round of review, I'll spend time on OVS patches later and look 
back.

> -Original Message-
> From: Hu, Jiayu 
> Sent: Friday, December 31, 2021 5:55 AM
> To: dev@dpdk.org
> Cc: maxime.coque...@redhat.com; i.maxim...@ovn.org; Xia, Chenbo
> ; Richardson, Bruce ; Van
> Haaren, Harry ; Pai G, Sunil
> ; Mcnamara, John ; Ding, Xuan
> ; Jiang, Cheng1 ;
> lian...@liangbit.com; Hu, Jiayu 
> Subject: [PATCH v1 1/1] vhost: integrate dmadev in asynchronous datapath
> 
> Since dmadev is introduced in 21.11, to avoid the overhead of vhost DMA
> abstraction layer and simplify application logics, this patch integrates
> dmadev in asynchronous data path.
> 
> Signed-off-by: Jiayu Hu 
> Signed-off-by: Sunil Pai G 
> ---
>  doc/guides/prog_guide/vhost_lib.rst |  70 -
>  examples/vhost/Makefile |   2 +-
>  examples/vhost/ioat.c   | 218 --
>  examples/vhost/ioat.h   |  63 
>  examples/vhost/main.c   | 230 +++-
>  examples/vhost/main.h   |  11 ++
>  examples/vhost/meson.build  |   6 +-
>  lib/vhost/meson.build   |   3 +-
>  lib/vhost/rte_vhost_async.h | 121 +--
>  lib/vhost/version.map   |   3 +
>  lib/vhost/vhost.c   | 130 +++-
>  lib/vhost/vhost.h   |  53 ++-
>  lib/vhost/virtio_net.c  | 206 +++--
>  13 files changed, 587 insertions(+), 529 deletions(-)
>  delete mode 100644 examples/vhost/ioat.c
>  delete mode 100644 examples/vhost/ioat.h
> 
> diff --git a/doc/guides/prog_guide/vhost_lib.rst
> b/doc/guides/prog_guide/vhost_lib.rst
> index 76f5d303c9..bdce7cbf02 100644
> --- a/doc/guides/prog_guide/vhost_lib.rst
> +++ b/doc/guides/prog_guide/vhost_lib.rst
> @@ -218,38 +218,12 @@ The following is an overview of some key Vhost API
> functions:
> 
>Enable or disable zero copy feature of the vhost crypto backend.
> 
> -* ``rte_vhost_async_channel_register(vid, queue_id, config, ops)``
> +* ``rte_vhost_async_channel_register(vid, queue_id)``
> 
>Register an async copy device channel for a vhost queue after vring

Since dmadev is here, let's just use 'DMA device' instead of 'copy device'

> -  is enabled. Following device ``config`` must be specified together
> -  with the registration:
> +  is enabled.
> 
> -  * ``features``
> -
> -This field is used to specify async copy device features.
> -
> -``RTE_VHOST_ASYNC_INORDER`` represents the async copy device can
> -guarantee the order of copy completion is the same as the order
> -of copy submission.
> -
> -Currently, only ``RTE_VHOST_ASYNC_INORDER`` capable device is
> -supported by vhost.
> -
> -  Applications must provide following ``ops`` callbacks for vhost lib to
> -  work with the async copy devices:
> -
> -  * ``transfer_data(vid, queue_id, descs, opaque_data, count)``
> -
> -vhost invokes this function to submit copy data to the async devices.
> -For non-async_inorder capable devices, ``opaque_data`` could be used
> -for identifying the completed packets.
> -
> -  * ``check_completed_copies(vid, queue_id, opaque_data, max_packets)``
> -
> -vhost invokes this function to get the copy data completed by async
> -devices.
> -
> -* ``rte_vhost_async_channel_register_thread_unsafe(vid, queue_id, config,
> ops)``
> +* ``rte_vhost_async_channel_register_thread_unsafe(vid, queue_id)``
> 
>Register an async copy device channel for a vhost queue without
>performing any locking.
> @@ -277,18 +251,13 @@ The following is an overview of some key Vhost API
> functions:
>This function is only safe to call in vhost callback functions
>(i.e., struct rte_vhost_device_ops).
> 
> -* ``rte_vhost_submit_enqueue_burst(vid, queue_id, pkts, count, comp_pkts,
> comp_count)``
> +* ``rte_vhost_submit_enqueue_burst(vid, queue_id, pkts, count, dma_id,
> dma_vchan)``
> 
>Submit an enqueue request to transmit ``count`` packets from host to guest
> -  by async data path. Successfully enqueued packets can be transfer completed
> -  or being occupied by DMA engines; transfer completed packets are returned
> in
> -  ``comp_pkts``, but others are not guaranteed to finish, when this API
> -  call returns.
> +  by async data path. Applications must not free the packets submitted for
> +  enqueue until the packets are completed.
> 
> -  Applications must not free the packets submitted for enqueue until the
> -  packets are completed.
> -
> -* ``rte_vhost_poll_enqueue_completed(vid, queue_id, pkts, count)``
> +* ``rte_vhost_poll_enqueue_completed(vid, queue_id, pkts, count, dma_id,
> dma_vchan)``
> 
>Poll enqueue completion status from async data path. Completed packets
>are returned to applications through ``pkts``.
> @@ -298,7 +267,7 @@ The following is an overview of some key Vhost API
> functions:
>This function returns the amount of in-flight pa

RE: [PATCH v3] ethdev: mark old macros as deprecated

2022-01-13 Thread Xia, Chenbo
> -Original Message-
> From: Yigit, Ferruh 
> Sent: Wednesday, January 12, 2022 10:36 PM
> To: Thomas Monjalon ; Andrew Rybchenko
> ; Hemant Agrawal ;
> Tyler Retzlaff ; Xia, Chenbo
> ; Jerin Jacob 
> Cc: dev@dpdk.org; Yigit, Ferruh ; Stephen Hemminger
> 
> Subject: [PATCH v3] ethdev: mark old macros as deprecated
> 
> Old macros kept for backward compatibility, but this cause old macro
> usage to sneak in silently.
> 
> Marking old macros as deprecated. Downside is this will cause some noise
> for applications that are using old macros.
> 
> Fixes: 295968d17407 ("ethdev: add namespace")
> 
> Signed-off-by: Ferruh Yigit 
> Acked-by: Stephen Hemminger 
> ---
> v2:
> * Release notes updated
> 
> v3:
> * Update 22.03 release note
> ---
>  doc/guides/rel_notes/release_22_03.rst |   3 +
>  lib/ethdev/rte_ethdev.h| 474 +
>  2 files changed, 247 insertions(+), 230 deletions(-)

Acked-by: Chenbo Xia 


RE: [PATCH] vdpa/sfc: make MCDI memzone name unique

2022-01-13 Thread Xia, Chenbo
Hi Abhimanyu,

> -Original Message-
> From: abhimanyu.sa...@xilinx.com 
> Sent: Tuesday, January 11, 2022 1:33 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; maxime.coque...@redhat.com;
> andrew.rybche...@oktetlabs.ru; Abhimanyu Saini 
> Subject: [PATCH] vdpa/sfc: make MCDI memzone name unique
> 
> From: Abhimanyu Saini 
> 
> Buffer for MCDI channel is allocated using rte_memzone_reserve_aligned
> with zone name 'mcdi'. Since multiple MCDI channels are needed to
> support multiple VF(s) and rte_memzone_reserve_aligned expects unique
> zone names, append PCI address to zone name to make it unique.
> 
> Signed-off-by: Abhimanyu Saini 
> ---

Could you help with the apply issue and checkpatch issue, then send new version?

Thanks,
Chenbo


RE: [dpdk-dev] [PATCH] net/virtio: fix uninitialized old_rss_key variable

2022-01-13 Thread Xia, Chenbo
Hi Yunjuan,

> -Original Message-
> From: Yunjian Wang 
> Sent: Saturday, January 8, 2022 4:14 PM
> To: dev@dpdk.org
> Cc: maxime.coque...@redhat.com; Xia, Chenbo ;
> dingxiaoxi...@huawei.com; xudin...@huawei.com; Yunjian Wang
> ; sta...@dpdk.org
> Subject: [dpdk-dev] [PATCH] net/virtio: fix uninitialized old_rss_key variable
> 
> This patch fixes an issue that uninitialized old_rss_key
> is used for restoring the rss_key.
> 
> Coverity issue: 373866
> Fixes: 0c9d66207054 ("net/virtio: support RSS")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Yunjian Wang 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index c2588369b2..7c445bfc48 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -2028,7 +2028,8 @@ virtio_dev_rss_hash_update(struct rte_eth_dev *dev,
> 
>   return 0;
>  restore_key:
> - memcpy(hw->rss_key, old_rss_key, VIRTIO_NET_RSS_KEY_SIZE);
> + if (rss_conf->rss_key && rss_conf->rss_key_len)
> + memcpy(hw->rss_key, old_rss_key, VIRTIO_NET_RSS_KEY_SIZE);
>  restore_types:
>   hw->rss_hash_types = old_hash_types;
> 
> --
> 2.27.0

Appreciate your effort with Coverity issue, thanks!

Reviewed-by: Chenbo Xia 



RE: [PATCH v3] vdpa/ifc: fix log info mismatch

2022-01-13 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Monday, December 13, 2021 3:01 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo ; sta...@dpdk.org
> Subject: [PATCH v3] vdpa/ifc: fix log info mismatch
> 
> Fix log info mismatch.
> 
> Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Andy Pei 
> ---
>  drivers/vdpa/ifc/base/ifcvf.c | 14 --
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
> index 721cb1d..d10c1fd 100644
> --- a/drivers/vdpa/ifc/base/ifcvf.c
> +++ b/drivers/vdpa/ifc/base/ifcvf.c
> @@ -94,12 +94,14 @@
>   return -1;
>   }
> 
> - DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
> - "notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
> - "multiplier: %u\n",
> - hw->common_cfg, hw->dev_cfg,
> - hw->isr, hw->notify_base,
> - hw->notify_off_multiplier);
> + DEBUGOUT("capability mapping:\n"
> +  "common cfg: %p\n"
> +  "notify base: %p\n"
> +  "isr cfg: %p\n"
> +  "device cfg: %p\n"
> +  "multiplier: %u\n",
> +  hw->common_cfg, hw->notify_base, hw->isr, hw->dev_cfg,
> +  hw->notify_off_multiplier);
> 
>   return 0;
>  }
> --
> 1.8.3.1

Reviewed-by: Chenbo Xia 


[PATCH] vdpa/ifc: fix log info mismatch

2022-01-13 Thread Andy Pei
Fix log info mismatch.

Fixes: a3f8150eac6d ("net/ifcvf: add ifcvf vDPA driver")
Cc: sta...@dpdk.org

Signed-off-by: Andy Pei 
---
 drivers/vdpa/ifc/base/ifcvf.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/vdpa/ifc/base/ifcvf.c b/drivers/vdpa/ifc/base/ifcvf.c
index 721cb1d..d10c1fd 100644
--- a/drivers/vdpa/ifc/base/ifcvf.c
+++ b/drivers/vdpa/ifc/base/ifcvf.c
@@ -94,12 +94,14 @@
return -1;
}
 
-   DEBUGOUT("capability mapping:\ncommon cfg: %p\n"
-   "notify base: %p\nisr cfg: %p\ndevice cfg: %p\n"
-   "multiplier: %u\n",
-   hw->common_cfg, hw->dev_cfg,
-   hw->isr, hw->notify_base,
-   hw->notify_off_multiplier);
+   DEBUGOUT("capability mapping:\n"
+"common cfg: %p\n"
+"notify base: %p\n"
+"isr cfg: %p\n"
+"device cfg: %p\n"
+"multiplier: %u\n",
+hw->common_cfg, hw->notify_base, hw->isr, hw->dev_cfg,
+hw->notify_off_multiplier);
 
return 0;
 }
-- 
1.8.3.1



[PATCH] vhost: add some log for vhost message VHOST_USER_SET_VRING_BASE

2022-01-13 Thread Andy Pei
Usually the last avail index and last used index is 0, but for target
device of live migration, the last avail index and last used index is
not 0. So I think some log is helpful.

Signed-off-by: Andy Pei 
---
 lib/vhost/vhost_user.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index a781346..3cb13fb 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -973,6 +973,11 @@
vq->last_avail_idx = msg->payload.state.num;
}
 
+   VHOST_LOG_CONFIG(INFO,
+   "vring base idx:%d last_used_idx:%u last_avail_idx:%u.\n",
+   msg->payload.state.index, vq->last_used_idx,
+   vq->last_avail_idx);
+
return RTE_VHOST_MSG_RESULT_OK;
 }
 
-- 
1.8.3.1



[PATCH 1/2] vhost: fix physical address mapping

2022-01-13 Thread xuan . ding
From: Xuan Ding 

When choosing IOVA as PA mode, IOVA is likely to be discontinuous,
which requires page by page mapping for DMA devices. To be consistent,
this patch implements page by page mapping instead of mapping at the
region granularity for both IOVA as VA and PA mode.

Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")

Signed-off-by: Xuan Ding 
Signed-off-by: Yuan Wang 
---
 lib/vhost/vhost.h  |   1 +
 lib/vhost/vhost_user.c | 116 -
 2 files changed, 57 insertions(+), 60 deletions(-)

diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index 7085e0885c..d246538ca5 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -355,6 +355,7 @@ struct vring_packed_desc_event {
 struct guest_page {
uint64_t guest_phys_addr;
uint64_t host_phys_addr;
+   uint64_t host_user_addr;
uint64_t size;
 };
 
diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index a781346c4d..6d888766b0 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -143,57 +143,56 @@ get_blk_size(int fd)
return ret == -1 ? (uint64_t)-1 : (uint64_t)stat.st_blksize;
 }
 
-static int
-async_dma_map(struct rte_vhost_mem_region *region, bool do_map)
+static void
+async_dma_map(struct virtio_net *dev, bool do_map)
 {
-   uint64_t host_iova;
int ret = 0;
-
-   host_iova = rte_mem_virt2iova((void 
*)(uintptr_t)region->host_user_addr);
+   uint32_t i;
+   struct guest_page *page;
if (do_map) {
-   /* Add mapped region into the default container of DPDK. */
-   ret = rte_vfio_container_dma_map(RTE_VFIO_DEFAULT_CONTAINER_FD,
-region->host_user_addr,
-host_iova,
-region->size);
-   if (ret) {
-   /*
-* DMA device may bind with kernel driver, in this case,
-* we don't need to program IOMMU manually. However, if 
no
-* device is bound with vfio/uio in DPDK, and vfio 
kernel
-* module is loaded, the API will still be called and 
return
-* with ENODEV/ENOSUP.
-*
-* DPDK vfio only returns ENODEV/ENOSUP in very similar
-* situations(vfio either unsupported, or supported
-* but no devices found). Either way, no mappings could 
be
-* performed. We treat it as normal case in async path.
-*/
-   if (rte_errno == ENODEV || rte_errno == ENOTSUP)
-   return 0;
-
-   VHOST_LOG_CONFIG(ERR, "DMA engine map failed\n");
-   /* DMA mapping errors won't stop 
VHST_USER_SET_MEM_TABLE. */
-   return 0;
+   for (i = 0; i < dev->nr_guest_pages; i++) {
+   page = &dev->guest_pages[i];
+   ret = 
rte_vfio_container_dma_map(RTE_VFIO_DEFAULT_CONTAINER_FD,
+page->host_user_addr,
+page->host_phys_addr,
+page->size);
+   if (ret) {
+   /*
+* DMA device may bind with kernel driver, in 
this case,
+* we don't need to program IOMMU manually. 
However, if no
+* device is bound with vfio/uio in DPDK, and 
vfio kernel
+* module is loaded, the API will still be 
called and return
+* with ENODEV.
+*
+* DPDK vfio only returns ENODEV in very 
similar situations
+* (vfio either unsupported, or supported but 
no devices found).
+* Either way, no mappings could be performed. 
We treat it as
+* normal case in async path. This is a 
workaround.
+*/
+   if (rte_errno == ENODEV)
+   return;
+
+   /* DMA mapping errors won't stop 
VHOST_USER_SET_MEM_TABLE. */
+   VHOST_LOG_CONFIG(ERR, "DMA engine map 
failed\n");
+   }
}
 
} else {
-   /* Remove mapped region from the default container of DPDK. */
-   ret = 
rte_vfio_container_dma_unmap(RTE_VFIO_DEFAULT_CONTAINER_FD,
-  region->host_user_addr,
-  host_iova,
-

[PATCH 0/2] vhost: fix physical address mapping

2022-01-13 Thread xuan . ding
From: Xuan Ding 

This patchset fixes the issue of incorrect DMA mapping in PA mode. 
Due to the ambiguity of host_phys_addr naming in the guest page
struct, rename it to host_iova.

Xuan Ding (2):
  vhost: fix physical address mapping
  vhost: rename field in guest page struct

 lib/vhost/vhost.h  |  11 ++--
 lib/vhost/vhost_user.c | 130 -
 lib/vhost/virtio_net.c |  11 ++--
 3 files changed, 75 insertions(+), 77 deletions(-)

-- 
2.17.1



[PATCH 2/2] vhost: rename field in guest page struct

2022-01-13 Thread xuan . ding
From: Xuan Ding 

This patch renames the host_phys_addr to host_iova in guest_page
struct. The host_phys_addr is iova, it depends on the DPDK
IOVA mode.

Signed-off-by: Xuan Ding 
---
 lib/vhost/vhost.h  | 10 +-
 lib/vhost/vhost_user.c | 24 
 lib/vhost/virtio_net.c | 11 ++-
 3 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
index d246538ca5..9521ae56da 100644
--- a/lib/vhost/vhost.h
+++ b/lib/vhost/vhost.h
@@ -354,7 +354,7 @@ struct vring_packed_desc_event {
 
 struct guest_page {
uint64_t guest_phys_addr;
-   uint64_t host_phys_addr;
+   uint64_t host_iova;
uint64_t host_user_addr;
uint64_t size;
 };
@@ -605,13 +605,13 @@ gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
if (gpa + gpa_size <=
page->guest_phys_addr + page->size) {
return gpa - page->guest_phys_addr +
-   page->host_phys_addr;
+   page->host_iova;
} else if (gpa < page->guest_phys_addr +
page->size) {
*hpa_size = page->guest_phys_addr +
page->size - gpa;
return gpa - page->guest_phys_addr +
-   page->host_phys_addr;
+   page->host_iova;
}
}
} else {
@@ -622,13 +622,13 @@ gpa_to_first_hpa(struct virtio_net *dev, uint64_t gpa,
if (gpa + gpa_size <=
page->guest_phys_addr + page->size) {
return gpa - page->guest_phys_addr +
-   page->host_phys_addr;
+   page->host_iova;
} else if (gpa < page->guest_phys_addr +
page->size) {
*hpa_size = page->guest_phys_addr +
page->size - gpa;
return gpa - page->guest_phys_addr +
-   page->host_phys_addr;
+   page->host_iova;
}
}
}
diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
index 6d888766b0..e2e56308b9 100644
--- a/lib/vhost/vhost_user.c
+++ b/lib/vhost/vhost_user.c
@@ -154,7 +154,7 @@ async_dma_map(struct virtio_net *dev, bool do_map)
page = &dev->guest_pages[i];
ret = 
rte_vfio_container_dma_map(RTE_VFIO_DEFAULT_CONTAINER_FD,
 page->host_user_addr,
-page->host_phys_addr,
+page->host_iova,
 page->size);
if (ret) {
/*
@@ -182,7 +182,7 @@ async_dma_map(struct virtio_net *dev, bool do_map)
page = &dev->guest_pages[i];
ret = 
rte_vfio_container_dma_unmap(RTE_VFIO_DEFAULT_CONTAINER_FD,
   page->host_user_addr,
-  page->host_phys_addr,
+  page->host_iova,
   page->size);
if (ret) {
/* like DMA map, ignore the kernel driver case 
when unmap. */
@@ -977,7 +977,7 @@ vhost_user_set_vring_base(struct virtio_net **pdev,
 
 static int
 add_one_guest_page(struct virtio_net *dev, uint64_t guest_phys_addr,
-  uint64_t host_phys_addr, uint64_t host_user_addr, uint64_t 
size)
+  uint64_t host_iova, uint64_t host_user_addr, uint64_t size)
 {
struct guest_page *page, *last_page;
struct guest_page *old_pages;
@@ -998,7 +998,7 @@ add_one_guest_page(struct virtio_net *dev, uint64_t 
guest_phys_addr,
if (dev->nr_guest_pages > 0) {
last_page = &dev->guest_pages[dev->nr_guest_pages - 1];
/* merge if the two pages are continuous */
-   if (host_phys_addr == last_page->host_phys_addr + 
last_page->size
+   if (host_iova == last_page->host_iova + last_page->size
&& guest_phys_addr == last_page->guest_phys_addr + 
last_page->size
&& host_user_addr == last_page->host_user_addr + 

RE: [PATCH] vhost: add some log for vhost message VHOST_USER_SET_VRING_BASE

2022-01-13 Thread Xia, Chenbo
> -Original Message-
> From: Pei, Andy 
> Sent: Friday, January 14, 2022 3:19 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo 
> Subject: [PATCH] vhost: add some log for vhost message
> VHOST_USER_SET_VRING_BASE

I suggest the title be:

vhost: add log for VHOST_USER_SET_VRING_BASE

> 
> Usually the last avail index and last used index is 0, but for target
> device of live migration, the last avail index and last used index is
> not 0. So I think some log is helpful.

Can simplify to:

This patch adds log for vring related info in handling of vhost message
VHOST_USER_SET_VRING_BASE, which will be useful in live migration case. 

> 
> Signed-off-by: Andy Pei 
> ---
>  lib/vhost/vhost_user.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c
> index a781346..3cb13fb 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -973,6 +973,11 @@
>   vq->last_avail_idx = msg->payload.state.num;
>   }
> 
> + VHOST_LOG_CONFIG(INFO,
> + "vring base idx:%d last_used_idx:%u last_avail_idx:%u.\n",

vring idx:%u

Thanks,
Chenbo

> + msg->payload.state.index, vq->last_used_idx,
> + vq->last_avail_idx);
> +
>   return RTE_VHOST_MSG_RESULT_OK;
>  }
> 
> --
> 1.8.3.1



RE: [PATCH] net/virtio: fix unreleased resource when creating virtio user dev is failed

2022-01-13 Thread Xia, Chenbo
> -Original Message-
> From: Harold Huang 
> Sent: Thursday, December 23, 2021 12:43 PM
> To: dev@dpdk.org
> Cc: Maxime Coquelin ; Xia, Chenbo
> 
> Subject: [PATCH] net/virtio: fix unreleased resource when creating virtio user
> dev is failed
> 
> When eth_virtio_dev_init is failed, the registered virtio user memory event
> cb is not released and the backend created tap device is not destroyed.
> It would cause some residual tap device existed in the host and creating
> a new vdev could be failed because the new virtio_user_dev could use the
> same address pointer and register memory event cb to the same address is
> not allowed.
> 
> Signed-off-by: Harold Huang 
> ---
> Compared to patch v3, commit log is changed because this bug could
> cause residual tap device in the host.
>  drivers/net/virtio/virtio_user_ethdev.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/virtio/virtio_user_ethdev.c
> b/drivers/net/virtio/virtio_user_ethdev.c
> index 0271098f0d..16eca2f940 100644
> --- a/drivers/net/virtio/virtio_user_ethdev.c
> +++ b/drivers/net/virtio/virtio_user_ethdev.c
> @@ -666,6 +666,7 @@ virtio_user_pmd_probe(struct rte_vdev_device *vdev)
>   /* previously called by pci probing for physical dev */
>   if (eth_virtio_dev_init(eth_dev) < 0) {
>   PMD_INIT_LOG(ERR, "eth_virtio_dev_init fails");
> + virtio_user_dev_uninit(dev);
>   virtio_user_eth_dev_free(eth_dev);
>   goto end;
>   }
> --
> 2.27.0

Reviewed-by: Chenbo Xia 


RE: [PATCH] vhost: add some log for vhost message VHOST_USER_SET_VRING_BASE

2022-01-13 Thread Pei, Andy
Hi Chenbo,

Thanks for you reply.
I will send a V2 patch to address it.

-Original Message-
From: Xia, Chenbo  
Sent: Friday, January 14, 2022 3:40 PM
To: Pei, Andy ; dev@dpdk.org
Subject: RE: [PATCH] vhost: add some log for vhost message 
VHOST_USER_SET_VRING_BASE

> -Original Message-
> From: Pei, Andy 
> Sent: Friday, January 14, 2022 3:19 PM
> To: dev@dpdk.org
> Cc: Xia, Chenbo 
> Subject: [PATCH] vhost: add some log for vhost message 
> VHOST_USER_SET_VRING_BASE

I suggest the title be:

vhost: add log for VHOST_USER_SET_VRING_BASE

> 
> Usually the last avail index and last used index is 0, but for target 
> device of live migration, the last avail index and last used index is 
> not 0. So I think some log is helpful.

Can simplify to:

This patch adds log for vring related info in handling of vhost message 
VHOST_USER_SET_VRING_BASE, which will be useful in live migration case. 

> 
> Signed-off-by: Andy Pei 
> ---
>  lib/vhost/vhost_user.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c index 
> a781346..3cb13fb 100644
> --- a/lib/vhost/vhost_user.c
> +++ b/lib/vhost/vhost_user.c
> @@ -973,6 +973,11 @@
>   vq->last_avail_idx = msg->payload.state.num;
>   }
> 
> + VHOST_LOG_CONFIG(INFO,
> + "vring base idx:%d last_used_idx:%u last_avail_idx:%u.\n",

vring idx:%u

Thanks,
Chenbo

> + msg->payload.state.index, vq->last_used_idx,
> + vq->last_avail_idx);
> +
>   return RTE_VHOST_MSG_RESULT_OK;
>  }
> 
> --
> 1.8.3.1