Re: [dpdk-dev] [PATCH v2 1/4] mbuf: remove duplicate definition of cksum offload flags

2021-10-16 Thread Andrew Rybchenko
On 10/15/21 10:24 PM, Olivier Matz wrote:
> The flags PKT_RX_L4_CKSUM_BAD and PKT_RX_IP_CKSUM_BAD are defined
> twice with the same value. Remove one of the occurence, which was
> marked as "deprecated".
> 
> Signed-off-by: Olivier Matz 

Acked-by: Andrew Rybchenko 


Re: [dpdk-dev] [PATCH v2 2/4] mbuf: mark old VLAN offload flags as deprecated

2021-10-16 Thread Andrew Rybchenko
On 10/15/21 10:24 PM, Olivier Matz wrote:
> The flags PKT_TX_VLAN_PKT and PKT_TX_QINQ_PKT are
> marked as deprecated since commit 380a7aab1ae2 ("mbuf: rename deprecated
> VLAN flags") (2017). But they were not using the RTE_DEPRECATED
> macro, because it did not exist at this time. Add it, and replace
> usage of these flags.
> 
> Signed-off-by: Olivier Matz 

Acked-by: Andrew Rybchenko 

I'd remove these flags completely. 4 years is definitely
enough. Yes, I realize that because of missing
RTE_DEPRECATED markup users were not warning on build.


Re: [dpdk-dev] [PATCH v2 4/4] mbuf: add rte prefix to offload flags

2021-10-16 Thread Andrew Rybchenko
On 10/15/21 10:24 PM, Olivier Matz wrote:
> Fix the mbuf offload flags namespace by adding an RTE_ prefix to the
> name. The old flags remain usable, but a deprecation warning is issued
> at compilation.
> 
> Signed-off-by: Olivier Matz 

Acked-by: Andrew Rybchenko 



Re: [dpdk-dev] [PATCH 1/1] net: fix aliasing issue in checksum computation

2021-10-16 Thread Morten Brørup
Geoff,

I have given this some more thoughts.

Most bytes transferred in real life are transferred in large packets, so faster 
processing of large packets is a great improvement!

Furthermore, a quick analysis of a recent packet sample from an ISP customer of 
ours shows that less than 8 % of the packets are odd size. Would you consider 
adding an unlikely() to the branch handling the odd byte at the end?

-Morten

> -Original Message-
> From: Morten Brørup
> Sent: Thursday, 14 October 2021 22.22
> 
> > -Original Message-
> > From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> > Sent: Thursday, 14 October 2021 19.20
> >
> > On 9/18/2021 12:49 PM, Georg Sauthoff wrote:
> > > That means a superfluous cast is removed and aliasing through a
> > uint8_t
> > > pointer is eliminated. Note that uint8_t doesn't have the same
> > > strict-aliasing properties as unsigned char.
> > >
> > > Also simplified the loop since a modern C compiler can speed up
> (i.e.
> > > auto-vectorize) it in a similar way. For example, GCC auto-
> vectorizes
> > it
> > > for Haswell using AVX registers while halving the number of
> > instructions
> > > in the generated code.
> > >
> > > Signed-off-by: Georg Sauthoff 
> >
> > + Morten. (Because of past reviews on cksum code)
> 
> Thanks, Ferruh.
> 
> I have not verified the claimed benefits of the patch, but I have
> reviewed the code thoroughly, and it looks perfectly good to me.
> 
> Reviewed-by: Morten Brørup 
> 
> BTW: It makes me wonder if other parts of DPDK could benefit from the
> same treatment. Especially some of the older DPDK code, where we were
> trying to optimize by hand what a modern compiler can optimize for us
> today.
> 
> >
> > > ---
> > >   lib/net/rte_ip.h | 27 ---
> > >   1 file changed, 8 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > > index 05948b69b7..386db94c85 100644
> > > --- a/lib/net/rte_ip.h
> > > +++ b/lib/net/rte_ip.h
> > > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> > *ipv4_hdr)
> > >   static inline uint32_t
> > >   __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> > >   {
> > > - /* workaround gcc strict-aliasing warning */
> > > - uintptr_t ptr = (uintptr_t)buf;
> > > + /* extend strict-aliasing rules */
> > >   typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > > - const u16_p *u16_buf = (const u16_p *)ptr;
> > > -
> > > - while (len >= (sizeof(*u16_buf) * 4)) {
> > > - sum += u16_buf[0];
> > > - sum += u16_buf[1];
> > > - sum += u16_buf[2];
> > > - sum += u16_buf[3];
> > > - len -= sizeof(*u16_buf) * 4;
> > > - u16_buf += 4;
> > > - }
> > > - while (len >= sizeof(*u16_buf)) {
> > > + const u16_p *u16_buf = (const u16_p *)buf;
> > > + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > > +
> > > + for (; u16_buf != end; ++u16_buf)
> 
> Personally I would prefer post-incrementing here. It makes no
> difference, so I don't see any need to revise the patch.
> 
> > >   sum += *u16_buf;
> > > - len -= sizeof(*u16_buf);
> > > - u16_buf += 1;
> > > - }
> > >
> > > - /* if length is in odd bytes */
> > > - if (len == 1) {
> > > + /* if length is odd, keeping it byte order independent */
> > > + if (len % 2) {

I assume that the compiler already optimizes "% 2" to "& 1".

> > >   uint16_t left = 0;
> > > - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > > + *(unsigned char*)&left = *(const unsigned char *)end;
> > >   sum += left;
> > >   }
> > >



Re: [dpdk-dev] [PATCH 1/1] net: fix aliasing issue in checksum computation

2021-10-16 Thread Morten Brørup
Georg, I apologize for calling you Geoff below! Just realized my mistake.

Med venlig hilsen / Kind regards,
-Morten Brørup


> -Original Message-
> From: Morten Brørup
> Sent: Saturday, 16 October 2021 10.21
> To: 'Georg Sauthoff'
> Cc: 'dev@dpdk.org'; 'Ferruh Yigit'; 'Olivier Matz'; 'Thomas Monjalon';
> 'David Marchand'
> Subject: RE: [dpdk-dev] [PATCH 1/1] net: fix aliasing issue in checksum
> computation
> 
> Geoff,
> 
> I have given this some more thoughts.
> 
> Most bytes transferred in real life are transferred in large packets,
> so faster processing of large packets is a great improvement!
> 
> Furthermore, a quick analysis of a recent packet sample from an ISP
> customer of ours shows that less than 8 % of the packets are odd size.
> Would you consider adding an unlikely() to the branch handling the odd
> byte at the end?
> 
> -Morten
> 
> > -Original Message-
> > From: Morten Brørup
> > Sent: Thursday, 14 October 2021 22.22
> >
> > > -Original Message-
> > > From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> > > Sent: Thursday, 14 October 2021 19.20
> > >
> > > On 9/18/2021 12:49 PM, Georg Sauthoff wrote:
> > > > That means a superfluous cast is removed and aliasing through a
> > > uint8_t
> > > > pointer is eliminated. Note that uint8_t doesn't have the same
> > > > strict-aliasing properties as unsigned char.
> > > >
> > > > Also simplified the loop since a modern C compiler can speed up
> > (i.e.
> > > > auto-vectorize) it in a similar way. For example, GCC auto-
> > vectorizes
> > > it
> > > > for Haswell using AVX registers while halving the number of
> > > instructions
> > > > in the generated code.
> > > >
> > > > Signed-off-by: Georg Sauthoff 
> > >
> > > + Morten. (Because of past reviews on cksum code)
> >
> > Thanks, Ferruh.
> >
> > I have not verified the claimed benefits of the patch, but I have
> > reviewed the code thoroughly, and it looks perfectly good to me.
> >
> > Reviewed-by: Morten Brørup 
> >
> > BTW: It makes me wonder if other parts of DPDK could benefit from the
> > same treatment. Especially some of the older DPDK code, where we were
> > trying to optimize by hand what a modern compiler can optimize for us
> > today.
> >
> > >
> > > > ---
> > > >   lib/net/rte_ip.h | 27 ---
> > > >   1 file changed, 8 insertions(+), 19 deletions(-)
> > > >
> > > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > > > index 05948b69b7..386db94c85 100644
> > > > --- a/lib/net/rte_ip.h
> > > > +++ b/lib/net/rte_ip.h
> > > > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> > > *ipv4_hdr)
> > > >   static inline uint32_t
> > > >   __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> > > >   {
> > > > -   /* workaround gcc strict-aliasing warning */
> > > > -   uintptr_t ptr = (uintptr_t)buf;
> > > > +   /* extend strict-aliasing rules */
> > > > typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > > > -   const u16_p *u16_buf = (const u16_p *)ptr;
> > > > -
> > > > -   while (len >= (sizeof(*u16_buf) * 4)) {
> > > > -   sum += u16_buf[0];
> > > > -   sum += u16_buf[1];
> > > > -   sum += u16_buf[2];
> > > > -   sum += u16_buf[3];
> > > > -   len -= sizeof(*u16_buf) * 4;
> > > > -   u16_buf += 4;
> > > > -   }
> > > > -   while (len >= sizeof(*u16_buf)) {
> > > > +   const u16_p *u16_buf = (const u16_p *)buf;
> > > > +   const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > > > +
> > > > +   for (; u16_buf != end; ++u16_buf)
> >
> > Personally I would prefer post-incrementing here. It makes no
> > difference, so I don't see any need to revise the patch.
> >
> > > > sum += *u16_buf;
> > > > -   len -= sizeof(*u16_buf);
> > > > -   u16_buf += 1;
> > > > -   }
> > > >
> > > > -   /* if length is in odd bytes */
> > > > -   if (len == 1) {
> > > > +   /* if length is odd, keeping it byte order independent */
> > > > +   if (len % 2) {
> 
> I assume that the compiler already optimizes "% 2" to "& 1".
> 
> > > > uint16_t left = 0;
> > > > -   *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > > > +   *(unsigned char*)&left = *(const unsigned char *)end;
> > > > sum += left;
> > > > }
> > > >



Re: [dpdk-dev] [PATCH v2] net/vhost: merge vhost stats loop in vhost Tx/Rx

2021-10-16 Thread Gaoxiang Liu
Hi, Maxime
I agree with you.The inline should be added to 
vhost_update_single_packet_xstats function.
I will fix it in [PATCH v3].

Thanks,
Gaoxiang



发自 网易邮箱大师




 回复的原邮件 
| 发件人 | Maxime Coquelin |
| 日期 | 2021年10月15日 20:16 |
| 收件人 | Gaoxiang 
Liu、chenbo@intel.com |
| 抄送至 | 
dev@dpdk.org、liugaoxi...@huawei.com |
| 主题 | Re: [PATCH v2] net/vhost: merge vhost stats loop in vhost Tx/Rx |
Hi,

On 9/28/21 03:43, Gaoxiang Liu wrote:
> To improve performance in vhost Tx/Rx, merge vhost stats loop.
> eth_vhost_tx has 2 loop of send num iteraion.
> It can be merge into one.
> eth_vhost_rx has the same issue as Tx.
>
> Fixes: 4d6cf2ac93dc ("net/vhost: add extended statistics")

Please remove the Fixes tag, this is an optimization, not a fix.

>
> Signed-off-by: Gaoxiang Liu 
> ---
>
> v2:
>   * Fix coding style issues.
> ---
>   drivers/net/vhost/rte_eth_vhost.c | 62 ++-
>   1 file changed, 28 insertions(+), 34 deletions(-)
>
> diff --git a/drivers/net/vhost/rte_eth_vhost.c 
> b/drivers/net/vhost/rte_eth_vhost.c
> index a202931e9a..a4129980f2 100644
> --- a/drivers/net/vhost/rte_eth_vhost.c
> +++ b/drivers/net/vhost/rte_eth_vhost.c
> @@ -336,38 +336,29 @@ vhost_count_xcast_packets(struct vhost_queue *vq,
>   }
>  
>   static void
> -vhost_update_packet_xstats(struct vhost_queue *vq, struct rte_mbuf **bufs,
> -  uint16_t count, uint64_t nb_bytes,
> -  uint64_t nb_missed)
> +vhost_update_single_packet_xstats(struct vhost_queue *vq, struct rte_mbuf 
> *buf)

I tried to build without and with your patch, and I think that what can
explain most of the performance difference is that without your patch
the function is not inlined, whereas it is implicitely inlined with your
patch applied.

I agree with your patch, but I think we might add __rte_always_inline to
this function to make it explicit. What do you think?

Other than that:

Reviewed-by: Maxime Coquelin 

Thanks,
Maxime


>   {
>uint32_t pkt_len = 0;
> - uint64_t i = 0;
>uint64_t index;
>struct vhost_stats *pstats = &vq->stats;
>  
> - pstats->xstats[VHOST_BYTE] += nb_bytes;
> - pstats->xstats[VHOST_MISSED_PKT] += nb_missed;
> - pstats->xstats[VHOST_UNICAST_PKT] += nb_missed;
> -
> - for (i = 0; i < count ; i++) {
> -  pstats->xstats[VHOST_PKT]++;
> -  pkt_len = bufs[i]->pkt_len;
> -  if (pkt_len == 64) {
> -   pstats->xstats[VHOST_64_PKT]++;
> -  } else if (pkt_len > 64 && pkt_len < 1024) {
> -   index = (sizeof(pkt_len) * 8)
> -- __builtin_clz(pkt_len) - 5;
> -   pstats->xstats[index]++;
> -  } else {
> -   if (pkt_len < 64)
> -pstats->xstats[VHOST_UNDERSIZE_PKT]++;
> -   else if (pkt_len <= 1522)
> -pstats->xstats[VHOST_1024_TO_1522_PKT]++;
> -   else if (pkt_len > 1522)
> -pstats->xstats[VHOST_1523_TO_MAX_PKT]++;
> -  }
> -  vhost_count_xcast_packets(vq, bufs[i]);
> + pstats->xstats[VHOST_PKT]++;
> + pkt_len = buf->pkt_len;
> + if (pkt_len == 64) {
> +  pstats->xstats[VHOST_64_PKT]++;
> + } else if (pkt_len > 64 && pkt_len < 1024) {
> +  index = (sizeof(pkt_len) * 8)
> +   - __builtin_clz(pkt_len) - 5;
> +  pstats->xstats[index]++;
> + } else {
> +  if (pkt_len < 64)
> +   pstats->xstats[VHOST_UNDERSIZE_PKT]++;
> +  else if (pkt_len <= 1522)
> +   pstats->xstats[VHOST_1024_TO_1522_PKT]++;
> +  else if (pkt_len > 1522)
> +   pstats->xstats[VHOST_1523_TO_MAX_PKT]++;
>}
> + vhost_count_xcast_packets(vq, buf);
>   }
>  
>   static uint16_t
> @@ -376,7 +367,6 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t 
> nb_bufs)
>struct vhost_queue *r = q;
>uint16_t i, nb_rx = 0;
>uint16_t nb_receive = nb_bufs;
> - uint64_t nb_bytes = 0;
>  
>if (unlikely(rte_atomic32_read(&r->allow_queuing) == 0))
> return 0;
> @@ -411,11 +401,11 @@ eth_vhost_rx(void *q, struct rte_mbuf **bufs, uint16_t 
> nb_bufs)
> if (r->internal->vlan_strip)
>  rte_vlan_strip(bufs[i]);
>  
> -  nb_bytes += bufs[i]->pkt_len;
> - }
> +  r->stats.bytes += bufs[i]->pkt_len;
> +  r->stats.xstats[VHOST_BYTE] += bufs[i]->pkt_len;
>  
> - r->stats.bytes += nb_bytes;
> - vhost_update_packet_xstats(r, bufs, nb_rx, nb_bytes, 0);
> +  vhost_update_single_packet_xstats(r, bufs[i]);
> + }
>  
>   out:
>rte_atomic32_set(&r->while_queuing, 0);
> @@ -471,16 +461,20 @@ eth_vhost_tx(void *q, struct rte_mbuf **bufs, uint16_t 
> nb_bufs)
>  break;
>}
>  
> - for (i = 0; likely(i < nb_tx); i++)
> + for (i = 0; likely(i < nb_tx); i++) {
> nb_bytes += bufs[i]->pkt_len;
> +  vhost_update_single_

Re: [dpdk-dev] [EXT] [dpdk-dev v3 08/10] crypto/qat: add gen specific data and function

2021-10-16 Thread Akhil Goyal
> +/* Macro to add a capability */
> +#define QAT_SYM_PLAIN_AUTH_CAP(n, b, d)

Can you add a comment for each of the defines, specifying what these
 variables (n,b,d,k,a,I etc)depict.
>   \
> + {   \
> + .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
>   \
> + {.sym = {   \
> + .xform_type = RTE_CRYPTO_SYM_XFORM_AUTH,
>   \
> + {.auth = {  \
> + .algo = RTE_CRYPTO_AUTH_##n,
>   \
> + b, d\
> + }, }\
> + }, }\
> + }
> +
> +#define QAT_SYM_AUTH_CAP(n, b, k, d, a, i)   \
> + {   \
> + .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
>   \
> + {.sym = {   \
> + .xform_type = RTE_CRYPTO_SYM_XFORM_AUTH,
>   \
> + {.auth = {  \
> + .algo = RTE_CRYPTO_AUTH_##n,
>   \
> + b, k, d, a, i   \
> + }, }\
> + }, }\
> + }
> +
> +#define QAT_SYM_AEAD_CAP(n, b, k, d, a, i)   \
> + {   \
> + .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
>   \
> + {.sym = {   \
> + .xform_type = RTE_CRYPTO_SYM_XFORM_AEAD,
>   \
> + {.aead = {  \
> + .algo = RTE_CRYPTO_AEAD_##n,
>   \
> + b, k, d, a, i   \
> + }, }\
> + }, }\
> + }
> +
> +#define QAT_SYM_CIPHER_CAP(n, b, k, i)
>   \
> + {   \
> + .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
>   \
> + {.sym = {   \
> + .xform_type = RTE_CRYPTO_SYM_XFORM_CIPHER,
>   \
> + {.cipher = {\
> + .algo = RTE_CRYPTO_CIPHER_##n,
>   \
> + b, k, i \
> + }, }\
> + }, }\
> + }
> +
>  extern uint8_t qat_sym_driver_id;
> 
> +extern struct qat_crypto_gen_dev_ops qat_sym_gen_dev_ops[];
> +
>  int
>  qat_sym_dev_create(struct qat_pci_device *qat_pci_dev,
>   struct qat_dev_cmd_param *qat_dev_cmd_param);
> --
> 2.25.1



[dpdk-dev] [PATCH v2 3/8] net/mlx5: improve Verbs flow priority discover for scalable

2021-10-16 Thread Xueming Li
To detect number flow Verbs flow priorities, PMD try to create Verbs
flows in different priority. While Verbs is not designed to support
ports larger than 255.

When DevX supported by kernel driver, 16 Verbs priorities must be
supported, no need to create Verbs flows.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_flow_verbs.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow_verbs.c 
b/drivers/net/mlx5/mlx5_flow_verbs.c
index b93fd4d2c96..178eabed163 100644
--- a/drivers/net/mlx5/mlx5_flow_verbs.c
+++ b/drivers/net/mlx5/mlx5_flow_verbs.c
@@ -83,6 +83,11 @@ mlx5_flow_discover_priorities(struct rte_eth_dev *dev)
int i;
int priority = 0;
 
+#if defined(HAVE_MLX5DV_DR_DEVX_PORT) || defined(HAVE_MLX5DV_DR_DEVX_PORT_V35)
+   /* If DevX supported, driver must support 16 verbs flow priorities. */
+   priority = RTE_DIM(priority_map_5);
+   goto out;
+#endif
if (!drop->qp) {
rte_errno = ENOTSUP;
return -rte_errno;
@@ -109,6 +114,9 @@ mlx5_flow_discover_priorities(struct rte_eth_dev *dev)
dev->data->port_id, priority);
return -rte_errno;
}
+#if defined(HAVE_MLX5DV_DR_DEVX_PORT) || defined(HAVE_MLX5DV_DR_DEVX_PORT_V35)
+out:
+#endif
DRV_LOG(INFO, "port %u supported flow priorities:"
" 0-%d for ingress or egress root table,"
" 0-%d for non-root table or transfer root table.",
-- 
2.33.0



[dpdk-dev] [PATCH v2 1/8] common/mlx5: add netlink API to get RDMA port state

2021-10-16 Thread Xueming Li
Introduce netlink API to get rdma port state.

Port state is restrieved based on RDMA device name and port index.

Signed-off-by: Xueming Li 
---
 drivers/common/mlx5/linux/meson.build |   2 +
 drivers/common/mlx5/linux/mlx5_nl.c   | 136 +++---
 drivers/common/mlx5/linux/mlx5_nl.h   |   2 +
 drivers/common/mlx5/version.map   |   1 +
 4 files changed, 106 insertions(+), 35 deletions(-)

diff --git a/drivers/common/mlx5/linux/meson.build 
b/drivers/common/mlx5/linux/meson.build
index cbea58f557d..2dcd27b7786 100644
--- a/drivers/common/mlx5/linux/meson.build
+++ b/drivers/common/mlx5/linux/meson.build
@@ -175,6 +175,8 @@ has_sym_args = [
 'RDMA_NLDEV_ATTR_DEV_NAME' ],
 [ 'HAVE_RDMA_NLDEV_ATTR_PORT_INDEX', 'rdma/rdma_netlink.h',
 'RDMA_NLDEV_ATTR_PORT_INDEX' ],
+[ 'HAVE_RDMA_NLDEV_ATTR_PORT_STATE', 'rdma/rdma_netlink.h',
+'RDMA_NLDEV_ATTR_PORT_STATE' ],
 [ 'HAVE_RDMA_NLDEV_ATTR_NDEV_INDEX', 'rdma/rdma_netlink.h',
 'RDMA_NLDEV_ATTR_NDEV_INDEX' ],
 [ 'HAVE_MLX5_DR_FLOW_DUMP', 'infiniband/mlx5dv.h',
diff --git a/drivers/common/mlx5/linux/mlx5_nl.c 
b/drivers/common/mlx5/linux/mlx5_nl.c
index 9120a697fd5..4b762850941 100644
--- a/drivers/common/mlx5/linux/mlx5_nl.c
+++ b/drivers/common/mlx5/linux/mlx5_nl.c
@@ -78,6 +78,9 @@
 #ifndef HAVE_RDMA_NLDEV_ATTR_PORT_INDEX
 #define RDMA_NLDEV_ATTR_PORT_INDEX 3
 #endif
+#ifndef HAVE_RDMA_NLDEV_ATTR_PORT_STATE
+#define RDMA_NLDEV_ATTR_PORT_STATE 12
+#endif
 #ifndef HAVE_RDMA_NLDEV_ATTR_NDEV_INDEX
 #define RDMA_NLDEV_ATTR_NDEV_INDEX 50
 #endif
@@ -160,14 +163,16 @@ struct mlx5_nl_mac_addr {
 #define MLX5_NL_CMD_GET_IB_INDEX (1 << 1)
 #define MLX5_NL_CMD_GET_NET_INDEX (1 << 2)
 #define MLX5_NL_CMD_GET_PORT_INDEX (1 << 3)
+#define MLX5_NL_CMD_GET_PORT_STATE (1 << 4)
 
 /** Data structure used by mlx5_nl_cmdget_cb(). */
-struct mlx5_nl_ifindex_data {
+struct mlx5_nl_port_info {
const char *name; /**< IB device name (in). */
uint32_t flags; /**< found attribute flags (out). */
uint32_t ibindex; /**< IB device index (out). */
uint32_t ifindex; /**< Network interface index (out). */
uint32_t portnum; /**< IB device max port number (out). */
+   uint16_t state; /**< IB device port state (out). */
 };
 
 uint32_t atomic_sn;
@@ -966,8 +971,8 @@ mlx5_nl_allmulti(int nlsk_fd, unsigned int iface_idx, int 
enable)
 static int
 mlx5_nl_cmdget_cb(struct nlmsghdr *nh, void *arg)
 {
-   struct mlx5_nl_ifindex_data *data = arg;
-   struct mlx5_nl_ifindex_data local = {
+   struct mlx5_nl_port_info *data = arg;
+   struct mlx5_nl_port_info local = {
.flags = 0,
};
size_t off = NLMSG_HDRLEN;
@@ -1000,6 +1005,10 @@ mlx5_nl_cmdget_cb(struct nlmsghdr *nh, void *arg)
local.portnum = *(uint32_t *)payload;
local.flags |= MLX5_NL_CMD_GET_PORT_INDEX;
break;
+   case RDMA_NLDEV_ATTR_PORT_STATE:
+   local.state = *(uint8_t *)payload;
+   local.flags |= MLX5_NL_CMD_GET_PORT_STATE;
+   break;
default:
break;
}
@@ -1016,6 +1025,7 @@ mlx5_nl_cmdget_cb(struct nlmsghdr *nh, void *arg)
data->ibindex = local.ibindex;
data->ifindex = local.ifindex;
data->portnum = local.portnum;
+   data->state = local.state;
}
return 0;
 error:
@@ -1024,7 +1034,7 @@ mlx5_nl_cmdget_cb(struct nlmsghdr *nh, void *arg)
 }
 
 /**
- * Get index of network interface associated with some IB device.
+ * Get port info of network interface associated with some IB device.
  *
  * This is the only somewhat safe method to avoid resorting to heuristics
  * when faced with port representors. Unfortunately it requires at least
@@ -1032,27 +1042,20 @@ mlx5_nl_cmdget_cb(struct nlmsghdr *nh, void *arg)
  *
  * @param nl
  *   Netlink socket of the RDMA kind (NETLINK_RDMA).
- * @param[in] name
- *   IB device name.
  * @param[in] pindex
  *   IB device port index, starting from 1
+ * @param[out] data
+ *   Pointer to port info.
  * @return
- *   A valid (nonzero) interface index on success, 0 otherwise and rte_errno
- *   is set.
+ *   0 on success, negative on error and rte_errno is set.
  */
-unsigned int
-mlx5_nl_ifindex(int nl, const char *name, uint32_t pindex)
+static int
+mlx5_nl_port_info(int nl, uint32_t pindex, struct mlx5_nl_port_info *data)
 {
-   struct mlx5_nl_ifindex_data data = {
-   .name = name,
-   .flags = 0,
-   .ibindex = 0, /* Determined during first pass. */
-   .ifindex = 0, /* Determined during second pass. */
-   };
union {
struct nlmsghdr nh;
uint8_t buf[NLMSG_HDRLEN +
-   NLA_HDRLEN + NLA_ALIGN(sizeof(data.ibindex)) +
+ 

[dpdk-dev] [PATCH v2 2/8] net/mlx5: use netlink when IB port greater than 255

2021-10-16 Thread Xueming Li
IB spec doesn't allow 255 ports on a single HCA, port number of 256 was
cast to u8 value 0 which invalid to ibv_query_port()

This patch invokes Netlink api to query port state when port number
greater than 255.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/linux/mlx5_os.c | 46 ++--
 1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 3746057673d..f283a3779cc 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -956,7 +956,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
 {
const struct mlx5_switch_info *switch_info = &spawn->info;
struct mlx5_dev_ctx_shared *sh = NULL;
-   struct ibv_port_attr port_attr;
+   struct ibv_port_attr port_attr = { .state = IBV_PORT_NOP };
struct mlx5dv_context dv_attr = { .comp_mask = 0 };
struct rte_eth_dev *eth_dev = NULL;
struct mlx5_priv *priv = NULL;
@@ -976,6 +976,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
int own_domain_id = 0;
uint16_t port_id;
struct mlx5_port_info vport_info = { .query_flags = 0 };
+   int nl_rdma = -1;
int i;
 
/* Determine if this port representor is supposed to be spawned. */
@@ -1170,19 +1171,36 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
" old OFED/rdma-core version or firmware configuration");
 #endif
config->mpls_en = mpls_en;
+   nl_rdma = mlx5_nl_init(NETLINK_RDMA);
/* Check port status. */
-   err = mlx5_glue->query_port(sh->ctx, spawn->phys_port, &port_attr);
-   if (err) {
-   DRV_LOG(ERR, "port query failed: %s", strerror(err));
-   goto error;
-   }
-   if (port_attr.link_layer != IBV_LINK_LAYER_ETHERNET) {
-   DRV_LOG(ERR, "port is not configured in Ethernet mode");
-   err = EINVAL;
-   goto error;
+   if (spawn->phys_port <= UINT8_MAX) {
+   /* Legacy Verbs api only support u8 port number. */
+   err = mlx5_glue->query_port(sh->ctx, spawn->phys_port,
+   &port_attr);
+   if (err) {
+   DRV_LOG(ERR, "port query failed: %s", strerror(err));
+   goto error;
+   }
+   if (port_attr.link_layer != IBV_LINK_LAYER_ETHERNET) {
+   DRV_LOG(ERR, "port is not configured in Ethernet mode");
+   err = EINVAL;
+   goto error;
+   }
+   } else if (nl_rdma >= 0) {
+   /* IB doesn't allow more than 255 ports, must be Ethernet. */
+   err = mlx5_nl_port_state(nl_rdma,
+   ((struct ibv_device *)spawn->phys_dev)->name,
+   spawn->phys_port);
+   if (err < 0) {
+   DRV_LOG(INFO, "Failed to get netlink port state: %s",
+   strerror(rte_errno));
+   err = -rte_errno;
+   goto error;
+   }
+   port_attr.state = (enum ibv_port_state)err;
}
if (port_attr.state != IBV_PORT_ACTIVE)
-   DRV_LOG(DEBUG, "port is not active: \"%s\" (%d)",
+   DRV_LOG(INFO, "port is not active: \"%s\" (%d)",
mlx5_glue->port_state_str(port_attr.state),
port_attr.state);
/* Allocate private eth device data. */
@@ -1199,7 +1217,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
priv->pci_dev = spawn->pci_dev;
priv->mtu = RTE_ETHER_MTU;
/* Some internal functions rely on Netlink sockets, open them now. */
-   priv->nl_socket_rdma = mlx5_nl_init(NETLINK_RDMA);
+   priv->nl_socket_rdma = nl_rdma;
priv->nl_socket_route = mlx5_nl_init(NETLINK_ROUTE);
priv->representor = !!switch_info->representor;
priv->master = !!switch_info->master;
@@ -1910,8 +1928,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
mlx5_os_free_shared_dr(priv);
if (priv->nl_socket_route >= 0)
close(priv->nl_socket_route);
-   if (priv->nl_socket_rdma >= 0)
-   close(priv->nl_socket_rdma);
if (priv->vmwa_context)
mlx5_vlan_vmwa_exit(priv->vmwa_context);
if (eth_dev && priv->drop_queue.hrxq)
@@ -1935,6 +1951,8 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
}
if (sh)
mlx5_free_shared_dev_ctx(sh);
+   if (nl_rdma >= 0)
+   close(nl_rdma);
MLX5_ASSERT(err > 0);
rte_errno = err;
return NULL;
-- 
2.33.0



[dpdk-dev] [PATCH v2 4/8] net/mlx5: support E-Switch manager egress traffic match

2021-10-16 Thread Xueming Li
For egress packet on representor, the vport ID in transport domain
is E-Switch manager vport ID since representor shares resources of
E-Switch manager. E-Switch manager vport ID and Tx queue internal device
index are used to match representor egress packet.

This patch adds flow item port ID match on E-Switch manager.

E-Switch manager vport ID is 0xfffe on BlueField, 0 otherwise.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_flow.h|  3 +++
 drivers/net/mlx5/mlx5_flow_dv.c | 25 +
 2 files changed, 28 insertions(+)

diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index 5c68d4f7d74..c25af8d9864 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -18,6 +18,9 @@
 
 #include "mlx5.h"
 
+/* E-Switch Manager port, used for rte_flow_item_port_id. */
+#define MLX5_PORT_ESW_MGR UINT32_MAX
+
 /* Private rte flow items. */
 enum mlx5_rte_flow_item_type {
MLX5_RTE_FLOW_ITEM_TYPE_END = INT_MIN,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index c6370cd1d68..f06ce54f7e7 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -92,6 +93,23 @@ static int
 flow_dv_jump_tbl_resource_release(struct rte_eth_dev *dev,
  uint32_t rix_jump);
 
+static int16_t
+flow_dv_get_esw_manager_vport_id(struct rte_eth_dev *dev)
+{
+   struct mlx5_priv *priv = dev->data->dev_private;
+
+   if (priv->pci_dev == NULL)
+   return 0;
+   switch (priv->pci_dev->id.device_id) {
+   case PCI_DEVICE_ID_MELLANOX_CONNECTX5BF:
+   case PCI_DEVICE_ID_MELLANOX_CONNECTX6DXBF:
+   case PCI_DEVICE_ID_MELLANOX_CONNECTX7BF:
+   return (int16_t)0xfffe;
+   default:
+   return 0;
+   }
+}
+
 /**
  * Initialize flow attributes structure according to flow items' types.
  *
@@ -2224,6 +2242,8 @@ flow_dv_validate_item_port_id(struct rte_eth_dev *dev,
return ret;
if (!spec)
return 0;
+   if (spec->id == MLX5_PORT_ESW_MGR)
+   return 0;
esw_priv = mlx5_port_to_eswitch_info(spec->id, false);
if (!esw_priv)
return rte_flow_error_set(error, rte_errno,
@@ -9685,6 +9705,11 @@ flow_dv_translate_item_port_id(struct rte_eth_dev *dev, 
void *matcher,
struct mlx5_priv *priv;
uint16_t mask, id;
 
+   if (pid_v && pid_v->id == MLX5_PORT_ESW_MGR) {
+   flow_dv_translate_item_source_vport(matcher, key,
+   flow_dv_get_esw_manager_vport_id(dev), 0x);
+   return 0;
+   }
mask = pid_m ? pid_m->id : 0x;
id = pid_v ? pid_v->id : dev->data->port_id;
priv = mlx5_port_to_eswitch_info(id, item == NULL);
-- 
2.33.0



[dpdk-dev] [PATCH v2 5/8] net/mlx5: supports flow item of normal Tx queue

2021-10-16 Thread Xueming Li
Extends txq flow pattern to support both hairpin and regular txq.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_flow_dv.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index f06ce54f7e7..4a17ca64a2e 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -10910,22 +10910,22 @@ flow_dv_translate_item_tx_queue(struct rte_eth_dev 
*dev,
void *misc_v =
MLX5_ADDR_OF(fte_match_param, key, misc_parameters);
struct mlx5_txq_ctrl *txq;
-   uint32_t queue;
-
+   uint32_t queue, mask;
 
queue_m = (const void *)item->mask;
-   if (!queue_m)
-   return;
queue_v = (const void *)item->spec;
if (!queue_v)
return;
txq = mlx5_txq_get(dev, queue_v->queue);
if (!txq)
return;
-   queue = txq->obj->sq->id;
-   MLX5_SET(fte_match_set_misc, misc_m, source_sqn, queue_m->queue);
-   MLX5_SET(fte_match_set_misc, misc_v, source_sqn,
-queue & queue_m->queue);
+   if (txq->type == MLX5_TXQ_TYPE_HAIRPIN)
+   queue = txq->obj->sq->id;
+   else
+   queue = txq->obj->sq_obj.sq->id;
+   mask = queue_m == NULL ? UINT32_MAX : queue_m->queue;
+   MLX5_SET(fte_match_set_misc, misc_m, source_sqn, mask);
+   MLX5_SET(fte_match_set_misc, misc_v, source_sqn, queue & mask);
mlx5_txq_release(dev, queue_v->queue);
 }
 
-- 
2.33.0



[dpdk-dev] [PATCH v2 0/8] net/mlx5: support more than 255 representors

2021-10-16 Thread Xueming Li
This patch set supports representor number of a PF to be more than 255.
CX6 and current OFED driver supports maxium 512 SFs. CX5 supports max 255 SFs.

v2:
 - fixed FDB root table flow priority
 - add error check to Netlink port state API
 - commit log update and other minor fixes

Xueming Li (8):
  common/mlx5: add netlink API to get RDMA port state
  net/mlx5: use netlink when IB port greater than 255
  net/mlx5: improve Verbs flow priority discover for scalable
  net/mlx5: support E-Switch manager egress traffic match
  net/mlx5: supports flow item of normal Tx queue
  net/mlx5: fix internal root table flow priroity
  net/mlx5: enable DevX Tx queue creation
  net/mlx5: check DevX to support more Verbs ports

 drivers/common/mlx5/linux/meson.build |   2 +
 drivers/common/mlx5/linux/mlx5_nl.c   | 136 +++---
 drivers/common/mlx5/linux/mlx5_nl.h   |   2 +
 drivers/common/mlx5/version.map   |   1 +
 drivers/net/mlx5/linux/mlx5_os.c  | 119 +++---
 drivers/net/mlx5/mlx5.h   |   2 +
 drivers/net/mlx5/mlx5_devx.c  |  10 +-
 drivers/net/mlx5/mlx5_devx.h  |   2 +
 drivers/net/mlx5/mlx5_flow.c  |  81 ++-
 drivers/net/mlx5/mlx5_flow.h  |   7 +-
 drivers/net/mlx5/mlx5_flow_dv.c   |  44 +++--
 drivers/net/mlx5/mlx5_flow_verbs.c|   8 ++
 drivers/net/mlx5/mlx5_trigger.c   |  11 ++-
 13 files changed, 291 insertions(+), 134 deletions(-)

-- 
2.33.0



[dpdk-dev] [PATCH v2 8/8] net/mlx5: check DevX to support more Verbs ports

2021-10-16 Thread Xueming Li
Verbs API doesn't support device port number larger than 255 by design.

To support more VF or SubFunction port representors, forces DevX API
check when max Verbs device link ports larger than 255.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/linux/mlx5_os.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 93ee9318ebc..39a9722d869 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1299,12 +1299,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
config->dv_flow_en = 0;
}
 #endif
-   if (spawn->max_port > UINT8_MAX) {
-   /* Verbs can't support ports larger than 255 by design. */
-   DRV_LOG(ERR, "can't support IB ports > UINT8_MAX");
-   err = EINVAL;
-   goto error;
-   }
config->ind_table_max_size =
sh->device_attr.max_rwq_indirection_table_size;
/*
@@ -1767,6 +1761,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
mlx5_rxq_ibv_obj_dummy_lb_create;
priv->obj_ops.lb_dummy_queue_release =
mlx5_rxq_ibv_obj_dummy_lb_release;
+   } else if (spawn->max_port > UINT8_MAX) {
+   /* Verbs can't support ports larger than 255 by design. */
+   DRV_LOG(ERR, "must enable DV and ESW when RDMA link ports > 
255");
+   err = ENOTSUP;
+   goto error;
} else {
priv->obj_ops = ibv_obj_ops;
}
-- 
2.33.0



[dpdk-dev] [PATCH v2 6/8] net/mlx5: fix internal root table flow priroity

2021-10-16 Thread Xueming Li
When creating internal transfer flow on root table with lowerest
priority, the flow was created with max UINT32_MAX priority. It is wrong
since the flow is created in kernel and  max priority supported is 16.

This patch fixes this by adding internal flow check.

Fixes: 5f8ae44dd454 ("net/mlx5: enlarge maximal flow priority")

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_flow.c| 7 ++-
 drivers/net/mlx5/mlx5_flow.h| 4 ++--
 drivers/net/mlx5/mlx5_flow_dv.c | 3 ++-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index c914a7120cc..b5232cd46ae 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -980,13 +980,15 @@ mlx5_get_lowest_priority(struct rte_eth_dev *dev,
  *   Pointer to device flow rule attributes.
  * @param[in] subpriority
  *   The priority based on the items.
+ * @param[in] external
+ *   Flow is user flow.
  * @return
  *   The matcher priority of the flow.
  */
 uint16_t
 mlx5_get_matcher_priority(struct rte_eth_dev *dev,
  const struct rte_flow_attr *attr,
- uint32_t subpriority)
+ uint32_t subpriority, bool external)
 {
uint16_t priority = (uint16_t)attr->priority;
struct mlx5_priv *priv = dev->data->dev_private;
@@ -995,6 +997,9 @@ mlx5_get_matcher_priority(struct rte_eth_dev *dev,
if (attr->priority == MLX5_FLOW_LOWEST_PRIO_INDICATOR)
priority = priv->config.flow_prio - 1;
return mlx5_os_flow_adjust_priority(dev, priority, subpriority);
+   } else if (!external && attr->transfer && attr->group == 0 &&
+  attr->priority == MLX5_FLOW_LOWEST_PRIO_INDICATOR) {
+   return (priv->config.flow_prio - 1) * 3;
}
if (attr->priority == MLX5_FLOW_LOWEST_PRIO_INDICATOR)
priority = MLX5_NON_ROOT_FLOW_MAX_PRIO;
diff --git a/drivers/net/mlx5/mlx5_flow.h b/drivers/net/mlx5/mlx5_flow.h
index c25af8d9864..f1a83d537d0 100644
--- a/drivers/net/mlx5/mlx5_flow.h
+++ b/drivers/net/mlx5/mlx5_flow.h
@@ -1431,8 +1431,8 @@ uint32_t mlx5_flow_adjust_priority(struct rte_eth_dev 
*dev, int32_t priority,
 uint32_t mlx5_get_lowest_priority(struct rte_eth_dev *dev,
const struct rte_flow_attr *attr);
 uint16_t mlx5_get_matcher_priority(struct rte_eth_dev *dev,
-const struct rte_flow_attr *attr,
-uint32_t subpriority);
+  const struct rte_flow_attr *attr,
+  uint32_t subpriority, bool external);
 int mlx5_flow_get_reg_id(struct rte_eth_dev *dev,
 enum mlx5_feature_name feature,
 uint32_t id,
diff --git a/drivers/net/mlx5/mlx5_flow_dv.c b/drivers/net/mlx5/mlx5_flow_dv.c
index 4a17ca64a2e..ffc1fc8a05c 100644
--- a/drivers/net/mlx5/mlx5_flow_dv.c
+++ b/drivers/net/mlx5/mlx5_flow_dv.c
@@ -13646,7 +13646,8 @@ flow_dv_translate(struct rte_eth_dev *dev,
matcher.crc = rte_raw_cksum((const void *)matcher.mask.buf,
matcher.mask.size);
matcher.priority = mlx5_get_matcher_priority(dev, attr,
-   matcher.priority);
+matcher.priority,
+dev_flow->external);
/**
 * When creating meter drop flow in drop table, using original
 * 5-tuple match, the matcher priority should be lower than
-- 
2.33.0



[dpdk-dev] [PATCH v2 7/8] net/mlx5: enable DevX Tx queue creation

2021-10-16 Thread Xueming Li
Verbs API does not support Infiniband device port number larger 255 by
design. To support more representors on a single Infiniband device DevX
API should be engaged.

While creating Send Queue (SQ) object with Verbs API, the PMD assigned
IB device port attribute and kernel created the default miss flows in
FDB domain, to redirect egress traffic from the queue being created to
representor appropriate peer (wire, HPF, VF or SF).

With DevX API there is no IB-device port attribute (it is merely kernel
one, DevX operates in PRM terms) and PMD must create default miss flows
in FDB explicitly. PMD did not provide this and using DevX API for
E-Switch configurations was disabled.

The default miss FDB flow matches E-Switch manager vport (to make sure
the source is some representor) and SQn (Send Queue number - device
internal queue index). The root flow table managed by kernel/firmware
and it does not support vport redirect action, we have to split the
default miss flow into two ones:

- flow with lowest priority in the root table that matches E-Switch
manager vport ID and jump to group 1.
- flow in group 1 that matches E-Switch manager vport ID and SQn and
forwards packet to peer vport

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/linux/mlx5_os.c | 62 +-
 drivers/net/mlx5/mlx5.h  |  2 +
 drivers/net/mlx5/mlx5_devx.c | 10 ++---
 drivers/net/mlx5/mlx5_devx.h |  2 +
 drivers/net/mlx5/mlx5_flow.c | 74 
 drivers/net/mlx5/mlx5_trigger.c  | 11 -
 6 files changed, 94 insertions(+), 67 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index f283a3779cc..93ee9318ebc 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -697,56 +697,6 @@ mlx5_init_once(void)
return ret;
 }
 
-/**
- * Create the Tx queue DevX/Verbs object.
- *
- * @param dev
- *   Pointer to Ethernet device.
- * @param idx
- *   Queue index in DPDK Tx queue array.
- *
- * @return
- *   0 on success, a negative errno value otherwise and rte_errno is set.
- */
-static int
-mlx5_os_txq_obj_new(struct rte_eth_dev *dev, uint16_t idx)
-{
-   struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_txq_data *txq_data = (*priv->txqs)[idx];
-   struct mlx5_txq_ctrl *txq_ctrl =
-   container_of(txq_data, struct mlx5_txq_ctrl, txq);
-
-   if (txq_ctrl->type == MLX5_TXQ_TYPE_HAIRPIN)
-   return mlx5_txq_devx_obj_new(dev, idx);
-#ifdef HAVE_MLX5DV_DEVX_UAR_OFFSET
-   if (!priv->config.dv_esw_en)
-   return mlx5_txq_devx_obj_new(dev, idx);
-#endif
-   return mlx5_txq_ibv_obj_new(dev, idx);
-}
-
-/**
- * Release an Tx DevX/verbs queue object.
- *
- * @param txq_obj
- *   DevX/Verbs Tx queue object.
- */
-static void
-mlx5_os_txq_obj_release(struct mlx5_txq_obj *txq_obj)
-{
-   if (txq_obj->txq_ctrl->type == MLX5_TXQ_TYPE_HAIRPIN) {
-   mlx5_txq_devx_obj_release(txq_obj);
-   return;
-   }
-#ifdef HAVE_MLX5DV_DEVX_UAR_OFFSET
-   if (!txq_obj->txq_ctrl->priv->config.dv_esw_en) {
-   mlx5_txq_devx_obj_release(txq_obj);
-   return;
-   }
-#endif
-   mlx5_txq_ibv_obj_release(txq_obj);
-}
-
 /**
  * DV flow counter mode detect and config.
  *
@@ -1812,16 +1762,6 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
ibv_obj_ops.drop_action_create;
priv->obj_ops.drop_action_destroy =
ibv_obj_ops.drop_action_destroy;
-#ifndef HAVE_MLX5DV_DEVX_UAR_OFFSET
-   priv->obj_ops.txq_obj_modify = ibv_obj_ops.txq_obj_modify;
-#else
-   if (config->dv_esw_en)
-   priv->obj_ops.txq_obj_modify =
-   ibv_obj_ops.txq_obj_modify;
-#endif
-   /* Use specific wrappers for Tx object. */
-   priv->obj_ops.txq_obj_new = mlx5_os_txq_obj_new;
-   priv->obj_ops.txq_obj_release = mlx5_os_txq_obj_release;
mlx5_queue_counter_id_prepare(eth_dev);
priv->obj_ops.lb_dummy_queue_create =
mlx5_rxq_ibv_obj_dummy_lb_create;
@@ -1832,7 +1772,7 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
}
if (config->tx_pp &&
(priv->config.dv_esw_en ||
-priv->obj_ops.txq_obj_new != mlx5_os_txq_obj_new)) {
+priv->obj_ops.txq_obj_new != mlx5_txq_devx_obj_new)) {
/*
 * HAVE_MLX5DV_DEVX_UAR_OFFSET is required to support
 * packet pacing and already checked above.
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3581414b789..570f827375a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1699,6 +1699,8 @@ int mlx5_ctrl_flow(struct rte_eth_dev *dev,
   struct rte_flow_item_eth *et

Re: [dpdk-dev] [EXT] [PATCH] cryptodev: extend data-unit length field

2021-10-16 Thread Akhil Goyal
> As described in [1] and as announced in [2], The field ``dataunit_len``
> of the ``struct rte_crypto_cipher_xform`` moved to the end of the
> structure and extended to ``uint32_t``.
> 
> In this way, sizes bigger than 64K bytes can be supported for data-unit
> lengths.
> 
> [1] commit d014dddb2d69 ("cryptodev: support multiple cipher
> data-units")
> [2] commit 9a5c09211b3a ("doc: announce extension of crypto data-unit
> length")
> 
> Signed-off-by: Matan Azrad 
Acked-by: Akhil Goyal 


[dpdk-dev] [PATCH v7 0/5] ethdev: introduce shared Rx queue

2021-10-16 Thread Xueming Li
In current DPDK framework, all Rx queues is pre-loaded with mbufs for
incoming packets. When number of representors scale out in a switch
domain, the memory consumption became significant. Further more,
polling all ports leads to high cache miss, high latency and low
throughputs.

This patch introduces shared Rx queue. PF and representors in same
Rx domain and switch domain could share Rx queue set by specifying
non-zero share group value in Rx queue configuration.

All ports that share Rx queue actually shares hardware descriptor
queue and feed all Rx queues with one descriptor supply, memory is saved.

Polling any queue using same shared Rx queue receives packets from all
member ports. Source port is identified by mbuf->port.

Multiple groups is supported by group ID. Port queue number in a shared
group should be identical. Queue index is 1:1 mapped in shared group.
An example of two share groups:
 Group1, 4 shared Rx queues per member port: PF, repr0, repr1
 Group2, 2 shared Rx queues per member port: repr2, repr3, ... repr127
 Poll first port for each group:
  core  portqueue
  0 0   0
  1 0   1
  2 0   2
  3 0   3
  4 2   0
  5 2   1

Shared Rx queue must be polled on single thread or core. If both PF0 and
representor0 joined same share group, can't poll pf0rxq0 on core1 and
rep0rxq0 on core2. Actually, polling one port within share group is
sufficient since polling any port in group will return packets for any
port in group.

There was some discussion to aggregate member ports in same group into a
dummy port, several ways to achieve it. Since it optional, need to collect
more feedback and requirement from user, make better decision later.

v1:
  - initial version
v2:
  - add testpmd patches
v3:
  - change common forwarding api to macro for performance, thanks Jerin.
  - save global variable accessed in forwarding to flowstream to minimize
cache miss
  - combined patches for each forwarding engine
  - support multiple groups in testpmd "--share-rxq" parameter
  - new api to aggregate shared rxq group
v4:
  - spelling fixes
  - remove shared-rxq support for all forwarding engines
  - add dedicate shared-rxq forwarding engine
v5:
 - fix grammars
 - remove aggregate api and leave it for later discussion
 - add release notes
 - add deployment example
v6:
 - replace RxQ offload flag with device offload capability flag
 - add Rx domain
 - RxQ is shared when share group > 0
 - update testpmd accordingly
v7:
 - fix testpmd share group id allocation
 - change rx_domain to 16bits

Xueming Li (5):
  ethdev: introduce shared Rx queue
  app/testpmd: new parameter to enable shared Rx queue
  app/testpmd: dump port info for shared Rx queue
  app/testpmd: force shared Rx queue polled on same core
  app/testpmd: add forwarding engine for shared Rx queue

 app/test-pmd/config.c | 106 -
 app/test-pmd/meson.build  |   1 +
 app/test-pmd/parameters.c |  13 ++
 app/test-pmd/shared_rxq_fwd.c | 148 ++
 app/test-pmd/testpmd.c|  23 ++-
 app/test-pmd/testpmd.h|   5 +
 app/test-pmd/util.c   |   3 +
 doc/guides/nics/features.rst  |  13 ++
 doc/guides/nics/features/default.ini  |   1 +
 .../prog_guide/switch_representation.rst  |  10 ++
 doc/guides/rel_notes/release_21_11.rst|   6 +
 doc/guides/testpmd_app_ug/run_app.rst |   8 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst   |   5 +-
 lib/ethdev/rte_ethdev.c   |   8 +
 lib/ethdev/rte_ethdev.h   |  21 +++
 15 files changed, 365 insertions(+), 6 deletions(-)
 create mode 100644 app/test-pmd/shared_rxq_fwd.c

-- 
2.33.0



[dpdk-dev] [PATCH v7 2/5] app/testpmd: new parameter to enable shared Rx queue

2021-10-16 Thread Xueming Li
Adds "--rxq-share=X" parameter to enable shared RxQ, share if device
supports, otherwise fallback to standard RxQ.

Share group number grows per X ports. X defaults to MAX, implies all
ports join share group 1.

Forwarding engine "shared-rxq" should be used which Rx only and update
stream statistics correctly.

Signed-off-by: Xueming Li 
---
 app/test-pmd/config.c |  6 +-
 app/test-pmd/parameters.c | 13 +
 app/test-pmd/testpmd.c| 18 +++---
 app/test-pmd/testpmd.h|  2 ++
 doc/guides/testpmd_app_ug/run_app.rst |  7 +++
 5 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9c66329e96e..96fc2ab888b 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2709,7 +2709,11 @@ rxtx_config_display(void)
printf("  RX threshold registers: pthresh=%d 
hthresh=%d "
" wthresh=%d\n",
pthresh_tmp, hthresh_tmp, wthresh_tmp);
-   printf("  RX Offloads=0x%"PRIx64"\n", offloads_tmp);
+   printf("  RX Offloads=0x%"PRIx64, offloads_tmp);
+   if (rx_conf->share_group > 0)
+   printf(" share group=%u",
+  rx_conf->share_group);
+   printf("\n");
}
 
/* per tx queue config only for first queue to be less verbose 
*/
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 3f94a82e321..30dae326310 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -167,6 +167,7 @@ usage(char* progname)
printf("  --tx-ip=src,dst: IP addresses in Tx-only mode\n");
printf("  --tx-udp=src[,dst]: UDP ports in Tx-only mode\n");
printf("  --eth-link-speed: force link speed.\n");
+   printf("  --rxq-share: number of ports per shared rxq groups, defaults 
to MAX(1 group)\n");
printf("  --disable-link-check: disable check on link status when "
   "starting/stopping ports.\n");
printf("  --disable-device-start: do not automatically start port\n");
@@ -607,6 +608,7 @@ launch_args_parse(int argc, char** argv)
{ "rxpkts", 1, 0, 0 },
{ "txpkts", 1, 0, 0 },
{ "txonly-multi-flow",  0, 0, 0 },
+   { "rxq-share",  2, 0, 0 },
{ "eth-link-speed", 1, 0, 0 },
{ "disable-link-check", 0, 0, 0 },
{ "disable-device-start",   0, 0, 0 },
@@ -1271,6 +1273,17 @@ launch_args_parse(int argc, char** argv)
}
if (!strcmp(lgopts[opt_idx].name, "txonly-multi-flow"))
txonly_multi_flow = 1;
+   if (!strcmp(lgopts[opt_idx].name, "rxq-share")) {
+   if (optarg == NULL) {
+   rxq_share = UINT32_MAX;
+   } else {
+   n = atoi(optarg);
+   if (n >= 0)
+   rxq_share = (uint32_t)n;
+   else
+   rte_exit(EXIT_FAILURE, 
"rxq-share must be >= 0\n");
+   }
+   }
if (!strcmp(lgopts[opt_idx].name, "no-flush-rx"))
no_flush_rx = 1;
if (!strcmp(lgopts[opt_idx].name, "eth-link-speed")) {
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 97ae52e17ec..4c501bf43f3 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -498,6 +498,11 @@ uint8_t record_core_cycles;
  */
 uint8_t record_burst_stats;
 
+/*
+ * Number of ports per shared Rx queue group, 0 disable.
+ */
+uint32_t rxq_share;
+
 unsigned int num_sockets = 0;
 unsigned int socket_ids[RTE_MAX_NUMA_NODES];
 
@@ -3393,14 +3398,21 @@ dev_event_callback(const char *device_name, enum 
rte_dev_event_type type,
 }
 
 static void
-rxtx_port_config(struct rte_port *port)
+rxtx_port_config(portid_t pid)
 {
uint16_t qid;
uint64_t offloads;
+   struct rte_port *port = &ports[pid];
 
for (qid = 0; qid < nb_rxq; qid++) {
offloads = port->rx_conf[qid].offloads;
port->rx_conf[qid] = port->dev_info.default_rxconf;
+
+   if (rxq_share > 0 &&
+   (port->dev_info.dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE))
+   /* Non-zero share group to enable RxQ share. */
+   port->rx_conf[qid].share_group = pid / rxq_share + 1;
+
if (offloads != 0)
port->rx_conf[qid].of

[dpdk-dev] [PATCH v7 1/5] ethdev: introduce shared Rx queue

2021-10-16 Thread Xueming Li
In current DPDK framework, each Rx queue is pre-loaded with mbufs to
save incoming packets. For some PMDs, when number of representors scale
out in a switch domain, the memory consumption became significant.
Polling all ports also leads to high cache miss, high latency and low
throughput.

This patch introduce shared Rx queue. Ports in same Rx domain and
switch domain could share Rx queue set by specifying non-zero sharing
group in Rx queue configuration.

Port A RxQ X can share RxQ with Port B RxQ X, but can't share with RxQ
Y. All member ports in share group share a list of shared Rx queue
indexed by Rx queue ID.

No special API is defined to receive packets from shared Rx queue.
Polling any member port of a shared Rx queue receives packets of that
queue for all member ports, source port is identified by mbuf->port.

Shared Rx queue must be polled in same thread or core, polling a queue
ID of any member port is essentially same.

Multiple share groups are supported. Device should support mixed
configuration by allowing multiple share groups and non-shared Rx queue.

Example grouping and polling model to reflect service priority:
 Group1, 2 shared Rx queues per port: PF, rep0, rep1
 Group2, 1 shared Rx queue per port: rep2, rep3, ... rep127
 Core0: poll PF queue0
 Core1: poll PF queue1
 Core2: poll rep2 queue0

PMD advertise shared Rx queue capability via RTE_ETH_DEV_CAPA_RXQ_SHARE.

PMD is responsible for shared Rx queue consistency checks to avoid
member port's configuration contradict to each other.

Signed-off-by: Xueming Li 
---
 doc/guides/nics/features.rst  | 13 
 doc/guides/nics/features/default.ini  |  1 +
 .../prog_guide/switch_representation.rst  | 10 +
 doc/guides/rel_notes/release_21_11.rst|  6 ++
 lib/ethdev/rte_ethdev.c   |  8 +++
 lib/ethdev/rte_ethdev.h   | 21 +++
 6 files changed, 59 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index e346018e4b8..b64433b8ea5 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -615,6 +615,19 @@ Supports inner packet L4 checksum.
   ``tx_offload_capa,tx_queue_offload_capa:DEV_TX_OFFLOAD_OUTER_UDP_CKSUM``.
 
 
+.. _nic_features_shared_rx_queue:
+
+Shared Rx queue
+---
+
+Supports shared Rx queue for ports in same Rx domain of a switch domain.
+
+* **[uses] rte_eth_dev_info**: ``dev_capa:RTE_ETH_DEV_CAPA_RXQ_SHARE``.
+* **[uses] rte_eth_dev_info,rte_eth_switch_info**: ``rx_domain``, 
``domain_id``.
+* **[uses] rte_eth_rxconf**: ``share_group``.
+* **[provides] mbuf**: ``mbuf.port``.
+
+
 .. _nic_features_packet_type_parsing:
 
 Packet type parsing
diff --git a/doc/guides/nics/features/default.ini 
b/doc/guides/nics/features/default.ini
index d473b94091a..93f5d1b46f4 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -19,6 +19,7 @@ Free Tx mbuf on demand =
 Queue start/stop =
 Runtime Rx queue setup =
 Runtime Tx queue setup =
+Shared Rx queue  =
 Burst mode info  =
 Power mgmt address monitor =
 MTU update   =
diff --git a/doc/guides/prog_guide/switch_representation.rst 
b/doc/guides/prog_guide/switch_representation.rst
index ff6aa91c806..de41db8385d 100644
--- a/doc/guides/prog_guide/switch_representation.rst
+++ b/doc/guides/prog_guide/switch_representation.rst
@@ -123,6 +123,16 @@ thought as a software "patch panel" front-end for 
applications.
 .. [1] `Ethernet switch device driver model (switchdev)
`_
 
+- For some PMDs, memory usage of representors is huge when number of
+  representor grows, mbufs are allocated for each descriptor of Rx queue.
+  Polling large number of ports brings more CPU load, cache miss and
+  latency. Shared Rx queue can be used to share Rx queue between PF and
+  representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` is
+  present in device capability of device info. Setting non-zero share group
+  in Rx queue configuration to enable share. Polling any member port can
+  receive packets of all member ports in the group, port ID is saved in
+  ``mbuf.port``.
+
 Basic SR-IOV
 
 
diff --git a/doc/guides/rel_notes/release_21_11.rst 
b/doc/guides/rel_notes/release_21_11.rst
index 4c56cdfeaaa..1c84e896554 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -67,6 +67,12 @@ New Features
   * Modified to allow ``--huge-dir`` option to specify a sub-directory
 within a hugetlbfs mountpoint.
 
+* **Added ethdev shared Rx queue support.**
+
+  * Added new device capability flag and rx domain field to switch info.
+  * Added share group to Rx queue configuration.
+  * Added testpmd support and dedicate forwarding engine.
+
 * **Added new RSS offload types for IPv4/L4 checksum in RSS flow.**
 
   Added macros ETH_RSS_IP

[dpdk-dev] [PATCH v7 3/5] app/testpmd: dump port info for shared Rx queue

2021-10-16 Thread Xueming Li
In case of shared Rx queue, polling any member port returns mbufs for
all members. This patch dumps mbuf->port for each packet.

Signed-off-by: Xueming Li 
---
 app/test-pmd/util.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/test-pmd/util.c b/app/test-pmd/util.c
index 51506e49404..e98f136d5ed 100644
--- a/app/test-pmd/util.c
+++ b/app/test-pmd/util.c
@@ -100,6 +100,9 @@ dump_pkt_burst(uint16_t port_id, uint16_t queue, struct 
rte_mbuf *pkts[],
struct rte_flow_restore_info info = { 0, };
 
mb = pkts[i];
+   if (rxq_share > 0)
+   MKDUMPSTR(print_buf, buf_size, cur_len, "port %u, ",
+ mb->port);
eth_hdr = rte_pktmbuf_read(mb, 0, sizeof(_eth_hdr), &_eth_hdr);
eth_type = RTE_BE_TO_CPU_16(eth_hdr->ether_type);
packet_type = mb->packet_type;
-- 
2.33.0



[dpdk-dev] [PATCH v7 4/5] app/testpmd: force shared Rx queue polled on same core

2021-10-16 Thread Xueming Li
Shared Rx queue must be polled on same core. This patch checks and stops
forwarding if shared RxQ being scheduled on multiple
cores.

It's suggested to use same number of Rx queues and polling cores.

Signed-off-by: Xueming Li 
---
 app/test-pmd/config.c  | 100 +
 app/test-pmd/testpmd.c |   4 +-
 app/test-pmd/testpmd.h |   2 +
 3 files changed, 105 insertions(+), 1 deletion(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 96fc2ab888b..9acd2705f18 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -2885,6 +2885,106 @@ port_rss_hash_key_update(portid_t port_id, char 
rss_type[], uint8_t *hash_key,
}
 }
 
+/*
+ * Check whether a shared rxq scheduled on other lcores.
+ */
+static bool
+fwd_stream_on_other_lcores(uint16_t domain_id, portid_t src_port,
+  queueid_t src_rxq, lcoreid_t src_lc,
+  uint32_t share_group)
+{
+   streamid_t sm_id;
+   streamid_t nb_fs_per_lcore;
+   lcoreid_t  nb_fc;
+   lcoreid_t  lc_id;
+   struct fwd_stream *fs;
+   struct rte_port *port;
+   struct rte_eth_dev_info *dev_info;
+   struct rte_eth_rxconf *rxq_conf;
+
+   nb_fc = cur_fwd_config.nb_fwd_lcores;
+   for (lc_id = src_lc + 1; lc_id < nb_fc; lc_id++) {
+   sm_id = fwd_lcores[lc_id]->stream_idx;
+   nb_fs_per_lcore = fwd_lcores[lc_id]->stream_nb;
+   for (; sm_id < fwd_lcores[lc_id]->stream_idx + nb_fs_per_lcore;
+sm_id++) {
+   fs = fwd_streams[sm_id];
+   port = &ports[fs->rx_port];
+   dev_info = &port->dev_info;
+   rxq_conf = &port->rx_conf[fs->rx_queue];
+   if ((dev_info->dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE)
+   == 0)
+   /* Not shared rxq. */
+   continue;
+   if (domain_id != port->dev_info.switch_info.domain_id)
+   continue;
+   if (fs->rx_queue != src_rxq)
+   continue;
+   if (rxq_conf->share_group != share_group)
+   continue;
+   printf("Shared Rx queue group %u can't be scheduled on 
different cores:\n",
+  share_group);
+   printf("  lcore %hhu Port %hu queue %hu\n",
+  src_lc, src_port, src_rxq);
+   printf("  lcore %hhu Port %hu queue %hu\n",
+  lc_id, fs->rx_port, fs->rx_queue);
+   printf("  please use --nb-cores=%hu to limit forwarding 
cores\n",
+  nb_rxq);
+   return true;
+   }
+   }
+   return false;
+}
+
+/*
+ * Check shared rxq configuration.
+ *
+ * Shared group must not being scheduled on different core.
+ */
+bool
+pkt_fwd_shared_rxq_check(void)
+{
+   streamid_t sm_id;
+   streamid_t nb_fs_per_lcore;
+   lcoreid_t  nb_fc;
+   lcoreid_t  lc_id;
+   struct fwd_stream *fs;
+   uint16_t domain_id;
+   struct rte_port *port;
+   struct rte_eth_dev_info *dev_info;
+   struct rte_eth_rxconf *rxq_conf;
+
+   nb_fc = cur_fwd_config.nb_fwd_lcores;
+   /*
+* Check streams on each core, make sure the same switch domain +
+* group + queue doesn't get scheduled on other cores.
+*/
+   for (lc_id = 0; lc_id < nb_fc; lc_id++) {
+   sm_id = fwd_lcores[lc_id]->stream_idx;
+   nb_fs_per_lcore = fwd_lcores[lc_id]->stream_nb;
+   for (; sm_id < fwd_lcores[lc_id]->stream_idx + nb_fs_per_lcore;
+sm_id++) {
+   fs = fwd_streams[sm_id];
+   /* Update lcore info stream being scheduled. */
+   fs->lcore = fwd_lcores[lc_id];
+   port = &ports[fs->rx_port];
+   dev_info = &port->dev_info;
+   rxq_conf = &port->rx_conf[fs->rx_queue];
+   if ((dev_info->dev_capa & RTE_ETH_DEV_CAPA_RXQ_SHARE)
+   == 0)
+   /* Not shared rxq. */
+   continue;
+   /* Check shared rxq not scheduled on remaining cores. */
+   domain_id = port->dev_info.switch_info.domain_id;
+   if (fwd_stream_on_other_lcores(domain_id, fs->rx_port,
+  fs->rx_queue, lc_id,
+  rxq_conf->share_group))
+   return false;
+   }
+   }
+   return true;
+}
+
 /*
  * Setup forwarding configuration for each logical core.
  */
diff --git a/app/test-pmd/te

[dpdk-dev] [PATCH v7 5/5] app/testpmd: add forwarding engine for shared Rx queue

2021-10-16 Thread Xueming Li
To support shared Rx queue, this patch introduces dedicate forwarding
engine. The engine groups received packets by mbuf->port into sub-group,
updates stream statistics and simply frees packets.

Signed-off-by: Xueming Li 
---
 app/test-pmd/meson.build|   1 +
 app/test-pmd/shared_rxq_fwd.c   | 148 
 app/test-pmd/testpmd.c  |   1 +
 app/test-pmd/testpmd.h  |   1 +
 doc/guides/testpmd_app_ug/run_app.rst   |   1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |   5 +-
 6 files changed, 156 insertions(+), 1 deletion(-)
 create mode 100644 app/test-pmd/shared_rxq_fwd.c

diff --git a/app/test-pmd/meson.build b/app/test-pmd/meson.build
index 98f3289bdfa..07042e45b12 100644
--- a/app/test-pmd/meson.build
+++ b/app/test-pmd/meson.build
@@ -21,6 +21,7 @@ sources = files(
 'noisy_vnf.c',
 'parameters.c',
 'rxonly.c',
+'shared_rxq_fwd.c',
 'testpmd.c',
 'txonly.c',
 'util.c',
diff --git a/app/test-pmd/shared_rxq_fwd.c b/app/test-pmd/shared_rxq_fwd.c
new file mode 100644
index 000..4e262b99bc7
--- /dev/null
+++ b/app/test-pmd/shared_rxq_fwd.c
@@ -0,0 +1,148 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright (c) 2021 NVIDIA Corporation & Affiliates
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "testpmd.h"
+
+/*
+ * Rx only sub-burst forwarding.
+ */
+static void
+forward_rx_only(uint16_t nb_rx, struct rte_mbuf **pkts_burst)
+{
+   rte_pktmbuf_free_bulk(pkts_burst, nb_rx);
+}
+
+/**
+ * Get packet source stream by source port and queue.
+ * All streams of same shared Rx queue locates on same core.
+ */
+static struct fwd_stream *
+forward_stream_get(struct fwd_stream *fs, uint16_t port)
+{
+   streamid_t sm_id;
+   struct fwd_lcore *fc;
+   struct fwd_stream **fsm;
+   streamid_t nb_fs;
+
+   fc = fs->lcore;
+   fsm = &fwd_streams[fc->stream_idx];
+   nb_fs = fc->stream_nb;
+   for (sm_id = 0; sm_id < nb_fs; sm_id++) {
+   if (fsm[sm_id]->rx_port == port &&
+   fsm[sm_id]->rx_queue == fs->rx_queue)
+   return fsm[sm_id];
+   }
+   return NULL;
+}
+
+/**
+ * Forward packet by source port and queue.
+ */
+static void
+forward_sub_burst(struct fwd_stream *src_fs, uint16_t port, uint16_t nb_rx,
+ struct rte_mbuf **pkts)
+{
+   struct fwd_stream *fs = forward_stream_get(src_fs, port);
+
+   if (fs != NULL) {
+   fs->rx_packets += nb_rx;
+   forward_rx_only(nb_rx, pkts);
+   } else {
+   /* Source stream not found, drop all packets. */
+   src_fs->fwd_dropped += nb_rx;
+   while (nb_rx > 0)
+   rte_pktmbuf_free(pkts[--nb_rx]);
+   }
+}
+
+/**
+ * Forward packets from shared Rx queue.
+ *
+ * Source port of packets are identified by mbuf->port.
+ */
+static void
+forward_shared_rxq(struct fwd_stream *fs, uint16_t nb_rx,
+  struct rte_mbuf **pkts_burst)
+{
+   uint16_t i, nb_sub_burst, port, last_port;
+
+   nb_sub_burst = 0;
+   last_port = pkts_burst[0]->port;
+   /* Locate sub-burst according to mbuf->port. */
+   for (i = 0; i < nb_rx - 1; ++i) {
+   rte_prefetch0(pkts_burst[i + 1]);
+   port = pkts_burst[i]->port;
+   if (i > 0 && last_port != port) {
+   /* Forward packets with same source port. */
+   forward_sub_burst(fs, last_port, nb_sub_burst,
+ &pkts_burst[i - nb_sub_burst]);
+   nb_sub_burst = 0;
+   last_port = port;
+   }
+   nb_sub_burst++;
+   }
+   /* Last sub-burst. */
+   nb_sub_burst++;
+   forward_sub_burst(fs, last_port, nb_sub_burst,
+ &pkts_burst[nb_rx - nb_sub_burst]);
+}
+
+static void
+shared_rxq_fwd(struct fwd_stream *fs)
+{
+   struct rte_mbuf *pkts_burst[nb_pkt_per_burst];
+   uint16_t nb_rx;
+   uint64_t start_tsc = 0;
+
+   get_start_cycles(&start_tsc);
+   nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
+nb_pkt_per_burst);
+   inc_rx_burst_stats(fs, nb_rx);
+   if (unlikely(nb_rx == 0))
+   return;
+   forward_shared_rxq(fs, nb_rx, pkts_burst);
+   get_end_cycles(fs, start_tsc);
+}
+
+struct fwd_engine shared_rxq_engine = {
+   .fwd_mode_name  = "shared_rxq",
+   .port_fwd_begin = NULL,
+   .port_fwd_end   = NULL,
+   .packet_fwd = shared_rxq_f

Re: [dpdk-dev] [EXT] Re: [Bug 828] [dpdk-21.11] zuc unit test is failing

2021-10-16 Thread Zhang, Roy Fan
Hi Akhil,

Ciara managed to include Pablo's fix into the ipse-mb patches in V4.
https://patchwork.dpdk.org/project/dpdk/patch/20211015143957.842499-6-ciara.po...@intel.com/
https://patchwork.dpdk.org/project/dpdk/patch/20211015143957.842499-7-ciara.po...@intel.com/

Regards,
Fan

> -Original Message-
> From: Akhil Goyal 
> Sent: Friday, October 15, 2021 7:05 PM
> To: David Marchand ; Vidya Sagar Velumuri
> 
> Cc: dev ; Zhang, Roy Fan ; De
> Lara Guarch, Pablo 
> Subject: RE: [EXT] Re: [dpdk-dev] [Bug 828] [dpdk-21.11] zuc unit test is
> failing
> 
> > Hello,
> >
> > On Fri, Oct 15, 2021 at 10:02 AM  wrote:
> > >
> > > https://bugs.dpdk.org/show_bug.cgi?id=828
> > >
> > > Bug ID: 828
> > >Summary: [dpdk-21.11] zuc unit test is failing
> > >Product: DPDK
> > >Version: 21.11
> > >   Hardware: All
> > > OS: Linux
> > > Status: UNCONFIRMED
> > >   Severity: normal
> > >   Priority: Normal
> > >  Component: cryptodev
> > >   Assignee: dev@dpdk.org
> > >   Reporter: varalakshm...@intel.com
> > >   Target Milestone: ---
> >
> > I could not assign this bug to you in bz, can you have a look?
> > Thanks.
> >
> Can somebody from Intel look into this?
> We don’t have intel-ipsec-mb library access, so cannot reproduce the issue.
> The test case is passing on cnxk hardware. Please let us know if the vectors
> added in the patch are not correct.
> 
> >
> > >
> > > Steps to reproduce
> > >
> > > from dpdk path, the following steps should be followed:
> > > x86_64-native-linuxapp-gcc/app/test/dpdk-test -l 1,2,3 --vdev
> crypto_zuc0
> > > --socket-mem 2048,0 -n 4 --log-level=6 -a :1a:01.0
> > >
> > > RTE>> cryptodev_sw_zuc_autotest
> > >
> >
> > [snip]
> >
> > > + --- +
> > > + Sub Testsuites Total : 27
> > > + Sub Testsuites Skipped : 25
> > > + Sub Testsuites Passed : 1
> > > + Sub Testsuites Failed : 1
> > > + --- +
> > > + Tests Total : 511
> > > + Tests Skipped : 488
> > > + Tests Executed : 65
> > > + Tests Unsupported: 0
> > > + Tests Passed : 20
> > > + Tests Failed : 3
> > > + --- +
> > > Test Failed
> > > RTE>>
> > >
> >
> > > 
> > > fa5bf9345d4e0141ac40f154b1c1a4b99e8fe9a3 is the first bad commit
> > >
> > > commit fa5bf9345d4e0141ac40f154b1c1a4b99e8fe9a3
> > > Author: Vidya Sagar Velumuri 
> > > Date: Wed Sep 15 06:11:03 2021 +
> > >
> > > test/crypto: add ZUC cases with 256-bit keys
> > >
> > > Add test cases for zuc 256 bit key.
> > > Add test case for zuc 8 and 16 byte digest with
> > > 256 bit key mode
> > >
> > > Signed-off-by: Vidya Sagar Velumuri 
> > > Acked-by: Akhil Goyal 
> >
> >
> >
> > --
> > David Marchand



[dpdk-dev] [PATCH v2 01/13] common/mlx5: support receive queue user index

2021-10-16 Thread Xueming Li
RQ user index is saved in CQE when packet received by RQ.

Signed-off-by: Xueming Li 
---
 drivers/common/mlx5/mlx5_prm.h   | 8 +++-
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 8 
 drivers/regex/mlx5/mlx5_regex_fastpath.c | 2 +-
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_prm.h b/drivers/common/mlx5/mlx5_prm.h
index 54e62aa1531..5fd93958ac3 100644
--- a/drivers/common/mlx5/mlx5_prm.h
+++ b/drivers/common/mlx5/mlx5_prm.h
@@ -393,7 +393,13 @@ struct mlx5_cqe {
uint16_t hdr_type_etc;
uint16_t vlan_info;
uint8_t lro_num_seg;
-   uint8_t rsvd3[3];
+   union {
+   uint8_t user_index_bytes[3];
+   struct {
+   uint8_t user_index_hi;
+   uint16_t user_index_low;
+   } __rte_packed;
+   };
uint32_t flow_table_metadata;
uint8_t rsvd4[4];
uint32_t byte_cnt;
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h 
b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 68cef1a83ed..82586f012cb 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -974,10 +974,10 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile 
struct mlx5_cqe *cq,
(vector unsigned short)cqe_tmp1, cqe_sel_mask1);
cqe_tmp2 = (vector unsigned char)(vector unsigned long){
*(__rte_aligned(8) unsigned long *)
-   &cq[pos + p3].rsvd3[9], 0LL};
+   &cq[pos + p3].user_index_bytes[9], 0LL};
cqe_tmp1 = (vector unsigned char)(vector unsigned long){
*(__rte_aligned(8) unsigned long *)
-   &cq[pos + p2].rsvd3[9], 0LL};
+   &cq[pos + p2].user_index_bytes[9], 0LL};
cqes[3] = (vector unsigned char)
vec_sel((vector unsigned short)cqes[3],
(vector unsigned short)cqe_tmp2,
@@ -1037,10 +1037,10 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile 
struct mlx5_cqe *cq,
(vector unsigned short)cqe_tmp1, cqe_sel_mask1);
cqe_tmp2 = (vector unsigned char)(vector unsigned long){
*(__rte_aligned(8) unsigned long *)
-   &cq[pos + p1].rsvd3[9], 0LL};
+   &cq[pos + p1].user_index_bytes[9], 0LL};
cqe_tmp1 = (vector unsigned char)(vector unsigned long){
*(__rte_aligned(8) unsigned long *)
-   &cq[pos].rsvd3[9], 0LL};
+   &cq[pos].user_index_bytes[9], 0LL};
cqes[1] = (vector unsigned char)
vec_sel((vector unsigned short)cqes[1],
(vector unsigned short)cqe_tmp2, cqe_sel_mask2);
diff --git a/drivers/regex/mlx5/mlx5_regex_fastpath.c 
b/drivers/regex/mlx5/mlx5_regex_fastpath.c
index 0833b2817e2..e51e632c1f8 100644
--- a/drivers/regex/mlx5/mlx5_regex_fastpath.c
+++ b/drivers/regex/mlx5/mlx5_regex_fastpath.c
@@ -571,7 +571,7 @@ mlx5_regexdev_dequeue(struct rte_regexdev *dev, uint16_t 
qp_id,
uint16_t wq_counter
= (rte_be_to_cpu_16(cqe->wqe_counter) + 1) &
  MLX5_REGEX_MAX_WQE_INDEX;
-   size_t hw_qpid = cqe->rsvd3[2];
+   size_t hw_qpid = cqe->user_index_bytes[2];
struct mlx5_regex_hw_qp *qp_obj = &queue->qps[hw_qpid];
 
/* UMR mode WQE counter move as WQE set(4 WQEBBS).*/
-- 
2.33.0



[dpdk-dev] [PATCH v2 00/13] net/mlx5: support shared Rx queue

2021-10-16 Thread Xueming Li
Implemetation of Shared Rx queue.

Depends-on: series-19708 ("ethdev: introduce shared Rx queue")
Depends-on: series-19698 ("Flow entites behavior on port restart")

v1:
- initial version
v2:
- rebased on latest dependent series
- fully tested

Viacheslav Ovsiienko (1):
  net/mlx5: add shared Rx queue port datapath support

Xueming Li (12):
  common/mlx5: support receive queue user index
  common/mlx5: support receive memory pool
  net/mlx5: fix Rx queue memory allocation return value
  net/mlx5: clean Rx queue code
  net/mlx5: split multiple packet Rq memory pool
  net/mlx5: split Rx queue
  net/mlx5: move Rx queue reference count
  net/mlx5: move Rx queue hairpin info to private data
  net/mlx5: remove port info from shareable Rx queue
  net/mlx5: move Rx queue DevX resource
  net/mlx5: remove Rx queue data list from device
  net/mlx5: support shared Rx queue

 doc/guides/nics/features/mlx5.ini|   1 +
 doc/guides/nics/mlx5.rst |   6 +
 drivers/common/mlx5/mlx5_common_devx.c   | 296 +--
 drivers/common/mlx5/mlx5_common_devx.h   |  19 +-
 drivers/common/mlx5/mlx5_devx_cmds.c |  52 ++
 drivers/common/mlx5/mlx5_devx_cmds.h |  16 +
 drivers/common/mlx5/mlx5_prm.h   |  93 +++-
 drivers/common/mlx5/version.map  |   1 +
 drivers/net/mlx5/linux/mlx5_os.c |   2 +
 drivers/net/mlx5/linux/mlx5_verbs.c  | 173 ---
 drivers/net/mlx5/mlx5.c  |  11 +-
 drivers/net/mlx5/mlx5.h  |  17 +-
 drivers/net/mlx5/mlx5_devx.c | 275 +-
 drivers/net/mlx5/mlx5_ethdev.c   |  21 +-
 drivers/net/mlx5/mlx5_flow.c |  45 +-
 drivers/net/mlx5/mlx5_mr.c   |   7 +-
 drivers/net/mlx5/mlx5_rss.c  |   6 +-
 drivers/net/mlx5/mlx5_rx.c   |  35 +-
 drivers/net/mlx5/mlx5_rx.h   |  46 +-
 drivers/net/mlx5/mlx5_rxq.c  | 633 +++
 drivers/net/mlx5/mlx5_rxtx.c |   6 +-
 drivers/net/mlx5/mlx5_rxtx_vec.c |   8 +-
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  14 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h|  12 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h |   8 +-
 drivers/net/mlx5/mlx5_stats.c|   9 +-
 drivers/net/mlx5/mlx5_trigger.c  | 161 +++---
 drivers/net/mlx5/mlx5_vlan.c |  16 +-
 drivers/regex/mlx5/mlx5_regex_fastpath.c |   2 +-
 29 files changed, 1350 insertions(+), 641 deletions(-)

-- 
2.33.0



[dpdk-dev] [PATCH v2 07/13] net/mlx5: move Rx queue reference count

2021-10-16 Thread Xueming Li
Rx queue reference count is counter of RQ, used on RQ table.
To prepare for shared Rx queue, move it from rxq_ctrl to Rx queue
private data.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_rx.h  |   8 +-
 drivers/net/mlx5/mlx5_rxq.c | 173 +---
 drivers/net/mlx5/mlx5_trigger.c |  57 +--
 3 files changed, 144 insertions(+), 94 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index db6252e8e86..fe19414c130 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -160,7 +160,6 @@ enum mlx5_rxq_type {
 struct mlx5_rxq_ctrl {
struct mlx5_rxq_data rxq; /* Data path structure. */
LIST_ENTRY(mlx5_rxq_ctrl) next; /* Pointer to the next element. */
-   uint32_t refcnt; /* Reference counter. */
LIST_HEAD(priv, mlx5_rxq_priv) owners; /* Owner rxq list. */
struct mlx5_rxq_obj *obj; /* Verbs/DevX elements. */
struct mlx5_dev_ctx_shared *sh; /* Shared context. */
@@ -179,6 +178,7 @@ struct mlx5_rxq_ctrl {
 /* RX queue private data. */
 struct mlx5_rxq_priv {
uint16_t idx; /* Queue index. */
+   uint32_t refcnt; /* Reference counter. */
struct mlx5_rxq_ctrl *ctrl; /* Shared Rx Queue. */
LIST_ENTRY(mlx5_rxq_priv) owner_entry; /* Entry in shared rxq_ctrl. */
struct mlx5_priv *priv; /* Back pointer to private data. */
@@ -216,7 +216,11 @@ struct mlx5_rxq_ctrl *mlx5_rxq_new(struct rte_eth_dev *dev,
 struct mlx5_rxq_ctrl *mlx5_rxq_hairpin_new
(struct rte_eth_dev *dev, struct mlx5_rxq_priv *rxq, uint16_t desc,
 const struct rte_eth_hairpin_conf *hairpin_conf);
-struct mlx5_rxq_ctrl *mlx5_rxq_get(struct rte_eth_dev *dev, uint16_t idx);
+struct mlx5_rxq_priv *mlx5_rxq_ref(struct rte_eth_dev *dev, uint16_t idx);
+uint32_t mlx5_rxq_deref(struct rte_eth_dev *dev, uint16_t idx);
+struct mlx5_rxq_priv *mlx5_rxq_get(struct rte_eth_dev *dev, uint16_t idx);
+struct mlx5_rxq_ctrl *mlx5_rxq_ctrl_get(struct rte_eth_dev *dev, uint16_t idx);
+struct mlx5_rxq_data *mlx5_rxq_data_get(struct rte_eth_dev *dev, uint16_t idx);
 int mlx5_rxq_release(struct rte_eth_dev *dev, uint16_t idx);
 int mlx5_rxq_verify(struct rte_eth_dev *dev);
 int rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index acd77a7ecc1..17077762ecd 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -386,15 +386,13 @@ mlx5_get_rx_port_offloads(void)
 static int
 mlx5_rxq_releasable(struct rte_eth_dev *dev, uint16_t idx)
 {
-   struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_rxq_ctrl *rxq_ctrl;
+   struct mlx5_rxq_priv *rxq = mlx5_rxq_get(dev, idx);
 
-   if (!(*priv->rxqs)[idx]) {
+   if (rxq == NULL) {
rte_errno = EINVAL;
return -rte_errno;
}
-   rxq_ctrl = container_of((*priv->rxqs)[idx], struct mlx5_rxq_ctrl, rxq);
-   return (__atomic_load_n(&rxq_ctrl->refcnt, __ATOMIC_RELAXED) == 1);
+   return (__atomic_load_n(&rxq->refcnt, __ATOMIC_RELAXED) == 1);
 }
 
 /* Fetches and drops all SW-owned and error CQEs to synchronize CQ. */
@@ -874,8 +872,8 @@ mlx5_rx_intr_vec_enable(struct rte_eth_dev *dev)
intr_handle->type = RTE_INTR_HANDLE_EXT;
for (i = 0; i != n; ++i) {
/* This rxq obj must not be released in this function. */
-   struct mlx5_rxq_ctrl *rxq_ctrl = mlx5_rxq_get(dev, i);
-   struct mlx5_rxq_obj *rxq_obj = rxq_ctrl ? rxq_ctrl->obj : NULL;
+   struct mlx5_rxq_priv *rxq = mlx5_rxq_get(dev, i);
+   struct mlx5_rxq_obj *rxq_obj = rxq ? rxq->ctrl->obj : NULL;
int rc;
 
/* Skip queues that cannot request interrupts. */
@@ -885,11 +883,9 @@ mlx5_rx_intr_vec_enable(struct rte_eth_dev *dev)
intr_handle->intr_vec[i] =
RTE_INTR_VEC_RXTX_OFFSET +
RTE_MAX_RXTX_INTR_VEC_ID;
-   /* Decrease the rxq_ctrl's refcnt */
-   if (rxq_ctrl)
-   mlx5_rxq_release(dev, i);
continue;
}
+   mlx5_rxq_ref(dev, i);
if (count >= RTE_MAX_RXTX_INTR_VEC_ID) {
DRV_LOG(ERR,
"port %u too many Rx queues for interrupt"
@@ -949,7 +945,7 @@ mlx5_rx_intr_vec_disable(struct rte_eth_dev *dev)
 * Need to access directly the queue to release the reference
 * kept in mlx5_rx_intr_vec_enable().
 */
-   mlx5_rxq_release(dev, i);
+   mlx5_rxq_deref(dev, i);
}
 free:
rte_intr_free_epoll_fd(intr_handle);
@@ -998,19 +994,14 @@ mlx5_arm_cq(struct mlx5_rxq_data *rxq, int sq_n_rxq)
 int
 mlx5_rx_intr_enable(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 {
-   struct mlx5_rxq_c

[dpdk-dev] [PATCH v2 05/13] net/mlx5: split multiple packet Rq memory pool

2021-10-16 Thread Xueming Li
Port info is invisible from shared Rx queue, split MPR mempool from
device to Rx queue, also changed pool flag to mp_sc.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5.c |   1 -
 drivers/net/mlx5/mlx5_rx.h  |   4 +-
 drivers/net/mlx5/mlx5_rxq.c | 109 
 drivers/net/mlx5/mlx5_trigger.c |  10 ++-
 4 files changed, 47 insertions(+), 77 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 45ccfe27845..1033c29cb82 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1608,7 +1608,6 @@ mlx5_dev_close(struct rte_eth_dev *dev)
mlx5_drop_action_destroy(dev);
if (priv->mreg_cp_tbl)
mlx5_hlist_destroy(priv->mreg_cp_tbl);
-   mlx5_mprq_free_mp(dev);
if (priv->sh->ct_mng)
mlx5_flow_aso_ct_mng_close(priv->sh);
mlx5_os_free_shared_dr(priv);
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index d44c8078dea..a8e0c3162b0 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -179,8 +179,8 @@ struct mlx5_rxq_ctrl {
 extern uint8_t rss_hash_default_key[];
 
 unsigned int mlx5_rxq_cqe_num(struct mlx5_rxq_data *rxq_data);
-int mlx5_mprq_free_mp(struct rte_eth_dev *dev);
-int mlx5_mprq_alloc_mp(struct rte_eth_dev *dev);
+int mlx5_mprq_free_mp(struct rte_eth_dev *dev, struct mlx5_rxq_ctrl *rxq_ctrl);
+int mlx5_mprq_alloc_mp(struct rte_eth_dev *dev, struct mlx5_rxq_ctrl 
*rxq_ctrl);
 int mlx5_rx_queue_start(struct rte_eth_dev *dev, uint16_t queue_id);
 int mlx5_rx_queue_stop(struct rte_eth_dev *dev, uint16_t queue_id);
 int mlx5_rx_queue_start_primary(struct rte_eth_dev *dev, uint16_t queue_id);
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 1cb99de1ae7..f29a8143967 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1087,7 +1087,7 @@ mlx5_mprq_buf_init(struct rte_mempool *mp, void 
*opaque_arg,
 }
 
 /**
- * Free mempool of Multi-Packet RQ.
+ * Free RXQ mempool of Multi-Packet RQ.
  *
  * @param dev
  *   Pointer to Ethernet device.
@@ -1096,16 +1096,15 @@ mlx5_mprq_buf_init(struct rte_mempool *mp, void 
*opaque_arg,
  *   0 on success, negative errno value on failure.
  */
 int
-mlx5_mprq_free_mp(struct rte_eth_dev *dev)
+mlx5_mprq_free_mp(struct rte_eth_dev *dev, struct mlx5_rxq_ctrl *rxq_ctrl)
 {
-   struct mlx5_priv *priv = dev->data->dev_private;
-   struct rte_mempool *mp = priv->mprq_mp;
-   unsigned int i;
+   struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
+   struct rte_mempool *mp = rxq->mprq_mp;
 
if (mp == NULL)
return 0;
-   DRV_LOG(DEBUG, "port %u freeing mempool (%s) for Multi-Packet RQ",
-   dev->data->port_id, mp->name);
+   DRV_LOG(DEBUG, "port %u queue %hu freeing mempool (%s) for Multi-Packet 
RQ",
+   dev->data->port_id, rxq->idx, mp->name);
/*
 * If a buffer in the pool has been externally attached to a mbuf and it
 * is still in use by application, destroying the Rx queue can spoil
@@ -1123,34 +1122,28 @@ mlx5_mprq_free_mp(struct rte_eth_dev *dev)
return -rte_errno;
}
rte_mempool_free(mp);
-   /* Unset mempool for each Rx queue. */
-   for (i = 0; i != priv->rxqs_n; ++i) {
-   struct mlx5_rxq_data *rxq = (*priv->rxqs)[i];
-
-   if (rxq == NULL)
-   continue;
-   rxq->mprq_mp = NULL;
-   }
-   priv->mprq_mp = NULL;
+   rxq->mprq_mp = NULL;
return 0;
 }
 
 /**
- * Allocate a mempool for Multi-Packet RQ. All configured Rx queues share the
- * mempool. If already allocated, reuse it if there're enough elements.
+ * Allocate RXQ a mempool for Multi-Packet RQ.
+ * If already allocated, reuse it if there're enough elements.
  * Otherwise, resize it.
  *
  * @param dev
  *   Pointer to Ethernet device.
+ * @param rxq_ctrl
+ *   Pointer to RXQ.
  *
  * @return
  *   0 on success, negative errno value on failure.
  */
 int
-mlx5_mprq_alloc_mp(struct rte_eth_dev *dev)
+mlx5_mprq_alloc_mp(struct rte_eth_dev *dev, struct mlx5_rxq_ctrl *rxq_ctrl)
 {
-   struct mlx5_priv *priv = dev->data->dev_private;
-   struct rte_mempool *mp = priv->mprq_mp;
+   struct mlx5_rxq_data *rxq = &rxq_ctrl->rxq;
+   struct rte_mempool *mp = rxq->mprq_mp;
char name[RTE_MEMPOOL_NAMESIZE];
unsigned int desc = 0;
unsigned int buf_len;
@@ -1158,28 +1151,15 @@ mlx5_mprq_alloc_mp(struct rte_eth_dev *dev)
unsigned int obj_size;
unsigned int strd_num_n = 0;
unsigned int strd_sz_n = 0;
-   unsigned int i;
-   unsigned int n_ibv = 0;
 
-   if (!mlx5_mprq_enabled(dev))
+   if (rxq_ctrl == NULL || rxq_ctrl->type != MLX5_RXQ_TYPE_STANDARD)
return 0;
-   /* Count the total number of descriptors configured. */
-   for (i = 0; i != priv->rxqs_n; ++i) {
-   struct 

[dpdk-dev] [PATCH v2 04/13] net/mlx5: clean Rx queue code

2021-10-16 Thread Xueming Li
Removes unused rxq code.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_rxq.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 4036bcbe544..1cb99de1ae7 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -674,9 +674,7 @@ mlx5_rx_queue_setup(struct rte_eth_dev *dev, uint16_t idx, 
uint16_t desc,
struct rte_mempool *mp)
 {
struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_rxq_data *rxq = (*priv->rxqs)[idx];
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq, struct mlx5_rxq_ctrl, rxq);
+   struct mlx5_rxq_ctrl *rxq_ctrl;
struct rte_eth_rxseg_split *rx_seg =
(struct rte_eth_rxseg_split *)conf->rx_seg;
struct rte_eth_rxseg_split rx_single = {.mp = mp};
@@ -743,9 +741,7 @@ mlx5_rx_hairpin_queue_setup(struct rte_eth_dev *dev, 
uint16_t idx,
const struct rte_eth_hairpin_conf *hairpin_conf)
 {
struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_rxq_data *rxq = (*priv->rxqs)[idx];
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq, struct mlx5_rxq_ctrl, rxq);
+   struct mlx5_rxq_ctrl *rxq_ctrl;
int res;
 
res = mlx5_rx_queue_pre_setup(dev, idx, &desc);
-- 
2.33.0



[dpdk-dev] [PATCH v2 08/13] net/mlx5: move Rx queue hairpin info to private data

2021-10-16 Thread Xueming Li
Hairpin info of Rx queue can't be shared, moves to private queue data.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_rx.h  |  4 ++--
 drivers/net/mlx5/mlx5_rxq.c | 13 +
 drivers/net/mlx5/mlx5_trigger.c | 24 
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index fe19414c130..2ed544556f5 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -171,8 +171,6 @@ struct mlx5_rxq_ctrl {
uint32_t flow_tunnels_n[MLX5_FLOW_TUNNEL]; /* Tunnels counters. */
uint32_t wqn; /* WQ number. */
uint16_t dump_file_n; /* Number of dump files. */
-   struct rte_eth_hairpin_conf hairpin_conf; /* Hairpin configuration. */
-   uint32_t hairpin_status; /* Hairpin binding status. */
 };
 
 /* RX queue private data. */
@@ -182,6 +180,8 @@ struct mlx5_rxq_priv {
struct mlx5_rxq_ctrl *ctrl; /* Shared Rx Queue. */
LIST_ENTRY(mlx5_rxq_priv) owner_entry; /* Entry in shared rxq_ctrl. */
struct mlx5_priv *priv; /* Back pointer to private data. */
+   struct rte_eth_hairpin_conf hairpin_conf; /* Hairpin configuration. */
+   uint32_t hairpin_status; /* Hairpin binding status. */
 };
 
 /* mlx5_rxq.c */
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 17077762ecd..2b9ab7b3fc4 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -1649,8 +1649,8 @@ mlx5_rxq_hairpin_new(struct rte_eth_dev *dev, struct 
mlx5_rxq_priv *rxq,
tmpl->rxq.elts_n = log2above(desc);
tmpl->rxq.elts = NULL;
tmpl->rxq.mr_ctrl.cache_bh = (struct mlx5_mr_btree) { 0 };
-   tmpl->hairpin_conf = *hairpin_conf;
tmpl->rxq.idx = idx;
+   rxq->hairpin_conf = *hairpin_conf;
mlx5_rxq_ref(dev, idx);
LIST_INSERT_HEAD(&priv->rxqsctrl, tmpl, next);
return tmpl;
@@ -1869,14 +1869,11 @@ const struct rte_eth_hairpin_conf *
 mlx5_rxq_get_hairpin_conf(struct rte_eth_dev *dev, uint16_t idx)
 {
struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_rxq_ctrl *rxq_ctrl = NULL;
+   struct mlx5_rxq_priv *rxq = mlx5_rxq_get(dev, idx);
 
-   if (idx < priv->rxqs_n && (*priv->rxqs)[idx]) {
-   rxq_ctrl = container_of((*priv->rxqs)[idx],
-   struct mlx5_rxq_ctrl,
-   rxq);
-   if (rxq_ctrl->type == MLX5_RXQ_TYPE_HAIRPIN)
-   return &rxq_ctrl->hairpin_conf;
+   if (idx < priv->rxqs_n && rxq != NULL) {
+   if (rxq->ctrl->type == MLX5_RXQ_TYPE_HAIRPIN)
+   return &rxq->hairpin_conf;
}
return NULL;
 }
diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
index a49254c96f6..f376f4d6fc4 100644
--- a/drivers/net/mlx5/mlx5_trigger.c
+++ b/drivers/net/mlx5/mlx5_trigger.c
@@ -273,7 +273,7 @@ mlx5_hairpin_auto_bind(struct rte_eth_dev *dev)
}
rxq_ctrl = rxq->ctrl;
if (rxq_ctrl->type != MLX5_RXQ_TYPE_HAIRPIN ||
-   rxq_ctrl->hairpin_conf.peers[0].queue != i) {
+   rxq->hairpin_conf.peers[0].queue != i) {
rte_errno = ENOMEM;
DRV_LOG(ERR, "port %u Tx queue %d can't be binded to "
"Rx queue %d", dev->data->port_id,
@@ -303,7 +303,7 @@ mlx5_hairpin_auto_bind(struct rte_eth_dev *dev)
if (ret)
goto error;
/* Qs with auto-bind will be destroyed directly. */
-   rxq_ctrl->hairpin_status = 1;
+   rxq->hairpin_status = 1;
txq_ctrl->hairpin_status = 1;
mlx5_txq_release(dev, i);
}
@@ -406,9 +406,9 @@ mlx5_hairpin_queue_peer_update(struct rte_eth_dev *dev, 
uint16_t peer_queue,
}
peer_info->qp_id = rxq_ctrl->obj->rq->id;
peer_info->vhca_id = priv->config.hca_attr.vhca_id;
-   peer_info->peer_q = rxq_ctrl->hairpin_conf.peers[0].queue;
-   peer_info->tx_explicit = rxq_ctrl->hairpin_conf.tx_explicit;
-   peer_info->manual_bind = rxq_ctrl->hairpin_conf.manual_bind;
+   peer_info->peer_q = rxq->hairpin_conf.peers[0].queue;
+   peer_info->tx_explicit = rxq->hairpin_conf.tx_explicit;
+   peer_info->manual_bind = rxq->hairpin_conf.manual_bind;
}
return 0;
 }
@@ -530,20 +530,20 @@ mlx5_hairpin_queue_peer_bind(struct rte_eth_dev *dev, 
uint16_t cur_queue,
dev->data->port_id, cur_queue);
return -rte_errno;
}
-   if (rxq_ctrl->hairpin_status != 0) {
+   if (rxq->hairpin_status != 0) {
DRV_LOG(DEBUG, "port %u Rx queue %d is already bound",
dev-

[dpdk-dev] [PATCH v2 03/13] net/mlx5: fix Rx queue memory allocation return value

2021-10-16 Thread Xueming Li
If error happened during Rx queue mbuf allocation, boolean value
returned. From description, return value should be error number.

This patch returns negative error number.

Fixes: 0f20acbf5eda ("net/mlx5: implement vectorized MPRQ burst")
Cc: akozy...@nvidia.com

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_rxq.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index fd2b5779fff..4036bcbe544 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -128,7 +128,7 @@ rxq_alloc_elts_mprq(struct mlx5_rxq_ctrl *rxq_ctrl)
  *   Pointer to RX queue structure.
  *
  * @return
- *   0 on success, errno value on failure.
+ *   0 on success, negative errno value on failure.
  */
 static int
 rxq_alloc_elts_sprq(struct mlx5_rxq_ctrl *rxq_ctrl)
@@ -219,7 +219,7 @@ rxq_alloc_elts_sprq(struct mlx5_rxq_ctrl *rxq_ctrl)
  *   Pointer to RX queue structure.
  *
  * @return
- *   0 on success, errno value on failure.
+ *   0 on success, negative errno value on failure.
  */
 int
 rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl)
@@ -232,7 +232,9 @@ rxq_alloc_elts(struct mlx5_rxq_ctrl *rxq_ctrl)
 */
if (mlx5_rxq_mprq_enabled(&rxq_ctrl->rxq))
ret = rxq_alloc_elts_mprq(rxq_ctrl);
-   return (ret || rxq_alloc_elts_sprq(rxq_ctrl));
+   if (ret == 0)
+   ret = rxq_alloc_elts_sprq(rxq_ctrl);
+   return ret;
 }
 
 /**
-- 
2.33.0



[dpdk-dev] [PATCH v2 06/13] net/mlx5: split Rx queue

2021-10-16 Thread Xueming Li
To prepare shared RX queue, splits rxq data into shareable and private.
Struct mlx5_rxq_priv is per queue data.
Struct mlx5_rxq_ctrl is shared queue resources and data.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5.c|  4 +++
 drivers/net/mlx5/mlx5.h|  5 ++-
 drivers/net/mlx5/mlx5_ethdev.c | 10 ++
 drivers/net/mlx5/mlx5_rx.h | 15 ++--
 drivers/net/mlx5/mlx5_rxq.c| 66 --
 5 files changed, 86 insertions(+), 14 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 1033c29cb82..477ad8c1bc9 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1591,6 +1591,10 @@ mlx5_dev_close(struct rte_eth_dev *dev)
mlx5_free(dev->intr_handle);
dev->intr_handle = NULL;
}
+   if (priv->rxq_privs != NULL) {
+   mlx5_free(priv->rxq_privs);
+   priv->rxq_privs = NULL;
+   }
if (priv->txqs != NULL) {
/* XXX race condition if mlx5_tx_burst() is still running. */
rte_delay_us_sleep(1000);
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 3581414b789..b18ddb0b0fa 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1335,6 +1335,8 @@ enum mlx5_txq_modify_type {
MLX5_TXQ_MOD_ERR2RDY, /* modify state from error to ready. */
 };
 
+struct mlx5_rxq_priv;
+
 /* HW objects operations structure. */
 struct mlx5_obj_ops {
int (*rxq_obj_modify_vlan_strip)(struct mlx5_rxq_obj *rxq_obj, int on);
@@ -1404,7 +1406,8 @@ struct mlx5_priv {
/* RX/TX queues. */
unsigned int rxqs_n; /* RX queues array size. */
unsigned int txqs_n; /* TX queues array size. */
-   struct mlx5_rxq_data *(*rxqs)[]; /* RX queues. */
+   struct mlx5_rxq_priv *(*rxq_privs)[]; /* RX queue non-shared data. */
+   struct mlx5_rxq_data *(*rxqs)[]; /* (Shared) RX queues. */
struct mlx5_txq_data *(*txqs)[]; /* TX queues. */
struct rte_mempool *mprq_mp; /* Mempool for Multi-Packet RQ. */
struct rte_eth_rss_conf rss_conf; /* RSS configuration. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 8ebfd0bccb3..ee1189b929d 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -104,6 +104,16 @@ mlx5_dev_configure(struct rte_eth_dev *dev)
   MLX5_RSS_HASH_KEY_LEN);
priv->rss_conf.rss_key_len = MLX5_RSS_HASH_KEY_LEN;
priv->rss_conf.rss_hf = dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf;
+   priv->rxq_privs = mlx5_realloc(priv->rxq_privs,
+  MLX5_MEM_ANY | MLX5_MEM_ZERO,
+  sizeof(void *) * rxqs_n, 0,
+  SOCKET_ID_ANY);
+   if (priv->rxq_privs == NULL) {
+   DRV_LOG(ERR, "port %u cannot allocate rxq private data",
+   dev->data->port_id);
+   rte_errno = ENOMEM;
+   return -rte_errno;
+   }
priv->rxqs = (void *)dev->data->rx_queues;
priv->txqs = (void *)dev->data->tx_queues;
if (txqs_n != priv->txqs_n) {
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index a8e0c3162b0..db6252e8e86 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -161,7 +161,9 @@ struct mlx5_rxq_ctrl {
struct mlx5_rxq_data rxq; /* Data path structure. */
LIST_ENTRY(mlx5_rxq_ctrl) next; /* Pointer to the next element. */
uint32_t refcnt; /* Reference counter. */
+   LIST_HEAD(priv, mlx5_rxq_priv) owners; /* Owner rxq list. */
struct mlx5_rxq_obj *obj; /* Verbs/DevX elements. */
+   struct mlx5_dev_ctx_shared *sh; /* Shared context. */
struct mlx5_priv *priv; /* Back pointer to private data. */
enum mlx5_rxq_type type; /* Rxq type. */
unsigned int socket; /* CPU socket ID for allocations. */
@@ -174,6 +176,14 @@ struct mlx5_rxq_ctrl {
uint32_t hairpin_status; /* Hairpin binding status. */
 };
 
+/* RX queue private data. */
+struct mlx5_rxq_priv {
+   uint16_t idx; /* Queue index. */
+   struct mlx5_rxq_ctrl *ctrl; /* Shared Rx Queue. */
+   LIST_ENTRY(mlx5_rxq_priv) owner_entry; /* Entry in shared rxq_ctrl. */
+   struct mlx5_priv *priv; /* Back pointer to private data. */
+};
+
 /* mlx5_rxq.c */
 
 extern uint8_t rss_hash_default_key[];
@@ -197,13 +207,14 @@ void mlx5_rx_intr_vec_disable(struct rte_eth_dev *dev);
 int mlx5_rx_intr_enable(struct rte_eth_dev *dev, uint16_t rx_queue_id);
 int mlx5_rx_intr_disable(struct rte_eth_dev *dev, uint16_t rx_queue_id);
 int mlx5_rxq_obj_verify(struct rte_eth_dev *dev);
-struct mlx5_rxq_ctrl *mlx5_rxq_new(struct rte_eth_dev *dev, uint16_t idx,
+struct mlx5_rxq_ctrl *mlx5_rxq_new(struct rte_eth_dev *dev,
+  struct mlx5_rxq_priv *rxq,
   uint16_t desc, unsigned int soc

[dpdk-dev] [PATCH v2 02/13] common/mlx5: support receive memory pool

2021-10-16 Thread Xueming Li
Adds DevX supports of PRM shared receive memory pool(RMP) object.
RMP is used to support shared Rx queue. Multiple RQ could share same
RMP. Memory buffers are supplied to RMP.

This patch makes RMP RQ optional, created only if mlx5_devx_rq.rmp
is set.

Signed-off-by: Xueming Li 
---
 drivers/common/mlx5/mlx5_common_devx.c | 296 +
 drivers/common/mlx5/mlx5_common_devx.h |  19 +-
 drivers/common/mlx5/mlx5_devx_cmds.c   |  52 +
 drivers/common/mlx5/mlx5_devx_cmds.h   |  16 ++
 drivers/common/mlx5/mlx5_prm.h |  85 ++-
 drivers/common/mlx5/version.map|   1 +
 drivers/net/mlx5/mlx5_devx.c   |   4 +-
 7 files changed, 425 insertions(+), 48 deletions(-)

diff --git a/drivers/common/mlx5/mlx5_common_devx.c 
b/drivers/common/mlx5/mlx5_common_devx.c
index 825f84b1833..db019418b39 100644
--- a/drivers/common/mlx5/mlx5_common_devx.c
+++ b/drivers/common/mlx5/mlx5_common_devx.c
@@ -271,6 +271,39 @@ mlx5_devx_sq_create(void *ctx, struct mlx5_devx_sq 
*sq_obj, uint16_t log_wqbb_n,
return -rte_errno;
 }
 
+/**
+ * Destroy DevX Receive Queue resources.
+ *
+ * @param[in] rq_res
+ *   DevX RQ resource to destroy.
+ */
+static void
+mlx5_devx_wq_res_destroy(struct mlx5_devx_wq_res *rq_res)
+{
+   if (rq_res->umem_obj)
+   claim_zero(mlx5_os_umem_dereg(rq_res->umem_obj));
+   if (rq_res->umem_buf)
+   mlx5_free((void *)(uintptr_t)rq_res->umem_buf);
+   memset(rq_res, 0, sizeof(*rq_res));
+}
+
+/**
+ * Destroy DevX Receive Memory Pool.
+ *
+ * @param[in] rmp
+ *   DevX RMP to destroy.
+ */
+static void
+mlx5_devx_rmp_destroy(struct mlx5_devx_rmp *rmp)
+{
+   MLX5_ASSERT(rmp->ref_cnt == 0);
+   if (rmp->rmp) {
+   claim_zero(mlx5_devx_cmd_destroy(rmp->rmp));
+   rmp->rmp = NULL;
+   }
+   mlx5_devx_wq_res_destroy(&rmp->wq);
+}
+
 /**
  * Destroy DevX Queue Pair.
  *
@@ -389,55 +422,48 @@ mlx5_devx_qp_create(void *ctx, struct mlx5_devx_qp 
*qp_obj, uint16_t log_wqbb_n,
 void
 mlx5_devx_rq_destroy(struct mlx5_devx_rq *rq)
 {
-   if (rq->rq)
+   if (rq->rq) {
claim_zero(mlx5_devx_cmd_destroy(rq->rq));
-   if (rq->umem_obj)
-   claim_zero(mlx5_os_umem_dereg(rq->umem_obj));
-   if (rq->umem_buf)
-   mlx5_free((void *)(uintptr_t)rq->umem_buf);
+   rq->rq = NULL;
+   if (rq->rmp)
+   rq->rmp->ref_cnt--;
+   }
+   if (rq->rmp == NULL) {
+   mlx5_devx_wq_res_destroy(&rq->wq);
+   } else {
+   if (rq->rmp->ref_cnt == 0)
+   mlx5_devx_rmp_destroy(rq->rmp);
+   }
 }
 
 /**
- * Create Receive Queue using DevX API.
- *
- * Get a pointer to partially initialized attributes structure, and updates the
- * following fields:
- *   wq_umem_valid
- *   wq_umem_id
- *   wq_umem_offset
- *   dbr_umem_valid
- *   dbr_umem_id
- *   dbr_addr
- *   log_wq_pg_sz
- * All other fields are updated by caller.
+ * Create WQ resources using DevX API.
  *
  * @param[in] ctx
  *   Context returned from mlx5 open_device() glue function.
- * @param[in/out] rq_obj
- *   Pointer to RQ to create.
  * @param[in] wqe_size
  *   Size of WQE structure.
  * @param[in] log_wqbb_n
  *   Log of number of WQBBs in queue.
- * @param[in] attr
- *   Pointer to RQ attributes structure.
  * @param[in] socket
  *   Socket to use for allocation.
+ * @param[out] wq_attr
+ *   Pointer to WQ attributes structure.
+ * @param[out] wq_res
+ *   Pointer to WQ resource to create.
  *
  * @return
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
-int
-mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, uint32_t wqe_size,
-   uint16_t log_wqbb_n,
-   struct mlx5_devx_create_rq_attr *attr, int socket)
+static int
+mlx5_devx_wq_init(void *ctx, uint32_t wqe_size, uint16_t log_wqbb_n, int 
socket,
+ struct mlx5_devx_wq_attr *wq_attr,
+ struct mlx5_devx_wq_res *wq_res)
 {
-   struct mlx5_devx_obj *rq = NULL;
struct mlx5dv_devx_umem *umem_obj = NULL;
void *umem_buf = NULL;
size_t alignment = MLX5_WQE_BUF_ALIGNMENT;
uint32_t umem_size, umem_dbrec;
-   uint16_t rq_size = 1 << log_wqbb_n;
int ret;
 
if (alignment == (size_t)-1) {
@@ -446,7 +472,7 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq *rq_obj, 
uint32_t wqe_size,
return -rte_errno;
}
/* Allocate memory buffer for WQEs and doorbell record. */
-   umem_size = wqe_size * rq_size;
+   umem_size = wqe_size * (1 << log_wqbb_n);
umem_dbrec = RTE_ALIGN(umem_size, MLX5_DBR_SIZE);
umem_size += MLX5_DBR_SIZE;
umem_buf = mlx5_malloc(MLX5_MEM_RTE | MLX5_MEM_ZERO, umem_size,
@@ -464,14 +490,60 @@ mlx5_devx_rq_create(void *ctx, struct mlx5_devx_rq 
*rq_obj, uint32_t wqe_size,
rte_errno = errno;
goto error;
   

[dpdk-dev] [PATCH v2 12/13] net/mlx5: support shared Rx queue

2021-10-16 Thread Xueming Li
This patch introduces shared RXQ. All share Rx queues with same group
and queue id shares same rxq_ctrl. Rxq_ctrl and rxq_data are shared,
all queues from different member port share same WQ and CQ, essentially
one Rx WQ, mbufs are filled into this singleton WQ.

Shared rxq_data is set into device Rx queues of all member ports as
rxq object, used for receiving packets. Polling queue of any member
ports returns packets of any member, mbuf->port is used to identify
source port.

Signed-off-by: Xueming Li 
---
 doc/guides/nics/features/mlx5.ini   |   1 +
 doc/guides/nics/mlx5.rst|   6 +
 drivers/net/mlx5/linux/mlx5_os.c|   2 +
 drivers/net/mlx5/linux/mlx5_verbs.c |  12 +-
 drivers/net/mlx5/mlx5.h |   4 +-
 drivers/net/mlx5/mlx5_devx.c|  50 +--
 drivers/net/mlx5/mlx5_ethdev.c  |   5 +
 drivers/net/mlx5/mlx5_rx.h  |   4 +
 drivers/net/mlx5/mlx5_rxq.c | 218 
 drivers/net/mlx5/mlx5_trigger.c |  76 ++
 10 files changed, 298 insertions(+), 80 deletions(-)

diff --git a/doc/guides/nics/features/mlx5.ini 
b/doc/guides/nics/features/mlx5.ini
index f01abd4231f..ff5e669acc1 100644
--- a/doc/guides/nics/features/mlx5.ini
+++ b/doc/guides/nics/features/mlx5.ini
@@ -11,6 +11,7 @@ Removal event= Y
 Rx interrupt = Y
 Fast mbuf free   = Y
 Queue start/stop = Y
+Shared Rx queue  = Y
 Burst mode info  = Y
 Power mgmt address monitor = Y
 MTU update   = Y
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bae73f42d88..d26f274dec4 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -113,6 +113,7 @@ Features
 - Connection tracking.
 - Sub-Function representors.
 - Sub-Function.
+- Shared Rx queue.
 
 
 Limitations
@@ -464,6 +465,11 @@ Limitations
   - In order to achieve best insertion rate, application should manage the 
flows per lcore.
   - Better to disable memory reclaim by setting ``reclaim_mem_mode`` to 0 to 
accelerate the flow object allocation and release with cache.
 
+ Shared Rx queue:
+
+  - Counter of received packets and bytes number of devices in same share 
group are same.
+  - Counter of received packets and bytes number of queues in same group and 
queue ID are same.
+
 Statistics
 --
 
diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 985f0bd4892..49acbe34817 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -457,6 +457,7 @@ mlx5_alloc_shared_dr(struct mlx5_priv *priv)
mlx5_glue->dr_create_flow_action_default_miss();
if (!sh->default_miss_action)
DRV_LOG(WARNING, "Default miss action is not supported.");
+   LIST_INIT(&sh->shared_rxqs);
return 0;
 error:
/* Rollback the created objects. */
@@ -531,6 +532,7 @@ mlx5_os_free_shared_dr(struct mlx5_priv *priv)
MLX5_ASSERT(sh && sh->refcnt);
if (sh->refcnt > 1)
return;
+   MLX5_ASSERT(LIST_EMPTY(&sh->shared_rxqs));
 #ifdef HAVE_MLX5DV_DR
if (sh->rx_domain) {
mlx5_glue->dr_destroy_domain(sh->rx_domain);
diff --git a/drivers/net/mlx5/linux/mlx5_verbs.c 
b/drivers/net/mlx5/linux/mlx5_verbs.c
index 0e68a13208b..17183adf732 100644
--- a/drivers/net/mlx5/linux/mlx5_verbs.c
+++ b/drivers/net/mlx5/linux/mlx5_verbs.c
@@ -459,20 +459,24 @@ mlx5_rxq_ibv_obj_new(struct mlx5_rxq_priv *rxq)
  *
  * @param rxq
  *   Pointer to Rx queue.
+ * @return
+ *   Safe to release RxQ object.
  */
-static void
+static bool
 mlx5_rxq_ibv_obj_release(struct mlx5_rxq_priv *rxq)
 {
struct mlx5_rxq_obj *rxq_obj = rxq->ctrl->obj;
 
-   MLX5_ASSERT(rxq_obj);
-   MLX5_ASSERT(rxq_obj->wq);
-   MLX5_ASSERT(rxq_obj->ibv_cq);
+   if (rxq_obj == NULL || rxq_obj->wq == NULL)
+   return true;
claim_zero(mlx5_glue->destroy_wq(rxq_obj->wq));
+   rxq_obj->wq = NULL;
+   MLX5_ASSERT(rxq_obj->ibv_cq);
claim_zero(mlx5_glue->destroy_cq(rxq_obj->ibv_cq));
if (rxq_obj->ibv_channel)
claim_zero(mlx5_glue->destroy_comp_channel
(rxq_obj->ibv_channel));
+   return true;
 }
 
 /**
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 55612f777ea..647a18d3916 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1193,6 +1193,7 @@ struct mlx5_dev_ctx_shared {
struct mlx5_flex_parser_profiles fp[MLX5_FLEX_PARSER_MAX];
/* Flex parser profiles information. */
void *devx_rx_uar; /* DevX UAR for Rx. */
+   LIST_HEAD(shared_rxqs, mlx5_rxq_ctrl) shared_rxqs; /* Shared RXQs. */
struct mlx5_aso_age_mng *aso_age_mng;
/* Management data for aging mechanism using ASO Flow Hit. */
struct mlx5_geneve_tlv_option_resource *geneve_tlv_option_resource;
@@ -1257,6 +1258,7 @@ struct mlx5_rxq_obj {
};
stru

[dpdk-dev] [PATCH v2 10/13] net/mlx5: move Rx queue DevX resource

2021-10-16 Thread Xueming Li
To support shared RX queue, move DevX RQ which is per queue resource to
Rx queue private data.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/linux/mlx5_verbs.c | 154 +++
 drivers/net/mlx5/mlx5.h |  11 +-
 drivers/net/mlx5/mlx5_devx.c| 227 ++--
 drivers/net/mlx5/mlx5_rx.h  |   1 +
 drivers/net/mlx5/mlx5_rxq.c |  44 +++---
 drivers/net/mlx5/mlx5_rxtx.c|   6 +-
 drivers/net/mlx5/mlx5_trigger.c |   2 +-
 drivers/net/mlx5/mlx5_vlan.c|  16 +-
 8 files changed, 241 insertions(+), 220 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_verbs.c 
b/drivers/net/mlx5/linux/mlx5_verbs.c
index d4fa202ac4b..a2a9b9c1f98 100644
--- a/drivers/net/mlx5/linux/mlx5_verbs.c
+++ b/drivers/net/mlx5/linux/mlx5_verbs.c
@@ -71,13 +71,13 @@ const struct mlx5_mr_ops mlx5_mr_verbs_ops = {
 /**
  * Modify Rx WQ vlan stripping offload
  *
- * @param rxq_obj
- *   Rx queue object.
+ * @param rxq
+ *   Rx queue.
  *
  * @return 0 on success, non-0 otherwise
  */
 static int
-mlx5_rxq_obj_modify_wq_vlan_strip(struct mlx5_rxq_obj *rxq_obj, int on)
+mlx5_rxq_obj_modify_wq_vlan_strip(struct mlx5_rxq_priv *rxq, int on)
 {
uint16_t vlan_offloads =
(on ? IBV_WQ_FLAGS_CVLAN_STRIPPING : 0) |
@@ -89,14 +89,14 @@ mlx5_rxq_obj_modify_wq_vlan_strip(struct mlx5_rxq_obj 
*rxq_obj, int on)
.flags = vlan_offloads,
};
 
-   return mlx5_glue->modify_wq(rxq_obj->wq, &mod);
+   return mlx5_glue->modify_wq(rxq->ctrl->obj->wq, &mod);
 }
 
 /**
  * Modifies the attributes for the specified WQ.
  *
- * @param rxq_obj
- *   Verbs Rx queue object.
+ * @param rxq
+ *   Verbs Rx queue.
  * @param type
  *   Type of change queue state.
  *
@@ -104,14 +104,14 @@ mlx5_rxq_obj_modify_wq_vlan_strip(struct mlx5_rxq_obj 
*rxq_obj, int on)
  *   0 on success, a negative errno value otherwise and rte_errno is set.
  */
 static int
-mlx5_ibv_modify_wq(struct mlx5_rxq_obj *rxq_obj, uint8_t type)
+mlx5_ibv_modify_wq(struct mlx5_rxq_priv *rxq, uint8_t type)
 {
struct ibv_wq_attr mod = {
.attr_mask = IBV_WQ_ATTR_STATE,
.wq_state = (enum ibv_wq_state)type,
};
 
-   return mlx5_glue->modify_wq(rxq_obj->wq, &mod);
+   return mlx5_glue->modify_wq(rxq->ctrl->obj->wq, &mod);
 }
 
 /**
@@ -181,21 +181,18 @@ mlx5_ibv_modify_qp(struct mlx5_txq_obj *obj, enum 
mlx5_txq_modify_type type,
 /**
  * Create a CQ Verbs object.
  *
- * @param dev
- *   Pointer to Ethernet device.
- * @param idx
- *   Queue index in DPDK Rx queue array.
+ * @param rxq
+ *   Pointer to Rx queue.
  *
  * @return
  *   The Verbs CQ object initialized, NULL otherwise and rte_errno is set.
  */
 static struct ibv_cq *
-mlx5_rxq_ibv_cq_create(struct rte_eth_dev *dev, uint16_t idx)
+mlx5_rxq_ibv_cq_create(struct mlx5_rxq_priv *rxq)
 {
-   struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_rxq_data *rxq_data = (*priv->rxqs)[idx];
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+   struct mlx5_priv *priv = rxq->priv;
+   struct mlx5_rxq_ctrl *rxq_ctrl = rxq->ctrl;
+   struct mlx5_rxq_data *rxq_data = &rxq_ctrl->rxq;
struct mlx5_rxq_obj *rxq_obj = rxq_ctrl->obj;
unsigned int cqe_n = mlx5_rxq_cqe_num(rxq_data);
struct {
@@ -241,7 +238,7 @@ mlx5_rxq_ibv_cq_create(struct rte_eth_dev *dev, uint16_t 
idx)
DRV_LOG(DEBUG,
"Port %u Rx CQE compression is disabled for HW"
" timestamp.",
-   dev->data->port_id);
+   priv->dev_data->port_id);
}
 #ifdef HAVE_IBV_MLX5_MOD_CQE_128B_PAD
if (RTE_CACHE_LINE_SIZE == 128) {
@@ -257,21 +254,18 @@ mlx5_rxq_ibv_cq_create(struct rte_eth_dev *dev, uint16_t 
idx)
 /**
  * Create a WQ Verbs object.
  *
- * @param dev
- *   Pointer to Ethernet device.
- * @param idx
- *   Queue index in DPDK Rx queue array.
+ * @param rxq
+ *   Pointer to Rx queue.
  *
  * @return
  *   The Verbs WQ object initialized, NULL otherwise and rte_errno is set.
  */
 static struct ibv_wq *
-mlx5_rxq_ibv_wq_create(struct rte_eth_dev *dev, uint16_t idx)
+mlx5_rxq_ibv_wq_create(struct mlx5_rxq_priv *rxq)
 {
-   struct mlx5_priv *priv = dev->data->dev_private;
-   struct mlx5_rxq_data *rxq_data = (*priv->rxqs)[idx];
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
+   struct mlx5_priv *priv = rxq->priv;
+   struct mlx5_rxq_ctrl *rxq_ctrl = rxq->ctrl;
+   struct mlx5_rxq_data *rxq_data = &rxq_ctrl->rxq;
struct mlx5_rxq_obj *rxq_obj = rxq_ctrl->obj;
unsigned int wqe_n = 1 << rxq_data->elts_n;
struct {
@@ -338,7 +332,7 @@ mlx5_rxq_ibv_wq_create(struct rte_eth_dev *dev, uint16_t 
idx)
DRV_LOG(ERR,
"Port %u Rx queue %u requested %u*%u

[dpdk-dev] [PATCH v2 09/13] net/mlx5: remove port info from shareable Rx queue

2021-10-16 Thread Xueming Li
To prepare for shared Rx queue, removes port info from shareable Rx
queue control.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/mlx5_devx.c |  2 +-
 drivers/net/mlx5/mlx5_mr.c   |  7 ---
 drivers/net/mlx5/mlx5_rx.c   | 15 +++
 drivers/net/mlx5/mlx5_rx.h   |  5 -
 drivers/net/mlx5/mlx5_rxq.c  | 10 --
 drivers/net/mlx5/mlx5_rxtx_vec.c |  2 +-
 6 files changed, 17 insertions(+), 24 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index c65a6e5d4e7..8411e6e6418 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -916,7 +916,7 @@ mlx5_rxq_devx_obj_drop_create(struct rte_eth_dev *dev)
}
rxq->rxq_ctrl = rxq_ctrl;
rxq_ctrl->type = MLX5_RXQ_TYPE_STANDARD;
-   rxq_ctrl->priv = priv;
+   rxq_ctrl->sh = priv->sh;
rxq_ctrl->obj = rxq;
rxq_data = &rxq_ctrl->rxq;
/* Create CQ using DevX API. */
diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 44afda731fc..8d48b4614ee 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -82,10 +82,11 @@ mlx5_rx_addr2mr_bh(struct mlx5_rxq_data *rxq, uintptr_t 
addr)
struct mlx5_rxq_ctrl *rxq_ctrl =
container_of(rxq, struct mlx5_rxq_ctrl, rxq);
struct mlx5_mr_ctrl *mr_ctrl = &rxq->mr_ctrl;
-   struct mlx5_priv *priv = rxq_ctrl->priv;
+   struct mlx5_priv *priv = RXQ_PORT(rxq_ctrl);
+   struct mlx5_dev_ctx_shared *sh = rxq_ctrl->sh;
 
-   return mlx5_mr_addr2mr_bh(priv->sh->pd, &priv->mp_id,
- &priv->sh->share_cache, mr_ctrl, addr,
+   return mlx5_mr_addr2mr_bh(sh->pd, &priv->mp_id,
+ &sh->share_cache, mr_ctrl, addr,
  priv->config.mr_ext_memseg_en);
 }
 
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index e3b1051ba46..09de26c0d39 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -118,15 +118,7 @@ int
 mlx5_rx_descriptor_status(void *rx_queue, uint16_t offset)
 {
struct mlx5_rxq_data *rxq = rx_queue;
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq, struct mlx5_rxq_ctrl, rxq);
-   struct rte_eth_dev *dev = ETH_DEV(rxq_ctrl->priv);
 
-   if (dev->rx_pkt_burst == NULL ||
-   dev->rx_pkt_burst == removed_rx_burst) {
-   rte_errno = ENOTSUP;
-   return -rte_errno;
-   }
if (offset >= (1 << rxq->cqe_n)) {
rte_errno = EINVAL;
return -rte_errno;
@@ -438,10 +430,10 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec)
sm.is_wq = 1;
sm.queue_id = rxq->idx;
sm.state = IBV_WQS_RESET;
-   if (mlx5_queue_state_modify(ETH_DEV(rxq_ctrl->priv), &sm))
+   if (mlx5_queue_state_modify(RXQ_DEV(rxq_ctrl), &sm))
return -1;
if (rxq_ctrl->dump_file_n <
-   rxq_ctrl->priv->config.max_dump_files_num) {
+   RXQ_PORT(rxq_ctrl)->config.max_dump_files_num) {
MKSTR(err_str, "Unexpected CQE error syndrome "
  "0x%02x CQN = %u RQN = %u wqe_counter = %u"
  " rq_ci = %u cq_ci = %u", u.err_cqe->syndrome,
@@ -478,8 +470,7 @@ mlx5_rx_err_handle(struct mlx5_rxq_data *rxq, uint8_t vec)
sm.is_wq = 1;
sm.queue_id = rxq->idx;
sm.state = IBV_WQS_RDY;
-   if (mlx5_queue_state_modify(ETH_DEV(rxq_ctrl->priv),
-   &sm))
+   if (mlx5_queue_state_modify(RXQ_DEV(rxq_ctrl), &sm))
return -1;
if (vec) {
const uint32_t elts_n =
diff --git a/drivers/net/mlx5/mlx5_rx.h b/drivers/net/mlx5/mlx5_rx.h
index 2ed544556f5..4eed4176324 100644
--- a/drivers/net/mlx5/mlx5_rx.h
+++ b/drivers/net/mlx5/mlx5_rx.h
@@ -23,6 +23,10 @@
 /* Support tunnel matching. */
 #define MLX5_FLOW_TUNNEL 10
 
+#define RXQ_PORT(rxq_ctrl) LIST_FIRST(&(rxq_ctrl)->owners)->priv
+#define RXQ_DEV(rxq_ctrl) ETH_DEV(RXQ_PORT(rxq_ctrl))
+#define RXQ_PORT_ID(rxq_ctrl) PORT_ID(RXQ_PORT(rxq_ctrl))
+
 struct mlx5_rxq_stats {
 #ifdef MLX5_PMD_SOFT_COUNTERS
uint64_t ipackets; /**< Total of successfully received packets. */
@@ -163,7 +167,6 @@ struct mlx5_rxq_ctrl {
LIST_HEAD(priv, mlx5_rxq_priv) owners; /* Owner rxq list. */
struct mlx5_rxq_obj *obj; /* Verbs/DevX elements. */
struct mlx5_dev_ctx_shared *sh; /* Shared context. */
-   struct mlx5_priv *priv; /* Back pointer to private data. */
enum mlx5_rxq_type type; /* Rxq type. */
unsigned int socket; /* CPU socket ID for allocations. */
unsigned int irq:1; /* Whether IRQ is

[dpdk-dev] [PATCH v2 13/13] net/mlx5: add shared Rx queue port datapath support

2021-10-16 Thread Xueming Li
From: Viacheslav Ovsiienko 

When receive packet, mlx5 PMD saves mbuf port number from
rxq data.

To support shared rxq, save port number into RQ context as
user index. Received packet resolve port number from
CQE user index which derived from RQ context.

Legacy Verbs API doesn't support RQ user index setting,
still read from rxq port number.

Signed-off-by: Xueming Li 
Signed-off-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_devx.c |  1 +
 drivers/net/mlx5/mlx5_rx.c   |  1 +
 drivers/net/mlx5/mlx5_rxq.c  |  3 ++-
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h |  6 ++
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h| 12 +++-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h |  8 +++-
 6 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index 94253047141..11c426eee14 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -277,6 +277,7 @@ mlx5_rxq_create_devx_rq_resources(struct mlx5_rxq_priv *rxq)
MLX5_WQ_END_PAD_MODE_NONE;
rq_attr.wq_attr.pd = priv->sh->pdn;
rq_attr.counter_set_id = priv->counter_set_id;
+   rq_attr.user_index = rte_cpu_to_be_16(priv->dev_data->port_id);
if (rxq_data->shared) /* Create RMP based RQ. */
rxq->devx_rq.rmp = &rxq_ctrl->obj->devx_rmp;
/* Create RQ using DevX API. */
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 3017a8da20c..6ee54b820f1 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -707,6 +707,7 @@ rxq_cq_to_mbuf(struct mlx5_rxq_data *rxq, struct rte_mbuf 
*pkt,
 {
/* Update packet information. */
pkt->packet_type = rxq_cq_to_pkt_type(rxq, cqe, mcqe);
+   pkt->port = unlikely(rxq->shared) ? cqe->user_index_low : rxq->port_id;
 
if (rxq->rss_hash) {
uint32_t rss_hash_res = 0;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index 494c9e3517f..250922b0d7a 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -179,7 +179,8 @@ rxq_alloc_elts_sprq(struct mlx5_rxq_ctrl *rxq_ctrl)
mbuf_init->data_off = RTE_PKTMBUF_HEADROOM;
rte_mbuf_refcnt_set(mbuf_init, 1);
mbuf_init->nb_segs = 1;
-   mbuf_init->port = rxq->port_id;
+   /* For shared queues port is provided in CQE */
+   mbuf_init->port = rxq->shared ? 0 : rxq->port_id;
if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF)
mbuf_init->ol_flags = EXT_ATTACHED_MBUF;
/*
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h 
b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
index 82586f012cb..115320a26f0 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h
@@ -1189,6 +1189,12 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile 
struct mlx5_cqe *cq,
 
/* D.5 fill in mbuf - rearm_data and packet_type. */
rxq_cq_to_ptype_oflags_v(rxq, cqes, opcode, &pkts[pos]);
+   if (unlikely(rxq->shared)) {
+   pkts[pos]->port = cq[pos].user_index_low;
+   pkts[pos + p1]->port = cq[pos + p1].user_index_low;
+   pkts[pos + p2]->port = cq[pos + p2].user_index_low;
+   pkts[pos + p3]->port = cq[pos + p3].user_index_low;
+   }
if (rxq->hw_timestamp) {
int offset = rxq->timestamp_offset;
if (rxq->rt_timestamp) {
diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h 
b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
index 5ff792f4cb5..9e78318129a 100644
--- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
+++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h
@@ -787,7 +787,17 @@ rxq_cq_process_v(struct mlx5_rxq_data *rxq, volatile 
struct mlx5_cqe *cq,
/* C.4 fill in mbuf - rearm_data and packet_type. */
rxq_cq_to_ptype_oflags_v(rxq, ptype_info, flow_tag,
 opcode, &elts[pos]);
-   if (rxq->hw_timestamp) {
+   if (unlikely(rxq->shared)) {
+   elts[pos]->port = container_of(p0, struct mlx5_cqe,
+ pkt_info)->user_index_low;
+   elts[pos + 1]->port = container_of(p1, struct mlx5_cqe,
+ pkt_info)->user_index_low;
+   elts[pos + 2]->port = container_of(p2, struct mlx5_cqe,
+ pkt_info)->user_index_low;
+   elts[pos + 3]->port = container_of(p3, struct mlx5_cqe,
+ pkt_info)->user_index_low;
+   }
+   if (unlikely(rxq->hw_timestamp)) {
int offset = rxq->

[dpdk-dev] [PATCH v2 11/13] net/mlx5: remove Rx queue data list from device

2021-10-16 Thread Xueming Li
Rx queue data list(priv->rxqs) can be replaced by Rx queue
list(priv->rxq_privs), removes it and replace with universal wrapper
API.

Signed-off-by: Xueming Li 
---
 drivers/net/mlx5/linux/mlx5_verbs.c |  7 ++---
 drivers/net/mlx5/mlx5.c | 10 +--
 drivers/net/mlx5/mlx5.h |  1 -
 drivers/net/mlx5/mlx5_devx.c| 13 +
 drivers/net/mlx5/mlx5_ethdev.c  |  6 +---
 drivers/net/mlx5/mlx5_flow.c| 45 +++--
 drivers/net/mlx5/mlx5_rss.c |  6 ++--
 drivers/net/mlx5/mlx5_rx.c  | 19 +---
 drivers/net/mlx5/mlx5_rx.h  |  9 +++---
 drivers/net/mlx5/mlx5_rxq.c | 23 ++-
 drivers/net/mlx5/mlx5_rxtx_vec.c|  6 ++--
 drivers/net/mlx5/mlx5_stats.c   |  9 +++---
 drivers/net/mlx5/mlx5_trigger.c |  2 +-
 13 files changed, 69 insertions(+), 87 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_verbs.c 
b/drivers/net/mlx5/linux/mlx5_verbs.c
index a2a9b9c1f98..0e68a13208b 100644
--- a/drivers/net/mlx5/linux/mlx5_verbs.c
+++ b/drivers/net/mlx5/linux/mlx5_verbs.c
@@ -527,11 +527,10 @@ mlx5_ibv_ind_table_new(struct rte_eth_dev *dev, const 
unsigned int log_n,
 
MLX5_ASSERT(ind_tbl);
for (i = 0; i != ind_tbl->queues_n; ++i) {
-   struct mlx5_rxq_data *rxq = (*priv->rxqs)[ind_tbl->queues[i]];
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq, struct mlx5_rxq_ctrl, rxq);
+   struct mlx5_rxq_priv *rxq = mlx5_rxq_get(dev,
+ind_tbl->queues[i]);
 
-   wq[i] = rxq_ctrl->obj->wq;
+   wq[i] = rxq->ctrl->obj->wq;
}
MLX5_ASSERT(i > 0);
/* Finalise indirection table. */
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index 477ad8c1bc9..6240f6f5dc6 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -1578,20 +1578,12 @@ mlx5_dev_close(struct rte_eth_dev *dev)
mlx5_mp_os_req_stop_rxtx(dev);
/* Free the eCPRI flex parser resource. */
mlx5_flex_parser_ecpri_release(dev);
-   if (priv->rxqs != NULL) {
+   if (priv->rxq_privs != NULL) {
/* XXX race condition if mlx5_rx_burst() is still running. */
rte_delay_us_sleep(1000);
for (i = 0; (i != priv->rxqs_n); ++i)
mlx5_rxq_release(dev, i);
priv->rxqs_n = 0;
-   priv->rxqs = NULL;
-   }
-   if (priv->representor) {
-   /* Each representor has a dedicated interrupts handler */
-   mlx5_free(dev->intr_handle);
-   dev->intr_handle = NULL;
-   }
-   if (priv->rxq_privs != NULL) {
mlx5_free(priv->rxq_privs);
priv->rxq_privs = NULL;
}
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index a2735cbb350..55612f777ea 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -1406,7 +1406,6 @@ struct mlx5_priv {
unsigned int rxqs_n; /* RX queues array size. */
unsigned int txqs_n; /* TX queues array size. */
struct mlx5_rxq_priv *(*rxq_privs)[]; /* RX queue non-shared data. */
-   struct mlx5_rxq_data *(*rxqs)[]; /* (Shared) RX queues. */
struct mlx5_txq_data *(*txqs)[]; /* TX queues. */
struct rte_mempool *mprq_mp; /* Mempool for Multi-Packet RQ. */
struct rte_eth_rss_conf rss_conf; /* RSS configuration. */
diff --git a/drivers/net/mlx5/mlx5_devx.c b/drivers/net/mlx5/mlx5_devx.c
index f7f7526dbf6..b767470dea0 100644
--- a/drivers/net/mlx5/mlx5_devx.c
+++ b/drivers/net/mlx5/mlx5_devx.c
@@ -682,15 +682,16 @@ mlx5_devx_tir_attr_set(struct rte_eth_dev *dev, const 
uint8_t *rss_key,
 
/* NULL queues designate drop queue. */
if (ind_tbl->queues != NULL) {
-   struct mlx5_rxq_data *rxq_data =
-   (*priv->rxqs)[ind_tbl->queues[0]];
-   struct mlx5_rxq_ctrl *rxq_ctrl =
-   container_of(rxq_data, struct mlx5_rxq_ctrl, rxq);
-   rxq_obj_type = rxq_ctrl->type;
+   struct mlx5_rxq_priv *rxq = mlx5_rxq_get(dev,
+ind_tbl->queues[0]);
 
+   rxq_obj_type = rxq->ctrl->type;
/* Enable TIR LRO only if all the queues were configured for. */
for (i = 0; i < ind_tbl->queues_n; ++i) {
-   if (!(*priv->rxqs)[ind_tbl->queues[i]]->lro) {
+   struct mlx5_rxq_data *rxq_i =
+   mlx5_rxq_data_get(dev, ind_tbl->queues[i]);
+
+   if (rxq_i != NULL && !rxq_i->lro) {
lro = false;
break;
}
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index ee1189b929d..070ff149488 100644
--- a

Re: [dpdk-dev] [PATCH v6 1/5] ethdev: introduce shared Rx queue

2021-10-16 Thread Xueming(Steven) Li
On Fri, 2021-10-15 at 18:20 +0100, Ferruh Yigit wrote:
> On 10/12/2021 3:39 PM, Xueming Li wrote:
> > index 6d80514ba7a..041da6ee52f 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -1044,6 +1044,13 @@ struct rte_eth_rxconf {
> > uint8_t rx_drop_en; /**< Drop packets if no descriptors are available. 
> > */
> > uint8_t rx_deferred_start; /**< Do not start queue with 
> > rte_eth_dev_start(). */
> > uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */
> > +   /**
> > +* Share group index in Rx domain and switch domain.
> > +* Non-zero value to enable Rx queue share, zero value disable share.
> > +* PMD driver is responsible for Rx queue consistency checks to avoid
> 
> When you update the set, can you please update 'PMD driver' usage too?
> 
> PMD = Poll Mode Driver, so second 'driver' is duplicate, there are a
> few more instance of this usage in this set.

Got it, thanks!

BTW, PMD patches updated:
https://patches.dpdk.org/project/dpdk/list/?series=19709


Re: [dpdk-dev] [PATCH v2 0/7] crypto/security session framework rework

2021-10-16 Thread Zhang, Roy Fan
Hi Akhil,

I didn't work on the asym problem. As stated in the email I could think of the 
solution is to add new API to create asym session pool - or you may have better 
solution. 

BTW current test_cryptodev_asym.c the function testsuite_setup() creates the 
queue pair before creating the session pool, which will always made the queue 
pair creation fail at the library layer - as the session pool cannot be empty. 
I don't think the session pool is mandatory when creating the queue pair as it 
is only needed for session-less operation even for sym crypto - this change 
also doesn't make sense for the crypto PMDs who don't support session-less 
operation.

My sym fix is as same as your proposal. Here is my diff as ref for sym crypto 
seg fault fix.

diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c 
b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index 330aad8157..990fc99763 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -174,27 +174,25 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp, struct 
rte_crypto_op *op)
sym_op->session,
cryptodev_driver_id);
} else  {
-   void *_sess;
-   void *_sess_private_data = NULL;
+   struct rte_cryptodev_sym_session *_sess =
+   rte_cryptodev_sym_session_create(qp->sess_mp);
 
-   if (rte_mempool_get(qp->sess_mp, (void **)&_sess))
+   if (_sess == NULL)
return NULL;
 
-   if (rte_mempool_get(qp->sess_mp_priv,
-   (void **)&_sess_private_data))
-   return NULL;
+   _sess->sess_data[cryptodev_driver_id].data =
+   (void *)((uint8_t *)_sess +
+   rte_cryptodev_sym_get_header_session_size() +
+   (cryptodev_driver_id * _sess->priv_sz));
 
-   sess = (struct aesni_gcm_session *)_sess_private_data;
+   sess = _sess->sess_data[cryptodev_driver_id].data;
 
if (unlikely(aesni_gcm_set_session_parameters(qp->ops,
sess, sym_op->xform) != 0)) {
rte_mempool_put(qp->sess_mp, _sess);
-   rte_mempool_put(qp->sess_mp_priv, _sess_private_data);
sess = NULL;
}
sym_op->session = (struct rte_cryptodev_sym_session *)_sess;
-   set_sym_session_private_data(sym_op->session,
-   cryptodev_driver_id, _sess_private_data);
}
 
if (unlikely(sess == NULL))
@@ -716,7 +714,6 @@ handle_completed_gcm_crypto_op(struct aesni_gcm_qp *qp,
memset(op->sym->session, 0,
rte_cryptodev_sym_get_existing_header_session_size(
op->sym->session));
-   rte_mempool_put(qp->sess_mp_priv, sess);
rte_mempool_put(qp->sess_mp, op->sym->session);
op->sym->session = NULL;
}
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h 
b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
index 2763d1c492..cb37fd6b29 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
@@ -52,8 +52,6 @@ struct aesni_gcm_qp {
/**< Queue pair statistics */
struct rte_mempool *sess_mp;
/**< Session Mempool */
-   struct rte_mempool *sess_mp_priv;
-   /**< Session Private Data Mempool */
uint16_t id;
/**< Queue Pair Identifier */
char name[RTE_CRYPTODEV_NAME_MAX_LEN];
diff --git a/drivers/crypto/aesni_mb/aesni_mb_pmd_private.h 
b/drivers/crypto/aesni_mb/aesni_mb_pmd_private.h
index 11e7bf5d18..2398fdf1b8 100644
--- a/drivers/crypto/aesni_mb/aesni_mb_pmd_private.h
+++ b/drivers/crypto/aesni_mb/aesni_mb_pmd_private.h
@@ -182,8 +182,6 @@ struct aesni_mb_qp {
/**< Ring for placing operations ready for processing */
struct rte_mempool *sess_mp;
/**< Session Mempool */
-   struct rte_mempool *sess_mp_priv;
-   /**< Session Private Data Mempool */
struct rte_cryptodev_stats stats;
/**< Queue pair statistics */
uint8_t digest_idx;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c 
b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index e8da9ea9e1..d9e525c86f 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -1024,27 +1024,25 @@ get_session(struct aesni_mb_qp *qp, struct 
rte_crypto_op *op)
(op->sym->sec_session);
 #endif
} else {
-   void *_sess = rte_cryptodev_sym_session_create(qp->sess_mp);
-   void *_sess_private_data = NULL;
+   struct rte_cryptodev_sym_session *_sess =
+   rte_cryptodev_sy

[dpdk-dev] DPDK 20.11 - i40evf: No response for 14

2021-10-16 Thread liaobiting
Hi:
I am using Intel SR-IOV XL710 VF with DPDK v20.11 to create a dpdkbond but 
failed.
 [cid:image001.png@01D7C1F3.41CDF7B0]
 As the picture shows, when dpdk bond start and slave link up, it 
triggers lsc callback function which registers in the eal thread, and then in 
the activate_slave process,
i40evf send msg 14 to pf to config promisc, but there is no response received. 
Because function "i40evf_handle_aq_msg" in which vf receives response from pf
is in the same thread of eal with function "_i40evf_execute_vf_cmd" in which vf 
sends msg to pf. Thus, when eal thread goes in function _i40evf_execute_vf_cmd,
it can't handle msg from pf at the same time, then it will report "No response 
for 14" in dpdk log.

And there are the error logs in dpdk:
2021-10-11T11:03:32.812516+08:00|info|ovs-vswitchd[1721043]|EAL: [eth_dev_ops] 
Slave 1: set mtu: 9058 .
2021-10-11T11:03:32.822434+08:00|info|ovs-vswitchd[1721043]|EAL: [eth_dev_ops] 
Slave 1: dev configure succeed.
2021-10-11T11:03:33.022426+08:00|info|ovs-vswitchd[1721043]|EAL: LSC CALLBACK: 
slave :87:02.0 links up
2021-10-11T11:03:33.022455+08:00|info|ovs-vswitchd[1721043]|PMD: Slave 1: 
lacp_rate has been set to slow.
2021-10-11T11:03:35.034445+08:00|warning|ovs-vswitchd[1721043]|_i40evf_execute_vf_cmd():
 No response for 14
2021-10-11T11:03:35.034628+08:00|err|ovs-vswitchd[1721043]|i40evf_config_promisc():
 fail to execute command CONFIG_PROMISCUOUS_MODE
2021-10-11T11:03:35.034657+08:00|info|ovs-vswitchd[1721043]|EAL: [eth_dev_ops] 
Slave 1: enable allmulticast.
2021-10-11T11:03:35.034680+08:00|err|ovs-vswitchd[1721043]|bond_mode_8023ad_register_lacp_mac(1250)
 - failed to enable allmulti mode for port 1: Resource temporarily unavailable
2021-10-11T11:03:35.034702+08:00|debug|ovs-vswitchd[1721043]|bond_mode_8023ad_register_lacp_mac(1266)
 - forced promiscuous for port 1
2021-10-11T11:03:35.034725+08:00|info|ovs-vswitchd[1721043]|EAL: [eth_dev_ops] 
Slave 1 (net_bonding_trunk1): Configuring and applying resources successed.
2021-10-11T11:03:35.034746+08:00|info|ovs-vswitchd[1721043]|EAL: [eth_dev_ops] 
slave 1 (net_bonding_trunk1) activate.
2021-10-11T11:03:35.034771+08:00|debug|ovs-vswitchd[1721043]| 0 [Port 1: 
rx_machine] -> INITIALIZE
2021-10-11T11:03:35.034794+08:00|debug|ovs-vswitchd[1721043]| 0 [Port 1: 
mux_machine] -> DETACHED
2021-10-11T11:03:35.084534+08:00|err|ovs-vswitchd[1721043]|i40evf_handle_aq_msg():
 command mismatch,expect 8, get 14
2021-10-11T11:03:35.134545+08:00|debug|ovs-vswitchd[1721043]|   100 [Port 1: 
mux_machine] DETACHED -> WAITING
2021-10-11T11:03:35.544648+08:00|info|ovs-vswitchd[1721043]|EAL: [eth_dev_ops] 
Slave 1: start dev.

How can I solve this problem ?
Thakns for help.
Regards,
Liao


Re: [dpdk-dev] [PATCH v2 0/7] crypto/security session framework rework

2021-10-16 Thread Zhang, Roy Fan
Hi,

> -Original Message-
> From: Akhil Goyal 
> Sent: Friday, October 15, 2021 7:47 PM
> To: Zhang, Roy Fan ; dev@dpdk.org
> Cc: tho...@monjalon.net; david.march...@redhat.com;
> hemant.agra...@nxp.com; Anoob Joseph ; De Lara
> Guarch, Pablo ; Trahe, Fiona
> ; Doherty, Declan ;
> ma...@nvidia.com; g.si...@nxp.com; jianjay.z...@huawei.com;
> asoma...@amd.com; ruifeng.w...@arm.com; Ananyev, Konstantin
> ; Nicolau, Radu ;
> ajit.khapa...@broadcom.com; Nagadheeraj Rottela
> ; Ankur Dwivedi ;
> Power, Ciara ; Wang, Haiyue
> ; jiawe...@trustnetic.com;
> jianw...@trustnetic.com
> Subject: RE: [PATCH v2 0/7] crypto/security session framework rework
> 
> > > Hi Akhil,
> > >
> > > I tried to fix the problems of seg faults.
> > > The seg-faults are gone now but all asym tests are failing too.
> > > The reason is the rte_cryptodev_queue_pair_setup() checks the session
> > > mempool same for sym and asym.
> > > Since we don't have a rte_cryptodev_asym_session_pool_create() the
> > > session mempool created by
> > > test_cryptodev_asym.c  with rte_mempool_create() will fail the
> mempool
> > > check when setting up the queue pair.
> > >
> > > If you think my fix may be useful (although not resolving asym issue) I 
> > > can
> > > send it.
> > >
> > Is it a different fix than what I proposed below? If yes, you can send the
> diff.
> > I already made the below changes for all the PMDs.
> > I will try to fix the asym issue, but I suppose it can be dealt in the app
> > Which can be fixed separately in RC2.
> >
> > Also, found the root cause of multi process issue, working on making the
> > patches.
> > Will send v3 soon with all 3 issues(docsis/mp/sessless) fixed atleast.
> > For Asym, may send a separate patch.
> >
> For Asym issue, it looks like the APIs are not written properly and has many
> Issues compared to sym.
> Looking at the API rte_cryptodev_queue_pair_setup(), it only support
> mp_session(or priv_sess_mp) for symmetric sessions even without my
> changes.
> 
> Hence, a qp does not have mempool for sessionless Asym processing and
> looking at current
> Drivers, only QAT support asym session less and it does not use mempool
> stored in qp.
> 
> Hence IMO, it is safe to remove the check from
> rte_cryptodev_queue_pair_setup()
> if (!qp_conf->mp_session) {
> CDEV_LOG_ERR("Invalid mempools\n");
> return -EINVAL;
> }
> Or we can have give a CDEV_LOG_INFO (to indicate session mempool not
> present, session less won't work) instead of CDEV_LOG_ERR and fall through.
> 

Yes this is a valid fix. It will make queue pair setup work as before.
The old code was like this:

if ((qp_conf->mp_session && !qp_conf->mp_session_private) ||
(!qp_conf->mp_session && qp_conf->mp_session_private)) {
CDEV_LOG_ERR("Invalid mempools\n");
return -EINVAL;
}

The requirement was either you provide 2 mempools (one for session and one for
session private) or you don't provide session mempool when creating queue pair 
at
all. Only otherwise the error is returned.

> For sym case, it is checking again in next line if session_mp is there or not.
> 
> I hope, the asym cases will work once we remove the above check and pass
> Null in the asym app while setting up queue pairs. What say?

It shall be working.  Thanks.
> 
> 
> 



Re: [dpdk-dev] [PATCH v2] examples/ipsec-secgw: accept inline proto pkts in single sa

2021-10-16 Thread Akhil Goyal
> 
> > > Subject: [PATCH v2] examples/ipsec-secgw: accept inline proto pkts in
> single
> > > sa
> > >
> > > In inline protocol inbound SA's, plain ipv4 and ipv6 packets are
> > > delivered to application unlike inline crypto or lookaside.
> > > Hence fix the application to not drop them when working in
> > > single SA mode.
> > >
> > > Signed-off-by: Nithin Dabilpuram 
> > > ---
> > >
> > Acked-by: Akhil Goyal 
> >
> > @Konstantin/Bernard: Any objections?
> 
> None from me.

Applied to dpdk-next-crypto


Re: [dpdk-dev] [PATCH] crypto/cnxk: add max queue pairs limit devargs

2021-10-16 Thread Akhil Goyal
> Adds max queue pairs limit devargs for crypto cnxk driver. This
> can be used to set a limit on the number of maximum queue pairs
> supported by the device. The default value is 63.
> 
> Signed-off-by: Ankur Dwivedi 
> Reviewed-by: Anoob Joseph 
> Reviewed-by: Jerin Jacob Kollanukkaran 
> ---
Applied to dpdk-next-crypto

Thanks.


Re: [dpdk-dev] [EXT] [dpdk-dev v4] app: fix buffer overrun

2021-10-16 Thread Akhil Goyal
> This patch fixes a possible buffer overrun problem in crypto perf test.
> Previously when user configured aad size is over 12 bytes the copy of
> template aad will cause a buffer overrun.
> The problem is fixed by only copy up to 12 bytes of aad template.
> 
> Fixes: 8a5b494a7f99 ("app/test-crypto-perf: add AEAD parameters")
> Cc: pablo.de.lara.gua...@intel.com
> 
> Signed-off-by: Przemyslaw Zegan 
> Acked-by: Fan Zhang 
Applied to dpdk-next-crypto


Re: [dpdk-dev] [EXT] [dpdk-dev v4] app: fix buffer overrun

2021-10-16 Thread Akhil Goyal
> > This patch fixes a possible buffer overrun problem in crypto perf test.
> > Previously when user configured aad size is over 12 bytes the copy of
> > template aad will cause a buffer overrun.
> > The problem is fixed by only copy up to 12 bytes of aad template.
> >
> > Fixes: 8a5b494a7f99 ("app/test-crypto-perf: add AEAD parameters")
> > Cc: pablo.de.lara.gua...@intel.com
> >
> > Signed-off-by: Przemyslaw Zegan 
> > Acked-by: Fan Zhang 
> Applied to dpdk-next-crypto
Cc: sta...@dpdk.org


Re: [dpdk-dev] [EXT] [PATCH v10 2/8] baseband: introduce NXP LA12xx driver

2021-10-16 Thread Akhil Goyal
Hi Nipun, 
Few nits below.

Nicolas, Any more comments on this patchset, Can you ack?

> +++ b/drivers/baseband/la12xx/version.map
> @@ -0,0 +1,3 @@
> +DPDK_21 {
> + local: *;
> +};
This should be DPDK_22

> diff --git a/drivers/baseband/meson.build b/drivers/baseband/meson.build
> index 5ee61d5323..ccd1eebc3b 100644
> --- a/drivers/baseband/meson.build
> +++ b/drivers/baseband/meson.build
> @@ -11,6 +11,7 @@ drivers = [
>  'fpga_lte_fec',
>  'null',
>  'turbo_sw',
> +'la12xx',

Alphabetical order?

>  ]
> 
>  log_prefix = 'pmd.bb'
> --
> 2.17.1



Re: [dpdk-dev] [EXT] [PATCH v10 3/8] baseband/la12xx: add devargs for max queues

2021-10-16 Thread Akhil Goyal
> From: Hemant Agrawal 
> 
> This patch adds dev args to take  max queues as input
> 
> Signed-off-by: Nipun Gupta 
> Signed-off-by: Hemant Agrawal 
> ---
Documentation for dev args missing in this patch.


Re: [dpdk-dev] [EXT] [PATCH v10 5/8] baseband/la12xx: add queue and modem config support

2021-10-16 Thread Akhil Goyal
> +Prerequisites
> +-
> +
> +Currently supported by DPDK:
> +
> +- NXP LA1224 BSP **1.0+**.
> +- NXP LA1224 PCIe Modem card connected to ARM host.
> +
> +- Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup
> the basic DPDK environment.
> +
> +* Use dev arg option ``modem=0`` to identify the modem instance for a
> given
> +  device. This is required only if more than 1 modem cards are attached to
> host.
> +  this is optional and the default value is 0.
> +  e.g. ``--vdev=baseband_la12xx,modem=0``
> +
The documentation need to be split in different patches
- base doc to be added in first patch,
- devargs max_nb_queues for 2/8
- devargs modem for 5/8


> +* Use dev arg option ``max_nb_queues=x`` to specify the maximum number
> of queues
> +  to be used for communication with offload device i.e. modem. default is
> 16.
> +  e.g. ``--vdev=baseband_la12xx,max_nb_queues=4``
> +
> +Enabling logs
> +-
> +
> +For enabling logs, use the following EAL parameter:
> +
> +.. code-block:: console
> +
> +   ./your_bbdev_application  --log-level=la12xx:
> +
> +Using ``bb.la12xx`` as log matching criteria, all Baseband PMD logs can be
> +enabled which are lower than logging ``level``.
> diff --git a/doc/guides/rel_notes/release_21_11.rst
> b/doc/guides/rel_notes/release_21_11.rst
> index 135aa467f2..f4cae1b760 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b/doc/guides/rel_notes/release_21_11.rst
> @@ -134,6 +134,11 @@ New Features
>* Added tests to validate packets hard expiry.
>* Added tests to verify tunnel header verification in IPsec inbound.
> 
> +* **Added NXP LA12xx baseband PMD.**
> +
> +  * Added a new baseband PMD driver for NXP LA12xx Software defined
> radio.
> +  * See the :doc:`../bbdevs/la12xx` for more details.
> +
> 

Release notes may be added in your 6/8 patch where PMD is completely supported.

> +#define HUGEPG_OFFSET(A) \
> + ((uint64_t) ((unsigned long) (A) \
> + - ((uint64_t)ipc_priv->hugepg_start.host_vaddr)))
> +
> +static int ipc_queue_configure(uint32_t channel_id,
> + ipc_t instance, struct bbdev_la12xx_q_priv *q_priv)

Follow DPDK coding convention here and check for other functions also.



Re: [dpdk-dev] [EXT] [PATCH] cryptodev: extend data-unit length field

2021-10-16 Thread Akhil Goyal
> As described in [1] and as announced in [2], The field ``dataunit_len``
> of the ``struct rte_crypto_cipher_xform`` moved to the end of the
> structure and extended to ``uint32_t``.
> 
> In this way, sizes bigger than 64K bytes can be supported for data-unit
> lengths.
> 
> [1] commit d014dddb2d69 ("cryptodev: support multiple cipher
> data-units")
> [2] commit 9a5c09211b3a ("doc: announce extension of crypto data-unit
> length")
> 
> Signed-off-by: Matan Azrad 
> ---
Applied to dpdk-next-crypto


Re: [dpdk-dev] [EXT] [PATCH 5/5] crypto/mlx5: support on Windows

2021-10-16 Thread Akhil Goyal
> Add support for mlx5 crypto pmd on Windows OS.
> Add changes to release note and pmd guide.
> 
> Signed-off-by: Tal Shnaiderman 
> ---
>  doc/guides/cryptodevs/mlx5.rst   | 15 ---
>  doc/guides/rel_notes/release_21_11.rst   |  1 +
>  drivers/common/mlx5/version.map  |  2 +-
>  drivers/common/mlx5/windows/mlx5_common_os.c |  2 +-
>  drivers/crypto/aesni_gcm/meson.build |  6 ++
>  drivers/crypto/aesni_mb/meson.build  |  6 ++
>  drivers/crypto/armv8/meson.build |  6 ++
>  drivers/crypto/bcmfs/meson.build |  6 ++
>  drivers/crypto/ccp/meson.build   |  1 +
>  drivers/crypto/kasumi/meson.build|  6 ++
>  drivers/crypto/meson.build   |  3 ---
>  drivers/crypto/mlx5/meson.build  |  4 ++--
>  drivers/crypto/mvsam/meson.build |  6 ++
>  drivers/crypto/null/meson.build  |  6 ++
>  drivers/crypto/octeontx/meson.build  |  6 ++
>  drivers/crypto/openssl/meson.build   |  6 ++
>  drivers/crypto/qat/meson.build   |  6 ++
>  drivers/crypto/scheduler/meson.build |  6 ++
>  drivers/crypto/snow3g/meson.build|  6 ++
>  drivers/crypto/virtio/meson.build|  6 ++
>  drivers/crypto/zuc/meson.build   |  6 ++
>  21 files changed, 102 insertions(+), 10 deletions(-)
> 

Please split this patch into two
- one for all drivers meson.build changes and
- one for enabling Windows compilation for mlx5.


Re: [dpdk-dev] [Bug 811] BPF tests fail with clang

2021-10-16 Thread Stephen Hemminger
Any progress on this issue?
Perhaps we should just disable BPF with clang build?


On Thu, 16 Sep 2021 03:07:41 +
bugzi...@dpdk.org wrote:

> https://bugs.dpdk.org/show_bug.cgi?id=811
> 
> Bug ID: 811
>Summary: BPF tests fail with clang
>Product: DPDK
>Version: 21.11
>   Hardware: All
> OS: All
> Status: UNCONFIRMED
>   Severity: normal
>   Priority: Normal
>  Component: other
>   Assignee: dev@dpdk.org
>   Reporter: step...@networkplumber.org
>   Target Milestone: ---
> 
> The bpf_autotest fails when DPDK 21.11 is built with clang.
> Same test and code work when built with Gcc.
> 
>  $ clang --version
> Debian clang version 11.0.1-2
> Target: x86_64-pc-linux-gnu
> Thread model: posix
> InstalledDir: /usr/bin
> 
> $ gcc --version
> gcc (Debian 10.2.1-6) 10.2.1 20210110
> Copyright (C) 2020 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> 
> 
>  $ DPDK_TEST='bpf_autotest'
> /home/shemminger/DPDK/pcapng2/build/app/test/dpdk-test -l 0-15 --no-huge -m
> 2048
> EAL: Detected 16 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'VA'
> APP: HPET is not enabled, using TSC as default timer
> RTE>>bpf_autotest  
> run_test(test_store1) start
> run_test(test_store2) start
> run_test(test_load1) start
> run_test(test_ldimm1) start
> run_test(test_mul1) start
> run_test(test_shift1) start
> test_shift1_check: invalid value
> expected:
> 00:80:21:81:00:00:00:00:00:00:00:00:00:00:00:00:ff:db:dd:ed:ff:ff:ff:ff:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:ff:00:00:00:00:00:00:00:00:c0:ed:27:21:00:00:00:00:00:00:00:00:00:00:00:00:81:ef:ad:f6:ff:76:77:fb:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> result:
> 00:80:21:81:00:00:00:00:00:00:00:00:00:00:00:00:ff:db:dd:ed:ff:ff:ff:ff:00:00:00:00:00:00:00:00:81:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:ff:00:00:00:00:00:00:00:00:c0:ed:27:21:00:00:00:00:00:00:00:00:00:00:00:00:81:ef:ad:f6:ff:76:77:fb:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> run_test@3196: check_result(test_shift1) failed, error: -1(Unknown error -1);
> test_shift1_check: invalid value
> expected:
> 00:00:cf:4e:00:00:00:00:00:00:00:00:00:00:00:00:cf:bb:75:ef:ff:ff:ff:ff:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:80:e7:dd:ba:00:00:00:00:00:00:00:00:00:00:00:c0:00:00:00:00:00:00:00:00:00:00:00:00:7b:de:ad:7b:ff:ff:ff:ff:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> result:
> 00:00:cf:4e:00:00:00:00:00:00:00:00:00:00:00:00:cf:bb:75:ef:ff:ff:ff:ff:00:00:00:00:00:00:00:00:00:9e:9d:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:80:e7:dd:ba:00:00:00:00:00:00:00:00:00:00:00:c0:00:00:00:00:00:00:00:00:00:00:00:00:7b:de:ad:7b:ff:ff:ff:ff:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
> run_test@3210: check_result(test_shift1) failed, error: -1(Unknown error -1);
> run_test(test_jump1) start
> run_test(test_jump2) start
> run_test(test_alu1) start
> run_test(test_bele1) start
> run_test(test_xadd1) start
> run_test(test_div1) start
> bpf_exec(0x7f3a1d03d000): division by 0 at pc: 0x68;
> run_test(test_call1) start
> run_test(test_call2) start
> run_test(test_call3) start
> run_test(test_call4) start
> run_test(test_call5) start
> run_test(test_ld_mbuf1) start
> run_test(test_ld_mbuf2) start
> run_test(test_ld_mbuf3) start
> Test Failed
> RTE>>  
> 



Re: [dpdk-dev] [PATCH 1/1] net: fix aliasing issue in checksum computation

2021-10-16 Thread Georg Sauthoff
Hello,

On Fri, Oct 15, 2021 at 04:39:02PM +0200, Olivier Matz wrote:
> On Sat, Sep 18, 2021 at 01:49:30PM +0200, Georg Sauthoff wrote:
> > That means a superfluous cast is removed and aliasing through a uint8_t
> > pointer is eliminated. Note that uint8_t doesn't have the same
> > strict-aliasing properties as unsigned char.
> 
> Interesting. Out of curiosity, do you have links that explains
> this?

yes, I do. https://stefansf.de/post/type-based-alias-analysis/ has some
nice examples and explains some things. Especially, it makes the point
that it's the access that matters for yielding undefined behaviour (i.e.
when violating strict-aliasing rules) and not the cast itself:

"N.B. the standard only speaks about the type of an object and the type
of an lvalue in order to access an object. Thus a pointer to an object
x may be converted arbitrarily often to arbitrary object pointer
types, and therefore even to incompatible types, as long as every
access to x is done through an lvalue which type conforms to C11
section 6.5 paragraph 7."

Section 'Character Type' in that article also addresses how uint8_t
isn't special as unsigned char while quoting the standard and
referencing below Bugzilla bug.

Another good article on strict aliasing:

https://gustedt.wordpress.com/2016/08/17/effective-types-and-aliasing/

 
> I found these, but these are just discussions:
>   
> https://stackoverflow.com/questions/16138237/when-is-uint8-t-%E2%89%A0-unsigned-char
>   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110

I like the Bugzilla link as it shows how some code benefits from
uint8_t not having the same aliasing requirements as e.g. unsigned char.
Thus, it's an example of why compiler developers might be motivated to
decide against making uint8_t a typedef of unsigned char, since the
standard doesn't require it.

> What about rewording the sentence "uint8_t doesn't have the same
> strict-aliasing properties as unsigned char" to clarify that unsigned
> char may alias, but uint8_t may not?

I can change

"That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. Note that uint8_t doesn't have the same
strict-aliasing properties as unsigned char."

to

"That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers."

Better?


Best regards
Georg

-- 
'This function is not fully implemented in some standard libraries. For
 example, gcc and clang always return zero even though the device is
 non-deterministic. In comparison, Visual Studio always returns 32, and
 boost.random returns 10.'
  (http://en.cppreference.com/w/cpp/numeric/random/random_device/entropy, 2014)


Re: [dpdk-dev] [PATCH 1/1] net: fix aliasing issue in checksum computation

2021-10-16 Thread Georg Sauthoff
Hello,

On Sat, Oct 16, 2021 at 10:21:03AM +0200, Morten Brørup wrote:
> I have given this some more thoughts.
> 
> Most bytes transferred in real life are transferred in large packets,
> so faster processing of large packets is a great improvement!
> 
> Furthermore, a quick analysis of a recent packet sample from an ISP
> customer of ours shows that less than 8 % of the packets are odd size.
> Would you consider adding an unlikely() to the branch handling the odd
> byte at the end?

sure, I don't see a problem with adding unlikely() there.

I'll post a version 2 of that patch then, tomorrow.

Best regards
Georg

-- 
"No one can write decently who is distrustful of the reader's
intelligence, or whose attitude is patronizing." (William Strunk,
Jr. and E.B. White, The Elements of Style, p. 70, 1959)


[dpdk-dev] [PATCH v6 0/4] net/mlx5: implicit mempool registration

2021-10-16 Thread Dmitry Kozlyuk
MLX5 hardware has its internal IOMMU where PMD registers the memory.
On the data path, PMD translates VA into a key consumed by the device
IOMMU.  It is impractical for the PMD to register all allocated memory
because of increased lookup cost both in HW and SW.  Most often mbuf
memory comes from mempools, so if PMD tracks them, it can almost always
have mbuf memory registered before an mbuf hits the PMD. This patchset
adds such tracking in the PMD and internal API to support it.

Please see [1] for the discussion of the patch 2/4
and how it can be useful outside of the MLX5 PMD.

[1]: 
http://inbox.dpdk.org/dev/ch0pr12mb509112fadb778ab28af3771db9...@ch0pr12mb5091.namprd12.prod.outlook.com/

v6:
Fix compilation issue in proc-info (CI).
v5:
1. Change non-IO flag inference + various fixes (Andrew).
1. Fix callback unregistration from secondary processes (Olivier).
2. Support non-IO flag in proc-dump (David).
3. Fix the usage of locks (Olivier).
4. Avoid resource leaks in unit test (Olivier).
v4: (Andrew)
1. Improve mempool event callbacks unit tests and documentation.
2. Make MEMPOOL_F_NON_IO internal and automatically inferred.
   Add unit tests for the inference logic.
v3: Improve wording and naming; fix typos (Thomas).
v2 (internal review and testing):
1. Change tracked mempool event from being created (CREATE) to being
   fully populated (READY), which is the state PMD is interested in.
2. Unit test the new mempool callback API.
3. Remove bogus "error" messages in normal conditions.
4. Fixes in PMD.

Dmitry Kozlyuk (4):
  mempool: add event callbacks
  mempool: add non-IO flag
  common/mlx5: add mempool registration facilities
  net/mlx5: support mempool registration

 app/proc-info/main.c   |   6 +-
 app/test/test_mempool.c| 360 +++
 doc/guides/nics/mlx5.rst   |  13 +
 doc/guides/rel_notes/release_21_11.rst |   9 +
 drivers/common/mlx5/mlx5_common_mp.c   |  50 +++
 drivers/common/mlx5/mlx5_common_mp.h   |  14 +
 drivers/common/mlx5/mlx5_common_mr.c   | 580 +
 drivers/common/mlx5/mlx5_common_mr.h   |  17 +
 drivers/common/mlx5/version.map|   5 +
 drivers/net/mlx5/linux/mlx5_mp_os.c|  44 ++
 drivers/net/mlx5/linux/mlx5_os.c   |   4 +-
 drivers/net/mlx5/mlx5.c| 152 +++
 drivers/net/mlx5/mlx5.h|  10 +
 drivers/net/mlx5/mlx5_mr.c | 120 ++---
 drivers/net/mlx5/mlx5_mr.h |   2 -
 drivers/net/mlx5/mlx5_rx.h |  21 +-
 drivers/net/mlx5/mlx5_rxq.c|  13 +
 drivers/net/mlx5/mlx5_trigger.c|  77 +++-
 drivers/net/mlx5/windows/mlx5_os.c |   1 +
 lib/mempool/rte_mempool.c  | 134 ++
 lib/mempool/rte_mempool.h  |  64 +++
 lib/mempool/version.map|   8 +
 22 files changed, 1586 insertions(+), 118 deletions(-)

-- 
2.25.1



[dpdk-dev] [PATCH v6 1/4] mempool: add event callbacks

2021-10-16 Thread Dmitry Kozlyuk
Data path performance can benefit if the PMD knows which memory it will
need to handle in advance, before the first mbuf is sent to the PMD.
It is impractical, however, to consider all allocated memory for this
purpose. Most often mbuf memory comes from mempools that can come and
go. PMD can enumerate existing mempools on device start, but it also
needs to track creation and destruction of mempools after the forwarding
starts but before an mbuf from the new mempool is sent to the device.

Add an API to register callback for mempool life cycle events:
* rte_mempool_event_callback_register()
* rte_mempool_event_callback_unregister()
Currently tracked events are:
* RTE_MEMPOOL_EVENT_READY (after populating a mempool)
* RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool)
Provide a unit test for the new API.
The new API is internal, because it is primarily demanded by PMDs that
may need to deal with any mempools and do not control their creation,
while an application, on the other hand, knows which mempools it creates
and doesn't care about internal mempools PMDs might create.

Signed-off-by: Dmitry Kozlyuk 
Acked-by: Matan Azrad 
Reviewed-by: Andrew Rybchenko 
---
 app/test/test_mempool.c   | 248 ++
 lib/mempool/rte_mempool.c | 124 +++
 lib/mempool/rte_mempool.h |  62 ++
 lib/mempool/version.map   |   8 ++
 4 files changed, 442 insertions(+)

diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index 66bc8d86b7..c39c83256e 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -489,6 +490,245 @@ test_mp_mem_init(struct rte_mempool *mp,
data->ret = 0;
 }
 
+struct test_mempool_events_data {
+   struct rte_mempool *mp;
+   enum rte_mempool_event event;
+   bool invoked;
+};
+
+static void
+test_mempool_events_cb(enum rte_mempool_event event,
+  struct rte_mempool *mp, void *user_data)
+{
+   struct test_mempool_events_data *data = user_data;
+
+   data->mp = mp;
+   data->event = event;
+   data->invoked = true;
+}
+
+static int
+test_mempool_events(int (*populate)(struct rte_mempool *mp))
+{
+#pragma push_macro("RTE_TEST_TRACE_FAILURE")
+#undef RTE_TEST_TRACE_FAILURE
+#define RTE_TEST_TRACE_FAILURE(...) do { goto fail; } while (0)
+
+   static const size_t CB_NUM = 3;
+   static const size_t MP_NUM = 2;
+
+   struct test_mempool_events_data data[CB_NUM];
+   struct rte_mempool *mp[MP_NUM], *freed;
+   char name[RTE_MEMPOOL_NAMESIZE];
+   size_t i, j;
+   int ret;
+
+   memset(mp, 0, sizeof(mp));
+   for (i = 0; i < CB_NUM; i++) {
+   ret = rte_mempool_event_callback_register
+   (test_mempool_events_cb, &data[i]);
+   RTE_TEST_ASSERT_EQUAL(ret, 0, "Failed to register the callback 
%zu: %s",
+ i, rte_strerror(rte_errno));
+   }
+   ret = rte_mempool_event_callback_unregister(test_mempool_events_cb, mp);
+   RTE_TEST_ASSERT_NOT_EQUAL(ret, 0, "Unregistered a non-registered 
callback");
+   /* NULL argument has no special meaning in this API. */
+   ret = rte_mempool_event_callback_unregister(test_mempool_events_cb,
+   NULL);
+   RTE_TEST_ASSERT_NOT_EQUAL(ret, 0, "Unregistered a non-registered 
callback with NULL argument");
+
+   /* Create mempool 0 that will be observed by all callbacks. */
+   memset(&data, 0, sizeof(data));
+   strcpy(name, "empty0");
+   mp[0] = rte_mempool_create_empty(name, MEMPOOL_SIZE,
+MEMPOOL_ELT_SIZE, 0, 0,
+SOCKET_ID_ANY, 0);
+   RTE_TEST_ASSERT_NOT_NULL(mp[0], "Cannot create mempool %s: %s",
+name, rte_strerror(rte_errno));
+   for (j = 0; j < CB_NUM; j++)
+   RTE_TEST_ASSERT_EQUAL(data[j].invoked, false,
+ "Callback %zu invoked on %s mempool 
creation",
+ j, name);
+
+   rte_mempool_set_ops_byname(mp[0], rte_mbuf_best_mempool_ops(), NULL);
+   ret = populate(mp[0]);
+   RTE_TEST_ASSERT_EQUAL(ret, (int)mp[0]->size, "Failed to populate 
mempool %s: %s",
+ name, rte_strerror(rte_errno));
+   for (j = 0; j < CB_NUM; j++) {
+   RTE_TEST_ASSERT_EQUAL(data[j].invoked, true,
+   "Callback %zu not invoked on mempool %s 
population",
+   j, name);
+   RTE_TEST_ASSERT_EQUAL(data[j].event,
+   RTE_MEMPOOL_EVENT_READY,
+   "Wrong callback invoked, expected 
READY");
+   RTE_TEST_ASSERT_EQUAL(data[j].mp, mp[0],
+

[dpdk-dev] [PATCH v6 2/4] mempool: add non-IO flag

2021-10-16 Thread Dmitry Kozlyuk
Mempool is a generic allocator that is not necessarily used
for device IO operations and its memory for DMA.
Add MEMPOOL_F_NON_IO flag to mark such mempools automatically
a) if their objects are not contiguous;
b) if IOVA is not available for any object.
Other components can inspect this flag
in order to optimize their memory management.

Discussion: https://mails.dpdk.org/archives/dev/2021-August/216654.html

Signed-off-by: Dmitry Kozlyuk 
Acked-by: Matan Azrad 
Reviewed-by: Andrew Rybchenko 
---
 app/proc-info/main.c   |   6 +-
 app/test/test_mempool.c| 112 +
 doc/guides/rel_notes/release_21_11.rst |   3 +
 lib/mempool/rte_mempool.c  |  10 +++
 lib/mempool/rte_mempool.h  |   2 +
 5 files changed, 131 insertions(+), 2 deletions(-)

diff --git a/app/proc-info/main.c b/app/proc-info/main.c
index a8e928fa9f..8ec9cadd79 100644
--- a/app/proc-info/main.c
+++ b/app/proc-info/main.c
@@ -1295,7 +1295,8 @@ show_mempool(char *name)
"\t  -- No cache align (%c)\n"
"\t  -- SP put (%c), SC get (%c)\n"
"\t  -- Pool created (%c)\n"
-   "\t  -- No IOVA config (%c)\n",
+   "\t  -- No IOVA config (%c)\n"
+   "\t  -- Not used for IO (%c)\n",
ptr->name,
ptr->socket_id,
(flags & MEMPOOL_F_NO_SPREAD) ? 'y' : 'n',
@@ -1303,7 +1304,8 @@ show_mempool(char *name)
(flags & MEMPOOL_F_SP_PUT) ? 'y' : 'n',
(flags & MEMPOOL_F_SC_GET) ? 'y' : 'n',
(flags & MEMPOOL_F_POOL_CREATED) ? 'y' : 'n',
-   (flags & MEMPOOL_F_NO_IOVA_CONTIG) ? 'y' : 'n');
+   (flags & MEMPOOL_F_NO_IOVA_CONTIG) ? 'y' : 'n',
+   (flags & MEMPOOL_F_NON_IO) ? 'y' : 'n');
printf("  - Size %u Cache %u element %u\n"
"  - header %u trailer %u\n"
"  - private data size %u\n",
diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c
index c39c83256e..caf9c46a29 100644
--- a/app/test/test_mempool.c
+++ b/app/test/test_mempool.c
@@ -12,6 +12,7 @@
 #include 
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -729,6 +730,109 @@ test_mempool_events_safety(void)
 #pragma pop_macro("RTE_TEST_TRACE_FAILURE")
 }
 
+#pragma push_macro("RTE_TEST_TRACE_FAILURE")
+#undef RTE_TEST_TRACE_FAILURE
+#define RTE_TEST_TRACE_FAILURE(...) do { \
+   ret = TEST_FAILED; \
+   goto exit; \
+   } while (0)
+
+static int
+test_mempool_flag_non_io_set_when_no_iova_contig_set(void)
+{
+   struct rte_mempool *mp = NULL;
+   int ret;
+
+   mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+ MEMPOOL_ELT_SIZE, 0, 0,
+ SOCKET_ID_ANY, MEMPOOL_F_NO_IOVA_CONTIG);
+   RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
+rte_strerror(rte_errno));
+   rte_mempool_set_ops_byname(mp, rte_mbuf_best_mempool_ops(), NULL);
+   ret = rte_mempool_populate_default(mp);
+   RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
+   rte_strerror(rte_errno));
+   RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
+   "NON_IO flag is not set when NO_IOVA_CONTIG is set");
+   ret = TEST_SUCCESS;
+exit:
+   rte_mempool_free(mp);
+   return ret;
+}
+
+static int
+test_mempool_flag_non_io_unset_when_populated_with_valid_iova(void)
+{
+   const struct rte_memzone *mz;
+   void *virt;
+   rte_iova_t iova;
+   size_t page_size = RTE_PGSIZE_2M;
+   struct rte_mempool *mp;
+   int ret;
+
+   mz = rte_memzone_reserve("test_mempool", 3 * page_size, SOCKET_ID_ANY,
+RTE_MEMZONE_IOVA_CONTIG);
+   RTE_TEST_ASSERT_NOT_NULL(mz, "Cannot allocate memory");
+   virt = mz->addr;
+   iova = rte_mem_virt2iova(virt);
+   RTE_TEST_ASSERT_NOT_EQUAL(iova,  RTE_BAD_IOVA, "Cannot get IOVA");
+   mp = rte_mempool_create_empty("empty", MEMPOOL_SIZE,
+ MEMPOOL_ELT_SIZE, 0, 0,
+ SOCKET_ID_ANY, 0);
+   RTE_TEST_ASSERT_NOT_NULL(mp, "Cannot create mempool: %s",
+rte_strerror(rte_errno));
+
+   ret = rte_mempool_populate_iova(mp, RTE_PTR_ADD(virt, 1 * page_size),
+   RTE_BAD_IOVA, page_size, NULL, NULL);
+   RTE_TEST_ASSERT(ret > 0, "Failed to populate mempool: %s",
+   rte_strerror(rte_errno));
+   RTE_TEST_ASSERT(mp->flags & MEMPOOL_F_NON_IO,
+

[dpdk-dev] [PATCH v6 3/4] common/mlx5: add mempool registration facilities

2021-10-16 Thread Dmitry Kozlyuk
Add internal API to register mempools, that is, to create memory
regions (MR) for their memory and store them in a separate database.
Implementation deals with multi-process, so that class drivers don't
need to. Each protection domain has its own database. Memory regions
can be shared within a database if they represent a single hugepage
covering one or more mempools entirely.

Add internal API to lookup an MR key for an address that belongs
to a known mempool. It is a responsibility of a class driver
to extract the mempool from an mbuf.

Signed-off-by: Dmitry Kozlyuk 
Acked-by: Matan Azrad 
---
 drivers/common/mlx5/mlx5_common_mp.c |  50 +++
 drivers/common/mlx5/mlx5_common_mp.h |  14 +
 drivers/common/mlx5/mlx5_common_mr.c | 580 +++
 drivers/common/mlx5/mlx5_common_mr.h |  17 +
 drivers/common/mlx5/version.map  |   5 +
 5 files changed, 666 insertions(+)

diff --git a/drivers/common/mlx5/mlx5_common_mp.c 
b/drivers/common/mlx5/mlx5_common_mp.c
index 673a7c31de..6dfc5535e0 100644
--- a/drivers/common/mlx5/mlx5_common_mp.c
+++ b/drivers/common/mlx5/mlx5_common_mp.c
@@ -54,6 +54,56 @@ mlx5_mp_req_mr_create(struct mlx5_mp_id *mp_id, uintptr_t 
addr)
return ret;
 }
 
+/**
+ * @param mp_id
+ *   ID of the MP process.
+ * @param share_cache
+ *   Shared MR cache.
+ * @param pd
+ *   Protection domain.
+ * @param mempool
+ *   Mempool to register or unregister.
+ * @param reg
+ *   True to register the mempool, False to unregister.
+ */
+int
+mlx5_mp_req_mempool_reg(struct mlx5_mp_id *mp_id,
+   struct mlx5_mr_share_cache *share_cache, void *pd,
+   struct rte_mempool *mempool, bool reg)
+{
+   struct rte_mp_msg mp_req;
+   struct rte_mp_msg *mp_res;
+   struct rte_mp_reply mp_rep;
+   struct mlx5_mp_param *req = (struct mlx5_mp_param *)mp_req.param;
+   struct mlx5_mp_arg_mempool_reg *arg = &req->args.mempool_reg;
+   struct mlx5_mp_param *res;
+   struct timespec ts = {.tv_sec = MLX5_MP_REQ_TIMEOUT_SEC, .tv_nsec = 0};
+   enum mlx5_mp_req_type type;
+   int ret;
+
+   MLX5_ASSERT(rte_eal_process_type() == RTE_PROC_SECONDARY);
+   type = reg ? MLX5_MP_REQ_MEMPOOL_REGISTER :
+MLX5_MP_REQ_MEMPOOL_UNREGISTER;
+   mp_init_msg(mp_id, &mp_req, type);
+   arg->share_cache = share_cache;
+   arg->pd = pd;
+   arg->mempool = mempool;
+   ret = rte_mp_request_sync(&mp_req, &mp_rep, &ts);
+   if (ret) {
+   DRV_LOG(ERR, "port %u request to primary process failed",
+   mp_id->port_id);
+   return -rte_errno;
+   }
+   MLX5_ASSERT(mp_rep.nb_received == 1);
+   mp_res = &mp_rep.msgs[0];
+   res = (struct mlx5_mp_param *)mp_res->param;
+   ret = res->result;
+   if (ret)
+   rte_errno = -ret;
+   mlx5_free(mp_rep.msgs);
+   return ret;
+}
+
 /**
  * Request Verbs queue state modification to the primary process.
  *
diff --git a/drivers/common/mlx5/mlx5_common_mp.h 
b/drivers/common/mlx5/mlx5_common_mp.h
index 6829141fc7..527bf3cad8 100644
--- a/drivers/common/mlx5/mlx5_common_mp.h
+++ b/drivers/common/mlx5/mlx5_common_mp.h
@@ -14,6 +14,8 @@
 enum mlx5_mp_req_type {
MLX5_MP_REQ_VERBS_CMD_FD = 1,
MLX5_MP_REQ_CREATE_MR,
+   MLX5_MP_REQ_MEMPOOL_REGISTER,
+   MLX5_MP_REQ_MEMPOOL_UNREGISTER,
MLX5_MP_REQ_START_RXTX,
MLX5_MP_REQ_STOP_RXTX,
MLX5_MP_REQ_QUEUE_STATE_MODIFY,
@@ -33,6 +35,12 @@ struct mlx5_mp_arg_queue_id {
uint16_t queue_id; /* DPDK queue ID. */
 };
 
+struct mlx5_mp_arg_mempool_reg {
+   struct mlx5_mr_share_cache *share_cache;
+   void *pd; /* NULL for MLX5_MP_REQ_MEMPOOL_UNREGISTER */
+   struct rte_mempool *mempool;
+};
+
 /* Pameters for IPC. */
 struct mlx5_mp_param {
enum mlx5_mp_req_type type;
@@ -41,6 +49,8 @@ struct mlx5_mp_param {
RTE_STD_C11
union {
uintptr_t addr; /* MLX5_MP_REQ_CREATE_MR */
+   struct mlx5_mp_arg_mempool_reg mempool_reg;
+   /* MLX5_MP_REQ_MEMPOOL_(UN)REGISTER */
struct mlx5_mp_arg_queue_state_modify state_modify;
/* MLX5_MP_REQ_QUEUE_STATE_MODIFY */
struct mlx5_mp_arg_queue_id queue_id;
@@ -91,6 +101,10 @@ void mlx5_mp_uninit_secondary(const char *name);
 __rte_internal
 int mlx5_mp_req_mr_create(struct mlx5_mp_id *mp_id, uintptr_t addr);
 __rte_internal
+int mlx5_mp_req_mempool_reg(struct mlx5_mp_id *mp_id,
+   struct mlx5_mr_share_cache *share_cache, void *pd,
+   struct rte_mempool *mempool, bool reg);
+__rte_internal
 int mlx5_mp_req_queue_state_modify(struct mlx5_mp_id *mp_id,
   struct mlx5_mp_arg_queue_state_modify *sm);
 __rte_internal
diff --git a/drivers/common/mlx5/mlx5_common_mr.c 
b/drivers/common/mlx5/mlx5_common_mr.c
index 98fe8698e2..2e039a4e70 100644
--- a/drivers/common/mlx5/ml

[dpdk-dev] [PATCH v6 4/4] net/mlx5: support mempool registration

2021-10-16 Thread Dmitry Kozlyuk
When the first port in a given protection domain (PD) starts,
install a mempool event callback for this PD and register all existing
memory regions (MR) for it. When the last port in a PD closes,
remove the callback and unregister all mempools for this PD.
This behavior can be switched off with a new devarg: mr_mempool_reg_en.

On TX slow path, i.e. when an MR key for the address of the buffer
to send is not in the local cache, first try to retrieve it from
the database of registered mempools. Supported are direct and indirect
mbufs, as well as externally-attached ones from MLX5 MPRQ feature.
Lookup in the database of non-mempool memory is used as the last resort.

RX mempools are registered regardless of the devarg value.
On RX data path only the local cache and the mempool database is used.
If implicit mempool registration is disabled, these mempools
are unregistered at port stop, releasing the MRs.

Signed-off-by: Dmitry Kozlyuk 
Acked-by: Matan Azrad 
---
 doc/guides/nics/mlx5.rst   |  13 +++
 doc/guides/rel_notes/release_21_11.rst |   6 +
 drivers/net/mlx5/linux/mlx5_mp_os.c|  44 +++
 drivers/net/mlx5/linux/mlx5_os.c   |   4 +-
 drivers/net/mlx5/mlx5.c| 152 +
 drivers/net/mlx5/mlx5.h|  10 ++
 drivers/net/mlx5/mlx5_mr.c | 120 +--
 drivers/net/mlx5/mlx5_mr.h |   2 -
 drivers/net/mlx5/mlx5_rx.h |  21 ++--
 drivers/net/mlx5/mlx5_rxq.c|  13 +++
 drivers/net/mlx5/mlx5_trigger.c|  77 +++--
 drivers/net/mlx5/windows/mlx5_os.c |   1 +
 12 files changed, 347 insertions(+), 116 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bae73f42d8..106e32e1c4 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -1001,6 +1001,19 @@ Driver options
 
   Enabled by default.
 
+- ``mr_mempool_reg_en`` parameter [int]
+
+  A nonzero value enables implicit registration of DMA memory of all mempools
+  except those having ``MEMPOOL_F_NON_IO``. This flag is set automatically
+  for mempools populated with non-contiguous objects or those without IOVA.
+  The effect is that when a packet from a mempool is transmitted,
+  its memory is already registered for DMA in the PMD and no registration
+  will happen on the data path. The tradeoff is extra work on the creation
+  of each mempool and increased HW resource use if some mempools
+  are not used with MLX5 devices.
+
+  Enabled by default.
+
 - ``representor`` parameter [list]
 
   This parameter can be used to instantiate DPDK Ethernet devices from
diff --git a/doc/guides/rel_notes/release_21_11.rst 
b/doc/guides/rel_notes/release_21_11.rst
index 39a8a3d950..f141999a0d 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -159,6 +159,12 @@ New Features
   * Added tests to verify tunnel header verification in IPsec inbound.
   * Added tests to verify inner checksum.
 
+* **Updated Mellanox mlx5 driver.**
+
+  Updated the Mellanox mlx5 driver with new features and improvements, 
including:
+
+  * Added implicit mempool registration to avoid data path hiccups (opt-out).
+
 
 Removed Items
 -
diff --git a/drivers/net/mlx5/linux/mlx5_mp_os.c 
b/drivers/net/mlx5/linux/mlx5_mp_os.c
index 3a4aa766f8..d2ac375a47 100644
--- a/drivers/net/mlx5/linux/mlx5_mp_os.c
+++ b/drivers/net/mlx5/linux/mlx5_mp_os.c
@@ -20,6 +20,45 @@
 #include "mlx5_tx.h"
 #include "mlx5_utils.h"
 
+/**
+ * Handle a port-agnostic message.
+ *
+ * @return
+ *   0 on success, 1 when message is not port-agnostic, (-1) on error.
+ */
+static int
+mlx5_mp_os_handle_port_agnostic(const struct rte_mp_msg *mp_msg,
+   const void *peer)
+{
+   struct rte_mp_msg mp_res;
+   struct mlx5_mp_param *res = (struct mlx5_mp_param *)mp_res.param;
+   const struct mlx5_mp_param *param =
+   (const struct mlx5_mp_param *)mp_msg->param;
+   const struct mlx5_mp_arg_mempool_reg *mpr;
+   struct mlx5_mp_id mp_id;
+
+   switch (param->type) {
+   case MLX5_MP_REQ_MEMPOOL_REGISTER:
+   mlx5_mp_id_init(&mp_id, param->port_id);
+   mp_init_msg(&mp_id, &mp_res, param->type);
+   mpr = ¶m->args.mempool_reg;
+   res->result = mlx5_mr_mempool_register(mpr->share_cache,
+  mpr->pd, mpr->mempool,
+  NULL);
+   return rte_mp_reply(&mp_res, peer);
+   case MLX5_MP_REQ_MEMPOOL_UNREGISTER:
+   mlx5_mp_id_init(&mp_id, param->port_id);
+   mp_init_msg(&mp_id, &mp_res, param->type);
+   mpr = ¶m->args.mempool_reg;
+   res->result = mlx5_mr_mempool_unregister(mpr->share_cache,
+mpr->mempool, NULL);
+   return rte_mp_reply(&mp_res, peer);
+  

Re: [dpdk-dev] [PATCH 2/5] ethdev: add capability to keep shared objects on restart

2021-10-16 Thread Dmitry Kozlyuk
> -Original Message-
> From: Ferruh Yigit 
> Sent: 15 октября 2021 г. 19:27
> To: Dmitry Kozlyuk ; dev@dpdk.org; Andrew Rybchenko
> ; Ori Kam ; Raslan
> Darawsheh 
> Cc: NBU-Contact-Thomas Monjalon ; Qi Zhang
> ; jer...@marvell.com; Maxime Coquelin
> 
> Subject: Re: [PATCH 2/5] ethdev: add capability to keep shared objects on
> restart
> 
> External email: Use caution opening links or attachments
> 
> 
> On 10/15/2021 1:35 PM, Dmitry Kozlyuk wrote:
> >> -Original Message-
> >> From: Ferruh Yigit 
> >> [...]
> >>> Introducing UNKNOWN state seems wrong to me.
> >>> What should an application do when it is reported?
> >>> Now there's just no way to learn how the PMD behaves,
> >>> but if it provides a response, it can't be "I don't know what I do".
> >>>
> >>
> >> I agree 'unknown' state is not ideal, but my intentions is prevent
> >> drivers that not implemented this new feature report wrong capability.
> >>
> >> Without capability, application already doesn't know how underlying
> >> PMD behaves, so this is by default 'unknown' state.
> >> I suggest keeping that state until driver explicitly updates its state
> >> to the correct value.
> >
> > My concern is that when all the drivers are changed to report a proper
> > capability, UNKNOWN remains in the API meaning "there's a bug in DPDK".
> >
> 
> When all drivers are changed, of course we can remove the 'unknown' flag.
> 
> > Instead of UNKNOWN response we can declare that rte_flow_flush()
> > must be called unless the application wants to keep the rules
> > and has made sure it's possible, or the behavior is undefined.
> > (Can be viewed as "UNKNOWN by default", but is simpler.)
> > This way neither UNKNOWN state is needed,
> > nor the bit saying the flow rules are flushed.
> > Here is why, let's consider KEEP and FLUSH combinations:
> >
> > (1) FLUSH=0, KEEP=0 is equivalent to UNKNOWN, i.e. the application
> >  must explicitly flush the rules itself
> >  in order to get deterministic behavior.
> > (2) FLUSH=1, KEEP=0 means PMD flushes all rules on the device stop.
> > (3) FLUSH=0, KEEP=1 means PMD can keep at least some rules,
> >  exact support must be checked with
> rte_flow_create()
> >  when the device is stopped.
> > (4) FLUSH=1, KEEP=1 is forbidden.
> >
> 
> What is 'FLUSH' here? Are you proposing a new capability?
> 
> > If the application doesn't need the PMD to keep flow rules,
> > it can as well flush them always before the device stop
> > regardless of whether the driver does it automatically or not.
> > It's even simpler and probably as efficient. Testpmd does this.
> > If the application wants to take advantage of rule-keeping ability,
> > it just tests the KEEP bit. If it is unset that's the previous case,
> > application should call rte_flow_flush() before the device stop to be
> sure.
> > Otherwise, the application can test capability to keep flow rule kinds
> > it is interested in (see my reply to Andrew).
> >
> 
> Overall this is an optimization, application can workaround without this
> capability.
> 
> If driver doesn't set KEEP capability, it is not clear what does it
> mean, driver doesn't keep rules or driver is not updated yet.
> I suggest to update comment to clarify the meaning of the missing KEEP
> flag.
> 
> And unless we have two explicit status flags application can never be
> sure that driver doesn't keep rules after stop. I am don't know if
> application wants to know this.
> 
> Other concern is how PMD maintainers will know that there is something
> to update here, I am sure many driver maintainers won't even be aware of
> this, your patch doesn't even cc them. Your approach feels like you are
> thinking only single PMD and ignore rest.
> 
> My intention was to have a way to follow drivers that is not updated,
> by marking them with UNKNOWN flag. But this also doesn't work with new
> drivers, they may forget setting capability.
> 
> 
> What about following:
> 1) Clarify KEEP flag meaning:
> having KEEP: flow rules are kept after stop
> missing KEEP: unknown behavior
> 
> 2) Mark all PMDs with useless flag:
> dev_capa &= ~KEEP
> Maintainer can remove or update this later, and we can easily track it.

Item 1) is almost what I did in v2. The difference (or clarification) is that
if the bit is set, it doesn't mean that all rules are kept.
It allows the PMD to not support keeping some kinds of rules.
Please see the doc update about how the kind is defined
and how the application can test what is unsupported.

This complication is needed so that if a PMD cannot keep some exotic kind of 
rules,
it is not forced to remove the capability completely,
blocking optimizations even if the application doesn't use problematic rule 
kinds.
It makes the capability future-proof.

The second flag (FLUSH) would not be of much help.
Consider it is not set, but the PMD can keep some kinds of rules.
The application still needs to test all the kinds it 

Re: [dpdk-dev] [PATCH v7 1/5] ethdev: introduce shared Rx queue

2021-10-16 Thread Ajit Khaparde
On Sat, Oct 16, 2021 at 1:43 AM Xueming Li  wrote:
>
> In current DPDK framework, each Rx queue is pre-loaded with mbufs to
> save incoming packets. For some PMDs, when number of representors scale
> out in a switch domain, the memory consumption became significant.
> Polling all ports also leads to high cache miss, high latency and low
> throughput.
>
> This patch introduce shared Rx queue. Ports in same Rx domain and
> switch domain could share Rx queue set by specifying non-zero sharing
> group in Rx queue configuration.
>
> Port A RxQ X can share RxQ with Port B RxQ X, but can't share with RxQ
> Y. All member ports in share group share a list of shared Rx queue
> indexed by Rx queue ID.
>
> No special API is defined to receive packets from shared Rx queue.
> Polling any member port of a shared Rx queue receives packets of that
> queue for all member ports, source port is identified by mbuf->port.
Is this port the physical port which received the packet?
Or does this port number correlate with the port_id seen by the application?



>
> Shared Rx queue must be polled in same thread or core, polling a queue
> ID of any member port is essentially same.
So it is upto the application to poll the queue of any member port or
all ports or a designated port to handle Rx?

>
> Multiple share groups are supported. Device should support mixed
> configuration by allowing multiple share groups and non-shared Rx queue.
>
> Example grouping and polling model to reflect service priority:
>  Group1, 2 shared Rx queues per port: PF, rep0, rep1
>  Group2, 1 shared Rx queue per port: rep2, rep3, ... rep127
>  Core0: poll PF queue0
>  Core1: poll PF queue1
>  Core2: poll rep2 queue0
>
> PMD advertise shared Rx queue capability via RTE_ETH_DEV_CAPA_RXQ_SHARE.
>
> PMD is responsible for shared Rx queue consistency checks to avoid
> member port's configuration contradict to each other.
>
> Signed-off-by: Xueming Li 
> ---
>  doc/guides/nics/features.rst  | 13 
>  doc/guides/nics/features/default.ini  |  1 +
>  .../prog_guide/switch_representation.rst  | 10 +
>  doc/guides/rel_notes/release_21_11.rst|  6 ++
>  lib/ethdev/rte_ethdev.c   |  8 +++
>  lib/ethdev/rte_ethdev.h   | 21 +++
>  6 files changed, 59 insertions(+)
>
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index e346018e4b8..b64433b8ea5 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -615,6 +615,19 @@ Supports inner packet L4 checksum.
>``tx_offload_capa,tx_queue_offload_capa:DEV_TX_OFFLOAD_OUTER_UDP_CKSUM``.
>
>
> +.. _nic_features_shared_rx_queue:
> +
> +Shared Rx queue
> +---
> +
> +Supports shared Rx queue for ports in same Rx domain of a switch domain.
> +
> +* **[uses] rte_eth_dev_info**: ``dev_capa:RTE_ETH_DEV_CAPA_RXQ_SHARE``.
> +* **[uses] rte_eth_dev_info,rte_eth_switch_info**: ``rx_domain``, 
> ``domain_id``.
> +* **[uses] rte_eth_rxconf**: ``share_group``.
> +* **[provides] mbuf**: ``mbuf.port``.
> +
> +
>  .. _nic_features_packet_type_parsing:
>
>  Packet type parsing
> diff --git a/doc/guides/nics/features/default.ini 
> b/doc/guides/nics/features/default.ini
> index d473b94091a..93f5d1b46f4 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -19,6 +19,7 @@ Free Tx mbuf on demand =
>  Queue start/stop =
>  Runtime Rx queue setup =
>  Runtime Tx queue setup =
> +Shared Rx queue  =
>  Burst mode info  =
>  Power mgmt address monitor =
>  MTU update   =
> diff --git a/doc/guides/prog_guide/switch_representation.rst 
> b/doc/guides/prog_guide/switch_representation.rst
> index ff6aa91c806..de41db8385d 100644
> --- a/doc/guides/prog_guide/switch_representation.rst
> +++ b/doc/guides/prog_guide/switch_representation.rst
> @@ -123,6 +123,16 @@ thought as a software "patch panel" front-end for 
> applications.
>  .. [1] `Ethernet switch device driver model (switchdev)
> `_
>
> +- For some PMDs, memory usage of representors is huge when number of
> +  representor grows, mbufs are allocated for each descriptor of Rx queue.
> +  Polling large number of ports brings more CPU load, cache miss and
> +  latency. Shared Rx queue can be used to share Rx queue between PF and
> +  representors among same Rx domain. ``RTE_ETH_DEV_CAPA_RXQ_SHARE`` is
> +  present in device capability of device info. Setting non-zero share group
> +  in Rx queue configuration to enable share. Polling any member port can
> +  receive packets of all member ports in the group, port ID is saved in
> +  ``mbuf.port``.
> +
>  Basic SR-IOV
>  
>
> diff --git a/doc/guides/rel_notes/release_21_11.rst 
> b/doc/guides/rel_notes/release_21_11.rst
> index 4c56cdfeaaa..1c84e896554 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b

Re: [dpdk-dev] [PATCH v4 14/14] eventdev: mark trace variables as internal

2021-10-16 Thread Jerin Jacob
On Sat, Oct 16, 2021 at 12:34 AM  wrote:
>
> From: Pavan Nikhilesh 
>
> Mark rte_trace global variables as internal i.e. remove them
> from experimental section of version map.
> Some of them are used in inline APIs, mark those as global.
>
> Signed-off-by: Pavan Nikhilesh 
> Acked-by: Ray Kinsella 
> ---
>  doc/guides/rel_notes/release_21_11.rst | 12 +
>  lib/eventdev/version.map   | 71 --
>  2 files changed, 44 insertions(+), 39 deletions(-)
>
> diff --git a/doc/guides/rel_notes/release_21_11.rst 
> b/doc/guides/rel_notes/release_21_11.rst
> index 38e601c236..5b4a05c3ae 100644
> --- a/doc/guides/rel_notes/release_21_11.rst
> +++ b/doc/guides/rel_notes/release_21_11.rst
> @@ -226,6 +226,9 @@ API Changes
>the crypto/security operation. This field will be used to communicate
>events such as soft expiry with IPsec in lookaside mode.
>
> +* eventdev: Event vector configuration APIs have been made stable.
> +  Move memory used by timer adapters to hugepage. This will prevent TLB 
> misses
> +  if any and aligns to memory structure of other subsystems.
>
>  ABI Changes
>  ---
> @@ -277,6 +280,15 @@ ABI Changes
>were added in structure ``rte_event_eth_rx_adapter_stats`` to get 
> additional
>status.
>
> +* eventdev: A new structure ``rte_event_fp_ops`` has been added which is now 
> used
> +  by the fastpath inline functions. The structures ``rte_eventdev``,
> +  ``rte_eventdev_data`` have been made internal. ``rte_eventdevs[]`` can't be
> +  accessed directly by user any more. This change is transparent to both
> +  applications and PMDs.
> +
> +* eventdev: Re-arrange fields in ``rte_event_timer`` to remove holes.
> +  ``rte_event_timer_adapter_pmd.h`` has been made internal.

Looks good. Please fix the following, If there are no objections, I
will merge the next version.

1) Please move the doc update to respective patches
2) Following checkpath issue
[for-main]dell[dpdk-next-eventdev] $ ./devtools/checkpatches.sh -n 14

### eventdev: move inline APIs into separate structure

INFO: symbol event_dev_fp_ops_reset has been added to the INTERNAL
section of the version map
INFO: symbol event_dev_fp_ops_set has been added to the INTERNAL
section of the version map
INFO: symbol event_dev_probing_finish has been added to the INTERNAL
section of the version map
ERROR: symbol rte_event_fp_ops is added in the DPDK_22 section, but is
expected to be added in the EXPERIMENTAL section of the version map


[dpdk-dev] [PATCH v11 0/8] baseband: add NXP LA12xx driver

2021-10-16 Thread nipun . gupta
From: Nipun Gupta 

This series introduces the BBDEV LA12xx poll mode driver (PMD) to support
an implementation for offloading High Phy processing functions like
LDPC Encode / Decode 5GNR wireless acceleration function, using PCI based
LA12xx Software defined radio.

Please check the documentation patch for more info.

The driver currently implements basic feature to offload only the 5G LDPC
encode/decode.

A new capability has been added to check if the driver can support the
input data in little/big endian byte order.

v2: add test case changes
v3: fix 32 bit compilation
v4: capability for network byte order, doc patch merged inline.
v5: add llr_size and llr_decimals, removed LLR compression flag,
update testbbdev to handle endianness, rebased on top of 20.08
v6: added BE as device info instead of capability, updated test
to have 2 codeblocks
v7: fixed checkpatch errors
v8: remove additional test vectors, update reverse_op function name,
make be_support param as bool, other minor changes in la12xx driver
v9: add little endianness capability as well (patch by Nicolas Chautru),
fix 32 bit (i386) compilation, fix get of nb_segs, add endianness
info in testbbdev doc.
v10: use default RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN defined, add
 data_endianness info for BBDEV null device
v11: split la12xx doc in separate patches and fixed some nits

Hemant Agrawal (5):
  baseband/la12xx: add devargs for max queues
  baseband/la12xx: add support for multiple modems
  baseband/la12xx: add queue and modem config support
  baseband/la12xx: add enqueue and dequeue support
  app/bbdev: enable la12xx for bbdev

Nicolas Chautru (1):
  bbdev: add device info related to data endianness

Nipun Gupta (2):
  baseband: introduce NXP LA12xx driver
  app/bbdev: handle endianness of test data

 MAINTAINERS   |   10 +
 app/test-bbdev/meson.build|3 +
 app/test-bbdev/test_bbdev_perf.c  |   43 +
 doc/guides/bbdevs/features/la12xx.ini |   13 +
 doc/guides/bbdevs/index.rst   |1 +
 doc/guides/bbdevs/la12xx.rst  |  124 ++
 doc/guides/rel_notes/release_21_11.rst|6 +
 doc/guides/tools/testbbdev.rst|3 +
 drivers/baseband/acc100/rte_acc100_pmd.c  |1 +
 .../fpga_5gnr_fec/rte_fpga_5gnr_fec.c |1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c  |1 +
 drivers/baseband/la12xx/bbdev_la12xx.c| 1104 +
 drivers/baseband/la12xx/bbdev_la12xx.h|   51 +
 drivers/baseband/la12xx/bbdev_la12xx_ipc.h|  244 
 .../baseband/la12xx/bbdev_la12xx_pmd_logs.h   |   28 +
 drivers/baseband/la12xx/meson.build   |6 +
 drivers/baseband/la12xx/version.map   |3 +
 drivers/baseband/meson.build  |1 +
 drivers/baseband/null/bbdev_null.c|6 +
 .../baseband/turbo_sw/bbdev_turbo_software.c  |1 +
 lib/bbdev/rte_bbdev.h |4 +
 21 files changed, 1654 insertions(+)
 create mode 100644 doc/guides/bbdevs/features/la12xx.ini
 create mode 100644 doc/guides/bbdevs/la12xx.rst
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx.c
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx.h
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx_ipc.h
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h
 create mode 100644 drivers/baseband/la12xx/meson.build
 create mode 100644 drivers/baseband/la12xx/version.map

-- 
2.17.1



[dpdk-dev] [PATCH v11 1/8] bbdev: add device info related to data endianness

2021-10-16 Thread nipun . gupta
From: Nicolas Chautru 

Adding device information to capture explicitly the assumption
of the input/output data byte endianness being processed.

Signed-off-by: Nicolas Chautru 
Signed-off-by: Nipun Gupta 
---
 doc/guides/rel_notes/release_21_11.rst | 1 +
 drivers/baseband/acc100/rte_acc100_pmd.c   | 1 +
 drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c | 1 +
 drivers/baseband/fpga_lte_fec/fpga_lte_fec.c   | 1 +
 drivers/baseband/null/bbdev_null.c | 6 ++
 drivers/baseband/turbo_sw/bbdev_turbo_software.c   | 1 +
 lib/bbdev/rte_bbdev.h  | 4 
 7 files changed, 15 insertions(+)

diff --git a/doc/guides/rel_notes/release_21_11.rst 
b/doc/guides/rel_notes/release_21_11.rst
index 4c56cdfeaa..957bd78d61 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -229,6 +229,7 @@ API Changes
   the crypto/security operation. This field will be used to communicate
   events such as soft expiry with IPsec in lookaside mode.
 
+* bbdev: Added device info related to data byte endianness processing.
 
 ABI Changes
 ---
diff --git a/drivers/baseband/acc100/rte_acc100_pmd.c 
b/drivers/baseband/acc100/rte_acc100_pmd.c
index 4e2feefc3c..05fe6f8b6f 100644
--- a/drivers/baseband/acc100/rte_acc100_pmd.c
+++ b/drivers/baseband/acc100/rte_acc100_pmd.c
@@ -1089,6 +1089,7 @@ acc100_dev_info_get(struct rte_bbdev *dev,
 #else
dev_info->harq_buffer_size = 0;
 #endif
+   dev_info->data_endianness = RTE_LITTLE_ENDIAN;
acc100_check_ir(d);
 }
 
diff --git a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c 
b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
index 6485cc824a..ee457f3071 100644
--- a/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
+++ b/drivers/baseband/fpga_5gnr_fec/rte_fpga_5gnr_fec.c
@@ -372,6 +372,7 @@ fpga_dev_info_get(struct rte_bbdev *dev,
dev_info->default_queue_conf = default_queue_conf;
dev_info->capabilities = bbdev_capabilities;
dev_info->cpu_flag_reqs = NULL;
+   dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 
/* Calculates number of queues assigned to device */
dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c 
b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
index 350c4248eb..703bb611a0 100644
--- a/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
+++ b/drivers/baseband/fpga_lte_fec/fpga_lte_fec.c
@@ -644,6 +644,7 @@ fpga_dev_info_get(struct rte_bbdev *dev,
dev_info->default_queue_conf = default_queue_conf;
dev_info->capabilities = bbdev_capabilities;
dev_info->cpu_flag_reqs = NULL;
+   dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 
/* Calculates number of queues assigned to device */
dev_info->max_num_queues = 0;
diff --git a/drivers/baseband/null/bbdev_null.c 
b/drivers/baseband/null/bbdev_null.c
index 53c538ba44..753d920e18 100644
--- a/drivers/baseband/null/bbdev_null.c
+++ b/drivers/baseband/null/bbdev_null.c
@@ -77,6 +77,12 @@ info_get(struct rte_bbdev *dev, struct rte_bbdev_driver_info 
*dev_info)
dev_info->cpu_flag_reqs = NULL;
dev_info->min_alignment = 0;
 
+   /* BBDEV null device does not process the data, so
+* endianness setting is not relevant, but setting it
+* here for code completeness.
+*/
+   dev_info->data_endianness = RTE_LITTLE_ENDIAN;
+
rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
 }
 
diff --git a/drivers/baseband/turbo_sw/bbdev_turbo_software.c 
b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
index e1db2bf205..b234bb751a 100644
--- a/drivers/baseband/turbo_sw/bbdev_turbo_software.c
+++ b/drivers/baseband/turbo_sw/bbdev_turbo_software.c
@@ -253,6 +253,7 @@ info_get(struct rte_bbdev *dev, struct 
rte_bbdev_driver_info *dev_info)
dev_info->capabilities = bbdev_capabilities;
dev_info->min_alignment = 64;
dev_info->harq_buffer_size = 0;
+   dev_info->data_endianness = RTE_LITTLE_ENDIAN;
 
rte_bbdev_log_debug("got device info from %u\n", dev->data->dev_id);
 }
diff --git a/lib/bbdev/rte_bbdev.h b/lib/bbdev/rte_bbdev.h
index 3ebf62e697..e863bd913f 100644
--- a/lib/bbdev/rte_bbdev.h
+++ b/lib/bbdev/rte_bbdev.h
@@ -309,6 +309,10 @@ struct rte_bbdev_driver_info {
uint16_t min_alignment;
/** HARQ memory available in kB */
uint32_t harq_buffer_size;
+   /** Byte endianness (RTE_BIG_ENDIAN/RTE_LITTLE_ENDIAN) supported
+*  for input/output data
+*/
+   uint8_t data_endianness;
/** Default queue configuration used if none is supplied  */
struct rte_bbdev_queue_conf default_queue_conf;
/** Device operation capabilities */
-- 
2.17.1



[dpdk-dev] [PATCH v11 2/8] baseband: introduce NXP LA12xx driver

2021-10-16 Thread nipun . gupta
From: Nipun Gupta 

This patch introduce the baseband device drivers for NXP's
LA1200 series software defined baseband modem.

Signed-off-by: Nipun Gupta 
Signed-off-by: Hemant Agrawal 
---
 MAINTAINERS   |  10 ++
 doc/guides/bbdevs/index.rst   |   1 +
 doc/guides/bbdevs/la12xx.rst  |  71 
 drivers/baseband/la12xx/bbdev_la12xx.c| 108 ++
 .../baseband/la12xx/bbdev_la12xx_pmd_logs.h   |  28 +
 drivers/baseband/la12xx/meson.build   |   6 +
 drivers/baseband/la12xx/version.map   |   3 +
 drivers/baseband/meson.build  |   1 +
 8 files changed, 228 insertions(+)
 create mode 100644 doc/guides/bbdevs/la12xx.rst
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx.c
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h
 create mode 100644 drivers/baseband/la12xx/meson.build
 create mode 100644 drivers/baseband/la12xx/version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index ed8becce85..ff632479c5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1289,6 +1289,16 @@ F: drivers/event/opdl/
 F: doc/guides/eventdevs/opdl.rst
 
 
+Baseband Drivers
+
+
+NXP LA12xx driver
+M: Nipun Gupta 
+M: Hemant Agrawal 
+F: drivers/baseband/la12xx/
+F: doc/guides/bbdevs/la12xx.rst
+
+
 Rawdev Drivers
 --
 
diff --git a/doc/guides/bbdevs/index.rst b/doc/guides/bbdevs/index.rst
index 4445cbd1b0..cedd706fa6 100644
--- a/doc/guides/bbdevs/index.rst
+++ b/doc/guides/bbdevs/index.rst
@@ -14,3 +14,4 @@ Baseband Device Drivers
 fpga_lte_fec
 fpga_5gnr_fec
 acc100
+la12xx
diff --git a/doc/guides/bbdevs/la12xx.rst b/doc/guides/bbdevs/la12xx.rst
new file mode 100644
index 00..9ac6f0a0cd
--- /dev/null
+++ b/doc/guides/bbdevs/la12xx.rst
@@ -0,0 +1,71 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright 2021 NXP
+
+NXP LA12xx Poll Mode Driver
+===
+
+The BBDEV LA12xx poll mode driver (PMD) supports an implementation for
+offloading High Phy processing functions like LDPC Encode / Decode 5GNR 
wireless
+acceleration function, using PCI based LA12xx Software defined radio.
+
+More information can be found at `NXP Official Website
+`_.
+
+Features
+
+
+LA12xx PMD supports the following features:
+
+- Maximum of 8 LDPC decode (UL) queues
+- Maximum of 8 LDPC encode (DL) queues
+- PCIe Gen-3 x8 Interface
+
+Installation
+
+
+Section 3 of the DPDK manual provides instructions on installing and compiling 
DPDK.
+
+DPDK requires hugepages to be configured as detailed in section 2 of the DPDK 
manual.
+
+Initialization
+--
+
+The device can be listed on the host console with:
+
+
+Use the following lspci command to get the multiple LA12xx processor ids. The
+device ID of the LA12xx baseband processor is "1c30".
+
+.. code-block:: console
+
+  sudo lspci -nn
+
+...
+0001:01:00.0 Power PC [0b20]: Freescale Semiconductor Inc Device [1957:1c30] (
+rev 10)
+...
+0002:01:00.0 Power PC [0b20]: Freescale Semiconductor Inc Device [1957:1c30] (
+rev 10)
+
+
+Prerequisites
+-
+
+Currently supported by DPDK:
+
+- NXP LA1224 BSP **1.0+**.
+- NXP LA1224 PCIe Modem card connected to ARM host.
+
+- Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup 
the basic DPDK environment.
+
+Enabling logs
+-
+
+For enabling logs, use the following EAL parameter:
+
+.. code-block:: console
+
+   ./your_bbdev_application  --log-level=la12xx:
+
+Using ``bb.la12xx`` as log matching criteria, all Baseband PMD logs can be
+enabled which are lower than logging ``level``.
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c 
b/drivers/baseband/la12xx/bbdev_la12xx.c
new file mode 100644
index 00..d3d7a4df37
--- /dev/null
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -0,0 +1,108 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020-2021 NXP
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+#define DRIVER_NAME baseband_la12xx
+
+/* private data structure */
+struct bbdev_la12xx_private {
+   unsigned int max_nb_queues;  /**< Max number of queues */
+};
+/* Create device */
+static int
+la12xx_bbdev_create(struct rte_vdev_device *vdev)
+{
+   struct rte_bbdev *bbdev;
+   const char *name = rte_vdev_device_name(vdev);
+
+   PMD_INIT_FUNC_TRACE();
+
+   bbdev = rte_bbdev_allocate(name);
+   if (bbdev == NULL)
+   return -ENODEV;
+
+   bbdev->data->dev_private = rte_zmalloc(name,
+   sizeof(struct bbdev_la12xx_private),
+   RTE_CACHE_LINE_SIZE);
+   if (bbdev->data->dev_private == NULL) {
+   rte_bbdev_release(bbdev);
+   return -ENOMEM;
+

[dpdk-dev] [PATCH v11 3/8] baseband/la12xx: add devargs for max queues

2021-10-16 Thread nipun . gupta
From: Hemant Agrawal 

This patch adds dev args to take  max queues as input

Signed-off-by: Nipun Gupta 
Signed-off-by: Hemant Agrawal 
---
 doc/guides/bbdevs/la12xx.rst   |  4 ++
 drivers/baseband/la12xx/bbdev_la12xx.c | 73 +-
 2 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/doc/guides/bbdevs/la12xx.rst b/doc/guides/bbdevs/la12xx.rst
index 9ac6f0a0cd..3725078567 100644
--- a/doc/guides/bbdevs/la12xx.rst
+++ b/doc/guides/bbdevs/la12xx.rst
@@ -58,6 +58,10 @@ Currently supported by DPDK:
 
 - Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup 
the basic DPDK environment.
 
+* Use dev arg option ``max_nb_queues=x`` to specify the maximum number of 
queues
+  to be used for communication with offload device i.e. modem. default is 16.
+  e.g. ``--vdev=baseband_la12xx,max_nb_queues=4``
+
 Enabling logs
 -
 
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c 
b/drivers/baseband/la12xx/bbdev_la12xx.c
index d3d7a4df37..f5c835eeb8 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -17,13 +17,73 @@
 
 #define DRIVER_NAME baseband_la12xx
 
+/*  Initialisation params structure that can be used by LA12xx BBDEV driver */
+struct bbdev_la12xx_params {
+   uint8_t queues_num; /*< LA12xx BBDEV queues number */
+};
+
+#define LA12XX_MAX_NB_QUEUES_ARG   "max_nb_queues"
+
+static const char * const bbdev_la12xx_valid_params[] = {
+   LA12XX_MAX_NB_QUEUES_ARG,
+};
+
 /* private data structure */
 struct bbdev_la12xx_private {
unsigned int max_nb_queues;  /**< Max number of queues */
 };
+static inline int
+parse_u16_arg(const char *key, const char *value, void *extra_args)
+{
+   uint16_t *u16 = extra_args;
+
+   uint64_t result;
+   if ((value == NULL) || (extra_args == NULL))
+   return -EINVAL;
+   errno = 0;
+   result = strtoul(value, NULL, 0);
+   if ((result >= (1 << 16)) || (errno != 0)) {
+   rte_bbdev_log(ERR, "Invalid value %" PRIu64 " for %s",
+ result, key);
+   return -ERANGE;
+   }
+   *u16 = (uint16_t)result;
+   return 0;
+}
+
+/* Parse parameters used to create device */
+static int
+parse_bbdev_la12xx_params(struct bbdev_la12xx_params *params,
+   const char *input_args)
+{
+   struct rte_kvargs *kvlist = NULL;
+   int ret = 0;
+
+   if (params == NULL)
+   return -EINVAL;
+   if (input_args) {
+   kvlist = rte_kvargs_parse(input_args,
+   bbdev_la12xx_valid_params);
+   if (kvlist == NULL)
+   return -EFAULT;
+
+   ret = rte_kvargs_process(kvlist, bbdev_la12xx_valid_params[0],
+   &parse_u16_arg, ¶ms->queues_num);
+   if (ret < 0)
+   goto exit;
+
+   }
+
+exit:
+   if (kvlist)
+   rte_kvargs_free(kvlist);
+   return ret;
+}
+
 /* Create device */
 static int
-la12xx_bbdev_create(struct rte_vdev_device *vdev)
+la12xx_bbdev_create(struct rte_vdev_device *vdev,
+   struct bbdev_la12xx_params *init_params __rte_unused)
 {
struct rte_bbdev *bbdev;
const char *name = rte_vdev_device_name(vdev);
@@ -60,7 +120,11 @@ la12xx_bbdev_create(struct rte_vdev_device *vdev)
 static int
 la12xx_bbdev_probe(struct rte_vdev_device *vdev)
 {
+   struct bbdev_la12xx_params init_params = {
+   8
+   };
const char *name;
+   const char *input_args;
 
PMD_INIT_FUNC_TRACE();
 
@@ -71,7 +135,10 @@ la12xx_bbdev_probe(struct rte_vdev_device *vdev)
if (name == NULL)
return -EINVAL;
 
-   return la12xx_bbdev_create(vdev);
+   input_args = rte_vdev_device_args(vdev);
+   parse_bbdev_la12xx_params(&init_params, input_args);
+
+   return la12xx_bbdev_create(vdev, &init_params);
 }
 
 /* Uninitialise device */
@@ -105,4 +172,6 @@ static struct rte_vdev_driver bbdev_la12xx_pmd_drv = {
 };
 
 RTE_PMD_REGISTER_VDEV(DRIVER_NAME, bbdev_la12xx_pmd_drv);
+RTE_PMD_REGISTER_PARAM_STRING(DRIVER_NAME,
+   LA12XX_MAX_NB_QUEUES_ARG"=");
 RTE_LOG_REGISTER_DEFAULT(bbdev_la12xx_logtype, NOTICE);
-- 
2.17.1



[dpdk-dev] [PATCH v11 4/8] baseband/la12xx: add support for multiple modems

2021-10-16 Thread nipun . gupta
From: Hemant Agrawal 

This patch add support for multiple modems by assigning
a modem id as dev args in vdev creation.

Signed-off-by: Hemant Agrawal 
---
 doc/guides/bbdevs/la12xx.rst   |  5 ++
 drivers/baseband/la12xx/bbdev_la12xx.c | 64 +++---
 drivers/baseband/la12xx/bbdev_la12xx.h | 56 +++
 drivers/baseband/la12xx/bbdev_la12xx_ipc.h | 20 +++
 4 files changed, 138 insertions(+), 7 deletions(-)
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx.h
 create mode 100644 drivers/baseband/la12xx/bbdev_la12xx_ipc.h

diff --git a/doc/guides/bbdevs/la12xx.rst b/doc/guides/bbdevs/la12xx.rst
index 3725078567..1a711ef5e3 100644
--- a/doc/guides/bbdevs/la12xx.rst
+++ b/doc/guides/bbdevs/la12xx.rst
@@ -58,6 +58,11 @@ Currently supported by DPDK:
 
 - Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup 
the basic DPDK environment.
 
+* Use dev arg option ``modem=0`` to identify the modem instance for a given
+  device. This is required only if more than 1 modem cards are attached to 
host.
+  this is optional and the default value is 0.
+  e.g. ``--vdev=baseband_la12xx,modem=0``
+
 * Use dev arg option ``max_nb_queues=x`` to specify the maximum number of 
queues
   to be used for communication with offload device i.e. modem. default is 16.
   e.g. ``--vdev=baseband_la12xx,max_nb_queues=4``
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c 
b/drivers/baseband/la12xx/bbdev_la12xx.c
index f5c835eeb8..58defa54f0 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -14,24 +14,26 @@
 #include 
 
 #include 
+#include 
+#include 
 
 #define DRIVER_NAME baseband_la12xx
 
 /*  Initialisation params structure that can be used by LA12xx BBDEV driver */
 struct bbdev_la12xx_params {
uint8_t queues_num; /*< LA12xx BBDEV queues number */
+   int8_t modem_id; /*< LA12xx modem instance id */
 };
 
 #define LA12XX_MAX_NB_QUEUES_ARG   "max_nb_queues"
+#define LA12XX_VDEV_MODEM_ID_ARG   "modem"
+#define LA12XX_MAX_MODEM 4
 
 static const char * const bbdev_la12xx_valid_params[] = {
LA12XX_MAX_NB_QUEUES_ARG,
+   LA12XX_VDEV_MODEM_ID_ARG,
 };
 
-/* private data structure */
-struct bbdev_la12xx_private {
-   unsigned int max_nb_queues;  /**< Max number of queues */
-};
 static inline int
 parse_u16_arg(const char *key, const char *value, void *extra_args)
 {
@@ -51,6 +53,28 @@ parse_u16_arg(const char *key, const char *value, void 
*extra_args)
return 0;
 }
 
+/* Parse integer from integer argument */
+static int
+parse_integer_arg(const char *key __rte_unused,
+   const char *value, void *extra_args)
+{
+   int i;
+   char *end;
+
+   errno = 0;
+
+   i = strtol(value, &end, 10);
+   if (*end != 0 || errno != 0 || i < 0 || i > LA12XX_MAX_MODEM) {
+   rte_bbdev_log(ERR, "Supported Port IDS are 0 to %d",
+   LA12XX_MAX_MODEM - 1);
+   return -EINVAL;
+   }
+
+   *((uint32_t *)extra_args) = i;
+
+   return 0;
+}
+
 /* Parse parameters used to create device */
 static int
 parse_bbdev_la12xx_params(struct bbdev_la12xx_params *params,
@@ -72,6 +96,16 @@ parse_bbdev_la12xx_params(struct bbdev_la12xx_params *params,
if (ret < 0)
goto exit;
 
+   ret = rte_kvargs_process(kvlist,
+   bbdev_la12xx_valid_params[1],
+   &parse_integer_arg,
+   ¶ms->modem_id);
+
+   if (params->modem_id >= LA12XX_MAX_MODEM) {
+   rte_bbdev_log(ERR, "Invalid modem id, must be < %u",
+   LA12XX_MAX_MODEM);
+   goto exit;
+   }
}
 
 exit:
@@ -83,10 +117,11 @@ parse_bbdev_la12xx_params(struct bbdev_la12xx_params 
*params,
 /* Create device */
 static int
 la12xx_bbdev_create(struct rte_vdev_device *vdev,
-   struct bbdev_la12xx_params *init_params __rte_unused)
+   struct bbdev_la12xx_params *init_params)
 {
struct rte_bbdev *bbdev;
const char *name = rte_vdev_device_name(vdev);
+   struct bbdev_la12xx_private *priv;
 
PMD_INIT_FUNC_TRACE();
 
@@ -102,6 +137,20 @@ la12xx_bbdev_create(struct rte_vdev_device *vdev,
return -ENOMEM;
}
 
+   priv = bbdev->data->dev_private;
+   priv->modem_id = init_params->modem_id;
+   /* if modem id is not configured */
+   if (priv->modem_id == -1)
+   priv->modem_id = bbdev->data->dev_id;
+
+   /* Reset Global variables */
+   priv->num_ldpc_enc_queues = 0;
+   priv->num_ldpc_dec_queues = 0;
+   priv->num_valid_queues = 0;
+   priv->max_nb_queues = init_params->queues_num;
+
+   rte_bbdev_log(INFO, "Setting Up %s: DevId=%d, ModemId=%d",
+   name, bbdev->data->dev_id, pri

[dpdk-dev] [PATCH v11 5/8] baseband/la12xx: add queue and modem config support

2021-10-16 Thread nipun . gupta
From: Hemant Agrawal 

This patch add support for connecting with modem
and creating the ipc channel as queues with modem
for the exchange of data.

Signed-off-by: Nipun Gupta 
Signed-off-by: Hemant Agrawal 
---
 drivers/baseband/la12xx/bbdev_la12xx.c | 559 -
 drivers/baseband/la12xx/bbdev_la12xx.h |  17 +-
 drivers/baseband/la12xx/bbdev_la12xx_ipc.h | 189 ++-
 3 files changed, 752 insertions(+), 13 deletions(-)

diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c 
b/drivers/baseband/la12xx/bbdev_la12xx.c
index 58defa54f0..7e312cf2e7 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -3,6 +3,11 @@
  */
 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
 
 #include 
 #include 
@@ -29,11 +34,556 @@ struct bbdev_la12xx_params {
 #define LA12XX_VDEV_MODEM_ID_ARG   "modem"
 #define LA12XX_MAX_MODEM 4
 
+#define LA12XX_MAX_CORES   4
+#define LA12XX_LDPC_ENC_CORE   0
+#define LA12XX_LDPC_DEC_CORE   1
+
+#define LA12XX_MAX_LDPC_ENC_QUEUES 4
+#define LA12XX_MAX_LDPC_DEC_QUEUES 4
+
 static const char * const bbdev_la12xx_valid_params[] = {
LA12XX_MAX_NB_QUEUES_ARG,
LA12XX_VDEV_MODEM_ID_ARG,
 };
 
+static const struct rte_bbdev_op_cap bbdev_capabilities[] = {
+   {
+   .type   = RTE_BBDEV_OP_LDPC_ENC,
+   .cap.ldpc_enc = {
+   .capability_flags =
+   RTE_BBDEV_LDPC_RATE_MATCH |
+   RTE_BBDEV_LDPC_CRC_24A_ATTACH |
+   RTE_BBDEV_LDPC_CRC_24B_ATTACH,
+   .num_buffers_src =
+   RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+   .num_buffers_dst =
+   RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+   }
+   },
+   {
+   .type   = RTE_BBDEV_OP_LDPC_DEC,
+   .cap.ldpc_dec = {
+   .capability_flags =
+   RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK |
+   RTE_BBDEV_LDPC_CRC_TYPE_24B_CHECK |
+   RTE_BBDEV_LDPC_CRC_TYPE_24B_DROP,
+   .num_buffers_src =
+   RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+   .num_buffers_hard_out =
+   RTE_BBDEV_LDPC_MAX_CODE_BLOCKS,
+   .llr_size = 8,
+   .llr_decimals = 1,
+   }
+   },
+   RTE_BBDEV_END_OF_CAPABILITIES_LIST()
+};
+
+static struct rte_bbdev_queue_conf default_queue_conf = {
+   .queue_size = MAX_CHANNEL_DEPTH,
+};
+
+/* Get device info */
+static void
+la12xx_info_get(struct rte_bbdev *dev __rte_unused,
+   struct rte_bbdev_driver_info *dev_info)
+{
+   PMD_INIT_FUNC_TRACE();
+
+   dev_info->driver_name = RTE_STR(DRIVER_NAME);
+   dev_info->max_num_queues = LA12XX_MAX_QUEUES;
+   dev_info->queue_size_lim = MAX_CHANNEL_DEPTH;
+   dev_info->hardware_accelerated = true;
+   dev_info->max_dl_queue_priority = 0;
+   dev_info->max_ul_queue_priority = 0;
+   dev_info->data_endianness = RTE_BIG_ENDIAN;
+   dev_info->default_queue_conf = default_queue_conf;
+   dev_info->capabilities = bbdev_capabilities;
+   dev_info->cpu_flag_reqs = NULL;
+   dev_info->min_alignment = 64;
+
+   rte_bbdev_log_debug("got device info from %u", dev->data->dev_id);
+}
+
+/* Release queue */
+static int
+la12xx_queue_release(struct rte_bbdev *dev, uint16_t q_id)
+{
+   RTE_SET_USED(dev);
+   RTE_SET_USED(q_id);
+
+   PMD_INIT_FUNC_TRACE();
+
+   return 0;
+}
+
+#define HUGEPG_OFFSET(A) \
+   ((uint64_t) ((unsigned long) (A) \
+   - ((uint64_t)ipc_priv->hugepg_start.host_vaddr)))
+
+static int
+ipc_queue_configure(uint32_t channel_id,
+   ipc_t instance,
+   struct bbdev_la12xx_q_priv *q_priv)
+{
+   ipc_userspace_t *ipc_priv = (ipc_userspace_t *)instance;
+   ipc_instance_t *ipc_instance = ipc_priv->instance;
+   ipc_ch_t *ch;
+   void *vaddr;
+   uint32_t i = 0;
+   uint32_t msg_size = sizeof(struct bbdev_ipc_enqueue_op);
+
+   PMD_INIT_FUNC_TRACE();
+
+   rte_bbdev_log_debug("%x %p", ipc_instance->initialized,
+   ipc_priv->instance);
+   ch = &(ipc_instance->ch_list[channel_id]);
+
+   rte_bbdev_log_debug("channel: %u, depth: %u, msg size: %u",
+   channel_id, q_priv->queue_size, msg_size);
+
+   /* Start init of channel */
+   ch->md.ring_size = rte_cpu_to_be_32(q_priv->queue_size);
+   ch->md.pi = 0;
+   ch->md.ci = 0;
+   ch->md.msg_size = msg_size;
+   for (i = 0; i < q_priv->queue_size; i++) {
+   vaddr = rte_malloc(NULL, msg_size, RTE_CACHE_LINE_SIZE);
+   if (!vaddr)
+  

[dpdk-dev] [PATCH v11 6/8] baseband/la12xx: add enqueue and dequeue support

2021-10-16 Thread nipun . gupta
From: Hemant Agrawal 

Add support for enqueue and dequeue the LDPC enc/dec
from the modem device.

Signed-off-by: Nipun Gupta 
Signed-off-by: Hemant Agrawal 
---
 doc/guides/bbdevs/features/la12xx.ini  |  13 +
 doc/guides/bbdevs/la12xx.rst   |  44 +++
 doc/guides/rel_notes/release_21_11.rst |   5 +
 drivers/baseband/la12xx/bbdev_la12xx.c | 320 +
 drivers/baseband/la12xx/bbdev_la12xx_ipc.h |  37 +++
 5 files changed, 419 insertions(+)
 create mode 100644 doc/guides/bbdevs/features/la12xx.ini

diff --git a/doc/guides/bbdevs/features/la12xx.ini 
b/doc/guides/bbdevs/features/la12xx.ini
new file mode 100644
index 00..0aec5eecb6
--- /dev/null
+++ b/doc/guides/bbdevs/features/la12xx.ini
@@ -0,0 +1,13 @@
+;
+; Supported features of the 'la12xx' bbdev driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Turbo Decoder (4G) = N
+Turbo Encoder (4G) = N
+LDPC Decoder (5G)  = Y
+LDPC Encoder (5G)  = Y
+LLR/HARQ Compression   = N
+HW Accelerated = Y
+BBDEV API  = Y
diff --git a/doc/guides/bbdevs/la12xx.rst b/doc/guides/bbdevs/la12xx.rst
index 1a711ef5e3..fe1bca4c5c 100644
--- a/doc/guides/bbdevs/la12xx.rst
+++ b/doc/guides/bbdevs/la12xx.rst
@@ -78,3 +78,47 @@ For enabling logs, use the following EAL parameter:
 
 Using ``bb.la12xx`` as log matching criteria, all Baseband PMD logs can be
 enabled which are lower than logging ``level``.
+
+Test Application
+
+
+BBDEV provides a test application, ``test-bbdev.py`` and range of test data 
for testing
+the functionality of LA12xx for FEC encode and decode, depending on the device
+capabilities. The test application is located under app->test-bbdev folder and 
has the
+following options:
+
+.. code-block:: console
+
+  "-p", "--testapp-path": specifies path to the bbdev test app.
+  "-e", "--eal-params" : EAL arguments which are passed to the test app.
+  "-t", "--timeout": Timeout in seconds (default=300).
+  "-c", "--test-cases" : Defines test cases to run. Run all if not specified.
+  "-v", "--test-vector": Test vector path 
(default=dpdk_path+/app/test-bbdev/test_vectors/bbdev_null.data).
+  "-n", "--num-ops": Number of operations to process on device 
(default=32).
+  "-b", "--burst-size" : Operations enqueue/dequeue burst size (default=32).
+  "-s", "--snr": SNR in dB used when generating LLRs for bler 
tests.
+  "-s", "--iter_max"   : Number of iterations for LDPC decoder.
+  "-l", "--num-lcores" : Number of lcores to run (default=16).
+  "-i", "--init-device" : Initialise PF device with default values.
+
+
+To execute the test application tool using simple decode or encode data,
+type one of the following:
+
+.. code-block:: console
+
+  ./test-bbdev.py -e="--vdev=baseband_la12xx,socket_id=0,max_nb_queues=8" -c 
validation -n 64 -b 1 -v ./ldpc_dec_default.data
+  ./test-bbdev.py -e="--vdev=baseband_la12xx,socket_id=0,max_nb_queues=8" -c 
validation -n 64 -b 1 -v ./ldpc_enc_default.data
+
+The test application ``test-bbdev.py``, supports the ability to configure the 
PF device with
+a default set of values, if the "-i" or "- -init-device" option is included. 
The default values
+are defined in test_bbdev_perf.c.
+
+
+Test Vectors
+
+
+In addition to the simple LDPC decoder and LDPC encoder tests, bbdev also 
provides
+a range of additional tests under the test_vectors folder, which may be 
useful. The results
+of these tests will depend on the LA12xx FEC capabilities which may cause some
+testcases to be skipped, but no failure should be reported.
diff --git a/doc/guides/rel_notes/release_21_11.rst 
b/doc/guides/rel_notes/release_21_11.rst
index 957bd78d61..d7e0bdc09b 100644
--- a/doc/guides/rel_notes/release_21_11.rst
+++ b/doc/guides/rel_notes/release_21_11.rst
@@ -159,6 +159,11 @@ New Features
   * Added tests to verify tunnel header verification in IPsec inbound.
   * Added tests to verify inner checksum.
 
+* **Added NXP LA12xx baseband PMD.**
+
+  * Added a new baseband PMD driver for NXP LA12xx Software defined radio.
+  * See the :doc:`../bbdevs/la12xx` for more details.
+
 
 Removed Items
 -
diff --git a/drivers/baseband/la12xx/bbdev_la12xx.c 
b/drivers/baseband/la12xx/bbdev_la12xx.c
index 7e312cf2e7..4b05b5d3f2 100644
--- a/drivers/baseband/la12xx/bbdev_la12xx.c
+++ b/drivers/baseband/la12xx/bbdev_la12xx.c
@@ -120,6 +120,10 @@ la12xx_queue_release(struct rte_bbdev *dev, uint16_t q_id)
((uint64_t) ((unsigned long) (A) \
- ((uint64_t)ipc_priv->hugepg_start.host_vaddr)))
 
+#define MODEM_P2V(A) \
+   ((uint64_t) ((unsigned long) (A) \
+   + (unsigned long)(ipc_priv->peb_start.host_vaddr)))
+
 static int
 ipc_queue_configure(uint32_t channel_id,
ipc_t instance,
@@ -336,6 +340,318 @@ static const struct rte_bbdev_ops pmd_ops = {
.queue_release = la12xx_queue_release,
.start = la12xx

[dpdk-dev] [PATCH v11 7/8] app/bbdev: enable la12xx for bbdev

2021-10-16 Thread nipun . gupta
From: Hemant Agrawal 

this patch adds la12xx driver in test bbdev

Signed-off-by: Hemant Agrawal 
---
 app/test-bbdev/meson.build | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/test-bbdev/meson.build b/app/test-bbdev/meson.build
index edb9deef84..a726a5b3fa 100644
--- a/app/test-bbdev/meson.build
+++ b/app/test-bbdev/meson.build
@@ -23,3 +23,6 @@ endif
 if dpdk_conf.has('RTE_BASEBAND_ACC100')
 deps += ['baseband_acc100']
 endif
+if dpdk_conf.has('RTE_LIBRTE_PMD_BBDEV_LA12XX')
+   deps += ['baseband_la12xx']
+endif
-- 
2.17.1



[dpdk-dev] [PATCH v11 8/8] app/bbdev: handle endianness of test data

2021-10-16 Thread nipun . gupta
From: Nipun Gupta 

With data input, output and harq also supported in big
endian format, this patch updates the testbbdev application
to handle the endianness conversion as directed by the
the driver being used.

The test vectors assumes the data in the little endian order, and
thus if the driver supports big endian data processing, conversion
from little endian to big is handled by the testbbdev application.

Signed-off-by: Nipun Gupta 
---
 app/test-bbdev/test_bbdev_perf.c | 43 
 doc/guides/tools/testbbdev.rst   |  3 +++
 2 files changed, 46 insertions(+)

diff --git a/app/test-bbdev/test_bbdev_perf.c b/app/test-bbdev/test_bbdev_perf.c
index 469597b8b3..7b4529789b 100644
--- a/app/test-bbdev/test_bbdev_perf.c
+++ b/app/test-bbdev/test_bbdev_perf.c
@@ -227,6 +227,45 @@ clear_soft_out_cap(uint32_t *op_flags)
*op_flags &= ~RTE_BBDEV_TURBO_NEG_LLR_1_BIT_SOFT_OUT;
 }
 
+/* This API is to convert all the test vector op data entries
+ * to big endian format. It is used when the device supports
+ * the input in the big endian format.
+ */
+static inline void
+convert_op_data_to_be(void)
+{
+   struct op_data_entries *op;
+   enum op_data_type type;
+   uint8_t nb_segs, *rem_data, temp;
+   uint32_t *data, len;
+   int complete, rem, i, j;
+
+   for (type = DATA_INPUT; type < DATA_NUM_TYPES; ++type) {
+   nb_segs = test_vector.entries[type].nb_segments;
+   op = &test_vector.entries[type];
+
+   /* Invert byte endianness for all the segments */
+   for (i = 0; i < nb_segs; ++i) {
+   len = op->segments[i].length;
+   data = op->segments[i].addr;
+
+   /* Swap complete u32 bytes */
+   complete = len / 4;
+   for (j = 0; j < complete; j++)
+   data[j] = rte_bswap32(data[j]);
+
+   /* Swap any remaining bytes */
+   rem = len % 4;
+   rem_data = (uint8_t *)&data[j];
+   for (j = 0; j < rem/2; j++) {
+   temp = rem_data[j];
+   rem_data[j] = rem_data[rem - j - 1];
+   rem_data[rem - j - 1] = temp;
+   }
+   }
+   }
+}
+
 static int
 check_dev_cap(const struct rte_bbdev_info *dev_info)
 {
@@ -234,6 +273,7 @@ check_dev_cap(const struct rte_bbdev_info *dev_info)
unsigned int nb_inputs, nb_soft_outputs, nb_hard_outputs,
nb_harq_inputs, nb_harq_outputs;
const struct rte_bbdev_op_cap *op_cap = dev_info->drv.capabilities;
+   uint8_t dev_data_endianness = dev_info->drv.data_endianness;
 
nb_inputs = test_vector.entries[DATA_INPUT].nb_segments;
nb_soft_outputs = test_vector.entries[DATA_SOFT_OUTPUT].nb_segments;
@@ -245,6 +285,9 @@ check_dev_cap(const struct rte_bbdev_info *dev_info)
if (op_cap->type != test_vector.op_type)
continue;
 
+   if (dev_data_endianness == RTE_BIG_ENDIAN)
+   convert_op_data_to_be();
+
if (op_cap->type == RTE_BBDEV_OP_TURBO_DEC) {
const struct rte_bbdev_op_cap_turbo_dec *cap =
&op_cap->cap.turbo_dec;
diff --git a/doc/guides/tools/testbbdev.rst b/doc/guides/tools/testbbdev.rst
index d397d991ff..83a0312062 100644
--- a/doc/guides/tools/testbbdev.rst
+++ b/doc/guides/tools/testbbdev.rst
@@ -332,6 +332,9 @@ Variable op_type has to be defined as a first variable in 
file. It specifies
 what type of operations will be executed. For 4G decoder op_type has to be set 
to
 ``RTE_BBDEV_OP_TURBO_DEC`` and for 4G encoder to ``RTE_BBDEV_OP_TURBO_ENC``.
 
+Bbdev-test adjusts the byte endianness based on the PMD capability 
(data_endianness)
+and all the test vectors input/output data are assumed to be LE by default
+
 Full details of the meaning and valid values for the below fields are
 documented in *rte_bbdev_op.h*
 
-- 
2.17.1