[dpdk-dev] [PATCH v1 0/4] refactor SMP barriers for net/mlx

2021-03-18 Thread Feifei Wang
For net/mlx4 and net/mlx5, fix cache rebuild bug and replace SMP
barriers with atomic fence.

Feifei Wang (4):
  net/mlx4: fix rebuild bug for Memory Region cache
  net/mlx4: replace SMP barrier with C11 barriers
  net/mlx5: fix rebuild bug for Memory Region cache
  net/mlx5: replace SMP barriers with C11 barriers

 drivers/net/mlx4/mlx4_mr.c | 21 +--
 drivers/net/mlx5/mlx5_mr.c | 41 ++
 2 files changed, 28 insertions(+), 34 deletions(-)

-- 
2.25.1



[dpdk-dev] [PATCH v1 1/4] net/mlx4: fix rebuild bug for Memory Region cache

2021-03-18 Thread Feifei Wang
'dev_gen' is a variable to inform other cores to flush their local cache
when global cache is rebuilt.

However, if 'dev_gen' is updated after global cache is rebuilt, other
cores may load a wrong memory region lkey value from old local cache.

Timeslotmain core   worker core
  1 rebuild global cache
  2  load unchanged dev_gen
  3update dev_gen
  4  look up old local cache

>From the example above, we can see that though global cache is rebuilt,
due to that dev_gen is not updated, the worker core may look up old
cache table and receive a wrong memory region lkey value.

To fix this, updating 'dev_gen' should be moved before rebuilding global
cache to inform worker cores to flush their local cache when global
cache start rebuilding. And wmb can ensure the sequence of this process.

Fixes: 9797bfcce1c9 ("net/mlx4: add new memory region support")
Cc: sta...@dpdk.org

Suggested-by: Ruifeng Wang 
Signed-off-by: Feifei Wang 
Reviewed-by: Ruifeng Wang 
---
 drivers/net/mlx4/mlx4_mr.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c
index 6b2f0cf18..cfd7d4a9c 100644
--- a/drivers/net/mlx4/mlx4_mr.c
+++ b/drivers/net/mlx4/mlx4_mr.c
@@ -946,20 +946,17 @@ mlx4_mr_mem_event_free_cb(struct rte_eth_dev *dev, const 
void *addr, size_t len)
rebuild = 1;
}
if (rebuild) {
-   mr_rebuild_dev_cache(dev);
-   /*
-* Flush local caches by propagating invalidation across cores.
-* rte_smp_wmb() is enough to synchronize this event. If one of
-* freed memsegs is seen by other core, that means the memseg
-* has been allocated by allocator, which will come after this
-* free call. Therefore, this store instruction (incrementing
-* generation below) will be guaranteed to be seen by other core
-* before the core sees the newly allocated memory.
-*/
++priv->mr.dev_gen;
DEBUG("broadcasting local cache flush, gen=%d",
- priv->mr.dev_gen);
+   priv->mr.dev_gen);
+
+   /* Flush local caches by propagating invalidation across cores.
+* rte_smp_wmb is to keep the order that dev_gen updated before
+* rebuilding global cache. Therefore, other core can flush 
their
+* local cache on time.
+*/
rte_smp_wmb();
+   mr_rebuild_dev_cache(dev);
}
rte_rwlock_write_unlock(&priv->mr.rwlock);
 #ifdef RTE_LIBRTE_MLX4_DEBUG
-- 
2.25.1



[dpdk-dev] [PATCH v1 2/4] net/mlx4: replace SMP barrier with C11 barriers

2021-03-18 Thread Feifei Wang
Replace SMP barrier with atomic thread fence.

Signed-off-by: Feifei Wang 
Reviewed-by: Ruifeng Wang 
---
 drivers/net/mlx4/mlx4_mr.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx4/mlx4_mr.c b/drivers/net/mlx4/mlx4_mr.c
index cfd7d4a9c..503e8a7bb 100644
--- a/drivers/net/mlx4/mlx4_mr.c
+++ b/drivers/net/mlx4/mlx4_mr.c
@@ -951,11 +951,11 @@ mlx4_mr_mem_event_free_cb(struct rte_eth_dev *dev, const 
void *addr, size_t len)
priv->mr.dev_gen);
 
/* Flush local caches by propagating invalidation across cores.
-* rte_smp_wmb is to keep the order that dev_gen updated before
+* release-fence is to keep the order that dev_gen updated 
before
 * rebuilding global cache. Therefore, other core can flush 
their
 * local cache on time.
 */
-   rte_smp_wmb();
+   rte_atomic_thread_fence(__ATOMIC_RELEASE);
mr_rebuild_dev_cache(dev);
}
rte_rwlock_write_unlock(&priv->mr.rwlock);
-- 
2.25.1



[dpdk-dev] [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache

2021-03-18 Thread Feifei Wang
'dev_gen' is a variable to inform other cores to flush their local cache
when global cache is rebuilt.

However, if 'dev_gen' is updated after global cache is rebuilt, other
cores may load a wrong memory region lkey value from old local cache.

Timeslotmain core   worker core
  1 rebuild global cache
  2  load unchanged dev_gen
  3update dev_gen
  4  look up old local cache

>From the example above, we can see that though global cache is rebuilt,
due to that dev_gen is not updated, the worker core may look up old
cache table and receive a wrong memory region lkey value.

To fix this, updating 'dev_gen' should be moved before rebuilding global
cache to inform worker cores to flush their local cache when global
cache start rebuilding. And wmb can ensure the sequence of this process.

Fixes: 974f1e7ef146 ("net/mlx5: add new memory region support")
Cc: sta...@dpdk.org

Suggested-by: Ruifeng Wang 
Signed-off-by: Feifei Wang 
Reviewed-by: Ruifeng Wang 
---
 drivers/net/mlx5/mlx5_mr.c | 37 +
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index da4e91fc2..7ce1d3e64 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -103,20 +103,18 @@ mlx5_mr_mem_event_free_cb(struct mlx5_dev_ctx_shared *sh,
rebuild = 1;
}
if (rebuild) {
-   mlx5_mr_rebuild_cache(&sh->share_cache);
+   ++sh->share_cache.dev_gen;
+   DEBUG("broadcasting local cache flush, gen=%d",
+   sh->share_cache.dev_gen);
+
/*
 * Flush local caches by propagating invalidation across cores.
-* rte_smp_wmb() is enough to synchronize this event. If one of
-* freed memsegs is seen by other core, that means the memseg
-* has been allocated by allocator, which will come after this
-* free call. Therefore, this store instruction (incrementing
-* generation below) will be guaranteed to be seen by other core
-* before the core sees the newly allocated memory.
+* rte_smp_wmb() is to keep the order that dev_gen updated 
before
+* rebuilding global cache. Therefore, other core can flush 
their
+* local cache on time.
 */
-   ++sh->share_cache.dev_gen;
-   DEBUG("broadcasting local cache flush, gen=%d",
- sh->share_cache.dev_gen);
rte_smp_wmb();
+   mlx5_mr_rebuild_cache(&sh->share_cache);
}
rte_rwlock_write_unlock(&sh->share_cache.rwlock);
 }
@@ -407,20 +405,19 @@ mlx5_dma_unmap(struct rte_pci_device *pdev, void *addr,
mlx5_mr_free(mr, sh->share_cache.dereg_mr_cb);
DEBUG("port %u remove MR(%p) from list", dev->data->port_id,
  (void *)mr);
-   mlx5_mr_rebuild_cache(&sh->share_cache);
+
+   ++sh->share_cache.dev_gen;
+   DEBUG("broadcasting local cache flush, gen=%d",
+   sh->share_cache.dev_gen);
+
/*
 * Flush local caches by propagating invalidation across cores.
-* rte_smp_wmb() is enough to synchronize this event. If one of
-* freed memsegs is seen by other core, that means the memseg
-* has been allocated by allocator, which will come after this
-* free call. Therefore, this store instruction (incrementing
-* generation below) will be guaranteed to be seen by other core
-* before the core sees the newly allocated memory.
+* rte_smp_wmb() is to keep the order that dev_gen updated before
+* rebuilding global cache. Therefore, other core can flush their
+* local cache on time.
 */
-   ++sh->share_cache.dev_gen;
-   DEBUG("broadcasting local cache flush, gen=%d",
- sh->share_cache.dev_gen);
rte_smp_wmb();
+   mlx5_mr_rebuild_cache(&sh->share_cache);
rte_rwlock_read_unlock(&sh->share_cache.rwlock);
return 0;
 }
-- 
2.25.1



[dpdk-dev] [PATCH v1 4/4] net/mlx5: replace SMP barriers with C11 barriers

2021-03-18 Thread Feifei Wang
Replace SMP barrier with atomic thread fence.

Signed-off-by: Feifei Wang 
Reviewed-by: Ruifeng Wang 
Reviewed-by: Honnappa Nagarahalli 
---
 drivers/net/mlx5/mlx5_mr.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_mr.c b/drivers/net/mlx5/mlx5_mr.c
index 7ce1d3e64..650fe9093 100644
--- a/drivers/net/mlx5/mlx5_mr.c
+++ b/drivers/net/mlx5/mlx5_mr.c
@@ -109,11 +109,11 @@ mlx5_mr_mem_event_free_cb(struct mlx5_dev_ctx_shared *sh,
 
/*
 * Flush local caches by propagating invalidation across cores.
-* rte_smp_wmb() is to keep the order that dev_gen updated 
before
+* release-fence is to keep the order that dev_gen updated 
before
 * rebuilding global cache. Therefore, other core can flush 
their
 * local cache on time.
 */
-   rte_smp_wmb();
+   rte_atomic_thread_fence(__ATOMIC_RELEASE);
mlx5_mr_rebuild_cache(&sh->share_cache);
}
rte_rwlock_write_unlock(&sh->share_cache.rwlock);
@@ -412,11 +412,11 @@ mlx5_dma_unmap(struct rte_pci_device *pdev, void *addr,
 
/*
 * Flush local caches by propagating invalidation across cores.
-* rte_smp_wmb() is to keep the order that dev_gen updated before
+* release-fence is to keep the order that dev_gen updated before
 * rebuilding global cache. Therefore, other core can flush their
 * local cache on time.
 */
-   rte_smp_wmb();
+   rte_atomic_thread_fence(__ATOMIC_RELEASE);
mlx5_mr_rebuild_cache(&sh->share_cache);
rte_rwlock_read_unlock(&sh->share_cache.rwlock);
return 0;
-- 
2.25.1



[dpdk-dev] [RFC] ethdev: introduce conntrack flow action and item

2021-03-18 Thread Bing Zhao
This commit introduced the conntrack action and item.

Usually the HW offloading is stateless. For some stateful offloading
like a TCP connection, HW module will help provide the ability of a
full offloading w/o SW participation after the connection was
established.

The basic usage is that in the first flow the application should add
the conntrack action and in the following flow(s) the application
should use the conntrack item to match on the result.

A TCP connection has two directions traffic. To set a conntrack
action context correctly, information from packets of both directions
are required.

The conntrack action should be created on one port and supply the
peer port as a parameter to the action. After context creating, it
could only be used between the ports (dual-port mode) or a single
port. The application should modify the action via action_ctx_update
interface before each use in dual-port mode, in order to set the
correct direction for the following rte flow.

Query will be supported via action_ctx_query interface, about the
current packets information and connection status.

For the packets received during the conntrack setup, it is suggested
to re-inject the packets in order to take full advantage of the
conntrack. Only the valid packets should pass the conntrack, packets
with invalid TCP information, like out of window, or with invalid
header, like malformed, should not pass.

Testpmd command line example:

set conntrack [index] enable is 1 last_seq is xxx last ack is xxx /
... / orig_dir win_scale is xxx sent_end is xxx max_win is xxx ... /
rply_dir ... / end
flow action_ctx [CTX] create ingress ... / conntrack is [index] / end
flow create 0 group X ingress patterns ... / tcp / end actions action_ctx [CTX]
/ jump group Y / end
flow create 0 group Y ingress patterns ... / ct is [Valid] / end actions
queue index [hairpin queue] / end

Signed-off-by: Bing Zhao 
---
 lib/librte_ethdev/rte_flow.h | 191 +++
 1 file changed, 191 insertions(+)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 669e677e91..b2e4f0751a 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -550,6 +550,15 @@ enum rte_flow_item_type {
 * See struct rte_flow_item_geneve_opt
 */
RTE_FLOW_ITEM_TYPE_GENEVE_OPT,
+
+   /**
+* [META]
+*
+* Matches conntrack state.
+*
+* See struct rte_flow_item_conntrack.
+*/
+   RTE_FLOW_ITEM_TYPE_CONNTRACK,
 };
 
 /**
@@ -1654,6 +1663,49 @@ rte_flow_item_geneve_opt_mask = {
 };
 #endif
 
+/**
+ * The packet is with valid.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_STATE_VALID (1 << 0)
+/**
+ * The state of the connection was changed.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_STATE_CHANGED (1 << 1)
+/**
+ * Error state was detected on this packet for this connection.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_ERROR (1 << 2)
+/**
+ * The HW connection tracking module is disabled.
+ * It can be due to application command or an invalid state.
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_DISABLED (1 << 3)
+/**
+ * The packet contains some bad field(s).
+ */
+#define RTE_FLOW_CONNTRACK_FLAG_BAD_PKT (1 << 4)
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_CONNTRACK
+ *
+ * Matches the state of a packet after it passed the connection tracking
+ * examination. The state is a bit mask of one RTE_FLOW_CONNTRACK_FLAG*
+ * or a reasonable combination of these bits.
+ */
+struct rte_flow_item_conntrack {
+   uint32_t flags;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_CONNTRACK. */
+#ifndef __cplusplus
+static const struct rte_flow_item_conntrack rte_flow_item_conntrack_mask = {
+   .flags = 0x,
+};
+#endif
+
 /**
  * Matching pattern item definition.
  *
@@ -2236,6 +2288,17 @@ enum rte_flow_action_type {
 * See struct rte_flow_action_modify_field.
 */
RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
+
+   /**
+* [META]
+*
+* Enable tracking a TCP connection state.
+*
+* Send packet to HW connection tracking module for examination.
+*
+* See struct rte_flow_action_conntrack.
+*/
+   RTE_FLOW_ACTION_TYPE_CONNTRACK,
 };
 
 /**
@@ -2828,6 +2891,134 @@ struct rte_flow_action_set_dscp {
  */
 struct rte_flow_shared_action;
 
+/**
+ * The state of a TCP connection.
+ */
+enum rte_flow_conntrack_state {
+   RTE_FLOW_CONNTRACK_STATE_SYN_RECV,
+   /**< SYN-ACK packet was seen. */
+   RTE_FLOW_CONNTRACK_STATE_ESTABLISHED,
+   /**< 3-way handshark was done. */
+   RTE_FLOW_CONNTRACK_STATE_FIN_WAIT,
+   /**< First FIN packet was received to close the connection. */
+   RTE_FLOW_CONNTRACK_STATE_CLOSE_WAIT,
+   /**< First FIN was ACKed. */
+   RTE_FLOW_CONNTRACK_STATE_LAST_ACK,
+   /**< After second FIN, waiting for the last ACK. */
+   RTE_FLOW_CONNTRACK_STATE_TIME_WAI

Re: [dpdk-dev] [PATCH] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Thomas Monjalon
17/03/2021 23:43, Dmitry Kozlyuk:
> 2021-03-17 00:11 (UTC+0100), Thomas Monjalon:
> [...]
> > diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
> > index fdda046515..3d009cc74b 100644
> > --- a/drivers/bus/pci/rte_bus_pci.h
> > +++ b/drivers/bus/pci/rte_bus_pci.h
> > @@ -52,12 +52,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver);
> >  struct rte_devargs;
> >  
> >  enum rte_pci_kernel_driver {
> > -   RTE_PCI_KDRV_UNKNOWN = 0,
> > -   RTE_PCI_KDRV_IGB_UIO,
> > -   RTE_PCI_KDRV_VFIO,
> > -   RTE_PCI_KDRV_UIO_GENERIC,
> > -   RTE_PCI_KDRV_NIC_UIO,
> > -   RTE_PCI_KDRV_NONE,
> > +   RTE_PCI_KDRV_UNKNOWN = 0,  /* not listed - may be a bifurcated driver */
> > +   RTE_PCI_KDRV_IGB_UIO,  /* igb_uio for Linux */
> > +   RTE_PCI_KDRV_VFIO, /* VFIO for Linux */
> > +   RTE_PCI_KDRV_UIO_GENERIC,  /* uio_generic for Linux */
> 
> Module name is "uio_pci_generic", otherwise

Oops, yes I will fix.

> Acked-by: Dmitry Kozlyuk 





Re: [dpdk-dev] [PATCH] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Thomas Monjalon
18/03/2021 00:17, Ranjit Menon:
> Hi Thomas,
> 
> On 3/16/2021 4:11 PM, Thomas Monjalon wrote:
> > In Windows probing, the value RTE_PCI_KDRV_NONE was used
> > instead of RTE_PCI_KDRV_UNKNOWN (mlx case),
> > and RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used
> > instead of having a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.
> Shouldn't the mlx case actually remain RTE_PCI_KDRV_NONE?
> 
> mlx does not require a UIO-like kernel driver...No? And NONE implies that no 
> kernel driver is used/required.
> Not sure what is correct here.

No this is a bifurcated model, meaning kernel and userland
work together. The PCI device is bound to the kernel driver,
but the driver is not listed because no special treatment is required.

> > While adding the new value RTE_PCI_KDRV_NET_UIO,
> > the enum of kernel driver categories is annotated.
> >
> > Fixes: b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
> > Fixes: c76ec01b4591 ("bus/pci: support netuio on Windows")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Thomas Monjalon 
> > ---
> >   drivers/bus/pci/rte_bus_pci.h | 13 +++--
> >   drivers/bus/pci/windows/pci.c | 14 +++---
> >   2 files changed, 14 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
> > index fdda046515..3d009cc74b 100644
> > --- a/drivers/bus/pci/rte_bus_pci.h
> > +++ b/drivers/bus/pci/rte_bus_pci.h
> > @@ -52,12 +52,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver);
> >   struct rte_devargs;
> >   
> >   enum rte_pci_kernel_driver {
> > -   RTE_PCI_KDRV_UNKNOWN = 0,
> > -   RTE_PCI_KDRV_IGB_UIO,
> > -   RTE_PCI_KDRV_VFIO,
> > -   RTE_PCI_KDRV_UIO_GENERIC,
> > -   RTE_PCI_KDRV_NIC_UIO,
> > -   RTE_PCI_KDRV_NONE,
> > +   RTE_PCI_KDRV_UNKNOWN = 0,  /* not listed - may be a bifurcated driver */
> > +   RTE_PCI_KDRV_IGB_UIO,  /* igb_uio for Linux */
> > +   RTE_PCI_KDRV_VFIO, /* VFIO for Linux */
> > +   RTE_PCI_KDRV_UIO_GENERIC,  /* uio_generic for Linux */
> > +   RTE_PCI_KDRV_NIC_UIO,  /* nic_uio for FreeBSD */
> > +   RTE_PCI_KDRV_NONE, /* error */
> > +   RTE_PCI_KDRV_NET_UIO,  /* NetUIO for Windows */
> >   };
> >   
> 
> Any chance we can re-order the enums, so that _NONE and _UNKNOWN are at 
> the top?

No, it would break the ABI.

> This will change the value, and break code where this value was 
> hard-coded. But how likely is that...?

The problem is when loading the new PCI bus driver with an old device driver.





Re: [dpdk-dev] [Linuxarm] Re: [PATCH V2] app/testpmd: support Tx mbuf free on demand cmd

2021-03-18 Thread Thomas Monjalon
18/03/2021 04:56, oulijun:
> 在 2021/3/17 20:07, Thomas Monjalon 写道:
> > 17/03/2021 12:30, oulijun:
> >> 2021/3/12 19:21, Thomas Monjalon:
> >>> 12/03/2021 11:29, oulijun:
>  2021/3/10 15:59, Thomas Monjalon:
> > 10/03/2021 02:48, oulijun:
> >> Can we add an API such as rte_eth_get_device(pord_id)
> >>
> >> for example:
> >> struct rte_eth_dev *
> >> rte_eth_get_device(uint16_t port_id)
> >> {
> >>return &rte_eth_devices[port_id];
> >> }
> > An application is not supposed to access the struct rte_eth_dev.
> > Which info do you need from this struct?
> 
>  Applications cannot directly access the global variable
>  rte_eth_devices[]. To obtain information about rte_eth_dev, they need to
>  access the global variable through APIs instead of directly.
> >>>
> >>> That's not the question.
> >>> Which device info do you need, which is not already provided by
> >>> one of the function rte_eth_*info* ?
> >>>   rte_eth_dev_get_dcb_info
> >>>   rte_eth_dev_get_reg_info
> >>>   rte_eth_dev_info_get
> >>>   rte_eth_rx_queue_info_get
> >>>   rte_eth_tx_queue_info_get
> >>>   rte_eth_dev_get_module_info
> >>>
> >> Hi, Thomas
> >> I think dev->data->nb_tx_queues can be obtained through
> >> rte_eth_info_get, but dev->data->tx_queue_state[queue_id] has nowhere to
> >> be obtained. I think a patch needs to be added to obtain
> >> tx_queue_state[queue_id] through rte_eth_tx_queue_info_get. What do you
> >> think?
> > 
> > Yes it looks OK to add more queue info in rte_eth_*x_queue_info_get.
> Good, can I just catch up with this version?

You can try.




Re: [dpdk-dev] [PATCH v6 00/17] Alpine/musl build support

2021-03-18 Thread Andrew Rybchenko
On 3/18/21 1:42 AM, Thomas Monjalon wrote:
> 28/02/2021 13:53, Thomas Monjalon:
>> These patches fix some build errors/warning for Alpine Linux,
>> using musl and busybox.
>> Few improvements are added on the way.
> 
> No more comment on this series? Ready to merge?

Series-acked-by: Andrew Rybchenko 




Re: [dpdk-dev] [PATCH] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Slava Ovsiienko
Hi, Thomas

> -Original Message-
> From: dev  On Behalf Of Thomas Monjalon
> Sent: Wednesday, March 17, 2021 1:12
> To: dev@dpdk.org
> Cc: dmitry.kozl...@gmail.com; sta...@dpdk.org; Tal Shnaiderman
> ; Narcisa Vasile ; Ranjit
> Menon ; John Alexander
> ; Pallavi Kadam
> 
> Subject: [dpdk-dev] [PATCH] bus/pci: fix Windows kernel driver categories
> 
> In Windows probing, the value RTE_PCI_KDRV_NONE was used instead of
> RTE_PCI_KDRV_UNKNOWN (mlx case), and RTE_PCI_KDRV_NIC_UIO
> (FreeBSD) was re-used instead of having a new RTE_PCI_KDRV_NET_UIO for
> Windows NetUIO.

As far as I understand - under Windows there is always some kernel driver
backing the device, hence, RTE_PCI_KDRV_NONE is not an option and
RTE_PCI_KDRV_UNKNOWN is more appropriate. I would add this extra
explanation in commit message.

With best regards,
Slava
> 
> While adding the new value RTE_PCI_KDRV_NET_UIO, the enum of kernel
> driver categories is annotated.
> 
> Fixes: b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
> Fixes: c76ec01b4591 ("bus/pci: support netuio on Windows")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Thomas Monjalon 
> ---
>  drivers/bus/pci/rte_bus_pci.h | 13 +++--
> drivers/bus/pci/windows/pci.c | 14 +++---
>  2 files changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
> index fdda046515..3d009cc74b 100644
> --- a/drivers/bus/pci/rte_bus_pci.h
> +++ b/drivers/bus/pci/rte_bus_pci.h
> @@ -52,12 +52,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver);
> struct rte_devargs;
> 
>  enum rte_pci_kernel_driver {
> - RTE_PCI_KDRV_UNKNOWN = 0,
> - RTE_PCI_KDRV_IGB_UIO,
> - RTE_PCI_KDRV_VFIO,
> - RTE_PCI_KDRV_UIO_GENERIC,
> - RTE_PCI_KDRV_NIC_UIO,
> - RTE_PCI_KDRV_NONE,
> + RTE_PCI_KDRV_UNKNOWN = 0,  /* not listed - may be a bifurcated
> driver */
> + RTE_PCI_KDRV_IGB_UIO,  /* igb_uio for Linux */
> + RTE_PCI_KDRV_VFIO, /* VFIO for Linux */
> + RTE_PCI_KDRV_UIO_GENERIC,  /* uio_generic for Linux */
> + RTE_PCI_KDRV_NIC_UIO,  /* nic_uio for FreeBSD */
> + RTE_PCI_KDRV_NONE, /* error */
> + RTE_PCI_KDRV_NET_UIO,  /* NetUIO for Windows */
>  };
> 
>  /**
> diff --git a/drivers/bus/pci/windows/pci.c b/drivers/bus/pci/windows/pci.c
> index 8f906097f4..3f0ce1fb83 100644
> --- a/drivers/bus/pci/windows/pci.c
> +++ b/drivers/bus/pci/windows/pci.c
> @@ -38,7 +38,7 @@ rte_pci_map_device(struct rte_pci_device *dev)
>* Devices that are bound to netuio are mapped at
>* the bus probing stage.
>*/
> - if (dev->kdrv == RTE_PCI_KDRV_NIC_UIO)
> + if (dev->kdrv == RTE_PCI_KDRV_NET_UIO)
>   return 0;
>   else
>   return -1;
> @@ -207,14 +207,14 @@ get_device_resource_info(HDEVINFO dev_info,
>   int ret;
> 
>   switch (dev->kdrv) {
> - case RTE_PCI_KDRV_NONE:
> - /* mem_resource - Unneeded for RTE_PCI_KDRV_NONE */
> + case RTE_PCI_KDRV_UNKNOWN:
> + /* mem_resource is unneeded */
>   dev->mem_resource[0].phys_addr = 0;
>   dev->mem_resource[0].len = 0;
>   dev->mem_resource[0].addr = NULL;
>   break;
> - case RTE_PCI_KDRV_NIC_UIO:
> - /* get device info from netuio kernel driver */
> + case RTE_PCI_KDRV_NET_UIO:
> + /* get device info from NetUIO kernel driver */
>   ret = get_netuio_device_info(dev_info, dev_info_data,
> dev);
>   if (ret != 0) {
>   RTE_LOG(DEBUG, EAL,
> @@ -323,9 +323,9 @@ set_kernel_driver_type(PSP_DEVINFO_DATA
> device_info_data,  {
>   /* set kernel driver type based on device class */
>   if (IsEqualGUID(&(device_info_data->ClassGuid),
> &GUID_DEVCLASS_NETUIO))
> - dev->kdrv = RTE_PCI_KDRV_NIC_UIO;
> + dev->kdrv = RTE_PCI_KDRV_NET_UIO;
>   else
> - dev->kdrv = RTE_PCI_KDRV_NONE;
> + dev->kdrv = RTE_PCI_KDRV_UNKNOWN;
>  }
> 
>  static int
> --
> 2.30.1



Re: [dpdk-dev] [PATCH v4 3/4] net/virtio: allocate fake mbuf in Rx queue

2021-03-18 Thread Xia, Chenbo
Hi Maxime,

> -Original Message-
> From: Maxime Coquelin 
> Sent: Tuesday, March 16, 2021 5:38 PM
> To: dev@dpdk.org; Xia, Chenbo ; amore...@redhat.com;
> david.march...@redhat.com; olivier.m...@6wind.com; bnem...@redhat.com
> Cc: Maxime Coquelin 
> Subject: [PATCH v4 3/4] net/virtio: allocate fake mbuf in Rx queue
> 
> While it is worth clarifying whether the fake mbuf
> in virtnet_rx struct is really necessary, it is sure
> that it heavily impacts cache usage by being part of
> the struct. Indeed, it uses two cachelines, and
> requires alignment on a cacheline.
> 
> Before this series, it means it took 120 bytes in
> virtnet_rx struct:
> 
> struct virtnet_rx {
>   struct virtqueue * vq;   /* 0 8 */
> 
>   /* XXX 56 bytes hole, try to pack */
> 
>   /* --- cacheline 1 boundary (64 bytes) --- */
>   struct rte_mbuffake_mbuf __attribute__((__aligned__(64)));
> /*64   128 */
>   /* --- cacheline 3 boundary (192 bytes) --- */
> 
> This patch allocates it using malloc in order to optimize
> virtnet_rx cache usage and so virtqueue cache usage.
> 
> Signed-off-by: Maxime Coquelin 
> Reviewed-by: David Marchand 
> ---
>  drivers/net/virtio/virtio_ethdev.c | 13 +
>  drivers/net/virtio/virtio_rxtx.c   |  9 +++--
>  drivers/net/virtio/virtio_rxtx.h   |  2 +-
>  3 files changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_ethdev.c
> b/drivers/net/virtio/virtio_ethdev.c
> index d5643733f7..be9faa3b6c 100644
> --- a/drivers/net/virtio/virtio_ethdev.c
> +++ b/drivers/net/virtio/virtio_ethdev.c
> @@ -435,6 +435,7 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t
> queue_idx)
>   int queue_type = virtio_get_queue_type(hw, queue_idx);
>   int ret;
>   int numa_node = dev->device->numa_node;
> + struct rte_mbuf *fake_mbuf = NULL;
> 
>   PMD_INIT_LOG(INFO, "setting up queue: %u on NUMA node %d",
>   queue_idx, numa_node);
> @@ -550,10 +551,19 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t
> queue_idx)
>   goto free_hdr_mz;
>   }
> 
> + fake_mbuf = rte_zmalloc_socket("sw_ring", sizeof(*fake_mbuf),
> + RTE_CACHE_LINE_SIZE, numa_node);
> + if (!fake_mbuf) {
> + PMD_INIT_LOG(ERR, "can not allocate fake mbuf");
> + ret = -ENOMEM;
> + goto free_sw_ring;
> + }
> +
>   vq->sw_ring = sw_ring;
>   rxvq = &vq->rxq;
>   rxvq->port_id = dev->data->port_id;
>   rxvq->mz = mz;
> + rxvq->fake_mbuf = fake_mbuf;
>   } else if (queue_type == VTNET_TQ) {
>   txvq = &vq->txq;
>   txvq->port_id = dev->data->port_id;
> @@ -612,6 +622,8 @@ virtio_init_queue(struct rte_eth_dev *dev, uint16_t
> queue_idx)
> 
>  clean_vq:
>   hw->cvq = NULL;
> + rte_free(fake_mbuf);
> +free_sw_ring:
>   rte_free(sw_ring);
>  free_hdr_mz:
>   rte_memzone_free(hdr_mz);
> @@ -641,6 +653,7 @@ virtio_free_queues(struct virtio_hw *hw)
> 
>   queue_type = virtio_get_queue_type(hw, i);
>   if (queue_type == VTNET_RQ) {
> + rte_free(vq->rxq.fake_mbuf);
>   rte_free(vq->sw_ring);
>   rte_memzone_free(vq->rxq.mz);
>   } else if (queue_type == VTNET_TQ) {
> diff --git a/drivers/net/virtio/virtio_rxtx.c
> b/drivers/net/virtio/virtio_rxtx.c
> index 32af8d3d11..8df913b0ba 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -703,12 +703,9 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev,
> uint16_t queue_idx)
>   virtio_rxq_vec_setup(rxvq);
>   }
> 
> - memset(&rxvq->fake_mbuf, 0, sizeof(rxvq->fake_mbuf));
> - for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST;
> -  desc_idx++) {
> - vq->sw_ring[vq->vq_nentries + desc_idx] =
> - &rxvq->fake_mbuf;
> - }
> + memset(rxvq->fake_mbuf, 0, sizeof(*rxvq->fake_mbuf));
> + for (desc_idx = 0; desc_idx < RTE_PMD_VIRTIO_RX_MAX_BURST; desc_idx++)

I just notice that the macro 'RTE_PMD_VIRTIO_RX_MAX_BURST' and 
'RTE_VIRTIO_VPMD_RX_BURST'
should always have the same value, so maybe better to make them into one macro 
later?

For this patch:

Reviewed-by: Chenbo Xia 

> + vq->sw_ring[vq->vq_nentries + desc_idx] = rxvq->fake_mbuf;
> 
>   if (hw->use_vec_rx && !virtio_with_packed_queue(hw)) {
>   while (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) {
> diff --git a/drivers/net/virtio/virtio_rxtx.h
> b/drivers/net/virtio/virtio_rxtx.h
> index 7f1036be6f..6ce5d67d15 100644
> --- a/drivers/net/virtio/virtio_rxtx.h
> +++ b/drivers/net/virtio/virtio_rxtx.h
> @@ -19,7 +19,7 @@ struct virtnet_stats {
> 
>  struct virtnet_rx {
>   /* dummy mbuf, for wraparound when 

Re: [dpdk-dev] [PATCH v6 00/17] Alpine/musl build support

2021-03-18 Thread David Marchand
On Wed, Mar 17, 2021 at 11:42 PM Thomas Monjalon  wrote:
>
> 28/02/2021 13:53, Thomas Monjalon:
> > These patches fix some build errors/warning for Alpine Linux,
> > using musl and busybox.
> > Few improvements are added on the way.
>
> No more comment on this series? Ready to merge?

This will probably need some rebase on the ioport changes.
I did not find the time to look at busybox, but I still think there is
something funny about awk :-).

For the rest, it looks good to me.
Acked-by: David Marchand 


-- 
David Marchand



[dpdk-dev] [PATCH v1] net/ice: support GTPU TEID pattern for switch filter

2021-03-18 Thread Yuying
Enable GTPU pattern for CVL switch filter. This patch only
supports outer l3/l4 filtering.

Signed-off-by: Yuying 
---
 doc/guides/rel_notes/release_21_05.rst |  3 +
 drivers/net/ice/ice_switch_filter.c| 91 ++
 2 files changed, 94 insertions(+)

diff --git a/doc/guides/rel_notes/release_21_05.rst 
b/doc/guides/rel_notes/release_21_05.rst
index 88e7607a08..8507dc948f 100644
--- a/doc/guides/rel_notes/release_21_05.rst
+++ b/doc/guides/rel_notes/release_21_05.rst
@@ -91,6 +91,9 @@ New Features
   * Added a command line option to configure forced speed for Ethernet port.
 ``dpdk-testpmd -c 0xff  -- -i  --eth-link-speed N``
 
+* **Updated Intel ice driver.**
+
+  * Added GTP TEID support for DCF switch filter.
 
 Removed Items
 -
diff --git a/drivers/net/ice/ice_switch_filter.c 
b/drivers/net/ice/ice_switch_filter.c
index ada3ecf60b..9147a5fdbe 100644
--- a/drivers/net/ice/ice_switch_filter.c
+++ b/drivers/net/ice/ice_switch_filter.c
@@ -137,6 +137,17 @@
 #define ICE_SW_INSET_MAC_IPV6_PFCP ( \
ICE_SW_INSET_MAC_IPV6 | \
ICE_INSET_PFCP_S_FIELD | ICE_INSET_PFCP_SEID)
+#define ICE_SW_INSET_MAC_IPV4_GTPU ( \
+   ICE_SW_INSET_MAC_IPV4 | ICE_INSET_GTPU_TEID)
+#define ICE_SW_INSET_MAC_IPV4_GTPU_EH ( \
+   ICE_SW_INSET_MAC_IPV4 | ICE_INSET_GTPU_TEID | \
+   ICE_INSET_GTPU_QFI)
+#define ICE_SW_INSET_MAC_IPV6_GTPU ( \
+   ICE_SW_INSET_MAC_IPV6 | ICE_INSET_GTPU_TEID)
+#define ICE_SW_INSET_MAC_IPV6_GTPU_EH ( \
+   ICE_SW_INSET_MAC_IPV6 | ICE_INSET_GTPU_TEID | \
+   ICE_INSET_GTPU_QFI)
+
 
 struct sw_meta {
struct ice_adv_lkup_elem *list;
@@ -198,6 +209,10 @@ ice_pattern_match_item ice_switch_pattern_dist_list[] = {
{pattern_eth_qinq_pppoes_proto, 
ICE_SW_INSET_MAC_PPPOE_PROTO,   ICE_INSET_NONE, ICE_INSET_NONE},
{pattern_eth_qinq_pppoes_ipv4,  
ICE_SW_INSET_MAC_PPPOE_IPV4,ICE_INSET_NONE, ICE_INSET_NONE},
{pattern_eth_qinq_pppoes_ipv6,  
ICE_SW_INSET_MAC_PPPOE_IPV6,ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv4_gtpu, 
ICE_SW_INSET_MAC_IPV4_GTPU, ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv4_gtpu_eh,  
ICE_SW_INSET_MAC_IPV4_GTPU_EH,  ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv6_gtpu, 
ICE_SW_INSET_MAC_IPV6_GTPU, ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv6_gtpu_eh,  
ICE_SW_INSET_MAC_IPV6_GTPU_EH,  ICE_INSET_NONE, ICE_INSET_NONE},
 };
 
 static struct
@@ -251,6 +266,10 @@ ice_pattern_match_item ice_switch_pattern_perm_list[] = {
{pattern_eth_qinq_pppoes_proto, 
ICE_SW_INSET_MAC_PPPOE_PROTO,   ICE_INSET_NONE, ICE_INSET_NONE},
{pattern_eth_qinq_pppoes_ipv4,  
ICE_SW_INSET_MAC_PPPOE_IPV4,ICE_INSET_NONE, ICE_INSET_NONE},
{pattern_eth_qinq_pppoes_ipv6,  
ICE_SW_INSET_MAC_PPPOE_IPV6,ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv4_gtpu, 
ICE_SW_INSET_MAC_IPV4_GTPU, ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv4_gtpu_eh,  
ICE_SW_INSET_MAC_IPV4_GTPU_EH,  ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv6_gtpu, 
ICE_SW_INSET_MAC_IPV6_GTPU, ICE_INSET_NONE, ICE_INSET_NONE},
+   {pattern_eth_ipv6_gtpu_eh,  
ICE_SW_INSET_MAC_IPV6_GTPU_EH,  ICE_INSET_NONE, ICE_INSET_NONE},
 };
 
 static int
@@ -378,6 +397,8 @@ ice_switch_inset_get(const struct rte_flow_item pattern[],
const struct rte_flow_item_ah *ah_spec, *ah_mask;
const struct rte_flow_item_l2tpv3oip *l2tp_spec, *l2tp_mask;
const struct rte_flow_item_pfcp *pfcp_spec, *pfcp_mask;
+   const struct rte_flow_item_gtp *gtp_spec, *gtp_mask;
+   const struct rte_flow_item_gtp_psc *gtp_psc_spec, *gtp_psc_mask;
uint64_t input_set = ICE_INSET_NONE;
uint16_t input_set_byte = 0;
bool pppoe_elem_valid = 0;
@@ -1255,6 +1276,76 @@ ice_switch_inset_get(const struct rte_flow_item 
pattern[],
}
break;
 
+   case RTE_FLOW_ITEM_TYPE_GTPU:
+   gtp_spec = item->spec;
+   gtp_mask = item->mask;
+   if (gtp_spec && !gtp_mask) {
+   rte_flow_error_set(error, EINVAL,
+   RTE_FLOW_ERROR_TYPE_ITEM,
+   item,
+   "Invalid GTPU item");
+   return 0;
+   }
+   if (gtp_spec && gtp_mask) {
+   if (gtp_mask->v_pt_rsv_flags ||
+   gtp_mask->msg_type ||
+

[dpdk-dev] [PATCH 1/2] [RFC]: ethdev: add pre-defined meter policy API

2021-03-18 Thread Li Zhang
Currently, the flow meter policy does not support multiple actions
per color; also the allowed action types per color are very limited.
In addition, the policy cannot be pre-defined.

Due to the growing in flow actions offload abilities there is a potential
for the user to use variety of actions per color differently.
This new meter policy API comes to allow this potential in the most ethdev
common way using rte_flow action definition.
A list of rte_flow actions will be provided by the user per color
in order to create a meter policy.
In addition, the API forces to pre-define the policy before
the meters creation in order to allow sharing of single policy
with multiple meters efficiently.

meter_policy_id is added into struct rte_mtr_params.
So that it can get the policy during the meters creation.

Policy id 0 is default policy. Action per color as below:
green - no action, yellow - no action, red - drop

Allow coloring the packet using a new rte_flow_action_color
as could be done by the old policy API,

The next API function were added:
- rte_mtr_meter_policy_add
- rte_mtr_meter_policy_delete
- rte_mtr_meter_policy_update
- rte_mtr_meter_policy_validate
The next struct was changed:
- rte_mtr_params
- rte_mtr_capabilities
The next API was deleted:
- rte_mtr_policer_actions_update

Signed-off-by: Li Zhang 
---
 lib/librte_ethdev/rte_flow.h   |  18 
 lib/librte_ethdev/rte_mtr.c|  55 --
 lib/librte_ethdev/rte_mtr.h| 166 -
 lib/librte_ethdev/rte_mtr_driver.h |  45 ++--
 4 files changed, 210 insertions(+), 74 deletions(-)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 669e677e91..5f38aa7fa4 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef __cplusplus
 extern "C" {
@@ -2236,6 +2237,13 @@ enum rte_flow_action_type {
 * See struct rte_flow_action_modify_field.
 */
RTE_FLOW_ACTION_TYPE_MODIFY_FIELD,
+
+   /**
+* Color the packet to reflect the meter color result.
+*
+* See struct rte_flow_action_color.
+*/
+   RTE_FLOW_ACTION_TYPE_COlOR,
 };
 
 /**
@@ -2828,6 +2836,16 @@ struct rte_flow_action_set_dscp {
  */
 struct rte_flow_shared_action;
 
+/**
+ * RTE_FLOW_ACTION_TYPE_COLOR
+ *
+ * The meter color should be set in the packet meta-data
+ * (i.e. struct rte_mbuf::sched::color).
+ */
+struct rte_flow_action_color {
+   enum rte_color color; /**< Green/Yellow/Red. */
+};
+
 /**
  * Field IDs for MODIFY_FIELD action.
  */
diff --git a/lib/librte_ethdev/rte_mtr.c b/lib/librte_ethdev/rte_mtr.c
index 3073ac03f2..fccec3760b 100644
--- a/lib/librte_ethdev/rte_mtr.c
+++ b/lib/librte_ethdev/rte_mtr.c
@@ -91,6 +91,40 @@ rte_mtr_meter_profile_delete(uint16_t port_id,
meter_profile_id, error);
 }
 
+/* MTR meter policy validate */
+int
+rte_mtr_meter_policy_validate(uint16_t port_id,
+   const struct rte_flow_action *actions[RTE_COLORS],
+   struct rte_mtr_error *error)
+{
+   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+   return RTE_MTR_FUNC(port_id, meter_policy_validate)(dev,
+   actions, error);
+}
+
+/* MTR meter policy add */
+int
+rte_mtr_meter_policy_add(uint16_t port_id,
+   uint32_t policy_id,
+   const struct rte_flow_action *actions[RTE_COLORS],
+   struct rte_mtr_error *error)
+{
+   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+   return RTE_MTR_FUNC(port_id, meter_policy_add)(dev,
+   policy_id, actions, error);
+}
+
+/** MTR meter policy delete */
+int
+rte_mtr_meter_policy_delete(uint16_t port_id,
+   uint32_t policy_id,
+   struct rte_mtr_error *error)
+{
+   struct rte_eth_dev *dev = &rte_eth_devices[port_id];
+   return RTE_MTR_FUNC(port_id, meter_policy_delete)(dev,
+   policy_id, error);
+}
+
 /** MTR object create */
 int
 rte_mtr_create(uint16_t port_id,
@@ -149,29 +183,28 @@ rte_mtr_meter_profile_update(uint16_t port_id,
mtr_id, meter_profile_id, error);
 }
 
-/** MTR object meter DSCP table update */
+/** MTR object meter policy update */
 int
-rte_mtr_meter_dscp_table_update(uint16_t port_id,
+rte_mtr_meter_policy_update(uint16_t port_id,
uint32_t mtr_id,
-   enum rte_color *dscp_table,
+   uint32_t meter_policy_id,
struct rte_mtr_error *error)
 {
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-   return RTE_MTR_FUNC(port_id, meter_dscp_table_update)(dev,
-   mtr_id, dscp_table, error);
+   return RTE_MTR_FUNC(port_id, meter_policy_update)(dev,
+   mtr_id, meter_policy_id, error);
 }
 
-/** MTR object policer action update */
+/** MTR object meter DSCP table update */
 int
-rte_mtr_policer_actions_update(uint16_t port_id,
+rte_mtr_meter_dscp_table_update(uint16_t port_id,
uint32_t mtr_id,
-   uint32_t action_mask,
-   

[dpdk-dev] [PATCH 2/2] [RFC]: ethdev: manage meter API object handles by the drivers

2021-03-18 Thread Li Zhang
Currently, all the meter objects are managed by the user IDs:
meter, profile and policy.
Hence, each PMD should manage data-structure in order to
map each API ID to the private PMD management structure.

>From the application side, it has all the picture how meter
is going to be assigned to flows and can easily use direct
mapping even when the meter handler is provided by the PMDs.

Also, this is the approach of the rte_flow API handles:
the flow handle and the shared action handle
is provided by the PMDs.

Use drivers handlers in order to manage all the meter API objects.

The following API will be changed:
- rte_mtr_meter_profile_add
- rte_mtr_meter_profile_delete
- rte_mtr_meter_policy_validate
- rte_mtr_meter_policy_add
- rte_mtr_meter_policy_delete
- rte_mtr_create
- rte_mtr_destroy
- rte_mtr_meter_disable
- rte_mtr_meter_enable
- rte_mtr_meter_profile_update
- rte_mtr_meter_policy_update
- rte_mtr_meter_dscp_table_update
- rte_mtr_stats_update
- rte_mtr_stats_read
The next struct will be changed:
- rte_flow_action_meter
- rte_mtr_params

Signed-off-by: Li Zhang 
---
 lib/librte_ethdev/rte_flow.h   |   9 ++-
 lib/librte_ethdev/rte_mtr.c|  77 --
 lib/librte_ethdev/rte_mtr.h| 102 +++--
 lib/librte_ethdev/rte_mtr_driver.h |  36 +-
 4 files changed, 122 insertions(+), 102 deletions(-)

diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index 5f38aa7fa4..6d2b86592d 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -2480,6 +2480,13 @@ struct rte_flow_action_port_id {
uint32_t id; /**< DPDK port ID. */
 };
 
+/**
+ * Opaque type returned after successfully creating a meter.
+ *
+ * This handle can be used to manage the related meter (e.g. to destroy it).
+ */
+struct rte_mtr;
+
 /**
  * RTE_FLOW_ACTION_TYPE_METER
  *
@@ -2489,7 +2496,7 @@ struct rte_flow_action_port_id {
  * next item with their color set by the MTR object.
  */
 struct rte_flow_action_meter {
-   uint32_t mtr_id; /**< MTR object ID created with rte_mtr_create(). */
+   struct rte_mtr *mtr; /**< MTR object created with rte_mtr_create(). */
 };
 
 /**
diff --git a/lib/librte_ethdev/rte_mtr.c b/lib/librte_ethdev/rte_mtr.c
index fccec3760b..e407c6f956 100644
--- a/lib/librte_ethdev/rte_mtr.c
+++ b/lib/librte_ethdev/rte_mtr.c
@@ -57,6 +57,19 @@ rte_mtr_ops_get(uint16_t port_id, struct rte_mtr_error 
*error)
ops->func;  \
 })
 
+#define RTE_MTR_FUNC_PTR(port_id, func)\
+({ \
+   const struct rte_mtr_ops *ops = \
+   rte_mtr_ops_get(port_id, error);\
+   if (ops == NULL)\
+   return NULL;\
+   \
+   if (ops->func == NULL)  \
+   return NULL;\
+   \
+   ops->func;  \
+})
+
 /* MTR capabilities get */
 int
 rte_mtr_capabilities_get(uint16_t port_id,
@@ -69,26 +82,25 @@ rte_mtr_capabilities_get(uint16_t port_id,
 }
 
 /* MTR meter profile add */
-int
+struct rte_mtr_profile *
 rte_mtr_meter_profile_add(uint16_t port_id,
-   uint32_t meter_profile_id,
struct rte_mtr_meter_profile *profile,
struct rte_mtr_error *error)
 {
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-   return RTE_MTR_FUNC(port_id, meter_profile_add)(dev,
-   meter_profile_id, profile, error);
+   return RTE_MTR_FUNC_PTR(port_id, meter_profile_add)(dev,
+   profile, error);
 }
 
 /** MTR meter profile delete */
 int
 rte_mtr_meter_profile_delete(uint16_t port_id,
-   uint32_t meter_profile_id,
+   struct rte_mtr_profile *profile,
struct rte_mtr_error *error)
 {
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
return RTE_MTR_FUNC(port_id, meter_profile_delete)(dev,
-   meter_profile_id, error);
+   profile, error);
 }
 
 /* MTR meter policy validate */
@@ -103,126 +115,123 @@ rte_mtr_meter_policy_validate(uint16_t port_id,
 }
 
 /* MTR meter policy add */
-int
+struct rte_mtr_policy *
 rte_mtr_meter_policy_add(uint16_t port_id,
-   uint32_t policy_id,
const struct rte_flow_action *actions[RTE_COLORS],
struct rte_mtr_error *error)
 {
struct rte_eth_dev *dev = &rte_eth_devices[port_id];
-   return RTE_MTR_FUNC(port_id, meter_policy_add)(dev,
-   policy_id, actions, error);
+   return RTE_MTR_FUNC_PTR(port_id, meter_policy_add)(dev,
+   actions, error);
 }
 
 /** MTR meter policy delete */
 int
 rte_mtr_meter_policy_delete(uint16_t port_id,
-   uint32_t policy_id,
+   struct rte_mtr_policy *policy,
struct rte_mtr

Re: [dpdk-dev] [PATCH v2 1/5] devargs: fix memory leak on parsing error

2021-03-18 Thread Thomas Monjalon
18/01/2021 16:16, Xueming Li:
> --- a/lib/librte_eal/common/eal_common_devargs.c
> +++ b/lib/librte_eal/common/eal_common_devargs.c
> + if (ret != 0) {
> + if (devargs->data && devargs->data != devstr) {

Better to make comparison explicit:
if (devargs->data != NULL

> + /* Free duplicated data. */
> + free(devargs->data);

Before patch 2, devargs->data is const,
so we cannot free (compilation error).





[dpdk-dev] [PATCH 1/2] net/bnxt: fix link state operations

2021-03-18 Thread Kalesh A P
From: Kalesh AP 

VFs does not have the privilege to change link configuration.
But the driver silently returns success to these ethdev callbacks
without actually issuing the HWRM command to bring the link up/down.

Fixes: 5c206086feaa ("net/bnxt: add link state operations")
Cc: sta...@dpdk.org

Signed-off-by: Kalesh AP 
Reviewed-by: Somnath Kotur 
Reviewed-by: Ajit Kumar Khaparde 
---
 drivers/net/bnxt/bnxt_ethdev.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 1997783..3665f31 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -1268,6 +1268,9 @@ static int bnxt_dev_set_link_up_op(struct rte_eth_dev 
*eth_dev)
struct bnxt *bp = eth_dev->data->dev_private;
int rc = 0;
 
+   if (!BNXT_SINGLE_PF(bp))
+   return -ENOTSUP;
+
if (!bp->link_info->link_up)
rc = bnxt_set_hwrm_link_config(bp, true);
if (!rc)
@@ -1281,6 +1284,9 @@ static int bnxt_dev_set_link_down_op(struct rte_eth_dev 
*eth_dev)
 {
struct bnxt *bp = eth_dev->data->dev_private;
 
+   if (!BNXT_SINGLE_PF(bp))
+   return -ENOTSUP;
+
eth_dev->data->dev_link.link_status = 0;
bnxt_set_hwrm_link_config(bp, false);
bp->link_info->link_up = 0;
-- 
2.10.1



[dpdk-dev] [PATCH 2/2] net/bnxt: fix unsupported handling in PTP

2021-03-18 Thread Kalesh A P
From: Kalesh AP 

Fixed to return error when PTP support is not supported on the port.
Also, removed an unnecessary check inside bnxt_get_rx_ts().

Fixes: b11cceb83a34 ("net/bnxt: support timesync")
Cc: sta...@dpdk.org

Signed-off-by: Kalesh AP 
Reviewed-by: Ajit Kumar Khaparde 
Reviewed-by: Somnath Kotur 
---
 drivers/net/bnxt/bnxt_ethdev.c | 17 +++--
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 3665f31..4fd9653 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -3395,9 +3395,6 @@ static int bnxt_get_rx_ts(struct bnxt *bp, uint64_t *ts)
uint16_t port_id;
uint32_t fifo;
 
-   if (!ptp)
-   return -ENODEV;
-
fifo = rte_le_to_cpu_32(rte_read32((uint8_t *)bp->bar0 +
ptp->rx_mapped_regs[BNXT_PTP_RX_FIFO]));
if (!(fifo & BNXT_PTP_RX_FIFO_PENDING))
@@ -3430,7 +3427,7 @@ bnxt_timesync_write_time(struct rte_eth_dev *dev, const 
struct timespec *ts)
struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
ns = rte_timespec_to_ns(ts);
/* Set the timecounters to a new value. */
@@ -3450,7 +3447,7 @@ bnxt_timesync_read_time(struct rte_eth_dev *dev, struct 
timespec *ts)
int rc = 0;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
if (BNXT_CHIP_P5(bp))
rc = bnxt_hwrm_port_ts_query(bp, BNXT_PTP_FLAGS_CURRENT_TIME,
@@ -3472,7 +3469,7 @@ bnxt_timesync_enable(struct rte_eth_dev *dev)
int rc;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
ptp->rx_filter = 1;
ptp->tx_tstamp_en = 1;
@@ -3513,7 +3510,7 @@ bnxt_timesync_disable(struct rte_eth_dev *dev)
struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
ptp->rx_filter = 0;
ptp->tx_tstamp_en = 0;
@@ -3540,7 +3537,7 @@ bnxt_timesync_read_rx_timestamp(struct rte_eth_dev *dev,
uint64_t ns;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
if (BNXT_CHIP_P5(bp))
rx_tstamp_cycles = ptp->rx_timestamp;
@@ -3563,7 +3560,7 @@ bnxt_timesync_read_tx_timestamp(struct rte_eth_dev *dev,
int rc = 0;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
if (BNXT_CHIP_P5(bp))
rc = bnxt_hwrm_port_ts_query(bp, BNXT_PTP_FLAGS_PATH_TX,
@@ -3584,7 +3581,7 @@ bnxt_timesync_adjust_time(struct rte_eth_dev *dev, 
int64_t delta)
struct bnxt_ptp_cfg *ptp = bp->ptp_cfg;
 
if (!ptp)
-   return 0;
+   return -ENOTSUP;
 
ptp->tc.nsec += delta;
ptp->tx_tstamp_tc.nsec += delta;
-- 
2.10.1



[dpdk-dev] [PATCH] vdpa/mlx5: improve interrupt management

2021-03-18 Thread Matan Azrad
The driver should notify the guest for each traffic burst detected by CQ
polling.

The CQ polling trigger is defined by `event_mode` device argument,
either by busy polling on all the CQs or by blocked call to HW
completion event using DevX channel.

Also, the polling event modes can move to blocked call when the
traffic rate is low.

The current blocked call uses the EAL interrupt API suffering a lot
of overhead in the API management and serve all the drivers and
libraries using only single thread.

Use blocking FD of the DevX channel in order to do blocked call
directly by the DevX channel FD mechanism.

Signed-off-by: Matan Azrad 
Acked-by: Xueming Li 
---
 doc/guides/vdpadevs/mlx5.rst|   8 +-
 drivers/vdpa/mlx5/mlx5_vdpa.c   |   8 +-
 drivers/vdpa/mlx5/mlx5_vdpa.h   |   8 +-
 drivers/vdpa/mlx5/mlx5_vdpa_event.c | 308 +++-
 4 files changed, 134 insertions(+), 198 deletions(-)

diff --git a/doc/guides/vdpadevs/mlx5.rst b/doc/guides/vdpadevs/mlx5.rst
index 1f2ae6f..9b2f9f1 100644
--- a/doc/guides/vdpadevs/mlx5.rst
+++ b/doc/guides/vdpadevs/mlx5.rst
@@ -129,10 +129,10 @@ Driver options
 
 - ``no_traffic_time`` parameter [int]
 
-  A nonzero value defines the traffic off time, in seconds, that moves the
-  driver to no-traffic mode. In this mode the timer events are stopped and
-  interrupts are configured to the device in order to notify traffic for the
-  driver. Default value is 2s.
+  A nonzero value defines the traffic off time, in polling cycle time units,
+  that moves the driver to no-traffic mode. In this mode the polling is stopped
+  and interrupts are configured to the device in order to notify traffic for 
the
+  driver. Default value is 16.
 
 - ``event_core`` parameter [int]
 
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.c b/drivers/vdpa/mlx5/mlx5_vdpa.c
index 4c2d886..5d70880 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.c
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.c
@@ -44,7 +44,7 @@
 
 #define MLX5_VDPA_MAX_RETRIES 20
 #define MLX5_VDPA_USEC 1000
-#define MLX5_VDPA_DEFAULT_NO_TRAFFIC_TIME_S 2LLU
+#define MLX5_VDPA_DEFAULT_NO_TRAFFIC_MAX 16LLU
 
 TAILQ_HEAD(mlx5_vdpa_privs, mlx5_vdpa_priv) priv_list =
  TAILQ_HEAD_INITIALIZER(priv_list);
@@ -632,7 +632,7 @@
} else if (strcmp(key, "event_us") == 0) {
priv->event_us = (uint32_t)tmp;
} else if (strcmp(key, "no_traffic_time") == 0) {
-   priv->no_traffic_time_s = (uint32_t)tmp;
+   priv->no_traffic_max = (uint32_t)tmp;
} else if (strcmp(key, "event_core") == 0) {
if (tmp >= (unsigned long)n_cores)
DRV_LOG(WARNING, "Invalid event_core %s.", val);
@@ -658,7 +658,7 @@
priv->event_mode = MLX5_VDPA_EVENT_MODE_FIXED_TIMER;
priv->event_us = 0;
priv->event_core = -1;
-   priv->no_traffic_time_s = MLX5_VDPA_DEFAULT_NO_TRAFFIC_TIME_S;
+   priv->no_traffic_max = MLX5_VDPA_DEFAULT_NO_TRAFFIC_MAX;
if (devargs == NULL)
return;
kvlist = rte_kvargs_parse(devargs->args, NULL);
@@ -671,7 +671,7 @@
priv->event_us = MLX5_VDPA_DEFAULT_TIMER_STEP_US;
DRV_LOG(DEBUG, "event mode is %d.", priv->event_mode);
DRV_LOG(DEBUG, "event_us is %u us.", priv->event_us);
-   DRV_LOG(DEBUG, "no traffic time is %u s.", priv->no_traffic_time_s);
+   DRV_LOG(DEBUG, "no traffic max is %u.", priv->no_traffic_max);
 }
 
 /**
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa.h b/drivers/vdpa/mlx5/mlx5_vdpa.h
index 98c71aa..e4c8575 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa.h
+++ b/drivers/vdpa/mlx5/mlx5_vdpa.h
@@ -120,16 +120,13 @@ struct mlx5_vdpa_priv {
TAILQ_ENTRY(mlx5_vdpa_priv) next;
uint8_t configured;
pthread_mutex_t vq_config_lock;
-   uint64_t last_traffic_tic;
+   uint64_t no_traffic_counter;
pthread_t timer_tid;
-   pthread_mutex_t timer_lock;
-   pthread_cond_t timer_cond;
-   volatile uint8_t timer_on;
int event_mode;
int event_core; /* Event thread cpu affinity core. */
uint32_t event_us;
uint32_t timer_delay_us;
-   uint32_t no_traffic_time_s;
+   uint32_t no_traffic_max;
uint8_t hw_latency_mode; /* Hardware CQ moderation mode. */
uint16_t hw_max_latency_us; /* Hardware CQ moderation period in usec. */
uint16_t hw_max_pending_comp; /* Hardware CQ moderation counter. */
@@ -146,7 +143,6 @@ struct mlx5_vdpa_priv {
struct mlx5dv_devx_event_channel *eventc;
struct mlx5dv_devx_event_channel *err_chnl;
struct mlx5dv_devx_uar *uar;
-   struct rte_intr_handle intr_handle;
struct rte_intr_handle err_intr_handle;
struct mlx5_devx_obj *td;
struct mlx5_devx_obj *tiss[16]; /* TIS list for each LAG port. */
diff --git a/drivers/vdpa/mlx5/mlx5_vdpa_event.c 
b/drivers/vdpa/mlx5/mlx5_vdpa_event.c
index 86adc86..64a1753 100644
--- a/drivers/vdpa/mlx5/mlx5_vdpa

Re: [dpdk-dev] [PATCH v2 2/5] devargs: refactor scratch buffer storage

2021-03-18 Thread Thomas Monjalon
18/01/2021 16:16, Xueming Li:
> In current design, legacy parser rte_devargs_parse() saved scratch
> buffer to devargs.args while new parser rte_devargs_layers_parse() saved
> to devargs.data. Code using devargs had to know the difference and
> cleaned up memory accordingly - error prone.
> 
> This patch unifies data the dedicate scratch buffer, introduces
> rte_devargs_free() function to wrap the memory memory clean up.
> 
> Signed-off-by: Xueming Li 
> ---
> --- a/lib/librte_eal/include/rte_devargs.h
> +++ b/lib/librte_eal/include/rte_devargs.h
> @@ -60,16 +60,16 @@ struct rte_devargs {
>   /** Name of the device. */
>   char name[RTE_DEV_NAME_MAX_LEN];
>   RTE_STD_C11
> - union {
> - /** Arguments string as given by user or "" for no argument. */
> - char *args;
> + union { /**< driver-related part of device string. */
> + const char *args; /**< legacy name. */
>   const char *drv_str;
>   };
>   struct rte_bus *bus; /**< bus handle. */
>   struct rte_class *cls; /**< class handle. */
>   const char *bus_str; /**< bus-related part of device string. */
>   const char *cls_str; /**< class-related part of device string. */
> - const char *data; /**< Device string storage. */
> + char *data; /**< Scratch buffer. */
> + const char *src; /**< Arguments given by user. */

Adding a field changes the size of the struct, which is an ABI break.
We need to plan this change for DPDK 21.11.
Let's think what can be done in the meantime.




[dpdk-dev] [PATCH 1/6] net/ngbe: add build and doc infrastructure

2021-03-18 Thread Jiawen Wu
Adding bare minimum PMD library and doc build infrastructure
and claim the maintainership for ngbe PMD.

Signed-off-by: Jiawen Wu 
---
 MAINTAINERS|  6 ++
 doc/guides/nics/features/ngbe.ini  | 10 +
 doc/guides/nics/index.rst  |  1 +
 doc/guides/nics/ngbe.rst   | 28 ++
 doc/guides/rel_notes/release_21_05.rst |  6 ++
 drivers/net/meson.build|  1 +
 drivers/net/ngbe/meson.build   | 12 +++
 drivers/net/ngbe/ngbe_ethdev.c |  4 
 drivers/net/ngbe/ngbe_ethdev.h |  4 
 drivers/net/ngbe/version.map   |  3 +++
 10 files changed, 75 insertions(+)
 create mode 100644 doc/guides/nics/features/ngbe.ini
 create mode 100644 doc/guides/nics/ngbe.rst
 create mode 100644 drivers/net/ngbe/meson.build
 create mode 100644 drivers/net/ngbe/ngbe_ethdev.c
 create mode 100644 drivers/net/ngbe/ngbe_ethdev.h
 create mode 100644 drivers/net/ngbe/version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index e341bc81d..a611b6595 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -886,6 +886,12 @@ F: drivers/net/txgbe/
 F: doc/guides/nics/txgbe.rst
 F: doc/guides/nics/features/txgbe.ini
 
+Wangxun txgbe
+M: Jiawen Wu 
+F: drivers/net/ngbe/
+F: doc/guides/nics/ngbe.rst
+F: doc/guides/nics/features/ngbe.ini
+
 VMware vmxnet3
 M: Yong Wang 
 F: drivers/net/vmxnet3/
diff --git a/doc/guides/nics/features/ngbe.ini 
b/doc/guides/nics/features/ngbe.ini
new file mode 100644
index 0..a7a524def
--- /dev/null
+++ b/doc/guides/nics/features/ngbe.ini
@@ -0,0 +1,10 @@
+;
+; Supported features of the 'ngbe' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Linux= Y
+ARMv8= Y
+x86-32   = Y
+x86-64   = Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 799697caf..31a3e6bcd 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -47,6 +47,7 @@ Network Interface Controller Drivers
 netvsc
 nfb
 nfp
+ngbe
 null
 octeontx
 octeontx2
diff --git a/doc/guides/nics/ngbe.rst b/doc/guides/nics/ngbe.rst
new file mode 100644
index 0..007d8e80e
--- /dev/null
+++ b/doc/guides/nics/ngbe.rst
@@ -0,0 +1,28 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2018-2020.
+
+NGBE Poll Mode Driver
+==
+
+The NGBE PMD (librte_pmd_ngbe) provides poll mode driver support
+for Wangxun 1 Gigabit Ethernet NICs.
+
+Prerequisites
+-
+
+- Learning about Wangxun 10 Gigabit Ethernet NICs using
+  ``_.
+
+- Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup 
the basic DPDK environment.
+
+Driver compilation and testing
+--
+
+Refer to the document :ref:`compiling and testing a PMD for a NIC 
`
+for details.
+
+Limitations or Known issues
+---
+
+Build with ICC is not supported yet.
+Power8, ARMv7 and BSD are not supported yet.
diff --git a/doc/guides/rel_notes/release_21_05.rst 
b/doc/guides/rel_notes/release_21_05.rst
index 21dc6d234..c23b14970 100644
--- a/doc/guides/rel_notes/release_21_05.rst
+++ b/doc/guides/rel_notes/release_21_05.rst
@@ -76,6 +76,12 @@ New Features
 
   * Added support for txgbevf PMD.
 
+* **Added Wangxun ngbe PMD.**
+
+  Added a new PMD driver for Wangxun 1 Gigabit Ethernet NICs.
+
+  See the :doc:`../nics/ngbe` for more details.
+
 * **Updated testpmd.**
 
   * Added a command line option to configure forced speed for Ethernet port.
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index fb9ff05a1..d1baa2842 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -36,6 +36,7 @@ drivers = ['af_packet',
'netvsc',
'nfb',
'nfp',
+   'ngbe',
'null',
'octeontx',
'octeontx2',
diff --git a/drivers/net/ngbe/meson.build b/drivers/net/ngbe/meson.build
new file mode 100644
index 0..d6388d061
--- /dev/null
+++ b/drivers/net/ngbe/meson.build
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018-2020
+
+if is_windows
+   build = false
+   reason = 'not supported on Windows'
+   subdir_done()
+endif
+
+sources = files(
+   'ngbe_ethdev.c',
+)
diff --git a/drivers/net/ngbe/ngbe_ethdev.c b/drivers/net/ngbe/ngbe_ethdev.c
new file mode 100644
index 0..e2756315a
--- /dev/null
+++ b/drivers/net/ngbe/ngbe_ethdev.c
@@ -0,0 +1,4 @@
+ /* SPDX-License-Identifier: BSD-3-Clause
+  * Copyright(c) 2018-2020
+  */
+
diff --git a/drivers/net/ngbe/ngbe_ethdev.h b/drivers/net/ngbe/ngbe_ethdev.h
new file mode 100644
index 0..20f37e9d4
--- /dev/null
+++ b/drivers/net/ngbe/ngbe_ethdev.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
diff --git a/drivers/net/ngbe/version.map b/drivers/net/ngbe/version.map
new file mode 

[dpdk-dev] [PATCH 2/6] net/ngbe: add device IDs

2021-03-18 Thread Jiawen Wu
Add device IDs for Wangxun 1Gb NICs, and register rte_ngbe_pmd.

Signed-off-by: Jiawen Wu 
---
 drivers/net/ngbe/base/meson.build   | 18 +++
 drivers/net/ngbe/base/ngbe_devids.h | 83 +
 drivers/net/ngbe/meson.build|  6 +++
 drivers/net/ngbe/ngbe_ethdev.c  | 33 
 4 files changed, 140 insertions(+)
 create mode 100644 drivers/net/ngbe/base/meson.build
 create mode 100644 drivers/net/ngbe/base/ngbe_devids.h

diff --git a/drivers/net/ngbe/base/meson.build 
b/drivers/net/ngbe/base/meson.build
new file mode 100644
index 0..b4fc6a53b
--- /dev/null
+++ b/drivers/net/ngbe/base/meson.build
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2018-2020
+
+sources = []
+
+error_cflags = []
+
+c_args = cflags
+foreach flag: error_cflags
+   if cc.has_argument(flag)
+   c_args += flag
+   endif
+endforeach
+
+base_lib = static_library('ngbe_base', sources,
+   dependencies: [static_rte_eal, static_rte_ethdev, static_rte_bus_pci],
+   c_args: c_args)
+base_objs = base_lib.extract_all_objects()
diff --git a/drivers/net/ngbe/base/ngbe_devids.h 
b/drivers/net/ngbe/base/ngbe_devids.h
new file mode 100644
index 0..79967b9fe
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe_devids.h
@@ -0,0 +1,83 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#ifndef _NGBE_DEVIDS_H_
+#define _NGBE_DEVIDS_H_
+
+/*
+ * Vendor ID
+ */
+#ifndef PCI_VENDOR_ID_WANGXUN
+#define PCI_VENDOR_ID_WANGXUN   0x8088
+#endif
+
+/*
+ * Device IDs
+ */
+#define NGBE_DEV_ID_EM_VF  0x0110
+#define   NGBE_SUB_DEV_ID_EM_VF0x0110
+#define NGBE_DEV_ID_EM 0x0100
+#define   NGBE_SUB_DEV_ID_EM_MVL_RGMII 0x0200
+#define   NGBE_SUB_DEV_ID_EM_MVL_SFP   0x0403
+#define   NGBE_SUB_DEV_ID_EM_RTL_SGMII 0x0410
+#define   NGBE_SUB_DEV_ID_EM_YT8521S_SFP   0x0460
+
+#define NGBE_DEV_ID_EM_WX1860AL_W  0x0100
+#define NGBE_DEV_ID_EM_WX1860AL_W_VF   0x0110
+#define NGBE_DEV_ID_EM_WX1860A20x0101
+#define NGBE_DEV_ID_EM_WX1860A2_VF 0x0111
+#define NGBE_DEV_ID_EM_WX1860A2S   0x0102
+#define NGBE_DEV_ID_EM_WX1860A2S_VF0x0112
+#define NGBE_DEV_ID_EM_WX1860A40x0103
+#define NGBE_DEV_ID_EM_WX1860A4_VF 0x0113
+#define NGBE_DEV_ID_EM_WX1860A4S   0x0104
+#define NGBE_DEV_ID_EM_WX1860A4S_VF0x0114
+#define NGBE_DEV_ID_EM_WX1860AL2   0x0105
+#define NGBE_DEV_ID_EM_WX1860AL2_VF0x0115
+#define NGBE_DEV_ID_EM_WX1860AL2S  0x0106
+#define NGBE_DEV_ID_EM_WX1860AL2S_VF   0x0116
+#define NGBE_DEV_ID_EM_WX1860AL4   0x0107
+#define NGBE_DEV_ID_EM_WX1860AL4_VF0x0117
+#define NGBE_DEV_ID_EM_WX1860AL4S  0x0108
+#define NGBE_DEV_ID_EM_WX1860AL4S_VF   0x0118
+#define NGBE_DEV_ID_EM_WX1860NCSI  0x0109
+#define NGBE_DEV_ID_EM_WX1860NCSI_VF   0x0119
+#define NGBE_DEV_ID_EM_WX1860A10x010A
+#define NGBE_DEV_ID_EM_WX1860A1_VF 0x011A
+#define NGBE_DEV_ID_EM_WX1860A1L   0x010B
+#define NGBE_DEV_ID_EM_WX1860A1L_VF0x011B
+#define   NGBE_SUB_DEV_ID_EM_ZTE5201_RJ45  0x0100
+#define   NGBE_SUB_DEV_ID_EM_SF100F_LP 0x0103
+#define   NGBE_SUB_DEV_ID_EM_M88E1512_RJ45 0x0200
+#define   NGBE_SUB_DEV_ID_EM_SF100HT   0x0102
+#define   NGBE_SUB_DEV_ID_EM_SF200T0x0201
+#define   NGBE_SUB_DEV_ID_EM_SF200HT   0x0202
+#define   NGBE_SUB_DEV_ID_EM_SF200T_S  0x0210
+#define   NGBE_SUB_DEV_ID_EM_SF200HT_S 0x0220
+#define   NGBE_SUB_DEV_ID_EM_SF200HXT  0x0230
+#define   NGBE_SUB_DEV_ID_EM_SF400T0x0401
+#define   NGBE_SUB_DEV_ID_EM_SF400HT   0x0402
+#define   NGBE_SUB_DEV_ID_EM_M88E1512_SFP  0x0403
+#define   NGBE_SUB_DEV_ID_EM_SF400T_S  0x0410
+#define   NGBE_SUB_DEV_ID_EM_SF400HT_S 0x0420
+#define   NGBE_SUB_DEV_ID_EM_SF400HXT  0x0430
+#define   NGBE_SUB_DEV_ID_EM_SF400_OCP 0x0440
+#define   NGBE_SUB_DEV_ID_EM_SF400_LY  0x0450
+#define   NGBE_SUB_DEV_ID_EM_SF400_LY_YT   0x0470
+
+/* Assign excessive id with masks */
+#define NGBE_INTERNAL_MASK 0x000F
+#define NGBE_OEM_MASK  0x00F0
+#define NGBE_WOL_SUP_MASK  0x4000
+#define NGBE_NCSI_SUP_MASK 0x8000
+
+#define NGBE_INTERNAL_SFP  0x0003
+#define NGBE_OCP_CARD  0x0040
+#define NGBE_LY_M88E1512_SFP   0x0050
+#define NGBE_YT8521S_SFP   0x0060
+#define NGBE_LY_YT8521S_SFP0x0070
+#define NGBE_WOL_SUP   0x4000
+#define NGBE_NCSI_SUP  0x8000
+
+#endif /* _NGBE_DEVIDS_H_ */
dif

[dpdk-dev] [PATCH 0/6] net: ngbe PMD

2021-03-18 Thread Jiawen Wu
This patch set provides a skeleton of ngbe PMD,
which adapted to Wangxun WX1860 series NICs.

Jiawen Wu (6):
  net/ngbe: add build and doc infrastructure
  net/ngbe: add device IDs
  net/ngbe: support probe and remove
  net/ngbe: add device init and uninit
  net/ngbe: add log type and error type
  net/ngbe: define registers

 MAINTAINERS|6 +
 doc/guides/nics/features/ngbe.ini  |   11 +
 doc/guides/nics/index.rst  |1 +
 doc/guides/nics/ngbe.rst   |   69 ++
 doc/guides/rel_notes/release_21_05.rst |6 +
 drivers/net/meson.build|1 +
 drivers/net/ngbe/base/meson.build  |   20 +
 drivers/net/ngbe/base/ngbe.h   |   11 +
 drivers/net/ngbe/base/ngbe_devids.h|   83 ++
 drivers/net/ngbe/base/ngbe_hw.c|   59 +
 drivers/net/ngbe/base/ngbe_hw.h|   12 +
 drivers/net/ngbe/base/ngbe_osdep.h |  172 +++
 drivers/net/ngbe/base/ngbe_regs.h  | 1489 
 drivers/net/ngbe/base/ngbe_status.h|  124 ++
 drivers/net/ngbe/base/ngbe_type.h  |   30 +
 drivers/net/ngbe/meson.build   |   18 +
 drivers/net/ngbe/ngbe_ethdev.c |  160 +++
 drivers/net/ngbe/ngbe_ethdev.h |   21 +
 drivers/net/ngbe/ngbe_logs.h   |   54 +
 drivers/net/ngbe/version.map   |3 +
 20 files changed, 2350 insertions(+)
 create mode 100644 doc/guides/nics/features/ngbe.ini
 create mode 100644 doc/guides/nics/ngbe.rst
 create mode 100644 drivers/net/ngbe/base/meson.build
 create mode 100644 drivers/net/ngbe/base/ngbe.h
 create mode 100644 drivers/net/ngbe/base/ngbe_devids.h
 create mode 100644 drivers/net/ngbe/base/ngbe_hw.c
 create mode 100644 drivers/net/ngbe/base/ngbe_hw.h
 create mode 100644 drivers/net/ngbe/base/ngbe_osdep.h
 create mode 100644 drivers/net/ngbe/base/ngbe_regs.h
 create mode 100644 drivers/net/ngbe/base/ngbe_status.h
 create mode 100644 drivers/net/ngbe/base/ngbe_type.h
 create mode 100644 drivers/net/ngbe/meson.build
 create mode 100644 drivers/net/ngbe/ngbe_ethdev.c
 create mode 100644 drivers/net/ngbe/ngbe_ethdev.h
 create mode 100644 drivers/net/ngbe/ngbe_logs.h
 create mode 100644 drivers/net/ngbe/version.map

-- 
2.21.0.windows.1





[dpdk-dev] [PATCH 4/6] net/ngbe: add device init and uninit

2021-03-18 Thread Jiawen Wu
Add basic init and uninit function.
Map device IDs and subsystem IDs to single ID for easy opearation.

Signed-off-by: Jiawen Wu 
---
 drivers/net/ngbe/base/meson.build  |   4 +-
 drivers/net/ngbe/base/ngbe.h   |  11 ++
 drivers/net/ngbe/base/ngbe_hw.c|  59 ++
 drivers/net/ngbe/base/ngbe_hw.h|  12 ++
 drivers/net/ngbe/base/ngbe_osdep.h | 172 +
 drivers/net/ngbe/base/ngbe_type.h  |  27 +
 drivers/net/ngbe/ngbe_ethdev.c |  36 +-
 drivers/net/ngbe/ngbe_ethdev.h |  17 +++
 8 files changed, 335 insertions(+), 3 deletions(-)
 create mode 100644 drivers/net/ngbe/base/ngbe.h
 create mode 100644 drivers/net/ngbe/base/ngbe_hw.c
 create mode 100644 drivers/net/ngbe/base/ngbe_hw.h
 create mode 100644 drivers/net/ngbe/base/ngbe_osdep.h
 create mode 100644 drivers/net/ngbe/base/ngbe_type.h

diff --git a/drivers/net/ngbe/base/meson.build 
b/drivers/net/ngbe/base/meson.build
index b4fc6a53b..d3616148f 100644
--- a/drivers/net/ngbe/base/meson.build
+++ b/drivers/net/ngbe/base/meson.build
@@ -1,7 +1,9 @@
 # SPDX-License-Identifier: BSD-3-Clause
 # Copyright(c) 2018-2020
 
-sources = []
+sources = [
+   'ngbe_hw.c',
+]
 
 error_cflags = []
 
diff --git a/drivers/net/ngbe/base/ngbe.h b/drivers/net/ngbe/base/ngbe.h
new file mode 100644
index 0..cdd435a0a
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#ifndef _NGBE_H_
+#define _NGBE_H_
+
+#include "ngbe_type.h"
+#include "ngbe_hw.h"
+
+#endif /* _NGBE_H_ */
diff --git a/drivers/net/ngbe/base/ngbe_hw.c b/drivers/net/ngbe/base/ngbe_hw.c
new file mode 100644
index 0..2a74405e3
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe_hw.c
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#include "ngbe_hw.h"
+
+void ngbe_map_device_id(struct ngbe_hw *hw)
+{
+   u16 oem = hw->sub_system_id & NGBE_OEM_MASK;
+   u16 internal = hw->sub_system_id & NGBE_INTERNAL_MASK;
+   hw->is_pf = true;
+
+   /* move subsystem_device_id to device_id */
+   switch (hw->device_id) {
+   case NGBE_DEV_ID_EM_WX1860AL_W_VF:
+   case NGBE_DEV_ID_EM_WX1860A2_VF:
+   case NGBE_DEV_ID_EM_WX1860A2S_VF:
+   case NGBE_DEV_ID_EM_WX1860A4_VF:
+   case NGBE_DEV_ID_EM_WX1860A4S_VF:
+   case NGBE_DEV_ID_EM_WX1860AL2_VF:
+   case NGBE_DEV_ID_EM_WX1860AL2S_VF:
+   case NGBE_DEV_ID_EM_WX1860AL4_VF:
+   case NGBE_DEV_ID_EM_WX1860AL4S_VF:
+   case NGBE_DEV_ID_EM_WX1860NCSI_VF:
+   case NGBE_DEV_ID_EM_WX1860A1_VF:
+   case NGBE_DEV_ID_EM_WX1860A1L_VF:
+   hw->device_id = NGBE_DEV_ID_EM_VF;
+   hw->sub_device_id = NGBE_SUB_DEV_ID_EM_VF;
+   hw->is_pf = false;
+   break;
+   case NGBE_DEV_ID_EM_WX1860AL_W:
+   case NGBE_DEV_ID_EM_WX1860A2:
+   case NGBE_DEV_ID_EM_WX1860A2S:
+   case NGBE_DEV_ID_EM_WX1860A4:
+   case NGBE_DEV_ID_EM_WX1860A4S:
+   case NGBE_DEV_ID_EM_WX1860AL2:
+   case NGBE_DEV_ID_EM_WX1860AL2S:
+   case NGBE_DEV_ID_EM_WX1860AL4:
+   case NGBE_DEV_ID_EM_WX1860AL4S:
+   case NGBE_DEV_ID_EM_WX1860NCSI:
+   case NGBE_DEV_ID_EM_WX1860A1:
+   case NGBE_DEV_ID_EM_WX1860A1L:
+   hw->device_id = NGBE_DEV_ID_EM;
+   if (oem == NGBE_LY_M88E1512_SFP ||
+   internal == NGBE_INTERNAL_SFP)
+   hw->sub_device_id = NGBE_SUB_DEV_ID_EM_MVL_SFP;
+   else if (hw->sub_system_id == NGBE_SUB_DEV_ID_EM_M88E1512_RJ45)
+   hw->sub_device_id = NGBE_SUB_DEV_ID_EM_MVL_RGMII;
+   else if (oem == NGBE_YT8521S_SFP ||
+   oem == NGBE_LY_YT8521S_SFP)
+   hw->sub_device_id = NGBE_SUB_DEV_ID_EM_YT8521S_SFP;
+   else
+   hw->sub_device_id = NGBE_SUB_DEV_ID_EM_RTL_SGMII;
+   break;
+   default:
+   break;
+   }
+}
+
diff --git a/drivers/net/ngbe/base/ngbe_hw.h b/drivers/net/ngbe/base/ngbe_hw.h
new file mode 100644
index 0..0dba04a54
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe_hw.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#ifndef _NGBE_HW_H_
+#define _NGBE_HW_H_
+
+#include "ngbe_type.h"
+
+void ngbe_map_device_id(struct ngbe_hw *hw);
+
+#endif /* _NGBE_HW_H_ */
diff --git a/drivers/net/ngbe/base/ngbe_osdep.h 
b/drivers/net/ngbe/base/ngbe_osdep.h
new file mode 100644
index 0..64afed2cc
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe_osdep.h
@@ -0,0 +1,172 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#ifndef _NGBE_OS_H_
+#define _NGBE_OS_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define RTE_LIBRTE_NGBE_TMDCPV(1, 0)
+#define TMZ_PADDR(mz)  ((mz)->

[dpdk-dev] [PATCH 5/6] net/ngbe: add log type and error type

2021-03-18 Thread Jiawen Wu
Add log type and error type to trace functions.

Signed-off-by: Jiawen Wu 
---
 doc/guides/nics/ngbe.rst|  41 +
 drivers/net/ngbe/base/ngbe_status.h | 124 
 drivers/net/ngbe/base/ngbe_type.h   |   1 +
 drivers/net/ngbe/ngbe_ethdev.c  |  20 +
 drivers/net/ngbe/ngbe_logs.h|  54 
 5 files changed, 240 insertions(+)
 create mode 100644 drivers/net/ngbe/base/ngbe_status.h
 create mode 100644 drivers/net/ngbe/ngbe_logs.h

diff --git a/doc/guides/nics/ngbe.rst b/doc/guides/nics/ngbe.rst
index 007d8e80e..1cbd72041 100644
--- a/doc/guides/nics/ngbe.rst
+++ b/doc/guides/nics/ngbe.rst
@@ -15,6 +15,47 @@ Prerequisites
 
 - Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup 
the basic DPDK environment.
 
+Pre-Installation Configuration
+--
+
+Build Options
+~
+
+The following build-time options may be enabled on build time using.
+
+``-Dc_args=`` meson argument (e.g. ``-Dc_args=-DRTE_LIBRTE_NGBE_DEBUG_RX``).
+
+Please note that enabling debugging options may affect system performance.
+
+- ``RTE_LIBRTE_NGBE_DEBUG_RX`` (undefined by default)
+
+  Toggle display of receive fast path run-time messages.
+
+- ``RTE_LIBRTE_NGBE_DEBUG_TX`` (undefined by default)
+
+  Toggle display of transmit fast path run-time messages.
+
+- ``RTE_LIBRTE_NGBE_DEBUG_TX_FREE`` (undefined by default)
+
+  Toggle display of transmit descriptor clean messages.
+
+Dynamic Logging Parameters
+~~
+
+One may leverage EAL option "--log-level" to change default levels
+for the log types supported by the driver. The option is used with
+an argument typically consisting of two parts separated by a colon.
+
+NGBE PMD provides the following log types available for control:
+
+- ``pmd.net.ngbe.driver`` (default level is **notice**)
+
+  Affects driver-wide messages unrelated to any particular devices.
+
+- ``pmd.net.ngbe.init`` (default level is **notice**)
+
+  Extra logging of the messages during PMD initialization.
+
 Driver compilation and testing
 --
 
diff --git a/drivers/net/ngbe/base/ngbe_status.h 
b/drivers/net/ngbe/base/ngbe_status.h
new file mode 100644
index 0..b2e7cfb29
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe_status.h
@@ -0,0 +1,124 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#ifndef _NGBE_STATUS_H_
+#define _NGBE_STATUS_H_
+
+/* Error Codes:
+ * common error
+ * module error(simple)
+ * module error(detailed)
+ *
+ * (-256, 256): reserved for non-ngbe defined error code
+ */
+#define TERR_BASE (0x100)
+enum ngbe_error {
+   TERR_NULL = TERR_BASE,
+   TERR_ANY,
+   TERR_NOSUPP,
+   TERR_NOIMPL,
+   TERR_NOMEM,
+   TERR_NOSPACE,
+   TERR_NOENTRY,
+   TERR_CONFIG,
+   TERR_ARGS,
+   TERR_PARAM,
+   TERR_INVALID,
+   TERR_TIMEOUT,
+   TERR_VERSION,
+   TERR_REGISTER,
+   TERR_FEATURE,
+   TERR_RESET,
+   TERR_AUTONEG,
+   TERR_MBX,
+   TERR_I2C,
+   TERR_FC,
+   TERR_FLASH,
+   TERR_DEVICE,
+   TERR_HOSTIF,
+   TERR_SRAM,
+   TERR_EEPROM,
+   TERR_EEPROM_CHECKSUM,
+   TERR_EEPROM_PROTECT,
+   TERR_EEPROM_VERSION,
+   TERR_MAC,
+   TERR_MAC_ADDR,
+   TERR_SFP,
+   TERR_SFP_INITSEQ,
+   TERR_SFP_PRESENT,
+   TERR_SFP_SUPPORT,
+   TERR_SFP_SETUP,
+   TERR_PHY,
+   TERR_PHY_ADDR,
+   TERR_PHY_INIT,
+   TERR_FDIR_CMD,
+   TERR_FDIR_REINIT,
+   TERR_SWFW_SYNC,
+   TERR_SWFW_COMMAND,
+   TERR_FC_CFG,
+   TERR_FC_NEGO,
+   TERR_LINK_SETUP,
+   TERR_PCIE_PENDING,
+   TERR_PBA_SECTION,
+   TERR_OVERTEMP,
+   TERR_UNDERTEMP,
+   TERR_XPCS_POWERUP,
+};
+
+/* WARNING: just for legacy compatibility */
+#define NGBE_NOT_IMPLEMENTED 0x7FFF
+#define NGBE_ERR_OPS_DUMMY   0x3FFF
+
+/* Error Codes */
+#define NGBE_ERR_EEPROM-(TERR_BASE + 1)
+#define NGBE_ERR_EEPROM_CHECKSUM   -(TERR_BASE + 2)
+#define NGBE_ERR_PHY   -(TERR_BASE + 3)
+#define NGBE_ERR_CONFIG-(TERR_BASE + 4)
+#define NGBE_ERR_PARAM -(TERR_BASE + 5)
+#define NGBE_ERR_MAC_TYPE  -(TERR_BASE + 6)
+#define NGBE_ERR_UNKNOWN_PHY   -(TERR_BASE + 7)
+#define NGBE_ERR_LINK_SETUP-(TERR_BASE + 8)
+#define NGBE_ERR_ADAPTER_STOPPED   -(TERR_BASE + 9)
+#define NGBE_ERR_INVALID_MAC_ADDR  -(TERR_BASE + 10)
+#define NGBE_ERR_DEVICE_NOT_SUPPORTED  -(TERR_BASE + 11)
+#define NGBE_ERR_MASTER_REQUESTS_PENDING   -(TERR_BASE + 12)
+#define NGBE_ERR_INVALID_LINK_SETTINGS -(TERR_BASE + 13)
+#define NGBE_ERR_AUTONEG_NOT_COMPLETE  -(TERR_BASE + 14)
+#define NGBE_ERR_RESET_FAILED  -(TERR_BASE + 15)
+#define NGBE_ERR_SWFW_SYNC -(TERR_B

[dpdk-dev] [PATCH 3/6] net/ngbe: support probe and remove

2021-03-18 Thread Jiawen Wu
Add basic PCIe ethdev probe and remove.

Signed-off-by: Jiawen Wu 
---
 doc/guides/nics/features/ngbe.ini |  1 +
 drivers/net/ngbe/ngbe_ethdev.c| 77 +--
 2 files changed, 75 insertions(+), 3 deletions(-)

diff --git a/doc/guides/nics/features/ngbe.ini 
b/doc/guides/nics/features/ngbe.ini
index a7a524def..977286ac0 100644
--- a/doc/guides/nics/features/ngbe.ini
+++ b/doc/guides/nics/features/ngbe.ini
@@ -4,6 +4,7 @@
 ; Refer to default.ini for the full list of available PMD features.
 ;
 [Features]
+Multiprocess aware   = Y
 Linux= Y
 ARMv8= Y
 x86-32   = Y
diff --git a/drivers/net/ngbe/ngbe_ethdev.c b/drivers/net/ngbe/ngbe_ethdev.c
index da951b6ef..d938fd68a 100644
--- a/drivers/net/ngbe/ngbe_ethdev.c
+++ b/drivers/net/ngbe/ngbe_ethdev.c
@@ -1,10 +1,12 @@
- /* SPDX-License-Identifier: BSD-3-Clause
-  * Copyright(c) 2018-2020
-  */
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
 
+#include 
 #include 
 
 #include 
+#include "ngbe_ethdev.h"
 
 /*
  * The set of PCI devices this driver supports
@@ -25,10 +27,79 @@ static const struct rte_pci_id pci_id_ngbe_map[] = {
{ .vendor_id = 0, /* sentinel */ },
 };
 
+static int
+eth_ngbe_dev_init(struct rte_eth_dev *eth_dev, void *init_params __rte_unused)
+{
+   struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   rte_eth_copy_pci_info(eth_dev, pci_dev);
+
+   return 0;
+}
+
+static int
+eth_ngbe_dev_uninit(struct rte_eth_dev *eth_dev)
+{
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   RTE_SET_USED(eth_dev);
+
+   return 0;
+}
+
+static int
+eth_ngbe_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
+   struct rte_pci_device *pci_dev)
+{
+   struct rte_eth_dev *pf_ethdev;
+   struct rte_eth_devargs eth_da;
+   int retval;
+
+   if (pci_dev->device.devargs) {
+   retval = rte_eth_devargs_parse(pci_dev->device.devargs->args,
+   ð_da);
+   if (retval)
+   return retval;
+   } else {
+   memset(ð_da, 0, sizeof(eth_da));
+   }
+
+   retval = rte_eth_dev_create(&pci_dev->device, pci_dev->device.name,
+   sizeof(struct ngbe_adapter),
+   eth_dev_pci_specific_init, pci_dev,
+   eth_ngbe_dev_init, NULL);
+
+   if (retval || eth_da.nb_representor_ports < 1)
+   return retval;
+
+   pf_ethdev = rte_eth_dev_allocated(pci_dev->device.name);
+   if (pf_ethdev == NULL)
+   return -ENODEV;
+
+   return 0;
+}
+
+static int eth_ngbe_pci_remove(struct rte_pci_device *pci_dev)
+{
+   struct rte_eth_dev *ethdev;
+
+   ethdev = rte_eth_dev_allocated(pci_dev->device.name);
+   if (!ethdev)
+   return -ENODEV;
+
+   return rte_eth_dev_destroy(ethdev, eth_ngbe_dev_uninit);
+}
+
 static struct rte_pci_driver rte_ngbe_pmd = {
.id_table = pci_id_ngbe_map,
.drv_flags = RTE_PCI_DRV_NEED_MAPPING |
 RTE_PCI_DRV_INTR_LSC,
+   .probe = eth_ngbe_pci_probe,
+   .remove = eth_ngbe_pci_remove,
 };
 
 RTE_PMD_REGISTER_PCI(net_ngbe, rte_ngbe_pmd);
-- 
2.21.0.windows.1





[dpdk-dev] [PATCH 6/6] net/ngbe: define registers

2021-03-18 Thread Jiawen Wu
Define all registers that will be used.

Signed-off-by: Jiawen Wu 
---
 drivers/net/ngbe/base/ngbe_regs.h | 1489 +
 drivers/net/ngbe/base/ngbe_type.h |2 +
 2 files changed, 1491 insertions(+)
 create mode 100644 drivers/net/ngbe/base/ngbe_regs.h

diff --git a/drivers/net/ngbe/base/ngbe_regs.h 
b/drivers/net/ngbe/base/ngbe_regs.h
new file mode 100644
index 0..86b4c203d
--- /dev/null
+++ b/drivers/net/ngbe/base/ngbe_regs.h
@@ -0,0 +1,1489 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018-2020
+ */
+
+#ifndef _NGBE_REGS_H_
+#define _NGBE_REGS_H_
+
+#define NGBE_PVMBX_QSIZE  (16) /* 16*4B */
+#define NGBE_PVMBX_BSIZE  (NGBE_PVMBX_QSIZE * 4)
+
+#define NGBE_REMOVED(a) (0)
+
+#define NGBE_REG_DUMMY 0xFF
+
+#define MS8(shift, mask)  (((u8)(mask)) << (shift))
+#define LS8(val, shift, mask) (((u8)(val) & (u8)(mask)) << (shift))
+#define RS8(reg, shift, mask) (((u8)(reg) >> (shift)) & (u8)(mask))
+
+#define MS16(shift, mask) (((u16)(mask)) << (shift))
+#define LS16(val, shift, mask)(((u16)(val) & (u16)(mask)) << (shift))
+#define RS16(reg, shift, mask)(((u16)(reg) >> (shift)) & (u16)(mask))
+
+#define MS32(shift, mask) (((u32)(mask)) << (shift))
+#define LS32(val, shift, mask)(((u32)(val) & (u32)(mask)) << (shift))
+#define RS32(reg, shift, mask)(((u32)(reg) >> (shift)) & (u32)(mask))
+
+#define MS64(shift, mask) (((u64)(mask)) << (shift))
+#define LS64(val, shift, mask)(((u64)(val) & (u64)(mask)) << (shift))
+#define RS64(reg, shift, mask)(((u64)(reg) >> (shift)) & (u64)(mask))
+
+#define MS(shift, mask)   MS32(shift, mask)
+#define LS(val, shift, mask)  LS32(val, shift, mask)
+#define RS(reg, shift, mask)  RS32(reg, shift, mask)
+
+#define ROUND_UP(x, y)  (((x) + (y) - 1) / (y) * (y))
+#define ROUND_DOWN(x, y)((x) / (y) * (y))
+#define ROUND_OVER(x, maxbits, unitbits) \
+   ((x) >= 1 << (maxbits) ? 0 : (x) >> (unitbits))
+
+/* autoc bits definition */
+#define NGBE_AUTOC   NGBE_REG_DUMMY
+#define   NGBE_AUTOC_FLU MS64(0, 0x1)
+#define   NGBE_AUTOC_10G_PMA_PMD_MASKMS64(7, 0x3) /* parallel */
+#define   NGBE_AUTOC_10G_XAUILS64(0, 7, 0x3)
+#define   NGBE_AUTOC_10G_KX4 LS64(1, 7, 0x3)
+#define   NGBE_AUTOC_10G_CX4 LS64(2, 7, 0x3)
+#define   NGBE_AUTOC_10G_KR  LS64(3, 7, 0x3) /* fixme */
+#define   NGBE_AUTOC_1G_PMA_PMD_MASK MS64(9, 0x7)
+#define   NGBE_AUTOC_1G_BX   LS64(0, 9, 0x7)
+#define   NGBE_AUTOC_1G_KX   LS64(1, 9, 0x7)
+#define   NGBE_AUTOC_1G_SFI  LS64(0, 9, 0x7)
+#define   NGBE_AUTOC_1G_KX_BXLS64(1, 9, 0x7)
+#define   NGBE_AUTOC_AN_RESTART  MS64(12, 0x1)
+#define   NGBE_AUTOC_LMS_MASKMS64(13, 0x7)
+#define   NGBE_AUTOC_LMS_10G LS64(3, 13, 0x7)
+#define   NGBE_AUTOC_LMS_KX4_KX_KR   LS64(4, 13, 0x7)
+#define   NGBE_AUTOC_LMS_SGMII_1G_100M   LS64(5, 13, 0x7)
+#define   NGBE_AUTOC_LMS_KX4_KX_KR_1G_AN LS64(6, 13, 0x7)
+#define   NGBE_AUTOC_LMS_KX4_KX_KR_SGMII LS64(7, 13, 0x7)
+#define   NGBE_AUTOC_LMS_1G_LINK_NO_AN   LS64(0, 13, 0x7)
+#define   NGBE_AUTOC_LMS_10G_LINK_NO_AN  LS64(1, 13, 0x7)
+#define   NGBE_AUTOC_LMS_1G_AN   LS64(2, 13, 0x7)
+#define   NGBE_AUTOC_LMS_KX4_AN  LS64(4, 13, 0x7)
+#define   NGBE_AUTOC_LMS_KX4_AN_1G_ANLS64(6, 13, 0x7)
+#define   NGBE_AUTOC_LMS_ATTACH_TYPE LS64(7, 13, 0x7)
+#define   NGBE_AUTOC_LMS_AN  MS64(15, 0x7)
+
+#define   NGBE_AUTOC_KR_SUPP MS64(16, 0x1)
+#define   NGBE_AUTOC_FECRMS64(17, 0x1)
+#define   NGBE_AUTOC_FECAMS64(18, 0x1)
+#define   NGBE_AUTOC_AN_RX_ALIGN MS64(18, 0x1F) /* fixme */
+#define   NGBE_AUTOC_AN_RX_DRIFT MS64(23, 0x3)
+#define   NGBE_AUTOC_AN_RX_LOOSE MS64(24, 0x3)
+#define   NGBE_AUTOC_PD_TMR  MS64(25, 0x3)
+#define   NGBE_AUTOC_RF  MS64(27, 0x1)
+#define   NGBE_AUTOC_ASM_PAUSE   MS64(29, 0x1)
+#define   NGBE_AUTOC_SYM_PAUSE   MS64(28, 0x1)
+#define   NGBE_AUTOC_PAUSE   MS64(28, 0x3)
+#define   NGBE_AUTOC_KX_SUPP MS64(30, 0x1)
+#define   NGBE_AUTOC_KX4_SUPPMS64(31, 0x1)
+
+#define   NGBE_AUTOC_10GS_PMA_PMD_MASK   MS64(48, 0x3)  /* serial */
+#define   NGBE_AUTOC_10GS_KR LS64(0, 48, 0x3)
+#define   NGBE_AUTOC_10GS_XFILS64(1, 48, 0x3)
+#define   NGBE_AUTOC_10GS_SFILS64(2, 48, 0x3)
+#define   NGBE_AUTOC_LINK_DIA_MASK   MS64(60, 0x7)
+#define   NGBE_AUTOC_LINK_DIA_D3_MASKLS64(5, 60, 0x7)
+
+#define   NGBE_AUTOC_SPEED_MASK  MS64(32, 0x)
+#define   NGBD_AUTOC_SPEED(r)RS64(r, 32, 0x)
+#define   NGBE_AUTOC_SPEED(v)LS64(v, 32, 0x)
+#define NGBE_LINK_SPEED_UNKNOWN  0
+#define NGBE_LINK_SPEED_10M_FULL 0x0002
+#define NGBE_L

[dpdk-dev] [PATCH 2/2] net/mlx5: workaround counter memory region creation

2021-03-18 Thread Michael Baum
Due to kernel issue in direct MKEY creation using the DevX API for
physical memory, this patch replaces the counter MR creation to use
Verbs API.

Fixes: 3aa279157fa0 ("net/mlx5: synchronize flow counter pool creation")
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Acked-by: Matan Azrad 
---
 drivers/net/mlx5/linux/mlx5_os.c   | 10 --
 drivers/net/mlx5/mlx5.c| 11 +++
 drivers/net/mlx5/mlx5.h|  5 +
 drivers/net/mlx5/mlx5_flow.c   | 27 +--
 drivers/net/mlx5/windows/mlx5_os.c |  9 -
 5 files changed, 13 insertions(+), 49 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 5e3ae9f..5740214 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1163,16 +1163,6 @@
err = -err;
goto error;
}
-   /* Check relax ordering support. */
-   if (!haswell_broadwell_cpu) {
-   sh->cmng.relaxed_ordering_write =
-   config->hca_attr.relaxed_ordering_write;
-   sh->cmng.relaxed_ordering_read =
-   config->hca_attr.relaxed_ordering_read;
-   } else {
-   sh->cmng.relaxed_ordering_read = 0;
-   sh->cmng.relaxed_ordering_write = 0;
-   }
sh->rq_ts_format = config->hca_attr.rq_ts_format;
sh->sq_ts_format = config->hca_attr.sq_ts_format;
sh->qp_ts_format = config->hca_attr.qp_ts_format;
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index abd7ff7..fb58631 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -469,17 +469,20 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list 
=
 /**
  * Destroy all the resources allocated for a counter memory management.
  *
+ * @param[in] sh
+ *   Pointer to mlx5_dev_ctx_shared object to free.
  * @param[in] mng
  *   Pointer to the memory management structure.
  */
 static void
-mlx5_flow_destroy_counter_stat_mem_mng(struct mlx5_counter_stats_mem_mng *mng)
+mlx5_flow_destroy_counter_stat_mem_mng(struct mlx5_dev_ctx_shared *sh,
+  struct mlx5_counter_stats_mem_mng *mng)
 {
uint8_t *mem = (uint8_t *)(uintptr_t)mng->raws[0].data;
 
LIST_REMOVE(mng, next);
-   claim_zero(mlx5_devx_cmd_destroy(mng->dm));
-   claim_zero(mlx5_os_umem_dereg(mng->umem));
+   sh->share_cache.dereg_mr_cb(&mng->dm);
+   memset(&mng->dm, 0, sizeof(mng->dm));
mlx5_free(mem);
 }
 
@@ -533,7 +536,7 @@ static LIST_HEAD(, mlx5_dev_ctx_shared) mlx5_dev_ctx_list =
}
mng = LIST_FIRST(&sh->cmng.mem_mngs);
while (mng) {
-   mlx5_flow_destroy_counter_stat_mem_mng(mng);
+   mlx5_flow_destroy_counter_stat_mem_mng(sh, mng);
mng = LIST_FIRST(&sh->cmng.mem_mngs);
}
memset(&sh->cmng, 0, sizeof(sh->cmng));
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index e2eb4db..8e8727a 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -422,8 +422,7 @@ struct mlx5_flow_counter_pool {
 struct mlx5_counter_stats_mem_mng {
LIST_ENTRY(mlx5_counter_stats_mem_mng) next;
struct mlx5_counter_stats_raw *raws;
-   struct mlx5_devx_obj *dm;
-   void *umem;
+   struct mlx5_pmd_mr dm;
 };
 
 /* Raw memory structure for the counter statistics values of a pool. */
@@ -454,8 +453,6 @@ struct mlx5_flow_counter_mng {
uint8_t pending_queries;
uint16_t pool_index;
uint8_t query_thread_on;
-   bool relaxed_ordering_read;
-   bool relaxed_ordering_write;
bool counter_fallback; /* Use counter fallback management. */
LIST_HEAD(mem_mngs, mlx5_counter_stats_mem_mng) mem_mngs;
LIST_HEAD(stat_raws, mlx5_counter_stats_raw) free_stat_raws;
diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index d46fc33..afa8ab4 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -6717,7 +6717,6 @@ struct mlx5_meter_domains_infos *
 static int
 mlx5_flow_create_counter_stat_mem_mng(struct mlx5_dev_ctx_shared *sh)
 {
-   struct mlx5_devx_mkey_attr mkey_attr;
struct mlx5_counter_stats_mem_mng *mem_mng;
volatile struct flow_counter_stats *raw_data;
int raws_n = MLX5_CNT_CONTAINER_RESIZE + MLX5_MAX_PENDING_QUERIES;
@@ -6727,6 +6726,7 @@ struct mlx5_meter_domains_infos *
sizeof(struct mlx5_counter_stats_mem_mng);
size_t pgsize = rte_mem_page_size();
uint8_t *mem;
+   int ret;
int i;
 
if (pgsize == (size_t)-1) {
@@ -6741,26 +6741,9 @@ struct mlx5_meter_domains_infos *
}
mem_mng = (struct mlx5_counter_stats_mem_mng *)(mem + size) - 1;
size = sizeof(*raw_data) * MLX5_COUNTERS_PER_POO

[dpdk-dev] [PATCH 0/2] adjusting mkey creations

2021-03-18 Thread Michael Baum
Adjusting mkey creations to use Verbs instead of DevX API.

Michael Baum (2):
  net/mlx5: workaround ASO memory region creation
  net/mlx5: workaround counter memory region creation

 drivers/common/mlx5/linux/mlx5_common_verbs.c |   1 -
 drivers/common/mlx5/windows/mlx5_common_os.c  |  23 +++---
 drivers/net/mlx5/linux/mlx5_os.c  |  10 ---
 drivers/net/mlx5/mlx5.c   |  11 ++-
 drivers/net/mlx5/mlx5.h   |  15 +---
 drivers/net/mlx5/mlx5_flow.c  |  27 ++-
 drivers/net/mlx5/mlx5_flow_age.c  | 106 +++---
 drivers/net/mlx5/windows/mlx5_os.c|   9 ---
 8 files changed, 71 insertions(+), 131 deletions(-)

-- 
1.8.3.1



[dpdk-dev] [PATCH 1/2] net/mlx5: workaround ASO memory region creation

2021-03-18 Thread Michael Baum
Due to kernel issue in direct MKEY creation using the DevX API for
physical memory, this patch replaces the ASO MR creation to use Verbs
API.

Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging")
Cc: sta...@dpdk.org

Signed-off-by: Michael Baum 
Acked-by: Matan Azrad 
---
 drivers/common/mlx5/linux/mlx5_common_verbs.c |   1 -
 drivers/common/mlx5/windows/mlx5_common_os.c  |  23 +++---
 drivers/net/mlx5/mlx5.h   |  10 +--
 drivers/net/mlx5/mlx5_flow_age.c  | 106 +++---
 4 files changed, 58 insertions(+), 82 deletions(-)

diff --git a/drivers/common/mlx5/linux/mlx5_common_verbs.c 
b/drivers/common/mlx5/linux/mlx5_common_verbs.c
index 339535d..aa560f0 100644
--- a/drivers/common/mlx5/linux/mlx5_common_verbs.c
+++ b/drivers/common/mlx5/linux/mlx5_common_verbs.c
@@ -37,7 +37,6 @@
 {
struct ibv_mr *ibv_mr;
 
-   memset(pmd_mr, 0, sizeof(*pmd_mr));
ibv_mr = mlx5_glue->reg_mr(pd, addr, length,
   IBV_ACCESS_LOCAL_WRITE |
   (haswell_broadwell_cpu ? 0 :
diff --git a/drivers/common/mlx5/windows/mlx5_common_os.c 
b/drivers/common/mlx5/windows/mlx5_common_os.c
index f2d781a..cebf42d 100644
--- a/drivers/common/mlx5/windows/mlx5_common_os.c
+++ b/drivers/common/mlx5/windows/mlx5_common_os.c
@@ -155,23 +155,22 @@
struct mlx5_devx_mkey_attr mkey_attr;
struct mlx5_pd *mlx5_pd = (struct mlx5_pd *)pd;
struct mlx5_hca_attr attr;
+   struct mlx5_devx_obj *mkey;
+   void *obj;
 
if (!pd || !addr) {
rte_errno = EINVAL;
return -1;
}
-   memset(pmd_mr, 0, sizeof(*pmd_mr));
if (mlx5_devx_cmd_query_hca_attr(mlx5_pd->devx_ctx, &attr))
return -1;
-   pmd_mr->addr = addr;
-   pmd_mr->len = length;
-   pmd_mr->obj = mlx5_os_umem_reg(mlx5_pd->devx_ctx, pmd_mr->addr,
-  pmd_mr->len, IBV_ACCESS_LOCAL_WRITE);
-   if (!pmd_mr->obj)
+   obj = mlx5_os_umem_reg(mlx5_pd->devx_ctx, addr, length,
+  IBV_ACCESS_LOCAL_WRITE);
+   if (!obj)
return -1;
mkey_attr.addr = (uintptr_t)addr;
mkey_attr.size = length;
-   mkey_attr.umem_id = ((struct mlx5_devx_umem *)(pmd_mr->obj))->umem_id;
+   mkey_attr.umem_id = ((struct mlx5_devx_umem *)(obj))->umem_id;
mkey_attr.pd = mlx5_pd->pdn;
mkey_attr.log_entity_size = 0;
mkey_attr.pg_access = 0;
@@ -183,11 +182,15 @@
mkey_attr.relaxed_ordering_write = attr.relaxed_ordering_write;
mkey_attr.relaxed_ordering_read = attr.relaxed_ordering_read;
}
-   pmd_mr->mkey = mlx5_devx_cmd_mkey_create(mlx5_pd->devx_ctx, &mkey_attr);
-   if (!pmd_mr->mkey) {
-   claim_zero(mlx5_os_umem_dereg(pmd_mr->obj));
+   mkey = mlx5_devx_cmd_mkey_create(mlx5_pd->devx_ctx, &mkey_attr);
+   if (!mkey) {
+   claim_zero(mlx5_os_umem_dereg(obj));
return -1;
}
+   pmd_mr->addr = addr;
+   pmd_mr->len = length;
+   pmd_mr->obj = obj;
+   pmd_mr->mkey = mkey;
pmd_mr->lkey = pmd_mr->mkey->id;
return 0;
 }
diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index 14043b6..e2eb4db 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -471,14 +471,6 @@ struct mlx5_aso_cq {
uint64_t errors;
 };
 
-struct mlx5_aso_devx_mr {
-   void *buf;
-   uint64_t length;
-   struct mlx5dv_devx_umem *umem;
-   struct mlx5_devx_obj *mkey;
-   bool is_indirect;
-};
-
 struct mlx5_aso_sq_elem {
struct mlx5_aso_age_pool *pool;
uint16_t burst_size;
@@ -489,7 +481,7 @@ struct mlx5_aso_sq {
struct mlx5_aso_cq cq;
struct mlx5_devx_sq sq_obj;
volatile uint64_t *uar_addr;
-   struct mlx5_aso_devx_mr mr;
+   struct mlx5_pmd_mr mr;
uint16_t pi;
uint32_t head;
uint32_t tail;
diff --git a/drivers/net/mlx5/mlx5_flow_age.c b/drivers/net/mlx5/mlx5_flow_age.c
index 00cb20d..c0be7c3 100644
--- a/drivers/net/mlx5/mlx5_flow_age.c
+++ b/drivers/net/mlx5/mlx5_flow_age.c
@@ -61,90 +61,72 @@
 /**
  * Free MR resources.
  *
+ * @param[in] sh
+ *   Pointer to shared device context.
  * @param[in] mr
  *   MR to free.
  */
 static void
-mlx5_aso_devx_dereg_mr(struct mlx5_aso_devx_mr *mr)
+mlx5_aso_dereg_mr(struct mlx5_dev_ctx_shared *sh, struct mlx5_pmd_mr *mr)
 {
-   claim_zero(mlx5_devx_cmd_destroy(mr->mkey));
-   if (!mr->is_indirect && mr->umem)
-   claim_zero(mlx5_glue->devx_umem_dereg(mr->umem));
-   mlx5_free(mr->buf);
+   void *addr = mr->addr;
+
+   sh->share_cache.dereg_mr_cb(mr);
+   mlx5_free(addr);
memset(mr, 0, sizeof(*mr));
 }
 
 /**
  * Register Memory Region.
  *
- * @param[in] ctx
- *   Context returned from mlx5 open_device() glue function.
+ * @param[in] sh
+ *   Pointe

Re: [dpdk-dev] [PATCH v2 2/4] common/mlx5: enable debug logs dynamically

2021-03-18 Thread Matan Azrad



From: Thomas Monjalon
> 17/03/2021 18:39, Ferruh Yigit:
> > On 3/9/2021 9:48 AM, Thomas Monjalon wrote:
> > > Most debug logs are using DRV_LOG(DEBUG,) but some were using
> > > DEBUG().
> > > The macro DEBUG is doing nothing if not compiled with
> > > RTE_LIBRTE_MLX5_DEBUG.
> > >
> > > As it is not used in the data path, the macro DEBUG can be replaced
> > > with DRV_LOG.
> > > Then all debug logs can be enabled at runtime with:
> > > --log-level pmd.net.mlx5:debug
> > >
> > > Signed-off-by: Thomas Monjalon 
> >
> > Similar comment for the mlx4 one, copying here:
> >
> > Why 'RTE_LIBRTE_MLX5_DEBUG' exists at first place?
> >
> > It seems is is used both for data and control path, can you extend the patch
> for:
> > 1- Remove #ifdef from control path
> > 2- Replace with 'RTE_ETHDEV_DEBUG_RX' & 'RTE_ETHDEV_DEBUG_TX' for
> data path,
> > please see:
> > https://patches.dpdk.org/project/dpdk/list/?series=15738
> > 3- Remove 'RTE_LIBRTE_MLX5_DEBUG' completely, if not removed
> document it in the
> > driver documentation as supported config file
> >
> > Both for 'mlx4' and 'mlx5', I will continue with existing patch, but
> > can it be possible to make additional patches to address above issues?
> 
> Same answer as for mlx4 :)
> To me using ETHDEV config macro in PMDs is new, and I think it is out of scope
> for this patch.
> But yes I agree it would be a nice improvement.
> Matan, Slave, please could you do this change during next month?

Yes, good suggestion, will add to our tasks.
Thanks Thomas\Ferruh.

Matan


Re: [dpdk-dev] [PATCH] examples/packet_ordering: use local port config

2021-03-18 Thread Pattan, Reshma



> -Original Message-
> From: dapengx...@intel.com 
> 
> Fixes: 6833f919f56b ("examples/packet_ordering: convert to new ethdev
> offloads API")
> Cc: sta...@dpdk.org
Also, need to add CC:  i.e CC:  Shahaf Shuler 


Other than that , patch looks ok to me. Please include my ack in next version 
of the patch .
Acked-by: Reshma Pattan 







[dpdk-dev] [PATCH 0/4] l3fwd improvements

2021-03-18 Thread Ruifeng Wang
This series of patches include changes to l3fwd example application.
Some improvements are made for better usage of CPU cycles and memory.

Ruifeng Wang (4):
  examples/l3fwd: tune prefetch for better performance
  examples/l3fwd: eliminate unnecessary calculations
  examples/l3fwd: eliminate unnecessary reloads in loop
  examples/l3fwd: make data struct to be memory efficient

 examples/l3fwd/l3fwd.h  | 12 ++--
 examples/l3fwd/l3fwd_common.h   |  4 ++--
 examples/l3fwd/l3fwd_em.c   |  6 +++---
 examples/l3fwd/l3fwd_lpm.c  | 16 +---
 examples/l3fwd/l3fwd_lpm_neon.h | 20 ++--
 5 files changed, 30 insertions(+), 28 deletions(-)

-- 
2.25.1



[dpdk-dev] [PATCH 1/4] examples/l3fwd: tune prefetch for better performance

2021-03-18 Thread Ruifeng Wang
Packet header is prefetched before packet processing for better
memory access performance. As L2 header will be updated by l3fwd,
using of prefetch for store hint will set cache line to proper
status and reduce cache maintenance overhead.

With this change, 12.9% performance uplift was measured on N1SDP
platform with MLX5 NIC.

Suggested-by: Honnappa Nagarahalli 

Signed-off-by: Ruifeng Wang 
Reviewed-by: Honnappa Nagarahalli 
---
 examples/l3fwd/l3fwd_lpm_neon.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/examples/l3fwd/l3fwd_lpm_neon.h b/examples/l3fwd/l3fwd_lpm_neon.h
index d6c0ba64a..ae8840694 100644
--- a/examples/l3fwd/l3fwd_lpm_neon.h
+++ b/examples/l3fwd/l3fwd_lpm_neon.h
@@ -97,13 +97,13 @@ l3fwd_lpm_send_packets(int nb_rx, struct rte_mbuf 
**pkts_burst,
 
if (k) {
for (i = 0; i < FWDSTEP; i++) {
-   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[i],
+   rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[i],
struct rte_ether_hdr *) + 1);
}
 
for (j = 0; j != k - FWDSTEP; j += FWDSTEP) {
for (i = 0; i < FWDSTEP; i++) {
-   rte_prefetch0(rte_pktmbuf_mtod(
+   rte_prefetch0_write(rte_pktmbuf_mtod(
pkts_burst[j + i + FWDSTEP],
struct rte_ether_hdr *) + 1);
}
@@ -124,17 +124,17 @@ l3fwd_lpm_send_packets(int nb_rx, struct rte_mbuf 
**pkts_burst,
/* Prefetch last up to 3 packets one by one */
switch (m) {
case 3:
-   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
+   rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[j],
struct rte_ether_hdr *) + 1);
j++;
/* fallthrough */
case 2:
-   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
+   rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[j],
struct rte_ether_hdr *) + 1);
j++;
/* fallthrough */
case 1:
-   rte_prefetch0(rte_pktmbuf_mtod(pkts_burst[j],
+   rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[j],
struct rte_ether_hdr *) + 1);
j++;
}
-- 
2.25.1



[dpdk-dev] [PATCH 2/4] examples/l3fwd: eliminate unnecessary calculations

2021-03-18 Thread Ruifeng Wang
Both L2 and L3 headers will be used in forward processing. And these
two headers are in the same cache line. It has the same effect for
prefetching with L2 header address and prefetching with L3 header
address.

Changed to use L2 header address for prefetching. The change showed
no measurable performance improvement, but it definitely removed
unnecessary instructions for address calculation.

Signed-off-by: Ruifeng Wang 
---
 examples/l3fwd/l3fwd_lpm_neon.h | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/examples/l3fwd/l3fwd_lpm_neon.h b/examples/l3fwd/l3fwd_lpm_neon.h
index ae8840694..1650ae444 100644
--- a/examples/l3fwd/l3fwd_lpm_neon.h
+++ b/examples/l3fwd/l3fwd_lpm_neon.h
@@ -98,14 +98,14 @@ l3fwd_lpm_send_packets(int nb_rx, struct rte_mbuf 
**pkts_burst,
if (k) {
for (i = 0; i < FWDSTEP; i++) {
rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[i],
-   struct rte_ether_hdr *) + 1);
+   void *));
}
 
for (j = 0; j != k - FWDSTEP; j += FWDSTEP) {
for (i = 0; i < FWDSTEP; i++) {
rte_prefetch0_write(rte_pktmbuf_mtod(
pkts_burst[j + i + FWDSTEP],
-   struct rte_ether_hdr *) + 1);
+   void *));
}
 
processx4_step1(&pkts_burst[j], &dip, &ipv4_flag);
@@ -125,17 +125,17 @@ l3fwd_lpm_send_packets(int nb_rx, struct rte_mbuf 
**pkts_burst,
switch (m) {
case 3:
rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[j],
-   struct rte_ether_hdr *) + 1);
+   void *));
j++;
/* fallthrough */
case 2:
rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[j],
-   struct rte_ether_hdr *) + 1);
+   void *));
j++;
/* fallthrough */
case 1:
rte_prefetch0_write(rte_pktmbuf_mtod(pkts_burst[j],
-   struct rte_ether_hdr *) + 1);
+   void *));
j++;
}
 
-- 
2.25.1



[dpdk-dev] [PATCH 3/4] examples/l3fwd: eliminate unnecessary reloads in loop

2021-03-18 Thread Ruifeng Wang
Number of rx queue and number of rx port in lcore config are constants
during the period of l3 forward application running. But compiler has
no this information.

Copied values from lcore config to local variables and used the local
variables for iteration. Compiler can see that the local variables are
not changed, so qconf reloads at each iteration can be eliminated.

The change showed 1.8% performance uplift in single core, single port,
single queue test on N1SDP platform with MLX5 NIC.

Signed-off-by: Ruifeng Wang 
---
 examples/l3fwd/l3fwd_lpm.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index 3dcf1fef1..d338590b9 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -190,14 +190,16 @@ lpm_main_loop(__rte_unused void *dummy)
lcore_id = rte_lcore_id();
qconf = &lcore_conf[lcore_id];
 
-   if (qconf->n_rx_queue == 0) {
+   uint16_t n_rx_q = qconf->n_rx_queue;
+   uint16_t n_tx_p = qconf->n_tx_port;
+   if (n_rx_q == 0) {
RTE_LOG(INFO, L3FWD, "lcore %u has nothing to do\n", lcore_id);
return 0;
}
 
RTE_LOG(INFO, L3FWD, "entering main loop on lcore %u\n", lcore_id);
 
-   for (i = 0; i < qconf->n_rx_queue; i++) {
+   for (i = 0; i < n_rx_q; i++) {
 
portid = qconf->rx_queue_list[i].port_id;
queueid = qconf->rx_queue_list[i].queue_id;
@@ -216,7 +218,7 @@ lpm_main_loop(__rte_unused void *dummy)
diff_tsc = cur_tsc - prev_tsc;
if (unlikely(diff_tsc > drain_tsc)) {
 
-   for (i = 0; i < qconf->n_tx_port; ++i) {
+   for (i = 0; i < n_tx_p; ++i) {
portid = qconf->tx_port_id[i];
if (qconf->tx_mbufs[portid].len == 0)
continue;
@@ -232,7 +234,7 @@ lpm_main_loop(__rte_unused void *dummy)
/*
 * Read packet from RX queues
 */
-   for (i = 0; i < qconf->n_rx_queue; ++i) {
+   for (i = 0; i < n_rx_q; ++i) {
portid = qconf->rx_queue_list[i].port_id;
queueid = qconf->rx_queue_list[i].queue_id;
nb_rx = rte_eth_rx_burst(portid, queueid, pkts_burst,
-- 
2.25.1



[dpdk-dev] [PATCH 4/4] examples/l3fwd: make data struct to be memory efficient

2021-03-18 Thread Ruifeng Wang
There are some holes in data struct lcore_conf. The holes are
due to alignment requirement.

For struct lcore_rx_queue, there is no need to make every element
of this type to be cache line aligned, because the data is not
shared between cores.

Member len of struct mbuf_table can be moved out. So data can be
packed and there will be no need to load an extra cache line when
mbuf table is empty.

The change showed slight performance improvement on N1SDP platform.

Suggested-by: Honnappa Nagarahalli 

Signed-off-by: Ruifeng Wang 
---
 examples/l3fwd/l3fwd.h| 12 ++--
 examples/l3fwd/l3fwd_common.h |  4 ++--
 examples/l3fwd/l3fwd_em.c |  6 +++---
 examples/l3fwd/l3fwd_lpm.c|  6 +++---
 4 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index 2cf06099e..f3a301e12 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -57,22 +57,22 @@
 #define HASH_ENTRY_NUMBER_DEFAULT  4
 
 struct mbuf_table {
-   uint16_t len;
struct rte_mbuf *m_table[MAX_PKT_BURST];
 };
 
 struct lcore_rx_queue {
uint16_t port_id;
uint8_t queue_id;
-} __rte_cache_aligned;
+};
 
 struct lcore_conf {
-   uint16_t n_rx_queue;
struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
-   uint16_t n_tx_port;
uint16_t tx_port_id[RTE_MAX_ETHPORTS];
uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
+   uint16_t tx_mbuf_len[RTE_MAX_ETHPORTS];
struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
+   uint16_t n_rx_queue;
+   uint16_t n_tx_port;
void *ipv4_lookup_struct;
void *ipv6_lookup_struct;
 } __rte_cache_aligned;
@@ -122,7 +122,7 @@ send_single_packet(struct lcore_conf *qconf,
 {
uint16_t len;
 
-   len = qconf->tx_mbufs[port].len;
+   len = qconf->tx_mbuf_len[port];
qconf->tx_mbufs[port].m_table[len] = m;
len++;
 
@@ -132,7 +132,7 @@ send_single_packet(struct lcore_conf *qconf,
len = 0;
}
 
-   qconf->tx_mbufs[port].len = len;
+   qconf->tx_mbuf_len[port] = len;
return 0;
 }
 
diff --git a/examples/l3fwd/l3fwd_common.h b/examples/l3fwd/l3fwd_common.h
index 7d83ff641..05e03dbfc 100644
--- a/examples/l3fwd/l3fwd_common.h
+++ b/examples/l3fwd/l3fwd_common.h
@@ -183,7 +183,7 @@ send_packetsx4(struct lcore_conf *qconf, uint16_t port, 
struct rte_mbuf *m[],
 {
uint32_t len, j, n;
 
-   len = qconf->tx_mbufs[port].len;
+   len = qconf->tx_mbuf_len[port];
 
/*
 * If TX buffer for that queue is empty, and we have enough packets,
@@ -258,7 +258,7 @@ send_packetsx4(struct lcore_conf *qconf, uint16_t port, 
struct rte_mbuf *m[],
}
}
 
-   qconf->tx_mbufs[port].len = len;
+   qconf->tx_mbuf_len[port] = len;
 }
 
 #endif /* _L3FWD_COMMON_H_ */
diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 9996bfba3..1970e0376 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -662,12 +662,12 @@ em_main_loop(__rte_unused void *dummy)
 
for (i = 0; i < qconf->n_tx_port; ++i) {
portid = qconf->tx_port_id[i];
-   if (qconf->tx_mbufs[portid].len == 0)
+   if (qconf->tx_mbuf_len[portid] == 0)
continue;
send_burst(qconf,
-   qconf->tx_mbufs[portid].len,
+   qconf->tx_mbuf_len[portid],
portid);
-   qconf->tx_mbufs[portid].len = 0;
+   qconf->tx_mbuf_len[portid] = 0;
}
 
prev_tsc = cur_tsc;
diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c
index d338590b9..e62139a0e 100644
--- a/examples/l3fwd/l3fwd_lpm.c
+++ b/examples/l3fwd/l3fwd_lpm.c
@@ -220,12 +220,12 @@ lpm_main_loop(__rte_unused void *dummy)
 
for (i = 0; i < n_tx_p; ++i) {
portid = qconf->tx_port_id[i];
-   if (qconf->tx_mbufs[portid].len == 0)
+   if (qconf->tx_mbuf_len[portid] == 0)
continue;
send_burst(qconf,
-   qconf->tx_mbufs[portid].len,
+   qconf->tx_mbuf_len[portid],
portid);
-   qconf->tx_mbufs[portid].len = 0;
+   qconf->tx_mbuf_len[portid] = 0;
}
 
prev_tsc = cur_tsc;
-- 
2.25.1



Re: [dpdk-dev] [PATCH 2/2] net/ice: add Rx AVX512 offload path

2021-03-18 Thread Van Haaren, Harry
> -Original Message-
> From: dev  On Behalf Of Leyi Rong
> Sent: Wednesday, March 17, 2021 9:14 AM
> To: Zhang, Qi Z ; Lu, Wenzhuo 
> Cc: dev@dpdk.org; Rong, Leyi 
> Subject: [dpdk-dev] [PATCH 2/2] net/ice: add Rx AVX512 offload path
> 
> Split AVX512 Rx data path into two, one is for basic,
> the other one can support additional Rx offload features,
> including Rx checksum offload, Rx vlan offload, RSS offload.
> 
> Signed-off-by: Leyi Rong 
> Signed-off-by: Wenzhuo Lu 


Hi Leyi and Wenzhou,

I'm a bit concerned over code-duplication of the RX datapath in this patch,
as it duplicates the core desc-to-mbuf RX loop.

I loaded the following functions, and compared in "meld" to view the diff
side-by-side, and it should be possible to "specialize" away the differences:
_ice_recv_raw_pkts_vec_avx512() /* original */
_ice_recv_raw_pkts_vec_avx512_offload() /* with offload */

Specializing the implementation (adding "do_offload" parameter to 
_ice_recv_raw_pkts_vec_avx512()),
and branch on it with an   if(do_offload)  when the offload and non-offload 
paths behave differently.

When inlining that function the compiler will remove the branches, and you'll 
only have one version
of the code to maintain, without any performance penalty.

If my suggestion around parameterizing, specializing and inlining isn't clear, 
please ask and I can
try to explain better.

Regards, -Harry




Re: [dpdk-dev] [PATCH] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Thomas Monjalon
18/03/2021 09:36, Slava Ovsiienko:
> From: Thomas Monjalon
> > In Windows probing, the value RTE_PCI_KDRV_NONE was used instead of
> > RTE_PCI_KDRV_UNKNOWN (mlx case), and RTE_PCI_KDRV_NIC_UIO
> > (FreeBSD) was re-used instead of having a new RTE_PCI_KDRV_NET_UIO for
> > Windows NetUIO.
> 
> As far as I understand - under Windows there is always some kernel driver
> backing the device, hence, RTE_PCI_KDRV_NONE is not an option and
> RTE_PCI_KDRV_UNKNOWN is more appropriate. I would add this extra
> explanation in commit message.

The reason is that NONE is not appropriate because there *is* a kernel
driver backing the device in mlx case.
And it is aligning with Linux.

I will improve the message and comments.




[dpdk-dev] [PATCH v2] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Thomas Monjalon
In Windows probing, the value RTE_PCI_KDRV_NONE was used
instead of RTE_PCI_KDRV_UNKNOWN.
This value covers the mlx case where the kernel driver is in place,
offering a bifurcated mode to the userspace driver.
When the kernel driver is listed as unknown,
there is no special treatment in DPDK probing, contrary to UIO modes.

The value RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used
instead of having a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.
While adding the new value RTE_PCI_KDRV_NET_UIO
(at the end for ABI compatibility),
the enum of kernel driver categories is annotated.

Fixes: b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
Fixes: c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: sta...@dpdk.org

Signed-off-by: Thomas Monjalon 
Acked-by: Dmitry Kozlyuk 
---
v2: improve comments and commit message
---
 drivers/bus/pci/rte_bus_pci.h | 13 +++--
 drivers/bus/pci/windows/pci.c | 14 +++---
 2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
index fdda046515..876abddefb 100644
--- a/drivers/bus/pci/rte_bus_pci.h
+++ b/drivers/bus/pci/rte_bus_pci.h
@@ -52,12 +52,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver);
 struct rte_devargs;
 
 enum rte_pci_kernel_driver {
-   RTE_PCI_KDRV_UNKNOWN = 0,
-   RTE_PCI_KDRV_IGB_UIO,
-   RTE_PCI_KDRV_VFIO,
-   RTE_PCI_KDRV_UIO_GENERIC,
-   RTE_PCI_KDRV_NIC_UIO,
-   RTE_PCI_KDRV_NONE,
+   RTE_PCI_KDRV_UNKNOWN = 0,  /* may be misc UIO or bifurcated driver */
+   RTE_PCI_KDRV_IGB_UIO,  /* igb_uio for Linux */
+   RTE_PCI_KDRV_VFIO, /* VFIO for Linux */
+   RTE_PCI_KDRV_UIO_GENERIC,  /* uio_pci_generic for Linux */
+   RTE_PCI_KDRV_NIC_UIO,  /* nic_uio for FreeBSD */
+   RTE_PCI_KDRV_NONE, /* no attached driver */
+   RTE_PCI_KDRV_NET_UIO,  /* NetUIO for Windows */
 };
 
 /**
diff --git a/drivers/bus/pci/windows/pci.c b/drivers/bus/pci/windows/pci.c
index 8f906097f4..d39a7748b8 100644
--- a/drivers/bus/pci/windows/pci.c
+++ b/drivers/bus/pci/windows/pci.c
@@ -38,7 +38,7 @@ rte_pci_map_device(struct rte_pci_device *dev)
 * Devices that are bound to netuio are mapped at
 * the bus probing stage.
 */
-   if (dev->kdrv == RTE_PCI_KDRV_NIC_UIO)
+   if (dev->kdrv == RTE_PCI_KDRV_NET_UIO)
return 0;
else
return -1;
@@ -207,14 +207,14 @@ get_device_resource_info(HDEVINFO dev_info,
int ret;
 
switch (dev->kdrv) {
-   case RTE_PCI_KDRV_NONE:
-   /* mem_resource - Unneeded for RTE_PCI_KDRV_NONE */
+   case RTE_PCI_KDRV_UNKNOWN:
+   /* bifurcated driver case - mem_resource is unneeded */
dev->mem_resource[0].phys_addr = 0;
dev->mem_resource[0].len = 0;
dev->mem_resource[0].addr = NULL;
break;
-   case RTE_PCI_KDRV_NIC_UIO:
-   /* get device info from netuio kernel driver */
+   case RTE_PCI_KDRV_NET_UIO:
+   /* get device info from NetUIO kernel driver */
ret = get_netuio_device_info(dev_info, dev_info_data, dev);
if (ret != 0) {
RTE_LOG(DEBUG, EAL,
@@ -323,9 +323,9 @@ set_kernel_driver_type(PSP_DEVINFO_DATA device_info_data,
 {
/* set kernel driver type based on device class */
if (IsEqualGUID(&(device_info_data->ClassGuid), &GUID_DEVCLASS_NETUIO))
-   dev->kdrv = RTE_PCI_KDRV_NIC_UIO;
+   dev->kdrv = RTE_PCI_KDRV_NET_UIO;
else
-   dev->kdrv = RTE_PCI_KDRV_NONE;
+   dev->kdrv = RTE_PCI_KDRV_UNKNOWN;
 }
 
 static int
-- 
2.30.1



[dpdk-dev] [PATCH] doc: fix names of UIO drivers

2021-03-18 Thread Thomas Monjalon
Fix typos in the names of kernel drivers based on UIO,
and make sure the generic term for the interface is UIO in capitals.

Fixes: 3a78b2f73206 ("doc: add virtio crypto PMD guide")
Fixes: 3cc4d996fa75 ("doc: update VFIO usage in qat crypto guide")
Fixes: 39922c470e3c ("doc: add known uio_pci_generic issue for i40e")
Fixes: 86fa6c57a175 ("doc: add known igb_uio issue for i40e")
Fixes: beff6d8e8e2e ("net/netvsc: add documentation")
Cc: sta...@dpdk.org

Signed-off-by: Thomas Monjalon 
---
 doc/guides/cryptodevs/caam_jr.rst |  2 +-
 doc/guides/cryptodevs/qat.rst |  2 +-
 doc/guides/cryptodevs/virtio.rst  |  2 +-
 doc/guides/nics/netvsc.rst|  2 +-
 doc/guides/nics/virtio.rst|  5 +++--
 doc/guides/nics/vmxnet3.rst   |  3 ++-
 doc/guides/rel_notes/known_issues.rst | 10 +-
 doc/guides/sample_app_ug/vhost.rst|  2 +-
 8 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/doc/guides/cryptodevs/caam_jr.rst 
b/doc/guides/cryptodevs/caam_jr.rst
index 5ef33ae78e..d7b0f14234 100644
--- a/doc/guides/cryptodevs/caam_jr.rst
+++ b/doc/guides/cryptodevs/caam_jr.rst
@@ -24,7 +24,7 @@ accelerators. This provides significant improvement to system 
level performance.
 
 SEC HW accelerator above 4.x+ version are also known as CAAM.
 
-caam_jr PMD is one of DPAA drivers which uses uio interface to interact with
+caam_jr PMD is one of DPAA drivers which uses UIO interface to interact with
 Linux kernel for configure and destroy the device instance (ring).
 
 
diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst
index cf16f03503..ea5c03b8fa 100644
--- a/doc/guides/cryptodevs/qat.rst
+++ b/doc/guides/cryptodevs/qat.rst
@@ -562,7 +562,7 @@ Binding the available VFs to the vfio-pci driver
 
 Note:
 
-* Please note that due to security issues, the usage of older DPDK igb-uio
+* Please note that due to security issues, the usage of older DPDK igb_uio
   driver is not recommended. This document shows how to use the more secure
   vfio-pci driver.
 * If QAT fails to bind to vfio-pci on Linux kernel 5.9+, please see the
diff --git a/doc/guides/cryptodevs/virtio.rst b/doc/guides/cryptodevs/virtio.rst
index 83d8e32397..8b96446ff2 100644
--- a/doc/guides/cryptodevs/virtio.rst
+++ b/doc/guides/cryptodevs/virtio.rst
@@ -63,7 +63,7 @@ QEMU can then be started using the following parameters:
 -device virtio-crypto-pci,id=crypto0,cryptodev=cryptodev0
 [...]
 
-Secondly bind the uio_generic driver for the virtio-crypto device.
+Secondly bind the uio_pci_generic driver for the virtio-crypto device.
 For example, :00:04.0 is the domain, bus, device and function
 number of the virtio-crypto device:
 
diff --git a/doc/guides/nics/netvsc.rst b/doc/guides/nics/netvsc.rst
index 19f9940fe6..c0e218c743 100644
--- a/doc/guides/nics/netvsc.rst
+++ b/doc/guides/nics/netvsc.rst
@@ -62,7 +62,7 @@ store it in a shell variable:
 
 .. _`UUID`: https://en.wikipedia.org/wiki/Universally_unique_identifier
 
-There are several possible ways to assign the uio device driver for a device.
+There are several possible ways to assign the UIO device driver for a device.
 The easiest way (but only on 4.18 or later)
 is to use the `driverctl Device Driver control utility`_ to override
 the normal kernel device.
diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst
index 02e74a6e77..ac07d4d1e5 100644
--- a/doc/guides/nics/virtio.rst
+++ b/doc/guides/nics/virtio.rst
@@ -71,7 +71,7 @@ In this release, the virtio PMD driver provides the basic 
functionality of packe
 
 *   Virtio supports software vlan stripping and inserting.
 
-*   Virtio supports using port IO to get PCI resource when uio/igb_uio module 
is not available.
+*   Virtio supports using port IO to get PCI resource when UIO module is not 
available.
 
 Prerequisites
 -
@@ -103,7 +103,8 @@ Host2VM communication example
 
 insmod rte_kni.ko
 
-Other basic DPDK preparations like hugepage enabling, uio port binding are 
not listed here.
+Other basic DPDK preparations like hugepage enabling,
+UIO port binding are not listed here.
 Please refer to the *DPDK Getting Started Guide* for detailed instructions.
 
 #.  Launch the kni user application:
diff --git a/doc/guides/nics/vmxnet3.rst b/doc/guides/nics/vmxnet3.rst
index ae146f0d55..190cf91a47 100644
--- a/doc/guides/nics/vmxnet3.rst
+++ b/doc/guides/nics/vmxnet3.rst
@@ -119,7 +119,8 @@ This section describes an example setup for 
Phy-vSwitch-VM-Phy communication.
 
 .. note::
 
-Other instructions on preparing to use DPDK such as, hugepage enabling, 
uio port binding are not listed here.
+Other instructions on preparing to use DPDK such as,
+hugepage enabling, UIO port binding are not listed here.
 Please refer to *DPDK Getting Started Guide and DPDK Sample Application's 
User Guide* for detailed instructions.
 
 The packet reception and transmission flow path is::
diff --git a/doc/guides/rel_notes/kno

[dpdk-dev] [PATCH] net/mlx5: support RSS expansion for IPv6 GRE

2021-03-18 Thread Jack Min
Currently RSS expansion only support IPv4 as GRE payload or
delivery protocol (RFC2784). IPv6 as GRE payload or delivery protocol
(RFC7676) is not supported.

This patch add RSS expansion for RFC7676 so PMD can expand flow item
correctly.

Fixes: f4b901a46aec ("net/mlx5: add flow GRE item")
Cc: sta...@dpdk.org

Signed-off-by: Xiaoyu Min 
---
 drivers/net/mlx5/mlx5_flow.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_flow.c b/drivers/net/mlx5/mlx5_flow.c
index d46fc333d1..de4e4a374a 100644
--- a/drivers/net/mlx5/mlx5_flow.c
+++ b/drivers/net/mlx5/mlx5_flow.c
@@ -499,7 +499,8 @@ static const struct mlx5_flow_expand_node 
mlx5_support_expansion[] = {
(MLX5_EXPANSION_OUTER_IPV6_UDP,
 MLX5_EXPANSION_OUTER_IPV6_TCP,
 MLX5_EXPANSION_IPV4,
-MLX5_EXPANSION_IPV6),
+MLX5_EXPANSION_IPV6,
+MLX5_EXPANSION_GRE),
.type = RTE_FLOW_ITEM_TYPE_IPV6,
.rss_types = ETH_RSS_IPV6 | ETH_RSS_FRAG_IPV6 |
ETH_RSS_NONFRAG_IPV6_OTHER,
@@ -527,7 +528,8 @@ static const struct mlx5_flow_expand_node 
mlx5_support_expansion[] = {
.type = RTE_FLOW_ITEM_TYPE_VXLAN_GPE,
},
[MLX5_EXPANSION_GRE] = {
-   .next = MLX5_FLOW_EXPAND_RSS_NEXT(MLX5_EXPANSION_IPV4),
+   .next = MLX5_FLOW_EXPAND_RSS_NEXT(MLX5_EXPANSION_IPV4,
+ MLX5_EXPANSION_IPV6),
.type = RTE_FLOW_ITEM_TYPE_GRE,
},
[MLX5_EXPANSION_MPLS] = {
-- 
2.30.1


[dpdk-dev] [PATCH v1] lib/mempool: distinguish debug counters from cache and pool

2021-03-18 Thread Joyce Kong
If cache is enabled, objects will be retrieved/put from/to cache,
subsequently from/to the common pool. Now the debug stats caculate
the objects retrived/put from/to cache and pool together, it is
better to distinguish the data number from local cache and common
pool.

Signed-off-by: Joyce Kong 
---
 lib/librte_mempool/rte_mempool.c | 12 +++
 lib/librte_mempool/rte_mempool.h | 60 +++-
 2 files changed, 55 insertions(+), 17 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index afb1239c8..9cb69367a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -1244,8 +1244,14 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
sum.put_bulk += mp->stats[lcore_id].put_bulk;
sum.put_objs += mp->stats[lcore_id].put_objs;
+   sum.put_objs_cache += mp->stats[lcore_id].put_objs_cache;
+   sum.put_objs_pool += mp->stats[lcore_id].put_objs_pool;
+   sum.put_objs_flush += mp->stats[lcore_id].put_objs_flush;
sum.get_success_bulk += mp->stats[lcore_id].get_success_bulk;
sum.get_success_objs += mp->stats[lcore_id].get_success_objs;
+   sum.get_success_objs_cache += 
mp->stats[lcore_id].get_success_objs_cache;
+   sum.get_success_objs_pool += 
mp->stats[lcore_id].get_success_objs_pool;
+   sum.get_success_objs_refill += 
mp->stats[lcore_id].get_success_objs_refill;
sum.get_fail_bulk += mp->stats[lcore_id].get_fail_bulk;
sum.get_fail_objs += mp->stats[lcore_id].get_fail_objs;
sum.get_success_blks += mp->stats[lcore_id].get_success_blks;
@@ -1254,8 +1260,14 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
fprintf(f, "  stats:\n");
fprintf(f, "put_bulk=%"PRIu64"\n", sum.put_bulk);
fprintf(f, "put_objs=%"PRIu64"\n", sum.put_objs);
+   fprintf(f, "put_objs_cache=%"PRIu64"\n", sum.put_objs_cache);
+   fprintf(f, "put_objs_pool=%"PRIu64"\n", sum.put_objs_pool);
+   fprintf(f, "put_objs_flush=%"PRIu64"\n", sum.put_objs_flush);
fprintf(f, "get_success_bulk=%"PRIu64"\n", sum.get_success_bulk);
fprintf(f, "get_success_objs=%"PRIu64"\n", sum.get_success_objs);
+   fprintf(f, "get_success_objs_cache=%"PRIu64"\n", 
sum.get_success_objs_cache);
+   fprintf(f, "get_success_objs_pool=%"PRIu64"\n", 
sum.get_success_objs_pool);
+   fprintf(f, "get_success_objs_refill=%"PRIu64"\n", 
sum.get_success_objs_refill);
fprintf(f, "get_fail_bulk=%"PRIu64"\n", sum.get_fail_bulk);
fprintf(f, "get_fail_objs=%"PRIu64"\n", sum.get_fail_objs);
if (info.contig_block_size > 0) {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c551cf733..26f2e2bc0 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -66,12 +66,18 @@ extern "C" {
  * A structure that stores the mempool statistics (per-lcore).
  */
 struct rte_mempool_debug_stats {
-   uint64_t put_bulk; /**< Number of puts. */
-   uint64_t put_objs; /**< Number of objects successfully put. */
-   uint64_t get_success_bulk; /**< Successful allocation number. */
-   uint64_t get_success_objs; /**< Objects successfully allocated. */
-   uint64_t get_fail_bulk;/**< Failed allocation number. */
-   uint64_t get_fail_objs;/**< Objects that failed to be allocated. */
+   uint64_t put_bulk;   /**< Number of puts. */
+   uint64_t put_objs;   /**< Number of objects successfully 
put. */
+   uint64_t put_objs_cache; /**< Number of objects successfully 
put to cache. */
+   uint64_t put_objs_pool;  /**< Number of objects successfully 
put to pool. */
+   uint64_t put_objs_flush; /**< Number of flushing objects from 
cache to pool. */
+   uint64_t get_success_bulk;   /**< Successful allocation number. */
+   uint64_t get_success_objs;   /**< Objects successfully allocated. */
+   uint64_t get_success_objs_cache; /**< Objects successfully allocated 
from cache. */
+   uint64_t get_success_objs_pool;  /**< Objects successfully allocated 
from pool. */
+   uint64_t get_success_objs_refill;/**< Number of refilling objects from 
pool to cache. */
+   uint64_t get_fail_bulk;  /**< Failed allocation number. */
+   uint64_t get_fail_objs;  /**< Objects that failed to be 
allocated. */
/** Successful allocation number of contiguous blocks. */
uint64_t get_success_blks;
/** Failed allocation number of contiguous blocks. */
@@ -270,22 +276,34 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __MEMPOOL_STAT_ADD

[dpdk-dev] [PATCH v2] lib/mempool: distinguish debug counters from cache and pool

2021-03-18 Thread Joyce Kong
If cache is enabled, objects will be retrieved/put from/to cache,
subsequently from/to the common pool. Now the debug stats calculate
the objects retrieved/put from/to cache and pool together, it is
better to distinguish the data number from local cache and common
pool.

Signed-off-by: Joyce Kong 
---
 lib/librte_mempool/rte_mempool.c | 12 ++
 lib/librte_mempool/rte_mempool.h | 64 ++--
 2 files changed, 57 insertions(+), 19 deletions(-)

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index afb1239c8..9cb69367a 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -1244,8 +1244,14 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
sum.put_bulk += mp->stats[lcore_id].put_bulk;
sum.put_objs += mp->stats[lcore_id].put_objs;
+   sum.put_objs_cache += mp->stats[lcore_id].put_objs_cache;
+   sum.put_objs_pool += mp->stats[lcore_id].put_objs_pool;
+   sum.put_objs_flush += mp->stats[lcore_id].put_objs_flush;
sum.get_success_bulk += mp->stats[lcore_id].get_success_bulk;
sum.get_success_objs += mp->stats[lcore_id].get_success_objs;
+   sum.get_success_objs_cache += 
mp->stats[lcore_id].get_success_objs_cache;
+   sum.get_success_objs_pool += 
mp->stats[lcore_id].get_success_objs_pool;
+   sum.get_success_objs_refill += 
mp->stats[lcore_id].get_success_objs_refill;
sum.get_fail_bulk += mp->stats[lcore_id].get_fail_bulk;
sum.get_fail_objs += mp->stats[lcore_id].get_fail_objs;
sum.get_success_blks += mp->stats[lcore_id].get_success_blks;
@@ -1254,8 +1260,14 @@ rte_mempool_dump(FILE *f, struct rte_mempool *mp)
fprintf(f, "  stats:\n");
fprintf(f, "put_bulk=%"PRIu64"\n", sum.put_bulk);
fprintf(f, "put_objs=%"PRIu64"\n", sum.put_objs);
+   fprintf(f, "put_objs_cache=%"PRIu64"\n", sum.put_objs_cache);
+   fprintf(f, "put_objs_pool=%"PRIu64"\n", sum.put_objs_pool);
+   fprintf(f, "put_objs_flush=%"PRIu64"\n", sum.put_objs_flush);
fprintf(f, "get_success_bulk=%"PRIu64"\n", sum.get_success_bulk);
fprintf(f, "get_success_objs=%"PRIu64"\n", sum.get_success_objs);
+   fprintf(f, "get_success_objs_cache=%"PRIu64"\n", 
sum.get_success_objs_cache);
+   fprintf(f, "get_success_objs_pool=%"PRIu64"\n", 
sum.get_success_objs_pool);
+   fprintf(f, "get_success_objs_refill=%"PRIu64"\n", 
sum.get_success_objs_refill);
fprintf(f, "get_fail_bulk=%"PRIu64"\n", sum.get_fail_bulk);
fprintf(f, "get_fail_objs=%"PRIu64"\n", sum.get_fail_objs);
if (info.contig_block_size > 0) {
diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
index c551cf733..29d80d97e 100644
--- a/lib/librte_mempool/rte_mempool.h
+++ b/lib/librte_mempool/rte_mempool.h
@@ -66,12 +66,18 @@ extern "C" {
  * A structure that stores the mempool statistics (per-lcore).
  */
 struct rte_mempool_debug_stats {
-   uint64_t put_bulk; /**< Number of puts. */
-   uint64_t put_objs; /**< Number of objects successfully put. */
-   uint64_t get_success_bulk; /**< Successful allocation number. */
-   uint64_t get_success_objs; /**< Objects successfully allocated. */
-   uint64_t get_fail_bulk;/**< Failed allocation number. */
-   uint64_t get_fail_objs;/**< Objects that failed to be allocated. */
+   uint64_t put_bulk;/**< Number of puts. */
+   uint64_t put_objs;/**< Number of objects successfully 
put. */
+   uint64_t put_objs_cache;  /**< Number of objects successfully 
put to cache. */
+   uint64_t put_objs_pool;   /**< Number of objects successfully 
put to pool. */
+   uint64_t put_objs_flush;  /**< Number of flushing objects from 
cache to pool. */
+   uint64_t get_success_bulk;/**< Successful allocation number. */
+   uint64_t get_success_objs;/**< Objects successfully allocated. 
*/
+   uint64_t get_success_objs_cache;  /**< Objects successfully allocated 
from cache. */
+   uint64_t get_success_objs_pool;   /**< Objects successfully allocated 
from pool. */
+   uint64_t get_success_objs_refill; /**< Number of refilling objects from 
pool to cache. */
+   uint64_t get_fail_bulk;   /**< Failed allocation number. */
+   uint64_t get_fail_objs;   /**< Objects that failed to be 
allocated. */
/** Successful allocation number of contiguous blocks. */
uint64_t get_success_blks;
/** Failed allocation number of contiguous blocks. */
@@ -270,22 +276,34 @@ struct rte_mempool {
  *   Number to add to the object-oriented statistics.
  */
 #ifdef RTE_LIBRTE_MEMPOOL_DEBUG
-#define __ME

Re: [dpdk-dev] [PATCH] net/e1000: fix filter control return value

2021-03-18 Thread Xiaozhen Ban
OK, but I think this bug affects all stable release about 6 years before today.


Re: [dpdk-dev] 19.11.4 patches review and test

2021-03-18 Thread Christian Ehrhardt
On Tue, Sep 1, 2020 at 3:23 PM Pai G, Sunil  wrote:
>
> Hi,
>
> Yes , OVS was using pkg-config even before these patches were rolled out.
> But it always used to pick the DPDK shared libs by default for OVS even on 
> using the -Bstatic/-Bshared flags.
> These patches from Bruce simplify the process from DPDK side without having 
> the user to specify them.
> Moreover, with these patches , the problem of shared DPDK libs always being 
> picked instead of static was not seen any more with a bit of changes from the 
> OVS side as well.
> http://patchwork.ozlabs.org/project/openvswitch/patch/20200707141126.71414-1-sunil.pa...@intel.com/
>  .
> The patches for ovs-master are ready as well and will them out soon.

Hi Sunil, Ian, everyone ..

back  in 19.11.4 these DPDK changes were not picked up as they have
broken builds as discussed here.
Later on the communication was that all this works fine now and
thereby Luca has "reverted the reverts" in 19.11.6 [1].

But today we were made aware that still no OVS 2.13 builds against a
DPDK that has those changes.
Not 2.13.1 as we have it in Ubuntu nor (if it needs some OVS changes
backported) the recent 2.13.3 does build.
They still fail with the very same issue I reported [2] back then.

Unfortunately I have just released 19.11.7 so I can't revert them
there - but OTOH reverting and counter reverting every other release
seems wrong anyway.

I wanted to ask if there is a set of patches that OVS would need to
backport to 2.13.x to make this work?
If they could be identified and prepared Distros could use them on
2.13.3 asap and 2.13.4 could officially release them for OVS later on.

But for that we'd need a hint which OVS changes that would need to be.
All I know atm is from the testing reports on DPDK it seems that OVS
2.14.3 and 2.15 are happy with the new DPDK code.
Do you have pointers on what 2.13.3 would need to get backported to
work again in regard to this build issue.


[1]: http://git.dpdk.org/dpdk-stable/log/?h=19.11&ofs=550
[2]: http://mails.dpdk.org/archives/stable/2020-September/024796.html

> Thanks and Regards,
> Pai G, Sunil
> Sunil
>
> > -Original Message-
> > From: Bruce Richardson 
> > Sent: Tuesday, September 1, 2020 6:18 PM
> > To: Christian Ehrhardt 
> > Cc: Luca Boccassi ; sta...@dpdk.org; dev ;
> > Pai G, Sunil ; Stokes, Ian 
> > Subject: Re: [dpdk-dev] 19.11.4 patches review and test
> >
> > On Tue, Sep 01, 2020 at 02:32:26PM +0200, Christian Ehrhardt wrote:
> > > On Tue, Sep 1, 2020 at 10:30 AM Luca Boccassi  wrote:
> > > >
> > > > On Tue, 2020-08-18 at 19:12 +0100, Luca Boccassi wrote:
> > > > > Hi all,
> > > > >
> > > > > Here is a list of patches targeted for stable release 19.11.4.
> > > > >
> > > > > The planned date for the final release is August 31st.
> > > > >
> > > > > Please help with testing and validation of your use cases and
> > > > > report any issues/results with reply-all to this mail. For the
> > > > > final release the fixes and reported validations will be added to the 
> > > > > release
> > notes.
> > > > >
> > > > > A release candidate tarball can be found at:
> > > > >
> > > > > https://dpdk.org/browse/dpdk-stable/tag/?id=v19.11.4-rc1
> > > > >
> > > > > These patches are located at branch 19.11 of dpdk-stable repo:
> > > > > https://dpdk.org/browse/dpdk-stable/
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Luca Boccassi
> > > >
> > > > Microsoft's regression tests are still running, delaying until
> > > > Thursday the 3rd. Apologies for any inconvenience.
> > >
> > > Due to report on OVS failing to build I happened to find that 19.11.4
> > > has massively changed linking.
> > > => https://paste.ubuntu.com/p/znCRR4gpjP/
> > >
> > > This was meant to be helpful for sure and I assume is around:
> > > 48f7fd27f6 build/pkg-config: prevent overlinking
> > > 2d1535d592 build/pkg-config: improve static linking flags
> > > 9fb13a12c1 build/pkg-config: output drivers first for static build
> > > 59b108d824 build/pkg-config: move pkg-config file creation
> > > aea915e944 devtools: test static linkage with pkg-config
> > >
> > > But overlinking has effectively become underlinking now
> > > https://launchpadlibrarian.net/495845224/buildlog_ubuntu-groovy-amd64.
> > > openvswitch_2.13.1-0ubuntu2~ppa1_BUILDING.txt.gz
> > >
> > > /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/10/../../../x86_64-linux-
> > gnu/librte_pmd_ring.a(net_ring_rte_eth_ring.c.o):
> > > in function `rte_eth_from_rings':
> > > (.text+0x91c): undefined reference to `rte_vdev_init'
> > > /usr/bin/ld: /usr/lib/gcc/x86_64-linux-gnu/10/../../../x86_64-linux-
> > gnu/librte_pmd_ring.a(net_ring_rte_eth_ring.c.o):
> > > in function `vdrvinitfn_pmd_ring_drv':
> > > (.text.startup+0x28): undefined reference to `rte_vdev_register'
> > > collect2: error: ld returned 1 exit status
> > >
> > > Also as you can see in the pastebin above, CFlags and Libs massively
> > > shrunk and likely too much so.
> > >
> > > Given that this should be a stable release I'd ask to 

[dpdk-dev] [PATCH v2 1/2] examples/qos_sched: fixup colors value overrun

2021-03-18 Thread Konstantin Ananyev
'3' is not valid RTE_COLOR_ enum value.

Fixes: de3cfa2c9823 ("sched: initial import")
Cc: sta...@dpdk.org

Signed-off-by: Konstantin Ananyev 
---
 examples/qos_sched/app_thread.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/qos_sched/app_thread.c b/examples/qos_sched/app_thread.c
index dbc878b55..a5f402190 100644
--- a/examples/qos_sched/app_thread.c
+++ b/examples/qos_sched/app_thread.c
@@ -53,7 +53,7 @@ get_pkt_sched(struct rte_mbuf *m, uint32_t *subport, uint32_t 
*pipe,
*queue = pipe_queue - *traffic_class;
 
/* Color (Destination IP) */
-   *color = pdata[COLOR_OFFSET] & 0x03;
+   *color = pdata[COLOR_OFFSET] % RTE_COLORS;
 
return 0;
 }
-- 
2.25.1



[dpdk-dev] [PATCH v2 2/2] qos: rearrange enqueue procedure

2021-03-18 Thread Konstantin Ananyev
In many usage scenarios input mbufs for rte_sched_port_enqueue()
are not yet in the CPU cache(s). That causes quite significant stalls
due to memory latency. Current implementation tries to migitate it
using SW pipeline and SW prefetch techniques, but stalls are still present.
Rework rte_sched_port_enqueue() to do actual fetch of all mbufs
metadata as a first stage of that function.
That helps to minimise load stalls at further stages of enqueue()
and improves overall enqueue performance.
With examples/qos_sched I observed:
on ICX box: up to 30% cycles reduction
on CSX AND BDX: 20-15% cycles reduction
I also run tests with mbufs already in the cache
(one core doing RX, QOS and TX).
With such scenario, on all mentioned above IA boxes
no performance drop was observed.

Signed-off-by: Konstantin Ananyev 
---
v2: fix clang and checkpatch complains
---
 lib/librte_sched/rte_sched.c | 219 +--
 1 file changed, 31 insertions(+), 188 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 7c5688068..41ef147e0 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -1861,24 +1861,23 @@ debug_check_queue_slab(struct rte_sched_subport 
*subport, uint32_t bmp_pos,
 #endif /* RTE_SCHED_DEBUG */
 
 static inline struct rte_sched_subport *
-rte_sched_port_subport(struct rte_sched_port *port,
-   struct rte_mbuf *pkt)
+sched_port_subport(const struct rte_sched_port *port, struct rte_mbuf_sched 
sch)
 {
-   uint32_t queue_id = rte_mbuf_sched_queue_get(pkt);
+   uint32_t queue_id = sch.queue_id;
uint32_t subport_id = queue_id >> (port->n_pipes_per_subport_log2 + 4);
 
return port->subports[subport_id];
 }
 
 static inline uint32_t
-rte_sched_port_enqueue_qptrs_prefetch0(struct rte_sched_subport *subport,
-   struct rte_mbuf *pkt, uint32_t subport_qmask)
+sched_port_enqueue_qptrs_prefetch0(const struct rte_sched_subport *subport,
+   struct rte_mbuf_sched sch, uint32_t subport_qmask)
 {
struct rte_sched_queue *q;
 #ifdef RTE_SCHED_COLLECT_STATS
struct rte_sched_queue_extra *qe;
 #endif
-   uint32_t qindex = rte_mbuf_sched_queue_get(pkt);
+   uint32_t qindex = sch.queue_id;
uint32_t subport_queue_id = subport_qmask & qindex;
 
q = subport->queue + subport_queue_id;
@@ -1971,197 +1970,41 @@ int
 rte_sched_port_enqueue(struct rte_sched_port *port, struct rte_mbuf **pkts,
   uint32_t n_pkts)
 {
-   struct rte_mbuf *pkt00, *pkt01, *pkt10, *pkt11, *pkt20, *pkt21,
-   *pkt30, *pkt31, *pkt_last;
-   struct rte_mbuf **q00_base, **q01_base, **q10_base, **q11_base,
-   **q20_base, **q21_base, **q30_base, **q31_base, **q_last_base;
-   struct rte_sched_subport *subport00, *subport01, *subport10, *subport11,
-   *subport20, *subport21, *subport30, *subport31, *subport_last;
-   uint32_t q00, q01, q10, q11, q20, q21, q30, q31, q_last;
-   uint32_t r00, r01, r10, r11, r20, r21, r30, r31, r_last;
-   uint32_t subport_qmask;
uint32_t result, i;
+   struct rte_mbuf_sched sch[n_pkts];
+   struct rte_sched_subport *subports[n_pkts];
+   struct rte_mbuf **q_base[n_pkts];
+   uint32_t q[n_pkts];
+
+   const uint32_t subport_qmask =
+   (1 << (port->n_pipes_per_subport_log2 + 4)) - 1;
 
result = 0;
-   subport_qmask = (1 << (port->n_pipes_per_subport_log2 + 4)) - 1;
 
-   /*
-* Less then 6 input packets available, which is not enough to
-* feed the pipeline
-*/
-   if (unlikely(n_pkts < 6)) {
-   struct rte_sched_subport *subports[5];
-   struct rte_mbuf **q_base[5];
-   uint32_t q[5];
-
-   /* Prefetch the mbuf structure of each packet */
-   for (i = 0; i < n_pkts; i++)
-   rte_prefetch0(pkts[i]);
-
-   /* Prefetch the subport structure for each packet */
-   for (i = 0; i < n_pkts; i++)
-   subports[i] = rte_sched_port_subport(port, pkts[i]);
-
-   /* Prefetch the queue structure for each queue */
-   for (i = 0; i < n_pkts; i++)
-   q[i] = 
rte_sched_port_enqueue_qptrs_prefetch0(subports[i],
-   pkts[i], subport_qmask);
-
-   /* Prefetch the write pointer location of each queue */
-   for (i = 0; i < n_pkts; i++) {
-   q_base[i] = rte_sched_subport_pipe_qbase(subports[i], 
q[i]);
-   rte_sched_port_enqueue_qwa_prefetch0(port, subports[i],
-   q[i], q_base[i]);
-   }
+   /* Prefetch the mbuf structure of each packet */
+   for (i = 0; i < n_pkts; i++)
+   sch[i] = pkts[i]->hash.sched;
 
-   /* Write each packet to its queue */
-   for (i = 0; i < n_pkts; i++)
- 

Re: [dpdk-dev] [PATCH v2] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Tal Shnaiderman
> Subject: [PATCH v2] bus/pci: fix Windows kernel driver categories
> 
> In Windows probing, the value RTE_PCI_KDRV_NONE was used instead of
> RTE_PCI_KDRV_UNKNOWN.
> This value covers the mlx case where the kernel driver is in place, offering a
> bifurcated mode to the userspace driver.
> When the kernel driver is listed as unknown, there is no special treatment in
> DPDK probing, contrary to UIO modes.
> 
> The value RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used instead of having
> a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.
> While adding the new value RTE_PCI_KDRV_NET_UIO (at the end for ABI
> compatibility), the enum of kernel driver categories is annotated.
> 
> Fixes: b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
> Fixes: c76ec01b4591 ("bus/pci: support netuio on Windows")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Thomas Monjalon 
> Acked-by: Dmitry Kozlyuk 
> ---
> v2: improve comments and commit message
> ---
>  drivers/bus/pci/rte_bus_pci.h | 13 +++--
> drivers/bus/pci/windows/pci.c | 14 +++---
>  2 files changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
> index fdda046515..876abddefb 100644
> --- a/drivers/bus/pci/rte_bus_pci.h
> +++ b/drivers/bus/pci/rte_bus_pci.h
> @@ -52,12 +52,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver);
> struct rte_devargs;
> 
>  enum rte_pci_kernel_driver {
> - RTE_PCI_KDRV_UNKNOWN = 0,
> - RTE_PCI_KDRV_IGB_UIO,
> - RTE_PCI_KDRV_VFIO,
> - RTE_PCI_KDRV_UIO_GENERIC,
> - RTE_PCI_KDRV_NIC_UIO,
> - RTE_PCI_KDRV_NONE,
> + RTE_PCI_KDRV_UNKNOWN = 0,  /* may be misc UIO or bifurcated
> driver */
> + RTE_PCI_KDRV_IGB_UIO,  /* igb_uio for Linux */
> + RTE_PCI_KDRV_VFIO, /* VFIO for Linux */
> + RTE_PCI_KDRV_UIO_GENERIC,  /* uio_pci_generic for Linux */
> + RTE_PCI_KDRV_NIC_UIO,  /* nic_uio for FreeBSD */
> + RTE_PCI_KDRV_NONE, /* no attached driver */
> + RTE_PCI_KDRV_NET_UIO,  /* NetUIO for Windows */
>  };
> 
>  /**
> diff --git a/drivers/bus/pci/windows/pci.c b/drivers/bus/pci/windows/pci.c
> index 8f906097f4..d39a7748b8 100644
> --- a/drivers/bus/pci/windows/pci.c
> +++ b/drivers/bus/pci/windows/pci.c
> @@ -38,7 +38,7 @@ rte_pci_map_device(struct rte_pci_device *dev)
>* Devices that are bound to netuio are mapped at
>* the bus probing stage.
>*/
> - if (dev->kdrv == RTE_PCI_KDRV_NIC_UIO)
> + if (dev->kdrv == RTE_PCI_KDRV_NET_UIO)
>   return 0;
>   else
>   return -1;
> @@ -207,14 +207,14 @@ get_device_resource_info(HDEVINFO dev_info,
>   int ret;
> 
>   switch (dev->kdrv) {
> - case RTE_PCI_KDRV_NONE:
> - /* mem_resource - Unneeded for RTE_PCI_KDRV_NONE */
> + case RTE_PCI_KDRV_UNKNOWN:
> + /* bifurcated driver case - mem_resource is unneeded */
>   dev->mem_resource[0].phys_addr = 0;
>   dev->mem_resource[0].len = 0;
>   dev->mem_resource[0].addr = NULL;
>   break;
> - case RTE_PCI_KDRV_NIC_UIO:
> - /* get device info from netuio kernel driver */
> + case RTE_PCI_KDRV_NET_UIO:
> + /* get device info from NetUIO kernel driver */
>   ret = get_netuio_device_info(dev_info, dev_info_data,
> dev);
>   if (ret != 0) {
>   RTE_LOG(DEBUG, EAL,
> @@ -323,9 +323,9 @@ set_kernel_driver_type(PSP_DEVINFO_DATA
> device_info_data,  {
>   /* set kernel driver type based on device class */
>   if (IsEqualGUID(&(device_info_data->ClassGuid),
> &GUID_DEVCLASS_NETUIO))
> - dev->kdrv = RTE_PCI_KDRV_NIC_UIO;
> + dev->kdrv = RTE_PCI_KDRV_NET_UIO;
>   else
> - dev->kdrv = RTE_PCI_KDRV_NONE;
> + dev->kdrv = RTE_PCI_KDRV_UNKNOWN;
>  }
> 
>  static int
> --
> 2.30.1

Acked-by: Tal Shnaiderman 


Re: [dpdk-dev] [PATCH 5/6] net/ngbe: add log type and error type

2021-03-18 Thread Thomas Monjalon
18/03/2021 10:32, Jiawen Wu:
> +#ifdef RTE_LIBRTE_NGBE_DEBUG_RX
> +extern int ngbe_logtype_rx;
> +#define PMD_RX_LOG(level, fmt, args...) \
> + rte_log(RTE_LOG_ ## level, ngbe_logtype_rx, \
> + "%s(): " fmt "\n", __func__, ##args)
> +#else
> +#define PMD_RX_LOG(level, fmt, args...) do { } while (0)
> +#endif
> +
> +#ifdef RTE_LIBRTE_NGBE_DEBUG_TX
> +extern int ngbe_logtype_tx;
> +#define PMD_TX_LOG(level, fmt, args...) \
> + rte_log(RTE_LOG_ ## level, ngbe_logtype_tx, \
> + "%s(): " fmt "\n", __func__, ##args)
> +#else
> +#define PMD_TX_LOG(level, fmt, args...) do { } while (0)
> +#endif

There is a discussion about using ethdev debug flags in PMDs.
Please check the mailing list:
https://inbox.dpdk.org/dev/20210318014234.2255366-2-qi.z.zh...@intel.com/




Re: [dpdk-dev] [PATCH 1/6] net/ngbe: add build and doc infrastructure

2021-03-18 Thread Thomas Monjalon
18/03/2021 10:32, Jiawen Wu:
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -886,6 +886,12 @@ F: drivers/net/txgbe/
>  F: doc/guides/nics/txgbe.rst
>  F: doc/guides/nics/features/txgbe.ini
>  
> +Wangxun txgbe
> +M: Jiawen Wu 
> +F: drivers/net/ngbe/
> +F: doc/guides/nics/ngbe.rst
> +F: doc/guides/nics/features/ngbe.ini

No it isn't txgbe :)




[dpdk-dev] [PATCH] ethdev: add queue state when retrieve queue information

2021-03-18 Thread Lijun Ou
Currently, upper-layer application could get queue state only
through pointers such as dev->data->tx_queue_state[queue_id],
this is not the recommended way to access it. So this patch
add get queue state when call rte_eth_rx_queue_info_get and
rte_eth_tx_queue_info_get API.

Note: The hairpin queue is not supported with above
rte_eth_*x_queue_info_get, so the queue state could be
RTE_ETH_QUEUE_STATE_STARTED or RTE_ETH_QUEUE_STATE_STOPPED.
Note: After add queue_state field, the 'struct rte_eth_rxq_info' size
remains 128B, and the 'struct rte_eth_txq_info' size remains 64B, so
it could be ABI compatible.

Signed-off-by: Chengwen Feng 
Signed-off-by: Lijun Ou 
---
 doc/guides/rel_notes/release_21_05.rst | 6 ++
 lib/librte_ethdev/rte_ethdev.c | 3 +++
 lib/librte_ethdev/rte_ethdev.h | 4 
 3 files changed, 13 insertions(+)

diff --git a/doc/guides/rel_notes/release_21_05.rst 
b/doc/guides/rel_notes/release_21_05.rst
index 43063e3..165b5f7 100644
--- a/doc/guides/rel_notes/release_21_05.rst
+++ b/doc/guides/rel_notes/release_21_05.rst
@@ -156,6 +156,12 @@ ABI Changes
 
 * No ABI change that would break compatibility with 20.11.
 
+* Added new field ``queue_state`` to ``rte_eth_rxq_info`` structure
+  to provide indicated rxq queue state.
+
+* Added new field ``queue_state`` to ``rte_eth_txq_info`` structure
+  to provide indicated txq queue state.
+
 
 Known Issues
 
diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c
index 3059aa5..fbd10b2 100644
--- a/lib/librte_ethdev/rte_ethdev.c
+++ b/lib/librte_ethdev/rte_ethdev.c
@@ -5042,6 +5042,8 @@ rte_eth_rx_queue_info_get(uint16_t port_id, uint16_t 
queue_id,
 
memset(qinfo, 0, sizeof(*qinfo));
dev->dev_ops->rxq_info_get(dev, queue_id, qinfo);
+   qinfo->queue_state = dev->data->rx_queue_state[queue_id];
+
return 0;
 }
 
@@ -5082,6 +5084,7 @@ rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t 
queue_id,
 
memset(qinfo, 0, sizeof(*qinfo));
dev->dev_ops->txq_info_get(dev, queue_id, qinfo);
+   qinfo->queue_state = dev->data->tx_queue_state[queue_id];
 
return 0;
 }
diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h
index efda313..3b83c5a 100644
--- a/lib/librte_ethdev/rte_ethdev.h
+++ b/lib/librte_ethdev/rte_ethdev.h
@@ -1591,6 +1591,8 @@ struct rte_eth_rxq_info {
uint8_t scattered_rx;   /**< scattered packets RX supported. */
uint16_t nb_desc;   /**< configured number of RXDs. */
uint16_t rx_buf_size;   /**< hardware receive buffer size. */
+   /**< Queues state: STARTED(1) / STOPPED(0). */
+   uint8_t queue_state;
 } __rte_cache_min_aligned;
 
 /**
@@ -1600,6 +1602,8 @@ struct rte_eth_rxq_info {
 struct rte_eth_txq_info {
struct rte_eth_txconf conf; /**< queue config parameters. */
uint16_t nb_desc;   /**< configured number of TXDs. */
+   /**< Queues state: STARTED(1) / STOPPED(0). */
+   uint8_t queue_state;
 } __rte_cache_min_aligned;
 
 /* Generic Burst mode flag definition, values can be ORed. */
-- 
2.7.4



Re: [dpdk-dev] [PATCH] eal: fix version macro

2021-03-18 Thread Bruce Richardson
On Wed, Mar 17, 2021 at 11:01:25AM +0100, Thomas Monjalon wrote:
> 17/03/2021 10:48, David Marchand:
> > On Wed, Mar 17, 2021 at 10:31 AM Thomas Monjalon  
> > wrote:
> > >
> > > The macro RTE_VERSION is broken since updated with function calls.
> > > It is a build-time version number, and must be built with macros.
> > > For a run-time version number, there is the function rte_version().
> > >
> > > Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
> > > Cc: sta...@dpdk.org
> > >
> > > Reported-by: David Marchand 
> > > Signed-off-by: Thomas Monjalon 
> > > ---
> > >  lib/librte_eal/include/rte_version.h | 8 
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/lib/librte_eal/include/rte_version.h 
> > > b/lib/librte_eal/include/rte_version.h
> > > index 2f3f727b46..736c5703be 100644
> > > --- a/lib/librte_eal/include/rte_version.h
> > > +++ b/lib/librte_eal/include/rte_version.h
> > > @@ -28,10 +28,10 @@ extern "C" {
> > >   * All version numbers in one to compare with RTE_VERSION_NUM()
> > >   */
> > >  #define RTE_VERSION RTE_VERSION_NUM( \
> > > -   rte_version_year(), \
> > > -   rte_version_month(), \
> > > -   rte_version_minor(), \
> > > -   rte_version_release())
> > > +   RTE_VER_YEAR, \
> > > +   RTE_VER_MONTH, \
> > > +   RTE_VER_MINOR, \
> > > +   RTE_VER_RELEASE)
> > >
> > >  /**
> > >   * Function to return DPDK version prefix string
> > 
> > The original patch wanted to fix rte_version() at runtime.
> > I don't see the need to keep the rte_version_XXX exports now that
> > RTE_VERSION is reverted.
> 
> I think it may help to query the version numbers at runtime,
> in "if" condition. Is there another way I'm missing?
> We may argue that the runtime version number should not be used
> to decide how to behave in an application.
> 
I would also tend toward keeping them, for the same reason that runtime is
definitely to be preferred over build time, and they are not like to be
much of a maintenance burden.

Also, next time we have an ABI break, I wonder if the existing macros
should be renamed to have an RTE_BUILD_VER_ prefix, to make it clear that
it's the build version only that is being reported rather than the version
actually being used. Similarly the functions could be renamed to have
rte_runtime_ prefix, ensuring that in all cases the user is clear whether
they are getting the build version or the runtime version.

/Bruce


Re: [dpdk-dev] [PATCH] eal: mark version parts API as experimental

2021-03-18 Thread Bruce Richardson
On Wed, Mar 17, 2021 at 04:15:35PM +0100, Thomas Monjalon wrote:
> Some functions were introduced in DPDK 21.05 to query the version parts
> (prefix, year, month, minor, suffix, release) at runtime.
> Per guidelines, these new public functions must be marked with
> __rte_experimental and ABI versioned as EXPERIMENTAL.
> 
> Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
> Cc: sta...@dpdk.org
> 
> Suggested-by: David Marchand 
> Signed-off-by: Thomas Monjalon 
> ---
Acked-by: Bruce Richardson 


Re: [dpdk-dev] [PATCH 21.05] net/virtio: remove duplicate port id from virtio_user

2021-03-18 Thread David Marchand
On Wed, Mar 17, 2021 at 9:04 PM Maxime Coquelin
 wrote:
> > diff --git a/drivers/net/virtio/virtio_user/vhost_user.c 
> > b/drivers/net/virtio/virtio_user/vhost_user.c
> > index ec2c53c8fb..18ae29eed2 100644
> > --- a/drivers/net/virtio/virtio_user/vhost_user.c
> > +++ b/drivers/net/virtio/virtio_user/vhost_user.c
> > @@ -950,7 +950,8 @@ vhost_user_update_link_state(struct virtio_user_dev 
> > *dev)
> >   r = recv(data->vhostfd, buf, 128, MSG_PEEK);
> >   if (r == 0 || (r < 0 && errno != EAGAIN)) {
> >   dev->net_status &= (~VIRTIO_NET_S_LINK_UP);
> > - PMD_DRV_LOG(ERR, "virtio-user port %u is down", 
> > dev->port_id);
> > + PMD_DRV_LOG(ERR, "virtio-user port %u is down",
> > + dev->hw.port_id);
>
> Trivial, but it can fit in a single line, as IIRC, we can go up to 100
> chars now. If you agree, we can fix it while applying, no need to
> resubmit.

Yep, ok for me, thanks.


-- 
David Marchand



[dpdk-dev] [PATCH] crypto/qat: fix to small sgl oop min offset

2021-03-18 Thread Arek Kusztal
This commit fixes problem with to small offset when both offsets
(auth, cipher) are non zero in digest encrypt case,
when using out-of-place and sgl.

Fixes: 40002f6c2a24 ("crypto/qat: extend support for digest-encrypted 
auth-cipher")
Cc: sta...@dpdk.org

Signed-off-by: Arek Kusztal 
---
 drivers/crypto/qat/qat_sym.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/qat/qat_sym.c b/drivers/crypto/qat/qat_sym.c
index 4b7676deb..a6cd33be3 100644
--- a/drivers/crypto/qat/qat_sym.c
+++ b/drivers/crypto/qat/qat_sym.c
@@ -162,6 +162,7 @@ qat_sym_build_request(void *in_op, uint8_t *out_msg,
uint8_t do_sgl = 0;
uint8_t in_place = 1;
int alignment_adjustment = 0;
+   int oop_shift = 0;
struct rte_crypto_op *op = (struct rte_crypto_op *)in_op;
struct qat_sym_op_cookie *cookie =
(struct qat_sym_op_cookie *)op_cookie;
@@ -472,6 +473,7 @@ qat_sym_build_request(void *in_op, uint8_t *out_msg,
rte_pktmbuf_iova_offset(op->sym->m_src, min_ofs);
dst_buf_start =
rte_pktmbuf_iova_offset(op->sym->m_dst, min_ofs);
+   oop_shift = min_ofs;
 
} else {
/* In-place operation
@@ -532,7 +534,7 @@ qat_sym_build_request(void *in_op, uint8_t *out_msg,
 /* First find the end of the data */
if (do_sgl) {
uint32_t remaining_off = auth_param->auth_off +
-   auth_param->auth_len + alignment_adjustment;
+   auth_param->auth_len + alignment_adjustment + 
oop_shift;
struct rte_mbuf *sgl_buf =
(in_place ?
op->sym->m_src : op->sym->m_dst);
-- 
2.17.1



Re: [dpdk-dev] 19.11.4 patches review and test

2021-03-18 Thread Pai G, Sunil
Hey Christian,



> back  in 19.11.4 these DPDK changes were not picked up as they have broken
> builds as discussed here.
> Later on the communication was that all this works fine now and thereby
> Luca has "reverted the reverts" in 19.11.6 [1].
> 
> But today we were made aware that still no OVS 2.13 builds against a DPDK
> that has those changes.
> Not 2.13.1 as we have it in Ubuntu nor (if it needs some OVS changes
> backported) the recent 2.13.3 does build.
> They still fail with the very same issue I reported [2] back then.
> 
> Unfortunately I have just released 19.11.7 so I can't revert them there - but
> OTOH reverting and counter reverting every other release seems wrong
> anyway.
> 
> I wanted to ask if there is a set of patches that OVS would need to backport
> to 2.13.x to make this work?
> If they could be identified and prepared Distros could use them on
> 2.13.3 asap and 2.13.4 could officially release them for OVS later on.
> 
> But for that we'd need a hint which OVS changes that would need to be.
> All I know atm is from the testing reports on DPDK it seems that OVS
> 2.14.3 and 2.15 are happy with the new DPDK code.

> Do you have pointers on what 2.13.3 would need to get backported to work
> again in regard to this build issue.

You would need to use partial contents from patch :
http://patchwork.ozlabs.org/project/openvswitch/patch/1608142365-26215-1-git-send-email-ian.sto...@intel.com/

If you'd like me to send patches which would work with 2.13, 2.14, I'm ok with 
that too.[keeping only those parts from patch which fixes the issue you see.]
But we must ensure it doesn’t cause problems for OVS too.
Your thoughts Ilya ?


> 
> [1]: http://git.dpdk.org/dpdk-stable/log/?h=19.11&ofs=550
> [2]: http://mails.dpdk.org/archives/stable/2020-September/024796.html


Thanks and regards,
Sunil



Re: [dpdk-dev] [PATCH] net/mlx5: support RSS expansion for IPv6 GRE

2021-03-18 Thread Matan Azrad



From: Jack Min
 
> Currently RSS expansion only support IPv4 as GRE payload or delivery protocol
> (RFC2784). IPv6 as GRE payload or delivery protocol
> (RFC7676) is not supported.
> 
> This patch add RSS expansion for RFC7676 so PMD can expand flow item
> correctly.
> 
> Fixes: f4b901a46aec ("net/mlx5: add flow GRE item")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Xiaoyu Min 
Acked-by: Matan Azrad 


Re: [dpdk-dev] [PATCH] eal: fix version macro

2021-03-18 Thread Thomas Monjalon
18/03/2021 13:28, Bruce Richardson:
> On Wed, Mar 17, 2021 at 11:01:25AM +0100, Thomas Monjalon wrote:
> > 17/03/2021 10:48, David Marchand:
> > > On Wed, Mar 17, 2021 at 10:31 AM Thomas Monjalon  
> > > wrote:
> > > >
> > > > The macro RTE_VERSION is broken since updated with function calls.
> > > > It is a build-time version number, and must be built with macros.
> > > > For a run-time version number, there is the function rte_version().
> > > >
> > > > Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
> > > > Cc: sta...@dpdk.org
> > > >
> > > > Reported-by: David Marchand 
> > > > Signed-off-by: Thomas Monjalon 
> > > > ---
> > > >  lib/librte_eal/include/rte_version.h | 8 
> > > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/lib/librte_eal/include/rte_version.h 
> > > > b/lib/librte_eal/include/rte_version.h
> > > > index 2f3f727b46..736c5703be 100644
> > > > --- a/lib/librte_eal/include/rte_version.h
> > > > +++ b/lib/librte_eal/include/rte_version.h
> > > > @@ -28,10 +28,10 @@ extern "C" {
> > > >   * All version numbers in one to compare with RTE_VERSION_NUM()
> > > >   */
> > > >  #define RTE_VERSION RTE_VERSION_NUM( \
> > > > -   rte_version_year(), \
> > > > -   rte_version_month(), \
> > > > -   rte_version_minor(), \
> > > > -   rte_version_release())
> > > > +   RTE_VER_YEAR, \
> > > > +   RTE_VER_MONTH, \
> > > > +   RTE_VER_MINOR, \
> > > > +   RTE_VER_RELEASE)
> > > >
> > > >  /**
> > > >   * Function to return DPDK version prefix string
> > > 
> > > The original patch wanted to fix rte_version() at runtime.
> > > I don't see the need to keep the rte_version_XXX exports now that
> > > RTE_VERSION is reverted.
> > 
> > I think it may help to query the version numbers at runtime,
> > in "if" condition. Is there another way I'm missing?
> > We may argue that the runtime version number should not be used
> > to decide how to behave in an application.
> > 
> I would also tend toward keeping them, for the same reason that runtime is
> definitely to be preferred over build time, and they are not like to be
> much of a maintenance burden.
> 
> Also, next time we have an ABI break, I wonder if the existing macros
> should be renamed to have an RTE_BUILD_VER_ prefix, to make it clear that
> it's the build version only that is being reported rather than the version
> actually being used. Similarly the functions could be renamed to have
> rte_runtime_ prefix, ensuring that in all cases the user is clear whether
> they are getting the build version or the runtime version.

I am fine with such rename,
but that's already quite clear that a macro is at build time,
and a function is usually evaluated at runtime.




Re: [dpdk-dev] [PATCH 1/3] Add EAL threads API

2021-03-18 Thread Tal Shnaiderman
> Subject: [dpdk-dev] [PATCH 1/3] Add EAL threads API
> 
> From: Narcisa Vasile 
> 
> EAL must hide the environment specifics from apps and libraries.
> Add an EAL API for managing threads.
> 
> Signed-off-by: Narcisa Vasile 
> Signed-off-by: Dmitry Malloy 
> ---

Hi Naty, Dmitry, 

Thank you for adding those functions to the thread API.
This is a huge commit, I'd split it to separate patches, something like:

1) Move existing code to rte_thread.
2) Add empty stubs for new functions.
3) Implement OS functions for thread creation/join.
4) Implement OS functions for thread affinity.
5) Implement OS functions for thread priority.

>  lib/librte_eal/common/eal_common_thread.c |   6 +-
>  lib/librte_eal/common/rte_thread.c| 346 
>  lib/librte_eal/include/rte_thread.h   | 323 ++-
>  lib/librte_eal/include/rte_thread_types.h |  20 +
>  lib/librte_eal/windows/eal_lcore.c| 167 --
>  lib/librte_eal/windows/eal_windows.h  |  10 +
>  .../include/rte_windows_thread_types.h|  19 +
>  lib/librte_eal/windows/rte_thread.c   | 507 +-
>  8 files changed, 1338 insertions(+), 60 deletions(-)  create mode 100644



> --- a/lib/librte_eal/windows/rte_thread.c
> +++ b/lib/librte_eal/windows/rte_thread.c
> @@ -1,16 +1,503 @@
>  /* SPDX-License-Identifier: BSD-3-Clause
>   * Copyright 2021 Mellanox Technologies, Ltd
> + * Copyright(c) 2021 Microsoft Corporation
>   */
> 
>  #include 
> -#include 
>  #include 
> -#include 
> +
> +#include "eal_windows.h"
> 
>  struct eal_tls_key {
>   DWORD thread_index;
>  };
> 

I don't know if this table is needed, the approach should be to have the return 
value/rte_errno identical between the OSs.
And having the specific OS errno printed.

e.g. pthread_setschedparam On UNIX returns ESRCH when no thread id is found, 
the table below doesn't translate to it so Windows
will never return such error code, maybe use only the errnos below for all OSs? 
what do you think?

> +/* Translates the most common error codes related to threads */ static
> +int rte_thread_translate_win32_error(DWORD error) {
> + switch (error) {
> + case ERROR_SUCCESS:
> + return 0;
> +
> + case ERROR_INVALID_PARAMETER:
> + return -EINVAL;
> +
> + case ERROR_INVALID_HANDLE:
> + return -EFAULT;
> +
> + case ERROR_NOT_ENOUGH_MEMORY:
> + /* FALLTHROUGH */
> + case ERROR_NO_SYSTEM_RESOURCES:
> + return -ENOMEM;
> +
> + case ERROR_PRIVILEGE_NOT_HELD:
> + /* FALLTHROUGH */
> + case ERROR_ACCESS_DENIED:
> + return -EACCES;
> +
> + case ERROR_ALREADY_EXISTS:
> + return -EEXIST;
> +
> + case ERROR_POSSIBLE_DEADLOCK:
> + return -EDEADLK;
> +
> + case ERROR_INVALID_FUNCTION:
> + /* FALLTHROUGH */
> + case ERROR_CALL_NOT_IMPLEMENTED:
> + return -ENOSYS;
> +
> + default:
> + return -EINVAL;
> + }
> +
> + return -EINVAL;
> +}



Re: [dpdk-dev] 19.11.4 patches review and test

2021-03-18 Thread Ilya Maximets
On 3/18/21 2:36 PM, Pai G, Sunil wrote:
> Hey Christian,
> 
> 
> 
>> back  in 19.11.4 these DPDK changes were not picked up as they have broken
>> builds as discussed here.
>> Later on the communication was that all this works fine now and thereby
>> Luca has "reverted the reverts" in 19.11.6 [1].
>>
>> But today we were made aware that still no OVS 2.13 builds against a DPDK
>> that has those changes.
>> Not 2.13.1 as we have it in Ubuntu nor (if it needs some OVS changes
>> backported) the recent 2.13.3 does build.
>> They still fail with the very same issue I reported [2] back then.
>>
>> Unfortunately I have just released 19.11.7 so I can't revert them there - but
>> OTOH reverting and counter reverting every other release seems wrong
>> anyway.

It is wrong indeed, but the main question here is why these patches was
backported to stable release in a first place?

Looking at these patches, they are not actual bug fixes but more like
"nice to have" features that additionally breaks the way application
links with DPDK.  Stuff like that should not be acceptable to the stable
release without a strong justification or, at least, testing with actual
applications.

Since we already have a revert of revert, revert of revert of revert
doesn't seem so bad.

>>
>> I wanted to ask if there is a set of patches that OVS would need to backport
>> to 2.13.x to make this work?
>> If they could be identified and prepared Distros could use them on
>> 2.13.3 asap and 2.13.4 could officially release them for OVS later on.
>>
>> But for that we'd need a hint which OVS changes that would need to be.
>> All I know atm is from the testing reports on DPDK it seems that OVS
>> 2.14.3 and 2.15 are happy with the new DPDK code.
> 
>> Do you have pointers on what 2.13.3 would need to get backported to work
>> again in regard to this build issue.
> 
> You would need to use partial contents from patch :
> http://patchwork.ozlabs.org/project/openvswitch/patch/1608142365-26215-1-git-send-email-ian.sto...@intel.com/
> 
> If you'd like me to send patches which would work with 2.13, 2.14, I'm ok 
> with that too.[keeping only those parts from patch which fixes the issue you 
> see.]
> But we must ensure it doesn’t cause problems for OVS too.
> Your thoughts Ilya ?

We had more fixes on top of this particular patch and I'd like to not
cherry-pick and re-check all of this again.  For users stable releases
should be transparent, i.e. should not have disruptive changes that will
break their ability to build with version of a library that they would
like to use.

What are exact changes we're talking about?  Will it still be possible
to build OVS with older versions of a stable 19.11 if these changes applied?

> 
> 
>>
>> [1]: http://git.dpdk.org/dpdk-stable/log/?h=19.11&ofs=550
>> [2]: http://mails.dpdk.org/archives/stable/2020-September/024796.html
> 
> 
> Thanks and regards,
> Sunil
>


Re: [dpdk-dev] [PATCH 1/6] baseband: introduce NXP LA12xx driver

2021-03-18 Thread David Marchand
On Thu, Mar 18, 2021 at 7:38 AM Hemant Agrawal  wrote:
>
> This patch introduce the baseband device drivers for NXP's
> LA1200 series software defined baseband modem.

Such a series deserves a cover letter.
You should copy bbdev maintainer and cryptodev subtree maintainer.

Quickly looked at the series, I see no change on the bbdev unit test code.
Are those tests running fine with no modification (I sure hope so, but
I want a confirmation).


>
> Signed-off-by: Nipun Gupta 
> Signed-off-by: Hemant Agrawal 
> ---
>  drivers/baseband/la12xx/bbdev_la12xx.c| 110 ++
>  .../baseband/la12xx/bbdev_la12xx_pmd_logs.h   |  38 ++
>  drivers/baseband/la12xx/meson.build   |   6 +
>  drivers/baseband/la12xx/version.map   |   3 +
>  drivers/baseband/meson.build  |   2 +-
>  5 files changed, 158 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/baseband/la12xx/bbdev_la12xx.c
>  create mode 100644 drivers/baseband/la12xx/bbdev_la12xx_pmd_logs.h
>  create mode 100644 drivers/baseband/la12xx/meson.build
>  create mode 100644 drivers/baseband/la12xx/version.map
>

[snip]

> +};
> +
> +RTE_PMD_REGISTER_VDEV(DRIVER_NAME, bbdev_la12xx_pmd_drv);
> +RTE_PMD_REGISTER_ALIAS(DRIVER_NAME, bbdev_la12xx);

Quick glance at this patch, no need for an alias.
Alias are for maintaining compatibility when drivers are renamed but
this is a new driver.


-- 
David Marchand



Re: [dpdk-dev] [PATCH] net/e1000: fix filter control return value

2021-03-18 Thread Wang, Haiyue
> -Original Message-
> From: Xiaozhen Ban 
> Sent: Thursday, March 18, 2021 19:44
> To: Guo, Jia ; Wang, Haiyue 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: RE: RE: [PATCH] net/e1000: fix filter control return value
> 
> OK, but I think this bug affects all stable release about 6 years before 
> today.

I don't think so, since it is PMD internal ops, the real API 'rte_flow_ops_get'
always use RTE_ETH_FILTER_GENERIC. ;-)



Re: [dpdk-dev] [PATCH] eal: fix version macro

2021-03-18 Thread Bruce Richardson
On Thu, Mar 18, 2021 at 03:41:35PM +0100, Thomas Monjalon wrote:
> 18/03/2021 13:28, Bruce Richardson:
> > On Wed, Mar 17, 2021 at 11:01:25AM +0100, Thomas Monjalon wrote:
> > > 17/03/2021 10:48, David Marchand:
> > > > On Wed, Mar 17, 2021 at 10:31 AM Thomas Monjalon  
> > > > wrote:
> > > > >
> > > > > The macro RTE_VERSION is broken since updated with function calls.
> > > > > It is a build-time version number, and must be built with macros.
> > > > > For a run-time version number, there is the function rte_version().
> > > > >
> > > > > Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
> > > > > Cc: sta...@dpdk.org
> > > > >
> > > > > Reported-by: David Marchand 
> > > > > Signed-off-by: Thomas Monjalon 
> > > > > ---
> > > > >  lib/librte_eal/include/rte_version.h | 8 
> > > > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > > >
> > > > > diff --git a/lib/librte_eal/include/rte_version.h 
> > > > > b/lib/librte_eal/include/rte_version.h
> > > > > index 2f3f727b46..736c5703be 100644
> > > > > --- a/lib/librte_eal/include/rte_version.h
> > > > > +++ b/lib/librte_eal/include/rte_version.h
> > > > > @@ -28,10 +28,10 @@ extern "C" {
> > > > >   * All version numbers in one to compare with RTE_VERSION_NUM()
> > > > >   */
> > > > >  #define RTE_VERSION RTE_VERSION_NUM( \
> > > > > -   rte_version_year(), \
> > > > > -   rte_version_month(), \
> > > > > -   rte_version_minor(), \
> > > > > -   rte_version_release())
> > > > > +   RTE_VER_YEAR, \
> > > > > +   RTE_VER_MONTH, \
> > > > > +   RTE_VER_MINOR, \
> > > > > +   RTE_VER_RELEASE)
> > > > >
> > > > >  /**
> > > > >   * Function to return DPDK version prefix string
> > > > 
> > > > The original patch wanted to fix rte_version() at runtime.
> > > > I don't see the need to keep the rte_version_XXX exports now that
> > > > RTE_VERSION is reverted.
> > > 
> > > I think it may help to query the version numbers at runtime,
> > > in "if" condition. Is there another way I'm missing?
> > > We may argue that the runtime version number should not be used
> > > to decide how to behave in an application.
> > > 
> > I would also tend toward keeping them, for the same reason that runtime is
> > definitely to be preferred over build time, and they are not like to be
> > much of a maintenance burden.
> > 
> > Also, next time we have an ABI break, I wonder if the existing macros
> > should be renamed to have an RTE_BUILD_VER_ prefix, to make it clear that
> > it's the build version only that is being reported rather than the version
> > actually being used. Similarly the functions could be renamed to have
> > rte_runtime_ prefix, ensuring that in all cases the user is clear whether
> > they are getting the build version or the runtime version.
> 
> I am fine with such rename,
> but that's already quite clear that a macro is at build time,
> and a function is usually evaluated at runtime.
> 

If we take the existing rte_version function, without checking the source
code, one has no way of checking if that is resolved at runtime (as it is
now) or at compile-time (as it was). However, if we assume that that is a
bug and that all such functions should be run-time operations, then
there is no difficulty.

/Bruce


Re: [dpdk-dev] [PATCH 1/3] Add EAL threads API

2021-03-18 Thread David Marchand
On Thu, Mar 18, 2021 at 2:01 AM Narcisa Ana Maria Vasile
 wrote:
> diff --git a/lib/librte_eal/common/eal_common_thread.c 
> b/lib/librte_eal/common/eal_common_thread.c
> index 73a055902..5219e783e 100644
> --- a/lib/librte_eal/common/eal_common_thread.c
> +++ b/lib/librte_eal/common/eal_common_thread.c
> @@ -84,7 +84,7 @@ thread_update_affinity(rte_cpuset_t *cpusetp)
>  }
>
>  int
> -rte_thread_set_affinity(rte_cpuset_t *cpusetp)
> +rte_thread_self_set_affinity(rte_cpuset_t *cpusetp)
>  {
> if (pthread_setaffinity_np(pthread_self(), sizeof(rte_cpuset_t),
> cpusetp) != 0) {

[snip]

> diff --git a/lib/librte_eal/include/rte_thread.h 
> b/lib/librte_eal/include/rte_thread.h
> index e640ea185..66b112bc4 100644
> --- a/lib/librte_eal/include/rte_thread.h
> +++ b/lib/librte_eal/include/rte_thread.h

[snip]

> +/**
> + * Set the affinity of thread 'thread_id' to the cpu set
> + * specified by 'cpuset'.
> + *
> + * @param thread_id
> + *Id of the thread for which to set the affinity.
> + *
> + * @param cpuset_size
> + *
> + * @param cpuset
> + *   Pointer to CPU affinity to set.
> + *
> + * @return
> + *   On success, return 0.
> + *   On failure, return nonzero.
> + */
> +__rte_experimental
> +int rte_thread_set_affinity(rte_thread_t thread_id, size_t cpuset_size,
> +   const rte_cpuset_t *cpuset);
> +

[snip]

> @@ -34,7 +353,7 @@ typedef struct eal_tls_key *rte_tls_key;
>   * @return
>   *   On success, return 0; otherwise return -1;
>   */
> -int rte_thread_set_affinity(rte_cpuset_t *cpusetp);
> +int rte_thread_self_set_affinity(rte_cpuset_t *cpusetp);
>
>  /**
>   * Get core affinity of the current thread.

rte_thread_*et_affinity() are stable.
This breaks the ABI (which is bad) and this API change was not
announced previously.

The ABI check will catch it for you if you stop at this patch (the
patch 3 actually makes the check go silent because of a wrong
version.map update with duplicate symbols).

$ DPDK_ABI_REF_VERSION=v21.02 ./devtools/test-meson-builds.sh
...
[2502/2502] Linking target drivers/librte_event_octeontx2.so.21.2
Error: ABI issue reported for 'abidiff --suppr
/home/dmarchan/dpdk/devtools/../devtools/libabigail.abignore
--no-added-syms --headers-dir1
/home/dmarchan/abi/v21.02/build-gcc-shared/usr/local/include
--headers-dir2 /home/dmarchan/builds/build-gcc-shared/install/usr/local/include
/home/dmarchan/abi/v21.02/build-gcc-shared/dump/librte_eal.dump
/home/dmarchan/builds/build-gcc-shared/install/dump/librte_eal.dump'
ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged
this as a potential issue).
ABIDIFF_ABI_INCOMPATIBLE_CHANGE, this change breaks the ABI.

$ abidiff --suppr
/home/dmarchan/dpdk/devtools/../devtools/libabigail.abignore
--no-added-syms --headers-dir1
/home/dmarchan/abi/v21.02/build-gcc-shared/usr/local/include
--headers-dir2 /home/dmarchan/builds/build-gcc-shared/install/usr/local/include
/home/dmarchan/abi/v21.02/build-gcc-shared/dump/librte_eal.dump
/home/dmarchan/builds/build-gcc-shared/install/dump/librte_eal.dump
Functions changes summary: 2 Removed, 0 Changed, 0 Added (6 filtered
out) functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
Variable symbols changes summary: 0 Removed, 0 Added variable symbol
not referenced by debug info

2 Removed functions:

  [D] 'function void rte_thread_get_affinity(rte_cpuset_t*)'
{rte_thread_get_affinity@@DPDK_21}
  [D] 'function int rte_thread_set_affinity(rte_cpuset_t*)'
{rte_thread_set_affinity@@DPDK_21}


-- 
David Marchand



[dpdk-dev] [PATCH v5 1/6] net/ark: update pkt director initial state

2021-03-18 Thread Ed Czeck
Fixes: b33ccdb17f55 ("net/ark: provide API for hardware modules MPU RQP and 
pktdir")
Cc: sta...@dpdk.org

Signed-off-by: Ed Czeck 
---
 drivers/net/ark/ark_ethdev.c | 1 +
 drivers/net/ark/ark_pktdir.c | 2 +-
 drivers/net/ark/ark_pktdir.h | 2 +-
 3 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index ef650a465..477e1de02 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -321,6 +321,7 @@ eth_ark_dev_init(struct rte_eth_dev *dev)
ark->rqpacing =
(struct ark_rqpace_t *)(ark->bar0 + ARK_RCPACING_BASE);
ark->started = 0;
+   ark->pkt_dir_v = ARK_PKT_DIR_INIT_VAL;
 
ARK_PMD_LOG(INFO, "Sys Ctrl Const = 0x%x  HW Commit_ID: %08x\n",
  ark->sysctrl.t32[4],
diff --git a/drivers/net/ark/ark_pktdir.c b/drivers/net/ark/ark_pktdir.c
index 25e121831..dbfd2924b 100644
--- a/drivers/net/ark/ark_pktdir.c
+++ b/drivers/net/ark/ark_pktdir.c
@@ -22,7 +22,7 @@ ark_pktdir_init(void *base)
return inst;
}
inst->regs = (struct ark_pkt_dir_regs *)base;
-   inst->regs->ctrl = 0x00110110;  /* POR state */
+   inst->regs->ctrl = ARK_PKT_DIR_INIT_VAL; /* POR state */
return inst;
 }
 
diff --git a/drivers/net/ark/ark_pktdir.h b/drivers/net/ark/ark_pktdir.h
index 4afd128f9..b5577cebb 100644
--- a/drivers/net/ark/ark_pktdir.h
+++ b/drivers/net/ark/ark_pktdir.h
@@ -7,7 +7,7 @@
 
 #include 
 
-#define ARK_PKTDIR_BASE_ADR  0xa
+#define ARK_PKT_DIR_INIT_VAL 0x0110
 
 typedef void *ark_pkt_dir_t;
 
-- 
2.17.1



[dpdk-dev] [PATCH v5 2/6] net/ark: refactor Rx buffer recovery

2021-03-18 Thread Ed Czeck
Allocate mbufs for Rx path in bulk of at least 64 buffers
to improve performance. Allow recovery even without
a Rx operation to support lack of buffers in pool.

Fixes: be410a861598 ("net/ark: add recovery for lack of mbufs")
Cc: sta...@dpdk.org

Signed-off-by: Ed Czeck 
---
 drivers/net/ark/ark_ethdev_rx.c | 49 -
 1 file changed, 11 insertions(+), 38 deletions(-)

diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index d29d3db78..8e55b851a 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -26,9 +26,6 @@ static uint32_t eth_ark_rx_jumbo(struct ark_rx_queue *queue,
 struct rte_mbuf *mbuf0,
 uint32_t cons_index);
 static inline int eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue);
-static int eth_ark_rx_seed_recovery(struct ark_rx_queue *queue,
-   uint32_t *pnb,
-   struct rte_mbuf **mbufs);
 
 /* * */
 struct ark_rx_queue {
@@ -54,7 +51,7 @@ struct ark_rx_queue {
/* The queue Index is used within the dpdk device structures */
uint16_t queue_index;
 
-   uint32_t last_cons;
+   uint32_t unused;
 
/* separate cache line */
/* second cache line - fields only used in slow path */
@@ -105,9 +102,8 @@ static inline void
 eth_ark_rx_update_cons_index(struct ark_rx_queue *queue, uint32_t cons_index)
 {
queue->cons_index = cons_index;
-   eth_ark_rx_seed_mbufs(queue);
-   if (((cons_index - queue->last_cons) >= 64U)) {
-   queue->last_cons = cons_index;
+   if ((cons_index + queue->queue_size - queue->seed_index) >= 64U) {
+   eth_ark_rx_seed_mbufs(queue);
ark_mpu_set_producer(queue->mpu, queue->seed_index);
}
 }
@@ -321,9 +317,7 @@ eth_ark_recv_pkts(void *rx_queue,
break;
}
 
-   if (unlikely(nb != 0))
-   /* report next free to FPGA */
-   eth_ark_rx_update_cons_index(queue, cons_index);
+   eth_ark_rx_update_cons_index(queue, cons_index);
 
return nb;
 }
@@ -458,11 +452,13 @@ eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue)
int status = rte_pktmbuf_alloc_bulk(queue->mb_pool, mbufs, nb);
 
if (unlikely(status != 0)) {
-   /* Try to recover from lack of mbufs in pool */
-   status = eth_ark_rx_seed_recovery(queue, &nb, mbufs);
-   if (unlikely(status != 0)) {
-   return -1;
-   }
+   ARK_PMD_LOG(NOTICE,
+   "Could not allocate %u mbufs from pool"
+   " for RX queue %u;"
+   " %u free buffers remaining in queue\n",
+   nb, queue->queue_index,
+   queue->seed_index - queue->cons_index);
+   return -1;
}
 
if (ARK_DEBUG_CORE) {   /* DEBUG */
@@ -511,29 +507,6 @@ eth_ark_rx_seed_mbufs(struct ark_rx_queue *queue)
return 0;
 }
 
-int
-eth_ark_rx_seed_recovery(struct ark_rx_queue *queue,
-uint32_t *pnb,
-struct rte_mbuf **mbufs)
-{
-   int status = -1;
-
-   /* Ignore small allocation failures */
-   if (*pnb <= 64)
-   return -1;
-
-   *pnb = 64U;
-   status = rte_pktmbuf_alloc_bulk(queue->mb_pool, mbufs, *pnb);
-   if (status != 0) {
-   ARK_PMD_LOG(NOTICE,
-   "ARK: Could not allocate %u mbufs from pool for RX 
queue %u;"
-   " %u free buffers remaining in queue\n",
-   *pnb, queue->queue_index,
-   queue->seed_index - queue->cons_index);
-   }
-   return status;
-}
-
 void
 eth_ark_rx_dump_queue(struct rte_eth_dev *dev, uint16_t queue_id,
  const char *msg)
-- 
2.17.1



[dpdk-dev] [PATCH v5 3/6] net/ark: update internal structs to reflect FPGA updates

2021-03-18 Thread Ed Czeck
- New PCIe IDs using net/ark driver
- Update Version IDs and structures specified by hardware
- New internal descriptor status for TX
- Adjust data placement in RX operations, headroom in retained
for segmented mbufs

Signed-off-by: Ed Czeck 
---
 doc/guides/nics/ark.rst |   5 ++
 drivers/net/ark/ark_ddm.c   |  18 +++--
 drivers/net/ark/ark_ddm.h   |  22 +++---
 drivers/net/ark/ark_ethdev.c|   5 ++
 drivers/net/ark/ark_ethdev_rx.c |   6 +-
 drivers/net/ark/ark_ethdev_tx.c | 122 +++-
 drivers/net/ark/ark_udm.c   |   2 +
 drivers/net/ark/ark_udm.h   |  13 ++--
 8 files changed, 123 insertions(+), 70 deletions(-)

diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
index 18434c7a4..dcccfee26 100644
--- a/doc/guides/nics/ark.rst
+++ b/doc/guides/nics/ark.rst
@@ -155,6 +155,11 @@ ARK PMD supports the following Arkville RTL PCIe instances 
including:
 
 * ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover]
 * ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover]
+* ``1d6c:100f`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover for Versal]
+* ``1d6c:1010`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover for Agilex]
+* ``1d6c:1017`` - AR-ARK-FX1 [Arkville 64B Multi-Homed Primary Endpoint]
+* ``1d6c:1018`` - AR-ARK-FX1 [Arkville 64B Multi-Homed Secondary Endpoint]
+* ``1d6c:1019`` - AR-ARK-FX1 [Arkville 64B Multi-Homed Tertiary Endpoint]
 
 Supported Operating Systems
 ---
diff --git a/drivers/net/ark/ark_ddm.c b/drivers/net/ark/ark_ddm.c
index 91d1179d8..232137157 100644
--- a/drivers/net/ark/ark_ddm.c
+++ b/drivers/net/ark/ark_ddm.c
@@ -7,6 +7,8 @@
 #include "ark_logs.h"
 #include "ark_ddm.h"
 
+static_assert(sizeof(union ark_tx_meta) == 8, "Unexpected struct size 
ark_tx_meta");
+
 /* * */
 int
 ark_ddm_verify(struct ark_ddm_t *ddm)
@@ -19,18 +21,26 @@ ark_ddm_verify(struct ark_ddm_t *ddm)
}
 
hw_const = ddm->cfg.const0;
+   if (hw_const == ARK_DDM_CONST3)
+   return 0;
+
if (hw_const == ARK_DDM_CONST1) {
ARK_PMD_LOG(ERR,
"ARK: DDM module is version 1, "
"PMD expects version 2\n");
return -1;
-   } else if (hw_const != ARK_DDM_CONST2) {
+   }
+
+   if (hw_const == ARK_DDM_CONST2) {
ARK_PMD_LOG(ERR,
-   "ARK: DDM module not found as expected 0x%08x\n",
-   ddm->cfg.const0);
+   "ARK: DDM module is version 2, "
+   "PMD expects version 3\n");
return -1;
}
-   return 0;
+   ARK_PMD_LOG(ERR,
+   "ARK: DDM module not found as expected 0x%08x\n",
+   ddm->cfg.const0);
+   return -1;
 }
 
 void
diff --git a/drivers/net/ark/ark_ddm.h b/drivers/net/ark/ark_ddm.h
index 5456b4b5c..687ff2519 100644
--- a/drivers/net/ark/ark_ddm.h
+++ b/drivers/net/ark/ark_ddm.h
@@ -16,17 +16,22 @@
  * there is minimal documentation.
  */
 
-/* struct defining Tx meta data --  fixed in FPGA -- 16 bytes */
-struct ark_tx_meta {
+/* struct defining Tx meta data --  fixed in FPGA -- 8 bytes */
+union ark_tx_meta {
uint64_t physaddr;
-   uint32_t user1;
-   uint16_t data_len;  /* of this MBUF */
+   struct {
+   uint32_t usermeta0;
+   uint32_t usermeta1;
+   };
+   struct {
+   uint16_t data_len;  /* of this MBUF */
 #define   ARK_DDM_EOP   0x01
 #define   ARK_DDM_SOP   0x02
-   uint8_t flags;  /* bit 0 indicates last mbuf in chain. */
-   uint8_t reserved[1];
-};
-
+   uint8_t  flags;
+   uint8_t  meta_cnt;
+   uint32_t user1;
+   };
+} __rte_packed;
 
 /*
  * DDM core hardware structures
@@ -35,6 +40,7 @@ struct ark_tx_meta {
  */
 #define ARK_DDM_CFG 0x
 /* Set unique HW ID for hardware version */
+#define ARK_DDM_CONST3 (0x334d)
 #define ARK_DDM_CONST2 (0x324d)
 #define ARK_DDM_CONST1 (0xfacecafe)
 
diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index 477e1de02..95546a891 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -95,6 +95,11 @@ static const char * const valid_arguments[] = {
 static const struct rte_pci_id pci_id_ark_map[] = {
{RTE_PCI_DEVICE(0x1d6c, 0x100d)},
{RTE_PCI_DEVICE(0x1d6c, 0x100e)},
+   {RTE_PCI_DEVICE(0x1d6c, 0x100f)},
+   {RTE_PCI_DEVICE(0x1d6c, 0x1010)},
+   {RTE_PCI_DEVICE(0x1d6c, 0x1017)},
+   {RTE_PCI_DEVICE(0x1d6c, 0x1018)},
+   {RTE_PCI_DEVICE(0x1d6c, 0x1019)},
{.vendor_id = 0, /* sentinel */ },
 };
 
diff --git a/drivers/net/ark/ark_ethdev_rx.c b/drivers/net/ark/ark_ethdev_rx.c
index 8e55b851a..21a9af41a 100644
--- a/drivers/net/ark/ark_ethdev_rx.c
+++ b/drivers/net/ark/ark_ethdev_rx.c
@@ -60

[dpdk-dev] [PATCH v5 4/6] net/ark: cleanup ark dynamic extension interface

2021-03-18 Thread Ed Czeck
- Rename extension functions with rte_pmd_ark prefix
- Update local function documentation

Signed-off-by: Ed Czeck 
---
v3:
- split function rename from previous commit
v4:
- reorder patches renaming before adding
v5:
- Keep the extension function changes in ark_ext.h
---
 drivers/net/ark/ark_ethdev.c |  32 ++--
 drivers/net/ark/ark_ext.h| 304 ++-
 2 files changed, 247 insertions(+), 89 deletions(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index 95546a891..5282534d3 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -193,58 +193,58 @@ check_for_ext(struct ark_adapter *ark)
/* Get the entry points */
ark->user_ext.dev_init =
(void *(*)(struct rte_eth_dev *, void *, int))
-   dlsym(ark->d_handle, "dev_init");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_init");
ARK_PMD_LOG(DEBUG, "device ext init pointer = %p\n",
  ark->user_ext.dev_init);
ark->user_ext.dev_get_port_count =
(int (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_get_port_count");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_get_port_count");
ark->user_ext.dev_uninit =
(void (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_uninit");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_uninit");
ark->user_ext.dev_configure =
(int (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_configure");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_configure");
ark->user_ext.dev_start =
(int (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_start");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_start");
ark->user_ext.dev_stop =
(void (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_stop");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_stop");
ark->user_ext.dev_close =
(void (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_close");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_close");
ark->user_ext.link_update =
(int (*)(struct rte_eth_dev *, int, void *))
-   dlsym(ark->d_handle, "link_update");
+   dlsym(ark->d_handle, "rte_pmd_ark_link_update");
ark->user_ext.dev_set_link_up =
(int (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_set_link_up");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_set_link_up");
ark->user_ext.dev_set_link_down =
(int (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "dev_set_link_down");
+   dlsym(ark->d_handle, "rte_pmd_ark_dev_set_link_down");
ark->user_ext.stats_get =
(int (*)(struct rte_eth_dev *, struct rte_eth_stats *,
  void *))
-   dlsym(ark->d_handle, "stats_get");
+   dlsym(ark->d_handle, "rte_pmd_ark_stats_get");
ark->user_ext.stats_reset =
(void (*)(struct rte_eth_dev *, void *))
-   dlsym(ark->d_handle, "stats_reset");
+   dlsym(ark->d_handle, "rte_pmd_ark_stats_reset");
ark->user_ext.mac_addr_add =
(void (*)(struct rte_eth_dev *, struct rte_ether_addr *,
uint32_t, uint32_t, void *))
-   dlsym(ark->d_handle, "mac_addr_add");
+   dlsym(ark->d_handle, "rte_pmd_ark_mac_addr_add");
ark->user_ext.mac_addr_remove =
(void (*)(struct rte_eth_dev *, uint32_t, void *))
-   dlsym(ark->d_handle, "mac_addr_remove");
+   dlsym(ark->d_handle, "rte_pmd_ark_mac_addr_remove");
ark->user_ext.mac_addr_set =
(void (*)(struct rte_eth_dev *, struct rte_ether_addr *,
  void *))
-   dlsym(ark->d_handle, "mac_addr_set");
+   dlsym(ark->d_handle, "rte_pmd_ark_mac_addr_set");
ark->user_ext.set_mtu =
(int (*)(struct rte_eth_dev *, uint16_t,
  void *))
-   dlsym(ark->d_handle, "set_mtu");
+   dlsym(ark->d_handle, "rte_pmd_ark_set_mtu");
 
return found;
 }
diff --git a/drivers/net/ark/ark_ext.h b/drivers/net/ark/ark_ext.h
index 821fb55bb..9c7d55a14 100644
--- a/drivers/net/ark/ark_ext.h
+++ b/drivers/net/ark/ark_ext.h
@@ -7,84 +7,242 @@
 
 #include 
 
-/*
- * This is the template file for users who which to define a dynamic
- * extension to the Arkville PMD.   User's who create an extension
- * should include this file and define the necessary and desired
- * functions.
- * Only 1 function is required for an extension, dev_init(); all other
- * functions prototyped in this file

[dpdk-dev] [PATCH v5 5/6] net/ark: generalize meta data between FPGA and PMD

2021-03-18 Thread Ed Czeck
In this commit we generalize the movement of user-specified
meta data between mbufs and FPGA AXIS tuser fields using
user-defined hook functions.

- Previous use of PMD dynfields are removed
- Remove emptied rte_pmd_ark.h
- Hook function added to ark_user_ext
- Add hook function calls in Rx and Tx paths
- Update guide with example of hook function use
- Add release notes

Signed-off-by: Ed Czeck 
---
v3:
- split function rename to separate commit
v4:
- reorder patches renaming before adding
v5:
- remove rte_pmd_ark.h
---
 doc/api/doxy-api-index.md  |   1 -
 doc/guides/nics/ark.rst| 139 -
 doc/guides/rel_notes/release_21_05.rst |  11 ++
 drivers/net/ark/ark_ethdev.c   |  59 ++-
 drivers/net/ark/ark_ethdev_rx.c|  28 +++--
 drivers/net/ark/ark_ethdev_rx.h|   3 -
 drivers/net/ark/ark_ethdev_tx.c|  24 +++--
 drivers/net/ark/ark_ext.h  |  38 +++
 drivers/net/ark/ark_global.h   |  20 
 drivers/net/ark/ark_udm.h  |   5 +-
 drivers/net/ark/meson.build|   2 -
 drivers/net/ark/rte_pmd_ark.h  | 125 --
 drivers/net/ark/version.map|   7 --
 13 files changed, 248 insertions(+), 214 deletions(-)
 delete mode 100644 drivers/net/ark/rte_pmd_ark.h

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index 748514e24..66572a75f 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -41,7 +41,6 @@ The public API headers are grouped by topics:
   [vhost]  (@ref rte_vhost.h),
   [vdpa]   (@ref rte_vdpa.h),
   [KNI](@ref rte_kni.h),
-  [ark](@ref rte_pmd_ark.h),
   [ixgbe]  (@ref rte_pmd_ixgbe.h),
   [i40e]   (@ref rte_pmd_i40e.h),
   [ice](@ref rte_pmd_ice.h),
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst
index dcccfee26..da61814b5 100644
--- a/doc/guides/nics/ark.rst
+++ b/doc/guides/nics/ark.rst
@@ -1,5 +1,5 @@
 .. SPDX-License-Identifier: BSD-3-Clause
-Copyright (c) 2015-2017 Atomic Rules LLC
+Copyright (c) 2015-2021 Atomic Rules LLC
 All rights reserved.
 
 ARK Poll Mode Driver
@@ -130,6 +130,141 @@ Configuration Information
  be offloaded or remain in host software.
 
 
+Dynamic PMD Extension
+-
+
+Dynamic PMD extensions allow users to customize net/ark functionality
+using their own code. Arkville RTL and this PMD support high-throughput data
+movement, and these extensions allow PMD support for users' FPGA
+features.
+Dynamic PMD extensions operate by having users supply a shared object
+file which is loaded by Arkville PMD during initialization.  The
+object file contains extension (or hook) functions that are registered
+and then called during PMD operations.
+
+The allowable set of extension functions are defined and documented in
+``ark_ext.h``, only the initialization function,
+``rte_pmd_ark_dev_init()``, is required; all others are optional. The
+following sections give a small extension example along with
+instructions for compiling and using the extension.
+
+
+Extension Example
+^
+
+The following example shows an extension which populates mbuf fields
+during RX from user meta data coming from FPGA hardware.
+
+.. code-block:: c
+
+   #include 
+   #include 
+   #include 
+   #include 
+
+   /* Global structure passed to extension/hook functions */
+   struct ark_user_extension {
+   int timestamp_dynfield_offset;
+   };
+
+   /* RX tuser field based on user's hardware */
+   struct user_rx_meta {
+  uint64_t timestamp;
+  uint32_t rss;
+   } __rte_packed;
+
+   /* Create ark_user_extension object for use in other hook functions */
+   void *rte_pmd_ark_dev_init(struct rte_eth_dev * dev,
+  void * abar, int port_id )
+   {
+  RTE_SET_USED(dev);
+  RTE_SET_USED(abar);
+  fprintf(stderr, "Called Arkville user extension for port %u\n",
+  port_id);
+
+  struct ark_user_extension *xdata = rte_zmalloc("macExtS",
+ sizeof(struct ark_user_extension), 64);
+  if (!xdata)
+ return NULL;
+
+  /* register dynfield for rx timestamp */
+  rte_mbuf_dyn_rx_timestamp_register(&xdata->timestamp_dynfield_offset,
+ NULL);
+
+  fprintf(stderr, "timestamp fields offset in extension is %d\n",
+  xdata->timestamp_dynfield_offset);
+  return xdata;
+   }
+
+   /* uninitialization */
+   void rte_pmd_ark_dev_uninit(struct rte_eth_dev * dev, void *user_data)
+   {
+  rte_free(user_data);
+   }
+
+   /* Hook function -- called for each RX packet
+* Extract RX timestamp and RSS from meta and place in mbuf
+*/
+   void rte_pmd_ark_rx_user_meta_hook(struct rte_mbuf *mbuf,
+  const uint32_t *meta,
+  void *user_data)
+   {
+

[dpdk-dev] [PATCH v5 6/6] net/ark: localize internal packet generator code

2021-03-18 Thread Ed Czeck
remove unnecessary includes
no functional changes

Signed-off-by: Ed Czeck 
---
 drivers/net/ark/ark_ethdev.c  | 17 ++---
 drivers/net/ark/ark_pktchkr.c |  4 
 drivers/net/ark/ark_pktgen.c  | 20 ++--
 drivers/net/ark/ark_pktgen.h  |  1 +
 4 files changed, 17 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ark/ark_ethdev.c b/drivers/net/ark/ark_ethdev.c
index dea779a98..9dea5facb 100644
--- a/drivers/net/ark/ark_ethdev.c
+++ b/drivers/net/ark/ark_ethdev.c
@@ -534,20 +534,6 @@ eth_ark_dev_configure(struct rte_eth_dev *dev)
return 0;
 }
 
-static void *
-delay_pg_start(void *arg)
-{
-   struct ark_adapter *ark = (struct ark_adapter *)arg;
-
-   /* This function is used exclusively for regression testing, We
-* perform a blind sleep here to ensure that the external test
-* application has time to setup the test before we generate packets
-*/
-   usleep(10);
-   ark_pktgen_run(ark->pg);
-   return NULL;
-}
-
 static int
 eth_ark_dev_start(struct rte_eth_dev *dev)
 {
@@ -582,7 +568,8 @@ eth_ark_dev_start(struct rte_eth_dev *dev)
/* Delay packet generatpr start allow the hardware to be ready
 * This is only used for sanity checking with internal generator
 */
-   if (pthread_create(&thread, NULL, delay_pg_start, ark)) {
+   if (pthread_create(&thread, NULL,
+  ark_pktgen_delay_start, ark->pg)) {
ARK_PMD_LOG(ERR, "Could not create pktgen "
"starter thread\n");
return -1;
diff --git a/drivers/net/ark/ark_pktchkr.c b/drivers/net/ark/ark_pktchkr.c
index 0f2d31e5b..84bb567a4 100644
--- a/drivers/net/ark/ark_pktchkr.c
+++ b/drivers/net/ark/ark_pktchkr.c
@@ -2,13 +2,9 @@
  * Copyright (c) 2015-2018 Atomic Rules LLC
  */
 
-#include 
-#include 
-#include 
 #include 
 
 #include 
-#include 
 #include 
 
 #include "ark_pktchkr.h"
diff --git a/drivers/net/ark/ark_pktgen.c b/drivers/net/ark/ark_pktgen.c
index ac4322a35..28a44f754 100644
--- a/drivers/net/ark/ark_pktgen.c
+++ b/drivers/net/ark/ark_pktgen.c
@@ -2,15 +2,9 @@
  * Copyright (c) 2015-2018 Atomic Rules LLC
  */
 
-#include 
-#include 
-#include 
 #include 
 
 #include 
-#include 
-
-#include 
 #include 
 
 #include "ark_pktgen.h"
@@ -470,3 +464,17 @@ ark_pktgen_setup(ark_pkt_gen_t handle)
ark_pktgen_run(handle);
}
 }
+
+void *
+ark_pktgen_delay_start(void *arg)
+{
+   struct ark_pkt_gen_inst *inst = (struct ark_pkt_gen_inst *)arg;
+
+   /* This function is used exclusively for regression testing, We
+* perform a blind sleep here to ensure that the external test
+* application has time to setup the test before we generate packets
+*/
+   usleep(10);
+   ark_pktgen_run(inst);
+   return NULL;
+}
diff --git a/drivers/net/ark/ark_pktgen.h b/drivers/net/ark/ark_pktgen.h
index c61dfee6d..7147fe1bd 100644
--- a/drivers/net/ark/ark_pktgen.h
+++ b/drivers/net/ark/ark_pktgen.h
@@ -75,5 +75,6 @@ void ark_pktgen_set_hdr_dW(ark_pkt_gen_t handle, uint32_t 
*hdr);
 void ark_pktgen_set_start_offset(ark_pkt_gen_t handle, uint32_t x);
 void ark_pktgen_parse(char *argv);
 void ark_pktgen_setup(ark_pkt_gen_t handle);
+void *ark_pktgen_delay_start(void *arg);
 
 #endif
-- 
2.17.1



Re: [dpdk-dev] [RFC 0/4] SocketPair Broker support for vhost and virtio-user.

2021-03-18 Thread Stefan Hajnoczi
On Wed, Mar 17, 2021 at 09:25:26PM +0100, Ilya Maximets wrote:
Hi,
Some questions to understand the problems that SocketPair Broker solves:

> Even more configuration tricks required in order to share some sockets
> between different containers and not only with the host, e.g. to
> create service chains.

How does SocketPair Broker solve this? I guess the idea is that
SocketPair Broker must be started before other containers. That way
applications don't need to sleep and reconnect when a socket isn't
available yet.

On the other hand, the SocketPair Broker might be unavailable (OOM
killer, crash, etc), so applications still need to sleep and reconnect
to the broker itself. I'm not sure the problem has actually been solved
unless there is a reason why the broker is always guaranteed to be
available?

> And some housekeeping usually required for applications in case the
> socket server terminated abnormally and socket files left on a file
> system:
>  "failed to bind to vhu: Address already in use; remove it and try again"

QEMU avoids this by unlinking before binding. The drawback is that users
might accidentally hijack an existing listen socket, but that can be
solved with a pidfile.

> Additionally, all applications (system and user's!) should follow
> naming conventions and place socket files in particular location on a
> file system to make things work.

Does SocketPair Broker solve this? Applications now need to use a naming
convention for keys, so it seems like this issue has not been
eliminated.

> This patch-set aims to eliminate most of the inconveniences by
> leveraging an infrastructure service provided by a SocketPair Broker.

I don't understand yet why this is useful for vhost-user, where the
creation of the vhost-user device backend and its use by a VMM are
closely managed by one piece of software:

1. Unlink the socket path.
2. Create, bind, and listen on the socket path.
3. Instantiate the vhost-user device backend (e.g. talk to DPDK/SPDK
   RPC, spawn a process, etc) and pass in the listen fd.
4. In the meantime the VMM can open the socket path and call connect(2).
   As soon as the vhost-user device backend calls accept(2) the
   connection will proceed (there is no need for sleeping).

This approach works across containers without a broker.

BTW what is the security model of the broker? Unlike pathname UNIX
domain sockets there is no ownership permission check.

Stefan


[dpdk-dev] [PATCH] [RFC] eventdev: introduce crypto adapter enqueue API

2021-03-18 Thread Akhil Goyal
In case an event from a previous stage is required to be forwarded
to a crypto adapter and PMD supports internal event port in crypto
adapter, exposed via capability
RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD, we do not have
a way to check in the API rte_event_enqueue_burst(), whether it is
for crypto adapter or for eth tx adapter.

Hence we need a new API similar to
rte_event_eth_tx_adapter_enqueue_burst(), which can send to a
crypto adapter.

Note that RTE_EVENT_TYPE_* cannot be used to make that decision,
as it is meant for event source and not event destination.
And event port designated for crypto adapter is designed to be used
for OP_NEW mode.

Hence, in order to support an event PMD which has an internal
event port in crypto adapter (RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD mode),
exposed via capability RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD,
application should use rte_event_crypto_adapter_enqueue_burst() API
to enqueue events.

When internal port is not available(RTE_EVENT_CRYPTO_ADAPTER_OP_NEW mode),
application can use API rte_event_enqueue_burst() as it was doing earlier,
i.e. retrieve event port used by crypto adapter and bind its event queues
to that port and enqueue events using the API rte_event_enqueue_burst().

TODO:
- test application changes for the usage of new API
- support in octeontx2 event PMD

Signed-off-by: Shijith Thotton 
Signed-off-by: Akhil Goyal 
---
 .../prog_guide/event_crypto_adapter.rst   | 69 ---
 lib/librte_eventdev/eventdev_trace_points.c   |  3 +
 .../rte_event_crypto_adapter.h| 66 ++
 lib/librte_eventdev/rte_eventdev.c| 10 +++
 lib/librte_eventdev/rte_eventdev.h|  8 ++-
 lib/librte_eventdev/rte_eventdev_trace_fp.h   | 10 +++
 lib/librte_eventdev/version.map   |  3 +
 7 files changed, 142 insertions(+), 27 deletions(-)

diff --git a/doc/guides/prog_guide/event_crypto_adapter.rst 
b/doc/guides/prog_guide/event_crypto_adapter.rst
index 1e3eb7139..4650ed945 100644
--- a/doc/guides/prog_guide/event_crypto_adapter.rst
+++ b/doc/guides/prog_guide/event_crypto_adapter.rst
@@ -55,21 +55,22 @@ which is needed to enqueue an event after the crypto 
operation is completed.
 RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD mode
 
 
-In the RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD mode, if HW supports
-RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD capability the application
-can directly submit the crypto operations to the cryptodev.
-If not, application retrieves crypto adapter's event port using
-rte_event_crypto_adapter_event_port_get() API. Then, links its event
-queue to this port and starts enqueuing crypto operations as events
-to the eventdev. The adapter then dequeues the events and submits the
-crypto operations to the cryptodev. After the crypto completions, the
-adapter enqueues events to the event device.
-Application can use this mode, when ingress packet ordering is needed.
-In this mode, events dequeued from the adapter will be treated as
-forwarded events. The application needs to specify the cryptodev ID
-and queue pair ID (request information) needed to enqueue a crypto
-operation in addition to the event information (response information)
-needed to enqueue an event after the crypto operation has completed.
+In the ``RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD`` mode, if the event PMD and 
crypto
+PMD supports internal event port
+(``RTE_EVENT_CRYPTO_ADAPTER_CAP_INTERNAL_PORT_OP_FWD``), the application should
+use ``rte_event_crypto_adapter_enqueue_burst()`` API to enqueue crypto
+operations as events to crypto adapter. If not, application retrieves crypto
+adapter's event port using ``rte_event_crypto_adapter_event_port_get()`` API,
+links its event queue to this port and starts enqueuing crypto operations as
+events to eventdev using ``rte_event_enqueue_burst()``. The adapter then
+dequeues the events and submits the crypto operations to the cryptodev. After
+the crypto operation is complete, the adapter enqueues events to the event
+device. The application can use this mode when ingress packet ordering is
+needed. In this mode, events dequeued from the adapter will be treated as
+forwarded events. The application needs to specify the cryptodev ID and queue
+pair ID (request information) needed to enqueue a crypto operation in addition
+to the event information (response information) needed to enqueue an event 
after
+the crypto operation has completed.
 
 .. _figure_event_crypto_adapter_op_forward:
 
@@ -120,28 +121,44 @@ service function and needs to create an event port for 
it. The callback is
 expected to fill the ``struct rte_event_crypto_adapter_conf`` structure
 passed to it.
 
-For RTE_EVENT_CRYPTO_ADAPTER_OP_FORWARD mode, the event port created by adapter
-can be retrieved using ``rte_event_crypto_adapter_event_port_get()`` API.
-Application can use this event port to link with event queue on which it
-enqueues events towards the cry

[dpdk-dev] [PATCH v1 0/6] ioat driver updates

2021-03-18 Thread Bruce Richardson
This set contains a series of updates to the ioat driver, described in each of
the individual patchsets.

Comments would be especially appreciated for the last patch in this set, which
converts the existing idxd vdev driver to a bus driver so that probing and
scanning can be done automatically. This approach is based on suggestions made
in a previous discussion thread[1]

NOTE: Documentation updates are currently missing from this set, but will be
included in future revisions.

[1] 
http://inbox.dpdk.org/dev/20210311171913.gd1...@bricha3-mobl.ger.corp.intel.com/t/#mf3170e5aab50b43343b8cc34054f7bbbefd94379

Bruce Richardson (5):
  raw/ioat: support limiting queues for idxd PCI device
  raw/ioat: add component prefix to log messages
  raw/ioat: add explicit padding to descriptor struct
  raw/ioat: rework SW ring layout
  raw/ioat: add bus driver for device scanning automatically

Kevin Laatz (1):
  raw/ioat: add api to query remaining ring space

 drivers/raw/ioat/idxd_bus.c| 320 +
 drivers/raw/ioat/idxd_pci.c|  33 ++-
 drivers/raw/ioat/idxd_vdev.c   | 231 --
 drivers/raw/ioat/ioat_common.c |  99 
 drivers/raw/ioat/ioat_private.h|   2 +-
 drivers/raw/ioat/ioat_rawdev_test.c| 137 +++
 drivers/raw/ioat/meson.build   |   3 +-
 drivers/raw/ioat/rte_ioat_rawdev_fns.h | 280 +-
 8 files changed, 711 insertions(+), 394 deletions(-)
 create mode 100644 drivers/raw/ioat/idxd_bus.c
 delete mode 100644 drivers/raw/ioat/idxd_vdev.c

--
2.27.0



[dpdk-dev] [PATCH v1 1/6] raw/ioat: support limiting queues for idxd PCI device

2021-03-18 Thread Bruce Richardson
When using a full device instance via vfio, allow the user to specify a
maximum number of queues to configure rather than always using the max
number of supported queues.

Signed-off-by: Bruce Richardson 
---
 drivers/raw/ioat/idxd_pci.c | 28 ++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/drivers/raw/ioat/idxd_pci.c b/drivers/raw/ioat/idxd_pci.c
index 01623f33f6..b48e565b4c 100644
--- a/drivers/raw/ioat/idxd_pci.c
+++ b/drivers/raw/ioat/idxd_pci.c
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 #include "ioat_private.h"
 #include "ioat_spec.h"
@@ -123,7 +124,8 @@ static const struct rte_rawdev_ops idxd_pci_ops = {
 #define IDXD_PORTAL_SIZE (4096 * 4)
 
 static int
-init_pci_device(struct rte_pci_device *dev, struct idxd_rawdev *idxd)
+init_pci_device(struct rte_pci_device *dev, struct idxd_rawdev *idxd,
+   unsigned int max_queues)
 {
struct idxd_pci_common *pci;
uint8_t nb_groups, nb_engines, nb_wqs;
@@ -179,6 +181,16 @@ init_pci_device(struct rte_pci_device *dev, struct 
idxd_rawdev *idxd)
for (i = 0; i < nb_wqs; i++)
idxd_get_wq_cfg(pci, i)[0] = 0;
 
+   /* limit queues if necessary */
+   if (max_queues != 0 && nb_wqs > max_queues) {
+   nb_wqs = max_queues;
+   if (nb_engines > max_queues)
+   nb_engines = max_queues;
+   if (nb_groups > max_queues)
+   nb_engines = max_queues;
+   IOAT_PMD_DEBUG("Limiting queues to %u", nb_wqs);
+   }
+
/* put each engine into a separate group to avoid reordering */
if (nb_groups > nb_engines)
nb_groups = nb_engines;
@@ -242,12 +254,23 @@ idxd_rawdev_probe_pci(struct rte_pci_driver *drv, struct 
rte_pci_device *dev)
uint8_t nb_wqs;
int qid, ret = 0;
char name[PCI_PRI_STR_SIZE];
+   unsigned int max_queues = 0;
 
rte_pci_device_name(&dev->addr, name, sizeof(name));
IOAT_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
dev->device.driver = &drv->driver;
 
-   ret = init_pci_device(dev, &idxd);
+   if (dev->device.devargs && dev->device.devargs->args[0] != '\0') {
+   /* if the number of devargs grows beyond just 1, use rte_kvargs 
*/
+   if (sscanf(dev->device.devargs->args,
+   "max_queues=%u", &max_queues) != 1) {
+   IOAT_PMD_ERR("Invalid device parameter: '%s'",
+   dev->device.devargs->args);
+   return -1;
+   }
+   }
+
+   ret = init_pci_device(dev, &idxd, max_queues);
if (ret < 0) {
IOAT_PMD_ERR("Error initializing PCI hardware");
return ret;
@@ -353,3 +376,4 @@ RTE_PMD_REGISTER_PCI(IDXD_PMD_RAWDEV_NAME_PCI, 
idxd_pmd_drv_pci);
 RTE_PMD_REGISTER_PCI_TABLE(IDXD_PMD_RAWDEV_NAME_PCI, pci_id_idxd_map);
 RTE_PMD_REGISTER_KMOD_DEP(IDXD_PMD_RAWDEV_NAME_PCI,
  "* igb_uio | uio_pci_generic | vfio-pci");
+RTE_PMD_REGISTER_PARAM_STRING(rawdev_idxd_pci, "max_queues=0");
-- 
2.27.0



[dpdk-dev] [PATCH v1 2/6] raw/ioat: add component prefix to log messages

2021-03-18 Thread Bruce Richardson
Add the driver prefix "IOAT" to log messages for the driver.

Signed-off-by: Bruce Richardson 
---
 drivers/raw/ioat/ioat_private.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/raw/ioat/ioat_private.h b/drivers/raw/ioat/ioat_private.h
index 6c423811ec..f032d5fe3d 100644
--- a/drivers/raw/ioat/ioat_private.h
+++ b/drivers/raw/ioat/ioat_private.h
@@ -21,7 +21,7 @@
 extern int ioat_pmd_logtype;
 
 #define IOAT_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \
-   ioat_pmd_logtype, "%s(): " fmt "\n", __func__, ##args)
+   ioat_pmd_logtype, "IOAT: %s(): " fmt "\n", __func__, ##args)
 
 #define IOAT_PMD_DEBUG(fmt, args...)  IOAT_PMD_LOG(DEBUG, fmt, ## args)
 #define IOAT_PMD_INFO(fmt, args...)   IOAT_PMD_LOG(INFO, fmt, ## args)
-- 
2.27.0



[dpdk-dev] [PATCH v1 3/6] raw/ioat: add explicit padding to descriptor struct

2021-03-18 Thread Bruce Richardson
Add an explicit padding field to the end of the descriptor structure so
that when the batch descriptor is defined on the stack for perform-ops, the
unused space is all zeroed appropriately.

Signed-off-by: Bruce Richardson 
---
 drivers/raw/ioat/rte_ioat_rawdev_fns.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/raw/ioat/rte_ioat_rawdev_fns.h 
b/drivers/raw/ioat/rte_ioat_rawdev_fns.h
index c2c4601ca7..e96edc9053 100644
--- a/drivers/raw/ioat/rte_ioat_rawdev_fns.h
+++ b/drivers/raw/ioat/rte_ioat_rawdev_fns.h
@@ -140,7 +140,10 @@ struct rte_idxd_hw_desc {
 
uint32_t size;/* length of data for op, or batch size */
 
-   /* 28 bytes of padding here */
+   uint16_t intr_handle; /* completion interrupt handle */
+
+   /* remaining 26 bytes are reserved */
+   uint16_t __reserved[13];
 } __rte_aligned(64);
 
 /**
-- 
2.27.0



[dpdk-dev] [PATCH v1 4/6] raw/ioat: rework SW ring layout

2021-03-18 Thread Bruce Richardson
The ring management in the idxd part of the driver is more complex than
it needs to be, tracking individual batches in a ring and having null
descriptors as padding to avoid having single-operation batches. This can
be simplified by using a regular ring-based layout, with additional
overflow at the end to ensure that the one does not need to wrap within a
batch.

Signed-off-by: Bruce Richardson 
---
 drivers/raw/ioat/idxd_pci.c|   5 +-
 drivers/raw/ioat/idxd_vdev.c   |   3 +-
 drivers/raw/ioat/ioat_common.c |  99 +--
 drivers/raw/ioat/ioat_rawdev_test.c|   1 +
 drivers/raw/ioat/rte_ioat_rawdev_fns.h | 229 +
 5 files changed, 179 insertions(+), 158 deletions(-)

diff --git a/drivers/raw/ioat/idxd_pci.c b/drivers/raw/ioat/idxd_pci.c
index b48e565b4c..13515dbc6c 100644
--- a/drivers/raw/ioat/idxd_pci.c
+++ b/drivers/raw/ioat/idxd_pci.c
@@ -90,7 +90,7 @@ idxd_pci_dev_start(struct rte_rawdev *dev)
return 0;
}
 
-   if (idxd->public.batch_ring == NULL) {
+   if (idxd->public.desc_ring == NULL) {
IOAT_PMD_ERR("WQ %d has not been fully configured", idxd->qid);
return -EINVAL;
}
@@ -337,7 +337,8 @@ idxd_rawdev_destroy(const char *name)
/* free device memory */
IOAT_PMD_DEBUG("Freeing device driver memory");
rdev->dev_private = NULL;
-   rte_free(idxd->public.batch_ring);
+   rte_free(idxd->public.batch_idx_ring);
+   rte_free(idxd->public.desc_ring);
rte_free(idxd->public.hdl_ring);
rte_memzone_free(idxd->mz);
 
diff --git a/drivers/raw/ioat/idxd_vdev.c b/drivers/raw/ioat/idxd_vdev.c
index 30a53b3b82..af585053b4 100644
--- a/drivers/raw/ioat/idxd_vdev.c
+++ b/drivers/raw/ioat/idxd_vdev.c
@@ -209,7 +209,8 @@ idxd_rawdev_remove_vdev(struct rte_vdev_device *vdev)
ret = -errno;
}
 
-   rte_free(idxd->public.batch_ring);
+   rte_free(idxd->public.batch_idx_ring);
+   rte_free(idxd->public.desc_ring);
rte_free(idxd->public.hdl_ring);
 
rte_memzone_free(idxd->mz);
diff --git a/drivers/raw/ioat/ioat_common.c b/drivers/raw/ioat/ioat_common.c
index d055c36a2a..fcb30572e6 100644
--- a/drivers/raw/ioat/ioat_common.c
+++ b/drivers/raw/ioat/ioat_common.c
@@ -84,21 +84,21 @@ idxd_dev_dump(struct rte_rawdev *dev, FILE *f)
fprintf(f, "Driver: %s\n\n", dev->driver_name);
 
fprintf(f, "Portal: %p\n", rte_idxd->portal);
-   fprintf(f, "Batch Ring size: %u\n", rte_idxd->batch_ring_sz);
-   fprintf(f, "Comp Handle Ring size: %u\n\n", rte_idxd->hdl_ring_sz);
-
-   fprintf(f, "Next batch: %u\n", rte_idxd->next_batch);
-   fprintf(f, "Next batch to be completed: %u\n", 
rte_idxd->next_completed);
-   for (i = 0; i < rte_idxd->batch_ring_sz; i++) {
-   struct rte_idxd_desc_batch *b = &rte_idxd->batch_ring[i];
-   fprintf(f, "Batch %u @%p: submitted=%u, op_count=%u, 
hdl_end=%u\n",
-   i, b, b->submitted, b->op_count, b->hdl_end);
-   }
-
-   fprintf(f, "\n");
-   fprintf(f, "Next free hdl: %u\n", rte_idxd->next_free_hdl);
-   fprintf(f, "Last completed hdl: %u\n", rte_idxd->last_completed_hdl);
-   fprintf(f, "Next returned hdl: %u\n", rte_idxd->next_ret_hdl);
+   fprintf(f, "Config: {ring_size: %u, hdls_disable: %u}\n\n",
+   rte_idxd->cfg.ring_size, rte_idxd->cfg.hdls_disable);
+
+   fprintf(f, "max batches: %u\n", rte_idxd->max_batches);
+   fprintf(f, "batch idx read: %u\n", rte_idxd->batch_idx_read);
+   fprintf(f, "batch idx write: %u\n", rte_idxd->batch_idx_write);
+   fprintf(f, "batch idxes:");
+   for (i = 0; i < rte_idxd->max_batches + 1; i++)
+   fprintf(f, "%u ", rte_idxd->batch_idx_ring[i]);
+   fprintf(f, "\n\n");
+
+   fprintf(f, "hdls read: %u\n", rte_idxd->max_batches);
+   fprintf(f, "hdls avail: %u\n", rte_idxd->hdls_avail);
+   fprintf(f, "batch start: %u\n", rte_idxd->batch_start);
+   fprintf(f, "batch size: %u\n", rte_idxd->batch_size);
 
return 0;
 }
@@ -114,10 +114,8 @@ idxd_dev_info_get(struct rte_rawdev *dev, rte_rawdev_obj_t 
dev_info,
if (info_size != sizeof(*cfg))
return -EINVAL;
 
-   if (cfg != NULL) {
-   cfg->ring_size = rte_idxd->hdl_ring_sz;
-   cfg->hdls_disable = rte_idxd->hdls_disable;
-   }
+   if (cfg != NULL)
+   *cfg = rte_idxd->cfg;
return 0;
 }
 
@@ -129,8 +127,6 @@ idxd_dev_configure(const struct rte_rawdev *dev,
struct rte_idxd_rawdev *rte_idxd = &idxd->public;
struct rte_ioat_rawdev_config *cfg = config;
uint16_t max_desc = cfg->ring_size;
-   uint16_t max_batches = max_desc / BATCH_SIZE;
-   uint16_t i;
 
if (config_size != sizeof(*cfg))
return -EINVAL;
@@ -140,47 +136,34 @@ idxd

[dpdk-dev] [PATCH v1 5/6] raw/ioat: add api to query remaining ring space

2021-03-18 Thread Bruce Richardson
From: Kevin Laatz 

Add a new API to query remaining descriptor ring capacity. This API is
useful, for example, when an application needs to enqueue a fragmented
packet and wants to ensure that all segments of the packet will be enqueued
together.

Signed-off-by: Kevin Laatz 
---
 drivers/raw/ioat/ioat_rawdev_test.c| 136 +
 drivers/raw/ioat/rte_ioat_rawdev_fns.h |  46 +
 2 files changed, 182 insertions(+)

diff --git a/drivers/raw/ioat/ioat_rawdev_test.c 
b/drivers/raw/ioat/ioat_rawdev_test.c
index 3de8273704..3a4c7a5161 100644
--- a/drivers/raw/ioat/ioat_rawdev_test.c
+++ b/drivers/raw/ioat/ioat_rawdev_test.c
@@ -202,6 +202,138 @@ test_enqueue_fill(int dev_id)
return 0;
 }
 
+static int
+test_burst_capacity(int dev_id, unsigned int ring_size)
+{
+   unsigned int i, j;
+   unsigned int length = 1024;
+
+   /* Test to make sure it does not enqueue if we cannot fit the entire 
burst */
+   do {
+#define BURST_SIZE 19
+#define EXPECTED_REJECTS   5
+   struct rte_mbuf *srcs[BURST_SIZE], *dsts[BURST_SIZE];
+   struct rte_mbuf *completed_src[BURST_SIZE];
+   struct rte_mbuf *completed_dst[BURST_SIZE];
+   unsigned int cnt_success = 0;
+   unsigned int cnt_rejected = 0;
+   unsigned int valid_iters = (ring_size - 1)/BURST_SIZE;
+
+   /* Enqueue burst until they won't fit + some extra iterations 
which should
+   * be rejected
+   */
+   for (i = 0; i < valid_iters + EXPECTED_REJECTS; i++) {
+   if (rte_ioat_burst_capacity(dev_id) >= BURST_SIZE) {
+   for (j = 0; j < BURST_SIZE; j++) {
+
+   srcs[j] = rte_pktmbuf_alloc(pool);
+   dsts[j] = rte_pktmbuf_alloc(pool);
+   srcs[j]->data_len = srcs[j]->pkt_len = 
length;
+   dsts[j]->data_len = dsts[j]->pkt_len = 
length;
+
+   if (rte_ioat_enqueue_copy(dev_id,
+   srcs[j]->buf_iova + 
srcs[j]->data_off,
+   dsts[j]->buf_iova + 
dsts[j]->data_off,
+   length,
+   (uintptr_t)srcs[j],
+   (uintptr_t)dsts[j]) != 
1) {
+   PRINT_ERR("Error with 
rte_ioat_enqueue_copy\n");
+   return -1;
+   }
+
+   rte_pktmbuf_free(srcs[j]);
+   rte_pktmbuf_free(dsts[j]);
+   cnt_success++;
+   }
+   } else {
+   cnt_rejected++;
+   }
+   }
+
+   /* do cleanup before next tests */
+   rte_ioat_perform_ops(dev_id);
+   usleep(100);
+   for (i = 0; i < valid_iters; i++) {
+   if (rte_ioat_completed_ops(dev_id, BURST_SIZE, (void 
*)completed_src,
+   (void *)completed_dst) != BURST_SIZE) {
+   PRINT_ERR("error with completions\n");
+   return -1;
+   }
+   }
+
+   printf("successful_enqueues: %u  expected_successful: %u  
rejected_iters: %u  expected_rejects: %u\n",
+   cnt_success, valid_iters * BURST_SIZE, 
cnt_rejected,
+   EXPECTED_REJECTS);
+
+   if (!(cnt_success == (valid_iters * BURST_SIZE)) &&
+   !(cnt_rejected == EXPECTED_REJECTS)) {
+   PRINT_ERR("Burst Capacity test failed\n");
+   return -1;
+   }
+   } while (0);
+
+   /* Verify that space is taken and free'd as expected.
+* Repeat the test to verify wrap-around handling is correct in
+* rte_ioat_burst_capacity().
+*/
+   for (i = 0; i < ring_size / 32; i++) {
+   struct rte_mbuf *srcs[64], *dsts[64];
+   struct rte_mbuf *completed_src[64];
+   struct rte_mbuf *completed_dst[64];
+
+   /* Make sure the ring is clean before we start */
+   if (rte_ioat_burst_capacity(dev_id) != ring_size - 1) {
+   PRINT_ERR("Error, ring should be empty\n");
+   return -1;
+   }
+
+   /* Enqueue 64 mbufs & verify that space is taken */
+   for (j = 0; j < 64; j++) {
+   srcs[j] = rte_pktmbuf_alloc(poo

[dpdk-dev] [PATCH v1 6/6] raw/ioat: add bus driver for device scanning automatically

2021-03-18 Thread Bruce Richardson
Rather than using a vdev with args, DPDK can scan and initialize the
devices automatically using a bus-type driver. This bus does not need to
worry about registering device drivers, rather it can initialize the
devices directly on probe.

The device instances (queues) to use are detected from /dev with the
additional info about them got from /sys.

Signed-off-by: Bruce Richardson 
---
 drivers/raw/ioat/idxd_bus.c  | 320 +++
 drivers/raw/ioat/idxd_vdev.c | 232 -
 drivers/raw/ioat/meson.build |   3 +-
 3 files changed, 321 insertions(+), 234 deletions(-)
 create mode 100644 drivers/raw/ioat/idxd_bus.c
 delete mode 100644 drivers/raw/ioat/idxd_vdev.c

diff --git a/drivers/raw/ioat/idxd_bus.c b/drivers/raw/ioat/idxd_bus.c
new file mode 100644
index 00..ec15d9736a
--- /dev/null
+++ b/drivers/raw/ioat/idxd_bus.c
@@ -0,0 +1,320 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2021 Intel Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include "ioat_private.h"
+
+/* default value for DSA paths, but allow override in environment for testing 
*/
+#define DSA_DEV_PATH "/dev/dsa"
+#define DSA_SYSFS_PATH "/sys/bus/dsa/devices"
+
+/** a DSA device instance */
+struct rte_dsa_device {
+   TAILQ_ENTRY(rte_dsa_device) next;   /**< next dev in list */
+   struct rte_device device;   /**< Inherit core device */
+   char wq_name[32];   /**< the workqueue name/number e.g. 
wq0.1 */
+   uint16_t device_id; /**< the DSA instance number */
+   uint16_t wq_id; /**< the queue on the DSA instance 
*/
+};
+
+/* forward prototypes */
+struct dsa_bus;
+static int dsa_scan(void);
+static int dsa_probe(void);
+static struct rte_device *dsa_find_device(const struct rte_device *start,
+   rte_dev_cmp_t cmp,  const void *data);
+
+/** List of devices */
+TAILQ_HEAD(dsa_device_list, rte_dsa_device);
+
+/**
+ * Structure describing the DSA bus
+ */
+struct dsa_bus {
+   struct rte_bus bus;   /**< Inherit the generic class */
+   struct rte_driver driver; /**< Driver struct for devices to 
point to */
+   struct dsa_device_list device_list;  /**< List of PCI devices */
+};
+
+struct dsa_bus dsa_bus = {
+   .bus = {
+   .scan = dsa_scan,
+   .probe = dsa_probe,
+   .find_device = dsa_find_device,
+   },
+   .driver = {
+   .name = "rawdev_idxd"
+   },
+   .device_list = TAILQ_HEAD_INITIALIZER(dsa_bus.device_list),
+};
+
+static inline const char *
+dsa_get_dev_path(void)
+{
+   const char *path = getenv("DSA_DEV_PATH");
+   return path ? path : DSA_DEV_PATH;
+}
+
+static inline const char *
+dsa_get_sysfs_path(void)
+{
+   const char *path = getenv("DSA_SYSFS_PATH");
+   return path ? path : DSA_SYSFS_PATH;
+}
+
+static const struct rte_rawdev_ops idxd_vdev_ops = {
+   .dev_close = idxd_rawdev_close,
+   .dev_selftest = ioat_rawdev_test,
+   .dump = idxd_dev_dump,
+   .dev_configure = idxd_dev_configure,
+   .dev_info_get = idxd_dev_info_get,
+   .xstats_get = ioat_xstats_get,
+   .xstats_get_names = ioat_xstats_get_names,
+   .xstats_reset = ioat_xstats_reset,
+};
+
+static void *
+idxd_vdev_mmap_wq(struct rte_dsa_device *dev)
+{
+   void *addr;
+   char path[PATH_MAX];
+   int fd;
+
+   snprintf(path, sizeof(path), "%s/%s", dsa_get_dev_path(), dev->wq_name);
+   fd = open(path, O_RDWR);
+   if (fd < 0) {
+   IOAT_PMD_ERR("Failed to open device path: %s", path);
+   return NULL;
+   }
+
+   addr = mmap(NULL, 0x1000, PROT_WRITE, MAP_SHARED, fd, 0);
+   close(fd);
+   if (addr == MAP_FAILED) {
+   IOAT_PMD_ERR("Failed to mmap device %s", path);
+   return NULL;
+   }
+
+   return addr;
+}
+
+static int
+read_wq_string(struct rte_dsa_device *dev, const char *filename,
+   char *value, size_t valuelen)
+{
+   char sysfs_node[PATH_MAX];
+   int len;
+   int fd;
+
+   snprintf(sysfs_node, sizeof(sysfs_node), "%s/%s/%s",
+   dsa_get_sysfs_path(), dev->wq_name, filename);
+   if ((fd = open(sysfs_node, O_RDONLY)) < 0) {
+   IOAT_PMD_ERR("%s(): opening file '%s' failed: %s",
+   __func__, sysfs_node, strerror(errno));
+   return -1;
+   }
+
+   len = read(fd, value, valuelen - 1);
+   close(fd);
+   if (len < 0) {
+   IOAT_PMD_ERR("%s(): error reading file '%s': %s",
+   __func__, sysfs_node, strerror(errno));
+   return -1;
+   }
+   value[len] = '\0';
+   return 0;
+}
+
+static int
+read_wq_int(struct rte_dsa_device *de

Re: [dpdk-dev] 19.11.4 patches review and test

2021-03-18 Thread Pai G, Sunil
Hi Christian, Ilya

> -Original Message-
> From: Ilya Maximets 
> Sent: Thursday, March 18, 2021 8:18 PM
> To: Pai G, Sunil ; Christian Ehrhardt
> ; Stokes, Ian ;
> Ilya Maximets ; Govindharajan, Hariprasad
> 
> Cc: Richardson, Bruce ; Luca Boccassi
> ; sta...@dpdk.org; dev ; James Page
> 
> Subject: Re: [dpdk-dev] 19.11.4 patches review and test
> 
> On 3/18/21 2:36 PM, Pai G, Sunil wrote:
> > Hey Christian,
> >
> > 
> >
> >> back  in 19.11.4 these DPDK changes were not picked up as they have
> >> broken builds as discussed here.
> >> Later on the communication was that all this works fine now and
> >> thereby Luca has "reverted the reverts" in 19.11.6 [1].
> >>
> >> But today we were made aware that still no OVS 2.13 builds against a
> >> DPDK that has those changes.
> >> Not 2.13.1 as we have it in Ubuntu nor (if it needs some OVS changes
> >> backported) the recent 2.13.3 does build.
> >> They still fail with the very same issue I reported [2] back then.
> >>
> >> Unfortunately I have just released 19.11.7 so I can't revert them
> >> there - but OTOH reverting and counter reverting every other release
> >> seems wrong anyway.
> 
> It is wrong indeed, but the main question here is why these patches was
> backported to stable release in a first place?
> 
> Looking at these patches, they are not actual bug fixes but more like "nice to
> have" features that additionally breaks the way application links with DPDK.
> Stuff like that should not be acceptable to the stable release without a 
> strong
> justification or, at least, testing with actual applications.
> 
> Since we already have a revert of revert, revert of revert of revert doesn't
> seem so bad.
> 
> >>
> >> I wanted to ask if there is a set of patches that OVS would need to
> >> backport to 2.13.x to make this work?
> >> If they could be identified and prepared Distros could use them on
> >> 2.13.3 asap and 2.13.4 could officially release them for OVS later on.
> >>
> >> But for that we'd need a hint which OVS changes that would need to be.
> >> All I know atm is from the testing reports on DPDK it seems that OVS
> >> 2.14.3 and 2.15 are happy with the new DPDK code.
> >
> >> Do you have pointers on what 2.13.3 would need to get backported to
> >> work again in regard to this build issue.
> >
> > You would need to use partial contents from patch :
> > http://patchwork.ozlabs.org/project/openvswitch/patch/1608142365-
> 26215
> > -1-git-send-email-ian.sto...@intel.com/
> >
> > If you'd like me to send patches which would work with 2.13, 2.14, I'm
> > ok with that too.[keeping only those parts from patch which fixes the issue
> you see.] But we must ensure it doesn’t cause problems for OVS too.
> > Your thoughts Ilya ?
> 
> We had more fixes on top of this particular patch and I'd like to not cherry-
> pick and re-check all of this again. 

I agree, we had more fixes on top of this. It would be risky to cherry-pick.
So it might be a better option to revert.


> For users stable releases should be
> transparent, i.e. should not have disruptive changes that will break their
> ability to build with version of a library that they would like to use.
> 
> What are exact changes we're talking about?  Will it still be possible to 
> build
> OVS with older versions of a stable 19.11 if these changes applied?
> 
> >
> >
> >>
> >> [1]: http://git.dpdk.org/dpdk-stable/log/?h=19.11&ofs=550
> >> [2]: http://mails.dpdk.org/archives/stable/2020-September/024796.html
> > 
> >
> > Thanks and regards,
> > Sunil
> >


[dpdk-dev] DPDK Release Status Meeting 18/03/2021

2021-03-18 Thread Mcnamara, John
Release status meeting minutes {Date}
=
:Date: 18 March 2021
:toc:

.Agenda:
* Release Dates
* Subtrees
* LTS
* Conferences
* Opens

.Participants:
* Arm
* Debian/Microsoft
* Intel
* Marvell
* Nvidia
* Red Hat


Release Dates
-

* `v21.05` dates
  - Proposal/V1:Thursday, 18 March
  - -rc1:   Thursday, 15 April
  - Release:Friday, 14 May

* Please send roadmaps, preferably before beginning of the release
  - Thanks to `Marvell`, `Huawei hns3`, `Nvidia`, `Wangxun` and `Intel` for
sending roadmap


Subtrees


* main

  - Thomas focused on Windows related patches
  - Patches for ARM compilation
  - Pulled MMIO for virtio legacy devices patches from Alibaba
  - Tests broken due to version macro (fix pending)
  - Should merge musl support in this release
** Needs support in CI Community lab
** Maybe run in container
  - Migrating the Git repositories, documentation, and the DPDK
core website on Sunday 28th March

* next-net
  - Progressing, nothing critical

* next-crypto
  - Not much update
  - Will review/merge some apps and drivers this week

* next-eventdev
  - Several ongoing patches: periodic timer, vector event,
removal of DLB1 and updates to DLB2, CNXK evnetdev

* next-virtio
  - Some series are ready but no PR yet


LTS
---

* `v19.11.7-rc2` is released, 2021-03-17
  - 
http://inbox.dpdk.org/dev/20210317170343.3267049-1-christian.ehrha...@canonical.com/T/#u

* `v20.11.1` is released, 2021-03-08
  - http://inbox.dpdk.org/dev/20210308181351.409609-1-luca.bocca...@gmail.com/


Conferences
---

* DPDK APAC 2021 event is approaching, it is on 22-23 March
  - https://www.dpdk.org/event/dpdk-summit-apac-2021/
  - https://events.linuxfoundation.org/dpdk-summit-apac/


Opens
-

* John is evaluating lgtm.com
  - https://lgtm.com/projects/g/DPDK/dpdk/?mode=list

* Is there anyone did code review on github, please reach out for BKMs or for
  experience sharing



.DPDK Release Status Meetings
*
The DPDK Release Status Meeting is intended for DPDK Committers to discuss the 
status of the master tree and sub-trees, and for project managers to track 
progress or milestone dates.

The meeting occurs on every Thursdays at 8:30 UTC. on https://meet.jit.si/DPDK

If you wish to attend just send an email to "John McNamara 
" for the invite.
*


Re: [dpdk-dev] [PATCH 1/3] Add EAL threads API

2021-03-18 Thread Narcisa Ana Maria Vasile
On Thu, Mar 18, 2021 at 04:48:49PM +0100, David Marchand wrote:
> On Thu, Mar 18, 2021 at 2:01 AM Narcisa Ana Maria Vasile
>  wrote:
> > diff --git a/lib/librte_eal/common/eal_common_thread.c 
> > b/lib/librte_eal/common/eal_common_thread.c
> > index 73a055902..5219e783e 100644
> > --- a/lib/librte_eal/common/eal_common_thread.c
> 
> rte_thread_*et_affinity() are stable.
> This breaks the ABI (which is bad) and this API change was not
> announced previously.
> 
  Thank you David, I will revert the renaming of the stable
  functions to fix the ABI break.

  Given that the original functions only operate on the current thread
  (using the _thread_self()), changing their names to 
  rte_thread_self_*et_affinity() brings more clarity to the purpose of
  the functions. We will propose an ABI change to rename
  them in the next release (following the proper ABI changes procedures).

> -- 
> David Marchand


Re: [dpdk-dev] [PATCH 1/3] Add EAL threads API

2021-03-18 Thread Narcisa Ana Maria Vasile
On Thu, Mar 18, 2021 at 02:48:01PM +, Tal Shnaiderman wrote:
> > Subject: [dpdk-dev] [PATCH 1/3] Add EAL threads API
> > 
> > From: Narcisa Vasile 
> > 
> > EAL must hide the environment specifics from apps and libraries.
> > Add an EAL API for managing threads.
> > 
> > Signed-off-by: Narcisa Vasile 
> > Signed-off-by: Dmitry Malloy 
> > ---
> 
> Hi Naty, Dmitry, 
> 
> Thank you for adding those functions to the thread API.
> This is a huge commit, I'd split it to separate patches, something like:
> 
  Thank you Tal, I will split this patch into smaller ones.
   


Re: [dpdk-dev] [PATCH v5 5/5] doc/guides/l3_forward: update documentation for FIB

2021-03-18 Thread Mcnamara, John



> -Original Message-
> From: dev  On Behalf Of Conor Walsh
> Sent: Monday, March 15, 2021 11:35 AM
> To: jer...@marvell.com; step...@networkplumber.org; Iremonger, Bernard
> ; Ananyev, Konstantin
> ; Medvedkin, Vladimir
> ; Burakov, Anatoly
> 
> Cc: dev@dpdk.org; Walsh, Conor 
> Subject: [dpdk-dev] [PATCH v5 5/5] doc/guides/l3_forward: update
> documentation for FIB

For the doc section of the patch:

Acked-by: John McNamara 



Re: [dpdk-dev] [RFC 0/4] SocketPair Broker support for vhost and virtio-user.

2021-03-18 Thread Ilya Maximets
On 3/18/21 6:52 PM, Stefan Hajnoczi wrote:
> On Wed, Mar 17, 2021 at 09:25:26PM +0100, Ilya Maximets wrote:
> Hi,
> Some questions to understand the problems that SocketPair Broker solves:
> 
>> Even more configuration tricks required in order to share some sockets
>> between different containers and not only with the host, e.g. to
>> create service chains.
> 
> How does SocketPair Broker solve this? I guess the idea is that
> SocketPair Broker must be started before other containers. That way
> applications don't need to sleep and reconnect when a socket isn't
> available yet.
> 
> On the other hand, the SocketPair Broker might be unavailable (OOM
> killer, crash, etc), so applications still need to sleep and reconnect
> to the broker itself. I'm not sure the problem has actually been solved
> unless there is a reason why the broker is always guaranteed to be
> available?

Hi, Stefan.  Thanks for your feedback!

The idea is to have the SocketPair Broker running right from the
boot of the host.  If it will use a systemd socket-based service
activation, the socket should persist while systemd is alive, IIUC.
OOM, crash and restart of the broker should not affect existence
of the socket and systemd will spawn a service if it's not running
for any reason without loosing incoming connections.

> 
>> And some housekeeping usually required for applications in case the
>> socket server terminated abnormally and socket files left on a file
>> system:
>>  "failed to bind to vhu: Address already in use; remove it and try again"
> 
> QEMU avoids this by unlinking before binding. The drawback is that users
> might accidentally hijack an existing listen socket, but that can be
> solved with a pidfile.

How exactly this could be solved with a pidfile?  And what if this is
a different application that tries to create a socket on a same path?
e.g. QEMU creates a socket (started in a server mode) and user
accidentally created dpdkvhostuser port in Open vSwitch instead of
dpdkvhostuserclient.  This way rte_vhost library will try to bind
to an existing socket file and will fail.  Subsequently port creation
in OVS will fail.   We can't allow OVS to unlink files because this
way OVS users will have ability to unlink random sockets that OVS has
access to and we also has no idea if it's a QEMU that created a file
or it was a virtio-user application or someone else.
There are, probably, ways to detect if there is any alive process that
has this socket open, but that sounds like too much for this purpose,
also I'm not sure if it's possible if actual user is in a different
container.
So I don't see a good reliable way to detect these conditions.  This
falls on shoulders of a higher level management software or a user to
clean these socket files up before adding ports.

> 
>> Additionally, all applications (system and user's!) should follow
>> naming conventions and place socket files in particular location on a
>> file system to make things work.
> 
> Does SocketPair Broker solve this? Applications now need to use a naming
> convention for keys, so it seems like this issue has not been
> eliminated.

Key is an arbitrary sequence of bytes, so it's hard to call it a naming
convention.  But they need to know keys, you're right.  And to be
careful I said "eliminates most of the inconveniences". :)

> 
>> This patch-set aims to eliminate most of the inconveniences by
>> leveraging an infrastructure service provided by a SocketPair Broker.
> 
> I don't understand yet why this is useful for vhost-user, where the
> creation of the vhost-user device backend and its use by a VMM are
> closely managed by one piece of software:
> 
> 1. Unlink the socket path.
> 2. Create, bind, and listen on the socket path.
> 3. Instantiate the vhost-user device backend (e.g. talk to DPDK/SPDK
>RPC, spawn a process, etc) and pass in the listen fd.
> 4. In the meantime the VMM can open the socket path and call connect(2).
>As soon as the vhost-user device backend calls accept(2) the
>connection will proceed (there is no need for sleeping).
> 
> This approach works across containers without a broker.

Not sure if I fully understood a question here, but anyway.

This approach works fine if you know what application to run.
In case of a k8s cluster, it might be a random DPDK application
with virtio-user ports running inside a container and want to
have a network connection.  Also, this application needs to run
virtio-user in server mode, otherwise restart of the OVS will
require restart of the application.  So, you basically need to
rely on a third-party application to create a socket with a right
name and in a correct location that is shared with a host, so
OVS can find it and connect.

In a VM world everything is much more simple, since you have
a libvirt and QEMU that will take care of all of these stuff
and which are also under full control of management software
and a system administrator.
In case of a container with a "random" DPDK application i

Re: [dpdk-dev] [PATCH 1/3] Add EAL threads API

2021-03-18 Thread Tyler Retzlaff
On Thu, Mar 18, 2021 at 02:48:01PM +, Tal Shnaiderman wrote:
> 
> I don't know if this table is needed, the approach should be to have the 
> return value/rte_errno identical between the OSs.
> And having the specific OS errno printed.

the underlying problem here is that dpdk is adopting linux errno spaces
as the de-facto standard.  even between bsd and linux some apis will
return different errno for the same input parameters.

between linux/bsd/unix this ends up being more subtle since usually the
alternate errno returned is still handled by the consumer of the api
in a similar manner but arguably could result in different behavior on
different platforms for the same application i.e. compatibility delta.

with windows the problem is more pronounced. calls to the underlying
native apis may fail with errors that semantically can't be represented
by the linux errno space or may not return an error at all. in such
circumstances a more generic errno is returned or the call succeeds
where it may have failed on another platform.  i.e. compatibility delta.

the patch as presented aims to be as semantically compatible as possible
with the current adopted linux errno space. we try to remap the underlying
platform error to something that makes sense within the set of linux errno
that we are allowed to return and when we can't map it we return a more
generic errno. if we've made mistakes here, please let us know.

i think there is probably a more general discussion to be had that is
off-topic for this patch about how to report errors portably in dpdk and
at the same time preserve and provide access to the underlying platform
details of the errors when needed.

> e.g. pthread_setschedparam On UNIX returns ESRCH when no thread id is found, 
> the table below doesn't translate to it so Windows
> will never return such error code, maybe use only the errnos below for all 
> OSs? what do you think?
> 
> > +/* Translates the most common error codes related to threads */ static
> > +int rte_thread_translate_win32_error(DWORD error) {
> > +   switch (error) {
> > +   case ERROR_SUCCESS:
> > +   return 0;
> > +
> > +   case ERROR_INVALID_PARAMETER:
> > +   return -EINVAL;
> > +
> > +   case ERROR_INVALID_HANDLE:
> > +   return -EFAULT;
> > +
> > +   case ERROR_NOT_ENOUGH_MEMORY:
> > +   /* FALLTHROUGH */
> > +   case ERROR_NO_SYSTEM_RESOURCES:
> > +   return -ENOMEM;
> > +
> > +   case ERROR_PRIVILEGE_NOT_HELD:
> > +   /* FALLTHROUGH */
> > +   case ERROR_ACCESS_DENIED:
> > +   return -EACCES;
> > +
> > +   case ERROR_ALREADY_EXISTS:
> > +   return -EEXIST;
> > +
> > +   case ERROR_POSSIBLE_DEADLOCK:
> > +   return -EDEADLK;
> > +
> > +   case ERROR_INVALID_FUNCTION:
> > +   /* FALLTHROUGH */
> > +   case ERROR_CALL_NOT_IMPLEMENTED:
> > +   return -ENOSYS;
> > +
> > +   default:
> > +   return -EINVAL;
> > +   }
> > +
> > +   return -EINVAL;
> > +}
> 


Re: [dpdk-dev] [RFC 0/4] SocketPair Broker support for vhost and virtio-user.

2021-03-18 Thread Ilya Maximets
On 3/18/21 8:47 PM, Ilya Maximets wrote:
> On 3/18/21 6:52 PM, Stefan Hajnoczi wrote:
>> On Wed, Mar 17, 2021 at 09:25:26PM +0100, Ilya Maximets wrote:
>> Hi,
>> Some questions to understand the problems that SocketPair Broker solves:
>>
>>> Even more configuration tricks required in order to share some sockets
>>> between different containers and not only with the host, e.g. to
>>> create service chains.
>>
>> How does SocketPair Broker solve this? I guess the idea is that
>> SocketPair Broker must be started before other containers. That way
>> applications don't need to sleep and reconnect when a socket isn't
>> available yet.
>>
>> On the other hand, the SocketPair Broker might be unavailable (OOM
>> killer, crash, etc), so applications still need to sleep and reconnect
>> to the broker itself. I'm not sure the problem has actually been solved
>> unless there is a reason why the broker is always guaranteed to be
>> available?
> 
> Hi, Stefan.  Thanks for your feedback!
> 
> The idea is to have the SocketPair Broker running right from the
> boot of the host.  If it will use a systemd socket-based service
> activation, the socket should persist while systemd is alive, IIUC.
> OOM, crash and restart of the broker should not affect existence
> of the socket and systemd will spawn a service if it's not running
> for any reason without loosing incoming connections.
> 
>>
>>> And some housekeeping usually required for applications in case the
>>> socket server terminated abnormally and socket files left on a file
>>> system:
>>>  "failed to bind to vhu: Address already in use; remove it and try again"
>>
>> QEMU avoids this by unlinking before binding. The drawback is that users
>> might accidentally hijack an existing listen socket, but that can be
>> solved with a pidfile.
> 
> How exactly this could be solved with a pidfile?  And what if this is
> a different application that tries to create a socket on a same path?
> e.g. QEMU creates a socket (started in a server mode) and user
> accidentally created dpdkvhostuser port in Open vSwitch instead of
> dpdkvhostuserclient.  This way rte_vhost library will try to bind
> to an existing socket file and will fail.  Subsequently port creation
> in OVS will fail.   We can't allow OVS to unlink files because this
> way OVS users will have ability to unlink random sockets that OVS has
> access to and we also has no idea if it's a QEMU that created a file
> or it was a virtio-user application or someone else.
> There are, probably, ways to detect if there is any alive process that
> has this socket open, but that sounds like too much for this purpose,
> also I'm not sure if it's possible if actual user is in a different
> container.
> So I don't see a good reliable way to detect these conditions.  This
> falls on shoulders of a higher level management software or a user to
> clean these socket files up before adding ports.
> 
>>
>>> Additionally, all applications (system and user's!) should follow
>>> naming conventions and place socket files in particular location on a
>>> file system to make things work.
>>
>> Does SocketPair Broker solve this? Applications now need to use a naming
>> convention for keys, so it seems like this issue has not been
>> eliminated.
> 
> Key is an arbitrary sequence of bytes, so it's hard to call it a naming
> convention.  But they need to know keys, you're right.  And to be
> careful I said "eliminates most of the inconveniences". :)
> 
>>
>>> This patch-set aims to eliminate most of the inconveniences by
>>> leveraging an infrastructure service provided by a SocketPair Broker.
>>
>> I don't understand yet why this is useful for vhost-user, where the
>> creation of the vhost-user device backend and its use by a VMM are
>> closely managed by one piece of software:
>>
>> 1. Unlink the socket path.
>> 2. Create, bind, and listen on the socket path.
>> 3. Instantiate the vhost-user device backend (e.g. talk to DPDK/SPDK
>>RPC, spawn a process, etc) and pass in the listen fd.
>> 4. In the meantime the VMM can open the socket path and call connect(2).
>>As soon as the vhost-user device backend calls accept(2) the
>>connection will proceed (there is no need for sleeping).
>>
>> This approach works across containers without a broker.
> 
> Not sure if I fully understood a question here, but anyway.
> 
> This approach works fine if you know what application to run.
> In case of a k8s cluster, it might be a random DPDK application
> with virtio-user ports running inside a container and want to
> have a network connection.  Also, this application needs to run
> virtio-user in server mode, otherwise restart of the OVS will
> require restart of the application.  So, you basically need to
> rely on a third-party application to create a socket with a right
> name and in a correct location that is shared with a host, so
> OVS can find it and connect.
> 
> In a VM world everything is much more simple, since you have
> a libvirt and QEMU that w

[dpdk-dev] Arm roadmap for 21.05

2021-03-18 Thread Honnappa Nagarahalli
(Bcc: Arm internal stake holders)

Hello,
Following are the work items planned for 21.05:

1) Performance improvements in L3fwd example.
2) Use C11 atomic built-ins in EAL, Service Core library and MLX4/MLX5 PMDs.
3) Enhance mempool library with additional debug counters.
4) Meson build improvements (Arm build options rework, Aarch32 cross 
compilation, KNI cross compilation)

Thank you,
Honnappa


Re: [dpdk-dev] [PATCH] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Ranjit Menon



On 3/18/2021 12:49 AM, Thomas Monjalon wrote:

18/03/2021 00:17, Ranjit Menon:

Hi Thomas,

On 3/16/2021 4:11 PM, Thomas Monjalon wrote:

In Windows probing, the value RTE_PCI_KDRV_NONE was used
instead of RTE_PCI_KDRV_UNKNOWN (mlx case),
and RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used
instead of having a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.

Shouldn't the mlx case actually remain RTE_PCI_KDRV_NONE?

mlx does not require a UIO-like kernel driver...No? And NONE implies that no 
kernel driver is used/required.
Not sure what is correct here.

No this is a bifurcated model, meaning kernel and userland
work together. The PCI device is bound to the kernel driver,
but the driver is not listed because no special treatment is required.


While adding the new value RTE_PCI_KDRV_NET_UIO,
the enum of kernel driver categories is annotated.

Fixes: b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
Fixes: c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: sta...@dpdk.org

Signed-off-by: Thomas Monjalon 
---
   drivers/bus/pci/rte_bus_pci.h | 13 +++--
   drivers/bus/pci/windows/pci.c | 14 +++---
   2 files changed, 14 insertions(+), 13 deletions(-)

diff --git a/drivers/bus/pci/rte_bus_pci.h b/drivers/bus/pci/rte_bus_pci.h
index fdda046515..3d009cc74b 100644
--- a/drivers/bus/pci/rte_bus_pci.h
+++ b/drivers/bus/pci/rte_bus_pci.h
@@ -52,12 +52,13 @@ TAILQ_HEAD(rte_pci_driver_list, rte_pci_driver);
   struct rte_devargs;
   
   enum rte_pci_kernel_driver {

-   RTE_PCI_KDRV_UNKNOWN = 0,
-   RTE_PCI_KDRV_IGB_UIO,
-   RTE_PCI_KDRV_VFIO,
-   RTE_PCI_KDRV_UIO_GENERIC,
-   RTE_PCI_KDRV_NIC_UIO,
-   RTE_PCI_KDRV_NONE,
+   RTE_PCI_KDRV_UNKNOWN = 0,  /* not listed - may be a bifurcated driver */
+   RTE_PCI_KDRV_IGB_UIO,  /* igb_uio for Linux */
+   RTE_PCI_KDRV_VFIO, /* VFIO for Linux */
+   RTE_PCI_KDRV_UIO_GENERIC,  /* uio_generic for Linux */
+   RTE_PCI_KDRV_NIC_UIO,  /* nic_uio for FreeBSD */
+   RTE_PCI_KDRV_NONE, /* error */
+   RTE_PCI_KDRV_NET_UIO,  /* NetUIO for Windows */
   };
   

Any chance we can re-order the enums, so that _NONE and _UNKNOWN are at
the top?

No, it would break the ABI.


This will change the value, and break code where this value was
hard-coded. But how likely is that...?

The problem is when loading the new PCI bus driver with an old device driver.




OK. Thanks for the explanation, Thomas.

ranjit m.



Re: [dpdk-dev] [PATCH v2] bus/pci: fix Windows kernel driver categories

2021-03-18 Thread Ranjit Menon

On 3/18/2021 3:48 AM, Thomas Monjalon wrote:

In Windows probing, the value RTE_PCI_KDRV_NONE was used
instead of RTE_PCI_KDRV_UNKNOWN.
This value covers the mlx case where the kernel driver is in place,
offering a bifurcated mode to the userspace driver.
When the kernel driver is listed as unknown,
there is no special treatment in DPDK probing, contrary to UIO modes.

The value RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used
instead of having a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.
While adding the new value RTE_PCI_KDRV_NET_UIO
(at the end for ABI compatibility),
the enum of kernel driver categories is annotated.

Fixes: b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
Fixes: c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: sta...@dpdk.org

Signed-off-by: Thomas Monjalon 
Acked-by: Dmitry Kozlyuk 
---
v2: improve comments and commit message
---
  drivers/bus/pci/rte_bus_pci.h | 13 +++--
  drivers/bus/pci/windows/pci.c | 14 +++---
  2 files changed, 14 insertions(+), 13 deletions(-)


Acked-by: Ranjit Menon 


[dpdk-dev] [RFC 0/3] net/virtio: add vdpa device config support

2021-03-18 Thread Maxime Coquelin
This patch adds vDPA device config space requests support.
For now, it only adds MAC address get and set. It may be
extended in next revision to support other configs like
link state.

Regarding the MAC selection strategy, if devargs MAC address
is set by the user and valid, the driver tries to store it
in the device config space, then it reads the MAC address
back from the device config, which will be used. If not set
in devargs or invalid, it tries to read it from the device.
If it fails, a random MAC will be used.

I'm interrested to know your feedback on this strategy.

It has been tested with vDPA simulator, which only supports
getting the MAC address, and witch CX6 which supports neither
getting or setting MAC address (and so devarg or random MAC is
used). IFCVF driver seems to support both getting and setting
the MAC, I have a try with it before next revision.

Maxime Coquelin (3):
  net/virtio: keep device and frontend features separated
  net/virtio: add device config support to vDPA
  net/virtio: add MAC device config getter and setter

 drivers/net/virtio/virtio_user/vhost.h|  3 +
 drivers/net/virtio/virtio_user/vhost_vdpa.c   | 69 +++
 .../net/virtio/virtio_user/virtio_user_dev.c  | 88 +++
 .../net/virtio/virtio_user/virtio_user_dev.h  |  2 +
 drivers/net/virtio/virtio_user_ethdev.c   | 12 ++-
 5 files changed, 151 insertions(+), 23 deletions(-)

-- 
2.30.2



[dpdk-dev] [RFC 1/3] net/virtio: keep device and frontend features separated

2021-03-18 Thread Maxime Coquelin
This patch is preliminary rework to add support for getting
and setting device's config space.

In order to get or set a device config such as its MAC address,
we need to know whether the device itself support the feature,
or if it is emulated by the frontend.

Signed-off-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_user/virtio_user_dev.c | 10 ++
 drivers/net/virtio/virtio_user_ethdev.c  |  5 +++--
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index 1b54d55bd8..8757a23f6e 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -572,11 +572,7 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
if (dev->backend_type == VIRTIO_USER_BACKEND_VHOST_USER)
dev->frontend_features |= (1ull << VIRTIO_NET_F_STATUS);
 
-   /*
-* Device features =
-* (frontend_features | backend_features) & ~unsupported_features;
-*/
-   dev->device_features |= dev->frontend_features;
+   dev->frontend_features &= ~dev->unsupported_features;
dev->device_features &= ~dev->unsupported_features;
 
if (rte_mem_event_callback_register(VIRTIO_USER_MEM_EVENT_CLB_NAME,
@@ -940,12 +936,10 @@ virtio_user_dev_server_reconnect(struct virtio_user_dev 
*dev)
return -1;
}
 
-   dev->device_features |= dev->frontend_features;
-
/* unmask vhost-user unsupported features */
dev->device_features &= ~(dev->unsupported_features);
 
-   dev->features &= dev->device_features;
+   dev->features &= (dev->device_features | dev->frontend_features);
 
/* For packed ring, resetting queues is required in reconnection. */
if (virtio_with_packed_queue(hw) &&
diff --git a/drivers/net/virtio/virtio_user_ethdev.c 
b/drivers/net/virtio/virtio_user_ethdev.c
index 1810a54694..bb36316186 100644
--- a/drivers/net/virtio/virtio_user_ethdev.c
+++ b/drivers/net/virtio/virtio_user_ethdev.c
@@ -110,7 +110,8 @@ virtio_user_get_features(struct virtio_hw *hw)
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
 
/* unmask feature bits defined in vhost user protocol */
-   return dev->device_features & VIRTIO_PMD_SUPPORTED_GUEST_FEATURES;
+   return (dev->device_features | dev->frontend_features) &
+   VIRTIO_PMD_SUPPORTED_GUEST_FEATURES;
 }
 
 static void
@@ -118,7 +119,7 @@ virtio_user_set_features(struct virtio_hw *hw, uint64_t 
features)
 {
struct virtio_user_dev *dev = virtio_user_get_dev(hw);
 
-   dev->features = features & dev->device_features;
+   dev->features = features & (dev->device_features | 
dev->frontend_features);
 }
 
 static int
-- 
2.30.2



[dpdk-dev] [RFC 2/3] net/virtio: add device config support to vDPA

2021-03-18 Thread Maxime Coquelin
This patch introduces two virtio-user callbacks to get
and set device's config, and implements it for vDPA
backends.

Signed-off-by: Maxime Coquelin 
---
 drivers/net/virtio/virtio_user/vhost.h  |  3 +
 drivers/net/virtio/virtio_user/vhost_vdpa.c | 69 +
 2 files changed, 72 insertions(+)

diff --git a/drivers/net/virtio/virtio_user/vhost.h 
b/drivers/net/virtio/virtio_user/vhost.h
index c49e88036d..dfbf6be033 100644
--- a/drivers/net/virtio/virtio_user/vhost.h
+++ b/drivers/net/virtio/virtio_user/vhost.h
@@ -79,6 +79,9 @@ struct virtio_user_backend_ops {
int (*set_vring_addr)(struct virtio_user_dev *dev, struct 
vhost_vring_addr *addr);
int (*get_status)(struct virtio_user_dev *dev, uint8_t *status);
int (*set_status)(struct virtio_user_dev *dev, uint8_t status);
+   int (*get_config)(struct virtio_user_dev *dev, uint8_t *data, uint32_t 
off, uint32_t len);
+   int (*set_config)(struct virtio_user_dev *dev, const uint8_t *data, 
uint32_t off,
+   uint32_t len);
int (*enable_qp)(struct virtio_user_dev *dev, uint16_t pair_idx, int 
enable);
int (*dma_map)(struct virtio_user_dev *dev, void *addr, uint64_t iova, 
size_t len);
int (*dma_unmap)(struct virtio_user_dev *dev, void *addr, uint64_t 
iova, size_t len);
diff --git a/drivers/net/virtio/virtio_user/vhost_vdpa.c 
b/drivers/net/virtio/virtio_user/vhost_vdpa.c
index e2d6d3504d..59bc712d48 100644
--- a/drivers/net/virtio/virtio_user/vhost_vdpa.c
+++ b/drivers/net/virtio/virtio_user/vhost_vdpa.c
@@ -41,6 +41,8 @@ struct vhost_vdpa_data {
 #define VHOST_VDPA_GET_DEVICE_ID _IOR(VHOST_VIRTIO, 0x70, __u32)
 #define VHOST_VDPA_GET_STATUS _IOR(VHOST_VIRTIO, 0x71, __u8)
 #define VHOST_VDPA_SET_STATUS _IOW(VHOST_VIRTIO, 0x72, __u8)
+#define VHOST_VDPA_GET_CONFIG _IOR(VHOST_VIRTIO, 0x73, struct 
vhost_vdpa_config)
+#define VHOST_VDPA_SET_CONFIG _IOW(VHOST_VIRTIO, 0x74, struct 
vhost_vdpa_config)
 #define VHOST_VDPA_SET_VRING_ENABLE _IOW(VHOST_VIRTIO, 0x75, struct 
vhost_vring_state)
 #define VHOST_SET_BACKEND_FEATURES _IOW(VHOST_VIRTIO, 0x25, __u64)
 #define VHOST_GET_BACKEND_FEATURES _IOR(VHOST_VIRTIO, 0x26, __u64)
@@ -65,6 +67,12 @@ struct vhost_iotlb_msg {
 
 #define VHOST_IOTLB_MSG_V2 0x2
 
+struct vhost_vdpa_config {
+   uint32_t off;
+   uint32_t len;
+   uint8_t buf[0];
+};
+
 struct vhost_msg {
uint32_t type;
uint32_t reserved;
@@ -440,6 +448,65 @@ vhost_vdpa_set_status(struct virtio_user_dev *dev, uint8_t 
status)
return vhost_vdpa_ioctl(data->vhostfd, VHOST_VDPA_SET_STATUS, &status);
 }
 
+static int
+vhost_vdpa_get_config(struct virtio_user_dev *dev, uint8_t *data, uint32_t 
off, uint32_t len)
+{
+   struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+   struct vhost_vdpa_config *config;
+   int ret = 0;
+
+   config = malloc(sizeof(*config) + len);
+   if (!config) {
+   PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data\n");
+   return -1;
+   }
+
+   config->off = off;
+   config->len = len;
+
+   ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_GET_CONFIG, 
config);
+   if (ret) {
+   PMD_DRV_LOG(ERR, "Failed to get vDPA config (offset %x, len 
%x)\n", off, len);
+   ret = -1;
+   goto out;
+   }
+
+   memcpy(data, config->buf, len);
+out:
+   free(config);
+
+   return ret;
+}
+
+static int
+vhost_vdpa_set_config(struct virtio_user_dev *dev, const uint8_t *data, 
uint32_t off, uint32_t len)
+{
+   struct vhost_vdpa_data *vdpa_data = dev->backend_data;
+   struct vhost_vdpa_config *config;
+   int ret = 0;
+
+   config = malloc(sizeof(*config) + len);
+   if (!config) {
+   PMD_DRV_LOG(ERR, "Failed to allocate vDPA config data\n");
+   return -1;
+   }
+
+   config->off = off;
+   config->len = len;
+
+   memcpy(config->buf, data, len);
+
+   ret = vhost_vdpa_ioctl(vdpa_data->vhostfd, VHOST_VDPA_SET_CONFIG, 
config);
+   if (ret) {
+   PMD_DRV_LOG(ERR, "Failed to set vDPA config (offset %x, len 
%x)\n", off, len);
+   ret = -1;
+   }
+
+   free(config);
+
+   return ret;
+}
+
 /**
  * Set up environment to talk with a vhost vdpa backend.
  *
@@ -559,6 +626,8 @@ struct virtio_user_backend_ops virtio_ops_vdpa = {
.set_vring_addr = vhost_vdpa_set_vring_addr,
.get_status = vhost_vdpa_get_status,
.set_status = vhost_vdpa_set_status,
+   .get_config = vhost_vdpa_get_config,
+   .set_config = vhost_vdpa_set_config,
.enable_qp = vhost_vdpa_enable_queue_pair,
.dma_map = vhost_vdpa_dma_map_batch,
.dma_unmap = vhost_vdpa_dma_unmap_batch,
-- 
2.30.2



[dpdk-dev] [RFC 3/3] net/virtio: add MAC device config getter and setter

2021-03-18 Thread Maxime Coquelin
This patch uses the new device config ops to get and set
the MAC address if supported.

If a valid MAC address is passed as devarg of the
Virtio-user PMD, the driver will try to store it in the
device config space. Otherwise the one provided in
the device config space will be used, if available.

Signed-off-by: Maxime Coquelin 
---
 .../net/virtio/virtio_user/virtio_user_dev.c  | 78 ---
 .../net/virtio/virtio_user/virtio_user_dev.h  |  2 +
 drivers/net/virtio/virtio_user_ethdev.c   |  7 +-
 3 files changed, 74 insertions(+), 13 deletions(-)

diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.c 
b/drivers/net/virtio/virtio_user/virtio_user_dev.c
index 8757a23f6e..61517692b3 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.c
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.c
@@ -259,20 +259,76 @@ int virtio_user_stop_device(struct virtio_user_dev *dev)
return -1;
 }
 
-static inline void
-parse_mac(struct virtio_user_dev *dev, const char *mac)
+int
+virtio_user_dev_set_mac(struct virtio_user_dev *dev)
 {
-   struct rte_ether_addr tmp;
+   int ret = 0;
 
-   if (!mac)
-   return;
+   if (!(dev->device_features & (1ULL << VIRTIO_NET_F_MAC)))
+   return -ENOTSUP;
+
+   if (!dev->ops->set_config)
+   return -ENOTSUP;
+
+   ret = dev->ops->set_config(dev, dev->mac_addr,
+   offsetof(struct virtio_net_config, mac),
+   RTE_ETHER_ADDR_LEN);
+   if (ret)
+   PMD_DRV_LOG(ERR, "(%s) Failed to set MAC address in device\n", 
dev->path);
+
+   return ret;
+}
+
+int
+virtio_user_dev_get_mac(struct virtio_user_dev *dev)
+{
+   int ret = 0;
+
+   if (!(dev->device_features & (1ULL << VIRTIO_NET_F_MAC)))
+   return -ENOTSUP;
+
+   if (!dev->ops->get_config)
+   return -ENOTSUP;
+
+   ret = dev->ops->get_config(dev, dev->mac_addr,
+   offsetof(struct virtio_net_config, mac),
+   RTE_ETHER_ADDR_LEN);
+   if (ret)
+   PMD_DRV_LOG(ERR, "(%s) Failed to get MAC address from 
device\n", dev->path);
 
-   if (rte_ether_unformat_addr(mac, &tmp) == 0) {
-   memcpy(dev->mac_addr, &tmp, RTE_ETHER_ADDR_LEN);
+   return ret;
+}
+
+static void
+virtio_user_dev_init_mac(struct virtio_user_dev *dev, const char *mac)
+{
+   struct rte_ether_addr cmdline_mac;
+   int ret;
+
+   if (mac && rte_ether_unformat_addr(mac, &cmdline_mac) == 0) {
+   /*
+* MAC address was passed from command-line, try to store
+* it in the device if it supports it. Otherwise try to use
+* the device one.
+*/
+   memcpy(dev->mac_addr, &cmdline_mac, RTE_ETHER_ADDR_LEN);
dev->mac_specified = 1;
+
+   /* Setting MAC may fail, continue to get the device one in this 
case */
+   virtio_user_dev_set_mac(dev);
+   ret = virtio_user_dev_get_mac(dev);
+   if (ret == -ENOTSUP)
+   return;
+
+   if (memcmp(&cmdline_mac, dev->mac_addr, RTE_ETHER_ADDR_LEN))
+   PMD_DRV_LOG(INFO, "(%s) Device MAC update failed\n", 
dev->path);
} else {
-   /* ignore the wrong mac, use random mac */
-   PMD_DRV_LOG(ERR, "wrong format of mac: %s", mac);
+   ret = virtio_user_dev_get_mac(dev);
+   if (ret)
+   PMD_DRV_LOG(ERR, "(%s) No valid MAC in devargs or 
device, use random\n",
+   dev->path);
+   else
+   dev->mac_specified = 1;
}
 }
 
@@ -508,8 +564,6 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
dev->unsupported_features = 0;
dev->backend_type = backend_type;
 
-   parse_mac(dev, mac);
-
if (*ifname) {
dev->ifname = *ifname;
*ifname = NULL;
@@ -537,6 +591,8 @@ virtio_user_dev_init(struct virtio_user_dev *dev, char 
*path, int queues,
return -1;
}
 
+   virtio_user_dev_init_mac(dev, mac);
+
if (!mrg_rxbuf)
dev->unsupported_features |= (1ull << VIRTIO_NET_F_MRG_RXBUF);
 
diff --git a/drivers/net/virtio/virtio_user/virtio_user_dev.h 
b/drivers/net/virtio/virtio_user/virtio_user_dev.h
index 8a62f7ea79..03bcf95970 100644
--- a/drivers/net/virtio/virtio_user/virtio_user_dev.h
+++ b/drivers/net/virtio/virtio_user/virtio_user_dev.h
@@ -78,6 +78,8 @@ uint8_t virtio_user_handle_mq(struct virtio_user_dev *dev, 
uint16_t q_pairs);
 int virtio_user_dev_set_status(struct virtio_user_dev *dev, uint8_t status);
 int virtio_user_dev_update_status(struct virtio_user_dev *dev);
 int virtio_user_dev_update_link_state(struct virtio_user_dev *dev);
+int virtio_user_dev_set_mac(struct virtio_user_dev *dev);
+int virtio_user_dev_get_

[dpdk-dev] [PATCH 1/1] net/bnxt: fix transmit length hint threshold

2021-03-18 Thread Lance Richardson
Use correct threshold when selecting "greater than or equal to
2K" length hint.

Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code")
Cc: sta...@dpdk.org

Signed-off-by: Lance Richardson 
Reviewed-by: Ajit Kumar Khaparde 
Reviewed-by: Somnath Kotur 
---
 drivers/net/bnxt/bnxt_txr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt_txr.c b/drivers/net/bnxt/bnxt_txr.c
index 65355fb040..27459960de 100644
--- a/drivers/net/bnxt/bnxt_txr.c
+++ b/drivers/net/bnxt/bnxt_txr.c
@@ -187,7 +187,7 @@ static uint16_t bnxt_start_xmit(struct rte_mbuf *tx_pkt,
txbd->flags_type |= TX_BD_SHORT_FLAGS_COAL_NOW;
txbd->flags_type |= TX_BD_LONG_FLAGS_NO_CMPL;
txbd->len = tx_pkt->data_len;
-   if (tx_pkt->pkt_len >= 2014)
+   if (tx_pkt->pkt_len >= 2048)
txbd->flags_type |= TX_BD_LONG_FLAGS_LHINT_GTE2K;
else
txbd->flags_type |= lhint_arr[tx_pkt->pkt_len >> 9];
-- 
2.25.1



[dpdk-dev] [PATCH 1/1] net/bnxt: fix Rx buffer posting

2021-03-18 Thread Lance Richardson
Remove early buffer posting logic from burst receive loop to address
several issues:
   - Posting receive descriptors without first posting completion
 entries risks overflowing the completion queue.
   - Posting receive descriptors without updating rx_raw_prod
 creates the possibility that the receive descriptor doorbell
 can be written twice with the same value.
   - Having this logic in the inner descriptor processing loop
 can impact performance.

Fixes: 637e34befd9c ("net/bnxt: optimize Rx processing")
Fixes: 04067844a3e9 ("net/bnxt: reduce CQ queue size without aggregation ring")
Cc: sta...@dpdk.org

Signed-off-by: Lance Richardson 
Reviewed-by: Ajit Kumar Khaparde 
---
 drivers/net/bnxt/bnxt_rxr.c | 3 ---
 drivers/net/bnxt/bnxt_rxr.h | 2 --
 2 files changed, 5 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c
index c72545ada7..7179c6cb30 100644
--- a/drivers/net/bnxt/bnxt_rxr.c
+++ b/drivers/net/bnxt/bnxt_rxr.c
@@ -1018,9 +1018,6 @@ uint16_t bnxt_recv_pkts(void *rx_queue, struct rte_mbuf 
**rx_pkts,
raw_cons = NEXT_RAW_CMP(raw_cons);
if (nb_rx_pkts == nb_pkts || nb_rep_rx_pkts == nb_pkts || evt)
break;
-   /* Post some Rx buf early in case of larger burst processing */
-   if (nb_rx_pkts == BNXT_RX_POST_THRESH)
-   bnxt_db_write(&rxr->rx_db, rxr->rx_raw_prod);
}
 
cpr->cp_raw_cons = raw_cons;
diff --git a/drivers/net/bnxt/bnxt_rxr.h b/drivers/net/bnxt/bnxt_rxr.h
index a6fdd7767a..b43256e03e 100644
--- a/drivers/net/bnxt/bnxt_rxr.h
+++ b/drivers/net/bnxt/bnxt_rxr.h
@@ -41,8 +41,6 @@ static inline uint16_t bnxt_tpa_start_agg_id(struct bnxt *bp,
(((cmp)->agg_bufs_v1 & RX_PKT_CMPL_AGG_BUFS_MASK) >> \
RX_PKT_CMPL_AGG_BUFS_SFT)
 
-#define BNXT_RX_POST_THRESH32
-
 /* Number of descriptors to process per inner loop in vector mode. */
 #define RTE_BNXT_DESCS_PER_LOOP4U
 
-- 
2.25.1



[dpdk-dev] [PATCH 1/1] net/bnxt: fix handling of null flow mask

2021-03-18 Thread Lance Richardson
When the mask field of an rte_flow pattern item is NULL,
the default mask for that item type should be used.

Fixes: 5ef3b79fdfe6 ("net/bnxt: support flow filter ops")
Cc: sta...@dpdk.org

Signed-off-by: Lance Richardson 
---
 drivers/net/bnxt/bnxt_flow.c | 47 +++-
 1 file changed, 36 insertions(+), 11 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_flow.c b/drivers/net/bnxt/bnxt_flow.c
index a8f5d91fc4..e3906b4779 100644
--- a/drivers/net/bnxt/bnxt_flow.c
+++ b/drivers/net/bnxt/bnxt_flow.c
@@ -188,11 +188,15 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
PMD_DRV_LOG(DEBUG, "Parse inner header\n");
break;
case RTE_FLOW_ITEM_TYPE_ETH:
-   if (!item->spec || !item->mask)
+   if (!item->spec)
break;
 
eth_spec = item->spec;
-   eth_mask = item->mask;
+
+   if (item->mask)
+   eth_mask = item->mask;
+   else
+   eth_mask = &rte_flow_item_eth_mask;
 
/* Source MAC address mask cannot be partially set.
 * Should be All 0's or all 1's.
@@ -281,7 +285,12 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
break;
case RTE_FLOW_ITEM_TYPE_VLAN:
vlan_spec = item->spec;
-   vlan_mask = item->mask;
+
+   if (item->mask)
+   vlan_mask = item->mask;
+   else
+   vlan_mask = &rte_flow_item_vlan_mask;
+
if (en & en_ethertype) {
rte_flow_error_set(error, EINVAL,
   RTE_FLOW_ERROR_TYPE_ITEM,
@@ -324,11 +333,15 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
case RTE_FLOW_ITEM_TYPE_IPV4:
/* If mask is not involved, we could use EM filters. */
ipv4_spec = item->spec;
-   ipv4_mask = item->mask;
 
-   if (!item->spec || !item->mask)
+   if (!item->spec)
break;
 
+   if (item->mask)
+   ipv4_mask = item->mask;
+   else
+   ipv4_mask = &rte_flow_item_ipv4_mask;
+
/* Only IP DST and SRC fields are maskable. */
if (ipv4_mask->hdr.version_ihl ||
ipv4_mask->hdr.type_of_service ||
@@ -385,11 +398,15 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
break;
case RTE_FLOW_ITEM_TYPE_IPV6:
ipv6_spec = item->spec;
-   ipv6_mask = item->mask;
 
-   if (!item->spec || !item->mask)
+   if (!item->spec)
break;
 
+   if (item->mask)
+   ipv6_mask = item->mask;
+   else
+   ipv6_mask = &rte_flow_item_ipv6_mask;
+
/* Only IP DST and SRC fields are maskable. */
if (ipv6_mask->hdr.vtc_flow ||
ipv6_mask->hdr.payload_len ||
@@ -437,11 +454,15 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
break;
case RTE_FLOW_ITEM_TYPE_TCP:
tcp_spec = item->spec;
-   tcp_mask = item->mask;
 
-   if (!item->spec || !item->mask)
+   if (!item->spec)
break;
 
+   if (item->mask)
+   tcp_mask = item->mask;
+   else
+   tcp_mask = &rte_flow_item_tcp_mask;
+
/* Check TCP mask. Only DST & SRC ports are maskable */
if (tcp_mask->hdr.sent_seq ||
tcp_mask->hdr.recv_ack ||
@@ -482,11 +503,15 @@ bnxt_validate_and_parse_flow_type(struct bnxt *bp,
break;
case RTE_FLOW_ITEM_TYPE_UDP:
udp_spec = item->spec;
-   udp_mask = item->mask;
 
-   if (!item->spec || !item->mask)
+   if (!item->spec)
break;
 
+   if (item->mask)
+   udp_mask = item->mask;
+   else
+   udp_mask = &rte_flow_item_udp_mask;
+
if (udp_mask->hdr.dgram_len ||
udp_mask->hdr.dgram_cksum) {

[dpdk-dev] [PATCH v5 0/8] features and bugfixes for hns3

2021-03-18 Thread Min Hu (Connor)
This series add four features according to the 21.05 roadmap, and
also two cleanups added.

Chengchang Tang (1):
  net/hns3: support for outer UDP cksum

Chengwen Feng (1):
  net/hns3: support runtime config to select IO burst func

Hongbo Zheng (4):
  net/hns3: adjust the format of RAS related structures
  net/hns3: delete redundant xstats RAS statistics
  net/hns3: support query Tx descriptor status
  net/hns3: support query Rx descriptor status

Min Hu (Connor) (2):
  net/hns3: support imissed stats for PF/VF
  net/hns3: support oerrors stats in PF

 doc/guides/nics/features/hns3.ini  |2 +
 doc/guides/nics/features/hns3_vf.ini   |2 +
 doc/guides/nics/hns3.rst   |   19 +
 doc/guides/rel_notes/release_21_05.rst |4 +
 drivers/net/hns3/hns3_cmd.c|3 +
 drivers/net/hns3/hns3_cmd.h|   13 +
 drivers/net/hns3/hns3_ethdev.c |   86 +-
 drivers/net/hns3/hns3_ethdev.h |   76 +-
 drivers/net/hns3/hns3_ethdev_vf.c  |   18 +
 drivers/net/hns3/hns3_intr.c   | 2127 ++--
 drivers/net/hns3/hns3_regs.h   |2 +
 drivers/net/hns3/hns3_rxtx.c   |  203 ++-
 drivers/net/hns3/hns3_rxtx.h   |6 +-
 drivers/net/hns3/hns3_rxtx_vec_sve.c   |5 +-
 drivers/net/hns3/hns3_stats.c  |  394 +++---
 drivers/net/hns3/hns3_stats.h  |2 +-
 16 files changed, 1838 insertions(+), 1124 deletions(-)

-- 
2.7.4



  1   2   >