Re: [dpdk-dev] [PATCH] net/mlx5: fix link state on device start

2018-01-24 Thread Nélio Laranjeiro
Hi Shahaf,

On Tue, Jan 23, 2018 at 07:01:06PM +0200, Shahaf Shuler wrote:
> Following commit c7bf62255edf ("net/mlx5: fix handling link status event")
> the link state must be up in order for the burst function to be set on
> the device ops.
> 
> As the link may take time to move between down and up state it is
> possible the rte_eth_dev_start call will return with wrong burst
> function (either null or the empty burst function).
> 
> Fixing it by forcing the link to be up before returning from device
> start. In case the link is still not up after 5 seconds fail the function.
> 
> Fixes: c7bf62255edf ("net/mlx5: fix handling link status event")
> Cc: ys...@mellanox.com
> 
> Signed-off-by: Shahaf Shuler 
> ---
>  drivers/net/mlx5/mlx5.h |  1 +
>  drivers/net/mlx5/mlx5_defs.h|  3 +++
>  drivers/net/mlx5/mlx5_ethdev.c  | 27 +++
>  drivers/net/mlx5/mlx5_trigger.c |  8 +++-
>  4 files changed, 38 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
> index a7ec607c3..30b737f76 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -246,6 +246,7 @@ int mlx5_dev_configure(struct rte_eth_dev *);
>  void mlx5_dev_infos_get(struct rte_eth_dev *, struct rte_eth_dev_info *);
>  const uint32_t *mlx5_dev_supported_ptypes_get(struct rte_eth_dev *dev);
>  int priv_link_update(struct priv *, int);
> +int priv_force_link_status_change(struct priv *, int);
>  int mlx5_link_update(struct rte_eth_dev *, int);
>  int mlx5_dev_set_mtu(struct rte_eth_dev *, uint16_t);
>  int mlx5_dev_get_flow_ctrl(struct rte_eth_dev *, struct rte_eth_fc_conf *);
> diff --git a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h
> index a71db281d..57f295c58 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -110,4 +110,7 @@
>  /* Supported RSS */
>  #define MLX5_RSS_HF_MASK (~(ETH_RSS_IP | ETH_RSS_UDP | ETH_RSS_TCP))
>  
> +/* Maximum number of attempts to query link status before giving up. */
> +#define MLX5_MAX_LINK_QUERY_ATTEMPTS 5
> +
>  #endif /* RTE_PMD_MLX5_DEFS_H_ */
> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index 6624888c9..523865d15 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -966,6 +966,33 @@ priv_link_update(struct priv *priv, int wait_to_complete)
>  }
>  
>  /**
> + * Querying the link status till it changes to the desired state.
> + * Number of query attempts is bounded by MLX5_MAX_LINK_QUERY_ATTEMPTS.
> + *
> + * @param priv
> + *   Pointer to private structure.
> + * @param status
> + *   Link desired status.
> + *
> + * @return
> + *   0 on success, -1 on error.
> + */
> +int
> +priv_force_link_status_change(struct priv *priv, int status)
> +{
> + int try = 0;
> +
> + while (try < MLX5_MAX_LINK_QUERY_ATTEMPTS) {
> + priv_link_update(priv, 0);
> + if (priv->dev->data->dev_link.link_status == status)
> + return 0;
> + try++;
> + sleep(1);
> + }
> + return -1;
> +}
> +
> +/**
>   * DPDK callback to retrieve physical link information.
>   *
>   * @param dev
> diff --git a/drivers/net/mlx5/mlx5_trigger.c b/drivers/net/mlx5/mlx5_trigger.c
> index 827db2e7e..c5429e182 100644
> --- a/drivers/net/mlx5/mlx5_trigger.c
> +++ b/drivers/net/mlx5/mlx5_trigger.c
> @@ -166,7 +166,13 @@ mlx5_dev_start(struct rte_eth_dev *dev)
>   priv_xstats_init(priv);
>   /* Update link status and Tx/Rx callbacks for the first time. */
>   memset(&dev->data->dev_link, 0, sizeof(struct rte_eth_link));
> - priv_link_update(priv, 1);
> + INFO("Forcing port %u link to be up", dev->data->port_id);
> + err = priv_force_link_status_change(priv, ETH_LINK_UP);
> + if (err) {
> + DEBUG("Failed to set port %u link to be up",
> +   dev->data->port_id);
> + goto error;
> + }
>   priv_dev_interrupt_handler_install(priv, dev);
>   priv_unlock(priv);
>   return 0;
> -- 
> 2.12.0

According to mlx5_dev_start() documentation function: 
 * @return
 *   0 on success, negative errno value on failure.

This code is returning -1 in case of error, which means: 
 EPERM   1  /* Operation not permitted */

which is a wrong value.

Why not returning an errno in your priv function with an EBUSY or EAGAIN
which is more accurate?

Regards,

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership

2018-01-24 Thread Thomas Monjalon
23/01/2018 22:18, Ananyev, Konstantin:
> > 
> > 23/01/2018 16:18, Ananyev, Konstantin:
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > If that' s the use case, then I think you need to set device 
> > > > > > ownership at creation time -
> > > > > > inside dev_allocate().
> > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > >
> > > > > The devices must be allocated at a low level layer.
> > > >
> > > > No one arguing about that.
> > > > But we can provide owner id information to the low level.
> > 
> > Sorry, you did not get it.
> 
> Might be.
> 
> > We cannot provide owner id at the low level
> > because it is not yet decided who will be the owner
> > before the port is allocated.
> 
> Why is that?
> What prevents us decide who will manage that device *before* allocating port 
> of it?
> IMO we do have all needed information at that stage.

We don't have the information.
It is a new device, it is matched by a driver which allocates a port.
I don't see where the higher level can interact here.
And even if you manage a trick, the higher level needs to read the port
informations to decide the ownership.

> > > > > When a new device appears (hotplug), an ethdev port should be 
> > > > > allocated
> > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > Then we must decide who will manage this device.
> > > > > I suggest notifying the DPDK libs first.
> > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > ownership in its notification callback.
> > > >
> > > > Possible, but seems a bit overcomplicated.
> > > > Why not just:
> > > >
> > > > Have a global variable process_default_owner_id, that would be init 
> > > > once at startup.
> > > > Have an LTS variable default_owner_id.
> > > > It will be used by rte_eth_dev_allocate() caller can set dev->owner_id 
> > > > at creation time,
> > > > so port allocation and setting ownership - will be an atomic operation.
> > > > At the exit rte_eth_dev_allocate() will always reset default_owner_id=0:
> > > >
> > > > rte_eth_dev_allocate(...)
> > > > {
> > > >lock(owner_lock);
> > > >
> > > >owner = RTE_PER_LCORE(default_owner_id);
> > > >if (owner == 0)
> > > >owner = process_default_owner_id;
> > > >   set_owner(port, ..., owner);
> > > >  unlock(owner_lock);
> > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > >
> > > Or probably better to leave default_owner_id reset to the caller.
> > > Another thing - we can use same LTS variable in all control ops to
> > > allow/disallow changing of port configuration based on ownership.
> > > Konstantin
> > >
> > > > }
> > > >
> > > > So callers who don't need any special ownership - don't need to do 
> > > > anything.
> > > > Special callers (like failsafe) can set default_owenr_id just before 
> > > > calling hotplug
> > > > handling routine.
> > 
> > No, hotplug will not be a routine.
> > I am talking about real hotplug, like a device which appears in the VM.
> > This new device must be handled by EAL and probed automatically if
> > comply with whitelist/blacklist policy given by the application or user.
> > Real hotplug is asynchronous.
> 
> By 'asynchronous' here you mean it would be handled in the EAL interrupt 
> thread
> or something different?

Yes, we receive an hotplug event which is processed in the event thread.

> Anyway, I suppose  you do need a function inside DPDK that will do the actual 
> work in response
> on hotplug event, right?

Yes

> That's what I refer to as 'hotplug routine' above.
> 
> > We will just receive notifications that it appeared.
> > 
> > Note: there is some temporary code in failsafe to manage some hotplug.
> > This code must be removed when it will be properly handled in EAL.
> 
> Ok, if it is just a temporary code, that would be removed soon -
> then it definitely seems wrong to modify tespmd (or any other user app)
> to comply with that temporary solution.

It will be modified when EAL hotplug will be implemented.

However, the ownership issue will be the same:
we don't know the owner when allocating a port.



[dpdk-dev] [PATCH v2 0/4] fix VF RX queue interrupt enabling

2018-01-24 Thread Wenzhuo Lu
If multiple interrupt not supported, the driver should not try to enable RX 
queue interrupt. Then APP can know the RX queue interrupt is not enabled and 
only choose the polling mode.

v2:
- AVF acts as not configured RX queue interrupt if not allowed.

Wenzhuo Lu (4):
  net/i40e: fix VF RX queue interrupt enabling
  net/ixgbe: fix VF RX queue interrupt enabling
  net/e1000: fix VF RX queue interrupt enabling
  net/avf: fix VF RX queue interrupt enabling

 drivers/net/avf/avf_ethdev.c  | 6 --
 drivers/net/e1000/igb_ethdev.c| 3 ++-
 drivers/net/i40e/i40e_ethdev_vf.c | 3 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c  | 3 ++-
 4 files changed, 10 insertions(+), 5 deletions(-)

-- 
1.9.3



Re: [dpdk-dev] [PATCH 2/5] net/mlx5: fix secondary process mempool registration

2018-01-24 Thread Nélio Laranjeiro
Hi Shahaf,

On Tue, Jan 23, 2018 at 07:08:20PM +0200, Shahaf Shuler wrote:
> Secondary process is not allowed to register mempools on the flight.
> 
> The code will return invalid memory key for such case.
> 
> Fixes: 87ec44ce1651 ("net/mlx5: add operations for secondary process")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Shahaf Shuler 
> Signed-off-by: Xueming Li 
> ---
>  doc/guides/nics/mlx5.rst |  7 ++-
>  drivers/net/mlx5/mlx5_rxtx.h | 17 ++---
>  2 files changed, 20 insertions(+), 4 deletions(-)
> 
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
> index bdc2216c0..2f860402f 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -106,7 +106,12 @@ Limitations
>  
>  - Inner RSS for VXLAN frames is not supported yet.
>  - Hardware checksum RX offloads for VXLAN inner header are not supported yet.
> -- Forked secondary process not supported.
> +- For secondary process:
> +
> +  - Forked secondary process not supported.
> +  - All mempools must be initialized before rte_eth_dev_start().
> +  - Number of mempools must less than CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE

Why such limitation?  Registering a new memory is independent of
searching a new one when the cache is too small.

>  - Flow pattern without any specific vlan will match for vlan packets as well:
>  
>When VLAN spec is not specified in the pattern, the matching rule will be 
> created with VLAN as a wild card.
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index a63364d79..79cdfc793 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -550,6 +550,7 @@ mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct rte_mbuf 
> *mb)
>   uint16_t i = txq->mr_cache_idx;
>   uintptr_t addr = rte_pktmbuf_mtod(mb, uintptr_t);
>   struct mlx5_mr *mr;
> + struct rte_mempool *mp;
>  
>   assert(i < RTE_DIM(txq->mp2mr));
>   if (likely(txq->mp2mr[i]->start <= addr && txq->mp2mr[i]->end >= addr))
> @@ -563,14 +564,24 @@ mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct 
> rte_mbuf *mb)
>   if (txq->mp2mr[i]->start <= addr &&
>   txq->mp2mr[i]->end >= addr) {
>   assert(txq->mp2mr[i]->lkey != (uint32_t)-1);
> - assert(rte_cpu_to_be_32(txq->mp2mr[i]->mr->lkey) ==
> -txq->mp2mr[i]->lkey);
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> + assert(rte_cpu_to_be_32(txq->mp2mr[i]->mr->lkey)
> +== txq->mp2mr[i]->lkey);
> + }

This code should be inside priv_txq_mp2mr_reg() to let the secondary
search inside the MR list when the cache is too small.

If it does not find any MR it should fail before calling
priv_mr_new().

>   txq->mr_cache_idx = i;
>   return txq->mp2mr[i]->lkey;
>   }
>   }
>   txq->mr_cache_idx = 0;
> - mr = mlx5_txq_mp2mr_reg(txq, mlx5_tx_mb2mp(mb), i);
> + mp = mlx5_tx_mb2mp(mb);
> + if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> + WARN("Using unregistered mempool 0x%p(%s) in secondary process,"
> +  " please create mempool before rte_eth_dev_start()",
> +  (void *)mp, mp->name);
> + assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
> + return (uint32_t)-1;
> + }
> + mr = mlx5_txq_mp2mr_reg(txq, mp, i);
>   /*
>* Request the reference to use in this queue, the original one is
>* kept by the control plane.
> -- 
> 2.12.0

-- 
Nélio Laranjeiro
6WIND


[dpdk-dev] [PATCH v2 3/4] net/e1000: fix VF RX queue interrupt enabling

2018-01-24 Thread Wenzhuo Lu
When using UIO, after enabling the interrupt to get the PF
message, VF RX queue interrupt is not working.
It's expected, as UIO doesn't support multiple interrupt.
So, PMD should not try to enable RX queue interrupt. Then
APP can know the RX queue interrupt is not enabled and only
choose the polling mode.

Fixes: 316f4f1adc2e ("net/igb: support VF mailbox interrupt for link up/down")
CC: sta...@dpdk.org

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/e1000/igb_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 077e094..15347df 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -3295,7 +3295,8 @@ static int eth_igbvf_link_update(struct e1000_hw *hw)
}
 
/* check and configure queue intr-vector mapping */
-   if (dev->data->dev_conf.intr_conf.rxq != 0) {
+   if (rte_intr_cap_multiple(intr_handle) &&
+   dev->data->dev_conf.intr_conf.rxq) {
intr_vector = dev->data->nb_rx_queues;
ret = rte_intr_efd_enable(intr_handle, intr_vector);
if (ret)
-- 
1.9.3



[dpdk-dev] [PATCH v2 2/4] net/ixgbe: fix VF RX queue interrupt enabling

2018-01-24 Thread Wenzhuo Lu
When using UIO, after enabling the interrupt to get the PF
message, VF RX queue interrupt is not working.
It's expected, as UIO doesn't support multiple interrupt.
So, PMD should not try to enable RX queue interrupt. Then
APP can know the RX queue interrupt is not enabled and only
choose the polling mode.

Fixes: 77234603fba0 ("net/ixgbe: support VF mailbox interrupt for link up/down")
CC: sta...@dpdk.org

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 4f4334d..91ff54c 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -5033,7 +5033,8 @@ static int ixgbevf_dev_xstats_get_names(__rte_unused 
struct rte_eth_dev *dev,
ixgbevf_dev_rxtx_start(dev);
 
/* check and configure queue intr-vector mapping */
-   if (dev->data->dev_conf.intr_conf.rxq != 0) {
+   if (rte_intr_cap_multiple(intr_handle) &&
+   dev->data->dev_conf.intr_conf.rxq) {
/* According to datasheet, only vector 0/1/2 can be used,
 * now only one vector is used for Rx queue
 */
-- 
1.9.3



[dpdk-dev] [PATCH v2 4/4] net/avf: fix VF RX queue interrupt enabling

2018-01-24 Thread Wenzhuo Lu
As UIO doesn't support multiple interrupt, and the interrupt
is occupied by the control plane. PMD should not try to enable
RX queue interrupt. Then APP can know the RX queue interrupt
is not enabled and only choose the polling mode.

Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt")
Signed-off-by: Wenzhuo Lu 
---
 drivers/net/avf/avf_ethdev.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/avf/avf_ethdev.c b/drivers/net/avf/avf_ethdev.c
index b36d317..2bd5170 100644
--- a/drivers/net/avf/avf_ethdev.c
+++ b/drivers/net/avf/avf_ethdev.c
@@ -292,7 +292,8 @@ static int avf_config_rx_queues_irqs(struct rte_eth_dev 
*dev,
uint16_t interval, i;
int vec;
 
-   if (dev->data->dev_conf.intr_conf.rxq != 0) {
+   if (rte_intr_cap_multiple(intr_handle) &&
+   dev->data->dev_conf.intr_conf.rxq) {
if (rte_intr_efd_enable(intr_handle, dev->data->nb_rx_queues))
return -1;
}
@@ -308,7 +309,8 @@ static int avf_config_rx_queues_irqs(struct rte_eth_dev 
*dev,
}
}
 
-   if (!dev->data->dev_conf.intr_conf.rxq) {
+   if (!dev->data->dev_conf.intr_conf.rxq ||
+   !rte_intr_dp_is_en(intr_handle)) {
/* Rx interrupt disabled, Map interrupt only for writeback */
vf->nb_msix = 1;
if (vf->vf_res->vf_cap_flags &
-- 
1.9.3



[dpdk-dev] [PATCH v2 1/4] net/i40e: fix VF RX queue interrupt enabling

2018-01-24 Thread Wenzhuo Lu
When using UIO, after enabling the interrupt to get the PF
message, VF RX queue interrupt is not working.
It's expected, as UIO doesn't support multiple interrupt.
So, PMD should not try to enable RX queue interrupt. Then
APP can know the RX queue interrupt is not enabled and only
choose the polling mode.

Fixes: ae19955e7c86 ("i40evf: support reporting PF reset")
CC: sta...@dpdk.org

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/i40e/i40e_ethdev_vf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 0cca0d3..60783f9 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1976,7 +1976,8 @@ static int eth_i40evf_pci_remove(struct rte_pci_device 
*pci_dev)
dev->data->nb_tx_queues);
 
/* check and configure queue intr-vector mapping */
-   if (dev->data->dev_conf.intr_conf.rxq != 0) {
+   if (rte_intr_cap_multiple(intr_handle) &&
+   dev->data->dev_conf.intr_conf.rxq) {
intr_vector = dev->data->nb_rx_queues;
if (rte_intr_efd_enable(intr_handle, intr_vector))
return -1;
-- 
1.9.3



Re: [dpdk-dev] [PATCH v3 3/3] doc: convert license headers to SPDX tags

2018-01-24 Thread Thomas Monjalon
24/01/2018 06:48, Hemant Agrawal:
> 
> > -Original Message-
> > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > 23/01/2018 16:19, Hemant Agrawal:
> > >
> > > > -Original Message-
> > > > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > >
> > > > 23/01/2018 14:08, Hemant Agrawal:
> > > > > This patch will be good if you only add SPDX to it and NOT remove
> > > > > the
> > > > original license text.
> > > > > i.e. only do following:
> > > > >
> > > > > > -..  BSD LICENSE
> > > > > > +..  SPDX-License-Identifier: BSD-3-Clause
> > > > >
> > > > > Around RC2 timeframe, I intend to do that. All the remaining but
> > > > > valid
> > > > license files, we will add SPDX and NOT remove the license text.
> > > >
> > > > Hemant, I don't understand why?
> > > [Hemant]
> > >
> > > Changing license for someone else copyright needs their ACK. However
> > > we can add SPDX without modifying existing license text There are large
> > number of other copyrights and not everyone is converting their license  to
> > SPDX only.
> > >
> > > In case of linux kernel and uboot, as a first step they just added SPDX to
> > all files without removing the license text.
> > >
> > > I was thinking of doing the same so that all the files in DPDK should have
> > SPDX. However, we will not SPDX to files, which are not complaint to DPDK
> > policy.
> > > We will deal with them separately.
> > 
> > If we don't remove the license now, it will never happen.
> 
> I have the same fear. But we can not remove other's license text without 
> their explicit approval. 
> 
> W.r.t DPDK project priorities: License compliance is important than cleanup. 
> 
> Let's target following:
> 1.  100% compliance by 18.05
> 2.  100% Cleanup by 18.08 or 18.11

Why do you assume we cannot get the author's approval quickly?
We did not try.
Let's try to do compliance and cleanup at the same time.



Re: [dpdk-dev] [PATCH 3/5] net/mlx5: assert for un-successful memory registration

2018-01-24 Thread Nélio Laranjeiro
Hi Shahaf,

On Tue, Jan 23, 2018 at 07:08:21PM +0200, Shahaf Shuler wrote:
> Memory registration can fail, add the proper assert for such scenario
> for it at least to be visible in debug mode.
> 
> Signed-off-by: Shahaf Shuler 
> Signed-off-by: Xueming Li 
> ---
>  drivers/net/mlx5/mlx5_rxtx.h | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h
> index 79cdfc793..2934f9fb3 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -589,6 +589,10 @@ mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct rte_mbuf 
> *mb)
>   if (mr) {
>   rte_atomic32_inc(&mr->refcnt);
>   return mr->lkey;
> + } else {
> + WARN("Failed to register mempool 0x%p(%s)",
> +   (void *)mp, mp->name);
> + assert(mr != NULL);

This assert seems wrong.

Why this assert, you don't trust the CPU to verify the pointer is NULL?

>   }
>   return (uint32_t)-1;
>  }
> -- 
> 2.12.0

Regards,

-- 
Nélio Laranjeiro
6WIND


Re: [dpdk-dev] [PATCH] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Ori Kam
Hi Beilei,

PSB

> -Original Message-
> From: Beilei Xing [mailto:beilei.x...@intel.com]
> Sent: Wednesday, January 24, 2018 9:53 AM
> To: Ori Kam 
> Cc: dev@dpdk.org
> Subject: [PATCH] examples/flow_filtering: add delay during updating link
> status
> 
> Add up to 9s delay for getting link status to make sure NIC updates link 
> status
> successfully, just like other applications such as testpmd and l2fwd.
> 
> Signed-off-by: Beilei Xing 
> ---
>  examples/flow_filtering/main.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/flow_filtering/main.c b/examples/flow_filtering/main.c
> index 4a07b63..1788f24 100644
> --- a/examples/flow_filtering/main.c
> +++ b/examples/flow_filtering/main.c
> @@ -55,6 +55,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  static volatile bool force_quit;
> 
> @@ -119,13 +120,22 @@ main_loop(void)
>   rte_eth_dev_close(port_id);
>  }
> 
> +#define CHECK_INTERVAL 1000  /* 100ms */

This define is not used.

> +#define MAX_REPEAT_TIME 90   /* 9s (90 * 100ms) in total */
> +
>  static void
>  assert_link_status(void)
>  {
>   struct rte_eth_link link;
> + uint8_t rep_cnt = MAX_REPEAT_TIME;
> 
>   memset(&link, 0, sizeof(link));
> - rte_eth_link_get(port_id, &link);
> + do {
> + rte_eth_link_get(port_id, &link);
> + if (link.link_status == ETH_LINK_UP)
> + break;
> + } while (--rep_cnt);

I think you are missing the rte_delay_ms(CHECK_INTERVAL);
Currently the code will work for 90 iterations  but we can't grantee the 
duration.

> +
>   if (link.link_status == ETH_LINK_DOWN)
>   rte_exit(EXIT_FAILURE, ":: error: link is still down\n");  }
> --
> 2.5.5

Regards,
Ori


Re: [dpdk-dev] [PATCH] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Xing, Beilei


> -Original Message-
> From: Ori Kam [mailto:or...@mellanox.com]
> Sent: Wednesday, January 24, 2018 4:17 PM
> To: Xing, Beilei 
> Cc: dev@dpdk.org
> Subject: RE: [PATCH] examples/flow_filtering: add delay during updating link
> status
> 
> Hi Beilei,
> 
> PSB
> 
> > -Original Message-
> > From: Beilei Xing [mailto:beilei.x...@intel.com]
> > Sent: Wednesday, January 24, 2018 9:53 AM
> > To: Ori Kam 
> > Cc: dev@dpdk.org
> > Subject: [PATCH] examples/flow_filtering: add delay during updating
> > link status
> >
> > Add up to 9s delay for getting link status to make sure NIC updates
> > link status successfully, just like other applications such as testpmd and
> l2fwd.
> >
> > Signed-off-by: Beilei Xing 
> > ---
> >  examples/flow_filtering/main.c | 12 +++-
> >  1 file changed, 11 insertions(+), 1 deletion(-)
> >
> > diff --git a/examples/flow_filtering/main.c
> > b/examples/flow_filtering/main.c index 4a07b63..1788f24 100644
> > --- a/examples/flow_filtering/main.c
> > +++ b/examples/flow_filtering/main.c
> > @@ -55,6 +55,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  static volatile bool force_quit;
> >
> > @@ -119,13 +120,22 @@ main_loop(void)
> > rte_eth_dev_close(port_id);
> >  }
> >
> > +#define CHECK_INTERVAL 1000  /* 100ms */
> 
> This define is not used.
> 
> > +#define MAX_REPEAT_TIME 90   /* 9s (90 * 100ms) in total */
> > +
> >  static void
> >  assert_link_status(void)
> >  {
> > struct rte_eth_link link;
> > +   uint8_t rep_cnt = MAX_REPEAT_TIME;
> >
> > memset(&link, 0, sizeof(link));
> > -   rte_eth_link_get(port_id, &link);
> > +   do {
> > +   rte_eth_link_get(port_id, &link);
> > +   if (link.link_status == ETH_LINK_UP)
> > +   break;
> > +   } while (--rep_cnt);
> 
> I think you are missing the rte_delay_ms(CHECK_INTERVAL); Currently the
> code will work for 90 iterations  but we can't grantee the duration.

Sorry for missing rte_delay_ms in this patch, will send v2 later, thanks.

> 
> > +
> > if (link.link_status == ETH_LINK_DOWN)
> > rte_exit(EXIT_FAILURE, ":: error: link is still down\n");  }
> > --
> > 2.5.5
> 
> Regards,
> Ori


Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Thomas Monjalon
24/01/2018 07:43, Yuanhan Liu:
> On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > 23/01/2018 13:46, Yuanhan Liu:
> > > On Thu, Jan 18, 2018 at 10:46:23AM +0100, Gaëtan Rivet wrote:
> > > > On Thu, Jan 18, 2018 at 09:46:29AM +0100, Thomas Monjalon wrote:
> > > > > 18/01/2018 08:35, Yuanhan Liu:
> > > > > > On Wed, Jan 17, 2018 at 12:34:08PM +, Ferruh Yigit wrote:
> > > > > > > So does it make sense to separate them logically? Perhaps as 
> > > > > > > "device identifier"
> > > > > > > and "device args".
> > > > > > 
> > > > > > Then I think it returns back to the old issue: how could we 
> > > > > > identify a
> > > > > > port when the bus id (say BDF for PCI bus) is not enough for 
> > > > > > identifying
> > > > > > a port? Such case could happen when a single NIC has 2 ports sharing
> > > > > > the same BDF. It could also happen with the VF representors that 
> > > > > > will
> > > > > > be introduced shortly.
> > > > > 
> > > > > Yes, the device matching syntax must include bus category, class 
> > > > > category
> > > > > and driver category. So any device can be identified in future.
> > > > > 
> > > > > But I think Ferruh is talking about separating device matching
> > > > > (which is described in this proposal) and device settings
> > > > > (which are usually mixed in -w and --vdev options).
> > > > > I agree there are different things and may be separate.
> > > > > They could share the same syntax (bus/class/driver) but be separate
> > > > > with a semicolon:
> > > > >   matching;settings
> > > >
> > > > Can you give an example?
> > > 
> > > Let's take port addition in OVS-DPDK as an example. It happens in 2
> > > steps:
> > > - port lookup (if port is already probed)
> > > - dev attachment (if lookup fails)
> > > 
> > > And also let's assume we need probe a ConnectX-3 port. Note that for
> > > ConnectX-3, there are 2 ports sharing the same PCI addr. Thus, PCI
> > > BDF is not enough. And let's assume we use another extra property
> > > "port".
> > > 
> > > If the proposal described in this patch is being used, the devarg
> > > would look like following:
> > > 
> > > bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > 
> > > Then "bus=pci,id=04:00.0/class=eth,port=0" will be used for lookup,
> > > It means we are looking for a port with PCI BDF == 04:00.0 AND
> > > port == 0 (the first port of the 2 ports).
> > > 
> > > Note that in my proposal the driver category is not intended for lookup.
> > > If any properties needed be looked in the driver category, they would
> > > probably need be elevated to the class category.
> > 
> > It is not my thought.
> > I think we should be able to use bus, class and driver properties for 
> > lookup.
> > We can imagine doing a lookup on a driver specific id, which is not
> > candidate to elevation to the class category.
> > 
> > > If port not found, then the whole string will be used for dev attachment.
> > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > port == 0 (the 2nd port will not be attached).
> > > 
> > > 
> > > And here is how the devargs would look like if "matching;settings" is
> > > being used:
> > > 
> > > 
> > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > 
> > > The part before ";" will be used for lookup and the later part will be
> > > used for attachment. It should work. It just looks redundant.
> > 
> > It does not have to be redundant.
> > It can be:
> > bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> 
> I knew you would make such reply :)
> Then there is a contradiction. According your suggestion, the "port=0" belongs
> to the matching section, but it also has to be used in the settings section.

If port=0 is matched, it is already set, right?
Why it needs to be in settings?

> > Another example, setting the MAC address:
> > bus=pci,id=04:00.0/class=eth,port=0;class=eth,mac=00:11:22:33:44:55
> 
> What's the scenario it will be used? And who is going to parse it?

It can be used on command line for whitelisting a device and let the user
change its MAC address.
The matching is parsed by the PCI driver + ethdev, here.
As mac is a property of ethdev (class=eth), this part must be parsed by ethdev.


[dpdk-dev] [PATCH v2] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Beilei Xing
Add up to 9s delay for getting link status to make sure NIC updates
link status successfully, just like other applications such as
testpmd and l2fwd.

Signed-off-by: Beilei Xing 
---

v2 changes:
 - Add rte_delay_ms(CHECK_INTERVAL) which is missed in v1.

 examples/flow_filtering/main.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/examples/flow_filtering/main.c b/examples/flow_filtering/main.c
index 4a07b63..85d5727 100644
--- a/examples/flow_filtering/main.c
+++ b/examples/flow_filtering/main.c
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static volatile bool force_quit;
 
@@ -119,13 +120,23 @@ main_loop(void)
rte_eth_dev_close(port_id);
 }
 
+#define CHECK_INTERVAL 1000  /* 100ms */
+#define MAX_REPEAT_TIME 90   /* 9s (90 * 100ms) in total */
+
 static void
 assert_link_status(void)
 {
struct rte_eth_link link;
+   uint8_t rep_cnt = MAX_REPEAT_TIME;
 
memset(&link, 0, sizeof(link));
-   rte_eth_link_get(port_id, &link);
+   do {
+   rte_eth_link_get(port_id, &link);
+   if (link.link_status == ETH_LINK_UP)
+   break;
+   rte_delay_ms(CHECK_INTERVAL);
+   } while (--rep_cnt);
+
if (link.link_status == ETH_LINK_DOWN)
rte_exit(EXIT_FAILURE, ":: error: link is still down\n");
 }
-- 
2.5.5



Re: [dpdk-dev] [PATCH v3 1/3] net/i40e: add null point check and fix mem leak

2018-01-24 Thread Zhang, Helin
Hi Yong

Thank you so much for your contribution! I have comments in general.
1. for a patch set, you need a cover letter for the series.
2. If a patch is to fix a bug/issue, the title should start with 'fix'.
3. A 'Fixes:' line is needed for any bug fixes.
4. A 'Cc:' line is needed, if the patch should be back ported into any stable 
release version.
5. you can find some good examples from the 'git log'.

Thanks!

Regards,
Helin


Re: [dpdk-dev] [PATCH v2] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Ori Kam
Hi


> -Original Message-
> From: Beilei Xing [mailto:beilei.x...@intel.com]
> Sent: Wednesday, January 24, 2018 10:35 AM
> To: Ori Kam 
> Cc: dev@dpdk.org
> Subject: [PATCH v2] examples/flow_filtering: add delay during updating link
> status
> 
> Add up to 9s delay for getting link status to make sure NIC updates link 
> status
> successfully, just like other applications such as testpmd and l2fwd.
> 
> Signed-off-by: Beilei Xing 
> ---
> 
> v2 changes:
>  - Add rte_delay_ms(CHECK_INTERVAL) which is missed in v1.
> 
>  examples/flow_filtering/main.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/flow_filtering/main.c b/examples/flow_filtering/main.c
> index 4a07b63..85d5727 100644
> --- a/examples/flow_filtering/main.c
> +++ b/examples/flow_filtering/main.c
> @@ -55,6 +55,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  static volatile bool force_quit;
> 
> @@ -119,13 +120,23 @@ main_loop(void)
>   rte_eth_dev_close(port_id);
>  }
> 
> +#define CHECK_INTERVAL 1000  /* 100ms */
> +#define MAX_REPEAT_TIME 90   /* 9s (90 * 100ms) in total */

I know that in other examples there is use of 
MAX_REPEAT_TIME but don't you think the name is incorrect,
It should be called:
MAX_REPEAT_TIMES or MAX_REPEAT_COUNT?
Since it doesn't represent time but iterations.
What do you think?

> +
>  static void
>  assert_link_status(void)
>  {
>   struct rte_eth_link link;
> + uint8_t rep_cnt = MAX_REPEAT_TIME;
> 
>   memset(&link, 0, sizeof(link));
> - rte_eth_link_get(port_id, &link);
> + do {
> + rte_eth_link_get(port_id, &link);
> + if (link.link_status == ETH_LINK_UP)
> + break;
> + rte_delay_ms(CHECK_INTERVAL);
> + } while (--rep_cnt);
> +
>   if (link.link_status == ETH_LINK_DOWN)
>   rte_exit(EXIT_FAILURE, ":: error: link is still down\n");  }
> --
> 2.5.5



Re: [dpdk-dev] [PATCH v3 01/19] crypto/ccp: add AMD ccp skeleton PMD

2018-01-24 Thread De Lara Guarch, Pablo
Hi Ravi,

> -Original Message-
> From: Kumar, Ravi1 [mailto:ravi1.ku...@amd.com]
> Sent: Wednesday, January 17, 2018 9:09 AM
> To: De Lara Guarch, Pablo ;
> dev@dpdk.org
> Cc: Shippen, Greg 
> Subject: RE: [PATCH v3 01/19] crypto/ccp: add AMD ccp skeleton PMD
> 
> >Hi Ravi,
> >
> >> -Original Message-
> >> From: Kumar, Ravi1 [mailto:ravi1.ku...@amd.com]
> >> Sent: Thursday, January 11, 2018 6:37 AM
> >> To: De Lara Guarch, Pablo ;
> >> dev@dpdk.org
> >> Cc: Shippen, Greg 
> >> Subject: RE: [PATCH v3 01/19] crypto/ccp: add AMD ccp skeleton PMD
> >>
> >> > -Original Message-
> >> >> From: Kumar, Ravi1 [mailto:ravi1.ku...@amd.com]
> >> >> Sent: Wednesday, January 10, 2018 10:30 AM
> >> >> To: De Lara Guarch, Pablo ;
> >> >> dev@dpdk.org
> >> >> Cc: Shippen, Greg 
> >> >> Subject: RE: [PATCH v3 01/19] crypto/ccp: add AMD ccp skeleton
> PMD
> >> >>
> >> >> >Hi Ravi,
> >> >> >
> >> >> >> -Original Message-
> >> >> >> From: Ravi Kumar [mailto:ravi1.ku...@amd.com]
> >> >> >> Sent: Wednesday, January 10, 2018 9:43 AM
> >> >> >> To: dev@dpdk.org
> >> >> >> Cc: De Lara Guarch, Pablo 
> >> >> >> Subject: [PATCH v3 01/19] crypto/ccp: add AMD ccp skeleton
> PMD
> >> >> >>
> >> >> >> Signed-off-by: Ravi Kumar 
> >> >> >> ---
> >> >> >
> >> >> >...
> >> >> >
> >> >> >> --- /dev/null
> >> >> >> +++ b/drivers/crypto/ccp/Makefile
> >> >> >> @@ -0,0 +1,55 @@
> >> >> >> +#
> >> >> >> +#   Copyright(c) 2018 Advanced Micro Devices, Inc.
> >> >> >> +#   All rights reserved.
> >> >> >> +#
> >> >> >
> >> >> >As Hemant commented, you need to change this full license with
> >> >> >SPDX
> >> >> tags:
> >> >> >
> >> >> >http://dpdk.org/ml/archives/dev/2018-January/085510.html
> >> >> >
> >> >> >Could you submit a v4 with these changes today?
> >> >> >
> >> >> >Thanks,
> >> >> >Pablo
> >> >>
> >> >> Hi Pablo,
> >> >>
> >> >> Our legal team is still working on the license. We want to get the
> >> >> code reviewed in parallel.
> >> >> I will give you and update later if we can upload the v4 patch
> >> >> with SPDX tags today.
> >> >>
> >> >
> >> >Ok, I have looked at the overall patchset and it looks ok to me.
> >> >The only thing that I would change is the title of patch 7/19:
> >> >I would change it to "crypto/ccp: support sessionless operations".
> >> >
> >> >Regards,
> >> >Pablo
> >>
> >> Hi Pablo,
> >>
> >> Thanks for going through our code. Good to know the code is fine.
> >>
> >> The new changes in licensing policy has introduced some delays as we
> >> work through the implications and get agreement from the third party.
> >>
> >> We are working on it and will upload the v4 with SPDX license tags as
> >> soon as possible.
> >>
> >
> >Any news? RC1 will be out on Friday, and the subtree should be close
> tomorrow evening, and there will not be more features added after this
> point.
> >
> >Thanks,
> >Pablo
> 
> Hi Pablo,
> 
> We are still in the process of getting the sign off on the new SPDX license
> tags from the third party. We expect an answer from them by the end of
> the week at the earliest.
> 
> I will get back to you as soon as I have some update.

Any update? This could still be merged in RC2, but no later than that.

Thanks,
Pablo

> 
> Regards,
> Ravi
> 
> >
> >> Regards,
> >> Ravi
> >> >
> >> >> Regards,
> >> >> Ravi


Re: [dpdk-dev] Dynamic Logging Name Formatting

2018-01-24 Thread Olivier Matz
On Wed, Jan 24, 2018 at 09:32:33AM +0300, Andrew Rybchenko wrote:
> On 01/23/2018 11:54 PM, Bruce Richardson wrote:
> > On Tue, Jan 23, 2018 at 05:45:09PM +, Ferruh Yigit wrote:
> > > On 1/23/2018 5:31 PM, Van Haaren, Harry wrote:
> > > > === Suggested Consistent Schema ===
> > > > ===
> > > > 
> > > > Libraries:
> > > >lib..specialization
> > > > 
> > > > Net PMDs:
> > > >pmd..specialization
> > > >// one could argue consistency would be pmd.ethdev.
> > > >// but the change is larger than the value IMO - thoughts?
> > pmd.net ??
> > 
> > > > Eventdev PMDs:
> > > >pmd.eventdev..specialization
> > > > 
> > > > Crypto PMDs:
> > > >pmd.cryptodev..specialization
> > > > 
> > "event" and "crypto" without the "dev" on the end, which would also
> > match "net" above.
> 
> I like the scheme suggested by Bruce including pmd.net. Shorter and
> consistent.

Agree.


Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Yuanhan Liu
On Wed, Jan 24, 2018 at 09:19:10AM +0100, Thomas Monjalon wrote:
> 24/01/2018 07:43, Yuanhan Liu:
> > On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > > 23/01/2018 13:46, Yuanhan Liu:
> > > > On Thu, Jan 18, 2018 at 10:46:23AM +0100, Gaëtan Rivet wrote:
> > > > > On Thu, Jan 18, 2018 at 09:46:29AM +0100, Thomas Monjalon wrote:
> > > > > > 18/01/2018 08:35, Yuanhan Liu:
> > > > > > > On Wed, Jan 17, 2018 at 12:34:08PM +, Ferruh Yigit wrote:
> > > > > > > > So does it make sense to separate them logically? Perhaps as 
> > > > > > > > "device identifier"
> > > > > > > > and "device args".
> > > > > > > 
> > > > > > > Then I think it returns back to the old issue: how could we 
> > > > > > > identify a
> > > > > > > port when the bus id (say BDF for PCI bus) is not enough for 
> > > > > > > identifying
> > > > > > > a port? Such case could happen when a single NIC has 2 ports 
> > > > > > > sharing
> > > > > > > the same BDF. It could also happen with the VF representors that 
> > > > > > > will
> > > > > > > be introduced shortly.
> > > > > > 
> > > > > > Yes, the device matching syntax must include bus category, class 
> > > > > > category
> > > > > > and driver category. So any device can be identified in future.
> > > > > > 
> > > > > > But I think Ferruh is talking about separating device matching
> > > > > > (which is described in this proposal) and device settings
> > > > > > (which are usually mixed in -w and --vdev options).
> > > > > > I agree there are different things and may be separate.
> > > > > > They could share the same syntax (bus/class/driver) but be separate
> > > > > > with a semicolon:
> > > > > > matching;settings
> > > > >
> > > > > Can you give an example?
> > > > 
> > > > Let's take port addition in OVS-DPDK as an example. It happens in 2
> > > > steps:
> > > > - port lookup (if port is already probed)
> > > > - dev attachment (if lookup fails)
> > > > 
> > > > And also let's assume we need probe a ConnectX-3 port. Note that for
> > > > ConnectX-3, there are 2 ports sharing the same PCI addr. Thus, PCI
> > > > BDF is not enough. And let's assume we use another extra property
> > > > "port".
> > > > 
> > > > If the proposal described in this patch is being used, the devarg
> > > > would look like following:
> > > > 
> > > > bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > 
> > > > Then "bus=pci,id=04:00.0/class=eth,port=0" will be used for lookup,
> > > > It means we are looking for a port with PCI BDF == 04:00.0 AND
> > > > port == 0 (the first port of the 2 ports).
> > > > 
> > > > Note that in my proposal the driver category is not intended for lookup.
> > > > If any properties needed be looked in the driver category, they would
> > > > probably need be elevated to the class category.
> > > 
> > > It is not my thought.
> > > I think we should be able to use bus, class and driver properties for 
> > > lookup.
> > > We can imagine doing a lookup on a driver specific id, which is not
> > > candidate to elevation to the class category.
> > > 
> > > > If port not found, then the whole string will be used for dev 
> > > > attachment.
> > > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > > port == 0 (the 2nd port will not be attached).
> > > > 
> > > > 
> > > > And here is how the devargs would look like if "matching;settings" is
> > > > being used:
> > > > 
> > > > 
> > > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > 
> > > > The part before ";" will be used for lookup and the later part will be
> > > > used for attachment. It should work. It just looks redundant.
> > > 
> > > It does not have to be redundant.
> > > It can be:
> > >   bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> > 
> > I knew you would make such reply :)
> > Then there is a contradiction. According your suggestion, the "port=0" 
> > belongs
> > to the matching section, but it also has to be used in the settings section.
> 
> If port=0 is matched, it is already set, right?

Yes.

> Why it needs to be in settings?

But I was talking the case it's not matched, say it's not probed and here
we do hotplug.

--yliu

> > > Another example, setting the MAC address:
> > >   bus=pci,id=04:00.0/class=eth,port=0;class=eth,mac=00:11:22:33:44:55
> > 
> > What's the scenario it will be used? And who is going to parse it?
> 
> It can be used on command line for whitelisting a device and let the user
> change its MAC address.
> The matching is parsed by the PCI driver + ethdev, here.
> As mac is a property of ethdev (class=eth), this part must be parsed by 
> ethdev.



[dpdk-dev] 答复: RE: [PATCH v3 1/3] net/i40e: add null point check andfix mem leak

2018-01-24 Thread wang.yong19
Hi Helin
Thanks for your advice.
However, it's hard to find the original patches which introduced the bug in 
general, especailly when the file name has changed.
This may prevent us to contribute to dpdk.
< 4. A 'Cc:' line is needed, if the patch should be back ported into any stable 
release version.
As a common contributor, I don't know whether the patch should be back ported 
into any stable release version.
And I have no rights to make this decision.

Thank you!

Yours,
Yong

--origin--
From: ;
To:; ; ; 
;
Cc: ;
Date :2018-01-24 16:41
Subject :RE: [dpdk-dev] [PATCH v3 1/3] net/i40e: add null point check andfix
mem leak
Hi Yong

Thank you so much for your contribution! I have comments in general.
1. for a patch set, you need a cover letter for the series.
2. If a patch is to fix a bug/issue, the title should start with 'fix'.
3. A 'Fixes:' line is needed for any bug fixes.
4. A 'Cc:' line is needed, if the patch should be back ported into any stable 
release version.
5. you can find some good examples from the 'git log'.

Thanks!

Regards,
Helin

[dpdk-dev] [PATCH] app/eventdev: fix port dequeue depth configuration

2018-01-24 Thread Pavan Nikhilesh
The port dequeue depth value has to be compared against the maximum
allowed dequeue depth reported by the event drivers.

Fixes: 3617aae53f92 ("app/eventdev: add event Rx adapter setup")

Signed-off-by: Pavan Nikhilesh 
---
 app/test-eventdev/test_perf_atq.c   | 13 -
 app/test-eventdev/test_perf_common.c| 25 +
 app/test-eventdev/test_perf_common.h|  3 ++-
 app/test-eventdev/test_perf_queue.c | 12 +++-
 app/test-eventdev/test_pipeline_atq.c   |  3 +++
 app/test-eventdev/test_pipeline_queue.c |  3 +++
 6 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/app/test-eventdev/test_perf_atq.c 
b/app/test-eventdev/test_perf_atq.c
index d07a05425..b36b22a77 100644
--- a/app/test-eventdev/test_perf_atq.c
+++ b/app/test-eventdev/test_perf_atq.c
@@ -207,7 +207,18 @@ perf_atq_eventdev_setup(struct evt_test *test, struct 
evt_options *opt)
}
}
 
-   ret = perf_event_dev_port_setup(test, opt, 1 /* stride */, nb_queues);
+   if (opt->wkr_deq_dep > dev_info.max_event_port_dequeue_depth)
+   opt->wkr_deq_dep = dev_info.max_event_port_dequeue_depth;
+
+   /* port configuration */
+   const struct rte_event_port_conf p_conf = {
+   .dequeue_depth = opt->wkr_deq_dep,
+   .enqueue_depth = dev_info.max_event_port_dequeue_depth,
+   .new_event_threshold = dev_info.max_num_events,
+   };
+
+   ret = perf_event_dev_port_setup(test, opt, 1 /* stride */, nb_queues,
+   &p_conf);
if (ret)
return ret;
 
diff --git a/app/test-eventdev/test_perf_common.c 
b/app/test-eventdev/test_perf_common.c
index e279d81a5..59fa0a49e 100644
--- a/app/test-eventdev/test_perf_common.c
+++ b/app/test-eventdev/test_perf_common.c
@@ -285,22 +285,12 @@ perf_event_rx_adapter_setup(struct evt_options *opt, 
uint8_t stride,
 
 int
 perf_event_dev_port_setup(struct evt_test *test, struct evt_options *opt,
-   uint8_t stride, uint8_t nb_queues)
+   uint8_t stride, uint8_t nb_queues,
+   const struct rte_event_port_conf *port_conf)
 {
struct test_perf *t = evt_test_priv(test);
uint16_t port, prod;
int ret = -1;
-   struct rte_event_port_conf port_conf;
-
-   memset(&port_conf, 0, sizeof(struct rte_event_port_conf));
-   rte_event_port_default_conf_get(opt->dev_id, 0, &port_conf);
-
-   /* port configuration */
-   const struct rte_event_port_conf wkr_p_conf = {
-   .dequeue_depth = opt->wkr_deq_dep,
-   .enqueue_depth = port_conf.enqueue_depth,
-   .new_event_threshold = port_conf.new_event_threshold,
-   };
 
/* setup one port per worker, linking to all queues */
for (port = 0; port < evt_nr_active_lcores(opt->wlcores);
@@ -313,7 +303,7 @@ perf_event_dev_port_setup(struct evt_test *test, struct 
evt_options *opt,
w->processed_pkts = 0;
w->latency = 0;
 
-   ret = rte_event_port_setup(opt->dev_id, port, &wkr_p_conf);
+   ret = rte_event_port_setup(opt->dev_id, port, port_conf);
if (ret) {
evt_err("failed to setup port %d", port);
return ret;
@@ -327,18 +317,13 @@ perf_event_dev_port_setup(struct evt_test *test, struct 
evt_options *opt,
}
 
/* port for producers, no links */
-   struct rte_event_port_conf prod_conf = {
-   .dequeue_depth = port_conf.dequeue_depth,
-   .enqueue_depth = port_conf.enqueue_depth,
-   .new_event_threshold = port_conf.new_event_threshold,
-   };
if (opt->prod_type == EVT_PROD_TYPE_ETH_RX_ADPTR) {
for ( ; port < perf_nb_event_ports(opt); port++) {
struct prod_data *p = &t->prod[port];
p->t = t;
}
 
-   ret = perf_event_rx_adapter_setup(opt, stride, prod_conf);
+   ret = perf_event_rx_adapter_setup(opt, stride, *port_conf);
if (ret)
return ret;
} else {
@@ -352,7 +337,7 @@ perf_event_dev_port_setup(struct evt_test *test, struct 
evt_options *opt,
p->t = t;
 
ret = rte_event_port_setup(opt->dev_id, port,
-   &prod_conf);
+   port_conf);
if (ret) {
evt_err("failed to setup port %d", port);
return ret;
diff --git a/app/test-eventdev/test_perf_common.h 
b/app/test-eventdev/test_perf_common.h
index f8d516ce4..9ad99733b 100644
--- a/app/test-eventdev/test_perf_common.h
+++ b/app/test-eventdev/test_perf_common.h
@@ -133,7 +133,8 @@ int perf_test_se

Re: [dpdk-dev] Dynamic Logging Name Formatting

2018-01-24 Thread Van Haaren, Harry
> From: Olivier Matz [mailto:olivier.m...@6wind.com]
> Sent: Wednesday, January 24, 2018 9:28 AM
> To: Andrew Rybchenko 
> Cc: Richardson, Bruce ; Yigit, Ferruh
> ; Van Haaren, Harry ;
> dev@dpdk.org; Ma, Liang J ; Mccarthy, Peter
> ; santosh.shu...@caviumnetworks.com;
> jerin.ja...@caviumnetworks.com
> Subject: Re: [dpdk-dev] Dynamic Logging Name Formatting
> 
> On Wed, Jan 24, 2018 at 09:32:33AM +0300, Andrew Rybchenko wrote:
> > On 01/23/2018 11:54 PM, Bruce Richardson wrote:
> > > On Tue, Jan 23, 2018 at 05:45:09PM +, Ferruh Yigit wrote:
> > > > On 1/23/2018 5:31 PM, Van Haaren, Harry wrote:
> > > > > === Suggested Consistent Schema ===
> > > > > ===
> > > > >
> > > > > Libraries:
> > > > >lib..specialization
> > > > >
> > > > > Net PMDs:
> > > > >pmd..specialization
> > > > >// one could argue consistency would be pmd.ethdev.
> > > > >// but the change is larger than the value IMO - thoughts?
> > > pmd.net ??
> > >
> > > > > Eventdev PMDs:
> > > > >pmd.eventdev..specialization
> > > > >
> > > > > Crypto PMDs:
> > > > >pmd.cryptodev..specialization
> > > > >
> > > "event" and "crypto" without the "dev" on the end, which would also
> > > match "net" above.
> >
> > I like the scheme suggested by Bruce including pmd.net. Shorter and
> > consistent.
> 
> Agree.

Thanks for feedback all, I'll send a patch to rework the existing names into 
the scheme as converged on here, and document the naming scheme in the docs.


Re: [dpdk-dev] [PATCH v2] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Xing, Beilei


> -Original Message-
> From: Ori Kam [mailto:or...@mellanox.com]
> Sent: Wednesday, January 24, 2018 4:56 PM
> To: Xing, Beilei 
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v2] examples/flow_filtering: add delay during updating
> link status
> 
> Hi
> 
> 
> > -Original Message-
> > From: Beilei Xing [mailto:beilei.x...@intel.com]
> > Sent: Wednesday, January 24, 2018 10:35 AM
> > To: Ori Kam 
> > Cc: dev@dpdk.org
> > Subject: [PATCH v2] examples/flow_filtering: add delay during updating
> > link status
> >
> > Add up to 9s delay for getting link status to make sure NIC updates
> > link status successfully, just like other applications such as testpmd and
> l2fwd.
> >
> > Signed-off-by: Beilei Xing 
> > ---
> >
> > v2 changes:
> >  - Add rte_delay_ms(CHECK_INTERVAL) which is missed in v1.
> >
> >  examples/flow_filtering/main.c | 13 -
> >  1 file changed, 12 insertions(+), 1 deletion(-)
> >
> > diff --git a/examples/flow_filtering/main.c
> > b/examples/flow_filtering/main.c index 4a07b63..85d5727 100644
> > --- a/examples/flow_filtering/main.c
> > +++ b/examples/flow_filtering/main.c
> > @@ -55,6 +55,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >
> >  static volatile bool force_quit;
> >
> > @@ -119,13 +120,23 @@ main_loop(void)
> > rte_eth_dev_close(port_id);
> >  }
> >
> > +#define CHECK_INTERVAL 1000  /* 100ms */
> > +#define MAX_REPEAT_TIME 90   /* 9s (90 * 100ms) in total */
> 
> I know that in other examples there is use of MAX_REPEAT_TIME but don't
> you think the name is incorrect, It should be called:
> MAX_REPEAT_TIMES or MAX_REPEAT_COUNT?
> Since it doesn't represent time but iterations.
> What do you think?
> 

Make sense, looks like MAX_REPEAT_TIMES should be more accurate than 
MAX_REPEAT_TIME.
Will update in next version.

> > +
> >  static void
> >  assert_link_status(void)
> >  {
> > struct rte_eth_link link;
> > +   uint8_t rep_cnt = MAX_REPEAT_TIME;
> >
> > memset(&link, 0, sizeof(link));
> > -   rte_eth_link_get(port_id, &link);
> > +   do {
> > +   rte_eth_link_get(port_id, &link);
> > +   if (link.link_status == ETH_LINK_UP)
> > +   break;
> > +   rte_delay_ms(CHECK_INTERVAL);
> > +   } while (--rep_cnt);
> > +
> > if (link.link_status == ETH_LINK_DOWN)
> > rte_exit(EXIT_FAILURE, ":: error: link is still down\n");  }
> > --
> > 2.5.5



[dpdk-dev] [PATCH v3] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Beilei Xing
Add up to 9s delay for getting link status to make sure NIC updates
link status successfully, just like other applications such as
testpmd and l2fwd.

Signed-off-by: Beilei Xing 
---
v3 changes:
 - Modify MAX_REPEAT_TIME with MAX_REPEAT_TIMES
v2 changes:
 - Add rte_delay_ms(CHECK_INTERVAL) which is missed in v1.

 examples/flow_filtering/main.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/examples/flow_filtering/main.c b/examples/flow_filtering/main.c
index 4a07b63..d16a0a5 100644
--- a/examples/flow_filtering/main.c
+++ b/examples/flow_filtering/main.c
@@ -55,6 +55,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static volatile bool force_quit;
 
@@ -119,13 +120,23 @@ main_loop(void)
rte_eth_dev_close(port_id);
 }
 
+#define CHECK_INTERVAL 1000  /* 100ms */
+#define MAX_REPEAT_TIMES 90  /* 9s (90 * 100ms) in total */
+
 static void
 assert_link_status(void)
 {
struct rte_eth_link link;
+   uint8_t rep_cnt = MAX_REPEAT_TIMES;
 
memset(&link, 0, sizeof(link));
-   rte_eth_link_get(port_id, &link);
+   do {
+   rte_eth_link_get(port_id, &link);
+   if (link.link_status == ETH_LINK_UP)
+   break;
+   rte_delay_ms(CHECK_INTERVAL);
+   } while (--rep_cnt);
+
if (link.link_status == ETH_LINK_DOWN)
rte_exit(EXIT_FAILURE, ":: error: link is still down\n");
 }
-- 
2.5.5



[dpdk-dev] [PATCH] net/failsafe: fix Rx burst infinite loop

2018-01-24 Thread Matan Azrad
In case of plugged out device, the fail-safe PMD uses failsafe_rx_burst
function for packet receiving.

This function iterates over the present sub-devices until it
receives a traffic from one of them or they are all cannot receive
packets.

The corrupted code didn't advance the sub-device pointer when the
sub-device was not present and caused to infinite loop.

Advance the sub-device pointer also in plugged-out sub-device case.

Fixes: 8052bbd9d548 ("net/failsafe: improve Rx sub-devices iteration")

Signed-off-by: Matan Azrad 
---
 drivers/net/failsafe/failsafe_rxtx.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/failsafe/failsafe_rxtx.c 
b/drivers/net/failsafe/failsafe_rxtx.c
index 1654494..aeee076 100644
--- a/drivers/net/failsafe/failsafe_rxtx.c
+++ b/drivers/net/failsafe/failsafe_rxtx.c
@@ -104,6 +104,7 @@
do {
if (fs_rx_unsafe(sdev)) {
nb_rx = 0;
+   sdev = sdev->next;
continue;
}
sub_rxq = ETH(sdev)->data->rx_queues[rxq->qid];
-- 
1.8.3.1



Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Thomas Monjalon
24/01/2018 10:28, Yuanhan Liu:
> On Wed, Jan 24, 2018 at 09:19:10AM +0100, Thomas Monjalon wrote:
> > 24/01/2018 07:43, Yuanhan Liu:
> > > On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > > > 23/01/2018 13:46, Yuanhan Liu:
> > > > > If port not found, then the whole string will be used for dev 
> > > > > attachment.
> > > > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > > > port == 0 (the 2nd port will not be attached).
> > > > > 
> > > > > 
> > > > > And here is how the devargs would look like if "matching;settings" is
> > > > > being used:
> > > > > 
> > > > > 
> > > > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > > 
> > > > > The part before ";" will be used for lookup and the later part will be
> > > > > used for attachment. It should work. It just looks redundant.
> > > > 
> > > > It does not have to be redundant.
> > > > It can be:
> > > > 
> > > > bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> > > 
> > > I knew you would make such reply :)
> > > Then there is a contradiction. According your suggestion, the "port=0" 
> > > belongs
> > > to the matching section, but it also has to be used in the settings 
> > > section.
> > 
> > If port=0 is matched, it is already set, right?
> 
> Yes.
> 
> > Why it needs to be in settings?
> 
> But I was talking the case it's not matched, say it's not probed and here
> we do hotplug.

I don't understand.
Anyway, the port property should be read-only.
Are we talking about the dev_port from the Linux kernel?


[dpdk-dev] [PATCH] usertools/devbind: fix kernel module reporting

2018-01-24 Thread Anatoly Burakov
lspci reports kernel modules in "Module" string, but devbind
expects it to be "Module_str". Fix it up similar to how we fix
up "Driver" to be "Driver_str".

Cc: sta...@dpdk.org

Signed-off-by: Anatoly Burakov 
---

Notes:
devbind status before changes:

Other Network devices
=
:08:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' 
unused=igb_uio

devbind status after changes:

Other Network devices
=
:08:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb' 
unused=ixgbe,igb_uio

Note that "ixgbe" driver is now shown as "unused" for unbound device.

Also, no idea if this was ever working to begin with, so no fixline tag.

 usertools/dpdk-devbind.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py
index 894b519..6bdb291 100755
--- a/usertools/dpdk-devbind.py
+++ b/usertools/dpdk-devbind.py
@@ -249,6 +249,8 @@ def get_device_details(devices_type):
 # of dictionary key names
 if "Driver" in dev.keys():
 dev["Driver_str"] = dev.pop("Driver")
+if "Module" in dev.keys():
+dev["Module_str"] = dev.pop("Module")
 # use dict to make copy of dev
 devices[dev["Slot"]] = dict(dev)
 # Clear previous device's data
-- 
2.7.4


Re: [dpdk-dev] IXGBE, IOMMU DMAR DRHD handling fault issue

2018-01-24 Thread Burakov, Anatoly

On 23-Jan-18 5:25 PM, Ravi Kerur wrote:

Hi,

I am running into an issue when DPDK is started with iommu on via GRUB
command. Problem is not seen with regular kernel driver, error messages
show when DPDK is started and happens for both PF and VF interfaces.

I am using DPDK 17.05 so the patch proposed in the following link is
available
http://dpdk.org/ml/archives/dev/2017-February/057048.html

Workaround is to use "iommu=pt" but I want iommu enabled in my setup. I
checked BIOS for reserved memory(DMA RMRR for IXGBE) didn't get any details
on it.

Kindly let me know how to resolve this issue.

Following are the details

(1) Linux kernel 4.9
(2) DPDK 17.05

(3) IXGBE details
ethtool -i enp4s0f0  (PF driver)
driver: ixgbe
version: 5.3.3
firmware-version: 0x87b8, 1.1018.0
bus-info: :04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes

ethtool -i enp4s16f2 (VF driver)
driver: ixgbevf
version: 4.3.2
firmware-version:
bus-info: :04:10.2
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

Bus info  Device   Class  Description
=
pci@:01:00.0  ens11f0  network82599ES 10-Gigabit SFI/SFP+
Network Connection
pci@:01:00.1  ens11f1  network82599ES 10-Gigabit SFI/SFP+
Network Connection
pci@:04:00.0  enp4s0f0 network82599ES 10-Gigabit SFI/SFP+
Network Connection
pci@:04:00.1  enp4s0f1 network82599ES 10-Gigabit SFI/SFP+
Network Connection
pci@:04:10.0  enp4s16  networkIllegal Vendor ID
pci@:04:10.2  enp4s16f2networkIllegal Vendor ID

(4) DPDK bind interfaces

# dpdk-devbind -s

Network devices using DPDK-compatible driver

:01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
drv=igb_uio unused=vfio-pci
:04:10.2 '82599 Ethernet Controller Virtual Function 10ed' drv=igb_uio
unused=vfio-pci

Network devices using kernel driver
===
:01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
if=ens11f1 drv=ixgbe unused=igb_uio,vfio-pci
:04:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
if=enp4s0f0 drv=ixgbe unused=igb_uio,vfio-pci
:04:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
if=enp4s0f1 drv=ixgbe unused=igb_uio,vfio-pci
:04:10.0 '82599 Ethernet Controller Virtual Function 10ed' if=enp4s16
drv=ixgbevf unused=igb_uio,vfio-pci
:06:00.0 'I210 Gigabit Network Connection 1533' if=eno1 drv=igb
unused=igb_uio,vfio-pci *Active*

Other Network devices
=


...

(5) Kernel dmesg

# dmesg | grep -e DMAR
[0.00] ACPI: DMAR 0x7999BAD0 E0 (v01 ALASKA A M I
0001 INTL 20091013)
[0.00] DMAR: IOMMU enabled
[0.518747] DMAR: Host address width 46
[0.526616] DMAR: DRHD base: 0x00fbffc000 flags: 0x0
[0.537447] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap
d2078c106f0466 ecap f020df
[0.553620] DMAR: DRHD base: 0x00c7ffc000 flags: 0x1
[0.564445] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap
d2078c106f0466 ecap f020df
[0.580611] DMAR: RMRR base: 0x007bbc6000 end: 0x007bbd4fff
[0.593344] DMAR: ATSR flags: 0x0
[0.600178] DMAR: RHSA base: 0x00c7ffc000 proximity domain: 0x0
[0.612905] DMAR: RHSA base: 0x00fbffc000 proximity domain: 0x1
[0.625632] DMAR-IR: IOAPIC id 3 under DRHD base  0xfbffc000 IOMMU 0
[0.638522] DMAR-IR: IOAPIC id 1 under DRHD base  0xc7ffc000 IOMMU 1
[0.651426] DMAR-IR: IOAPIC id 2 under DRHD base  0xc7ffc000 IOMMU 1
[0.664324] DMAR-IR: HPET id 0 under DRHD base 0xc7ffc000
[0.675326] DMAR-IR: Queued invalidation will be enabled to support
x2apic and Intr-remapping.
[0.693805] DMAR-IR: Enabled IRQ remapping in x2apic mode
[9.395170] DMAR: dmar1: Using Queued invalidation
[9.405011] DMAR: Setting RMRR:
[9.412006] DMAR: Setting identity map for device :00:1d.0
[0x7bbc6000 - 0x7bbd4fff]
[9.428569] DMAR: Prepare 0-16MiB unity mapping for LPC
[9.439712] DMAR: Setting identity map for device :00:1f.0 [0x0 -
0xff]
[9.454684] DMAR: Intel(R) Virtualization Technology for Directed I/O
[  287.023068] DMAR: DRHD: handling fault status reg 2
[  287.023073] DMAR: [DMA Read] Request device [01:00.0] fault addr
18a260a000 [fault reason 06] PTE Read access is not set
[  287.023180] DMAR: DRHD: handling fault status reg 102
[  287.023183] DMAR: [DMA Read] Request device [01:00.0] fault addr
18a301 [fault reason 06] PTE Read access is not set
[  287.038250] DMAR: DRHD: handling fault status reg 202
[  287.038252] DMAR: [DMA Read] Request device [01:00.0] fault addr
18a301 [fault reason 06] PTE Read access is not set
[  288.170165] DMAR: DRHD: handling fault status reg 302
[  288.170170] DMAR: [DMA Read] 

Re: [dpdk-dev] [PATCH] net/failsafe: fix Rx burst infinite loop

2018-01-24 Thread Gaëtan Rivet
On Wed, Jan 24, 2018 at 10:19:17AM +, Matan Azrad wrote:
> In case of plugged out device, the fail-safe PMD uses failsafe_rx_burst
> function for packet receiving.
> 
> This function iterates over the present sub-devices until it
> receives a traffic from one of them or they are all cannot receive
> packets.
> 
> The corrupted code didn't advance the sub-device pointer when the
> sub-device was not present and caused to infinite loop.
> 
> Advance the sub-device pointer also in plugged-out sub-device case.
> 
> Fixes: 8052bbd9d548 ("net/failsafe: improve Rx sub-devices iteration")
> 
> Signed-off-by: Matan Azrad 
Acked-by: Gaetan Rivet 
> ---
>  drivers/net/failsafe/failsafe_rxtx.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/net/failsafe/failsafe_rxtx.c 
> b/drivers/net/failsafe/failsafe_rxtx.c
> index 1654494..aeee076 100644
> --- a/drivers/net/failsafe/failsafe_rxtx.c
> +++ b/drivers/net/failsafe/failsafe_rxtx.c
> @@ -104,6 +104,7 @@
>   do {
>   if (fs_rx_unsafe(sdev)) {
>   nb_rx = 0;
> + sdev = sdev->next;
>   continue;
>   }
>   sub_rxq = ETH(sdev)->data->rx_queues[rxq->qid];
> -- 
> 1.8.3.1
> 

-- 
Gaëtan Rivet
6WIND


Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Yuanhan Liu
On Wed, Jan 24, 2018 at 11:21:44AM +0100, Thomas Monjalon wrote:
> 24/01/2018 10:28, Yuanhan Liu:
> > On Wed, Jan 24, 2018 at 09:19:10AM +0100, Thomas Monjalon wrote:
> > > 24/01/2018 07:43, Yuanhan Liu:
> > > > On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > > > > 23/01/2018 13:46, Yuanhan Liu:
> > > > > > If port not found, then the whole string will be used for dev 
> > > > > > attachment.
> > > > > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > > > > port == 0 (the 2nd port will not be attached).
> > > > > > 
> > > > > > 
> > > > > > And here is how the devargs would look like if "matching;settings" 
> > > > > > is
> > > > > > being used:
> > > > > > 
> > > > > > 
> > > > > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > > > 
> > > > > > The part before ";" will be used for lookup and the later part will 
> > > > > > be
> > > > > > used for attachment. It should work. It just looks redundant.
> > > > > 
> > > > > It does not have to be redundant.
> > > > > It can be:
> > > > >   
> > > > > bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> > > > 
> > > > I knew you would make such reply :)
> > > > Then there is a contradiction. According your suggestion, the "port=0" 
> > > > belongs
> > > > to the matching section, but it also has to be used in the settings 
> > > > section.
> > > 
> > > If port=0 is matched, it is already set, right?
> > 
> > Yes.
> > 
> > > Why it needs to be in settings?
> > 
> > But I was talking the case it's not matched, say it's not probed and here
> > we do hotplug.
> 
> I don't understand.
> Anyway, the port property should be read-only.

All proberties should be read-only.

> Are we talking about the dev_port from the Linux kernel?

Yes. And it can be used for probing one port only (out of 2 ports in a NIC)
at probe stage. So, at this stage, it's a setting but not a match.

--yliu


Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Thomas Monjalon
24/01/2018 11:36, Yuanhan Liu:
> On Wed, Jan 24, 2018 at 11:21:44AM +0100, Thomas Monjalon wrote:
> > 24/01/2018 10:28, Yuanhan Liu:
> > > On Wed, Jan 24, 2018 at 09:19:10AM +0100, Thomas Monjalon wrote:
> > > > 24/01/2018 07:43, Yuanhan Liu:
> > > > > On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > > > > > 23/01/2018 13:46, Yuanhan Liu:
> > > > > > > If port not found, then the whole string will be used for dev 
> > > > > > > attachment.
> > > > > > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > > > > > port == 0 (the 2nd port will not be attached).
> > > > > > > 
> > > > > > > 
> > > > > > > And here is how the devargs would look like if 
> > > > > > > "matching;settings" is
> > > > > > > being used:
> > > > > > > 
> > > > > > > 
> > > > > > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > > > > 
> > > > > > > The part before ";" will be used for lookup and the later part 
> > > > > > > will be
> > > > > > > used for attachment. It should work. It just looks redundant.
> > > > > > 
> > > > > > It does not have to be redundant.
> > > > > > It can be:
> > > > > > 
> > > > > > bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> > > > > 
> > > > > I knew you would make such reply :)
> > > > > Then there is a contradiction. According your suggestion, the 
> > > > > "port=0" belongs
> > > > > to the matching section, but it also has to be used in the settings 
> > > > > section.
> > > > 
> > > > If port=0 is matched, it is already set, right?
> > > 
> > > Yes.
> > > 
> > > > Why it needs to be in settings?
> > > 
> > > But I was talking the case it's not matched, say it's not probed and here
> > > we do hotplug.
> > 
> > I don't understand.
> > Anyway, the port property should be read-only.
> 
> All proberties should be read-only.
> 
> > Are we talking about the dev_port from the Linux kernel?
> 
> Yes. And it can be used for probing one port only (out of 2 ports in a NIC)
> at probe stage. So, at this stage, it's a setting but not a match.

No it's a match!

A settings is changing data in the port.



Re: [dpdk-dev] [PATCH v1] net/tap: use local eBPF definitions

2018-01-24 Thread Van Haaren, Harry
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ophir Munk
> Sent: Tuesday, January 23, 2018 9:54 PM
> To: dev@dpdk.org; Pascal Mazon 
> Cc: Thomas Monjalon ; Olga Shern ;
> Ophir Munk 
> Subject: [dpdk-dev] [PATCH v1] net/tap: use local eBPF definitions
> 
> eBPF has a graceful approach: it must successfully compile on all Linux
> distributions. If a specific kernel cannot support eBPF it will gracefully
> refuse the eBPF netlink message sent to it.
> The kernel header file linux/bpf.h (if present) on different Linux
> distributions may not include all definitions required for TAP
> compilation.
> In order to guarantee a successful eBPF compilation everywhere all the
> required definitions for TAP have been locally added instead of including
> file 
> 
> Signed-off-by: Ophir Munk 

Tested on a Fedora 20 vm, uname -r = 3.15.6-200.fc20.x86_64

Confirmed before patch was failing, with patch build is fixed.

Tested-by: Harry van Haaren 


[dpdk-dev] [PATCH v1] net/vdev_netvsc: remove CFLAGS -std=c11 and -pedantic

2018-01-24 Thread Ophir Munk
In order to guarantee a successful vdev_netvsc compilation on
old Linux distributions remove CFLAGS -std=c11 and -pedantic
Otherwise old GCC compilers may complain as follows:
cc1: error: unrecognized command line option -std=c11

Signed-off-by: Ophir Munk 
---
 drivers/net/vdev_netvsc/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vdev_netvsc/Makefile b/drivers/net/vdev_netvsc/Makefile
index f2b2ac5..45351b8 100644
--- a/drivers/net/vdev_netvsc/Makefile
+++ b/drivers/net/vdev_netvsc/Makefile
@@ -12,7 +12,7 @@ EXPORT_MAP := rte_pmd_vdev_netvsc_version.map
 # Additional compilation flags.
 CFLAGS += -O3
 CFLAGS += -g
-CFLAGS += -std=c11 -pedantic -Wall -Wextra
+CFLAGS += -Wall -Wextra
 CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += -D_BSD_SOURCE
 CFLAGS += -D_DEFAULT_SOURCE
-- 
2.7.4



[dpdk-dev] [pull-request] dpdk-next-build 18.02-rc2

2018-01-24 Thread Bruce Richardson
Hi Thomas,

the new build system for DPDK using meson and ninja is now ready for merge
as EXPERIMENTAL into the 18.02 release.

Regards,
/Bruce

The following changes since commit bf375b4d51170cd58ca50c646781cb6de17766ea:

  version: 18.02-rc1 (2018-01-22 01:59:14 +0100)

are available in the Git repository at:

  http://dpdk.org/git/draft/dpdk-next-build 

for you to fetch changes up to 3b018b348de769a7414929b598552a12555a0e91:

  doc: add release note entry for meson build (2018-01-23 17:18:51 +)


Bruce Richardson (44):
  build: add infrastructure for meson and ninja builds
  eal: add eal library to meson build
  igb_uio: add uio kmod to meson build
  build: add DPDK libraries to build
  build: add buildtools to meson build
  build: add infrastructure for building PMDs
  drivers/bus: add PCI bus driver to meson build
  drivers/bus: add vdev bus driver to meson build
  drivers/mempool: add SW mempool drivers to meson build
  drivers/crypto: add crypto drv class and null PMD to meson
  crypto/openssl: add driver to meson build
  crypto/qat: add driver to meson build
  drivers/net: add net driver support to meson build
  drivers/net: add set of vdev PMDs to build
  drivers/net: add drivers for Intel NICs to meson build
  app/testpmd: add testpmd to meson build
  usertools: add usertools installation to meson build
  build: add option to version libs using DPDK version
  doc: add documentation on how to add new components to DPDK
  build: sort meson options alphabetically
  build: fix driver install path
  build/x86: add SSE cpuflags
  eal/bsdapp: add FreeBSD module compilation to meson build
  build: add maths library to libs in pkg-config file
  build: add detection and use of libnuma
  lpm: install vector header files
  drivers/event: add skeleton and SW eventdevs to meson build
  net/bonding: add to meson build
  examples: allow building examples as part of a meson build
  examples: put app name and sources at top of makefiles
  examples: use pkg-config info when building examples
  build: remove library special cases
  eal: fix list of source files to meson build
  build: build all libs and drivers as both static and shared
  build: change default library type to static
  build: symlink drivers to library directory
  examples: enable linking examples both static and shared
  build: replace license text with SPDX ids in meson files
  build: remove architecture flag as default C flag
  net/pcap: fix cross compilation
  doc: add instructions on build using meson
  build: add support for ARM builds
  build: make compat a universal dependency
  doc: add release note entry for meson build

Harry van Haaren (1):
  test/test: add test app to build as dpdk-test

Laatz, Kevin (1):
  test/test: add additional test cases to meson build

Luca Boccassi (3):
  build: rename pkgconfig to libdpdk.pc
  build: add optional arch-specific headers install path
  build: don't hard-code generic/exec-env install path

Pavan Nikhilesh (7):
  drivers/mempool: add octeontx mempool driver to meson build
  drivers/net: add drivers for Cavium NICs to meson build
  event/octeontx: add to meson build
  app/test-eventdev: add test-eventdev to meson build
  build: add support for detecting micro-arch on ARM
  build: add support for vendor specific ARM cross builds
  doc: add instructions to cross compile using meson

Timothy M. Redaelli (1):
  app/testpmd: fix relative rpath for drivers

 app/meson.build|   5 +
 app/test-eventdev/meson.build  |  28 +++
 app/test-pmd/meson.build   |  52 +
 buildtools/gen-pmdinfo-cfile.sh|  12 +
 buildtools/meson.build |   6 +
 buildtools/pmdinfogen/meson.build  |   9 +
 buildtools/symlink-drivers-solibs.sh   |  12 +
 config/arm/arm64_armv8_linuxapp_gcc|  10 +
 config/arm/arm64_thunderx_linuxapp_gcc |  13 ++
 config/arm/armv8_machine.py|  18 ++
 config/arm/meson.build | 118 ++
 config/meson.build |  77 +++
 config/rte_config.h|  97 
 config/x86/meson.build |  52 +
 doc/build-sdk-meson.txt| 203 
 doc/guides/contributing/coding_style.rst   | 223 ++
 doc/guides/rel_notes/release_18_02.rst |  14 ++
 drivers/bus/meson.build|   7 +
 drivers/bus/pci/meson.build|  16 ++
 drivers/bus/vdev/meson.build   |   5 +
 drivers/crypto

Re: [dpdk-dev] [PATCH v3] examples/flow_filtering: add delay during updating link status

2018-01-24 Thread Ori Kam


> -Original Message-
> From: Beilei Xing [mailto:beilei.x...@intel.com]
> Sent: Wednesday, January 24, 2018 12:16 PM
> To: Ori Kam 
> Cc: dev@dpdk.org
> Subject: [PATCH v3] examples/flow_filtering: add delay during updating link
> status
> 
> Add up to 9s delay for getting link status to make sure NIC updates
> link status successfully, just like other applications such as
> testpmd and l2fwd.
> 
> Signed-off-by: Beilei Xing 
> ---
> v3 changes:
>  - Modify MAX_REPEAT_TIME with MAX_REPEAT_TIMES
> v2 changes:
>  - Add rte_delay_ms(CHECK_INTERVAL) which is missed in v1.
> 
>  examples/flow_filtering/main.c | 13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/examples/flow_filtering/main.c b/examples/flow_filtering/main.c
> index 4a07b63..d16a0a5 100644
> --- a/examples/flow_filtering/main.c
> +++ b/examples/flow_filtering/main.c
> @@ -55,6 +55,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
> 
>  static volatile bool force_quit;
> 
> @@ -119,13 +120,23 @@ main_loop(void)
>   rte_eth_dev_close(port_id);
>  }
> 
> +#define CHECK_INTERVAL 1000  /* 100ms */
> +#define MAX_REPEAT_TIMES 90  /* 9s (90 * 100ms) in total */
> +
>  static void
>  assert_link_status(void)
>  {
>   struct rte_eth_link link;
> + uint8_t rep_cnt = MAX_REPEAT_TIMES;
> 
>   memset(&link, 0, sizeof(link));
> - rte_eth_link_get(port_id, &link);
> + do {
> + rte_eth_link_get(port_id, &link);
> + if (link.link_status == ETH_LINK_UP)
> + break;
> + rte_delay_ms(CHECK_INTERVAL);
> + } while (--rep_cnt);
> +
>   if (link.link_status == ETH_LINK_DOWN)
>   rte_exit(EXIT_FAILURE, ":: error: link is still down\n");
>  }
> --
> 2.5.5

Acked-by: Ori Kam 

Thanks,
Ori


[dpdk-dev] [PATCH] app/testpmd: move variables definition in source

2018-01-24 Thread Georgios P. Katsikas
From: Georgios Katsikas 

This patch moves the definition of 3 variables in testpmd.h
into the respective .c file. The idea behind this move is
to allow external applications to compile against testpmd
without throwing compilation errors related to multiple
definition of variables.

Also, an extern dcb_q_mapping in testpmd.h is removed
since it appears that this variable is not defined
elsewhere in the tree.

Signed-off-by: Georgios Katsikas 
---
 app/test-pmd/testpmd.c | 18 ++
 app/test-pmd/testpmd.h |  7 +++
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 5dc8cca..02e8787 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -91,6 +91,24 @@ uint8_t socket_num = UMA_NO_CONFIG;
 uint8_t mp_anon = 0;
 
 /*
+ * Store specified sockets on which memory pool to be used by ports
+ * is allocated.
+ */
+uint8_t port_numa[RTE_MAX_ETHPORTS];
+
+/*
+ * Store specified sockets on which RX ring to be used by ports
+ * is allocated.
+ */
+uint8_t rxring_numa[RTE_MAX_ETHPORTS];
+
+/*
+ * Store specified sockets on which TX ring to be used by ports
+ * is allocated.
+ */
+uint8_t txring_numa[RTE_MAX_ETHPORTS];
+
+/*
  * Record the Ethernet address of peer target ports to which packets are
  * forwarded.
  * Must be instantiated with the ethernet addresses of peer traffic generator
diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h
index 47f8fa8..153abea 100644
--- a/app/test-pmd/testpmd.h
+++ b/app/test-pmd/testpmd.h
@@ -329,19 +329,19 @@ extern uint32_t bypass_timeout; /**< Store the NIC bypass 
watchdog timeout */
  * Store specified sockets on which memory pool to be used by ports
  * is allocated.
  */
-uint8_t port_numa[RTE_MAX_ETHPORTS];
+extern uint8_t port_numa[RTE_MAX_ETHPORTS];
 
 /*
  * Store specified sockets on which RX ring to be used by ports
  * is allocated.
  */
-uint8_t rxring_numa[RTE_MAX_ETHPORTS];
+extern uint8_t rxring_numa[RTE_MAX_ETHPORTS];
 
 /*
  * Store specified sockets on which TX ring to be used by ports
  * is allocated.
  */
-uint8_t txring_numa[RTE_MAX_ETHPORTS];
+extern uint8_t txring_numa[RTE_MAX_ETHPORTS];
 
 extern uint8_t socket_num;
 
@@ -384,7 +384,6 @@ extern int16_t tx_rs_thresh;
 
 extern uint8_t dcb_config;
 extern uint8_t dcb_test;
-extern enum dcb_queue_mapping_mode dcb_q_mapping;
 
 extern uint16_t mbuf_data_size; /**< Mbuf data space size. */
 extern uint32_t param_total_num_mbufs;
-- 
2.7.4



Re: [dpdk-dev] [PATCH v2] keepalive: fix keepalive state alignment

2018-01-24 Thread Remy Horton

Done quick smoke test and it all seems fine :)

On 23/01/2018 15:43, Andriy Berestovskyy wrote:
[..]

Fixes: e70a61ad50ab ("keepalive: export states")
Cc: remy.hor...@intel.com
Signed-off-by: Andriy Berestovskyy 


Acked-by: Remy Horton 


Re: [dpdk-dev] [PATCH 1/3] app/testpmd: Moved cmdline_flow to librte_cmdline

2018-01-24 Thread george . dit
Hi again,

I decided to simplify things for now, hence sent a new minimal patch
only for testpmd (not librte_cmdline), which allows external
applications to compile against it without errors.

Thanks again for you time,
Georgios

On Tue, Jan 16, 2018 at 6:54 PM, Adrien Mazarguil
 wrote:
> On Tue, Jan 16, 2018 at 03:54:57PM +0100, george@gmail.com wrote:
>> Hi Adrien,
>>
>> Thanks for your insights and sorry for omitting the cover letter (this is
>> my first patch in DPDK).
>
> No problem, don't worry about that. Remember to put as much context as close
> as possible to the related changes. The commit log of a patch is in fact a
> more appropriate place since a cover letter is simply a summary of a
> subsequent series and is discarded once applied.
>
>> I understand your concerns. The reason I proposed this patch is to
>> facilitate a more high level vendor agnostic API for configuring and
>> monitoring DPDK-based NICs.
>>
>> To avoid just copying thousands of lines of code into my application, do
>> you think it is feasible to move at least the components (struct context,
>> struct arg, struct token and the parse_* helpers) you mentioned and
>> restructure testpmd in a way that allows applications to re-use all of its
>> parsing features?
>
> Yes I think it's feasible, although at the cost of great effort because
> you'd need to untangle engine code from parser code to expose the former,
> the flow command being a mix of both. Proper APIs must be devised for that,
> hence my question: is it really worth the trouble?
>
> Other contributors already asked me how they could reuse the flow command
> parser to implement similar testpmd commands without copy/pasting the entire
> file, so I'm already convinced separating at least the engine part makes
> sense at the testpmd level. Moving it to librte_cmdline for external
> applications seems more complex though.
>
>> I could give it a try and come back with a new patch, otherwise I am
>> perfectly ok if you want to do it instead.
>
> While I'd certainly like to do it (at least at the testpmd level), it's
> unlikely to happen anytime soon due to other priorities.
>
> Feel free to take care of it if you're motivated enough, just keep in mind
> right now I don't think this should be exposed as a public API. I can change
> my mind if you manage to make it generic enough.
>
>> On Tue, Jan 16, 2018 at 3:31 PM, Adrien Mazarguil <
>> adrien.mazarg...@6wind.com> wrote:
>>
>> > George,
>> >
>> > I missed the original RFC thread [1][2] (you should have used it as a cover
>> > letter for this series BTW) please see below.
>> >
>> > On Tue, Jan 16, 2018 at 10:24:25AM +0100, Olivier Matz wrote:
>> > > Hi,
>> > >
>> > > > On Tue, Jan 16, 2018 at 9:40 AM Olivier Matz 
>> > wrote:
>> > > >
>> > > > On Tue, Jan 16, 2018 at 08:45:32AM +, george@gmail.com wrote:
>> > > > > Hi Georgios,
>> > > > >
>> > > > > On Mon, Jan 15, 2018 at 01:30:35AM +, Lu, Wenzhuo wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > > -Original Message-
>> > > > > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Georgios
>> > Katsikas
>> > > > > > > Sent: Saturday, January 13, 2018 5:01 AM
>> > > > > > > To: olivier.m...@6wind.com
>> > > > > > > Cc: dev@dpdk.org; Georgios Katsikas 
>> > > > > > > Subject: [dpdk-dev] [PATCH 1/3] app/testpmd: Moved cmdline_flow
>> > to
>> > > > > > > librte_cmdline
>> > > > > > >
>> > > > > > > Signed-off-by: Georgios Katsikas 
>> > > > > > Looks like a good idea to move this code to the lib.
>> > > > > > cc Adrien the author of this file, app/test-pmd/cmdline_flow.c.
>> > > > >
>> > > > > If the command line parsing of rte_flow is something that has some
>> > > > > chances to be shared among multiple applications, I agree it makes
>> > sense
>> > > > > to move it in a library.
>> > > > >
>> > > > > However, my opinion is that it would be better to have a specific
>> > > > > library for it, like librte_flow_cmdline, because I'm not sure that
>> > > > > people linking with librte_cmdline always want to pull the rte_flow
>> > > > > parsing code.
>> >
>> > In my opinion the entire flow command parser has nothing to do outside
>> > testpmd, it's way too specific, even if another application finds it
>> > useful.
>> >
>> > Code duplication being a bad thing, your application could as well compile
>> > or even #include app/test-pmd/cmdline_flow.c directly (not pretty, I know)
>> > since it would have the same internal layout as testpmd. Testpmd's Makefile
>> > could be modified to spit it out as a separate library if needed.
>> >
>> > What could make more sense would be to extract the parser engine for
>> > librte_cmdline's dynamic tokens (e.g. struct context, struct arg, struct
>> > token) and possibly various generic helpers (e.g. parse_string,
>> > parse_mac_addr), but not enum index nor token_list[] and friends which
>> > define the layout of testpmd's flow command.
>> >
>> > This would enable other flow-like commands without duplicating the engine
>

[dpdk-dev] [PATCH v1] doc: fix bbdev test guide build

2018-01-24 Thread Marko Kovacevic
Some indentations in the bbdev test application
were causing build failures. Which are solved by this
commit.

Fixes: f714a18885a6 ("app/testbbdev: add test application for bbdev")
Cc: amr.mokh...@intel.com

Signed-off-by: Marko Kovacevic 
---
 doc/guides/tools/testbbdev.rst | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/doc/guides/tools/testbbdev.rst b/doc/guides/tools/testbbdev.rst
index c7aac49..5c7112d 100644
--- a/doc/guides/tools/testbbdev.rst
+++ b/doc/guides/tools/testbbdev.rst
@@ -71,24 +71,26 @@ The following are the command-line options:
  Defines test cases to run. If not specified all available tests are run.
 
  The following tests can be run:
-  * unittest
+
+ * unittest
  Small unit tests witch check basic functionality of bbdev library.
-  * latency
+ * latency
  Test calculates three latency metrics:
-  * offload_latency_tc
+
+ * offload_latency_tc
  measures the cost of offloading enqueue and dequeue operations.
-  * offload_latency_empty_q_tc
+ * offload_latency_empty_q_tc
  measures the cost of offloading a dequeue operation from an empty 
queue.
  checks how long last dequeueing if there is no operations to dequeue
-  * operation_latency_tc
+ * operation_latency_tc
  measures the time difference from the first attempt to enqueue till 
the
  first successful dequeue.
-  * validation
+ * validation
  Test do enqueue on given vector and compare output after dequeueing.
-  * throughput
+ * throughput
  Test measures the achieved throughput on the available lcores.
  Results are printed in million operations per second and million bits per 
second.
-  * interrupt
+ * interrupt
  The same test as 'throughput' but uses interrupts instead of PMD to 
perform
  the dequeue.
 
-- 
2.9.5



[dpdk-dev] [PATCH] eal/service: add routine to release memory.

2018-01-24 Thread Vipin Varghese
The routine rte_service_finalize cehcks if service is initialized,
if yes releases the internal meory for services and lcore states.

This routine is to be invoked at end of application termiantion.

Signed-off-by: Vipin Varghese 
---
 lib/librte_eal/common/include/rte_service.h | 12 
 lib/librte_eal/common/rte_service.c | 13 +
 lib/librte_eal/rte_eal_version.map  |  1 +
 3 files changed, 26 insertions(+)

diff --git a/lib/librte_eal/common/include/rte_service.h 
b/lib/librte_eal/common/include/rte_service.h
index 02b1512..1b2b8e7 100644
--- a/lib/librte_eal/common/include/rte_service.h
+++ b/lib/librte_eal/common/include/rte_service.h
@@ -429,6 +429,18 @@ int32_t rte_service_attr_get(uint32_t id, uint32_t attr_id,
  */
 int32_t rte_service_attr_reset_all(uint32_t id);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice
+ *
+ * Free up the memory that has been initialized.
+ *
+ * This routine is to be invoked prior to process termination.
+ *
+ * @retval None
+ */
+void rte_service_finalize(void);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index 5f97d85..5133c98 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -108,6 +108,19 @@ int32_t rte_service_init(void)
return 0;
 }
 
+void rte_service_finalize(void)
+{
+   if (rte_service_library_initialized) {
+   if (rte_services)
+   rte_free(rte_services);
+   if (lcore_states)
+   rte_free(lcore_states);
+
+   rte_service_library_initialized = 0;
+   }
+}
+
+
 /* returns 1 if service is registered and has not been unregistered
  * Returns 0 if service never registered, or has been unregistered
  */
diff --git a/lib/librte_eal/rte_eal_version.map 
b/lib/librte_eal/rte_eal_version.map
index 7088b72..24d1ca7 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -245,5 +245,6 @@ EXPERIMENTAL {
rte_service_set_runstate_mapped_check;
rte_service_set_stats_enable;
rte_service_start_with_defaults;
+   rte_service_finalize;
 
 } DPDK_18.02;
-- 
1.9.1



Re: [dpdk-dev] [PATCH v1] net/vdev_netvsc: remove CFLAGS -std=c11 and -pedantic

2018-01-24 Thread Matan Azrad
Hi Ophir

From: Ophir Munk, Wednesday, January 24, 2018 1:20 PM
> In order to guarantee a successful vdev_netvsc compilation on old Linux
> distributions remove CFLAGS -std=c11 and -pedantic Otherwise old GCC
> compilers may complain as follows:
> cc1: error: unrecognized command line option -std=c11
> 
> Signed-off-by: Ophir Munk 
> ---
>  drivers/net/vdev_netvsc/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/vdev_netvsc/Makefile
> b/drivers/net/vdev_netvsc/Makefile
> index f2b2ac5..45351b8 100644
> --- a/drivers/net/vdev_netvsc/Makefile
> +++ b/drivers/net/vdev_netvsc/Makefile
> @@ -12,7 +12,7 @@ EXPORT_MAP := rte_pmd_vdev_netvsc_version.map  #
> Additional compilation flags.
>  CFLAGS += -O3
>  CFLAGS += -g
> -CFLAGS += -std=c11 -pedantic -Wall -Wextra
> +CFLAGS += -Wall -Wextra
>  CFLAGS += -D_XOPEN_SOURCE=600
>  CFLAGS += -D_BSD_SOURCE
>  CFLAGS += -D_DEFAULT_SOURCE
> --
> 2.7.4

This patch is a fix for compilation issue so it should be with fix title.
Suggestion:
net/vdev_netvsc: fix build using C11 mode and pedantic

Also "Fixes" line should be added.

Thanks,
Matan.


[dpdk-dev] [PATCH] usertools/devbind: remove unused code

2018-01-24 Thread Anatoly Burakov
Cc: sta...@dpdk.org

Signed-off-by: Anatoly Burakov 
---
 usertools/dpdk-devbind.py | 33 -
 1 file changed, 33 deletions(-)

diff --git a/usertools/dpdk-devbind.py b/usertools/dpdk-devbind.py
index 894b519..955e984 100755
--- a/usertools/dpdk-devbind.py
+++ b/usertools/dpdk-devbind.py
@@ -120,39 +120,6 @@ def check_output(args, stderr=None):
 stderr=stderr).communicate()[0]
 
 
-def find_module(mod):
-'''find the .ko file for kernel module named mod.
-Searches the $RTE_SDK/$RTE_TARGET directory, the kernel
-modules directory and finally under the parent directory of
-the script '''
-# check $RTE_SDK/$RTE_TARGET directory
-if 'RTE_SDK' in os.environ and 'RTE_TARGET' in os.environ:
-path = "%s/%s/kmod/%s.ko" % (os.environ['RTE_SDK'],
- os.environ['RTE_TARGET'], mod)
-if exists(path):
-return path
-
-# check using depmod
-try:
-with open(os.devnull, "w") as fnull:
-path = check_output(["modinfo", "-n", mod], stderr=fnull).strip()
-
-if path and exists(path):
-return path
-except:  # if modinfo can't find module, it fails, so continue
-pass
-
-# check for a copy based off current path
-tools_dir = dirname(abspath(sys.argv[0]))
-if tools_dir.endswith("tools"):
-base_dir = dirname(tools_dir)
-find_out = check_output(["find", base_dir, "-name", mod + ".ko"])
-if len(find_out) > 0:  # something matched
-path = find_out.splitlines()[0]
-if exists(path):
-return path
-
-
 def check_modules():
 '''Checks that igb_uio is loaded'''
 global dpdk_drivers
-- 
2.7.4


[dpdk-dev] [PATCH v2] net/bonding: check dequeue result before proceeding

2018-01-24 Thread Radu Nicolau
Fixes: 09150784a776 ("net/bonding: burst mode hash calculation")
Coverity issue: 257015
Cc: sta...@dpdk.org

Signed-off-by: Radu Nicolau 
---
v2: updated commit message

 drivers/net/bonding/rte_eth_bond_pmd.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index 158f3aa..cc06ff4 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1435,17 +1435,17 @@ bond_ethdev_tx_burst_8023ad(void *queue, struct 
rte_mbuf **bufs,
if (likely(rte_ring_empty(port->tx_ring)))
continue;
 
-   rte_ring_dequeue(port->tx_ring, (void **)&ctrl_pkt);
-
-   slave_tx_count = rte_eth_tx_burst(slave_port_ids[i],
+   if (rte_ring_dequeue(port->tx_ring,
+(void **)&ctrl_pkt) != -ENOENT) {
+   slave_tx_count = rte_eth_tx_burst(slave_port_ids[i],
bd_tx_q->queue_id, &ctrl_pkt, 1);
-
-   /*
-* re-enqueue LAG control plane packets to buffering
-* ring if transmission fails so the packet isn't lost.
-*/
-   if (slave_tx_count != 1)
-   rte_ring_enqueue(port->tx_ring, ctrl_pkt);
+   /*
+* re-enqueue LAG control plane packets to buffering
+* ring if transmission fails so the packet isn't lost.
+*/
+   if (slave_tx_count != 1)
+   rte_ring_enqueue(port->tx_ring, ctrl_pkt);
+   }
}
 
return total_tx_count;
-- 
2.7.5



Re: [dpdk-dev] [PATCH] eal/service: add routine to release memory.

2018-01-24 Thread Van Haaren, Harry
> From: Varghese, Vipin
> Sent: Wednesday, January 24, 2018 7:03 AM
> To: Van Haaren, Harry ; dev@dpdk.org
> Cc: Jain, Deepak K ; Mcnamara, John
> ; sta...@dpdk.org; Patel, Amol
> ; Varghese, Vipin 
> Subject: [PATCH] eal/service: add routine to release memory.
> 
> The routine rte_service_finalize cehcks if service is initialized,
> if yes releases the internal meory for services and lcore states.
> 
> This routine is to be invoked at end of application termiantion.
> 
> Signed-off-by: Vipin Varghese 

Idea looks good to me - two notes below. Check for typos in commit message too 
:)

For v2, 

Acked-by: Harry van Haaren 




> diff --git a/lib/librte_eal/common/rte_service.c
> b/lib/librte_eal/common/rte_service.c
> index 5f97d85..5133c98 100644
> --- a/lib/librte_eal/common/rte_service.c
> +++ b/lib/librte_eal/common/rte_service.c
> @@ -108,6 +108,19 @@ int32_t rte_service_init(void)
>   return 0;
>  }
> 
> +void rte_service_finalize(void)
> +{
> + if (rte_service_library_initialized) {
> + if (rte_services)
> + rte_free(rte_services);
> + if (lcore_states)
> + rte_free(lcore_states);
> +
> + rte_service_library_initialized = 0;
> + }
> +}


I find the following implementation cleaner, less if()-ed code?

void rte_service_finalize(void)
{
if (!rte_service_library_initialized)
return;

if (rte_services)
rte_free(rte_services);
if (lcore_states)
rte_free(lcore_states);

rte_service_library_initialized = 0;
}


> +
> +
>  /* returns 1 if service is registered and has not been unregistered
>   * Returns 0 if service never registered, or has been unregistered
>   */
> diff --git a/lib/librte_eal/rte_eal_version.map
> b/lib/librte_eal/rte_eal_version.map
> index 7088b72..24d1ca7 100644
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -245,5 +245,6 @@ EXPERIMENTAL {
>   rte_service_set_runstate_mapped_check;
>   rte_service_set_stats_enable;
>   rte_service_start_with_defaults;
> + rte_service_finalize;
> 
>  } DPDK_18.02;

The version.map file functions should be in alphabetical order;
In this case, add between dump() and get_by_id()

rte_service_dump;
+   rte_service_finalize;
rte_service_get_by_id;


[dpdk-dev] [PATCH] app/procinfo: fix memory leak by rte_service_init

2018-01-24 Thread Vipin Varghese
When procinfo is run multiple times against primary application,
it consumes huge page memory by rte_service_init. Which is not
released at exit of application.

Invoking rte_service_finalize to free memory and prevent memory leak.

Signed-off-by: Vipin Varghese 
---
 app/proc_info/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/proc_info/main.c b/app/proc_info/main.c
index 94d53f5..1884e06 100644
--- a/app/proc_info/main.c
+++ b/app/proc_info/main.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Maximum long option length for option parsing. */
 #define MAX_LONG_OPT_SZ 64
@@ -660,5 +661,7 @@ static void collectd_resolve_cnt_type(char *cnt_type, 
size_t cnt_type_len,
if (enable_metrics)
metrics_display(RTE_METRICS_GLOBAL);
 
+   rte_service_finalize();
+
return 0;
 }
-- 
1.9.1



[dpdk-dev] [PATCH] examples/ip_pipeline: update copyright and license

2018-01-24 Thread Lee Daly
This updates the Intel and Oliver Matz licenses on files in examples
to be the standard BSD-3-Clause license used for the rest of DPDK,
bringing the files in compliance with the DPDK licensing policy.

Signed-off-by: Lee Daly 
---
 examples/ip_pipeline/parser.c | 56 +++
 1 file changed, 3 insertions(+), 53 deletions(-)

diff --git a/examples/ip_pipeline/parser.c b/examples/ip_pipeline/parser.c
index 689e206..39dd1ee 100644
--- a/examples/ip_pipeline/parser.c
+++ b/examples/ip_pipeline/parser.c
@@ -1,34 +1,6 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2016 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2016 Intel Corporation.
+ * All rights reserved.
  */
 
 /*
@@ -36,28 +8,6 @@
  *
  * Copyright (c) 2009, Olivier MATZ 
  * All rights reserved.
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in the
- *   documentation and/or other materials provided with the distribution.
- * * Neither the name of the University of California, Berkeley nor the
- *   names of its contributors may be used to endorse or promote products
- *   derived from this software without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND ANY
- * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
- * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
- * DISCLAIMED. IN NO EVENT SHALL THE REGENTS AND CONTRIBUTORS BE LIABLE FOR ANY
- * DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
- * (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
- * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
- * ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF 
THIS
- * SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 /*
-- 
2.7.4



[dpdk-dev] [PATCH v2] examples/bond: add mbuf alloc check

2018-01-24 Thread Radu Nicolau
Fixes: cc7e8ae84faa ("examples/bond: add example application for link bonding 
mode 6")
Coverity issue: 257008
Cc: sta...@dpdk.org

Signed-off-by: Radu Nicolau 
---
v2: updated commit message

 examples/bond/main.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/examples/bond/main.c b/examples/bond/main.c
index 01e5eda..6c8a44a 100644
--- a/examples/bond/main.c
+++ b/examples/bond/main.c
@@ -441,6 +441,11 @@ static void cmd_obj_send_parsed(void *parsed_result,
(BOND_IP_3 << 16) | (BOND_IP_4 << 24);
 
created_pkt = rte_pktmbuf_alloc(mbuf_pool);
+   if (created_pkt == NULL) {
+   cmdline_printf(cl, "Failed to allocate mbuf\n");
+   return;
+   }
+
pkt_size = sizeof(struct ether_hdr) + sizeof(struct arp_hdr);
created_pkt->data_len = pkt_size;
created_pkt->pkt_len = pkt_size;
-- 
2.7.5



[dpdk-dev] [PATCH] app/pdump: fix the memory leak by rte_service_init

2018-01-24 Thread Vipin Varghese
When pdump is run multiple times against any primary application,
it consumes huge page memory by rte_service_init. This is not freed
at exit of application.

Invoking rte_service_finalize to free memory and prevent memory leak.

Signed-off-by: Vipin Varghese 
---
 app/pdump/main.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/app/pdump/main.c b/app/pdump/main.c
index 0f70c75..9d03366 100644
--- a/app/pdump/main.c
+++ b/app/pdump/main.c
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CMD_LINE_OPT_PDUMP "pdump"
 #define PDUMP_PORT_ARG "port"
@@ -882,5 +883,7 @@ struct parse_val {
/* dump debug stats */
print_pdump_stats();
 
+   rte_service_finalize();
+
return 0;
 }
-- 
1.9.1



[dpdk-dev] [PATCH] examples: update copyrights and license

2018-01-24 Thread Lee Daly
This updates the Intel and Cavium license on files in examples to be
the standard BSD-3-Clause license used for the rest of DPDK,
bringing the files in compliance with the DPDK licensing policy.

Signed-off-by: Lee Daly 
---
 .../performance-thread/common/arch/x86/stack.h | 30 +-
 1 file changed, 1 insertion(+), 29 deletions(-)

diff --git a/examples/performance-thread/common/arch/x86/stack.h 
b/examples/performance-thread/common/arch/x86/stack.h
index 98723ba..e1e35da 100644
--- a/examples/performance-thread/common/arch/x86/stack.h
+++ b/examples/performance-thread/common/arch/x86/stack.h
@@ -1,35 +1,7 @@
-/*-
- *   BSD LICENSE
- *
+/* SPDX-License-Identifier: BSD-3-Clause
  *   Copyright(c) 2015 Intel Corporation. All rights reserved.
  *   Copyright(c) Cavium, Inc. 2017.
  *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */
 
 /*
-- 
2.7.4



Re: [dpdk-dev] [PATCH 2/5] net/mlx5: fix secondary process mempool registration

2018-01-24 Thread Shahaf Shuler
Wednesday, January 24, 2018 10:14 AM, Nélio Laranjeiro:
> Hi Shahaf,
> 
> On Tue, Jan 23, 2018 at 07:08:20PM +0200, Shahaf Shuler wrote:
> > Secondary process is not allowed to register mempools on the flight.
> >
> > The code will return invalid memory key for such case.
> >
> > Fixes: 87ec44ce1651 ("net/mlx5: add operations for secondary process")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Shahaf Shuler 
> > Signed-off-by: Xueming Li 
> > ---
> >  doc/guides/nics/mlx5.rst |  7 ++-
> >  drivers/net/mlx5/mlx5_rxtx.h | 17 ++---
> >  2 files changed, 20 insertions(+), 4 deletions(-)
> >
> > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index
> > bdc2216c0..2f860402f 100644
> > --- a/doc/guides/nics/mlx5.rst
> > +++ b/doc/guides/nics/mlx5.rst
> > @@ -106,7 +106,12 @@ Limitations
> >
> >  - Inner RSS for VXLAN frames is not supported yet.
> >  - Hardware checksum RX offloads for VXLAN inner header are not
> supported yet.
> > -- Forked secondary process not supported.
> > +- For secondary process:
> > +
> > +  - Forked secondary process not supported.
> > +  - All mempools must be initialized before rte_eth_dev_start().
> > +  - Number of mempools must less than
> > + CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE
> 
> Why such limitation?  Registering a new memory is independent of searching
> a new one when the cache is too small.
> 
> >  - Flow pattern without any specific vlan will match for vlan packets as 
> > well:
> >
> >When VLAN spec is not specified in the pattern, the matching rule will be
> created with VLAN as a wild card.
> > diff --git a/drivers/net/mlx5/mlx5_rxtx.h
> > b/drivers/net/mlx5/mlx5_rxtx.h index a63364d79..79cdfc793 100644
> > --- a/drivers/net/mlx5/mlx5_rxtx.h
> > +++ b/drivers/net/mlx5/mlx5_rxtx.h
> > @@ -550,6 +550,7 @@ mlx5_tx_mb2mr(struct mlx5_txq_data *txq, struct
> rte_mbuf *mb)
> > uint16_t i = txq->mr_cache_idx;
> > uintptr_t addr = rte_pktmbuf_mtod(mb, uintptr_t);
> > struct mlx5_mr *mr;
> > +   struct rte_mempool *mp;
> >
> > assert(i < RTE_DIM(txq->mp2mr));
> > if (likely(txq->mp2mr[i]->start <= addr && txq->mp2mr[i]->end >=
> > addr)) @@ -563,14 +564,24 @@ mlx5_tx_mb2mr(struct mlx5_txq_data
> *txq, struct rte_mbuf *mb)
> > if (txq->mp2mr[i]->start <= addr &&
> > txq->mp2mr[i]->end >= addr) {
> > assert(txq->mp2mr[i]->lkey != (uint32_t)-1);
> > -   assert(rte_cpu_to_be_32(txq->mp2mr[i]->mr->lkey)
> ==
> > -  txq->mp2mr[i]->lkey);
> > +   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> > +   assert(rte_cpu_to_be_32(txq->mp2mr[i]-
> >mr->lkey)
> > +  == txq->mp2mr[i]->lkey);
> > +   }
> 
> This code should be inside priv_txq_mp2mr_reg() to let the secondary
> search inside the MR list when the cache is too small.

You probably mean below code and not above. Yes good suggestion.
Regarding the above code - the assert in un-accessible from the secondary 
process as it tries to access ibv_mr.
I will add another patch to remove this assert completely (in order to save the 
comparison logic). 

> 
> If it does not find any MR it should fail before calling priv_mr_new().
> 
> > txq->mr_cache_idx = i;
> > return txq->mp2mr[i]->lkey;
> > }
> > }
> > txq->mr_cache_idx = 0;
> > -   mr = mlx5_txq_mp2mr_reg(txq, mlx5_tx_mb2mp(mb), i);
> > +   mp = mlx5_tx_mb2mp(mb);
> > +   if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > +   WARN("Using unregistered mempool 0x%p(%s) in secondary
> process,"
> > +" please create mempool before rte_eth_dev_start()",
> > +(void *)mp, mp->name);
> > +   assert(rte_eal_process_type() == RTE_PROC_PRIMARY);
> > +   return (uint32_t)-1;
> > +   }
> > +   mr = mlx5_txq_mp2mr_reg(txq, mp, i);
> > /*
> >  * Request the reference to use in this queue, the original one is
> >  * kept by the control plane.
> > --
> > 2.12.0
> 
> --
> Nélio Laranjeiro
> 6WIND


[dpdk-dev] [PATCH v2] net/vdev_netvsc: fix build using C11 mode and pedantic

2018-01-24 Thread Ophir Munk
Remove CFLAGS -std=c11 and -pedantic in order to guarantee
a successful vdev_netvsc compilation on old Linux distributions.
Otherwise old GCC compilers may complain as follows:
cc1: error: unrecognized command line option -std=c11

Fixes: 6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform driver")
Cc: sta...@dpdk.org

Signed-off-by: Ophir Munk 
---
 drivers/net/vdev_netvsc/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vdev_netvsc/Makefile b/drivers/net/vdev_netvsc/Makefile
index f2b2ac5..45351b8 100644
--- a/drivers/net/vdev_netvsc/Makefile
+++ b/drivers/net/vdev_netvsc/Makefile
@@ -12,7 +12,7 @@ EXPORT_MAP := rte_pmd_vdev_netvsc_version.map
 # Additional compilation flags.
 CFLAGS += -O3
 CFLAGS += -g
-CFLAGS += -std=c11 -pedantic -Wall -Wextra
+CFLAGS += -Wall -Wextra
 CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += -D_BSD_SOURCE
 CFLAGS += -D_DEFAULT_SOURCE
-- 
2.7.4



[dpdk-dev] [PATCH v2] eal/service: add routine to release memory

2018-01-24 Thread Vipin Varghese
The routine rte_service_finalize checks if service is initialized,
if yes; releases internal memory for services and lcore states.

This routine is to be invoked at end of application termination.

Signed-off-by: Vipin Varghese 
---

V2 Changes:
- redo the logic for rte_service_finalize
- added in alphabetical order
- cleaned up commit message
---
 lib/librte_eal/common/rte_service.c | 16 +---
 lib/librte_eal/rte_eal_version.map  |  2 +-
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index 5133c98..dba74a3 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -110,14 +110,16 @@ int32_t rte_service_init(void)
 
 void rte_service_finalize(void)
 {
-   if (rte_service_library_initialized) {
-   if (rte_services)
-   rte_free(rte_services);
-   if (lcore_states)
-   rte_free(lcore_states);
+   if (!rte_service_library_initialized)
+   return;
 
-   rte_service_library_initialized = 0;
-   }
+   if (rte_services)
+   rte_free(rte_services);
+
+   if (lcore_states)
+   rte_free(lcore_states);
+
+   rte_service_library_initialized = 0;
 }
 
 
diff --git a/lib/librte_eal/rte_eal_version.map 
b/lib/librte_eal/rte_eal_version.map
index 24d1ca7..1a8b1b5 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -223,6 +223,7 @@ EXPERIMENTAL {
rte_service_component_unregister;
rte_service_component_runstate_set;
rte_service_dump;
+   rte_service_finalize;
rte_service_get_by_id;
rte_service_get_by_name;
rte_service_get_count;
@@ -245,6 +246,5 @@ EXPERIMENTAL {
rte_service_set_runstate_mapped_check;
rte_service_set_stats_enable;
rte_service_start_with_defaults;
-   rte_service_finalize;
 
 } DPDK_18.02;
-- 
1.9.1



Re: [dpdk-dev] [PATCH v2] net/vdev_netvsc: fix build using C11 mode and pedantic

2018-01-24 Thread Matan Azrad
Hi Ophir

From: Ophir Munk, Wednesday, January 24, 2018 4:12 PM
> Remove CFLAGS -std=c11 and -pedantic in order to guarantee a successful
> vdev_netvsc compilation on old Linux distributions.
> Otherwise old GCC compilers may complain as follows:
> cc1: error: unrecognized command line option -std=c11
> 
> Fixes: 6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform
> driver")
> Cc: sta...@dpdk.org

No need to backport this fix.

> Signed-off-by: Ophir Munk 

Besides that,
Acked-by: Matan Azrad 



Re: [dpdk-dev] [PATCH V12 1/3] eal: add uevent monitor api and callback func

2018-01-24 Thread Wu, Jingjing


> -Original Message-
> From: Guo, Jia
> Sent: Thursday, January 18, 2018 12:12 PM
> To: step...@networkplumber.org; Richardson, Bruce 
> ;
> Yigit, Ferruh ; gaetan.ri...@6wind.com
> Cc: Ananyev, Konstantin ; jblu...@infradead.org;
> shreyansh.j...@nxp.com; Wu, Jingjing ; dev@dpdk.org; 
> Guo, Jia
> ; tho...@monjalon.net; Zhang, Helin 
> ;
> mo...@mellanox.com
> Subject: [PATCH V12 1/3] eal: add uevent monitor api and callback func
> 
> This patch aim to add a general uevent mechanism in eal device layer,
> to enable all linux kernel object uevent monitoring, user could use these
> APIs to monitor and read out the device status info that sent from the
> kernel side, then corresponding to handle it, such as when detect hotplug
> uevent type, user could detach or attach the device, and more it benefit
> to use to do smoothly fail safe work.
> 
> About uevent monitoring:
> a: add one epolling to poll the netlink socket, to monitor the uevent of
>the device.
> b: add enum of rte_eal_dev_event_type and struct of rte_eal_uevent.
> c: add below APIs in rte eal device layer.
>rte_dev_callback_register
>rte_dev_callback_unregister
>_rte_dev_callback_process
>rte_dev_event_monitor_start
>rte_dev_event_monitor_stop
> 
> Signed-off-by: Jeff Guo 
> ---
> v12->v11:
> identify null param in callback for monitor all devices uevent
> ---
>  lib/librte_eal/bsdapp/eal/eal_dev.c |  38 ++
>  lib/librte_eal/common/eal_common_dev.c  | 128 ++
>  lib/librte_eal/common/include/rte_dev.h | 119 +
>  lib/librte_eal/linuxapp/eal/Makefile|   1 +
>  lib/librte_eal/linuxapp/eal/eal_dev.c   | 223 
> 
>  5 files changed, 509 insertions(+)
>  create mode 100644 lib/librte_eal/bsdapp/eal/eal_dev.c
>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_dev.c
> 

[..]

> +int
> +rte_dev_callback_register(char *device_name, rte_dev_event_cb_fn cb_fn,
> + void *cb_arg)
> +{
> + struct rte_dev_event_callback *event_cb = NULL;
> +
> + rte_spinlock_lock(&rte_dev_event_lock);
> +
> + if (TAILQ_EMPTY(&(dev_event_cbs)))
> + TAILQ_INIT(&(dev_event_cbs));
> +
> + TAILQ_FOREACH(event_cb, &(dev_event_cbs), next) {
> + if (event_cb->cb_fn == cb_fn &&
> + event_cb->cb_arg == cb_arg &&
> + !strcmp(event_cb->dev_name, device_name))
device_name = NULL means means for all devices, right? Can strcmp accept NULL 
arguments?

> + break;
> + }
> +
> + /* create a new callback. */
> + if (event_cb == NULL) {
> + /* allocate a new user callback entity */
> + event_cb = malloc(sizeof(struct rte_dev_event_callback));
> + if (event_cb != NULL) {
> + event_cb->cb_fn = cb_fn;
> + event_cb->cb_arg = cb_arg;
> + event_cb->dev_name = device_name;
> + }
Is that OK to call TAILQ_INSERT_TAIL below if event_cb == NULL?

> + TAILQ_INSERT_TAIL(&(dev_event_cbs), event_cb, next);
> + }
> +
> + rte_spinlock_unlock(&rte_dev_event_lock);
> + return (event_cb == NULL) ? -1 : 0;
> +}
> +
> +int
> +rte_dev_callback_unregister(char *device_name, rte_dev_event_cb_fn cb_fn,
> + void *cb_arg)
> +{
> + int ret;
> + struct rte_dev_event_callback *event_cb, *next;
> +
> + if (!cb_fn || device_name == NULL)
> + return -EINVAL;
> +
> + rte_spinlock_lock(&rte_dev_event_lock);
> +
> + ret = 0;
> +
> + for (event_cb = TAILQ_FIRST(&(dev_event_cbs)); event_cb != NULL;
> +   event_cb = next) {
> +
> + next = TAILQ_NEXT(event_cb, next);
> +
> + if (event_cb->cb_fn != cb_fn ||
> + (event_cb->cb_arg != (void *)-1 &&
> + event_cb->cb_arg != cb_arg) ||
> + strcmp(event_cb->dev_name, device_name))

The same comments as above.

> + continue;
> +
> + /*
> +  * if this callback is not executing right now,
> +  * then remove it.
> +  */
> + if (event_cb->active == 0) {
> + TAILQ_REMOVE(&(dev_event_cbs), event_cb, next);
> + rte_free(event_cb);
> + } else {
> + ret = -EAGAIN;
> + }
> + }
> +
> + rte_spinlock_unlock(&rte_dev_event_lock);
> + return ret;
> +}
> +

[..]

> +int
> +rte_dev_event_monitor_start(void)
> +{
> + int ret;
> + struct rte_service_spec service;
> + uint32_t id;
> + const uint32_t sid = 0;
> +
> + if (!service_no_init)
> + return 0;
> +
> + uint32_t slcore_1 = rte_get_next_lcore(/* start core */ -1,
> +/* skip master */ 1,
> +/* wrap */ 0);
> +

Re: [dpdk-dev] [PATCH V12 2/3] eal: add uevent pass and process function

2018-01-24 Thread Wu, Jingjing


> -Original Message-
> From: Guo, Jia
> Sent: Thursday, January 18, 2018 12:12 PM
> To: step...@networkplumber.org; Richardson, Bruce 
> ;
> Yigit, Ferruh ; gaetan.ri...@6wind.com
> Cc: Ananyev, Konstantin ; jblu...@infradead.org;
> shreyansh.j...@nxp.com; Wu, Jingjing ; dev@dpdk.org; 
> Guo, Jia
> ; tho...@monjalon.net; Zhang, Helin 
> ;
> mo...@mellanox.com
> Subject: [PATCH V12 2/3] eal: add uevent pass and process function
> 
> In order to handle the uevent which have been detected from the kernel
> side, add uevent process function, let hot plug event to be example to
> show uevent mechanism how to pass the uevent and process the uevent.
> 
> About uevent passing and processing, add below functions in linux eal
> dev layer. FreeBSD not support uevent ,so let it to be void and do not
> implement in function.
> a.dev_uev_parse
> b.dev_uev_receive
> c.dev_uev_process
> 
> Signed-off-by: Jeff Guo 


Reviewed-by: Jingjing Wu 


Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Yuanhan Liu
On Wed, Jan 24, 2018 at 11:37:31AM +0100, Thomas Monjalon wrote:
> 24/01/2018 11:36, Yuanhan Liu:
> > On Wed, Jan 24, 2018 at 11:21:44AM +0100, Thomas Monjalon wrote:
> > > 24/01/2018 10:28, Yuanhan Liu:
> > > > On Wed, Jan 24, 2018 at 09:19:10AM +0100, Thomas Monjalon wrote:
> > > > > 24/01/2018 07:43, Yuanhan Liu:
> > > > > > On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > > > > > > 23/01/2018 13:46, Yuanhan Liu:
> > > > > > > > If port not found, then the whole string will be used for dev 
> > > > > > > > attachment.
> > > > > > > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > > > > > > port == 0 (the 2nd port will not be attached).
> > > > > > > > 
> > > > > > > > 
> > > > > > > > And here is how the devargs would look like if 
> > > > > > > > "matching;settings" is
> > > > > > > > being used:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > > > > > 
> > > > > > > > The part before ";" will be used for lookup and the later part 
> > > > > > > > will be
> > > > > > > > used for attachment. It should work. It just looks redundant.
> > > > > > > 
> > > > > > > It does not have to be redundant.
> > > > > > > It can be:
> > > > > > >   
> > > > > > > bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> > > > > > 
> > > > > > I knew you would make such reply :)
> > > > > > Then there is a contradiction. According your suggestion, the 
> > > > > > "port=0" belongs
> > > > > > to the matching section, but it also has to be used in the settings 
> > > > > > section.
> > > > > 
> > > > > If port=0 is matched, it is already set, right?
> > > > 
> > > > Yes.
> > > > 
> > > > > Why it needs to be in settings?
> > > > 
> > > > But I was talking the case it's not matched, say it's not probed and 
> > > > here
> > > > we do hotplug.
> > > 
> > > I don't understand.
> > > Anyway, the port property should be read-only.
> > 
> > All proberties should be read-only.
> > 
> > > Are we talking about the dev_port from the Linux kernel?
> > 
> > Yes. And it can be used for probing one port only (out of 2 ports in a NIC)
> > at probe stage. So, at this stage, it's a setting but not a match.
> 
> No it's a match!
> 
> A settings is changing data in the port.

So I see that's your definition about the "settings". What I think is
everything needed for driver initiation are settings.

For example, one proposed interface for VF rep is the "vf_id" property,
Similar to "port" property we have just discussed above,  it's used for
probing one specific VR rep for the given VF id.

You can say it's a match here, just like the "port" property.

But note that "vf_id" could be a range, to enable multiple VF reps.
The semantics looks like "setting" more than "match".

Another example is from the failsafe PMD that Gaetan had mentioned:

driver=failsafe,dev(bus=pci,id=00:02.0),fd(/some/file/)

They (dev and fd) should belong the "setting" section, for 2 reasons:

- they should not be used for matching
- they are used for failsafe PMD initiation

But it belongs "match", according to your definition about "settings",
because it doesn't change data in the port.

That also means, the word "settings" might not be well named. It's
probably better to name it "drvargs".

--yliu


[dpdk-dev] [PATCH 2/2] event/opdl: fix dereference before null check

2018-01-24 Thread Liang Ma
Fixes: 4236ce9bf5bf ("event/opdl: add OPDL ring infrastructure library")
Coverity issue: 257022

Signed-off-by: Liang Ma 
---
 drivers/event/opdl/opdl_ring.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/event/opdl/opdl_ring.c b/drivers/event/opdl/opdl_ring.c
index 7e16d4d..39dc41d 100644
--- a/drivers/event/opdl/opdl_ring.c
+++ b/drivers/event/opdl/opdl_ring.c
@@ -550,6 +550,10 @@ opdl_stage_claim_multithread(struct opdl_stage *s, void 
*entries,
uint32_t i = 0, offset;
uint8_t *entries_offset = (uint8_t *)entries;
 
+   if (seq == NULL) {
+   PMD_DRV_LOG(ERR, "Invalid seq PTR");
+   return 0;
+   }
offset = opdl_first_entry_id(*seq, s->nb_instance, s->instance_id);
num_entries = offset + (s->nb_instance * num_entries);
 
@@ -561,8 +565,8 @@ opdl_stage_claim_multithread(struct opdl_stage *s, void 
*entries,
entries_offset += t->slot_size;
i++;
}
-   if (seq != NULL)
-   *seq = old_head;
+
+   *seq = old_head;
 
return i;
 }
-- 
2.7.5



[dpdk-dev] [PATCH 1/2] event/opdl: fix the resource leak issue

2018-01-24 Thread Liang Ma
Fixes: d548ef513cd7 ("event/opdl: add unit tests")
Coverity issue: 257004

Signed-off-by: Liang Ma 
---
 drivers/event/opdl/opdl_test.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/event/opdl/opdl_test.c b/drivers/event/opdl/opdl_test.c
index 44a5cc5..4894c08 100644
--- a/drivers/event/opdl/opdl_test.c
+++ b/drivers/event/opdl/opdl_test.c
@@ -1002,11 +1002,13 @@ opdl_selftest(void)
/* turn on stats by default */
if (rte_vdev_init(eventdev_name, "do_validation=1") < 0) {
PMD_DRV_LOG(ERR, "Error creating eventdev\n");
+   free(t);
return -1;
}
evdev = rte_event_dev_get_dev_id(eventdev_name);
if (evdev < 0) {
PMD_DRV_LOG(ERR, "Error finding newly created 
eventdev\n");
+   free(t);
return -1;
}
}
@@ -1022,6 +1024,7 @@ opdl_selftest(void)
rte_socket_id());
if (!eventdev_func_mempool) {
PMD_DRV_LOG(ERR, "ERROR creating mempool\n");
+   free(t);
return -1;
}
}
@@ -1044,9 +1047,9 @@ opdl_selftest(void)
ret = single_link_w_stats(t);
 
/*
-* Free test instance, leaving mempool initialized, and a pointer to it
-* in static eventdev_func_mempool, as it is re-used on re-runs
+* Free test instance, free  mempool
 */
+   rte_mempool_free(t->mbuf_pool);
free(t);
 
if (ret != 0)
-- 
2.7.5



[dpdk-dev] [PATCH v2] doc: fix bbdev test guide build

2018-01-24 Thread Marko Kovacevic
Fix build issue with pdf guides. Some indentations in the bbdev test
application doc were causing build failures. Latex Log message:
 
    doc.log:! LaTeX Error: Too deeply nested.
   
Fixes: f714a18885a6 ("app/testbbdev: add test application for bbdev")
Cc: amr.mokh...@intel.com
 

Signed-off-by: Marko Kovacevic 
---

V2: Added more information into commit message
about the issue

 doc/guides/tools/testbbdev.rst | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/doc/guides/tools/testbbdev.rst b/doc/guides/tools/testbbdev.rst
index c7aac49..5c7112d 100644
--- a/doc/guides/tools/testbbdev.rst
+++ b/doc/guides/tools/testbbdev.rst
@@ -71,24 +71,26 @@ The following are the command-line options:
  Defines test cases to run. If not specified all available tests are run.
 
  The following tests can be run:
-  * unittest
+
+ * unittest
  Small unit tests witch check basic functionality of bbdev library.
-  * latency
+ * latency
  Test calculates three latency metrics:
-  * offload_latency_tc
+
+ * offload_latency_tc
  measures the cost of offloading enqueue and dequeue operations.
-  * offload_latency_empty_q_tc
+ * offload_latency_empty_q_tc
  measures the cost of offloading a dequeue operation from an empty 
queue.
  checks how long last dequeueing if there is no operations to dequeue
-  * operation_latency_tc
+ * operation_latency_tc
  measures the time difference from the first attempt to enqueue till 
the
  first successful dequeue.
-  * validation
+ * validation
  Test do enqueue on given vector and compare output after dequeueing.
-  * throughput
+ * throughput
  Test measures the achieved throughput on the available lcores.
  Results are printed in million operations per second and million bits per 
second.
-  * interrupt
+ * interrupt
  The same test as 'throughput' but uses interrupts instead of PMD to 
perform
  the dequeue.
 
-- 
2.9.5



Re: [dpdk-dev] [PATCH V12 3/3] app/testpmd: use uevent to monitor hotplug

2018-01-24 Thread Wu, Jingjing
> +
> +static void
> +add_uevent_callback(void *arg)
> +{
> + char *dev_name = (char *)arg;
> +
> + rte_eal_alarm_cancel(add_uevent_callback, arg);
> +
> + if (!in_hotplug_list(dev_name))
> + return;
> +
> + RTE_LOG(ERR, EAL, "add device: %s\n", dev_name);

It's not an error, replace by printf?

> + attach_port(dev_name);
> +}
> +
>  /* This function is used by the interrupt thread */
>  static int
>  eth_event_callback(portid_t port_id, enum rte_eth_event_type type, void 
> *param,
> @@ -1931,6 +2014,82 @@ eth_event_callback(portid_t port_id, enum
> rte_eth_event_type type, void *param,
>  }
> 
>  static int
> +in_hotplug_list(const char *dev_name)
> +{
> + struct hotplug_request *hp_request = NULL;
> +
> + TAILQ_FOREACH(hp_request, &hp_list, next) {
> + if (!strcmp(hp_request->dev_name, dev_name))
> + break;
> + }
> +
> + if (hp_request)
> + return 1;
> +
Is it better to use TRUE and FALSE?



Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Yuanhan Liu
On Tue, Jan 23, 2018 at 05:08:16PM +0100, Gaëtan Rivet wrote:
> Drivers answers to a specific API (ethdev, cryptodev, ...), to create
> standardized objects in response to parameters that are given to them
> for init. I think matching properties should be restricted to higher
> classes (bus, eth/crypto),

That's also what I thought. But I'm okay to have "driver" category
included for matching. I just don't really see a good example for that.

> while the driver class should be left
> free-form and to the responsibility of the PMD itself (while having the
> proper libraries for helping parsing safely, thus driving developpers
> toward similar syntaxes, while not forcing them in those).

I agree. The drv args are parsed by the drivers after all. It's hard to
have a good parser for all. I also don't know why we have to force them
to use "key=value" pairs.

I even see some drawbacks from the forcement:

- some PMDs already use none key/value format. Forcing them breaks more.
  If the "-w" "--vdev" compatibility is kept", nothing will be broken
  from the user point of view. However, if "key=value" pair is going to
  be used, user have to do some changes.

- Some "value" might have to use the nested "=". Handling the nested pairs
  introduces more complexity.

- sometimes, it's simple without an assignment. For example, it could be
  "driver=vhost-pmd,...,client" to let the vhost PMD acts as the client
  mode.

Both Linux kernel and QEMU don't force the "key=value" pair usage, I don't
see any good reason why we have to do that.

--yliu


Re: [dpdk-dev] [PATCH v2] net/vdev_netvsc: fix build using C11 mode and pedantic

2018-01-24 Thread Stephen Hemminger
On Wed, 24 Jan 2018 14:12:13 +
Ophir Munk  wrote:

> Remove CFLAGS -std=c11 and -pedantic in order to guarantee
> a successful vdev_netvsc compilation on old Linux distributions.
> Otherwise old GCC compilers may complain as follows:
> cc1: error: unrecognized command line option -std=c11
> 
> Fixes: 6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform driver")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Ophir Munk 
> ---
>  drivers/net/vdev_netvsc/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/vdev_netvsc/Makefile 
> b/drivers/net/vdev_netvsc/Makefile
> index f2b2ac5..45351b8 100644
> --- a/drivers/net/vdev_netvsc/Makefile
> +++ b/drivers/net/vdev_netvsc/Makefile
> @@ -12,7 +12,7 @@ EXPORT_MAP := rte_pmd_vdev_netvsc_version.map
>  # Additional compilation flags.
>  CFLAGS += -O3
>  CFLAGS += -g
> -CFLAGS += -std=c11 -pedantic -Wall -Wextra
> +CFLAGS += -Wall -Wextra
>  CFLAGS += -D_XOPEN_SOURCE=600
>  CFLAGS += -D_BSD_SOURCE
>  CFLAGS += -D_DEFAULT_SOURCE

Why did this driver not use $(WERROR) like rest of DPDK drivers.


Re: [dpdk-dev] [PATCH] ethdev: move internal callback list definition

2018-01-24 Thread Ferruh Yigit
On 1/22/2018 12:25 PM, David Marchand wrote:
> This structure is not exposed through public apis, we should just move it
> to the core header.
> 
> Signed-off-by: David Marchand 
> Reviewed-by: Ferruh Yigit 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH v2 0/2] net/mrvl: switch to the new Rx/Tx offloads API

2018-01-24 Thread Ferruh Yigit
On 1/23/2018 8:46 AM, Tomasz Duszynski wrote:
> This patch series replaces the old Rx/Tx offload API with the
> new API.
> 
> v2:
>  * Follow the same logic for calculating both unsupported and missing flags.
> 
> Tomasz Duszynski (2):
>   net/mrvl: switch to the new Rx offload API
>   net/mrvl: switch to the new Tx offload API

Series applied to dpdk-next-net/master, thanks.


[dpdk-dev] [PATCH v6 1/3] net/failsafe: register as an Rx interrupt mode PMD

2018-01-24 Thread Moti Haimovsky
This patch adds registering the Rx queues of the failsafe PMD with EAL
Rx interrupts subsystem.
Each failsafe RX queue is assigned with a unique eventfd and an enable
interrupts flag.
The PMD creates an interrupt vector containing the above eventfds and
Registers it with  EAL. The PMD also implements the Rx interrupts enable
and disable interface routines.
This patch does not implement the generation of Rx interrupts, so an
application can now wait for failsafe Rx interrupts but it will not
receive one.

Signed-off-by: Moti Haimovsky 
---
V6:
Fixed typo in commit subject.

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-mo...@mellanox.com
---
 doc/guides/nics/features/failsafe.ini   |   1 +
 drivers/net/failsafe/Makefile   |   1 +
 drivers/net/failsafe/failsafe.c |   4 +
 drivers/net/failsafe/failsafe_intr.c| 138 
 drivers/net/failsafe/failsafe_ops.c |  64 +++
 drivers/net/failsafe/failsafe_private.h |   9 +++
 6 files changed, 217 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

diff --git a/doc/guides/nics/features/failsafe.ini 
b/doc/guides/nics/features/failsafe.ini
index a42e344..39ee579 100644
--- a/doc/guides/nics/features/failsafe.ini
+++ b/doc/guides/nics/features/failsafe.ini
@@ -6,6 +6,7 @@
 [Features]
 Link status  = Y
 Link status event= Y
+Rx interrupt = Y
 MTU update   = Y
 Jumbo frame  = Y
 Promiscuous mode = Y
diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index ea2a8fe..91a734b 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -46,6 +46,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ops.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index cb274eb..921e656 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -244,6 +244,10 @@
mac->addr_bytes[2], mac->addr_bytes[3],
mac->addr_bytes[4], mac->addr_bytes[5]);
dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+   PRIV(dev)->intr_handle = (struct rte_intr_handle){
+   .fd = -1,
+   .type = RTE_INTR_HANDLE_EXT,
+   };
return 0;
 free_args:
failsafe_args_free(dev);
diff --git a/drivers/net/failsafe/failsafe_intr.c 
b/drivers/net/failsafe/failsafe_intr.c
new file mode 100644
index 000..54ef2f4
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -0,0 +1,138 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * Interrupts handling for failsafe driver.
+ */
+
+#include 
+
+#include "failsafe_private.h"
+
+/**
+ * Uninstall failsafe interrupt vector.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ */
+static void
+fs_rx_intr_vec_uninstall(struct fs_priv *priv)
+{
+   struct rte_intr_handle *intr_handle;
+
+   intr_handle = &priv->intr_handle;
+   if (intr_handle->intr_vec != NULL) {
+   free(intr_handle->intr_vec);
+   intr_handle->intr_vec = NULL;
+   }
+   intr_handle->nb_efd = 0;
+}
+
+/**
+ * Installs failsafe interrupt vector to be registered with EAL later on.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ *
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_intr_vec_install(struct fs_priv *priv)
+{
+   unsigned int i;
+   unsigned int rxqs_n;
+   unsigned int n;
+   unsigned int count;
+   struct rte_intr_handle *intr_handle;
+
+   rxqs_n = priv->dev->data->nb_rx_queues;
+   n = RTE_MIN(rxqs_n, (uint32_t)RTE_MAX_RXTX_INTR_VEC_ID);
+   count = 0;
+   intr_handle = &priv->intr_handle;
+   /* Allocate the interrupt vector of the failsafe Rx proxy interrupts */
+   intr_handle->intr_vec = malloc(n * sizeof(intr_handle->intr_vec[0]));
+   if (intr_handle->intr_vec == NULL) {
+   fs_rx_intr_vec_uninstall(priv);
+   rte_errno = ENOMEM;
+   ERROR("Failed to allocate memory for interrupt vector,"
+ " Rx interrupts will not be supported");
+   return -rte_errno;
+   }
+   for (i = 0; i < n; i++) {
+   struct rxq *rxq = priv->dev->data->rx_queues[i];
+
+   /* Skip queues that cannot request interrupts. */
+   if (rxq == NULL || rxq->event_fd < 0) {
+   /* Use invalid intr_vec[] index to disable entry. */
+   intr_handle->intr_vec[i] =
+  

[dpdk-dev] [PATCH v6 0/3] net/failsafe: add Rx interrupts support

2018-01-24 Thread Moti Haimovsky
These three patches add support for registering and waiting for
Rx interrupts in failsafe PMD. This allows applications to wait
for Rx events from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
subdevices.
  * Acting as an application,
for each Rx queue in each active subdevice the failsafe will:
  o Register the Rx queue with the EAL.
  o Pass the EAL the failsafe private epoll fd as the epfd to
register the Rx queue event on.
  o Pass the EAL, as a parameter, the pointer to the failsafe Rx
queue that handles this Rx queue.
  o Using the DPDK service callbacks, the failsafe PMD will launch
an Rx proxy service that will Wait on the epoll fd for Rx
events from the sub-devices.
  o For each Rx event received the proxy service will
  - Retrieve the pointer to failsafe Rx queue that handles
this subdevice Rx queue from the user info returned by the
EAL.
  - Trigger a failsafe Rx event on that queue by writing to
the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
interrupts respectively on both on the failsafe and its subdevices.

Moti Haimovsky (3):
  net/failsafe: register as an Rx interrupt mode PMD
  net/failsafe: slaves Rx interrupts registration
  net/failsafe: add Rx interrupts
---
V6:
* Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
  See: 1516193643-130838-1-git-send-email-mo...@mellanox.com
* Separated between routines' variables definition and initialization
  according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-mo...@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 doc/guides/nics/features/failsafe.ini  |   1 +
 drivers/net/failsafe/Makefile  |   6 +
 drivers/net/failsafe/failsafe.c|   4 +
 drivers/net/failsafe/failsafe_epoll.h  |  10 +
 drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +
 drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +
 drivers/net/failsafe/failsafe_ether.c  |   1 +
 drivers/net/failsafe/failsafe_intr.c   | 505 +
 drivers/net/failsafe/failsafe_ops.c| 102 +
 drivers/net/failsafe/failsafe_private.h|  40 ++
 10 files changed, 706 insertions(+)
 create mode 100644 drivers/net/failsafe/failsafe_epoll.h
 create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
 create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c
 create mode 100644 drivers/net/failsafe/failsafe_intr.c

-- 
1.8.3.1



[dpdk-dev] [PATCH v6 2/3] net/failsafe: slaves Rx interrupts registration

2018-01-24 Thread Moti Haimovsky
This commit adds the following functionality to failsafe PMD:
* Register and unregister slaves Rx interrupts.
* Enable and Disable slaves Rx interrupts.
The interrupts events generated by the slaves are not handled in this
commit.

Signed-off-by: Moti Haimovsky 
---
V6:
Added a wrapper around epoll_create1 since it is not supported in FreeBSD.
See: 1516193643-130838-1-git-send-email-mo...@mellanox.com

V5:
Initial version of this patch in accordance to inputs from Gaetan Rivet
in reply to
1516354344-13495-2-git-send-email-mo...@mellanox.com
---
 drivers/net/failsafe/Makefile  |   5 +
 drivers/net/failsafe/failsafe_epoll.h  |  10 ++
 drivers/net/failsafe/failsafe_epoll_bsdapp.c   |  19 +++
 drivers/net/failsafe/failsafe_epoll_linuxapp.c |  18 +++
 drivers/net/failsafe/failsafe_ether.c  |   1 +
 drivers/net/failsafe/failsafe_intr.c   | 198 +
 drivers/net/failsafe/failsafe_ops.c|  36 -
 drivers/net/failsafe/failsafe_private.h|  16 ++
 8 files changed, 301 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/failsafe/failsafe_epoll.h
 create mode 100644 drivers/net/failsafe/failsafe_epoll_bsdapp.c
 create mode 100644 drivers/net/failsafe/failsafe_epoll_linuxapp.c

diff --git a/drivers/net/failsafe/Makefile b/drivers/net/failsafe/Makefile
index 91a734b..4e6a983 100644
--- a/drivers/net/failsafe/Makefile
+++ b/drivers/net/failsafe/Makefile
@@ -47,6 +47,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_ether.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_intr.c
+ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y)
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_linuxapp.c
+else
+SRCS-$(CONFIG_RTE_LIBRTE_PMD_FAILSAFE) += failsafe_epoll_bsdapp.c
+endif
 
 # No exported include files
 
diff --git a/drivers/net/failsafe/failsafe_epoll.h 
b/drivers/net/failsafe/failsafe_epoll.h
new file mode 100644
index 000..8e6a1ec
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+#ifndef _RTE_ETH_FAILSAFE_EPOLL_H_
+#define _RTE_ETH_FAILSAFE_EPOLL_H_
+
+int failsafe_epoll_create1(int flags);
+
+#endif /* _RTE_ETH_FAILSAFE_EPOLL_H_*/
diff --git a/drivers/net/failsafe/failsafe_epoll_bsdapp.c 
b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
new file mode 100644
index 000..46c839b
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll_bsdapp.c
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * epoll wrapper for failsafe driver.
+ */
+
+#include 
+
+#include "failsafe_epoll.h"
+
+int
+failsafe_epoll_create1(int flags)
+{
+   RTE_SET_USED(flags);
+   return -ENOTSUP;
+}
diff --git a/drivers/net/failsafe/failsafe_epoll_linuxapp.c 
b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
new file mode 100644
index 000..d82ee0a
--- /dev/null
+++ b/drivers/net/failsafe/failsafe_epoll_linuxapp.c
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2018 Mellanox Technologies, Ltd.
+ */
+
+/**
+ * @file
+ * epoll wrapper for failsafe driver.
+ */
+
+#include 
+
+#include "failsafe_epoll.h"
+
+int
+failsafe_epoll_create1(int flags)
+{
+   return epoll_create1(flags);
+}
diff --git a/drivers/net/failsafe/failsafe_ether.c 
b/drivers/net/failsafe/failsafe_ether.c
index 8a4cacf..0f1630e 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -283,6 +283,7 @@
return;
switch (sdev->state) {
case DEV_STARTED:
+   failsafe_rx_intr_uninstall_subdevice(sdev);
rte_eth_dev_stop(PORT_ID(sdev));
sdev->state = DEV_ACTIVE;
/* fallthrough */
diff --git a/drivers/net/failsafe/failsafe_intr.c 
b/drivers/net/failsafe/failsafe_intr.c
index 54ef2f4..512efc7 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -9,8 +9,198 @@
 
 #include 
 
+#include "failsafe_epoll.h"
 #include "failsafe_private.h"
 
+#define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
+
+/**
+ * Install failsafe Rx event proxy subsystem.
+ * This is the way the failsafe PMD generates Rx events on behalf of its
+ * subdevices.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise and rte_errno is set.
+ */
+static int
+fs_rx_event_proxy_install(struct fs_priv *priv)
+{
+   int rc = 0;
+
+   /*
+* Create the epoll fd and event vector for the proxy service to
+* wait on for Rx events generated by the subdevices.
+*/
+   priv->rxp.efd = failsafe_epoll_create1(0);
+   if (priv->rxp.efd < 0) {
+   rte_errno = errno;
+  

[dpdk-dev] [PATCH v6 3/3] net/failsafe: add Rx interrupts

2018-01-24 Thread Moti Haimovsky
This patch is the last patch in the series of patches aimed
to add support for registering and waiting for Rx interrupts
in failsafe PMD. This allows applications to wait for Rx events
from the PMD using the DPDK rte_epoll subsystem.
The failsafe PMD presents to the application a facade of a single
device to be handled by the application while internally it manages
several devices on behalf of the application including packets
transmission and reception.
The Proposed failsafe Rx interrupt scheme follows this approach.
The failsafe PMD will present the application with a single set of
Rx interrupt vectors representing the failsafe Rx queues, while
internally it will serve as an interrupt proxy for its subdevices.
will allow applications to wait for Rx traffic from the failsafe
PMD by registering and waiting for Rx events from its Rx queues.
In order to support this the following is suggested:
  * Every Rx queue in the failsafe (virtual) device will be assigned
  * a Linux event file descriptor (efd) and an enable_interrupts flag.
  * The failsafe PMD will fill in its rte_intr_handle structure with
the Rx efds assigned previously and register them with the EAL.
  * The failsafe driver will create a private epoll fd (epfd) and
  * will allocate enough space to handle all the Rx events from all its
subdevices.
  * Acting as an application,
for each Rx queue in each active subdevice the failsafe will:
  o Register the Rx queue with the EAL.
  o Pass the EAL the failsafe private epoll fd as the epfd to
register the Rx queue event on.
  o Pass the EAL, as a parameter, the pointer to the failsafe Rx
queue that handles this Rx queue.
  o Using the DPDK service callbacks, the failsafe PMD will launch
an Rx proxy service that will Wait on the epoll fd for Rx
events from the sub-devices.
  o For each Rx event received the proxy service will
  - Retrieve the pointer to failsafe Rx queue that handles
this subdevice Rx queue from the user info returned by the
EAL.
  - Trigger a failsafe Rx event on that queue by writing to
the event fd unless interrupts are disabled for that queue.
  * The failsafe pmd will also implement the rx_queue_intr_enable
  * and rx_queue_intr_disable routines that will enable and disable Rx
interrupts respectively on both on the failsafe and its subdevices.

Signed-off-by: Moti Haimovsky 
---
V6:
Separated between routines' variables definition and initialization
according to guidelines from Gaetan Rivet.

V5:
Modified code and split the patch into three patches in accordance to
inputs from Gaetan Rivet in reply to
1516354344-13495-2-git-send-email-mo...@mellanox.com

V4:
Fixed merge conflicts found during integration with other failsafe patches
(See cover letter).

V3:
Fixed build failures in FreeBSD10.3_64

V2:
Modifications according to inputs from Stephen Hemminger:
* Removed unneeded (void *) casting.
Fixed coding style warning.
---
 drivers/net/failsafe/failsafe_intr.c| 169 
 drivers/net/failsafe/failsafe_ops.c |   6 ++
 drivers/net/failsafe/failsafe_private.h |  17 +++-
 3 files changed, 191 insertions(+), 1 deletion(-)

diff --git a/drivers/net/failsafe/failsafe_intr.c 
b/drivers/net/failsafe/failsafe_intr.c
index 512efc7..215bfe4 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -9,12 +9,176 @@
 
 #include 
 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
 #include "failsafe_epoll.h"
 #include "failsafe_private.h"
 
 #define NUM_RX_PROXIES (FAILSAFE_MAX_ETHPORTS * RTE_MAX_RXTX_INTR_VEC_ID)
 
 /**
+ * Install failsafe Rx event proxy service.
+ * The Rx event proxy is the service that listens to Rx events from the
+ * subdevices and triggers failsafe Rx events accordingly.
+ *
+ * @param priv
+ *   Pointer to failsafe private structure.
+ * @return
+ *   0 on success, negative errno value otherwise.
+ */
+static int
+fs_rx_event_proxy_routine(void *data)
+{
+   struct fs_priv *priv;
+   struct rxq *rxq;
+   struct rte_epoll_event *events;
+   uint64_t u64;
+   int i, n;
+   int rc = 0;
+
+   u64 = 1;
+   priv = data;
+   events = priv->rxp.evec;
+   n = rte_epoll_wait(priv->rxp.efd, events, NUM_RX_PROXIES, -1);
+   for (i = 0; i < n; i++) {
+   rxq = events[i].epdata.data;
+   if (rxq->enable_events && rxq->event_fd != -1) {
+   if (write(rxq->event_fd, &u64, sizeof(u64)) !=
+   sizeof(u64)) {
+   ERROR("Failed to proxy Rx event to socket %d",
+  rxq->event_fd);
+   rc = -EIO;
+   }
+   }
+   }
+   return rc;
+}
+
+/**
+ * Uninstall failsafe Rx event proxy service.
+ *
+ * @param priv
+ *   Pointer to failsafe priva

[dpdk-dev] [PATCH v1] doc: add note in proc info for stats retrieval

2018-01-24 Thread Marko Kovacevic
Note added to outline that using
proc_info for virtual devices is not supported

Signed-off-by: Marko Kovacevic 
---
 doc/guides/tools/proc_info.rst | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/doc/guides/tools/proc_info.rst b/doc/guides/tools/proc_info.rst
index fd17e27..191ab20 100644
--- a/doc/guides/tools/proc_info.rst
+++ b/doc/guides/tools/proc_info.rst
@@ -68,3 +68,6 @@ The xstats-reset parameter controls the resetting of extended 
port statistics.
 If no port mask is specified xstats are reset for all DPDK ports.
 
 **-m**: Print DPDK memory information.
+
+.. NOTE::
+   NOTE: Stats retrieval using ``proc_info`` is not supported for virtual 
devices like PCAP and TAP
-- 
2.9.5



Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Thomas Monjalon
24/01/2018 16:24, Yuanhan Liu:
> On Tue, Jan 23, 2018 at 05:08:16PM +0100, Gaëtan Rivet wrote:
> > Drivers answers to a specific API (ethdev, cryptodev, ...), to create
> > standardized objects in response to parameters that are given to them
> > for init. I think matching properties should be restricted to higher
> > classes (bus, eth/crypto),
> 
> That's also what I thought. But I'm okay to have "driver" category
> included for matching. I just don't really see a good example for that.
> 
> > while the driver class should be left
> > free-form and to the responsibility of the PMD itself (while having the
> > proper libraries for helping parsing safely, thus driving developpers
> > toward similar syntaxes, while not forcing them in those).
> 
> I agree. The drv args are parsed by the drivers after all. It's hard to
> have a good parser for all. I also don't know why we have to force them
> to use "key=value" pairs.
> 
> I even see some drawbacks from the forcement:
> 
> - some PMDs already use none key/value format. Forcing them breaks more.
>   If the "-w" "--vdev" compatibility is kept", nothing will be broken
>   from the user point of view. However, if "key=value" pair is going to
>   be used, user have to do some changes.
> 
> - Some "value" might have to use the nested "=". Handling the nested pairs
>   introduces more complexity.
> 
> - sometimes, it's simple without an assignment. For example, it could be
>   "driver=vhost-pmd,...,client" to let the vhost PMD acts as the client
>   mode.
> 
> Both Linux kernel and QEMU don't force the "key=value" pair usage, I don't
> see any good reason why we have to do that.

OK



Re: [dpdk-dev] [PATCH] doc: document the new devargs syntax

2018-01-24 Thread Thomas Monjalon
24/01/2018 16:04, Yuanhan Liu:
> On Wed, Jan 24, 2018 at 11:37:31AM +0100, Thomas Monjalon wrote:
> > 24/01/2018 11:36, Yuanhan Liu:
> > > On Wed, Jan 24, 2018 at 11:21:44AM +0100, Thomas Monjalon wrote:
> > > > 24/01/2018 10:28, Yuanhan Liu:
> > > > > On Wed, Jan 24, 2018 at 09:19:10AM +0100, Thomas Monjalon wrote:
> > > > > > 24/01/2018 07:43, Yuanhan Liu:
> > > > > > > On Tue, Jan 23, 2018 at 03:29:34PM +0100, Thomas Monjalon wrote:
> > > > > > > > 23/01/2018 13:46, Yuanhan Liu:
> > > > > > > > > If port not found, then the whole string will be used for dev 
> > > > > > > > > attachment.
> > > > > > > > > It means we are attaching a port with PCI BDF == 04.00.0 AND
> > > > > > > > > port == 0 (the 2nd port will not be attached).
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > And here is how the devargs would look like if 
> > > > > > > > > "matching;settings" is
> > > > > > > > > being used:
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > bus=pci,id=04:00.0/class=eth,port=0;bus=pci,id=04:00.0/class=eth,port=0/driver=mlx4,mlx4_arg_A=val,...
> > > > > > > > > 
> > > > > > > > > The part before ";" will be used for lookup and the later 
> > > > > > > > > part will be
> > > > > > > > > used for attachment. It should work. It just looks redundant.
> > > > > > > > 
> > > > > > > > It does not have to be redundant.
> > > > > > > > It can be:
> > > > > > > > 
> > > > > > > > bus=pci,id=04:00.0/class=eth,port=0;driver=mlx4,mlx4_arg1=settings1,...
> > > > > > > 
> > > > > > > I knew you would make such reply :)
> > > > > > > Then there is a contradiction. According your suggestion, the 
> > > > > > > "port=0" belongs
> > > > > > > to the matching section, but it also has to be used in the 
> > > > > > > settings section.
> > > > > > 
> > > > > > If port=0 is matched, it is already set, right?
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > > Why it needs to be in settings?
> > > > > 
> > > > > But I was talking the case it's not matched, say it's not probed and 
> > > > > here
> > > > > we do hotplug.
> > > > 
> > > > I don't understand.
> > > > Anyway, the port property should be read-only.
> > > 
> > > All proberties should be read-only.
> > > 
> > > > Are we talking about the dev_port from the Linux kernel?
> > > 
> > > Yes. And it can be used for probing one port only (out of 2 ports in a 
> > > NIC)
> > > at probe stage. So, at this stage, it's a setting but not a match.
> > 
> > No it's a match!
> > 
> > A settings is changing data in the port.
> 
> So I see that's your definition about the "settings". What I think is
> everything needed for driver initiation are settings.
> 
> For example, one proposed interface for VF rep is the "vf_id" property,
> Similar to "port" property we have just discussed above,  it's used for
> probing one specific VR rep for the given VF id.
> 
> You can say it's a match here, just like the "port" property.
> 
> But note that "vf_id" could be a range, to enable multiple VF reps.
> The semantics looks like "setting" more than "match".

Not sure why it would look like settings.

> Another example is from the failsafe PMD that Gaetan had mentioned:
> 
> driver=failsafe,dev(bus=pci,id=00:02.0),fd(/some/file/)
> 
> They (dev and fd) should belong the "setting" section, for 2 reasons:
> 
> - they should not be used for matching
> - they are used for failsafe PMD initiation

Yes these ones are settings.

> But it belongs "match", according to your definition about "settings",
> because it doesn't change data in the port.

No, it changes data, dev() is adding a slave, and fd() is adding
a file descriptor.
So we agree these are settings.

> That also means, the word "settings" might not be well named. It's
> probably better to name it "drvargs".

I disagree. But it's only naming.
Settings can be class settings, not only driver settings.
And driver properties can be matching or settings.
So "drvargs" does not make sense.


[dpdk-dev] [PATCH] service: fix possible mem leak on initialize

2018-01-24 Thread Harry van Haaren
This commit ensures that if that if we run out of memory
during the initialization of the service library, that the
first allocated memory is correctly freed instead of leaked.

Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: sta...@dpdk.org

Reported-by: Vipin Varghese 
Signed-off-by: Harry van Haaren 
---
 lib/librte_eal/common/rte_service.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/rte_service.c 
b/lib/librte_eal/common/rte_service.c
index 5f97d85..b40c3d9 100644
--- a/lib/librte_eal/common/rte_service.c
+++ b/lib/librte_eal/common/rte_service.c
@@ -82,14 +82,14 @@ int32_t rte_service_init(void)
RTE_CACHE_LINE_SIZE);
if (!rte_services) {
printf("error allocating rte services array\n");
-   return -ENOMEM;
+   goto fail_mem;
}
 
lcore_states = rte_calloc("rte_service_core_states", RTE_MAX_LCORE,
sizeof(struct core_state), RTE_CACHE_LINE_SIZE);
if (!lcore_states) {
printf("error allocating core states array\n");
-   return -ENOMEM;
+   goto fail_mem;
}
 
int i;
@@ -106,6 +106,12 @@ int32_t rte_service_init(void)
 
rte_service_library_initialized = 1;
return 0;
+fail_mem:
+   if (rte_services)
+   rte_free(rte_services);
+   if (lcore_states)
+   rte_free(lcore_states);
+   return -ENOMEM;
 }
 
 /* returns 1 if service is registered and has not been unregistered
-- 
2.7.4



Re: [dpdk-dev] [PATCH] net/enic: set L4 checksum flags for IPv6 packets

2018-01-24 Thread Ferruh Yigit
On 1/23/2018 1:05 AM, John Daley wrote:
> From: Hyong Youb Kim 
> 
> enic_cq_rx_to_pkt_flags() currently sets checksum good/bad flags only
> for IPv4.  The hardware actually validates the TCP/UDP checksum of
> IPv6 packets too. Set PKT_RX_L4_CKSUM_{GOOD,BAD} accordingly.
> 
> Signed-off-by: Hyong Youb Kim 
> Reviewed-by: John Daley 
> ---
>  drivers/net/enic/enic_rxtx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c
> index 98902caa0..8157697a0 100644
> --- a/drivers/net/enic/enic_rxtx.c
> +++ b/drivers/net/enic/enic_rxtx.c
> @@ -185,14 +185,14 @@ enic_cq_rx_to_pkt_flags(struct cq_desc *cqd, struct 
> rte_mbuf *mbuf)
>   }
>  
>   /* checksum flags */
> - if (mbuf->packet_type & RTE_PTYPE_L3_IPV4) {
> + if (mbuf->packet_type & (RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L3_IPV6)) {
>   if (!enic_cq_rx_desc_csum_not_calc(cqrd)) {
>   uint32_t l4_flags;
>   l4_flags = mbuf->packet_type & RTE_PTYPE_L4_MASK;
>  
>   if (enic_cq_rx_desc_ipv4_csum_ok(cqrd))
>   pkt_flags |= PKT_RX_IP_CKSUM_GOOD;
> - else
> + else if (mbuf->packet_type & RTE_PTYPE_L3_IPV4)

This looks like conflicting with commit log.
Is pkt_flags intentionally not set for this case?
If so can you update commit log to say only PKT_RX_IP_CKSUM_GOOD set for ipv6?

>   pkt_flags |= PKT_RX_IP_CKSUM_BAD;
>  
>   if (l4_flags == RTE_PTYPE_L4_UDP ||
> 



Re: [dpdk-dev] [PATCH] net/enic: add a Tx prepare handler

2018-01-24 Thread Ferruh Yigit
On 1/23/2018 1:05 AM, John Daley wrote:
> From: Hyong Youb Kim 
> 
> Like most NICs, this hardware (Cisco VIC) also requires partial
> checksum in the packet for checksum offload and TSO. So, add
> the tx_pkt_prepare handler like other PMDs do.
> 
> Technically, VIC has an offload mode that does not require partial
> checksum for non-TSO packets. But, it has no such mode for TSO
> packets, making tx_pkt_prepare unavoidable.
> 
> Signed-off-by: Hyong Youb Kim 
> Reviewed-by: John Daley 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH] net/enic: fix segfault due to static max number of queues

2018-01-24 Thread Ferruh Yigit
On 1/23/2018 1:05 AM, John Daley wrote:
> From: Hyong Youb Kim 
> 
> ENIC_CQ_MAX, ENIC_WQ_MAX and others are arbitrary values that
> prevent the app from using more queues when they are available on
> hardware. Remove them and dynamically allocate vnic_cq and such
> arrays to accommodate all available hardware queues.
> 
> As a side effect of removing ENIC_CQ_MAX, this commit fixes a segfault
> that would happen when the app requests more than 16 CQs, because
> enic_set_vnic_res() does not consider ENIC_CQ_MAX. For example, the
> following command causes a crash.
> 
> testpmd -- --rxq=16 --txq=16
> 
> Fixes: ce93d3c36db0 ("net/enic: fix resource check failures when bonding 
> devices")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Hyong Youb Kim 
> Reviewed-by: John Daley 

Applied to dpdk-next-net/master, thanks.


[dpdk-dev] [PATCH] maintainers: update for cryptodev

2018-01-24 Thread Pablo de Lara
Signed-off-by: Pablo de Lara 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5788ea004..924426343 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -276,6 +276,7 @@ F: examples/bbdev_app/
 F: doc/guides/sample_app_ug/bbdev_app.rst
 
 Crypto API
+M: Pablo de Lara 
 M: Declan Doherty 
 T: git://dpdk.org/next/dpdk-next-crypto
 F: lib/librte_cryptodev/
-- 
2.14.3



Re: [dpdk-dev] [PATCH] net/failsafe: fix Rx burst infinite loop

2018-01-24 Thread Ferruh Yigit
On 1/24/2018 10:31 AM, Gaëtan Rivet wrote:
> On Wed, Jan 24, 2018 at 10:19:17AM +, Matan Azrad wrote:
>> In case of plugged out device, the fail-safe PMD uses failsafe_rx_burst
>> function for packet receiving.
>>
>> This function iterates over the present sub-devices until it
>> receives a traffic from one of them or they are all cannot receive
>> packets.
>>
>> The corrupted code didn't advance the sub-device pointer when the
>> sub-device was not present and caused to infinite loop.
>>
>> Advance the sub-device pointer also in plugged-out sub-device case.
>>
>> Fixes: 8052bbd9d548 ("net/failsafe: improve Rx sub-devices iteration")
>>
>> Signed-off-by: Matan Azrad 
> Acked-by: Gaetan Rivet 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH] net/enic: set L4 checksum flags for IPv6 packets

2018-01-24 Thread Hyong Youb Kim
On Wed, Jan 24, 2018 at 05:18:37PM +, Ferruh Yigit wrote:
> On 1/23/2018 1:05 AM, John Daley wrote:
> > From: Hyong Youb Kim 
> > 
> > enic_cq_rx_to_pkt_flags() currently sets checksum good/bad flags only
> > for IPv4.  The hardware actually validates the TCP/UDP checksum of
> > IPv6 packets too. Set PKT_RX_L4_CKSUM_{GOOD,BAD} accordingly.
[...]
> > /* checksum flags */
> > -   if (mbuf->packet_type & RTE_PTYPE_L3_IPV4) {
> > +   if (mbuf->packet_type & (RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L3_IPV6)) {
> > if (!enic_cq_rx_desc_csum_not_calc(cqrd)) {
> > uint32_t l4_flags;
> > l4_flags = mbuf->packet_type & RTE_PTYPE_L4_MASK;
> >  
> > if (enic_cq_rx_desc_ipv4_csum_ok(cqrd))
> > pkt_flags |= PKT_RX_IP_CKSUM_GOOD;
> > -   else
> > +   else if (mbuf->packet_type & RTE_PTYPE_L3_IPV4)
> 
> This looks like conflicting with commit log.
> Is pkt_flags intentionally not set for this case?
> If so can you update commit log to say only PKT_RX_IP_CKSUM_GOOD set for ipv6?
> 
> > pkt_flags |= PKT_RX_IP_CKSUM_BAD;

Yes, it is intentional. IPv6 has no IP header checksum, and
PKT_RX_IP_CKSUM_{GOOD,BAD} does not apply. So, the code does not set
PKT_RX_IP_CKSUM for IPv6. As the commit log says, for IPv6, we are
only setting L4 checksum flags.

-Hyong


Re: [dpdk-dev] [PATCH] net/enic: set L4 checksum flags for IPv6 packets

2018-01-24 Thread Ferruh Yigit
On 1/24/2018 5:30 PM, Hyong Youb Kim wrote:
> On Wed, Jan 24, 2018 at 05:18:37PM +, Ferruh Yigit wrote:
>> On 1/23/2018 1:05 AM, John Daley wrote:
>>> From: Hyong Youb Kim 
>>>
>>> enic_cq_rx_to_pkt_flags() currently sets checksum good/bad flags only
>>> for IPv4.  The hardware actually validates the TCP/UDP checksum of
>>> IPv6 packets too. Set PKT_RX_L4_CKSUM_{GOOD,BAD} accordingly.
> [...]
>>> /* checksum flags */
>>> -   if (mbuf->packet_type & RTE_PTYPE_L3_IPV4) {
>>> +   if (mbuf->packet_type & (RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L3_IPV6)) {
>>> if (!enic_cq_rx_desc_csum_not_calc(cqrd)) {
>>> uint32_t l4_flags;
>>> l4_flags = mbuf->packet_type & RTE_PTYPE_L4_MASK;
>>>  
>>> if (enic_cq_rx_desc_ipv4_csum_ok(cqrd))
>>> pkt_flags |= PKT_RX_IP_CKSUM_GOOD;
>>> -   else
>>> +   else if (mbuf->packet_type & RTE_PTYPE_L3_IPV4)
>>
>> This looks like conflicting with commit log.
>> Is pkt_flags intentionally not set for this case?
>> If so can you update commit log to say only PKT_RX_IP_CKSUM_GOOD set for 
>> ipv6?
>>
>>> pkt_flags |= PKT_RX_IP_CKSUM_BAD;
> 
> Yes, it is intentional. IPv6 has no IP header checksum, and
> PKT_RX_IP_CKSUM_{GOOD,BAD} does not apply. So, the code does not set
> PKT_RX_IP_CKSUM for IPv6. As the commit log says, for IPv6, we are
> only setting L4 checksum flags.

Ahh, I see.
And I assume enic_cq_rx_desc_ipv4_csum_ok() always false for IPv6.

> 
> -Hyong
> 



Re: [dpdk-dev] [PATCH] net/enic: set L4 checksum flags for IPv6 packets

2018-01-24 Thread Hyong Youb Kim
On Wed, Jan 24, 2018 at 05:45:48PM +, Ferruh Yigit wrote:
> On 1/24/2018 5:30 PM, Hyong Youb Kim wrote:
> > On Wed, Jan 24, 2018 at 05:18:37PM +, Ferruh Yigit wrote:
> >> On 1/23/2018 1:05 AM, John Daley wrote:
> >>> From: Hyong Youb Kim 
> >>>
> >>> enic_cq_rx_to_pkt_flags() currently sets checksum good/bad flags only
> >>> for IPv4.  The hardware actually validates the TCP/UDP checksum of
> >>> IPv6 packets too. Set PKT_RX_L4_CKSUM_{GOOD,BAD} accordingly.
> > [...]
> >>>   /* checksum flags */
> >>> - if (mbuf->packet_type & RTE_PTYPE_L3_IPV4) {
> >>> + if (mbuf->packet_type & (RTE_PTYPE_L3_IPV4 | RTE_PTYPE_L3_IPV6)) {
> >>>   if (!enic_cq_rx_desc_csum_not_calc(cqrd)) {
> >>>   uint32_t l4_flags;
> >>>   l4_flags = mbuf->packet_type & RTE_PTYPE_L4_MASK;
> >>>  
> >>>   if (enic_cq_rx_desc_ipv4_csum_ok(cqrd))
> >>>   pkt_flags |= PKT_RX_IP_CKSUM_GOOD;
> >>> - else
> >>> + else if (mbuf->packet_type & RTE_PTYPE_L3_IPV4)
> >>
> >> This looks like conflicting with commit log.
> >> Is pkt_flags intentionally not set for this case?
> >> If so can you update commit log to say only PKT_RX_IP_CKSUM_GOOD set for 
> >> ipv6?
> >>
> >>>   pkt_flags |= PKT_RX_IP_CKSUM_BAD;
> > 
> > Yes, it is intentional. IPv6 has no IP header checksum, and
> > PKT_RX_IP_CKSUM_{GOOD,BAD} does not apply. So, the code does not set
> > PKT_RX_IP_CKSUM for IPv6. As the commit log says, for IPv6, we are
> > only setting L4 checksum flags.
> 
> Ahh, I see.
> And I assume enic_cq_rx_desc_ipv4_csum_ok() always false for IPv6.
> 

Yes, that is correct. enic_cq_rx_desc_ipv4_csum_ok() returns true only
for IPv4. It is a bit confusing..

-Hyong


Re: [dpdk-dev] [PATCH] net/enic: set L4 checksum flags for IPv6 packets

2018-01-24 Thread Ferruh Yigit
On 1/23/2018 1:05 AM, John Daley wrote:
> From: Hyong Youb Kim 
> 
> enic_cq_rx_to_pkt_flags() currently sets checksum good/bad flags only
> for IPv4.  The hardware actually validates the TCP/UDP checksum of
> IPv6 packets too. Set PKT_RX_L4_CKSUM_{GOOD,BAD} accordingly.
> 
> Signed-off-by: Hyong Youb Kim 
> Reviewed-by: John Daley 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH] event/opdl: rework loops to comply with dpdk style

2018-01-24 Thread Thomas Monjalon
22/01/2018 11:30, Liang, Ma:
> On 22 Jan 10:04, Harry van Haaren wrote:
> > This commit reworks the loop counter variable declarations
> > to be in line with the DPDK source code.
> > 
> > Fixes: 3c7f3dcfb099 ("event/opdl: add PMD main body and helper function")
> > Fixes: 8ca8e3b48eff ("event/opdl: add event queue config get/set")
> > Fixes: d548ef513cd7 ("event/opdl: add unit tests")
> > 
> > Cc: liang.j...@intel.com
> > Cc: peter.mccar...@intel.com
> > 
> > Signed-off-by: Harry van Haaren 
> > 
> > ---
> Many thanks Harry. 
> 
> Acked-by: Liang Ma  

Applied, thanks


Re: [dpdk-dev] [PATCH] drivers/event: fix resource leak in selftest

2018-01-24 Thread Thomas Monjalon
23/01/2018 15:17, Van Haaren, Harry:
> > From: Pavan Nikhilesh [mailto:pbhagavat...@caviumnetworks.com]
> > Sent: Monday, January 22, 2018 5:46 PM
> > To: jerin.ja...@caviumnetworks.com; Van Haaren, Harry
> > 
> > Cc: dev@dpdk.org; Pavan Nikhilesh 
> > Subject: [dpdk-dev] [PATCH] drivers/event: fix resource leak in selftest
> > 
> > Free resources leak in eventdev selftests.
> > 
> > Coverity issue: 257044
> > Coverity issue: 257047
> > Coverity issue: 257009
> > Fixes: 9ef576176db0 ("test/eventdev: add octeontx multi queue and multi
> > port")
> > Fixes: 3a17ff401f1e ("test/eventdev: add basic SW tests")
> > Fixes: 5e6eb5ccd788 ("event/sw: make test standalone")
> > 
> > Signed-off-by: Pavan Nikhilesh 
> 
> Acked-by: Harry van Haaren 

Applied, thanks



Re: [dpdk-dev] [PATCH v1] net/tap: use local eBPF definitions

2018-01-24 Thread Thomas Monjalon
24/01/2018 12:05, Van Haaren, Harry:
> > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ophir Munk
> > Sent: Tuesday, January 23, 2018 9:54 PM
> > To: dev@dpdk.org; Pascal Mazon 
> > Cc: Thomas Monjalon ; Olga Shern ;
> > Ophir Munk 
> > Subject: [dpdk-dev] [PATCH v1] net/tap: use local eBPF definitions
> > 
> > eBPF has a graceful approach: it must successfully compile on all Linux
> > distributions. If a specific kernel cannot support eBPF it will gracefully
> > refuse the eBPF netlink message sent to it.
> > The kernel header file linux/bpf.h (if present) on different Linux
> > distributions may not include all definitions required for TAP
> > compilation.
> > In order to guarantee a successful eBPF compilation everywhere all the
> > required definitions for TAP have been locally added instead of including
> > file 
> > 
> > Signed-off-by: Ophir Munk 
> 
> Tested on a Fedora 20 vm, uname -r = 3.15.6-200.fc20.x86_64
> 
> Confirmed before patch was failing, with patch build is fixed.
> 
> Tested-by: Harry van Haaren 

Applied, thanks



Re: [dpdk-dev] [PATCH v2] net/vdev_netvsc: fix build using C11 mode and pedantic

2018-01-24 Thread Thomas Monjalon
24/01/2018 15:45, Matan Azrad:
> Hi Ophir
> 
> From: Ophir Munk, Wednesday, January 24, 2018 4:12 PM
> > Remove CFLAGS -std=c11 and -pedantic in order to guarantee a successful
> > vdev_netvsc compilation on old Linux distributions.
> > Otherwise old GCC compilers may complain as follows:
> > cc1: error: unrecognized command line option -std=c11
> > 
> > Fixes: 6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform
> > driver")
> > Cc: sta...@dpdk.org
> 
> No need to backport this fix.
> 
> > Signed-off-by: Ophir Munk 
> 
> Besides that,
> Acked-by: Matan Azrad 

Applied, thanks



Re: [dpdk-dev] [dpdk-stable] [PATCH v2] net/vdev_netvsc: fix build using C11 mode and pedantic

2018-01-24 Thread Thomas Monjalon
24/01/2018 16:39, Stephen Hemminger:
> On Wed, 24 Jan 2018 14:12:13 +
> Ophir Munk  wrote:
> > --- a/drivers/net/vdev_netvsc/Makefile
> > +++ b/drivers/net/vdev_netvsc/Makefile
> > @@ -12,7 +12,7 @@ EXPORT_MAP := rte_pmd_vdev_netvsc_version.map
> >  # Additional compilation flags.
> >  CFLAGS += -O3
> >  CFLAGS += -g
> > -CFLAGS += -std=c11 -pedantic -Wall -Wextra
> > +CFLAGS += -Wall -Wextra
> >  CFLAGS += -D_XOPEN_SOURCE=600
> >  CFLAGS += -D_BSD_SOURCE
> >  CFLAGS += -D_DEFAULT_SOURCE
> 
> Why did this driver not use $(WERROR) like rest of DPDK drivers.

It can be a separate patch.
Matan?


Re: [dpdk-dev] [dpdk-stable] [PATCH v2] net/vdev_netvsc: fix build using C11 mode and pedantic

2018-01-24 Thread Stephen Hemminger
On Wed, 24 Jan 2018 19:08:02 +0100
Thomas Monjalon  wrote:

> 24/01/2018 16:39, Stephen Hemminger:
> > On Wed, 24 Jan 2018 14:12:13 +
> > Ophir Munk  wrote:  
> > > --- a/drivers/net/vdev_netvsc/Makefile
> > > +++ b/drivers/net/vdev_netvsc/Makefile
> > > @@ -12,7 +12,7 @@ EXPORT_MAP := rte_pmd_vdev_netvsc_version.map
> > >  # Additional compilation flags.
> > >  CFLAGS += -O3
> > >  CFLAGS += -g
> > > -CFLAGS += -std=c11 -pedantic -Wall -Wextra
> > > +CFLAGS += -Wall -Wextra
> > >  CFLAGS += -D_XOPEN_SOURCE=600
> > >  CFLAGS += -D_BSD_SOURCE
> > >  CFLAGS += -D_DEFAULT_SOURCE  
> > 
> > Why did this driver not use $(WERROR) like rest of DPDK drivers.  
> 
> It can be a separate patch.
> Matan?

I meant that you should use:

CFLAGS += $(WERROR_FLAGS)

instead of

CFLAGS += -Wall -Wextra

in this patch.

Also, do you really  need all the other CFLAGS? Why?

This driver has no reason to be a special case different from what is done
in virtio, vmxnet3, ixgbe, e1000, ...


Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port ownership

2018-01-24 Thread Ananyev, Konstantin


> -Original Message-
> From: Thomas Monjalon [mailto:tho...@monjalon.net]
> Sent: Wednesday, January 24, 2018 8:10 AM
> To: Ananyev, Konstantin 
> Cc: Matan Azrad ; Gaëtan Rivet ; 
> Wu, Jingjing ;
> dev@dpdk.org; Neil Horman ; Richardson, Bruce 
> 
> Subject: Re: [dpdk-dev] [PATCH v3 7/7] app/testpmd: adjust ethdev port 
> ownership
> 
> 23/01/2018 22:18, Ananyev, Konstantin:
> > >
> > > 23/01/2018 16:18, Ananyev, Konstantin:
> > > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > > From: Thomas Monjalon [mailto:tho...@monjalon.net]
> > > > > > 23/01/2018 14:34, Ananyev, Konstantin:
> > > > > > > If that' s the use case, then I think you need to set device 
> > > > > > > ownership at creation time -
> > > > > > > inside dev_allocate().
> > > > > > > Again that would avoid such racing conditions inside testpmd.
> > > > > >
> > > > > > The devices must be allocated at a low level layer.
> > > > >
> > > > > No one arguing about that.
> > > > > But we can provide owner id information to the low level.
> > >
> > > Sorry, you did not get it.
> >
> > Might be.
> >
> > > We cannot provide owner id at the low level
> > > because it is not yet decided who will be the owner
> > > before the port is allocated.
> >
> > Why is that?
> > What prevents us decide who will manage that device *before* allocating 
> > port of it?
> > IMO we do have all needed information at that stage.
> 
> We don't have the information.

At that point we do have dev name and all parameters, right?
Plus we do have blacklist/whitelist, etc.
So what else are we missing to make the decision at that point? 

> It is a new device, it is matched by a driver which allocates a port.
> I don't see where the higher level can interact here.
> And even if you manage a trick, the higher level needs to read the port
> informations to decide the ownership.

Could you specify what particular port information it needs?

> 
> > > > > > When a new device appears (hotplug), an ethdev port should be 
> > > > > > allocated
> > > > > > automatically if it passes the whitelist/blacklist policy test.
> > > > > > Then we must decide who will manage this device.
> > > > > > I suggest notifying the DPDK libs first.
> > > > > > So a DPDK lib or PMD like failsafe can have the priority to take the
> > > > > > ownership in its notification callback.
> > > > >
> > > > > Possible, but seems a bit overcomplicated.
> > > > > Why not just:
> > > > >
> > > > > Have a global variable process_default_owner_id, that would be init 
> > > > > once at startup.
> > > > > Have an LTS variable default_owner_id.
> > > > > It will be used by rte_eth_dev_allocate() caller can set 
> > > > > dev->owner_id at creation time,
> > > > > so port allocation and setting ownership - will be an atomic 
> > > > > operation.
> > > > > At the exit rte_eth_dev_allocate() will always reset 
> > > > > default_owner_id=0:
> > > > >
> > > > > rte_eth_dev_allocate(...)
> > > > > {
> > > > >lock(owner_lock);
> > > > >
> > > > >owner = RTE_PER_LCORE(default_owner_id);
> > > > >if (owner == 0)
> > > > >owner = process_default_owner_id;
> > > > >   set_owner(port, ..., owner);
> > > > >  unlock(owner_lock);
> > > > >  RTE_PER_LCORE(default_owner_id) = 0;
> > > >
> > > > Or probably better to leave default_owner_id reset to the caller.
> > > > Another thing - we can use same LTS variable in all control ops to
> > > > allow/disallow changing of port configuration based on ownership.
> > > > Konstantin
> > > >
> > > > > }
> > > > >
> > > > > So callers who don't need any special ownership - don't need to do 
> > > > > anything.
> > > > > Special callers (like failsafe) can set default_owenr_id just before 
> > > > > calling hotplug
> > > > > handling routine.
> > >
> > > No, hotplug will not be a routine.
> > > I am talking about real hotplug, like a device which appears in the VM.
> > > This new device must be handled by EAL and probed automatically if
> > > comply with whitelist/blacklist policy given by the application or user.
> > > Real hotplug is asynchronous.
> >
> > By 'asynchronous' here you mean it would be handled in the EAL interrupt 
> > thread
> > or something different?
> 
> Yes, we receive an hotplug event which is processed in the event thread.
> 
> > Anyway, I suppose  you do need a function inside DPDK that will do the 
> > actual work in response
> > on hotplug event, right?
> 
> Yes

Ok, btw why that function has to be always called from interrupt thread?
Why not to allow user to decide?
Some apps have their own thread dedicated for control ops (like testpmd)
For them it might be plausible to call that function from their own control 
thread context.
Konstantin

> 
> > That's what I refer to as 'hotplug routine' above.
> >
> > > We will just receive notifications that it appeared.
> > >
> > > Note: there is some temporary code in failsafe to manage some hotplug.
> > > This code must be removed when it will be properly handled in 

Re: [dpdk-dev] IXGBE, IOMMU DMAR DRHD handling fault issue

2018-01-24 Thread Ravi Kerur
Hi Burakov, Thank you. I will try with vfio-pci driver. I am assuming it
will work for both PF and VF interfaces since I am using both in my setup?

Thanks.

On Wed, Jan 24, 2018 at 2:31 AM, Burakov, Anatoly  wrote:

> On 23-Jan-18 5:25 PM, Ravi Kerur wrote:
>
>> Hi,
>>
>> I am running into an issue when DPDK is started with iommu on via GRUB
>> command. Problem is not seen with regular kernel driver, error messages
>> show when DPDK is started and happens for both PF and VF interfaces.
>>
>> I am using DPDK 17.05 so the patch proposed in the following link is
>> available
>> http://dpdk.org/ml/archives/dev/2017-February/057048.html
>>
>> Workaround is to use "iommu=pt" but I want iommu enabled in my setup. I
>> checked BIOS for reserved memory(DMA RMRR for IXGBE) didn't get any
>> details
>> on it.
>>
>> Kindly let me know how to resolve this issue.
>>
>> Following are the details
>>
>> (1) Linux kernel 4.9
>> (2) DPDK 17.05
>>
>> (3) IXGBE details
>> ethtool -i enp4s0f0  (PF driver)
>> driver: ixgbe
>> version: 5.3.3
>> firmware-version: 0x87b8, 1.1018.0
>> bus-info: :04:00.0
>> supports-statistics: yes
>> supports-test: yes
>> supports-eeprom-access: yes
>> supports-register-dump: yes
>> supports-priv-flags: yes
>>
>> ethtool -i enp4s16f2 (VF driver)
>> driver: ixgbevf
>> version: 4.3.2
>> firmware-version:
>> bus-info: :04:10.2
>> supports-statistics: yes
>> supports-test: yes
>> supports-eeprom-access: no
>> supports-register-dump: yes
>> supports-priv-flags: no
>>
>> Bus info  Device   Class  Description
>> =
>> pci@:01:00.0  ens11f0  network82599ES 10-Gigabit SFI/SFP+
>> Network Connection
>> pci@:01:00.1  ens11f1  network82599ES 10-Gigabit SFI/SFP+
>> Network Connection
>> pci@:04:00.0  enp4s0f0 network82599ES 10-Gigabit SFI/SFP+
>> Network Connection
>> pci@:04:00.1  enp4s0f1 network82599ES 10-Gigabit SFI/SFP+
>> Network Connection
>> pci@:04:10.0  enp4s16  networkIllegal Vendor ID
>> pci@:04:10.2  enp4s16f2networkIllegal Vendor ID
>>
>> (4) DPDK bind interfaces
>>
>> # dpdk-devbind -s
>>
>> Network devices using DPDK-compatible driver
>> 
>> :01:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> drv=igb_uio unused=vfio-pci
>> :04:10.2 '82599 Ethernet Controller Virtual Function 10ed' drv=igb_uio
>> unused=vfio-pci
>>
>> Network devices using kernel driver
>> ===
>> :01:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=ens11f1 drv=ixgbe unused=igb_uio,vfio-pci
>> :04:00.0 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=enp4s0f0 drv=ixgbe unused=igb_uio,vfio-pci
>> :04:00.1 '82599ES 10-Gigabit SFI/SFP+ Network Connection 10fb'
>> if=enp4s0f1 drv=ixgbe unused=igb_uio,vfio-pci
>> :04:10.0 '82599 Ethernet Controller Virtual Function 10ed' if=enp4s16
>> drv=ixgbevf unused=igb_uio,vfio-pci
>> :06:00.0 'I210 Gigabit Network Connection 1533' if=eno1 drv=igb
>> unused=igb_uio,vfio-pci *Active*
>>
>> Other Network devices
>> =
>> 
>>
>> ...
>>
>> (5) Kernel dmesg
>>
>> # dmesg | grep -e DMAR
>> [0.00] ACPI: DMAR 0x7999BAD0 E0 (v01 ALASKA A M I
>> 0001 INTL 20091013)
>> [0.00] DMAR: IOMMU enabled
>> [0.518747] DMAR: Host address width 46
>> [0.526616] DMAR: DRHD base: 0x00fbffc000 flags: 0x0
>> [0.537447] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap
>> d2078c106f0466 ecap f020df
>> [0.553620] DMAR: DRHD base: 0x00c7ffc000 flags: 0x1
>> [0.564445] DMAR: dmar1: reg_base_addr c7ffc000 ver 1:0 cap
>> d2078c106f0466 ecap f020df
>> [0.580611] DMAR: RMRR base: 0x007bbc6000 end: 0x007bbd4fff
>> [0.593344] DMAR: ATSR flags: 0x0
>> [0.600178] DMAR: RHSA base: 0x00c7ffc000 proximity domain: 0x0
>> [0.612905] DMAR: RHSA base: 0x00fbffc000 proximity domain: 0x1
>> [0.625632] DMAR-IR: IOAPIC id 3 under DRHD base  0xfbffc000 IOMMU 0
>> [0.638522] DMAR-IR: IOAPIC id 1 under DRHD base  0xc7ffc000 IOMMU 1
>> [0.651426] DMAR-IR: IOAPIC id 2 under DRHD base  0xc7ffc000 IOMMU 1
>> [0.664324] DMAR-IR: HPET id 0 under DRHD base 0xc7ffc000
>> [0.675326] DMAR-IR: Queued invalidation will be enabled to support
>> x2apic and Intr-remapping.
>> [0.693805] DMAR-IR: Enabled IRQ remapping in x2apic mode
>> [9.395170] DMAR: dmar1: Using Queued invalidation
>> [9.405011] DMAR: Setting RMRR:
>> [9.412006] DMAR: Setting identity map for device :00:1d.0
>> [0x7bbc6000 - 0x7bbd4fff]
>> [9.428569] DMAR: Prepare 0-16MiB unity mapping for LPC
>> [9.439712] DMAR: Setting identity map for device :00:1f.0 [0x0 -
>> 0xff]
>> [9.454684] DMAR: Intel(R) Virtualization Technology for Directed I/O
>> [  287.023068] DMAR: DRHD: handling fault status reg 2
>> [  287.

Re: [dpdk-dev] [RFC v3 1/1] lib: add compressdev API

2018-01-24 Thread Ahmed Mansour
Hi All,

Please see responses in line.

Thanks,

Ahmed

On 1/23/2018 6:58 AM, Verma, Shally wrote:
> Hi Fiona
>
>> -Original Message-
>> From: Trahe, Fiona [mailto:fiona.tr...@intel.com]
>> Sent: 19 January 2018 17:30
>> To: Verma, Shally ; dev@dpdk.org;
>> akhil.go...@nxp.com
>> Cc: Challa, Mahipal ; Athreya, Narayana
>> Prasad ; De Lara Guarch, Pablo
>> ; Gupta, Ashish
>> ; Sahu, Sunila ;
>> Jain, Deepak K ; Hemant Agrawal
>> ; Roy Pledge ; Youri
>> Querry ; Ahmed Mansour
>> ; Trahe, Fiona 
>> Subject: RE: [RFC v3 1/1] lib: add compressdev API
>>
>> Hi Shally,
>>
>>> -Original Message-
>>> From: Verma, Shally [mailto:shally.ve...@cavium.com]
>>> Sent: Thursday, January 18, 2018 12:54 PM
>>> To: Trahe, Fiona ; dev@dpdk.org
>>> Cc: Challa, Mahipal ; Athreya, Narayana
>> Prasad
>>> ; De Lara Guarch, Pablo
>> ;
>>> Gupta, Ashish ; Sahu, Sunila
>> ; Jain, Deepak K
>>> ; Hemant Agrawal
>> ; Roy Pledge
>>> ; Youri Querry ;
>> Ahmed Mansour
>>> 
>>> Subject: RE: [RFC v3 1/1] lib: add compressdev API
>>>
>>> Hi Fiona
>>>
>>> While revisiting this, we identified few questions and additions. Please see
>> them inline.
>>>
 -Original Message-
 From: Trahe, Fiona [mailto:fiona.tr...@intel.com]
 Sent: 15 December 2017 23:19
 To: dev@dpdk.org; Verma, Shally 
 Cc: Challa, Mahipal ; Athreya, Narayana
 Prasad ;
 pablo.de.lara.gua...@intel.com; fiona.tr...@intel.com
 Subject: [RFC v3 1/1] lib: add compressdev API

 Signed-off-by: Trahe, Fiona 
 ---
>>> //snip
>>>
 +
 +int
 +rte_compressdev_queue_pair_setup(uint8_t dev_id, uint16_t
 queue_pair_id,
 +  uint32_t max_inflight_ops, int socket_id)
>>> [Shally] Is max_inflights_ops different from nb_streams_per_qp in struct
>> rte_compressdev_info?
>>> I assume they both carry same purpose. If yes, then it will be better to use
>> single naming convention to
>>> avoid confusion.
>> [Fiona] No, I think they have different purposes.
>> max_inflight_ops should be used to configure the qp with the number of ops
>> the application expects to be able to submit to the qp before it needs to 
>> poll
>> for a response. It can be configured differently for each qp. In the QAT 
>> case it
>> dictates the depth of the qp created, it may have different implications on
>> other PMDs.
>> nb_sessions_per_qp and nb_streams_per_qp are limitations the devices
>> reports and are same for all qps on the device. QAT doesn't have those
>> limitations and so would report 0, however I assumed they may be necessary
>> for other devices.
>> This assumption is based on the patch submitted by NXP to cryptodev in Feb
>> 2017
>>  
>> https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fdpdk.org%2Fml%2Farchives%2Fdev%2F2017-March%2F060740.html&data=02%7C01%7Cahmed.mansour%40nxp.com%7Cb012d74d7530493b155108d56258955f%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C636523054981379413&sdata=2SazlEazMxcBGS7R58CpNrX0G5OeWx8PLMwf%2FYzqv34%3D&reserved=0
>> I also assume these are not necessarily the max number of sessions in ops on
>> the qp at a given time, but the total number attached, i.e. if the device has
>> this limitation then sessions must be attached to qps, and presumably
>> reserve some resources. Being attached doesn't imply there is an op on the
>> qp at that time using that session. So it's not to relating to the inflight 
>> op
>> count, but to the number of sessions attached/detached to the qp.
>> Including Akhil on the To list, maybe NXP can confirm if these params are
>> needed.
> [Shally] Ok. Then let's wait for NXP to confirm on this requirement as 
> currently spec doesn't have any API to attach 
> queue_pair_to_specific_session_or_stream as cryptodev.
>
> But then how application could know limit on max_inflight_ops supported on a 
> qp? As it can pass any random number during qp_setup().
> Do you believe we need to add a capability field in dev_info to indicate 
> limit on max_inflight_ops?
>
> Thanks
> Shally
[Ahmed] @Fiona This looks ok. max_inflight_ops makes sense. I understand
it as a push back mechanism per qp. We do not have physical limit for
number of streams or sessions on a qp in our hardware, so we would
return 0 here as well.
@Shally in our PMD implementation we do not attach streams or sessions
to a particular qp. Regarding max_inflight_ops. I think that limit
should be independent of hardware. Not all enqueues must succeed. The
hardware can push back against the enqueuer dynamically if the resources
needed to accommodate additional ops are not available yet. This push
back happens in the software if the user sets a max_inflight_ops that is
less that the hardware max_inflight_ops. The same return pathway can be
exercised if the user actually attempts to enqueue more than the
supported max_inflight_ops because of the hardware.
>>
>>> Also, is it optional API? Like Is this a valid use case?:
>>> dev_configure() --> dev_start() --> qp_start() --> enqueue/dequeue() -->
>> qp_stop(

Re: [dpdk-dev] [RFC v2, 1/2] cryptodev: add support to set session private data

2018-01-24 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Gujjar, Abhinandan S
> Sent: Tuesday, January 23, 2018 8:54 AM
> To: Doherty, Declan ; akhil.go...@nxp.com; De
> Lara Guarch, Pablo ;
> jerin.jacobkollanukka...@cavium.com
> Cc: dev@dpdk.org; Vangati, Narender ;
> Gujjar, Abhinandan S ; Rao, Nikhil
> 
> Subject: [RFC v2, 1/2] cryptodev: add support to set session private data
> 
> Update rte_crypto_op to indicate private data offset.
> 
> The application may want to store private data along with the
> rte_cryptodev that is transparent to the rte_cryptodev layer.
> For e.g., If an eventdev based application is submitting a
> rte_cryptodev_sym_session operation and wants to indicate event
> information required to construct a new event that will be enqueued to
> eventdev after completion of the rte_cryptodev_sym_session operation.
> This patch provides a mechanism for the application to associate this
> information with the rte_cryptodev_sym_session session.
> The application can set the private data using
> rte_cryptodev_sym_session_set_private_data() and retrieve it using
> rte_cryptodev_sym_session_get_private_data().

Hi Abhinandan,

> 
> Signed-off-by: Abhinandan Gujjar 
> Signed-off-by: Nikhil Rao 
> ---
> Notes:
> V2:
>   1. Removed enum rte_crypto_op_private_data_type
>   2. Corrected formatting
> 
>  lib/librte_cryptodev/rte_crypto.h|  8 ++--
>  lib/librte_cryptodev/rte_cryptodev.h | 32
> 
>  2 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_cryptodev/rte_crypto.h
> b/lib/librte_cryptodev/rte_crypto.h
> index 95cf861..14c87c8 100644
> --- a/lib/librte_cryptodev/rte_crypto.h
> +++ b/lib/librte_cryptodev/rte_crypto.h
> @@ -84,8 +84,12 @@ struct rte_crypto_op {
>*/
>   uint8_t sess_type;
>   /**< operation session type */
> -
> - uint8_t reserved[5];
> + uint16_t private_data_offset;
> + /**< Offset to indicate start of private data (if any). The private
> +  * data may be used by the application to store information which
> +  * should remain untouched in the library/driver

Is this the offset for the private data after the crypto operation?
>From your title, it looks like it is for the session private data, but then, 
>this shouldn't be here.
If it is for the crypto operation, I suggest you to separate it in another 
patch.
Also, you should indicate where the offset starts from. For the IV, the offset 
is counted
from the start of the rte_crypto_op, so I think it should be the same, to keep 
consistency.

For the session private data, we see two options:

1 - Add a  "valid" private data field in the rte_cryptodev_sym_session 
structure,
so when it is set, it indicates that the session contains private data
(a single bit would be enough, 1 to indicate there is, and 0 to indicate there 
is not).
This would go into the beginning of the structure, so this would require an ABI 
deprecation notice.
This also assumes that the private data starts just after the session header

2 -  Do not add an extra "valid" private data field in 
rte_cryptodev_sym_session structure,
and add a small header in the private data, which contains the "valid" bit.
Then, when calling sym_session_get_private_data, this bit should be checked.
Note that the object that holds the session structure needs to be big enough to 
hold this value.
If the object has only space for the sess_private_data array, then the session 
has no private data.
Therefore, this approach might be less performant, but with no ABI deprecation 
required.

I would recommend you to send a deprecation notice for option 1, then check the 
performance of both option,
and if needed, make the change in the structure, in 18.05.

Regards,
Pablo


Re: [dpdk-dev] [PATCH] maintainers: update for cryptodev

2018-01-24 Thread Doherty, Declan

On 24/01/2018 5:24 PM, Pablo de Lara wrote:

Signed-off-by: Pablo de Lara 
---
  MAINTAINERS | 1 +
  1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5788ea004..924426343 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -276,6 +276,7 @@ F: examples/bbdev_app/
  F: doc/guides/sample_app_ug/bbdev_app.rst
  
  Crypto API

+M: Pablo de Lara 
  M: Declan Doherty 
  T: git://dpdk.org/next/dpdk-next-crypto
  F: lib/librte_cryptodev/


Acked-by: Declan Doherty 


[dpdk-dev] [PATCH v1 1/4] net/mlx4: move rdma-core calls to separate file

2018-01-24 Thread Adrien Mazarguil
This lays the groundwork for externalizing rdma-core as an optional
run-time dependency instead of a mandatory one.

No functional change.

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx4/Makefile  |   1 +
 drivers/net/mlx4/mlx4.c|  35 ++---
 drivers/net/mlx4/mlx4_ethdev.c |   3 +-
 drivers/net/mlx4/mlx4_flow.c   |  32 +++--
 drivers/net/mlx4/mlx4_glue.c   | 275 
 drivers/net/mlx4/mlx4_glue.h   |  80 +++
 drivers/net/mlx4/mlx4_intr.c   |  10 +-
 drivers/net/mlx4/mlx4_mr.c |   7 +-
 drivers/net/mlx4/mlx4_rxq.c|  53 +++
 drivers/net/mlx4/mlx4_txq.c|  17 +--
 10 files changed, 440 insertions(+), 73 deletions(-)

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index 1f95e0df9..1d33c38ed 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -38,6 +38,7 @@ LIB = librte_pmd_mlx4.a
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_flow.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_glue.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_intr.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_mr.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_rxq.c
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 2a721e7e2..0fd9a999c 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -67,6 +67,7 @@
 #include 
 
 #include "mlx4.h"
+#include "mlx4_glue.h"
 #include "mlx4_flow.h"
 #include "mlx4_rxtx.h"
 #include "mlx4_utils.h"
@@ -218,8 +219,8 @@ mlx4_dev_close(struct rte_eth_dev *dev)
mlx4_tx_queue_release(dev->data->tx_queues[i]);
if (priv->pd != NULL) {
assert(priv->ctx != NULL);
-   claim_zero(ibv_dealloc_pd(priv->pd));
-   claim_zero(ibv_close_device(priv->ctx));
+   claim_zero(mlx4_glue->dealloc_pd(priv->pd));
+   claim_zero(mlx4_glue->close_device(priv->ctx));
} else
assert(priv->ctx == NULL);
mlx4_intr_uninstall(priv);
@@ -436,7 +437,7 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
 
(void)pci_drv;
assert(pci_drv == &mlx4_driver);
-   list = ibv_get_device_list(&i);
+   list = mlx4_glue->get_device_list(&i);
if (list == NULL) {
rte_errno = errno;
assert(rte_errno);
@@ -465,12 +466,12 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
  PCI_DEVICE_ID_MELLANOX_CONNECTX3VF);
INFO("PCI information matches, using device \"%s\" (VF: %s)",
 list[i]->name, (vf ? "true" : "false"));
-   attr_ctx = ibv_open_device(list[i]);
+   attr_ctx = mlx4_glue->open_device(list[i]);
err = errno;
break;
}
if (attr_ctx == NULL) {
-   ibv_free_device_list(list);
+   mlx4_glue->free_device_list(list);
switch (err) {
case 0:
rte_errno = ENODEV;
@@ -487,7 +488,7 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
}
ibv_dev = list[i];
DEBUG("device opened");
-   if (ibv_query_device(attr_ctx, &device_attr)) {
+   if (mlx4_glue->query_device(attr_ctx, &device_attr)) {
rte_errno = ENODEV;
goto error;
}
@@ -502,7 +503,7 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
if (!conf.ports.enabled)
conf.ports.enabled = conf.ports.present;
/* Retrieve extended device attributes. */
-   if (ibv_query_device_ex(attr_ctx, NULL, &device_attr_ex)) {
+   if (mlx4_glue->query_device_ex(attr_ctx, NULL, &device_attr_ex)) {
rte_errno = ENODEV;
goto error;
}
@@ -520,13 +521,13 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
if (!(conf.ports.enabled & (1 << i)))
continue;
DEBUG("using port %u", port);
-   ctx = ibv_open_device(ibv_dev);
+   ctx = mlx4_glue->open_device(ibv_dev);
if (ctx == NULL) {
rte_errno = ENODEV;
goto port_error;
}
/* Check port status. */
-   err = ibv_query_port(ctx, port, &port_attr);
+   err = mlx4_glue->query_port(ctx, port, &port_attr);
if (err) {
rte_errno = err;
ERROR("port query failed: %s", strerror(rte_errno));
@@ -540,7 +541,7 @@ mlx4_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
}
if (port_attr.state != IBV_PORT_ACTIVE)
DEBUG("port %d is not active: \"%s\" (%d)",
- 

[dpdk-dev] [PATCH v1 0/4] net/mlx: make rdma-core optional at run-time

2018-01-24 Thread Adrien Mazarguil
A problem encountered with Mellanox PMDs and frequently reported by DPDK
application developers and Linux distribution package maintainers is their
dependency on rdma-core components, namely libibverbs, libmlx4, and libmlx5.

For best performance in applications, DPDK is normally built as a collection
of library archives (.a files), whose external dependencies are inherited
through rte.app.mk during link.

When these PMDs are built-in, any binary DPDK package, whether DPDK itself
or derived applications, always have to pull rdma-core. This dependency is
not obvious and may be missed during the packaging of any intermediate layer
between DPDK itself and the end application.

While still required during compilation, this series trades this hard
dependency for an optional one, dynamically loaded at run-time through
dlopen().

It supersedes Shachar's previous work on the same topic [1] using a
different approach in order to preserve symbol versioning and address
the remaining issues.

[1] http://dpdk.org/ml/archives/dev/2017-December/085090.html

Adrien Mazarguil (3):
  net/mlx4: move rdma-core calls to separate file
  net/mlx4: spawn rdma-core dependency plug-in
  net/mlx5: spawn rdma-core dependency plug-in

Nelio Laranjeiro (1):
  net/mlx5: move rdma-core calls to separate file

 drivers/net/mlx4/Makefile|  41 +
 drivers/net/mlx4/mlx4.c  | 115 ++--
 drivers/net/mlx4/mlx4_ethdev.c   |   3 +-
 drivers/net/mlx4/mlx4_flow.c |  32 ++--
 drivers/net/mlx4/mlx4_glue.c | 275 
 drivers/net/mlx4/mlx4_glue.h |  80 +
 drivers/net/mlx4/mlx4_intr.c |  10 +-
 drivers/net/mlx4/mlx4_mr.c   |   7 +-
 drivers/net/mlx4/mlx4_rxq.c  |  53 +++---
 drivers/net/mlx4/mlx4_txq.c  |  17 +-
 drivers/net/mlx5/Makefile|  41 +
 drivers/net/mlx5/mlx5.c  | 127 ++---
 drivers/net/mlx5/mlx5_ethdev.c   |   7 +-
 drivers/net/mlx5/mlx5_flow.c |  71 
 drivers/net/mlx5/mlx5_glue.c | 328 ++
 drivers/net/mlx5/mlx5_glue.h | 100 +++
 drivers/net/mlx5/mlx5_mac.c  |   1 +
 drivers/net/mlx5/mlx5_mr.c   |   7 +-
 drivers/net/mlx5/mlx5_rss.c  |   1 +
 drivers/net/mlx5/mlx5_rxmode.c   |   1 +
 drivers/net/mlx5/mlx5_rxq.c  |  54 +++---
 drivers/net/mlx5/mlx5_rxtx.c |   1 +
 drivers/net/mlx5/mlx5_rxtx.h |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec.c |   1 +
 drivers/net/mlx5/mlx5_txq.c  |  22 +--
 drivers/net/mlx5/mlx5_vlan.c |   2 +-
 mk/rte.app.mk|   2 +-
 27 files changed, 1221 insertions(+), 179 deletions(-)
 create mode 100644 drivers/net/mlx4/mlx4_glue.c
 create mode 100644 drivers/net/mlx4/mlx4_glue.h
 create mode 100644 drivers/net/mlx5/mlx5_glue.c
 create mode 100644 drivers/net/mlx5/mlx5_glue.h

-- 
2.11.0


[dpdk-dev] [PATCH v1 2/4] net/mlx4: spawn rdma-core dependency plug-in

2018-01-24 Thread Adrien Mazarguil
When mlx4 is not compiled directly as an independent shared object (e.g.
CONFIG_RTE_BUILD_SHARED_LIB not enabled for performance reasons), DPDK
applications inherit its dependencies on libibverbs and libmlx4 through
rte.app.mk.

This is an issue both when DPDK is delivered as a binary package (Linux
distributions) and for end users because rdma-core then propagates as a
mandatory dependency for everything.

Application writers relying on binary DPDK packages are not necessarily
aware of this fact and may end up delivering packages with broken
dependencies.

This patch therefore introduces an intermediate internal plug-in
hard-linked with rdma-core (to preserve symbol versioning) loaded by the
PMD through dlopen(), so that a missing rdma-core does not cause unresolved
symbols, allowing applications to start normally.

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx4/Makefile | 40 +
 drivers/net/mlx4/mlx4.c   | 80 ++
 mk/rte.app.mk |  2 +-
 3 files changed, 121 insertions(+), 1 deletion(-)

diff --git a/drivers/net/mlx4/Makefile b/drivers/net/mlx4/Makefile
index 1d33c38ed..2ebe61e90 100644
--- a/drivers/net/mlx4/Makefile
+++ b/drivers/net/mlx4/Makefile
@@ -38,7 +38,11 @@ LIB = librte_pmd_mlx4.a
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_flow.c
+ifneq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_glue_lib.c
+else
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_glue.c
+endif
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_intr.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_mr.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += mlx4_rxq.c
@@ -55,7 +59,12 @@ CFLAGS += -D_BSD_SOURCE
 CFLAGS += -D_DEFAULT_SOURCE
 CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
+ifneq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+CFLAGS_mlx4_glue.o += -fPIC
+LDLIBS += -ldl
+else
 LDLIBS += -libverbs -lmlx4
+endif
 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
 LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
 LDLIBS += -lrte_bus_pci
@@ -109,7 +118,38 @@ mlx4_autoconf.h: mlx4_autoconf.h.new
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX4_PMD):.c=.o): mlx4_autoconf.h
 
+# Generate dependency plug-in for rdma-core when the PMD cannot be linked
+# directly, so that applications do not inherit this dependency.
+
+ifneq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+
+mlx4_glue_lib.c: mlx4_glue_lib.so
+   $Q printf '#include \n' > $@
+   $Q printf '#include \n\n' >> $@
+   $Q printf 'const uint8_t mlx4_glue_lib[][16] = {\n' >> $@
+   $Q od -vt x1 $< | \
+   sed -ne '/^[[:xdigit:]]\{1,\}/{' \
+   -e 's///;' \
+   -e '/^$$/d; ' \
+   -e 's/[[:space:]]*$$//;' \
+   -e 's/[[:space:]]\{1,\}/\\x/g;' \
+   -e 's/^/"/;' \
+   -e 's/$$/",/;' \
+   -e 'p;' \
+   -e '}' >> $@
+   $Q printf '};\n\n' >> $@
+   $Q printf 'const size_t mlx4_glue_lib_size = %u;' $$(wc -c < $<) >> $@
+
+mlx4_glue_lib.so: mlx4_glue.o
+   $Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+   -s -shared -o $@ $< -libverbs -lmlx4
+
+mlx4_glue.o: mlx4_autoconf.h
+
+endif
+
 clean_mlx4: FORCE
$Q rm -f -- mlx4_autoconf.h mlx4_autoconf.h.new
+   $Q rm -f -- mlx4_glue.o mlx4_glue_lib.*
 
 clean: clean_mlx4
diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index 0fd9a999c..c173fbf56 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -37,6 +37,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -44,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Verbs headers do not support -pedantic. */
 #ifdef PEDANTIC
@@ -55,6 +57,7 @@
 #endif
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -724,6 +727,78 @@ static struct rte_pci_driver mlx4_driver = {
 RTE_PCI_DRV_INTR_RMV,
 };
 
+#ifndef RTE_BUILD_SHARED_LIB
+
+extern const uint8_t mlx4_glue_lib[][16];
+extern const size_t mlx4_glue_lib_size;
+
+/**
+ * Initialization routine for run-time dependency on rdma-core.
+ */
+static int
+mlx4_glue_init(void)
+{
+   char file[] = "/tmp/" MLX4_DRIVER_NAME "_XX";
+   int fd = mkstemp(file);
+   size_t off = 0;
+   void *handle = NULL;
+   void **sym;
+   const char *dlmsg;
+
+   if (fd == -1) {
+   rte_errno = errno;
+   goto glue_error;
+   }
+   while (off != mlx4_glue_lib_size) {
+   ssize_t ret;
+
+   ret = write(fd, (const uint8_t *)mlx4_glue_lib + off,
+   mlx4_glue_lib_size - off);
+   if (ret == -1) {
+   if (errno != EINTR) {
+   rte_errno = errno;
+   goto glue_error;
+   }
+   ret = 0;
+   }
+   off += ret;
+   }
+  

[dpdk-dev] [PATCH v1 3/4] net/mlx5: move rdma-core calls to separate file

2018-01-24 Thread Adrien Mazarguil
From: Nelio Laranjeiro 

This lays the groundwork for externalizing rdma-core as an optional
run-time dependency instead of a mandatory one.

No functional change.

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/Makefile|   1 +
 drivers/net/mlx5/mlx5.c  |  48 ++---
 drivers/net/mlx5/mlx5_ethdev.c   |   7 +-
 drivers/net/mlx5/mlx5_flow.c |  71 
 drivers/net/mlx5/mlx5_glue.c | 328 ++
 drivers/net/mlx5/mlx5_glue.h | 100 +++
 drivers/net/mlx5/mlx5_mac.c  |   1 +
 drivers/net/mlx5/mlx5_mr.c   |   7 +-
 drivers/net/mlx5/mlx5_rss.c  |   1 +
 drivers/net/mlx5/mlx5_rxmode.c   |   1 +
 drivers/net/mlx5/mlx5_rxq.c  |  54 +++---
 drivers/net/mlx5/mlx5_rxtx.c |   1 +
 drivers/net/mlx5/mlx5_rxtx.h |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec.c |   1 +
 drivers/net/mlx5/mlx5_txq.c  |  22 +--
 drivers/net/mlx5/mlx5_vlan.c |   2 +-
 16 files changed, 541 insertions(+), 105 deletions(-)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index a3984eb9f..bdec30692 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c
 
 # Basic CFLAGS.
 CFLAGS += -O3
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index bc4b6bad0..b36cc611f 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -63,6 +63,7 @@
 #include "mlx5_rxtx.h"
 #include "mlx5_autoconf.h"
 #include "mlx5_defs.h"
+#include "mlx5_glue.h"
 
 /* Device parameter to enable RX completion queue compression. */
 #define MLX5_RXQ_CQE_COMP_EN "rxq_cqe_comp_en"
@@ -216,8 +217,8 @@ mlx5_dev_close(struct rte_eth_dev *dev)
}
if (priv->pd != NULL) {
assert(priv->ctx != NULL);
-   claim_zero(ibv_dealloc_pd(priv->pd));
-   claim_zero(ibv_close_device(priv->ctx));
+   claim_zero(mlx5_glue->dealloc_pd(priv->pd));
+   claim_zero(mlx5_glue->close_device(priv->ctx));
} else
assert(priv->ctx == NULL);
if (priv->rss_conf.rss_key != NULL)
@@ -523,7 +524,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
 
/* Save PCI address. */
mlx5_dev[idx].pci_addr = pci_dev->addr;
-   list = ibv_get_device_list(&i);
+   list = mlx5_glue->get_device_list(&i);
if (list == NULL) {
assert(errno);
if (errno == ENOSYS)
@@ -573,12 +574,12 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
 " (SR-IOV: %s)",
 list[i]->name,
 sriov ? "true" : "false");
-   attr_ctx = ibv_open_device(list[i]);
+   attr_ctx = mlx5_glue->open_device(list[i]);
err = errno;
break;
}
if (attr_ctx == NULL) {
-   ibv_free_device_list(list);
+   mlx5_glue->free_device_list(list);
switch (err) {
case 0:
ERROR("cannot access device, is mlx5_ib loaded?");
@@ -597,7 +598,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
 * Multi-packet send is supported by ConnectX-4 Lx PF as well
 * as all ConnectX-5 devices.
 */
-   mlx5dv_query_device(attr_ctx, &attrs_out);
+   mlx5_glue->dv_query_device(attr_ctx, &attrs_out);
if (attrs_out.flags & MLX5DV_CONTEXT_FLAGS_MPW_ALLOWED) {
if (attrs_out.flags & MLX5DV_CONTEXT_FLAGS_ENHANCED_MPW) {
DEBUG("Enhanced MPW is supported");
@@ -615,7 +616,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
cqe_comp = 0;
else
cqe_comp = 1;
-   if (ibv_query_device_ex(attr_ctx, NULL, &device_attr))
+   if (mlx5_glue->query_device_ex(attr_ctx, NULL, &device_attr))
goto error;
INFO("%u port(s) detected", device_attr.orig_attr.phys_port_cnt);
 
@@ -650,7 +651,7 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
char name[RTE_ETH_NAME_MAX_LEN];
 
snprintf(name, sizeof(name), "%s port %u",
-ibv_get_device_name(ibv_dev), port);
+mlx5_glue->get_device_name(ibv_dev), port);
eth_dev = rte_eth_dev_attach_secondary(name);
if (eth_dev == NULL) {
ERROR("can not attach rte ethdev");
@@ -686,15 +687,15 @@ mlx5_pci_probe(struct rte_pci_driver *pci_drv, struct 
rte_pci_device *pci_dev)
 
DEBUG("using port %u 

[dpdk-dev] [PATCH v1 4/4] net/mlx5: spawn rdma-core dependency plug-in

2018-01-24 Thread Adrien Mazarguil
When mlx5 is not compiled directly as an independent shared object (e.g.
CONFIG_RTE_BUILD_SHARED_LIB not enabled for performance reasons), DPDK
applications inherit its dependencies on libibverbs and libmlx5 through
rte.app.mk.

This is an issue both when DPDK is delivered as a binary package (Linux
distributions) and for end users because rdma-core then propagates as a
mandatory dependency for everything.

Application writers relying on binary DPDK packages are not necessarily
aware of this fact and may end up delivering packages with broken
dependencies.

This patch therefore introduces an intermediate internal plug-in
hard-linked with rdma-core (to preserve symbol versioning) loaded by the
PMD through dlopen(), so that a missing rdma-core does not cause unresolved
symbols, allowing applications to start normally.

Signed-off-by: Adrien Mazarguil 
---
 drivers/net/mlx5/Makefile | 40 +
 drivers/net/mlx5/mlx5.c   | 79 ++
 2 files changed, 119 insertions(+)

diff --git a/drivers/net/mlx5/Makefile b/drivers/net/mlx5/Makefile
index bdec30692..35525f6d0 100644
--- a/drivers/net/mlx5/Makefile
+++ b/drivers/net/mlx5/Makefile
@@ -53,7 +53,11 @@ SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_rss.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_mr.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_flow.c
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_socket.c
+ifneq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue_lib.c
+else
 SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += mlx5_glue.c
+endif
 
 # Basic CFLAGS.
 CFLAGS += -O3
@@ -65,7 +69,12 @@ CFLAGS += -D_DEFAULT_SOURCE
 CFLAGS += -D_XOPEN_SOURCE=600
 CFLAGS += $(WERROR_FLAGS)
 CFLAGS += -Wno-strict-prototypes
+ifneq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+CFLAGS_mlx5_glue.o += -fPIC
+LDLIBS += -ldl
+else
 LDLIBS += -libverbs -lmlx5
+endif
 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
 LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs
 LDLIBS += -lrte_bus_pci
@@ -158,7 +167,38 @@ mlx5_autoconf.h: mlx5_autoconf.h.new
 
 $(SRCS-$(CONFIG_RTE_LIBRTE_MLX5_PMD):.c=.o): mlx5_autoconf.h
 
+# Generate dependency plug-in for rdma-core when the PMD cannot be linked
+# directly, so that applications do not inherit this dependency.
+
+ifneq ($(CONFIG_RTE_BUILD_SHARED_LIB),y)
+
+mlx5_glue_lib.c: mlx5_glue_lib.so
+   $Q printf '#include \n' > $@
+   $Q printf '#include \n\n' >> $@
+   $Q printf 'const uint8_t mlx5_glue_lib[][16] = {\n' >> $@
+   $Q od -vt x1 $< | \
+   sed -ne '/^[[:xdigit:]]\{1,\}/{' \
+   -e 's///;' \
+   -e '/^$$/d; ' \
+   -e 's/[[:space:]]*$$//;' \
+   -e 's/[[:space:]]\{1,\}/\\x/g;' \
+   -e 's/^/"/;' \
+   -e 's/$$/",/;' \
+   -e 'p;' \
+   -e '}' >> $@
+   $Q printf '};\n\n' >> $@
+   $Q printf 'const size_t mlx5_glue_lib_size = %u;' $$(wc -c < $<) >> $@
+
+mlx5_glue_lib.so: mlx5_glue.o
+   $Q $(LD) $(LDFLAGS) $(EXTRA_LDFLAGS) \
+   -s -shared -o $@ $< -libverbs -lmlx5
+
+mlx5_glue.o: mlx5_autoconf.h
+
+endif
+
 clean_mlx5: FORCE
$Q rm -f -- mlx5_autoconf.h mlx5_autoconf.h.new
+   $Q rm -f -- mlx5_glue.o mlx5_glue_lib.*
 
 clean: clean_mlx5
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index b36cc611f..7ef724585 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -35,6 +35,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -56,6 +57,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "mlx5.h"
@@ -966,6 +968,78 @@ static struct rte_pci_driver mlx5_driver = {
.drv_flags = RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV,
 };
 
+#ifndef RTE_BUILD_SHARED_LIB
+
+extern const uint8_t mlx5_glue_lib[][16];
+extern const size_t mlx5_glue_lib_size;
+
+/**
+ * Initialization routine for run-time dependency on rdma-core.
+ */
+static int
+mlx5_glue_init(void)
+{
+   char file[] = "/tmp/" MLX5_DRIVER_NAME "_XX";
+   int fd = mkstemp(file);
+   size_t off = 0;
+   void *handle = NULL;
+   void **sym;
+   const char *dlmsg;
+
+   if (fd == -1) {
+   rte_errno = errno;
+   goto glue_error;
+   }
+   while (off != mlx5_glue_lib_size) {
+   ssize_t ret;
+
+   ret = write(fd, (const uint8_t *)mlx5_glue_lib + off,
+   mlx5_glue_lib_size - off);
+   if (ret == -1) {
+   if (errno != EINTR) {
+   rte_errno = errno;
+   goto glue_error;
+   }
+   ret = 0;
+   }
+   off += ret;
+   }
+   close(fd);
+   fd = -1;
+   handle = dlopen(file, RTLD_LAZY);
+   unlink(file);
+   if (!handle) {
+   rte_errno = EINVAL;
+   dlmsg = dlerror();
+   if (d

Re: [dpdk-dev] [PATCH v1 1/4] net/mlx4: move rdma-core calls to separate file

2018-01-24 Thread Stephen Hemminger
On Thu, 25 Jan 2018 00:25:00 +0100
Adrien Mazarguil  wrote:

> +const struct mlx4_glue *mlx4_glue = &(const struct mlx4_glue){
> + .fork_init = mlx4_glue_fork_init,
> + .get_async_event = mlx4_glue_get_async_event,

The cast should not be necessary here.


Re: [dpdk-dev] [PATCH] app/testpmd: do not enable Rx offloads by default

2018-01-24 Thread Lu, Wenzhuo
Hi Moti,


> -Original Message-
> From: Moti Haimovsky [mailto:mo...@mellanox.com]
> Sent: Tuesday, January 23, 2018 4:11 PM
> To: Lu, Wenzhuo ; tho...@monjalon.net
> Cc: dev@dpdk.org; Moti Haimovsky 
> Subject: [PATCH] app/testpmd: do not enable Rx offloads by default
> 
> Removed the hardcoded preconfigured Rx offload configuration from
> testpmd.
> Testers who wish to use these offloads will now have to explicitly write them
> in the command-line when running testpmd.
> 
> Motivation:
> Some PMDs such at the mlx4 may not implement all the offloads.
> After the offload API rework assuming no offload is enabled by default,
>   commit ce17eddefc20 ("ethdev: introduce Rx queue offloads API")
>   commit cba7f53b717d ("ethdev: introduce Tx queue offloads API") trying to
> enable a not supported offload is clearly an error which will cause
> configuration failing.
> 
> Considering that testpmd is an application to test the PMD, it should not fail
> on a configuration which was not explicitly requested.
> The behavior of this test application is then turned to an opt-in model.
> 
> Signed-off-by: Moti Haimovsky 
> ---
>  app/test-pmd/testpmd.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
> 
> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> 5dc8cca..a082352 100644
> --- a/app/test-pmd/testpmd.c
> +++ b/app/test-pmd/testpmd.c
> @@ -305,9 +305,7 @@ struct fwd_engine * fwd_engines[] = {
>   */
>  struct rte_eth_rxmode rx_mode = {
>   .max_rx_pkt_len = ETHER_MAX_LEN, /**< Default maximum frame
> length. */
> - .offloads = (DEV_RX_OFFLOAD_VLAN_FILTER |
> -  DEV_RX_OFFLOAD_VLAN_STRIP |
> -  DEV_RX_OFFLOAD_CRC_STRIP),
> + .offloads = 0,
Change the default behavior may trigger other problems. I think TX offload 
could be a good reference. Get the capability and check what's supported first, 
then ignore the not supported functions with printing a warning but not block 
anything...

>   .ignore_offload_bitfield = 1,
>  };
> 
> --
> 1.8.3.1



Re: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port ownership

2018-01-24 Thread Lu, Wenzhuo
Hi Matan,


> -Original Message-
> From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Matan Azrad
> Sent: Tuesday, January 23, 2018 12:38 AM
> To: Thomas Monjalon ; Gaetan Rivet
> ; Wu, Jingjing 
> Cc: dev@dpdk.org; Neil Horman ; Richardson,
> Bruce ; Ananyev, Konstantin
> 
> Subject: [dpdk-dev] [PATCH v5 7/7] app/testpmd: adjust ethdev port
> ownership
> 
> Testpmd should not use ethdev ports which are managed by other DPDK
> entities.
> 
> Set Testpmd ownership to each port which is not used by other entity and
> prevent any usage of ethdev ports which are not owned by Testpmd.
> 
> Signed-off-by: Matan Azrad 
> ---
>  app/test-pmd/cmdline.c  | 89 
> +++--
>  app/test-pmd/cmdline_flow.c |  2 +-
>  app/test-pmd/config.c   | 37 ++-
>  app/test-pmd/parameters.c   |  4 +-
>  app/test-pmd/testpmd.c  | 63 
>  app/test-pmd/testpmd.h  |  3 ++
>  6 files changed, 103 insertions(+), 95 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index
> 9f12c0f..36dba6c 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -1394,7 +1394,7 @@ struct cmd_config_speed_all {
>   &link_speed) < 0)
>   return;
> 
> - RTE_ETH_FOREACH_DEV(pid) {
> + RTE_ETH_FOREACH_DEV_OWNED_BY(pid, my_owner.id) {
I see my_owner is a global variable, so, don't know why we need the parameter 
'my_owner.id' here. I think we can still use RTE_ETH_FOREACH_DEV and check 
'my_owner' in it. If there's some reason and you don't want change 
RTE_ETH_FOREACH_DEV, I think ' RTE_ETH_FOREACH_DEV_OWNED(pid) {' is better.


  1   2   >