[dpdk-dev] [PATCH 7/8] drivers/net/ixgbe: Signed left shift operator

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 7/8] drivers/net/ixgbe: Signed left shift operator
> 
> Tell the compiler to use an unsigned constant for the config shifts.
> 
> Signed-off-by: Aaron Conole 
Acked-by: Helin Zhang 

> ---
>  drivers/net/ixgbe/ixgbe_pf.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_pf.c b/drivers/net/ixgbe/ixgbe_pf.c index
> 2ffbd1f..8b5119f 100644
> --- a/drivers/net/ixgbe/ixgbe_pf.c
> +++ b/drivers/net/ixgbe/ixgbe_pf.c
> @@ -236,9 +236,9 @@ int ixgbe_pf_host_configure(struct rte_eth_dev
> *eth_dev)
>   vfre_slot = (vf_num >> VFRE_SHIFT) > 0 ? 1 : 0;
> 
>   /* Enable pools reserved to PF only */
> - IXGBE_WRITE_REG(hw, IXGBE_VFRE(vfre_slot), (~0) << vfre_offset);
> + IXGBE_WRITE_REG(hw, IXGBE_VFRE(vfre_slot), (~0U) << vfre_offset);
>   IXGBE_WRITE_REG(hw, IXGBE_VFRE(vfre_slot ^ 1), vfre_slot - 1);
> - IXGBE_WRITE_REG(hw, IXGBE_VFTE(vfre_slot), (~0) << vfre_offset);
> + IXGBE_WRITE_REG(hw, IXGBE_VFTE(vfre_slot), (~0U) << vfre_offset);
>   IXGBE_WRITE_REG(hw, IXGBE_VFTE(vfre_slot ^ 1), vfre_slot - 1);
> 
>   /* PFDMA Tx General Switch Control Enables VMDQ loopback */
> --
> 2.5.0



[dpdk-dev] [PATCH 5/8] drivers/net/ixgbe: Fix vlan filter missing brackets

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 5/8] drivers/net/ixgbe: Fix vlan filter missing brackets
> 
> The ixgbe vlan filter code has an if check with an incorrect whitespace.
> 
> Signed-off-by: Aaron Conole 
Acked-by: Helin Zhang 

> ---
>  drivers/net/ixgbe/ixgbe_ethdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 3e6fe86..2e1c3ad 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -4258,10 +4258,11 @@ ixgbe_set_pool_vlan_filter(struct rte_eth_dev *dev,
> uint16_t vlan,
>   if (ixgbe_vmdq_mode_check(hw) < 0)
>   return -ENOTSUP;
>   for (pool_idx = 0; pool_idx < ETH_64_POOLS; pool_idx++) {
> - if (pool_mask & ((uint64_t)(1ULL << pool_idx)))
> + if (pool_mask & ((uint64_t)(1ULL << pool_idx))) {
>   ret = hw->mac.ops.set_vfta(hw,vlan,pool_idx,vlan_on);
>   if (ret < 0)
>   return ret;
> + }
>   }
> 
>   return ret;
> --
> 2.5.0



[dpdk-dev] ixgbe TX function selection

2016-03-18 Thread Lu, Wenzhuo
Hi Zoltan,


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> Sent: Friday, March 18, 2016 1:11 AM
> To: Wu, Jingjing; dev at dpdk.org
> Subject: Re: [dpdk-dev] ixgbe TX function selection
> 
> 
> 
> On 10/03/16 07:51, Wu, Jingjing wrote:
> > Hi, Zoltan
> >
> >> -Original Message-
> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
> >> Sent: Wednesday, March 2, 2016 3:19 AM
> >> To: dev at dpdk.org
> >> Subject: [dpdk-dev] ixgbe TX function selection
> >>
> >> Hi,
> >>
> >> I've noticed that ixgbe_set_tx_function() selects the non-SG function
> >> even if (dev->data->scattered_rx == 1). That seems a bit dangerous,
> >> as you can turn that on inadvertently when you don't set
> >> max_rx_pkt_len and buffer size in certain ways. I've learnt it in the
> >> hard way, as my segmented packets were leaking memory on the TX path,
> >> which doesn't cries if you send out segmented packets.
> >> How should this case be treated? Assert on the non-SG TX side for the
> >> 'next' pointer? Or turning on SG if RX has it? It doesn't seem to be
> >> a solid way as other interfaces still can have SG turned on.
> >>
> >
> > If you look into the ixgbe_set_tx_function, you will find tx function
> > selection is decided by the tx_flags on queue configure, which is
> > passed by rte_eth_txconf. So even you set dev->data->scattered_rx to
> > 1, if the tx_flags is ETH_TXQ_FLAGS_NOMULTSEGS, ixgbe_xmit_pkts_simple
> > is still selected as tx function. So, you'd better to set tx_flags=0, and 
> > have a try.
> 
> You mean getting default_txconf from rte_eth_dev_info_get() and explicitly 
> turn
> ETH_TXQ_FLAGS_NOMULTSEGS to 0? (filling tx_flags with zeros doesn't work
> very well) That's a way to solve it for me, but I'm rather talking about using
> defaults which doesn't cause memory leak quite easily.
Yes, ETH_TXQ_FLAGS_NOMULTSEGS only can be set to 1 when you know all your 
packets will not be segmented.
I think that means normally we should use full function path for TX, for we 
have no knowledge about if the packets will be segmented or not.
You don't need to set tx_flags to 0, only the ETH_TXQ_FLAGS_NOMULTSEGS bit 
should be 0, the other bits can be 1 if needed.

> 
> >
> >> Regards,
> >>
> >> Zoltan


[dpdk-dev] [PATCH 8/8] drivers/net/ixgbe: Fix uninitialized warning

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 8/8] drivers/net/ixgbe: Fix uninitialized warning
> 
> Silence a compiler warning that this variable may be used uninitialized.
> 
> Signed-off-by: Aaron Conole 
Acked-by: Helin Zhang 

> ---
>  drivers/net/ixgbe/ixgbe_rxtx.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index e95e6b7..775edc7 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -1563,7 +1563,7 @@ ixgbe_recv_pkts_lro(void *rx_queue, struct rte_mbuf
> **rx_pkts, uint16_t nb_pkts,
>   struct ixgbe_rx_entry *rxe;
>   struct ixgbe_scattered_rx_entry *sc_entry;
>   struct ixgbe_scattered_rx_entry *next_sc_entry;
> - struct ixgbe_rx_entry *next_rxe;
> + struct ixgbe_rx_entry *next_rxe = NULL;
>   struct rte_mbuf *first_seg;
>   struct rte_mbuf *rxm;
>   struct rte_mbuf *nmb;
> @@ -1740,7 +1740,7 @@ next_desc:
>* the pointer to the first mbuf at the NEXTP entry in the
>* sw_sc_ring and continue to parse the RX ring.
>*/
> - if (!eop) {
> + if (!eop && next_rxe) {
>   rxm->next = next_rxe->mbuf;
>   next_sc_entry->fbuf = first_seg;
>   goto next_desc;
> --
> 2.5.0



[dpdk-dev] [PATCH 6/8] drivers/net/e1000/igb: Signed left shift operator

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 6/8] drivers/net/e1000/igb: Signed left shift operator
> 
> Tell the compiler to use an unsigned constant for the config shifts.
> 
> Signed-off-by: Aaron Conole 
Acked-by: Helin Zhang 
> ---
>  drivers/net/e1000/igb_pf.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/e1000/igb_pf.c b/drivers/net/e1000/igb_pf.c index
> 1d00dda..afe80f5 100644
> --- a/drivers/net/e1000/igb_pf.c
> +++ b/drivers/net/e1000/igb_pf.c
> @@ -172,8 +172,8 @@ int igb_pf_host_configure(struct rte_eth_dev *eth_dev)
>   E1000_WRITE_REG(hw, E1000_VT_CTL, vtctl);
> 
>   /* Enable pools reserved to PF only */
> - E1000_WRITE_REG(hw, E1000_VFRE, (~0) << vf_num);
> - E1000_WRITE_REG(hw, E1000_VFTE, (~0) << vf_num);
> + E1000_WRITE_REG(hw, E1000_VFRE, (~0U) << vf_num);
> + E1000_WRITE_REG(hw, E1000_VFTE, (~0U) << vf_num);
> 
>   /* PFDMA Tx General Switch Control Enables VMDQ loopback */
>   if (hw->mac.type == e1000_i350)
> --
> 2.5.0



[dpdk-dev] [PATCH 4/8] drivers/net/e1000: Fix missing lsc interrupt check brackets

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 4/8] drivers/net/e1000: Fix missing lsc interrupt check 
> brackets
> 
> The device lsc interupt check has a misleading whitespace around it which can 
> be
> improved by adding appropriate braces to the check. Since the ret variable was
> checked after previous assignment, this introduces no functional change.

It seems a "Fixes: " line is required.
> 
> Signed-off-by: Aaron Conole 
> ---
>  drivers/net/e1000/em_ethdev.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
> index 4a843fe..1d86091 100644
> --- a/drivers/net/e1000/em_ethdev.c
> +++ b/drivers/net/e1000/em_ethdev.c
> @@ -637,13 +637,14 @@ eth_em_start(struct rte_eth_dev *dev)
> 
>   if (rte_intr_allow_others(intr_handle)) {
>   /* check if lsc interrupt is enabled */
> - if (dev->data->dev_conf.intr_conf.lsc != 0)
> + if (dev->data->dev_conf.intr_conf.lsc != 0) {
>   ret = eth_em_interrupt_setup(dev);
>   if (ret) {
>   PMD_INIT_LOG(ERR, "Unable to setup interrupts");
>   em_dev_clear_queues(dev);
>   return ret;
>   }
> + }
>   } else {
>   rte_intr_callback_unregister(intr_handle,
>   eth_em_interrupt_handler,
> --
> 2.5.0



[dpdk-dev] [PATCH 3/8] drivers/net/e1000: Fix missing brackets

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 3/8] drivers/net/e1000: Fix missing brackets
> 
> The register read/write mphy functions have misleading whitespace around the
> locked check. This cleanup merely preserves the existing functionality while
> improving the ready check.
> 
> Signed-off-by: Aaron Conole 
> ---
>  drivers/net/e1000/base/e1000_phy.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/e1000/base/e1000_phy.c
> b/drivers/net/e1000/base/e1000_phy.c
> index d43b7ce..8642d38 100644
> --- a/drivers/net/e1000/base/e1000_phy.c
> +++ b/drivers/net/e1000/base/e1000_phy.c
> @@ -4153,13 +4153,13 @@ s32 e1000_read_phy_reg_mphy(struct e1000_hw
> *hw, u32 address, u32 *data)
>   *data = E1000_READ_REG(hw, E1000_MPHY_DATA);
> 
>   /* Disable access to mPHY if it was originally disabled */
> - if (locked)
> + if (locked) {
>   ready = e1000_is_mphy_ready(hw);
>   if (!ready)
>   return -E1000_ERR_PHY;
> - E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
> - E1000_MPHY_DIS_ACCESS);
> + }
> 
> + E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
> E1000_MPHY_DIS_ACCESS);
I think the above line should be included in brackets above.
Though we should avoid modifying any code in base driver, I'd
suggest you to fix the issue here. Thank you!

>   return E1000_SUCCESS;
>  }
> 
> @@ -4218,13 +4218,13 @@ s32 e1000_write_phy_reg_mphy(struct e1000_hw
> *hw, u32 address, u32 data,
>   E1000_WRITE_REG(hw, E1000_MPHY_DATA, data);
> 
>   /* Disable access to mPHY if it was originally disabled */
> - if (locked)
> + if (locked) {
>   ready = e1000_is_mphy_ready(hw);
>   if (!ready)
>   return -E1000_ERR_PHY;
> - E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
> - E1000_MPHY_DIS_ACCESS);
> + }
> 
> + E1000_WRITE_REG(hw, E1000_MPHY_ADDR_CTRL,
> E1000_MPHY_DIS_ACCESS);
>   return E1000_SUCCESS;
The same comments as above. Thanks!

>  }
> 
> --
> 2.5.0



[dpdk-dev] [PATCH 2/8] app/test/test: Fix missing brackets

2016-03-18 Thread Zhang, Helin


> -Original Message-
> From: Aaron Conole [mailto:aconole at redhat.com]
> Sent: Friday, February 26, 2016 2:49 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo ; Zhang, Helin
> ; Ananyev, Konstantin
> ; Richardson, Bruce
> 
> Subject: [PATCH 2/8] app/test/test: Fix missing brackets
> 
> The test application calls printf(...) with the suite->suite_name argument.
> The intent (based on whitespace) in the printf is to check suite->suite_name 
> first
> and then apply the printf. This doesn't happen due to missing brackets.
> 
> Signed-off-by: Aaron Conole 
Acked-by: Helin Zhang 
> ---
>  app/test/test.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/app/test/test.c b/app/test/test.c index f35b304..ccad0e3 100644
> --- a/app/test/test.c
> +++ b/app/test/test.c
> @@ -162,9 +162,10 @@ unit_test_suite_runner(struct unit_test_suite *suite)
>   int test_success;
>   unsigned total = 0, executed = 0, skipped = 0, succeeded = 0, failed = 
> 0;
> 
> - if (suite->suite_name)
> + if (suite->suite_name) {
>   printf(" + 
> --- +\n");
>   printf(" + Test Suite : %s\n", suite->suite_name);
> + }
> 
>   if (suite->setup)
>   if (suite->setup() != 0)
> --
> 2.5.0



[dpdk-dev] QEDE pmd v2 patches

2016-03-18 Thread Rasesh Mody
Hi Thomas, Bruce,

We didn't see automated build email for all the patches that we submitted to 
dpdk.org. Is this expected?

One of our patch, "[PATCH v2 04/10] qede: Add base driver", doesn't show up on 
dpdk patchworks site http://dpdk.org/dev/patchwork/project/dpdk/list. Is there 
any action needed from our side?

We received following messages from dpdk build nonfictions. We did apply the 
patch-set to the latest dpdk tree then and it was applied cleanly. Please let 
us know if we need to re-submit the patch-set to take care of it. 

--
Test-Label: Intel Niantic on Fedora
Test-Status: apply patch error

Patchwork ID: 11395-11402
http://www.dpdk.org/dev/patchwork/patch/11402/
Submitter: Rasesh Mody 
Date: Thu, 10 Mar 2016 05:45:39 -0800
DPDK git baseline: 94b0ad8e0aa556230183f4c4d06b68bfd145dce3

error: patch failed: MAINTAINERS:352
error: MAINTAINERS: patch does not apply

DPDK STV team
--

Thanks,
-Rasesh



[dpdk-dev] [PATCH] kni: set kni mac on ioctl_create

2016-03-18 Thread Zhang, Helin
Hi Sergey

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergey Balabanov
> Sent: Friday, August 28, 2015 9:06 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] kni: set kni mac on ioctl_create
> 
> There is a situation when ioctl returns zero mac address (00:00:00:00:00:00)
> for just created kni. The situation happens because kni mac is set on 
> 'ipconfig
> up' event (kni_net_open callback) not on kni creation (kni_ioctl_create).
Could you help to clarify a bit of the real issue? What's wrong there?

> 
> Signed-off-by: Sergey Balabanov 
> ---
>  lib/librte_eal/linuxapp/kni/kni_misc.c | 10 ++
> lib/librte_eal/linuxapp/kni/kni_net.c  |  9 -
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/lib/librte_eal/linuxapp/kni/kni_misc.c
> b/lib/librte_eal/linuxapp/kni/kni_misc.c
> index 2e9fa89..61f83a0 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_misc.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_misc.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include  /* eth_type_trans */
> 
>  #include 
>  #include "kni_dev.h"
> @@ -465,6 +466,15 @@ kni_ioctl_create(unsigned int ioctl_num, unsigned
> long ioctl_param)
>   if (pci)
>   pci_dev_put(pci);
> 
> + if (kni->lad_dev)
> + memcpy(net_dev->dev_addr, kni->lad_dev->dev_addr,
> ETH_ALEN);
> + else
> + /*
> +  * Generate random mac address. eth_random_addr() is the
> newer
> +  * version of generating mac address in linux kernel.
> +  */
> + random_ether_addr(net_dev->dev_addr);
> +
A rebase is needed, as a lot of changes after that. Thanks!

Helin
>   ret = register_netdev(net_dev);
>   if (ret) {
>   KNI_ERR("error %i registering device \"%s\"\n", diff --git
> a/lib/librte_eal/linuxapp/kni/kni_net.c 
> b/lib/librte_eal/linuxapp/kni/kni_net.c
> index ab5add4..b50b4cf 100644
> --- a/lib/librte_eal/linuxapp/kni/kni_net.c
> +++ b/lib/librte_eal/linuxapp/kni/kni_net.c
> @@ -70,15 +70,6 @@ kni_net_open(struct net_device *dev)
>   struct rte_kni_request req;
>   struct kni_dev *kni = netdev_priv(dev);
> 
> - if (kni->lad_dev)
> - memcpy(dev->dev_addr, kni->lad_dev->dev_addr,
> ETH_ALEN);
> - else
> - /*
> -  * Generate random mac address. eth_random_addr() is the
> newer
> -  * version of generating mac address in linux kernel.
> -  */
> - random_ether_addr(dev->dev_addr);
> -
>   netif_start_queue(dev);
> 
>   memset(&req, 0, sizeof(req));
> --
> 2.1.4



[dpdk-dev] [PATCH] ixgbe: fix reta query and update on x550

2016-03-18 Thread Wang Xiao W
For x550 device, the reta table has 512 entries, but in function
ixgbe_dev_rss_reta_query and ixgbe_dev_rss_reta_update we use an
"uint8_t i" to traverse the entries, this will lead the function
to an endless loop.

This patch changes the data type from uint8_t to uint16_t to fix
the issue.

Fixes: 4bee94a6c22f ("ixgbe: support 512 RSS entries on x550")

Signed-off-by: Wang Xiao W 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index afe6582..46c5d4d 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3640,11 +3640,11 @@ ixgbe_dev_rss_reta_update(struct rte_eth_dev *dev,
  struct rte_eth_rss_reta_entry64 *reta_conf,
  uint16_t reta_size)
 {
-   uint8_t i, j, mask;
+   uint16_t i, sp_reta_size;
+   uint8_t j, mask;
uint32_t reta, r;
uint16_t idx, shift;
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   uint16_t sp_reta_size;
uint32_t reta_reg;

PMD_INIT_FUNC_TRACE();
@@ -3694,11 +3694,11 @@ ixgbe_dev_rss_reta_query(struct rte_eth_dev *dev,
 struct rte_eth_rss_reta_entry64 *reta_conf,
 uint16_t reta_size)
 {
-   uint8_t i, j, mask;
+   uint16_t i, sp_reta_size;
+   uint8_t j, mask;
uint32_t reta;
uint16_t idx, shift;
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   uint16_t sp_reta_size;
uint32_t reta_reg;

PMD_INIT_FUNC_TRACE();
-- 
1.9.3



[dpdk-dev] Fwd: EAL: map_all_hugepages(): mmap failed: Cannot allocate memory

2016-03-18 Thread Tan, Jianfeng


On 3/18/2016 6:41 AM, John Wei wrote:
> I am setting up OVS inside a Linux container. This OVS is built using DPDK
> library.
> During the startup of ovs-vswitchd, it core dumped due to fail to mmap.
>in eal_memory.c
> virtaddr = mmap(vma_addr, hugepage_sz, PROT_READ | PROT_WRITE,
>  MAP_SHARED, fd, 0);
>
> This call is made inside a for loop that loops through all the pages and
> mmap them.
> My server has two cores, and I allocated 8192 2MB pages.
> The mmap for the first 4096 pages were successful. It failed when trying to
> map 4096th page.
>
> Can someone help me understand when the mmap for the first 4096 pages were
> successful and it failed on 4096th page?

In my limited experience, there are some scenario that may lead to such 
failure: a. specified an option size when do mount hugetlbfs; b. cgroup 
limitation, /sys/fs/cgroup/hugetlb//hugetlb.2MB.limit_in_bytes; c. open files by ulimit...

Workaround: as only "--socket-mem 128,128" is needed, you can reduce the 
total number of 2M hugepages from 8192 to 512 (or something else).
In addition: this is a case why I sent a patchset: 
http://dpdk.org/dev/patchwork/patch/11194/

Thanks,
Jianfeng

>
>
> John
>
>
>
> ovs-vswitchd --dpdk -c 0x1 -n 4 -l 1 --file-prefix ct- --socket-mem
> 128,128 -- unix:$DB_SOCK --pidfile --detach --log-file=ct.log
>
>
> EAL: Detected lcore 23 as core 5 on socket 1
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 24 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: VFIO modules not all loaded, skip VFIO support...
> EAL: Setting up physically contiguous memory...
> EAL: map_all_hugepages(): mmap failed: Cannot allocate memory
> EAL: Failed to mmap 2 MB hugepages
> PANIC in rte_eal_init():
> Cannot init memory
> 7: [ovs-vswitchd() [0x411f15]]
> 6: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff5f6133b15]]
> 5: [ovs-vswitchd() [0x4106f9]]
> 4: [ovs-vswitchd() [0x66917d]]
> 3: [ovs-vswitchd() [0x42b6f5]]
> 2: [ovs-vswitchd() [0x40dd8c]]
> 1: [ovs-vswitchd() [0x56b3ba]]
> Aborted (core dumped)



[dpdk-dev] [PATCH] ixgbe: fix reta query and update on x550

2016-03-18 Thread Lu, Wenzhuo
Hi,


> -Original Message-
> From: Wang, Xiao W
> Sent: Friday, March 18, 2016 10:28 AM
> To: Zhang, Helin
> Cc: dev at dpdk.org; Lu, Wenzhuo; Wang, Xiao W
> Subject: [PATCH] ixgbe: fix reta query and update on x550
> 
> For x550 device, the reta table has 512 entries, but in function
> ixgbe_dev_rss_reta_query and ixgbe_dev_rss_reta_update we use an "uint8_t i"
> to traverse the entries, this will lead the function to an endless loop.
> 
> This patch changes the data type from uint8_t to uint16_t to fix the issue.
> 
> Fixes: 4bee94a6c22f ("ixgbe: support 512 RSS entries on x550")
> 
> Signed-off-by: Wang Xiao W 
Acked-by: Wenzhuo Lu 



[dpdk-dev] L2 Forwarding Sample Applications

2016-03-18 Thread Vivek Gupta
Please let me know the purpose of writing these sample applications in DPDK-

* L2fwd-keepalive/main.c

* L2fwd-jobstats/main.c

* L2fwd-ivshmem/guest/guest.c

* L2fwd-ivshmem/host/host.c

Thanks & Regards
Vivek Gupta


::DISCLAIMER::


The contents of this e-mail and any attachment(s) are confidential and intended 
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information 
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in 
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on 
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the 
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, 
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written 
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please 
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and 
other defects.




[dpdk-dev] [PATCH v11 5/8] ethdev: add speed capabilities

2016-03-18 Thread Chen, Jing D
Hi,

Best Regards,
Mark


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, March 18, 2016 2:09 AM
> To: marcdevel at gmail.com; Richardson, Bruce; Doherty, Declan; Ananyev,
> Konstantin; Lu, Wenzhuo; Zhang, Helin; Chen, Jing D; harish.patil at 
> qlogic.com;
> rahul.lakkireddy at chelsio.com; johndale at cisco.com; vido at cesnet.cz;
> adrien.mazarguil at 6wind.com; alejandro.lucero at netronome.com
> Cc: dev at dpdk.org
> Subject: [PATCH v11 5/8] ethdev: add speed capabilities
> 
> From: Marc Sune 
> 
> The speed capabilities of a device can be retrieved with
> rte_eth_dev_info_get().
> 
> The new field speed_capa is initialized in the drivers without
> taking care of device characteristics in this patch.
> When the capabilities of a driver are accurate, the table in
> overview.rst must be filled.
> 
> Signed-off-by: Marc Sune 
> ---
>  doc/guides/nics/overview.rst   |  1 +
>  doc/guides/rel_notes/release_16_04.rst |  8 
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  1 +
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  1 +
>  drivers/net/e1000/em_ethdev.c  |  4 
>  drivers/net/e1000/igb_ethdev.c |  4 
>  drivers/net/fm10k/fm10k_ethdev.c   |  4 
>  drivers/net/i40e/i40e_ethdev.c |  8 
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  8 
>  drivers/net/mlx4/mlx4.c|  2 ++
>  drivers/net/mlx5/mlx5_ethdev.c |  3 +++
>  drivers/net/nfp/nfp_net.c  |  2 ++
>  lib/librte_ether/rte_ethdev.h  | 21 +
>  13 files changed, 67 insertions(+)
> 
> 
>  static void
> diff --git a/drivers/net/fm10k/fm10k_ethdev.c
> b/drivers/net/fm10k/fm10k_ethdev.c
> index edc8c11..2a1c222 100644
> --- a/drivers/net/fm10k/fm10k_ethdev.c
> +++ b/drivers/net/fm10k/fm10k_ethdev.c
> @@ -1410,6 +1410,10 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
>   .nb_min = FM10K_MIN_TX_DESC,
>   .nb_align = FM10K_MULT_TX_DESC,
>   };
> +
> + dev_info->speed_capa = ETH_LINK_SPEED_1G |
> ETH_LINK_SPEED_2_5G |
> + ETH_LINK_SPEED_10G | ETH_LINK_SPEED_25G |
> + ETH_LINK_SPEED_40G;
>  }
> 

Fm10k has 100G capability, we'd better to add ETH_LINK_SPEED_100G here.



[dpdk-dev] [PATCH] enic: change maintainers

2016-03-18 Thread Thomas Monjalon
2016-03-17 15:43, John Daley:
> Change maintainers for ENIC PMD.
> 
> Signed-off-by: John Daley 
> ---
>  doc/guides/nics/enic.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Please update the MAINTAINERS file.


[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Yuanhan Liu
On Thu, Mar 17, 2016 at 04:29:32PM +0100, Thomas Monjalon wrote:
> 2016-02-24 14:47, Ilya Maximets:
> > Implementation of rte_vhost_enqueue_burst() based on lockless ring-buffer
> > algorithm and contains almost all to be thread-safe, but it's not.
> > 
> > This set adds required changes.
> > 
> > First patch in set is a standalone patch that fixes many times discussed
> > issue with barriers on different architectures.
> > 
> > Second and third adds fixes to make rte_vhost_enqueue_burst thread safe.
> 
> My understanding is that we do not want to pollute Rx/Tx with locks.
> 
> Huawei, Yuanhan, Bruce, do you confirm?

Huawei would like to do that, and I'm behind that. Let's do it. The
question is can we do that in this release? As I replied in another
thread, I'm wondering we might need do an announce first and do it
in next release?

Both are Okay to me; I just want to know which one is more proper.

Thoughts?

--yliu


[dpdk-dev] QEDE pmd v2 patches

2016-03-18 Thread Thomas Monjalon
2016-03-18 01:43, Rasesh Mody:
> Hi Thomas, Bruce,
> 
> We didn't see automated build email for all the patches that we submitted to 
> dpdk.org. Is this expected?

The email below is for the whole patchset:
Patchwork ID: 11395-11402

> One of our patch, "[PATCH v2 04/10] qede: Add base driver", doesn't show up 
> on dpdk patchworks site http://dpdk.org/dev/patchwork/project/dpdk/list. Is 
> there any action needed from our side?

This patch is missing in the mailing list. Maybe it was too big?
You should have received an email about the reception issue. Check your spams.

> We received following messages from dpdk build nonfictions. We did apply the 
> patch-set to the latest dpdk tree then and it was applied cleanly. Please let 
> us know if we need to re-submit the patch-set to take care of it. 
> 
> --
> Test-Label: Intel Niantic on Fedora
> Test-Status: apply patch error
> 
> Patchwork ID: 11395-11402
> http://www.dpdk.org/dev/patchwork/patch/11402/
> Submitter: Rasesh Mody 
> Date: Thu, 10 Mar 2016 05:45:39 -0800
> DPDK git baseline: 94b0ad8e0aa556230183f4c4d06b68bfd145dce3
> 
> error: patch failed: MAINTAINERS:352
> error: MAINTAINERS: patch does not apply
> 
> DPDK STV team
> --

Which branch are based on?
You should be working on next-net.



[dpdk-dev] L2 Forwarding Sample Applications

2016-03-18 Thread Thomas Monjalon
2016-03-18 05:11, Vivek Gupta:
> The contents of this e-mail and any attachment(s) are confidential

Don't be afraid, I won't read your confidential message ;)


[dpdk-dev] [PATCH v11 5/8] ethdev: add speed capabilities

2016-03-18 Thread Thomas Monjalon
2016-03-18 05:18, Chen, Jing D:
> > --- a/drivers/net/fm10k/fm10k_ethdev.c
> > +++ b/drivers/net/fm10k/fm10k_ethdev.c
> > @@ -1410,6 +1410,10 @@ fm10k_dev_infos_get(struct rte_eth_dev *dev,
> > .nb_min = FM10K_MIN_TX_DESC,
> > .nb_align = FM10K_MULT_TX_DESC,
> > };
> > +
> > +   dev_info->speed_capa = ETH_LINK_SPEED_1G |
> > ETH_LINK_SPEED_2_5G |
> > +   ETH_LINK_SPEED_10G | ETH_LINK_SPEED_25G |
> > +   ETH_LINK_SPEED_40G;
> >  }
> > 
> 
> Fm10k has 100G capability, we'd better to add ETH_LINK_SPEED_100G here.

The 100G capability is added in patch #8.


[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Thomas Monjalon
2016-03-18 16:00, Yuanhan Liu:
> On Thu, Mar 17, 2016 at 04:29:32PM +0100, Thomas Monjalon wrote:
> > 2016-02-24 14:47, Ilya Maximets:
> > > Implementation of rte_vhost_enqueue_burst() based on lockless ring-buffer
> > > algorithm and contains almost all to be thread-safe, but it's not.
> > > 
> > > This set adds required changes.
> > > 
> > > First patch in set is a standalone patch that fixes many times discussed
> > > issue with barriers on different architectures.
> > > 
> > > Second and third adds fixes to make rte_vhost_enqueue_burst thread safe.
> > 
> > My understanding is that we do not want to pollute Rx/Tx with locks.
> > 
> > Huawei, Yuanhan, Bruce, do you confirm?
> 
> Huawei would like to do that, and I'm behind that. Let's do it.

I'm not sure to understand. What do you want to do exactly?

> The question is can we do that in this release? As I replied in another
> thread, I'm wondering we might need do an announce first and do it
> in next release?
> 
> Both are Okay to me; I just want to know which one is more proper.
> 
> Thoughts?



[dpdk-dev] virtio-pci and vdev=eth_pcap combined lead to a hang on virtnet_set_rx_mode (lcore-slave)

2016-03-18 Thread Christian Ehrhardt
Hi,
TL;DR
When using --vdev=eth_pcap0 on testpmd "quit" the guest runs into a blocked
cpu on virtnet_set_rx_mode (lcore-slave).
Using these devices actually appers to be fine, but on the release path
(testpmd quit) the issue triggers.
Maybe virtio-pci and vdev=eth_pcap are just incompatible, but then there
could be more graceful exits :-)


Background:
I was "reusing" a testpmd commandline I had used on a physical system on a
KVM guest.
Doing so caused weird hangs.
Unfortunately I didn't have the time to debug it down to a fix, but along
the way of checking what actually was going on I collected quite a few logs
that I wanted to share.
If anybody recognizes (or even debugs and fixes) it right away great, if
not it will at least be "google'able" by the next with a similar issue.


Steps to reproduce:
Failing Commandline:
sudo /usr/bin/testpmd --pci-blacklist :00:03.0 --socket-mem 2048
--vdev=eth_pcap0,iface=eth1 --vdev=eth_pcap1,iface=eth2 -- --interactive
--total-num-mbufs=2048

Note: Working Commandline is the same just without vdev's
sudo /usr/bin/testpmd --pci-blacklist :00:03.0 --socket-mem 2048 --
--interactive --total-num-mbufs=2048

The blacklist on 03.0 is the device that I use to run ssh into the guest.


Setup:
(The setup of the physical system is equivalent, just ixgbe cards instead
of virtio)
- DPDK 2.2 / Qemu 2.5
- guest has 3 virtio network devices
- one to actually connect to it via ssh
- two to be dedicated for dpdk experiments
- I kept them at the virtio-pci driver, dpdk_nic_bind status:
  Network devices using kernel driver
  ===
  :00:03.0 'Virtio network device' if= drv=virtio-pci unused=
  :00:04.0 'Virtio network device' if= drv=virtio-pci unused=
  :00:05.0 'Virtio network device' if= drv=virtio-pci unused=
- ~3G hugepages are allocated (in 1536 2M hugepages)
- 4 virtual CPUs in the guest
- virtio-net devices have 4 queues each, combined with ethtool
  ethtool -L eth1
  current values: tx 0 rx 0 other 0 combined 4


After start testpmd initialization is ok
- got into interactive shell of testpmd, all fine
- ping Host to guest main dev still ok after testpmd start, protected by
blacklist (expected)
- ping Host to guest dpdk used dev stops (expected)

Nothing else interesting on journal / console of host or guest
Checking config from testpmds interactive console also didn't show anything
obvious to me.


Run "start tx_first"
- still all fine; ping to main console and guest ssh connectivity ok.
- host saw something happening on one of the dpdk ports, but nothing
stopped any active conenction
  kernel: localbridge0: port 2(vnet1) neighbor 8000.52:54:00:0f:a3:5f lost
  kernel: localbridge0: topology change detected, propagating


But then "quit" in testpmd seems to upset something in the guest
>From the quit itselt I get a hanging:
testpmd> quit
Stopping port 0...done
Stopping port 1...done
Stopping port 2...done
Stopping port 3...done
bye...
(hangs here)

The guest console at the same time triggers a hang check
[ 1712.124021] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s!
[lcore-slave-1:5653]


The guests journal list the soft lockup deatils as:
dpdk-guest1 kernel: device eth2 left promiscuous mode
dpdk-guest1 kernel: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s!
[lcore-slave-1:5653]
dpdk-guest1 kernel: Modules linked in: xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc
iptable_filter ip_tables x_tables isofs ppdev kvm_intel kvm irqbypass
crct10dif_pclmul crc32_pclmul joydev serio_raw parport_pc parport iscsi_tcp
libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq
libcrc32c raid1 raid0 multipath linear aesni_intel psmouse aes_x86_64
glue_helper lrw gf128mul ablk_helper cryptd floppy
dpdk-guest1 kernel: CPU: 1 PID: 5653 Comm: lcore-slave-1 Not tainted
4.4.0-13-generic #29-Ubuntu
dpdk-guest1 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
dpdk-guest1 kernel: task: 8801b7626040 ti: 8800ba944000 task.ti:
8800ba944000
dpdk-guest1 kernel: RIP: 0010:[]  []
virtnet_send_command+0xf3/0x150
dpdk-guest1 kernel: RSP: 0018:8800ba947a20  EFLAGS: 0246
dpdk-guest1 kernel: RAX:  RBX: 8800bbb0e840 RCX:
8801b658d000
dpdk-guest1 kernel: RDX: c050 RSI: 8800ba947a24 RDI:
8800bbaf0c00
dpdk-guest1 kernel: RBP: 8800ba947ab8 R08: 0004 R09:
8801b9001b00
dpdk-guest1 kernel: R10: 8800ba676340 R11: 8801b76d1000 R12:
0002
dpdk-guest1 kernel: R13: 8800ba947a48 R14:  R15:

dpdk-guest1 kernel: FS:  7f2a7a9fe700() GS:8801bfc8()
knlGS:
dpdk-guest1 kernel: CS:  0010 DS:  ES:  CR0: 80050033
dpdk-guest1 kernel: CR

[dpdk-dev] [RFC] vhost user: add error handling for fd > 1023

2016-03-18 Thread Patrik Andersson
Protect against DPDK crash when allocation of listen fd >= 1023.
For events on fd:s >1023, the current implementation will trigger
an abort due to access outside of allocated bit mask.

Corrections would include:

  * Match fdset_add() signature in fd_man.c to fd_man.h
  * Handling of return codes from fdset_add()
  * Addition of check of fd number in fdset_add_fd()

---

The rationale behind the suggested code change is that,
fdset_event_dispatch() could attempt access outside of the FD_SET
bitmask if there is an event on a file descriptor that in turn
looks up a virtio file descriptor with a value > 1023.
Such an attempt will lead to an abort() and a restart of any
vswitch using DPDK.

A discussion topic exist in the ovs-discuss mailing list that can
provide a little more background:

http://openvswitch.org/pipermail/discuss/2016-February/020243.html

Signed-off-by: Patrik Andersson 
---
 lib/librte_vhost/vhost_user/fd_man.c | 11 ++-
 lib/librte_vhost/vhost_user/vhost-net-user.c | 22 --
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/lib/librte_vhost/vhost_user/fd_man.c 
b/lib/librte_vhost/vhost_user/fd_man.c
index 087aaed..c691339 100644
--- a/lib/librte_vhost/vhost_user/fd_man.c
+++ b/lib/librte_vhost/vhost_user/fd_man.c
@@ -71,20 +71,22 @@ fdset_find_free_slot(struct fdset *pfdset)
return fdset_find_fd(pfdset, -1);
 }

-static void
+static int
 fdset_add_fd(struct fdset  *pfdset, int idx, int fd,
fd_cb rcb, fd_cb wcb, void *dat)
 {
struct fdentry *pfdentry;

-   if (pfdset == NULL || idx >= MAX_FDS)
-   return;
+   if (pfdset == NULL || idx >= MAX_FDS || fd >= FD_SETSIZE)
+   return -1;

pfdentry = &pfdset->fd[idx];
pfdentry->fd = fd;
pfdentry->rcb = rcb;
pfdentry->wcb = wcb;
pfdentry->dat = dat;
+
+   return 0;
 }

 /**
@@ -150,12 +152,11 @@ fdset_add(struct fdset *pfdset, int fd, fd_cb rcb, fd_cb 
wcb, void *dat)

/* Find a free slot in the list. */
i = fdset_find_free_slot(pfdset);
-   if (i == -1) {
+   if (i == -1 || fdset_add_fd(pfdset, i, fd, rcb, wcb, dat) < 0) {
pthread_mutex_unlock(&pfdset->fd_mutex);
return -2;
}

-   fdset_add_fd(pfdset, i, fd, rcb, wcb, dat);
pfdset->num++;

pthread_mutex_unlock(&pfdset->fd_mutex);
diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index df2bd64..778851d 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -288,6 +288,7 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int 
*remove)
int fh;
struct vhost_device_ctx vdev_ctx = { (pid_t)0, 0 };
unsigned int size;
+   int ret;

conn_fd = accept(fd, NULL, NULL);
RTE_LOG(INFO, VHOST_CONFIG,
@@ -317,8 +318,15 @@ vserver_new_vq_conn(int fd, void *dat, __rte_unused int 
*remove)

ctx->vserver = vserver;
ctx->fh = fh;
-   fdset_add(&g_vhost_server.fdset,
+   ret = fdset_add(&g_vhost_server.fdset,
conn_fd, vserver_message_handler, NULL, ctx);
+   if (ret < 0) {
+   free(ctx);
+   close(conn_fd);
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "failed to add fd %d into vhost server fdset\n",
+   conn_fd);
+   }
 }

 /* callback when there is message on the connfd */
@@ -453,6 +461,7 @@ int
 rte_vhost_driver_register(const char *path)
 {
struct vhost_server *vserver;
+   int ret;

pthread_mutex_lock(&g_vhost_server.server_mutex);

@@ -478,8 +487,17 @@ rte_vhost_driver_register(const char *path)

vserver->path = strdup(path);

-   fdset_add(&g_vhost_server.fdset, vserver->listenfd,
+   ret = fdset_add(&g_vhost_server.fdset, vserver->listenfd,
vserver_new_vq_conn, NULL, vserver);
+   if (ret < 0) {
+   pthread_mutex_unlock(&g_vhost_server.server_mutex);
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "failed to add listen fd %d to vhost server 
fdset\n",
+   vserver->listenfd);
+   free(vserver->path);
+   free(vserver);
+   return -1;
+   }

g_vhost_server.server[g_vhost_server.vserver_cnt++] = vserver;
pthread_mutex_unlock(&g_vhost_server.server_mutex);
-- 
1.9.1



[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Yuanhan Liu
On Fri, Mar 18, 2016 at 09:09:04AM +0100, Thomas Monjalon wrote:
> 2016-03-18 16:00, Yuanhan Liu:
> > On Thu, Mar 17, 2016 at 04:29:32PM +0100, Thomas Monjalon wrote:
> > > 2016-02-24 14:47, Ilya Maximets:
> > > > Implementation of rte_vhost_enqueue_burst() based on lockless 
> > > > ring-buffer
> > > > algorithm and contains almost all to be thread-safe, but it's not.
> > > > 
> > > > This set adds required changes.
> > > > 
> > > > First patch in set is a standalone patch that fixes many times discussed
> > > > issue with barriers on different architectures.
> > > > 
> > > > Second and third adds fixes to make rte_vhost_enqueue_burst thread safe.
> > > 
> > > My understanding is that we do not want to pollute Rx/Tx with locks.
> > > 
> > > Huawei, Yuanhan, Bruce, do you confirm?
> > 
> > Huawei would like to do that, and I'm behind that. Let's do it.
> 
> I'm not sure to understand. What do you want to do exactly?

I was thinking we are on the same page :(

"do not want to pollute Rx/Tx with locks" == "remove lockless Rx/Tx, the
proposal from Huawei", right?

In another way, I'm behind the following patch from Huawei:

http://dpdk.org/dev/patchwork/patch/9740/

--yliu


[dpdk-dev] [PATCH v9 01/11] ethdev: add API to query supported packet types

2016-03-18 Thread Tan, Jianfeng
Hi Thomas,

On 3/15/2016 4:50 AM, Jianfeng Tan wrote:
> Add a new API rte_eth_dev_get_supported_ptypes to query what packet types
> can be filled by given already started device (or its pmd rx burst function
> has already been decided).
>
> Signed-off-by: Jianfeng Tan 
> Acked-by: Konstantin Ananyev 
> Acked-by: Adrien Mazarguil 
> ---
> v9:
>   - Fix rte_ether_version.map as Ferruh Yigit sugguests.
>   lib/librte_ether/rte_ethdev.c  | 27 +++
>   lib/librte_ether/rte_ethdev.h  | 27 +++
>   lib/librte_ether/rte_ether_version.map |  7 +++
>   3 files changed, 61 insertions(+)
...
> diff --git a/lib/librte_ether/rte_ether_version.map 
> b/lib/librte_ether/rte_ether_version.map
> index d8db24d..c242e71 100644
> --- a/lib/librte_ether/rte_ether_version.map
> +++ b/lib/librte_ether/rte_ether_version.map
> @@ -117,3 +117,10 @@ DPDK_2.2 {
>   
>   local: *;
>   };
> +
> +DPDK_16.04 {
> + global:
> +
> + rte_eth_dev_get_supported_ptypes;
> +
> +} DPDK_2.2;

Although this series is rebased on dpdk-next-net/rel_16_04, but applying 
this into v16.04-rc1 only needs to remove change in 
rte_ether_version.map. Shall I send a new version? Or this series will 
simply be deferred?

Thanks,
Jianfeng


[dpdk-dev] [PATCH v11 5/8] ethdev: add speed capabilities

2016-03-18 Thread Adrien Mazarguil
On Thu, Mar 17, 2016 at 07:09:02PM +0100, Thomas Monjalon wrote:
> From: Marc Sune 
> 
> The speed capabilities of a device can be retrieved with
> rte_eth_dev_info_get().
> 
> The new field speed_capa is initialized in the drivers without
> taking care of device characteristics in this patch.
> When the capabilities of a driver are accurate, the table in
> overview.rst must be filled.
> 
> Signed-off-by: Marc Sune 
> ---
>  doc/guides/nics/overview.rst   |  1 +
>  doc/guides/rel_notes/release_16_04.rst |  8 
>  drivers/net/bnx2x/bnx2x_ethdev.c   |  1 +
>  drivers/net/cxgbe/cxgbe_ethdev.c   |  1 +
>  drivers/net/e1000/em_ethdev.c  |  4 
>  drivers/net/e1000/igb_ethdev.c |  4 
>  drivers/net/fm10k/fm10k_ethdev.c   |  4 
>  drivers/net/i40e/i40e_ethdev.c |  8 
>  drivers/net/ixgbe/ixgbe_ethdev.c   |  8 
>  drivers/net/mlx4/mlx4.c|  2 ++
>  drivers/net/mlx5/mlx5_ethdev.c |  3 +++
>  drivers/net/nfp/nfp_net.c  |  2 ++
>  lib/librte_ether/rte_ethdev.h  | 21 +
>  13 files changed, 67 insertions(+)
[...]
> diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
> index cc4e9aa..3a4a989 100644
> --- a/drivers/net/mlx4/mlx4.c
> +++ b/drivers/net/mlx4/mlx4.c
> @@ -4301,6 +4301,8 @@ mlx4_dev_infos_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *info)
>0);
>   if (priv_get_ifname(priv, &ifname) == 0)
>   info->if_index = if_nametoindex(ifname);
> + info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_40G |
> + ETH_LINK_SPEED_56G;
>   priv_unlock(priv);
>  }

Missing: ETH_LINK_SPEED_100M (not sure if we care), ETH_LINK_SPEED_1G and
the nonstandard ETH_LINK_SPEED_20G with some adapters.

In the future we should provide a more accurate speed_capa depending on
actual port capabilities, several mlx4 adapters cannot handle them all.

> diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
> index 6704382..3487538 100644
> --- a/drivers/net/mlx5/mlx5_ethdev.c
> +++ b/drivers/net/mlx5/mlx5_ethdev.c
> @@ -522,6 +522,9 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct 
> rte_eth_dev_info *info)
>* size if it is not fixed.
>* The API should be updated to solve this problem. */
>   info->reta_size = priv->ind_table_max_size;
> + info->speed_capa = ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G |
> + ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G |
> + ETH_LINK_SPEED_50G;
>   priv_unlock(priv);
>  }

Missing: ETH_LINK_SPEED_100G, ETH_LINK_SPEED_20G and ETH_LINK_SPEED_56G.

Same as above, these capabilities actually depend on the adapter type and
should be probed.

I think ETH_LINK_SPEED_100M should work as well but I can't find it
mentioned anywhere, let's leave it out for now.

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Thomas Monjalon
2016-03-18 17:16, Yuanhan Liu:
> On Fri, Mar 18, 2016 at 09:09:04AM +0100, Thomas Monjalon wrote:
> > 2016-03-18 16:00, Yuanhan Liu:
> > > On Thu, Mar 17, 2016 at 04:29:32PM +0100, Thomas Monjalon wrote:
> > > > 2016-02-24 14:47, Ilya Maximets:
> > > > > Implementation of rte_vhost_enqueue_burst() based on lockless 
> > > > > ring-buffer
> > > > > algorithm and contains almost all to be thread-safe, but it's not.
> > > > > 
> > > > > This set adds required changes.
> > > > > 
> > > > > First patch in set is a standalone patch that fixes many times 
> > > > > discussed
> > > > > issue with barriers on different architectures.
> > > > > 
> > > > > Second and third adds fixes to make rte_vhost_enqueue_burst thread 
> > > > > safe.
> > > > 
> > > > My understanding is that we do not want to pollute Rx/Tx with locks.
> > > > 
> > > > Huawei, Yuanhan, Bruce, do you confirm?
> > > 
> > > Huawei would like to do that, and I'm behind that. Let's do it.
> > 
> > I'm not sure to understand. What do you want to do exactly?
> 
> I was thinking we are on the same page :(

Yes we are on the same page.
But it's better to make things explicit.

There should be no lock in Rx/Tx drivers (including vhost).
The locking must be done by the caller (application level).
That's why this series "Thread safe rte_vhost_enqueue_burst" is rejected.

> "do not want to pollute Rx/Tx with locks" == "remove lockless Rx/Tx, the
> proposal from Huawei", right?
> 
> In another way, I'm behind the following patch from Huawei:
> 
> http://dpdk.org/dev/patchwork/patch/9740/

The patch "vhost: remove lockless enqueue to the virtio ring" must me
reworked in 2 patches:
1/ announce API change
2/ remove locks

Please avoid talking about removing "lockless" when actually removing locks.

Thanks


[dpdk-dev] [PATCH v4] examples/l3fwd: em path performance fix

2016-03-18 Thread Tomasz Kulasek
It seems that for the most use cases, previous hash_multi_lookup provides
better performance, and more, sequential lookup can cause significant
performance drop.

This patch sets previously optional hash_multi_lookup method as default.
It also provides some minor optimizations such as queue drain only on used
tx ports.


This patch should be applied after Maciej Czekaj's patch "l3fwd: Fix
compilation with HASH_MULTI_LOOKUP"


v4 changes:
 - rebased to be applicable after patch "l3fwd: Fix compilation with
   HASH_MULTI_LOOKUP" of Maciej Czekaj

v3 changes:
 - "lpm: extend IPv4 next hop field" patch extends dst_port table from
   uint16_t to uint32_t omiting previously disabled l3fwd_em_hlm_sse.h,
   what causes incompatible pointer type error after turning on this header

v2 changes:
 - fixed copy-paste error causing that not all packets are classified right
   in hash_multi_lookup implementation when burst size is not divisible
   by 8

Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")

Reported-by: Qian Xu 
Signed-off-by: Tomasz Kulasek 
---
 examples/l3fwd/l3fwd.h|8 
 examples/l3fwd/l3fwd_em.c |6 +++---
 examples/l3fwd/l3fwd_em_hlm_sse.h |   28 ++--
 examples/l3fwd/l3fwd_em_sse.h |9 +
 examples/l3fwd/l3fwd_lpm.c|4 ++--
 examples/l3fwd/main.c |7 +++
 6 files changed, 39 insertions(+), 23 deletions(-)

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index 7dcc7e5..69dcc17 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -40,6 +40,12 @@

 #define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1

+#define __ARM_NEON 1
+
+#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)
+#define NO_HASH_MULTI_LOOKUP 1
+#endif
+
 #define MAX_PKT_BURST 32
 #define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */

@@ -86,6 +92,8 @@ struct lcore_rx_queue {
 struct lcore_conf {
uint16_t n_rx_queue;
struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
+   uint16_t n_tx_port;
+   uint16_t tx_port_id[RTE_MAX_ETHPORTS];
uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
void *ipv4_lookup_struct;
diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 0adf8f4..5a2e7ff 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -320,7 +320,7 @@ em_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, void 
*lookup_struct)
  * buffer optimization i.e. ENABLE_MULTI_BUFFER_OPTIMIZE=1.
  */
 #if defined(__SSE4_1__)
-#ifndef HASH_MULTI_LOOKUP
+#if defined(NO_HASH_MULTI_LOOKUP)
 #include "l3fwd_em_sse.h"
 #else
 #include "l3fwd_em_hlm_sse.h"
@@ -568,8 +568,8 @@ em_main_loop(__attribute__((unused)) void *dummy)
diff_tsc = cur_tsc - prev_tsc;
if (unlikely(diff_tsc > drain_tsc)) {

-   for (i = 0; i < qconf->n_rx_queue; i++) {
-   portid = qconf->rx_queue_list[i].port_id;
+   for (i = 0; i < qconf->n_tx_port; ++i) {
+   portid = qconf->tx_port_id[i];
if (qconf->tx_mbufs[portid].len == 0)
continue;
send_burst(qconf,
diff --git a/examples/l3fwd/l3fwd_em_hlm_sse.h 
b/examples/l3fwd/l3fwd_em_hlm_sse.h
index 891ae2e..7faf04a 100644
--- a/examples/l3fwd/l3fwd_em_hlm_sse.h
+++ b/examples/l3fwd/l3fwd_em_hlm_sse.h
@@ -34,17 +34,9 @@
 #ifndef __L3FWD_EM_HLM_SSE_H__
 #define __L3FWD_EM_HLM_SSE_H__

-/**
- * @file
- * This is an optional implementation of packet classification in Exact-Match
- * path using rte_hash_lookup_multi method from previous implementation.
- * While sequential classification seems to be faster, it's disabled by default
- * and can be enabled with HASH_LOOKUP_MULTI global define in compilation time.
- */
-
 #include "l3fwd_sse.h"

-static inline void
+static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
uint8_t portid, uint32_t dst_port[8])
 {
@@ -168,7 +160,7 @@ get_ipv6_5tuple(struct rte_mbuf *m0, __m128i mask0,
key->xmm[2] = _mm_and_si128(tmpdata2, mask1);
 }

-static inline void
+static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv6x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
uint8_t portid, uint32_t dst_port[8])
 {
@@ -322,17 +314,17 @@ l3fwd_em_send_packets(int nb_rx, struct rte_mbuf 
**pkts_burst,

} else {
dst_port[j]   = em_get_dst_port(qconf, pkts_burst[j], 
portid);
-   dst_port[j+1] = em_get_dst_port(qconf, pkts_burst[j], 
portid);
-   dst_port[j+2] = em_get_dst_port(qconf, pkts_burst[j], 
portid);
-   dst_port[j+3] = em_get_dst_port(qconf, pkts_burst[j], 
portid);
-   dst_port[j

[dpdk-dev] [PATCH v4] examples/l3fwd: em path performance fix

2016-03-18 Thread Kulasek, TomaszX


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Tomasz Kulasek
> Sent: Friday, March 18, 2016 10:37
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4] examples/l3fwd: em path performance fix
> 
> It seems that for the most use cases, previous hash_multi_lookup provides
> better performance, and more, sequential lookup can cause significant
> performance drop.
> 
> This patch sets previously optional hash_multi_lookup method as default.
> It also provides some minor optimizations such as queue drain only on used
> tx ports.
> 
> 
> This patch should be applied after Maciej Czekaj's patch "l3fwd: Fix
> compilation with HASH_MULTI_LOOKUP"
> 
> 
> v4 changes:
>  - rebased to be applicable after patch "l3fwd: Fix compilation with
>HASH_MULTI_LOOKUP" of Maciej Czekaj
> 
> v3 changes:
>  - "lpm: extend IPv4 next hop field" patch extends dst_port table from
>uint16_t to uint32_t omiting previously disabled l3fwd_em_hlm_sse.h,
>what causes incompatible pointer type error after turning on this
> header
> 
> v2 changes:
>  - fixed copy-paste error causing that not all packets are classified
> right
>in hash_multi_lookup implementation when burst size is not divisible
>by 8
> 
> Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
> 
> Reported-by: Qian Xu 
> Signed-off-by: Tomasz Kulasek 
> ---
>  examples/l3fwd/l3fwd.h|8 
>  examples/l3fwd/l3fwd_em.c |6 +++---
>  examples/l3fwd/l3fwd_em_hlm_sse.h |   28 ++--
>  examples/l3fwd/l3fwd_em_sse.h |9 +
>  examples/l3fwd/l3fwd_lpm.c|4 ++--
>  examples/l3fwd/main.c |7 +++
>  6 files changed, 39 insertions(+), 23 deletions(-)
> 

Self NACK - I forgot to remove debug options


[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Yuanhan Liu
On Fri, Mar 18, 2016 at 10:34:56AM +0100, Thomas Monjalon wrote:
> 2016-03-18 17:16, Yuanhan Liu:
> > On Fri, Mar 18, 2016 at 09:09:04AM +0100, Thomas Monjalon wrote:
> > > 2016-03-18 16:00, Yuanhan Liu:
> > > > On Thu, Mar 17, 2016 at 04:29:32PM +0100, Thomas Monjalon wrote:
> > > > > 2016-02-24 14:47, Ilya Maximets:
> > > > > > Implementation of rte_vhost_enqueue_burst() based on lockless 
> > > > > > ring-buffer
> > > > > > algorithm and contains almost all to be thread-safe, but it's not.
> > > > > > 
> > > > > > This set adds required changes.
> > > > > > 
> > > > > > First patch in set is a standalone patch that fixes many times 
> > > > > > discussed
> > > > > > issue with barriers on different architectures.
> > > > > > 
> > > > > > Second and third adds fixes to make rte_vhost_enqueue_burst thread 
> > > > > > safe.
> > > > > 
> > > > > My understanding is that we do not want to pollute Rx/Tx with locks.
> > > > > 
> > > > > Huawei, Yuanhan, Bruce, do you confirm?
> > > > 
> > > > Huawei would like to do that, and I'm behind that. Let's do it.
> > > 
> > > I'm not sure to understand. What do you want to do exactly?
> > 
> > I was thinking we are on the same page :(
> 
> Yes we are on the same page.
> But it's better to make things explicit.
> 
> There should be no lock in Rx/Tx drivers (including vhost).
> The locking must be done by the caller (application level).

Yeah, that's why Huawei made the proposal and the patch.

> That's why this series "Thread safe rte_vhost_enqueue_burst" is rejected.
> 
> > "do not want to pollute Rx/Tx with locks" == "remove lockless Rx/Tx, the
> > proposal from Huawei", right?
> > 
> > In another way, I'm behind the following patch from Huawei:
> > 
> > http://dpdk.org/dev/patchwork/patch/9740/
> 
> The patch "vhost: remove lockless enqueue to the virtio ring" must me
> reworked in 2 patches:
> 1/ announce API change

That was my concern, and agreed we need that.

> 2/ remove locks
> 
> Please avoid talking about removing "lockless" when actually removing locks.

Sorry, my bad.

--yliu


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Tomasz Kulasek
It seems that for the most use cases, previous hash_multi_lookup provides
better performance, and more, sequential lookup can cause significant
performance drop.

This patch sets previously optional hash_multi_lookup method as default.
It also provides some minor optimizations such as queue drain only on used
tx ports.


This patch should be applied after Maciej Czekaj's patch "l3fwd: Fix
compilation with HASH_MULTI_LOOKUP"


v5 changes:
 - removed debug informations, patch cleanup

v4 changes:
 - rebased to be applicable after patch "l3fwd: Fix compilation with
   HASH_MULTI_LOOKUP" of Maciej Czekaj

v3 changes:
 - "lpm: extend IPv4 next hop field" patch extends dst_port table from
   uint16_t to uint32_t omiting previously disabled l3fwd_em_hlm_sse.h,
   what causes incompatible pointer type error after turning on this header

v2 changes:
 - fixed copy-paste error causing that not all packets are classified right
   in hash_multi_lookup implementation when burst size is not divisible
   by 8

Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Fixes: 64d3955de1de ("examples/l3fwd: fix ARM build")

Reported-by: Qian Xu 
Signed-off-by: Tomasz Kulasek 
---
 examples/l3fwd/l3fwd.h|6 ++
 examples/l3fwd/l3fwd_em.c |6 +++---
 examples/l3fwd/l3fwd_em_hlm_sse.h |   28 ++--
 examples/l3fwd/l3fwd_em_sse.h |9 +
 examples/l3fwd/l3fwd_lpm.c|4 ++--
 examples/l3fwd/main.c |7 +++
 6 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index 7dcc7e5..1b148e7 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -40,6 +40,10 @@

 #define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1

+#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)
+#define NO_HASH_MULTI_LOOKUP 1
+#endif
+
 #define MAX_PKT_BURST 32
 #define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */

@@ -86,6 +90,8 @@ struct lcore_rx_queue {
 struct lcore_conf {
uint16_t n_rx_queue;
struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
+   uint16_t n_tx_port;
+   uint16_t tx_port_id[RTE_MAX_ETHPORTS];
uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
void *ipv4_lookup_struct;
diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 0adf8f4..5a2e7ff 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -320,7 +320,7 @@ em_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, void 
*lookup_struct)
  * buffer optimization i.e. ENABLE_MULTI_BUFFER_OPTIMIZE=1.
  */
 #if defined(__SSE4_1__)
-#ifndef HASH_MULTI_LOOKUP
+#if defined(NO_HASH_MULTI_LOOKUP)
 #include "l3fwd_em_sse.h"
 #else
 #include "l3fwd_em_hlm_sse.h"
@@ -568,8 +568,8 @@ em_main_loop(__attribute__((unused)) void *dummy)
diff_tsc = cur_tsc - prev_tsc;
if (unlikely(diff_tsc > drain_tsc)) {

-   for (i = 0; i < qconf->n_rx_queue; i++) {
-   portid = qconf->rx_queue_list[i].port_id;
+   for (i = 0; i < qconf->n_tx_port; ++i) {
+   portid = qconf->tx_port_id[i];
if (qconf->tx_mbufs[portid].len == 0)
continue;
send_burst(qconf,
diff --git a/examples/l3fwd/l3fwd_em_hlm_sse.h 
b/examples/l3fwd/l3fwd_em_hlm_sse.h
index 891ae2e..7faf04a 100644
--- a/examples/l3fwd/l3fwd_em_hlm_sse.h
+++ b/examples/l3fwd/l3fwd_em_hlm_sse.h
@@ -34,17 +34,9 @@
 #ifndef __L3FWD_EM_HLM_SSE_H__
 #define __L3FWD_EM_HLM_SSE_H__

-/**
- * @file
- * This is an optional implementation of packet classification in Exact-Match
- * path using rte_hash_lookup_multi method from previous implementation.
- * While sequential classification seems to be faster, it's disabled by default
- * and can be enabled with HASH_LOOKUP_MULTI global define in compilation time.
- */
-
 #include "l3fwd_sse.h"

-static inline void
+static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
uint8_t portid, uint32_t dst_port[8])
 {
@@ -168,7 +160,7 @@ get_ipv6_5tuple(struct rte_mbuf *m0, __m128i mask0,
key->xmm[2] = _mm_and_si128(tmpdata2, mask1);
 }

-static inline void
+static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv6x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
uint8_t portid, uint32_t dst_port[8])
 {
@@ -322,17 +314,17 @@ l3fwd_em_send_packets(int nb_rx, struct rte_mbuf 
**pkts_burst,

} else {
dst_port[j]   = em_get_dst_port(qconf, pkts_burst[j], 
portid);
-   dst_port[j+1] = em_get_dst_port(qconf, pkts_burst[j], 
portid);
-   dst_port[j+2] = em_get_dst_port(qconf, pkts_

[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Ilya Maximets


On 18.03.2016 12:46, Yuanhan Liu wrote:
> On Fri, Mar 18, 2016 at 10:34:56AM +0100, Thomas Monjalon wrote:
>> 2016-03-18 17:16, Yuanhan Liu:
>>> On Fri, Mar 18, 2016 at 09:09:04AM +0100, Thomas Monjalon wrote:
 2016-03-18 16:00, Yuanhan Liu:
> On Thu, Mar 17, 2016 at 04:29:32PM +0100, Thomas Monjalon wrote:
>> 2016-02-24 14:47, Ilya Maximets:
>>> Implementation of rte_vhost_enqueue_burst() based on lockless 
>>> ring-buffer
>>> algorithm and contains almost all to be thread-safe, but it's not.
>>>
>>> This set adds required changes.
>>>
>>> First patch in set is a standalone patch that fixes many times discussed
>>> issue with barriers on different architectures.
>>>
>>> Second and third adds fixes to make rte_vhost_enqueue_burst thread safe.
>>
>> My understanding is that we do not want to pollute Rx/Tx with locks.
>>
>> Huawei, Yuanhan, Bruce, do you confirm?
>
> Huawei would like to do that, and I'm behind that. Let's do it.

 I'm not sure to understand. What do you want to do exactly?
>>>
>>> I was thinking we are on the same page :(
>>
>> Yes we are on the same page.
>> But it's better to make things explicit.
>>
>> There should be no lock in Rx/Tx drivers (including vhost).
>> The locking must be done by the caller (application level).
> 
> Yeah, that's why Huawei made the proposal and the patch.
> 
>> That's why this series "Thread safe rte_vhost_enqueue_burst" is rejected.

Hi all.
And what about first patch of this series:
"vhost: use SMP barriers instead of compiler ones." ?

It's not about thread safety in terms of 'lockless'. It is the standalone
patch that fixes many times discussed issue with barriers on arm.

Best regards, Ilya Maximets.

>>
>>> "do not want to pollute Rx/Tx with locks" == "remove lockless Rx/Tx, the
>>> proposal from Huawei", right?
>>>
>>> In another way, I'm behind the following patch from Huawei:
>>>
>>> http://dpdk.org/dev/patchwork/patch/9740/
>>
>> The patch "vhost: remove lockless enqueue to the virtio ring" must me
>> reworked in 2 patches:
>> 1/ announce API change
> 
> That was my concern, and agreed we need that.
> 
>> 2/ remove locks
>>
>> Please avoid talking about removing "lockless" when actually removing locks.
> 
> Sorry, my bad.
> 
>   --yliu
> 
> 


[dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the tail of rx hwring

2016-03-18 Thread Bruce Richardson
On Thu, Mar 17, 2016 at 10:20:01AM +0800, Jianbo Liu wrote:
> On 16 March 2016 at 19:14, Bruce Richardson  
> wrote:
> > On Wed, Mar 16, 2016 at 03:51:53PM +0800, Jianbo Liu wrote:
> >> Hi Wenzhuo,
> >>
> >> On 16 March 2016 at 14:06, Lu, Wenzhuo  wrote:
> >> > HI Jianbo,
> >> >
> >> >
> >> >> -Original Message-
> >> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jianbo Liu
> >> >> Sent: Monday, March 14, 2016 10:26 PM
> >> >> To: Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org
> >> >> Cc: Jianbo Liu
> >> >> Subject: [dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking 
> >> >> at the
> >> >> tail of rx hwring
> >> >>
> >> >> When checking rx ring queue, it's possible that loop will break at the 
> >> >> tail while
> >> >> there are packets still in the queue header.
> >> > Would you like to give more details about in what scenario this issue 
> >> > will be hit? Thanks.
> >> >
> >>
> >> vPMD will place extra RTE_IXGBE_DESCS_PER_LOOP - 1 number of empty
> >> descriptiors at the end of hwring to avoid overflow when do checking
> >> on rx side.
> >>
> >> For the loop in _recv_raw_pkts_vec(), we check 4 descriptors each
> >> time. If all 4 DD are set, and all 4 packets are received.That's OK in
> >> the middle.
> >> But if come to the end of hwring, and less than 4 descriptors left, we
> >> still need to check 4 descriptors at the same time, so the extra empty
> >> descriptors are checked with them.
> >> This time, the number of received packets is apparently less than 4,
> >> and we break out of the loop because of the condition "var !=
> >> RTE_IXGBE_DESCS_PER_LOOP".
> >> So the problem arises. It is possible that there could be more packets
> >> at the hwring beginning that still waiting for being received.
> >> I think this fix can avoid this situation, and at least reduce the
> >> latency for the packets in the header.
> >>
> > Packets are always received in order from the NIC, so no packets ever get 
> > left
> > behind or skipped on an RX burst call.
> >
> > /Bruce
> >
> 
> I knew packets are received in order, and no packets will be skipped,
> but some will be left behind as I explained above.
> vPMD will not received nb_pkts required by one RX burst call, and
> those at the beginning of hwring are still waiting to be received till
> the next call.
> 
> Thanks!
> Jianbo
HI Jianbo,

ok, I understand now. I'm not sure that this is a significant problem though,
since we are working in polling mode. Is there a performance impact to your
change, because I don't think that we can reduce performance just to fix this?

Regards,
/Bruce


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Thomas Monjalon
2016-03-18 10:52, Tomasz Kulasek:
> +#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)

I think we should use CONFIG_RTE_ARCH_ARM_NEON here.
Any ARM maintainer to confirm?

Note that there is already another occurence of this compiler flag:
examples/l3fwd/l3fwd_em.c:#elif defined(__ARM_NEON)


[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Xie, Huawei
On 2/24/2016 7:47 PM, Ilya Maximets wrote:
>* Wait until it's our turn to add our buffer
> @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
> queue_id,
>   entry_success++;
>   }
>  
> - rte_compiler_barrier();
> + rte_smp_rmb();

smp_rmb()?

>   vq->used->idx += entry_success;
>   vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>   sizeof(vq->used->idx));
> -- 2.5.0



[dpdk-dev] Performance issue with uio_pci_generic driver

2016-03-18 Thread Bruce Richardson
On Wed, Mar 16, 2016 at 11:59:34PM +, Simon Jouet wrote:
> Hi everyone,
> 
> First off I would like to thanks tmonjalo, Harry Van Harren and Bruce 
> Richardson for the input they gave while I was trying to figure out the issue 
> and pushing me to report the problem here ?
> 
> Okay, so I was trying out some basic sanity benchmarks with DPDK before doing 
> anything more complicated and surprisingly I was getting lower than gigabit 
> speed for minimum packet size running l2fwd (or l3fwd for that matter).
> 
> The setup is very simple I?ve got two machine with Intel x710 quad port NICs 
> one is running DPDK l2fwd and the other is running MoonGen for the 
> performance benchmark.
> 
> After much debugging and trying to modify parameters one by one, giving up 
> after nothing worked and setting up ovs-dpdk I noticed from the ovs 
> documentation that the kernel module to load were uio and igb_uio while I was 
> previous loading uio_pci_generic as mentioned in the DPDK getting started 
> guide. I simply changed the kernel module and l2fwd went from 700Mbps to 10G 
> line-rate.  Bruce said that shouldn?t be the case and the performance should 
> be similar regardless of the driver loaded ...
> 
> Here is the full log of the experiment, if you?re interested:
> https://gist.github.com/simon-jouet/178e1d302afef5c6a642
> 
> Best regards,
> Simon
> 
Thanks Simon for reporting this here!
It seems very strange to me that the particular module being used for uio has
such an impact on performance. Has anyone else seen anything similar?

/Bruce


[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Xie, Huawei
On 3/18/2016 5:55 PM, Ilya Maximets wrote:
> Hi all.
> And what about first patch of this series:
> "vhost: use SMP barriers instead of compiler ones." ?
>
> It's not about thread safety in terms of 'lockless'. It is the standalone
> patch that fixes many times discussed issue with barriers on arm.
>
> Best regards, Ilya Maximets.
Right, for ARM. I put a comment there. Btw, could you add -in-reply-to
when you send you next patch?


[dpdk-dev] [PATCH v11 5/8] ethdev: add speed capabilities

2016-03-18 Thread Thomas Monjalon
2016-03-18 10:28, Adrien Mazarguil:
> On Thu, Mar 17, 2016 at 07:09:02PM +0100, Thomas Monjalon wrote:
> > --- a/drivers/net/mlx4/mlx4.c
> > +++ b/drivers/net/mlx4/mlx4.c
> > @@ -4301,6 +4301,8 @@ mlx4_dev_infos_get(struct rte_eth_dev *dev, struct 
> > rte_eth_dev_info *info)
> >  0);
> > if (priv_get_ifname(priv, &ifname) == 0)
> > info->if_index = if_nametoindex(ifname);
> > +   info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_40G |
> > +   ETH_LINK_SPEED_56G;
> > priv_unlock(priv);
> >  }
> 
> Missing: ETH_LINK_SPEED_100M (not sure if we care), ETH_LINK_SPEED_1G and
> the nonstandard ETH_LINK_SPEED_20G with some adapters.

For v12:
-   info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_40G |
+   info->speed_capa =
+   ETH_LINK_SPEED_1G |
+   ETH_LINK_SPEED_10G |
+   ETH_LINK_SPEED_20G |
+   ETH_LINK_SPEED_40G |
ETH_LINK_SPEED_56G;

> In the future we should provide a more accurate speed_capa depending on
> actual port capabilities, several mlx4 adapters cannot handle them all.

When doing so, you'll be able to fill the row "speed capability" in 
overview.rst.

> > --- a/drivers/net/mlx5/mlx5_ethdev.c
> > +++ b/drivers/net/mlx5/mlx5_ethdev.c
> > @@ -522,6 +522,9 @@ mlx5_dev_infos_get(struct rte_eth_dev *dev, struct 
> > rte_eth_dev_info *info)
> >  * size if it is not fixed.
> >  * The API should be updated to solve this problem. */
> > info->reta_size = priv->ind_table_max_size;
> > +   info->speed_capa = ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G |
> > +   ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G |
> > +   ETH_LINK_SPEED_50G;
> > priv_unlock(priv);
> >  }
> 
> Missing: ETH_LINK_SPEED_100G, ETH_LINK_SPEED_20G and ETH_LINK_SPEED_56G.

For v12:
-   info->speed_capa = ETH_LINK_SPEED_1G | ETH_LINK_SPEED_10G |
-   ETH_LINK_SPEED_25G | ETH_LINK_SPEED_40G |
-   ETH_LINK_SPEED_50G | ETH_LINK_SPEED_100G;
+   info->speed_capa =
+   ETH_LINK_SPEED_1G |
+   ETH_LINK_SPEED_10G |
+   ETH_LINK_SPEED_20G |
+   ETH_LINK_SPEED_25G |
+   ETH_LINK_SPEED_40G |
+   ETH_LINK_SPEED_50G |
+   ETH_LINK_SPEED_56G |
+   ETH_LINK_SPEED_100G;

ETH_LINK_SPEED_100G is added in patch #8.

> Same as above, these capabilities actually depend on the adapter type and
> should be probed.
> 
> I think ETH_LINK_SPEED_100M should work as well but I can't find it
> mentioned anywhere, let's leave it out for now.



[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-18 Thread Bruce Richardson
On Thu, Mar 17, 2016 at 05:09:08PM +0100, Mauricio V?squez wrote:
> Hi Lazaros,
> 
> On Thu, Mar 17, 2016 at 4:49 PM, Lazaros Koromilas 
> wrote:
> 
> > Issuing a zero objects dequeue with a single consumer has no effect.
> > Doing so with multiple consumers, can get more than one thread to succeed
> > the compare-and-set operation and observe starvation or even deadlock in
> > the while loop that checks for preceding dequeues.  The problematic piece
> > of code when n = 0:
> >
> > cons_next = cons_head + n;
> > success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> >
> > The same is possible on the enqueue path.
> >
> > Signed-off-by: Lazaros Koromilas 
> > ---
> >  lib/librte_ring/rte_ring.h | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> > index 943c97c..eb45e41 100644
> > --- a/lib/librte_ring/rte_ring.h
> > +++ b/lib/librte_ring/rte_ring.h
> > @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
> > const *obj_table,
> > uint32_t mask = r->prod.mask;
> > int ret;
> >
> > +   /* Avoid the unnecessary cmpset operation below, which is also
> > +* potentially harmful when n equals 0. */
> > +   if (n == 0)
> >
> 
> What about using unlikely here?
> 

Unless there is a measurable performance increase by adding in likely/unlikely
I'd suggest avoiding it's use. In general, likely/unlikely should only be used
for things like catestrophic errors because the penalty for taking the unlikely
leg of the code can be quite severe. For normal stuff, where the code nearly
always goes one way in the branch but occasionally goes the other, the hardware
branch predictors generally do a good enough job.

Just my 2c.

/Bruce



[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Ilya Maximets
On 18.03.2016 13:08, Xie, Huawei wrote:
> On 2/24/2016 7:47 PM, Ilya Maximets wrote:
>>   * Wait until it's our turn to add our buffer
>> @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
>> queue_id,
>>  entry_success++;
>>  }
>>  
>> -rte_compiler_barrier();
>> +rte_smp_rmb();
> 
> smp_rmb()?

There is no such function 'smp_rmb' in DPDK.
But:
.../arch/arm/rte_atomic.h:#define rte_smp_rmb() rte_rmb()
.../arch/ppc_64/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
.../arch/tile/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
.../arch/x86/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()

> 
>>  vq->used->idx += entry_success;
>>  vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>>  sizeof(vq->used->idx));
>> -- 2.5.0
> 
> 
> 


[dpdk-dev] [PATCH] enic: don't set enic->config.rq_desc_count in enic_alloc_rq()

2016-03-18 Thread Bruce Richardson
On Thu, Mar 17, 2016 at 03:46:50PM -0700, John Daley wrote:
> From: Nelson Escobar 
> 
> When the requested number of rx descriptors was less than the amount
> configured on the vic, enic_alloc_rq() was incorrectly setting
> enic->config.rq_desc_count to the lower value.  This screwed up later
> calls to enic_alloc_rq().

Can you perhaps clarify what exactly happened when the calls got "screwed up"?
Did the calls just fail, making the queue unusable, or did it crash the 
application,
or something else. 

Similarly, can you perhaps reword the patch title to start with the word "fix"
as this patch is for a bug-fix, and describe in a couple of words there what
the bug was that was being fixed e.g. "fix rx queue lockup", rather than 
describing the technicalities of the fix. In general, try to avoid using
function or variable names in the titles of messages.

Lastly, can you please include a "Fixes" line in the message, as described here:
http://dpdk.org/doc/guides/contributing/patches.html#commit-messages-body

Thanks,
/Bruce

> 
> Signed-off-by: Nelson Escobar 
> Reviewed-by: John Daley 
> ---
>  drivers/net/enic/enic_main.c | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
> index 9fff020..cd7857f 100644
> --- a/drivers/net/enic/enic_main.c
> +++ b/drivers/net/enic/enic_main.c
> @@ -524,24 +524,22 @@ int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,
>   "policy.  Applying the value in the adapter "\
>   "policy (%d).\n",
>   queue_idx, nb_desc, enic->config.rq_desc_count);
> - } else if (nb_desc != enic->config.rq_desc_count) {
> - enic->config.rq_desc_count = nb_desc;
> - dev_info(enic,
> - "RX Queues - effective number of descs:%d\n",
> - nb_desc);
> + nb_desc = enic->config.rq_desc_count;
>   }
> + dev_info(enic, "RX Queues - effective number of descs:%d\n",
> +  nb_desc);
>   }
>  
>   /* Allocate queue resources */
>   rc = vnic_rq_alloc(enic->vdev, rq, queue_idx,
> - enic->config.rq_desc_count, sizeof(struct rq_enet_desc));
> + nb_desc, sizeof(struct rq_enet_desc));
>   if (rc) {
>   dev_err(enic, "error in allocation of rq\n");
>   goto err_exit;
>   }
>  
>   rc = vnic_cq_alloc(enic->vdev, &enic->cq[queue_idx], queue_idx,
> - socket_id, enic->config.rq_desc_count,
> + socket_id, nb_desc,
>   sizeof(struct cq_enet_rq_desc));
>   if (rc) {
>   dev_err(enic, "error in allocation of cq for rq\n");
> @@ -550,7 +548,7 @@ int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,
>  
>   /* Allocate the mbuf ring */
>   rq->mbuf_ring = (struct rte_mbuf **)rte_zmalloc_socket("rq->mbuf_ring",
> - sizeof(struct rte_mbuf *) * enic->config.rq_desc_count,
> + sizeof(struct rte_mbuf *) * nb_desc,
>   RTE_CACHE_LINE_SIZE, rq->socket_id);
>  
>   if (rq->mbuf_ring != NULL)
> -- 
> 2.7.0
> 


[dpdk-dev] [PATCH RFC v3 0/3] Thread safe rte_vhost_enqueue_burst().

2016-03-18 Thread Thomas Monjalon
2016-03-18 10:10, Xie, Huawei:
> On 3/18/2016 5:55 PM, Ilya Maximets wrote:
> > Hi all.
> > And what about first patch of this series:
> > "vhost: use SMP barriers instead of compiler ones." ?
> >
> > It's not about thread safety in terms of 'lockless'. It is the standalone
> > patch that fixes many times discussed issue with barriers on arm.
> >
> > Best regards, Ilya Maximets.
> Right, for ARM. I put a comment there. Btw, could you add -in-reply-to
> when you send you next patch?

Please Ilya, send it alone when you'll have an ack from a vhost maintainer.


[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Xie, Huawei
On 3/18/2016 6:23 PM, Ilya Maximets wrote:
> On 18.03.2016 13:08, Xie, Huawei wrote:
>> On 2/24/2016 7:47 PM, Ilya Maximets wrote:
>>>  * Wait until it's our turn to add our buffer
>>> @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, 
>>> uint16_t queue_id,
>>> entry_success++;
>>> }
>>>  
>>> -   rte_compiler_barrier();
>>> +   rte_smp_rmb();
>> smp_rmb()?
> There is no such function 'smp_rmb' in DPDK.
> But:
> .../arch/arm/rte_atomic.h:#define rte_smp_rmb() rte_rmb()
> .../arch/ppc_64/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
> .../arch/tile/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
> .../arch/x86/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()

I mean shoudn't be rte_smp_wmb()?

>
>>> vq->used->idx += entry_success;
>>> vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>>> sizeof(vq->used->idx));
>>> -- 2.5.0
>>
>>



[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-18 Thread Olivier Matz
Hi,

On 03/18/2016 11:18 AM, Bruce Richardson wrote:
>>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
>>> index 943c97c..eb45e41 100644
>>> --- a/lib/librte_ring/rte_ring.h
>>> +++ b/lib/librte_ring/rte_ring.h
>>> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
>>> const *obj_table,
>>> uint32_t mask = r->prod.mask;
>>> int ret;
>>>
>>> +   /* Avoid the unnecessary cmpset operation below, which is also
>>> +* potentially harmful when n equals 0. */
>>> +   if (n == 0)
>>>
>>
>> What about using unlikely here?
>>
> 
> Unless there is a measurable performance increase by adding in likely/unlikely
> I'd suggest avoiding it's use. In general, likely/unlikely should only be used
> for things like catestrophic errors because the penalty for taking the 
> unlikely
> leg of the code can be quite severe. For normal stuff, where the code nearly
> always goes one way in the branch but occasionally goes the other, the 
> hardware
> branch predictors generally do a good enough job.

Do you mean using likely/unlikely could be worst than not using it
in this case?

To me, using unlikely here is not a bad idea: it shows to the compiler
and to the reader of the code that is case is not the usual case.


Olivier


[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-18 Thread Bruce Richardson
On Fri, Mar 18, 2016 at 11:27:18AM +0100, Olivier Matz wrote:
> Hi,
> 
> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> >>> diff --git a/lib/librte_ring/rte_ring.h b/lib/librte_ring/rte_ring.h
> >>> index 943c97c..eb45e41 100644
> >>> --- a/lib/librte_ring/rte_ring.h
> >>> +++ b/lib/librte_ring/rte_ring.h
> >>> @@ -431,6 +431,11 @@ __rte_ring_mp_do_enqueue(struct rte_ring *r, void *
> >>> const *obj_table,
> >>> uint32_t mask = r->prod.mask;
> >>> int ret;
> >>>
> >>> +   /* Avoid the unnecessary cmpset operation below, which is also
> >>> +* potentially harmful when n equals 0. */
> >>> +   if (n == 0)
> >>>
> >>
> >> What about using unlikely here?
> >>
> > 
> > Unless there is a measurable performance increase by adding in 
> > likely/unlikely
> > I'd suggest avoiding it's use. In general, likely/unlikely should only be 
> > used
> > for things like catestrophic errors because the penalty for taking the 
> > unlikely
> > leg of the code can be quite severe. For normal stuff, where the code nearly
> > always goes one way in the branch but occasionally goes the other, the 
> > hardware
> > branch predictors generally do a good enough job.
> 
> Do you mean using likely/unlikely could be worst than not using it
> in this case?
> 
> To me, using unlikely here is not a bad idea: it shows to the compiler
> and to the reader of the code that is case is not the usual case.
>
Hi Olivier,

it might be worse if the user makes a lot of calls with n == 0. It almost
certainly would depend upon the compiler. Overall, I'd rather see us err on the
side of not putting in the calls unless there is a proven case to do so.

I don't think the documentation benefit is huge here either, it's just standard
parameter checking at the start of the function.

/Bruce


[dpdk-dev] [PATCH] examples/l3fwd: fix validation for queue id of config tuple

2016-03-18 Thread Reshma Pattan
Added validation for queue id of config parameter tuple.

This validation enforces user to enter queue ids of a port
from 0 and in sequence.

This additional validation on queue ids avoids ixgbe crash caused by null
rxq pointer access inside ixgbe_dev_rx_init.

Reason for null rxq is, L3fwd application allocates memory only for queues 
passed by user.
But rte_eth_dev_start  tries to initialize rx queues in sequence from 0 to 
nb_rx_queues,
which is not true and coredump while accessing the unallocated queue .

Fixes: Commit af75078f

Signed-off-by: Reshma Pattan 
---
 examples/l3fwd/main.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 0e33039..f148d4e 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -264,8 +264,14 @@ get_port_n_rx_queues(const uint8_t port)

for (i = 0; i < nb_lcore_params; ++i) {
if (lcore_params[i].port_id == port &&
-   lcore_params[i].queue_id > queue)
+   lcore_params[i].queue_id > queue &&
+   lcore_params[i].queue_id == queue+1)
queue = lcore_params[i].queue_id;
+   else if (lcore_params[i].port_id == port &&
+   lcore_params[i].queue_id != queue+1)
+   rte_exit(EXIT_FAILURE, "queue ids of the port %d must 
be"
+   " in sequence and must start with 0",
+   lcore_params[i].port_id);
}
return (uint8_t)(++queue);
 }
-- 
2.5.0



[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-18 Thread Thomas Monjalon
2016-03-18 11:27, Olivier Matz:
> On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> >>> +   /* Avoid the unnecessary cmpset operation below, which is also
> >>> +* potentially harmful when n equals 0. */
> >>> +   if (n == 0)
> >>>
> >>
> >> What about using unlikely here?
> >>
> > 
> > Unless there is a measurable performance increase by adding in 
> > likely/unlikely
> > I'd suggest avoiding it's use. In general, likely/unlikely should only be 
> > used
> > for things like catestrophic errors because the penalty for taking the 
> > unlikely
> > leg of the code can be quite severe. For normal stuff, where the code nearly
> > always goes one way in the branch but occasionally goes the other, the 
> > hardware
> > branch predictors generally do a good enough job.
> 
> Do you mean using likely/unlikely could be worst than not using it
> in this case?
> 
> To me, using unlikely here is not a bad idea: it shows to the compiler
> and to the reader of the code that is case is not the usual case.

It would be nice to have a guideline section about likely/unlikely in
doc/guides/contributing/design.rst

Bruce gave a talk at Dublin about this kind of things.
I'm sure he could contribute more design guidelines ;)


[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Ilya Maximets


On 18.03.2016 13:27, Xie, Huawei wrote:
> On 3/18/2016 6:23 PM, Ilya Maximets wrote:
>> On 18.03.2016 13:08, Xie, Huawei wrote:
>>> On 2/24/2016 7:47 PM, Ilya Maximets wrote:
 * Wait until it's our turn to add our buffer
 @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, 
 uint16_t queue_id,
entry_success++;
}
  
 -  rte_compiler_barrier();
 +  rte_smp_rmb();
>>> smp_rmb()?
>> There is no such function 'smp_rmb' in DPDK.
>> But:
>> .../arch/arm/rte_atomic.h:#define rte_smp_rmb() rte_rmb()
>> .../arch/ppc_64/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>> .../arch/tile/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>> .../arch/x86/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
> 
> I mean shoudn't be rte_smp_wmb()?

No. Here we need to be sure that copying of data from descriptor to
our local mbuf completed before 'vq->used->idx += entry_success;'.

Read memory barrier will help us with it.

In other places write barriers used because copying performed in
opposite direction.

> 
>>
vq->used->idx += entry_success;
vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
sizeof(vq->used->idx));
 -- 2.5.0
>>>
>>>
> 
> 
> 


[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Xie, Huawei
On 3/18/2016 6:39 PM, Ilya Maximets wrote:
>
> On 18.03.2016 13:27, Xie, Huawei wrote:
>> On 3/18/2016 6:23 PM, Ilya Maximets wrote:
>>> On 18.03.2016 13:08, Xie, Huawei wrote:
 On 2/24/2016 7:47 PM, Ilya Maximets wrote:
>* Wait until it's our turn to add our buffer
> @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, 
> uint16_t queue_id,
>   entry_success++;
>   }
>  
> - rte_compiler_barrier();
> + rte_smp_rmb();
 smp_rmb()?
>>> There is no such function 'smp_rmb' in DPDK.
>>> But:
>>> .../arch/arm/rte_atomic.h:#define rte_smp_rmb() rte_rmb()
>>> .../arch/ppc_64/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>>> .../arch/tile/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>>> .../arch/x86/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>> I mean shoudn't be rte_smp_wmb()?
> No. Here we need to be sure that copying of data from descriptor to
> our local mbuf completed before 'vq->used->idx += entry_success;'.
>
> Read memory barrier will help us with it.
>
> In other places write barriers used because copying performed in
> opposite direction.

What about the udpate to the used ring?


>
>   vq->used->idx += entry_success;
>   vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>   sizeof(vq->used->idx));
> -- 2.5.0

>>
>>



[dpdk-dev] [PATCH] eal: add missing long-options for short option arguments

2016-03-18 Thread David Marchand
On Thu, Mar 3, 2016 at 4:02 PM, Wiles, Keith  wrote:
>>On Thu, Feb 25, 2016 at 11:12 PM, Wiles, Keith  
>>wrote:
On Thu, Feb 25, 2016 at 01:09:16PM -0600, Keith Wiles wrote:
> A number of short options for EAL are missing long options
> and this patch adds those missing options.
>
> The missing long options are for:
> -c add --coremask
> -d add --driver
> -l add --corelist
> -m add --memsize
> -n add --mem-channels
> -r add --mem-ranks
> -v add --version
> Add an alias for --lcores using --lcore-map
[snip]
>>No strong opinion on this.
>>
>>Just, why "memsize" with no -  but "mem-channels" ?
>>And why cut down to mem rather than memory ?
>
> I debated on mem-size, but I noticed in a couple places some used memsize. I 
> can change them to any thing someone wants. If you want memory-channels and 
> memory-ranks I am good with that too.

Ok, since I can see no real argument against, let's go with explicit options :
--memory-size
--memory-channels
--memory-ranks

The best would be then to align all other existing long options (but
keeping compat with existing ones).

Deal ?


-- 
David Marchand


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Jerin Jacob
On Fri, Mar 18, 2016 at 11:04:49AM +0100, Thomas Monjalon wrote:
> 2016-03-18 10:52, Tomasz Kulasek:
> > +#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)
> 
> I think we should use CONFIG_RTE_ARCH_ARM_NEON here.
> Any ARM maintainer to confirm?

__ARM_NEON should work existing GCC, but it is better to use
RTE_MACHINE_CPUFLAG_NEON as
-it has been generated by probing the compiler capabilities.
-it's future-proof solution to support clang or other gcc versions in
future

? [master]laptop [dpdk-master] $ aarch64-thunderx-linux-gnu-gcc -dM -E - < 
/dev/null  | grep NEON
#define __ARM_NEON_FP 12
#define __ARM_NEON 1

make output
aarch64-thunderx-linux-gnu-gcc -Wp,-MD,./.eal.o.d.tmp  -pthread
-march=armv8-a+crc -mcpu=thunderx -DRTE_MACHINE_CPUFLAG_NEON
-DRTE_MACHINE_CPUFLAG_CRC32
-DRTE_COMPILE_TIME_CPUFLAGS=RTE_CPUFLAG_NEON,RTE_CPUFLAG_CRC32
-I/export/dpdk-master/build/include -include
/export/dpdk-master/build/include/rte_config.h
-I/export/dpdk-master/lib/librte_eal/linuxapp/eal/include
-I/export/dpdk-master/lib/librte_eal/common
-I/export/dpdk-master/lib/librte_eal/common/include
-I/export/dpdk-master/lib/librte_ring
-I/export/dpdk-master/lib/librte_mempool
-I/export/dpdk-master/lib/librte_ivshmem -W -Wall -Wstrict-prototypes
-Wmissing-prototypes -Wmissing-declarations -Wold-style-definition
-Wpointer-arith -Wcast-align -Wnested-externs -Wcast-qual
-Wformat-nonliteral -Wformat-security -Wundef -Wwrite-strings -Werror
-O3 -D_GNU_SOURCE  -o eal.o -c
/export/dpdk-master/lib/librte_eal/linuxapp/eal/eal.c 

Jerin

> 
> Note that there is already another occurence of this compiler flag:
> examples/l3fwd/l3fwd_em.c:#elif defined(__ARM_NEON)


[dpdk-dev] [PATCH] app: fix for lpm in ip_pipeline

2016-03-18 Thread Thomas Monjalon
2016-03-17 09:47, Michal Jastrzebski:
> From: Michal Kobylinski 
> 
> Updated ip_pipeline app is using new changes from LPM library 
> (Increased number of next hops and added new config structure 
> for LPM IPv4).
> 
> Fixes: 7164439d017d ("lpm: add a new config structure for IPv4")

Fixes: f1f7261838b3 ("lpm: add a new config structure for IPv4")

> Signed-off-by: Michal Kobylinski 
> Acked-by: Cristian Dumitrescu 

Applied, thanks


[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Ilya Maximets
On 18.03.2016 13:47, Xie, Huawei wrote:
> On 3/18/2016 6:39 PM, Ilya Maximets wrote:
>>
>> On 18.03.2016 13:27, Xie, Huawei wrote:
>>> On 3/18/2016 6:23 PM, Ilya Maximets wrote:
 On 18.03.2016 13:08, Xie, Huawei wrote:
> On 2/24/2016 7:47 PM, Ilya Maximets wrote:
>>   * Wait until it's our turn to add our buffer
>> @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, 
>> uint16_t queue_id,
>>  entry_success++;
>>  }
>>  
>> -rte_compiler_barrier();
>> +rte_smp_rmb();
> smp_rmb()?
 There is no such function 'smp_rmb' in DPDK.
 But:
 .../arch/arm/rte_atomic.h:#define rte_smp_rmb() rte_rmb()
 .../arch/ppc_64/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
 .../arch/tile/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
 .../arch/x86/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>>> I mean shoudn't be rte_smp_wmb()?
>> No. Here we need to be sure that copying of data from descriptor to
>> our local mbuf completed before 'vq->used->idx += entry_success;'.
>>
>> Read memory barrier will help us with it.
>>
>> In other places write barriers used because copying performed in
>> opposite direction.
> 
> What about the udpate to the used ring?

Next line is the only place where this vq->used->idx accessed.
So, there must be no issues.

>>  vq->used->idx += entry_success;
>>  vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>>  sizeof(vq->used->idx));
>> -- 2.5.0
>
>>>
>>>
> 
> 
> 


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Thomas Monjalon
2016-03-18 16:22, Jerin Jacob:
> On Fri, Mar 18, 2016 at 11:04:49AM +0100, Thomas Monjalon wrote:
> > 2016-03-18 10:52, Tomasz Kulasek:
> > > +#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)
> > 
> > I think we should use CONFIG_RTE_ARCH_ARM_NEON here.
> > Any ARM maintainer to confirm?
> 
> __ARM_NEON should work existing GCC, but it is better to use
> RTE_MACHINE_CPUFLAG_NEON as
> -it has been generated by probing the compiler capabilities.
> -it's future-proof solution to support clang or other gcc versions in
> future

I agree to use RTE_MACHINE_CPUFLAG_NEON.

I just don't understand why CONFIG_RTE_ARCH_ARM_NEON has been introduced.
It seems to be used to disable NEON on ARMv7:
ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
 
MACHINE_CFLAGS += -mfpu=neon
endif


[dpdk-dev] [PATCH] examples/l3fwd: prefer probed NEON flag to ARM gcc flag

2016-03-18 Thread Thomas Monjalon
Fixes: 64d3955de1de ("examples/l3fwd: fix ARM build")

Suggested-by: Jerin Jacob 
Signed-off-by: Thomas Monjalon 
---
 examples/l3fwd/l3fwd_em.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 0adf8f4..e8e7ca1 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -250,7 +250,7 @@ em_mask_key(void *key, xmm_t mask)

return _mm_and_si128(data, mask);
 }
-#elif defined(__ARM_NEON)
+#elif defined(RTE_CPUFLAG_NEON)
 static inline xmm_t
 em_mask_key(void *key, xmm_t mask)
 {
-- 
2.7.0



[dpdk-dev] [PATCH v2] examples/l3fwd: prefer probed NEON flag to ARM gcc flag

2016-03-18 Thread Thomas Monjalon
Fixes: 64d3955de1de ("examples/l3fwd: fix ARM build")

Suggested-by: Jerin Jacob 
Signed-off-by: Thomas Monjalon 
---
 examples/l3fwd/l3fwd_em.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

v2: use RTE_MACHINE_CPUFLAG_NEON instead of (always defined) RTE_CPUFLAG_NEON

diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 0adf8f4..4983eed 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -250,7 +250,7 @@ em_mask_key(void *key, xmm_t mask)

return _mm_and_si128(data, mask);
 }
-#elif defined(__ARM_NEON)
+#elif defined(RTE_MACHINE_CPUFLAG_NEON)
 static inline xmm_t
 em_mask_key(void *key, xmm_t mask)
 {
-- 
2.7.0



[dpdk-dev] [PATCH v4 07/12] librte_ether: extend flow director struct

2016-03-18 Thread Thomas Monjalon
Hi Jingjing,

2016-03-10 11:25, Jingjing Wu:
> This patch changed rte_eth_fdir_flow from union to struct to
> support more packets formats, for example, Vxlan and GRE tunnel
> packets with IP inner frame.

I think we need a lot more explanations about how it should work.
>From this point we should collect some acknowledgements from the
maintainers of other drivers having this kind of flow steering need.
Maybe that a better API, more generic, is possible.

> This patch also add new RTE_FDIR_TUNNEL_TYPE_GRE enum.

OK to add GRE to the existing API.

> Signed-off-by: Jingjing Wu 
> Acked-by: Helin Zhang 
[...]
>  /**
> - * An union contains the inputs for all types of flow
> + * A struct contains the inputs for all types of flow
>   */
> -union rte_eth_fdir_flow {
> - struct rte_eth_l2_flow l2_flow;
> - struct rte_eth_udpv4_flow  udp4_flow;
> - struct rte_eth_tcpv4_flow  tcp4_flow;
> - struct rte_eth_sctpv4_flow sctp4_flow;
> - struct rte_eth_ipv4_flow   ip4_flow;
> - struct rte_eth_udpv6_flow  udp6_flow;
> - struct rte_eth_tcpv6_flow  tcp6_flow;
> - struct rte_eth_sctpv6_flow sctp6_flow;
> - struct rte_eth_ipv6_flow   ipv6_flow;
> +struct rte_eth_fdir_flow {
> + union {
> + struct rte_eth_l2_flow l2_flow;
> + struct rte_eth_udpv4_flow  udp4_flow;
> + struct rte_eth_tcpv4_flow  tcp4_flow;
> + struct rte_eth_sctpv4_flow sctp4_flow;
> + struct rte_eth_ipv4_flow   ip4_flow;
> + struct rte_eth_udpv6_flow  udp6_flow;
> + struct rte_eth_tcpv6_flow  tcp6_flow;
> + struct rte_eth_sctpv6_flow sctp6_flow;
> + struct rte_eth_ipv6_flow   ipv6_flow;
> + };
>   struct rte_eth_mac_vlan_flow mac_vlan_flow;
>   struct rte_eth_tunnel_flow   tunnel_flow;
>  };

Please explain somewhere how to use this API change in order to have more
discussions with other maintainers.

I'm sorry to comment this change only now. I took time to realize that
we need more consensus about the filtering API to make it usable by
more drivers.

For the 16.04 release, I suggest to remove this change from the series.
Thanks for your understanding.


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Jan Viktorin
Hello Thomas, Jerin, Tomasz, all...

On Fri, 18 Mar 2016 12:00:24 +0100
Thomas Monjalon  wrote:

> 2016-03-18 16:22, Jerin Jacob:
> > On Fri, Mar 18, 2016 at 11:04:49AM +0100, Thomas Monjalon wrote:  
> > > 2016-03-18 10:52, Tomasz Kulasek:  
> > > > +#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)  
> > > 
> > > I think we should use CONFIG_RTE_ARCH_ARM_NEON here.
> > > Any ARM maintainer to confirm?  
> > 
> > __ARM_NEON should work existing GCC, but it is better to use
> > RTE_MACHINE_CPUFLAG_NEON as
> > -it has been generated by probing the compiler capabilities.
> > -it's future-proof solution to support clang or other gcc versions in
> > future  
> 
> I agree to use RTE_MACHINE_CPUFLAG_NEON.
> 
> I just don't understand why CONFIG_RTE_ARCH_ARM_NEON has been introduced.
> It seems to be used to disable NEON on ARMv7:

This is true. You should be able to disable the NEON-specific code if it
is unwanted. Eg., the memcpy operations are not always faster with NEON.
But...

$ git grep ARM_NEON
...
lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h:45:#ifdef __ARM_NEON_FP
lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h:328:#endif /* 
__ARM_NEON_FP */
...


[dpdk-dev] [PATCH] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only

2016-03-18 Thread Jan Viktorin
The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing.
However, the target CPU may not support NEON or the user can disable to
use it (as it does not always improve the performance).

The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the
__ARM_NEON_FP feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from
the .config. The memcpy implemention is driven by
RTE_MACHINE_CPUFLAG_NEON, so the reason to disable neon is hidden for
the actual code.

Signed-off-by: Jan Viktorin 
---

I can also include this one:

examples/l3fwd/l3fwd_em.c:253:#elif defined(__ARM_NEON)

---
 lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h | 4 ++--
 mk/machine/armv7a/rte.vars.mk  | 2 +-
 mk/rte.cpuflags.mk | 2 ++
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h
b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h index
df47c0d..ad8bc65 100644 ---
a/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h +++
b/lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h @@ -42,7 +42,7
@@ extern "C" { 
 #include "generic/rte_memcpy.h"

-#ifdef __ARM_NEON_FP
+#ifdef RTE_MACHINE_CPUFLAG_NEON

 /* ARM NEON Intrinsics are used to copy data */
 #include 
@@ -325,7 +325,7 @@ rte_memcpy_func(void *dst, const void *src, size_t
n) return memcpy(dst, src, n);
 }

-#endif /* __ARM_NEON_FP */
+#endif /* RTE_MACHINE_CPUFLAG_NEON */

 #ifdef __cplusplus
 }
diff --git a/mk/machine/armv7a/rte.vars.mk
b/mk/machine/armv7a/rte.vars.mk index 48d3979..7a167c1 100644
--- a/mk/machine/armv7a/rte.vars.mk
+++ b/mk/machine/armv7a/rte.vars.mk
@@ -62,6 +62,6 @@ ifdef CONFIG_RTE_ARCH_ARM_TUNE
 MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
 endif

-ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
+ifdef $(RTE_MACHINE_CPUFLAG_NEON)
 MACHINE_CFLAGS += -mfpu=neon
 endif
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index 19a3e7e..1947511 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -111,9 +111,11 @@ CPUFLAGS += VSX
 endif

 # ARM flags
+ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
 ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
 CPUFLAGS += NEON
 endif
+endif

 ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_FEATURE_CRC32),)
 CPUFLAGS += CRC32
-- 
2.7.0


[dpdk-dev] Document2

2016-03-18 Thread dev@dpdk.org

-- next part --
A non-text attachment was scrubbed...
Name: Document2.zip
Type: application/zip
Size: 5011 bytes
Desc: Document2.zip
URL: 
<http://dpdk.org/ml/archives/dev/attachments/20160318/ad1fc10d/attachment-0001.zip>


[dpdk-dev] [PATCH v4] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Ilya Maximets
Since commit 4c02e453cc62 ("eal: introduce SMP memory barriers") virtio
uses architecture dependent SMP barriers. vHost should use them too.

Fixes: 4c02e453cc62 ("eal: introduce SMP memory barriers")

Signed-off-by: Ilya Maximets 
---
 lib/librte_vhost/vhost_rxtx.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index b4da665..859c669 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -315,7 +315,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
rte_prefetch0(&vq->desc[desc_indexes[i+1]]);
}

-   rte_compiler_barrier();
+   rte_smp_wmb();

/* Wait until it's our turn to add our buffer to the used ring. */
while (unlikely(vq->last_used_idx != res_start_idx))
@@ -565,7 +565,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,

nr_used = copy_mbuf_to_desc_mergeable(dev, vq, start, end,
  pkts[pkt_idx]);
-   rte_compiler_barrier();
+   rte_smp_wmb();

/*
 * Wait until it's our turn to add our buffer
@@ -923,7 +923,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
queue_id,
sizeof(vq->used->ring[used_idx]));
}

-   rte_compiler_barrier();
+   rte_smp_wmb();
+   rte_smp_rmb();
vq->used->idx += i;
vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
sizeof(vq->used->idx));
-- 
2.5.0



[dpdk-dev] [PATCH v12 2/2] vhost: Add VHOST PMD

2016-03-18 Thread Bruce Richardson
On Tue, Mar 15, 2016 at 05:31:41PM +0900, Tetsuya Mukawa wrote:
> The patch introduces a new PMD. This PMD is implemented as thin wrapper
> of librte_vhost. It means librte_vhost is also needed to compile the PMD.
> The vhost messages will be handled only when a port is started. So start
> a port first, then invoke QEMU.
> 
> The PMD has 2 parameters.
>  - iface:  The parameter is used to specify a path to connect to a
>virtio-net device.
>  - queues: The parameter is used to specify the number of the queues
>virtio-net device has.
>(Default: 1)
> 
> Here is an example.
> $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' -- -i
> 
> To connect above testpmd, here is qemu command example.
> 
> $ qemu-system-x86_64 \
> 
> -chardev socket,id=chr0,path=/tmp/sock0 \
> -netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \
> -device virtio-net-pci,netdev=net0,mq=on
> 
> Signed-off-by: Tetsuya Mukawa 
> Acked-by: Ferruh Yigit 
> Acked-by: Yuanhan Liu 
> Acked-by: Rich Lane 
> Tested-by: Rich Lane 

Hi Tetsuya,

I hope to get this set merged for RC2 very soon. Can you provide an update for
the nic overview.rst doc listing out the features of this new PMD. If you want,
you can provide it as a separate patch, that I will merge into this one for you
on apply to next-net.

If you do decide to respin this patchset with the extra doc, please take into
account the following patchwork issues also - otherwise I'll also fix them on
apply:

WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should probably be 
static const char * const
#364: FILE: drivers/net/vhost/rte_eth_vhost.c:56:
+static const char *valid_arguments[] = {

WARNING:LINE_SPACING: Missing a blank line after declarations
#399: FILE: drivers/net/vhost/rte_eth_vhost.c:91:
+   char *iface_name;
+   volatile uint16_t once;

WARNING:TYPO_SPELLING: 'Unknow' may be misspelled - perhaps 'Unknown'?
#684: FILE: drivers/net/vhost/rte_eth_vhost.c:376:
+   RTE_LOG(ERR, PMD, "Unknow numa node\n");

Regards,
/Bruce



[dpdk-dev] [PATCH RFC v3 1/3] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Ilya Maximets
CC to list.

On 18.03.2016 15:19, Ilya Maximets wrote:
> 
> 
> On 18.03.2016 14:42, Ilya Maximets wrote:
>> On 18.03.2016 14:30, Xie, Huawei wrote:
>>> On 3/18/2016 7:00 PM, Ilya Maximets wrote:
 On 18.03.2016 13:47, Xie, Huawei wrote:
> On 3/18/2016 6:39 PM, Ilya Maximets wrote:
>> On 18.03.2016 13:27, Xie, Huawei wrote:
>>> On 3/18/2016 6:23 PM, Ilya Maximets wrote:
 On 18.03.2016 13:08, Xie, Huawei wrote:
> On 2/24/2016 7:47 PM, Ilya Maximets wrote:
>>   * Wait until it's our turn to add our buffer
>> @@ -979,7 +979,7 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, 
>> uint16_t queue_id,
>>  entry_success++;
>>  }
>>  
>> -rte_compiler_barrier();
>> +rte_smp_rmb();
> smp_rmb()?
 There is no such function 'smp_rmb' in DPDK.
 But:
 .../arch/arm/rte_atomic.h:#define rte_smp_rmb() rte_rmb()
 .../arch/ppc_64/rte_atomic.h:#define rte_smp_rmb() 
 rte_compiler_barrier()
 .../arch/tile/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
 .../arch/x86/rte_atomic.h:#define rte_smp_rmb() rte_compiler_barrier()
>>> I mean shoudn't be rte_smp_wmb()?
>> No. Here we need to be sure that copying of data from descriptor to
>> our local mbuf completed before 'vq->used->idx += entry_success;'.
>>
>> Read memory barrier will help us with it.
>>
>> In other places write barriers used because copying performed in
>> opposite direction.
> What about the udpate to the used ring?
 Next line is the only place where this vq->used->idx accessed.
 So, there must be no issues.
>>>
>>> The update to the used ring entries, happened before the update to the
>>> used ring index.
>>
>> Oh. Sorry. In that case we need full barrier here because we need reads and
>> writes both completed before updating of used->idx:
>>  rte_smp_mb();
> 
> Hmmm.. But as I see, rte_smp_mb is a real mm_fence on x86. May be the pair
> of barriers will be better here:
>   rte_smp_rmb();
>   rte_smp_wmb();
> 
> It is normal because next increment uses read and write. So, we don't need to
> synchronize reads with writes here.
> 
>>
>>>

>>  vq->used->idx += entry_success;
>>  vhost_log_used_vring(dev, vq, offsetof(struct vring_used, idx),
>>  sizeof(vq->used->idx));
>> -- 2.5.0
>>>
>
>
>>>
>>>
>>>


[dpdk-dev] [PATCH v4] vhost: use SMP barriers instead of compiler ones.

2016-03-18 Thread Yuanhan Liu
On Fri, Mar 18, 2016 at 03:23:53PM +0300, Ilya Maximets wrote:
> Since commit 4c02e453cc62 ("eal: introduce SMP memory barriers") virtio
> uses architecture dependent SMP barriers. vHost should use them too.
> 
> Fixes: 4c02e453cc62 ("eal: introduce SMP memory barriers")
> 
> Signed-off-by: Ilya Maximets 
> ---
>  lib/librte_vhost/vhost_rxtx.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
> index b4da665..859c669 100644
> --- a/lib/librte_vhost/vhost_rxtx.c
> +++ b/lib/librte_vhost/vhost_rxtx.c
> @@ -315,7 +315,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>   rte_prefetch0(&vq->desc[desc_indexes[i+1]]);
>   }
>  
> - rte_compiler_barrier();
> + rte_smp_wmb();
>  
>   /* Wait until it's our turn to add our buffer to the used ring. */
>   while (unlikely(vq->last_used_idx != res_start_idx))
> @@ -565,7 +565,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
> queue_id,
>  
>   nr_used = copy_mbuf_to_desc_mergeable(dev, vq, start, end,
> pkts[pkt_idx]);
> - rte_compiler_barrier();
> + rte_smp_wmb();
>  
>   /*
>* Wait until it's our turn to add our buffer
> @@ -923,7 +923,8 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
> queue_id,
>   sizeof(vq->used->ring[used_idx]));
>   }
>  
> - rte_compiler_barrier();
> + rte_smp_wmb();
> + rte_smp_rmb();

rte_smp_mb?

--yliu


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Kulasek, TomaszX


> -Original Message-
> From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> Sent: Friday, March 18, 2016 12:57
> To: Thomas Monjalon 
> Cc: Jerin Jacob ; Kulasek, TomaszX
> ; dev at dpdk.org; jianbo.liu at linaro.org
> Subject: Re: [dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix
> 
> Hello Thomas, Jerin, Tomasz, all...
> 
> On Fri, 18 Mar 2016 12:00:24 +0100
> Thomas Monjalon  wrote:
> 
> > 2016-03-18 16:22, Jerin Jacob:
> > > On Fri, Mar 18, 2016 at 11:04:49AM +0100, Thomas Monjalon wrote:
> > > > 2016-03-18 10:52, Tomasz Kulasek:
> > > > > +#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)
> > > >
> > > > I think we should use CONFIG_RTE_ARCH_ARM_NEON here.
> > > > Any ARM maintainer to confirm?
> > >
> > > __ARM_NEON should work existing GCC, but it is better to use
> > > RTE_MACHINE_CPUFLAG_NEON as -it has been generated by probing the
> > > compiler capabilities.
> > > -it's future-proof solution to support clang or other gcc versions
> > > in future
> >
> > I agree to use RTE_MACHINE_CPUFLAG_NEON.
> >
> > I just don't understand why CONFIG_RTE_ARCH_ARM_NEON has been
> introduced.
> > It seems to be used to disable NEON on ARMv7:
> 
> This is true. You should be able to disable the NEON-specific code if it
> is unwanted. Eg., the memcpy operations are not always faster with NEON.
> But...
> 
> $ git grep ARM_NEON
> ...
> lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h:45:#ifdef
> __ARM_NEON_FP
> lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h:328:#endif /*
> __ARM_NEON_FP */ ...
> 
> From this point of view, this is wrong and should be fixed to check a
> different constant.
> 
> > ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
> > MACHINE_CFLAGS += -mfpu=neon
> > endif
> 
> However, there is another possible way of understanding these options.
> We can (well, unlikely and I am about to say 'never') have an ARM
> processor without NEON. This cannot be detected by gcc as it does not know
> the target processor... So from my point of view:
> 
> * CONFIG_RTE_ARCH_ARM_NEON says "my CPU does (not) support NEON" or "I
>   want to enable/disable NEON" while
> * RTE_MACHINE_CPUFLAG_NEON says, the _compiler_ supports NEON
> 
> I'll send a patch trying to solve this.
> 
> Regards
> Jan

Hi

As I understand with your last patch it's safe and preferred to use 
RTE_MACHINE_CPUFLAG_NEON for ARM Neon detection? If so, I can include this 
modification for whole l3fwd in v6 of this patch.

Tomasz.


[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-18 Thread Mauricio Vásquez
Hi,


On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon  wrote:

> 2016-03-18 11:27, Olivier Matz:
> > On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> > >>> +   /* Avoid the unnecessary cmpset operation below, which is
> also
> > >>> +* potentially harmful when n equals 0. */
> > >>> +   if (n == 0)
> > >>>
> > >>
> > >> What about using unlikely here?
> > >>
> > >
> > > Unless there is a measurable performance increase by adding in
> likely/unlikely
> > > I'd suggest avoiding it's use. In general, likely/unlikely should only
> be used
> > > for things like catestrophic errors because the penalty for taking the
> unlikely
> > > leg of the code can be quite severe. For normal stuff, where the code
> nearly
> > > always goes one way in the branch but occasionally goes the other, the
> hardware
> > > branch predictors generally do a good enough job.
> >
> > Do you mean using likely/unlikely could be worst than not using it
> > in this case?
> >
> > To me, using unlikely here is not a bad idea: it shows to the compiler
> > and to the reader of the code that is case is not the usual case.
>
> It would be nice to have a guideline section about likely/unlikely in
> doc/guides/contributing/design.rst
>
> Bruce gave a talk at Dublin about this kind of things.
> I'm sure he could contribute more design guidelines ;)
>

There is a small explanation in the section "Branch Prediction" of
doc/guides/contributing/coding_style.rst, but I do not know if that is
enough to understand when to use them.

I've made a fast check and there are many PMDs that use them to check if
number of packets is zero in the transmission function.


[dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix

2016-03-18 Thread Jan Viktorin
On Fri, 18 Mar 2016 12:45:03 +
"Kulasek, TomaszX"  wrote:

> > -Original Message-
> > From: Jan Viktorin [mailto:viktorin at rehivetech.com]
> > Sent: Friday, March 18, 2016 12:57
> > To: Thomas Monjalon 
> > Cc: Jerin Jacob ; Kulasek, TomaszX
> > ; dev at dpdk.org; jianbo.liu at linaro.org
> > Subject: Re: [dpdk-dev] [PATCH v5] examples/l3fwd: em path performance fix
> > 
> > Hello Thomas, Jerin, Tomasz, all...
> > 
> > On Fri, 18 Mar 2016 12:00:24 +0100
> > Thomas Monjalon  wrote:
> >   
> > > 2016-03-18 16:22, Jerin Jacob:  
> > > > On Fri, Mar 18, 2016 at 11:04:49AM +0100, Thomas Monjalon wrote:  
> > > > > 2016-03-18 10:52, Tomasz Kulasek:  
> > > > > > +#if !defined(NO_HASH_MULTI_LOOKUP) && defined(__ARM_NEON)  
> > > > >
> > > > > I think we should use CONFIG_RTE_ARCH_ARM_NEON here.
> > > > > Any ARM maintainer to confirm?  
> > > >
> > > > __ARM_NEON should work existing GCC, but it is better to use
> > > > RTE_MACHINE_CPUFLAG_NEON as -it has been generated by probing the
> > > > compiler capabilities.
> > > > -it's future-proof solution to support clang or other gcc versions
> > > > in future  
> > >
> > > I agree to use RTE_MACHINE_CPUFLAG_NEON.
> > >
> > > I just don't understand why CONFIG_RTE_ARCH_ARM_NEON has been  
> > introduced.  
> > > It seems to be used to disable NEON on ARMv7:  
> > 
> > This is true. You should be able to disable the NEON-specific code if it
> > is unwanted. Eg., the memcpy operations are not always faster with NEON.
> > But...
> > 
> > $ git grep ARM_NEON
> > ...
> > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h:45:#ifdef
> > __ARM_NEON_FP
> > lib/librte_eal/common/include/arch/arm/rte_memcpy_32.h:328:#endif /*
> > __ARM_NEON_FP */ ...
> > 
> > From this point of view, this is wrong and should be fixed to check a
> > different constant.
> >   
> > >   ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
> > >   MACHINE_CFLAGS += -mfpu=neon
> > >   endif  
> > 
> > However, there is another possible way of understanding these options.
> > We can (well, unlikely and I am about to say 'never') have an ARM
> > processor without NEON. This cannot be detected by gcc as it does not know
> > the target processor... So from my point of view:
> > 
> > * CONFIG_RTE_ARCH_ARM_NEON says "my CPU does (not) support NEON" or "I
> >   want to enable/disable NEON" while
> > * RTE_MACHINE_CPUFLAG_NEON says, the _compiler_ supports NEON
> > 
> > I'll send a patch trying to solve this.
> > 
> > Regards
> > Jan  
> 
> Hi
> 
> As I understand with your last patch it's safe and preferred to use 
> RTE_MACHINE_CPUFLAG_NEON for ARM Neon detection? If so, I can include this 
> modification for whole l3fwd in v6 of this patch.

Yes, I'd prefer this approach as well.

Jan

> 
> Tomasz.



-- 
   Jan Viktorin  E-mail: Viktorin at RehiveTech.com
   System Architect  Web:www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic


[dpdk-dev] [PATCH 1/2] mlx5: fix overwritten RSS configuration

2016-03-18 Thread Nelio Laranjeiro
RSS configuration provided by the application should not be used as storage
by the PMD.

Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5.h| 1 +
 drivers/net/mlx5/mlx5_ethdev.c | 1 +
 drivers/net/mlx5/mlx5_rss.c| 7 ++-
 drivers/net/mlx5/mlx5_rxq.c| 2 +-
 4 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h
index d012f50..d6c08bb 100644
--- a/drivers/net/mlx5/mlx5.h
+++ b/drivers/net/mlx5/mlx5.h
@@ -122,6 +122,7 @@ struct priv {
unsigned int hash_rxqs_n; /* Hash RX QPs array size. */
/* RSS configuration array indexed by hash RX queue type. */
struct rte_eth_rss_conf *(*rss_conf)[];
+   uint64_t rss_hf; /* RSS DPDK bit field of active RSS. */
struct rte_intr_handle intr_handle; /* Interrupt handler. */
unsigned int (*reta_idx)[]; /* RETA index table. */
unsigned int reta_idx_n; /* RETA index size. */
diff --git a/drivers/net/mlx5/mlx5_ethdev.c b/drivers/net/mlx5/mlx5_ethdev.c
index 66115d2..aa67623 100644
--- a/drivers/net/mlx5/mlx5_ethdev.c
+++ b/drivers/net/mlx5/mlx5_ethdev.c
@@ -447,6 +447,7 @@ dev_configure(struct rte_eth_dev *dev)
unsigned int j;
unsigned int reta_idx_n;

+   priv->rss_hf = dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf;
priv->rxqs = (void *)dev->data->rx_queues;
priv->txqs = (void *)dev->data->tx_queues;
if (txqs_n != priv->txqs_n) {
diff --git a/drivers/net/mlx5/mlx5_rss.c b/drivers/net/mlx5/mlx5_rss.c
index 7eb688a..e73cd9d 100644
--- a/drivers/net/mlx5/mlx5_rss.c
+++ b/drivers/net/mlx5/mlx5_rss.c
@@ -162,11 +162,8 @@ mlx5_rss_hash_update(struct rte_eth_dev *dev,
rss_hash_default_key_len,
ETH_RSS_PROTO_MASK);

-   /* Store the configuration set into port configure.
-* This will enable/disable hash RX queues associated to the protocols
-* enabled/disabled by this update. */
-   priv->dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf =
-   rss_conf->rss_hf;
+   /* Store protocols for which RSS is enabled. */
+   priv->rss_hf = rss_conf->rss_hf;
priv_unlock(priv);
assert(err >= 0);
return -err;
diff --git a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c
index c8af77f..cbb017b 100644
--- a/drivers/net/mlx5/mlx5_rxq.c
+++ b/drivers/net/mlx5/mlx5_rxq.c
@@ -312,7 +312,7 @@ priv_make_ind_table_init(struct priv *priv,
/* Mandatory to receive frames not handled by normal hash RX queues. */
unsigned int hash_types_sup = 1 << HASH_RXQ_ETH;

-   rss_hf = priv->dev->data->dev_conf.rx_adv_conf.rss_conf.rss_hf;
+   rss_hf = priv->rss_hf;
/* Process other protocols only if more than one queue. */
if (priv->rxqs_n > 1)
for (i = 0; (i != hash_rxq_init_n); ++i)
-- 
2.1.4



[dpdk-dev] [PATCH 2/2] mlx5: fix handling of NULL RSS key

2016-03-18 Thread Nelio Laranjeiro
Update function can be called with no key to enable or disable a RSS
protocol, or with a key to be applied to the desired protocols.

Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get")

Signed-off-by: Nelio Laranjeiro 
---
 drivers/net/mlx5/mlx5_rss.c | 6 --
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rss.c b/drivers/net/mlx5/mlx5_rss.c
index e73cd9d..639e935 100644
--- a/drivers/net/mlx5/mlx5_rss.c
+++ b/drivers/net/mlx5/mlx5_rss.c
@@ -156,12 +156,6 @@ mlx5_rss_hash_update(struct rte_eth_dev *dev,
rss_conf->rss_key,
rss_conf->rss_key_len,
rss_conf->rss_hf);
-   else
-   err = rss_hash_rss_conf_new_key(priv,
-   rss_hash_default_key,
-   rss_hash_default_key_len,
-   ETH_RSS_PROTO_MASK);
-
/* Store protocols for which RSS is enabled. */
priv->rss_hf = rss_conf->rss_hf;
priv_unlock(priv);
-- 
2.1.4



[dpdk-dev] [PATCH v6] examples/l3fwd: em path performance fix

2016-03-18 Thread Tomasz Kulasek
It seems that for the most use cases, previous hash_multi_lookup provides
better performance, and more, sequential lookup can cause significant
performance drop.

This patch sets previously optional hash_multi_lookup method as default.
It also provides some minor optimizations such as queue drain only on used
tx ports.


This patch should be applied after Maciej Czekaj's patch "l3fwd: Fix
compilation with HASH_MULTI_LOOKUP"


v6 changes:
 - use RTE_MACHINE_CPUFLAG_NEON instead of __ARM_NEON for ARM Neon
   detection

v5 changes:
 - removed debug informations, patch cleanup

v4 changes:
 - rebased to be applicable after patch "l3fwd: Fix compilation with
   HASH_MULTI_LOOKUP" of Maciej Czekaj

v3 changes:
 - "lpm: extend IPv4 next hop field" patch extends dst_port table from
   uint16_t to uint32_t omiting previously disabled l3fwd_em_hlm_sse.h,
   what causes incompatible pointer type error after turning on this header

v2 changes:
 - fixed copy-paste error causing that not all packets are classified right
   in hash_multi_lookup implementation when burst size is not divisible
   by 8

Fixes: 94c54b4158d5 ("examples/l3fwd: rework exact-match")
Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Fixes: 64d3955de1de ("examples/l3fwd: fix ARM build")

Reported-by: Qian Xu 
Signed-off-by: Tomasz Kulasek 
---
 examples/l3fwd/l3fwd.h|6 ++
 examples/l3fwd/l3fwd_em.c |8 
 examples/l3fwd/l3fwd_em_hlm_sse.h |   28 ++--
 examples/l3fwd/l3fwd_em_sse.h |9 +
 examples/l3fwd/l3fwd_lpm.c|4 ++--
 examples/l3fwd/main.c |7 +++
 6 files changed, 38 insertions(+), 24 deletions(-)

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index 7dcc7e5..726e8cc 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -40,6 +40,10 @@

 #define RTE_LOGTYPE_L3FWD RTE_LOGTYPE_USER1

+#if !defined(NO_HASH_MULTI_LOOKUP) && defined(RTE_MACHINE_CPUFLAG_NEON)
+#define NO_HASH_MULTI_LOOKUP 1
+#endif
+
 #define MAX_PKT_BURST 32
 #define BURST_TX_DRAIN_US 100 /* TX drain every ~100us */

@@ -86,6 +90,8 @@ struct lcore_rx_queue {
 struct lcore_conf {
uint16_t n_rx_queue;
struct lcore_rx_queue rx_queue_list[MAX_RX_QUEUE_PER_LCORE];
+   uint16_t n_tx_port;
+   uint16_t tx_port_id[RTE_MAX_ETHPORTS];
uint16_t tx_queue_id[RTE_MAX_ETHPORTS];
struct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
void *ipv4_lookup_struct;
diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 0adf8f4..50f09e5 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -250,7 +250,7 @@ em_mask_key(void *key, xmm_t mask)

return _mm_and_si128(data, mask);
 }
-#elif defined(__ARM_NEON)
+#elif defined(RTE_MACHINE_CPUFLAG_NEON)
 static inline xmm_t
 em_mask_key(void *key, xmm_t mask)
 {
@@ -320,7 +320,7 @@ em_get_ipv6_dst_port(void *ipv6_hdr,  uint8_t portid, void 
*lookup_struct)
  * buffer optimization i.e. ENABLE_MULTI_BUFFER_OPTIMIZE=1.
  */
 #if defined(__SSE4_1__)
-#ifndef HASH_MULTI_LOOKUP
+#if defined(NO_HASH_MULTI_LOOKUP)
 #include "l3fwd_em_sse.h"
 #else
 #include "l3fwd_em_hlm_sse.h"
@@ -568,8 +568,8 @@ em_main_loop(__attribute__((unused)) void *dummy)
diff_tsc = cur_tsc - prev_tsc;
if (unlikely(diff_tsc > drain_tsc)) {

-   for (i = 0; i < qconf->n_rx_queue; i++) {
-   portid = qconf->rx_queue_list[i].port_id;
+   for (i = 0; i < qconf->n_tx_port; ++i) {
+   portid = qconf->tx_port_id[i];
if (qconf->tx_mbufs[portid].len == 0)
continue;
send_burst(qconf,
diff --git a/examples/l3fwd/l3fwd_em_hlm_sse.h 
b/examples/l3fwd/l3fwd_em_hlm_sse.h
index 891ae2e..7faf04a 100644
--- a/examples/l3fwd/l3fwd_em_hlm_sse.h
+++ b/examples/l3fwd/l3fwd_em_hlm_sse.h
@@ -34,17 +34,9 @@
 #ifndef __L3FWD_EM_HLM_SSE_H__
 #define __L3FWD_EM_HLM_SSE_H__

-/**
- * @file
- * This is an optional implementation of packet classification in Exact-Match
- * path using rte_hash_lookup_multi method from previous implementation.
- * While sequential classification seems to be faster, it's disabled by default
- * and can be enabled with HASH_LOOKUP_MULTI global define in compilation time.
- */
-
 #include "l3fwd_sse.h"

-static inline void
+static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv4x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
uint8_t portid, uint32_t dst_port[8])
 {
@@ -168,7 +160,7 @@ get_ipv6_5tuple(struct rte_mbuf *m0, __m128i mask0,
key->xmm[2] = _mm_and_si128(tmpdata2, mask1);
 }

-static inline void
+static inline __attribute__((always_inline)) void
 em_get_dst_port_ipv6x8(struct lcore_conf *qconf, struct rte_mbuf *m[8],
uint8_t portid, uint32_t dst_port[8])
 {
@@ -322,17 +314,17

[dpdk-dev] ixgbe TX function selection

2016-03-18 Thread Zoltan Kiss


On 18/03/16 00:45, Lu, Wenzhuo wrote:
> Hi Zoltan,
>
>
>> -Original Message-
>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
>> Sent: Friday, March 18, 2016 1:11 AM
>> To: Wu, Jingjing; dev at dpdk.org
>> Subject: Re: [dpdk-dev] ixgbe TX function selection
>>
>>
>>
>> On 10/03/16 07:51, Wu, Jingjing wrote:
>>> Hi, Zoltan
>>>
 -Original Message-
 From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Zoltan Kiss
 Sent: Wednesday, March 2, 2016 3:19 AM
 To: dev at dpdk.org
 Subject: [dpdk-dev] ixgbe TX function selection

 Hi,

 I've noticed that ixgbe_set_tx_function() selects the non-SG function
 even if (dev->data->scattered_rx == 1). That seems a bit dangerous,
 as you can turn that on inadvertently when you don't set
 max_rx_pkt_len and buffer size in certain ways. I've learnt it in the
 hard way, as my segmented packets were leaking memory on the TX path,
 which doesn't cries if you send out segmented packets.
 How should this case be treated? Assert on the non-SG TX side for the
 'next' pointer? Or turning on SG if RX has it? It doesn't seem to be
 a solid way as other interfaces still can have SG turned on.

>>>
>>> If you look into the ixgbe_set_tx_function, you will find tx function
>>> selection is decided by the tx_flags on queue configure, which is
>>> passed by rte_eth_txconf. So even you set dev->data->scattered_rx to
>>> 1, if the tx_flags is ETH_TXQ_FLAGS_NOMULTSEGS, ixgbe_xmit_pkts_simple
>>> is still selected as tx function. So, you'd better to set tx_flags=0, and 
>>> have a try.
>>
>> You mean getting default_txconf from rte_eth_dev_info_get() and explicitly 
>> turn
>> ETH_TXQ_FLAGS_NOMULTSEGS to 0? (filling tx_flags with zeros doesn't work
>> very well) That's a way to solve it for me, but I'm rather talking about 
>> using
>> defaults which doesn't cause memory leak quite easily.
> Yes, ETH_TXQ_FLAGS_NOMULTSEGS only can be set to 1 when you know all your 
> packets will not be segmented.
> I think that means normally we should use full function path for TX, for we 
> have no knowledge about if the packets will be segmented or not.
> You don't need to set tx_flags to 0, only the ETH_TXQ_FLAGS_NOMULTSEGS bit 
> should be 0, the other bits can be 1 if needed.

So can we agree that the default settings should set 
ETH_TXQ_FLAGS_NOMULTSEGS to 0?


>
>>
>>>
 Regards,

 Zoltan


[dpdk-dev] [PATCH v12 2/2] vhost: Add VHOST PMD

2016-03-18 Thread Tetsuya Mukawa
2016/03/18 ??9:27 "Bruce Richardson" :
>
> On Tue, Mar 15, 2016 at 05:31:41PM +0900, Tetsuya Mukawa wrote:
> > The patch introduces a new PMD. This PMD is implemented as thin wrapper
> > of librte_vhost. It means librte_vhost is also needed to compile the
PMD.
> > The vhost messages will be handled only when a port is started. So start
> > a port first, then invoke QEMU.
> >
> > The PMD has 2 parameters.
> >  - iface:  The parameter is used to specify a path to connect to a
> >virtio-net device.
> >  - queues: The parameter is used to specify the number of the queues
> >virtio-net device has.
> >(Default: 1)
> >
> > Here is an example.
> > $ ./testpmd -c f -n 4 --vdev 'eth_vhost0,iface=/tmp/sock0,queues=1' --
-i
> >
> > To connect above testpmd, here is qemu command example.
> >
> > $ qemu-system-x86_64 \
> > 
> > -chardev socket,id=chr0,path=/tmp/sock0 \
> > -netdev vhost-user,id=net0,chardev=chr0,vhostforce,queues=1 \
> > -device virtio-net-pci,netdev=net0,mq=on
> >
> > Signed-off-by: Tetsuya Mukawa 
> > Acked-by: Ferruh Yigit 
> > Acked-by: Yuanhan Liu 
> > Acked-by: Rich Lane 
> > Tested-by: Rich Lane 
>
> Hi Tetsuya,
>
> I hope to get this set merged for RC2 very soon. Can you provide an
update for
> the nic overview.rst doc listing out the features of this new PMD. If you
want,
> you can provide it as a separate patch, that I will merge into this one
for you
> on apply to next-net.
>
> If you do decide to respin this patchset with the extra doc, please take
into
> account the following patchwork issues also - otherwise I'll also fix
them on
> apply:
>
> WARNING:STATIC_CONST_CHAR_ARRAY: static const char * array should
probably be static const char * const
> #364: FILE: drivers/net/vhost/rte_eth_vhost.c:56:
> +static const char *valid_arguments[] = {
>
> WARNING:LINE_SPACING: Missing a blank line after declarations
> #399: FILE: drivers/net/vhost/rte_eth_vhost.c:91:
> +   char *iface_name;
> +   volatile uint16_t once;
>
> WARNING:TYPO_SPELLING: 'Unknow' may be misspelled - perhaps 'Unknown'?
> #684: FILE: drivers/net/vhost/rte_eth_vhost.c:376:
> +   RTE_LOG(ERR, PMD, "Unknow numa node\n");
>
> Regards,
> /Bruce
>

Hi Bruce,

I've sent the v12 patch with vhost.rst.
Could you please check below?

http://dpdk.org/dev/patchwork/project/dpdk/list/?submitter=64

Is this the documentation I need to add?

Anyway, it contains above nits. So could you please fix it before merging,
if it's the documentation?

Regards,
Tetsuya


[dpdk-dev] [PATCH] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only

2016-03-18 Thread Thomas Monjalon
2016-03-18 13:00, Jan Viktorin:
> The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing.
> However, the target CPU may not support NEON or the user can disable to
> use it (as it does not always improve the performance).
> 
> The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the
> __ARM_NEON_FP feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from
> the .config. The memcpy implemention is driven by
> RTE_MACHINE_CPUFLAG_NEON, so the reason to disable neon is hidden for
> the actual code.
> 
> Signed-off-by: Jan Viktorin 
> ---
> 
> I can also include this one:
> 
> examples/l3fwd/l3fwd_em.c:253:#elif defined(__ARM_NEON)

Yes please.
I will set my patch as superseded.



[dpdk-dev] [PATCH v12 2/2] vhost: Add VHOST PMD

2016-03-18 Thread Thomas Monjalon
2016-03-18 22:41, Tetsuya Mukawa:
> 2016/03/18 ??9:27 "Bruce Richardson" :
> > I hope to get this set merged for RC2 very soon. Can you provide an
> > update for the nic overview.rst doc listing out the features of
> > this new PMD.
[...]
> I've sent the v12 patch with vhost.rst.
> Could you please check below?

Bruce is talking about the table of features in overview.rst.


[dpdk-dev] [PATCH v12 1/2] ethdev: Add a new event type to notify a queue state changed event

2016-03-18 Thread Thomas Monjalon
2016-03-15 17:31, Tetsuya Mukawa:
> This patch adds a below event type.
>  - RTE_ETH_EVENT_QUEUE_STATE_CHANGE
> This event is used for notifying a queue state changed event.
[...]
>  enum rte_eth_event_type {
>   RTE_ETH_EVENT_UNKNOWN,  /**< unknown event type */
>   RTE_ETH_EVENT_INTR_LSC, /**< lsc interrupt event */
> + RTE_ETH_EVENT_QUEUE_STATE_CHANGE,
> + /**< queue state changed interrupt */

This comment is not really helpful.
Please could you describe what is a queue state?
Is it only applicable to vhost?
Can we say there is a real interrupt or simply an event?


[dpdk-dev] [PATCH v12 2/2] vhost: Add VHOST PMD

2016-03-18 Thread Tetsuya Mukawa
On 2016/03/18 22:52, Thomas Monjalon wrote:
> 2016-03-18 22:41, Tetsuya Mukawa:
>> 2016/03/18 ??9:27 "Bruce Richardson" :
>>> I hope to get this set merged for RC2 very soon. Can you provide an
>>> update for the nic overview.rst doc listing out the features of
>>> this new PMD.
> [...]
>> I've sent the v12 patch with vhost.rst.
>> Could you please check below?
> Bruce is talking about the table of features in overview.rst.

Hi Bruce and Thomas,

Thanks, I've got it.
Now I am out of office, so I will send the patch separately by hopefully
tomorrow.
Could you please apply it separately?

Regards,
Tetsuya


[dpdk-dev] [PATCH] arm: detect NEON by RTE_MACHINE_CPUFLAG_NEON flag only

2016-03-18 Thread Jan Viktorin
On Fri, 18 Mar 2016 14:49:57 +0100
Thomas Monjalon  wrote:

> 2016-03-18 13:00, Jan Viktorin:
> > The RTE_MACHINE_CPUFLAG_NEON was only a result of the gcc testing.
> > However, the target CPU may not support NEON or the user can disable to
> > use it (as it does not always improve the performance).
> > 
> > The RTE_MACHINE_CPUFLAG_NEON detection is now based on both, the
> > __ARM_NEON_FP feature from gcc and CONFIG_RTE_ARCH_ARM_NEON from
> > the .config. The memcpy implemention is driven by
> > RTE_MACHINE_CPUFLAG_NEON, so the reason to disable neon is hidden for
> > the actual code.
> > 
> > Signed-off-by: Jan Viktorin 
> > ---
> > 
> > I can also include this one:
> > 
> > examples/l3fwd/l3fwd_em.c:253:#elif defined(__ARM_NEON)  
> 
> Yes please.
> I will set my patch as superseded.
> 

OK, I will send v2.

By the way, for Intel-related code, it is also common to check eg.
__SSE2__ instead of the RTE_MACHINE_CPUFLAG_SSE2. That's probably a
source of confusion for new code, newcomers and adding new platforms.

As for me, I've had known about the CPUFLAGs... But I could hardly see
those in the DPDK code. It looks like the features are detected by
unused... And IMHO this is the reason why we are confused here.

Regards
Jan


[dpdk-dev] [PATCH v12 2/2] vhost: Add VHOST PMD

2016-03-18 Thread Bruce Richardson
On Fri, Mar 18, 2016 at 11:03:56PM +0900, Tetsuya Mukawa wrote:
> On 2016/03/18 22:52, Thomas Monjalon wrote:
> > 2016-03-18 22:41, Tetsuya Mukawa:
> >> 2016/03/18 ??9:27 "Bruce Richardson" :
> >>> I hope to get this set merged for RC2 very soon. Can you provide an
> >>> update for the nic overview.rst doc listing out the features of
> >>> this new PMD.
> > [...]
> >> I've sent the v12 patch with vhost.rst.
> >> Could you please check below?
> > Bruce is talking about the table of features in overview.rst.
> 
> Hi Bruce and Thomas,
> 
> Thanks, I've got it.
> Now I am out of office, so I will send the patch separately by hopefully
> tomorrow.
> Could you please apply it separately?
> 
I'll hold applying the patchset until I have all relevant bits ready to go 
together.
If it's in by Monday, it should be ok. Please also supply a more detailed 
comment
for the new flag addition that Thomas has called out on patch 1. That needs to 
be
resolved too before apply.

/Bruce


[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-18 Thread Bruce Richardson
On Fri, Mar 18, 2016 at 01:47:29PM +0100, Mauricio V?squez wrote:
> Hi,
> 
> 
> On Fri, Mar 18, 2016 at 11:35 AM, Thomas Monjalon  6wind.com
> > wrote:
> 
> > 2016-03-18 11:27, Olivier Matz:
> > > On 03/18/2016 11:18 AM, Bruce Richardson wrote:
> > > >>> +   /* Avoid the unnecessary cmpset operation below, which is
> > also
> > > >>> +* potentially harmful when n equals 0. */
> > > >>> +   if (n == 0)
> > > >>>
> > > >>
> > > >> What about using unlikely here?
> > > >>
> > > >
> > > > Unless there is a measurable performance increase by adding in
> > likely/unlikely
> > > > I'd suggest avoiding it's use. In general, likely/unlikely should only
> > be used
> > > > for things like catestrophic errors because the penalty for taking the
> > unlikely
> > > > leg of the code can be quite severe. For normal stuff, where the code
> > nearly
> > > > always goes one way in the branch but occasionally goes the other, the
> > hardware
> > > > branch predictors generally do a good enough job.
> > >
> > > Do you mean using likely/unlikely could be worst than not using it
> > > in this case?
> > >
> > > To me, using unlikely here is not a bad idea: it shows to the compiler
> > > and to the reader of the code that is case is not the usual case.
> >
> > It would be nice to have a guideline section about likely/unlikely in
> > doc/guides/contributing/design.rst
> >
> > Bruce gave a talk at Dublin about this kind of things.
> > I'm sure he could contribute more design guidelines ;)
> >
> 
> There is a small explanation in the section "Branch Prediction" of
> doc/guides/contributing/coding_style.rst, but I do not know if that is
> enough to understand when to use them.
> 
> I've made a fast check and there are many PMDs that use them to check if
> number of packets is zero in the transmission function.

Yeah, and I wonder how many of those are actually necessary too :-)

It's not a big deal either way, I just think the patch is fine as-is without
the extra macros.

/Bruce


[dpdk-dev] [PATCH v12 2/2] vhost: Add VHOST PMD

2016-03-18 Thread Tetsuya Mukawa
On 2016/03/18 23:13, Bruce Richardson wrote:
> On Fri, Mar 18, 2016 at 11:03:56PM +0900, Tetsuya Mukawa wrote:
>> On 2016/03/18 22:52, Thomas Monjalon wrote:
>>> 2016-03-18 22:41, Tetsuya Mukawa:
 2016/03/18 ??9:27 "Bruce Richardson" :
> I hope to get this set merged for RC2 very soon. Can you provide an
> update for the nic overview.rst doc listing out the features of
> this new PMD.
>>> [...]
 I've sent the v12 patch with vhost.rst.
 Could you please check below?
>>> Bruce is talking about the table of features in overview.rst.
>> Hi Bruce and Thomas,
>>
>> Thanks, I've got it.
>> Now I am out of office, so I will send the patch separately by hopefully
>> tomorrow.
>> Could you please apply it separately?
>>
> I'll hold applying the patchset until I have all relevant bits ready to go 
> together.
> If it's in by Monday, it should be ok. 

I appreciate it.

> Please also supply a more detailed comment
> for the new flag addition that Thomas has called out on patch 1. That needs 
> to be
> resolved too before apply.

Sure, I will add more description.

Regards,
Tetsuya



[dpdk-dev] L2 Forwarding Sample Applications

2016-03-18 Thread Wiles, Keith
>::DISCLAIMER::
>
>
>The contents of this e-mail and any attachment(s) are confidential and 
>intended for the named recipient(s) only.
>E-mail transmission is not guaranteed to be secure or error-free as 
>information could be intercepted, corrupted,
>lost, destroyed, arrive late or incomplete, or may contain viruses in 
>transmission. The e mail and its contents
>(with or without referred errors) shall therefore not attach any liability on 
>the originator or HCL or its affiliates.
>Views or opinions, if any, presented in this email are solely those of the 
>author and may not necessarily reflect the
>views or opinions of HCL or its affiliates. Any form of reproduction, 
>dissemination, copying, disclosure, modification,
>distribution and / or publication of this message without the prior written 
>consent of authorized representative of
>HCL is strictly prohibited. If you have received this email in error please 
>delete it and notify the sender immediately.
>Before opening any email and/or attachments, please check them for viruses and 
>other defects.
>
>
BTW, there is nothing private or confidential on a public email list :-) Please 
remove this disclaimer in the future when sending to the list.

>


Regards,
Keith






[dpdk-dev] L2 Forwarding Sample Applications

2016-03-18 Thread Wiles, Keith
>Please let me know the purpose of writing these sample applications in DPDK-
>
>* L2fwd-keepalive/main.c
>
>* L2fwd-jobstats/main.c
>
>* L2fwd-ivshmem/guest/guest.c
>
>* L2fwd-ivshmem/host/host.c

I am not sure what your question is trying ask here, but these example 
applications help show how to use these features and helps in testing these 
features. You can take these examples to help add these features to your 
application.

Regards,
Keith






[dpdk-dev] Issues with openvswitch2.5+dpdk2.2+kvm/virtio-pci

2016-03-18 Thread Christian Ehrhardt
Hi,
I was trying to replicate a setup that I have working on physical devices
(ixgbe) under kvm since there is a virtio pmd driver.

TL;DR:
- under KVM with virtio-pci (working on baremetal with ixgbe cards)
- adding dpdk port to ovs fails with memzone  already exists
and causes a segfault
- I couldn't find a solution in similar mails that popped up here recently,
any help or pointer appreciated.

## Details ##
I thought I've read that others have it working I thought that would be a
great way to gain more debug control of the environment, but something
seems to be eluding me.

There were quite some similar mails on the List recently, but none seemed
to hit the same issue as I do. At least none of the tunings/workarounds
seemed to apply to me.
As versions I have Openvswitch 2.5, DPDK 2.2, Qemu 2.5, Kernel 4.4 - so a
fairly recent software stack.

The super-short repro summary is:
1. starting ovs-dpdk like
ovs-vswitchd --dpdk -c 0x1 -n 4 --pci-blacklist :00:03.0 -m 2048 --
unix:/var/run/openvswitch/db.sock -vconsole:emer -vsyslog:err -vfile:info
--mlockall --no-chdir --log-file=/var/log/openvswitch/ovs-vswitchd.log
--pidfile=/var/run/openvswitch/ovs-vswitchd.pid --detach --monitor
2. add a bridge and a ovs dpdk port
ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk

The log of the initialization after #1 looks good to me - I can see two of
my three virtio devices recognized and one blacklisted.
Memory allocation looks good, ... I'll attach the log at the end of the mail


## ISSUE ##
But when I add a port and refer to one of the dpdk ports it fails with the
following:
ovs-vsctl[14023]: ovs|1|vsctl|INFO|Called as ovs-vsctl add-port
ovsdpdkbr0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vswitchd[13903]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone
 already exists
ovs-vswitchd[13903]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone
 already exists
ovs-vswitchd[13903]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone
 already exists
kernel: show_signal_msg: 18 callbacks suppressed
kernel: pmd12[14025]: segfault at 2 ip 7f3eb205eab2 sp 7f3e3dffa590
error 4 in libdpdk.so.0[7f3eb1fdf000+1e9000]
ovs-vswitchd[13902]: ovs|3|daemon_unix(monitor)|ERR|1 crashes: pid
13903 died, killed (Segmentation fault), core dumped, restarting
systemd-udevd[14040]: Could not generate persistent MAC address for
ovs-netdev: No such file or directory
kernel: device ovs-netdev entered promiscuous mode
ovs-vswitchd[14036]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone
 already exists
ovs-vswitchd[14036]: RING: Cannot reserve memory
kernel: device ovsdpdkbr0 entered promiscuous mode
ovs-vswitchd[14036]: EAL: memzone_reserve_aligned_thread_unsafe(): memzone
 already exists
ovs-vswitchd[14036]: RING: Cannot reserve memory


## Experiments (failed) ##
I thought it could be related to all the multiqueue chances that recently
going in.
My usual setup has 4 vCPUs and 4 queues per virtio-net device.
I tried them with only 1 of 4 queues, also with only 1 queue defined and
only 1 CPU - but all fail the same way.
I have testpmd and l2fwd on the same devices working, so I hope they are
not totally set up badly.

I also tried hilarious things like reassigning to uio_pci_generic before,
but well its virtio_pmd eventually anyways - so it made no difference.

>From how it appears I felt that it could be related to the old discussions
around
[1] http://dpdk.org/ml/archives/dev/2015-May/017589.html
[2] http://openvswitch.org/pipermail/dev/2015-March/052344.html
But they are (partially) applied upstream already and the issue doesn't
100% match the old discussions.


## Logs ##
[3] log of openvswitch start:
systemd[1]: Starting Open vSwitch Internal Unit...
ovs-ctl[13868]:  * Starting ovsdb-server
ovs-vsctl[13893]: ovs|1|vsctl|INFO|Called as ovs-vsctl --no-wait --
init -- set Open_vSwitch . db-version=7.12.1
ovs-vsctl[13898]: ovs|1|vsctl|INFO|Called as ovs-vsctl --no-wait set
Open_vSwitch . ovs-version=2.5.0
"external-ids:system-id=\"8ddb892c-53a5-410d-a765-0031ad6eb1be\""
"system-type=\"Ubuntu\"" "system-version=\"16.04-xenial\""
ovs-ctl[13868]:  * Configuring Open vSwitch system IDs
ovs-ctl[13868]: 2016-03-18T14:28:28Z|1|dpdk|INFO|No -vhost_sock_dir
provided - defaulting to /var/run/openvswitch
ovs-vswitchd[13900]: ovs|1|dpdk|INFO|No -vhost_sock_dir provided -
defaulting to /var/run/openvswitch
ovs-ctl[13868]: EAL: Detected lcore 0 as core 0 on socket 0
ovs-ctl[13868]: EAL: Detected lcore 1 as core 0 on socket 0
ovs-ctl[13868]: EAL: Detected lcore 2 as core 0 on socket 0
ovs-ctl[13868]: EAL: Detected lcore 3 as core 0 on socket 0
ovs-ctl[13868]: EAL: Support maximum 128 logical core(s) by configuration.
ovs-ctl[13868]: EAL: Detected 4 lcore(s)
ovs-ctl[13868]: EAL: No free hugepages reported in hugepages-1048576kB
ovs-ctl[13868]: EAL: VFIO modules not all loaded, skip VFIO support...
ovs-ctl[13868]: EAL: Sett

[dpdk-dev] [PATCH v8 00/11] Add API to get supported packet types

2016-03-18 Thread Bruce Richardson
On Mon, Mar 14, 2016 at 03:42:45PM +0800, Jianfeng Tan wrote:
> To achieve this, a new function pointer, dev_supported_ptypes_get, is added
> into struct eth_dev_ops. For those devices who do not implement it, it
> means it does not support any ptypes.
> 
> v8:
>   - Rebased on dpdk-next-net/rel_16_04 branch.
>   - Rename ptypes_info -> supported_ptypes.
>   - Abandon info about newly added API in release_16_04.rst.
>   - concise -> correct.
> 
> v7:
>   - 2.2 -> 16.04
>   - Add note: this API better invoked after device is already started.
>   - Update release_16_04.rst for newly added API.
> 
> v6:
>   - Remove extern in function declaration.
>   - Update rte_ether_version.map.
> 
> v5:
>   - Exclude l3fwd change from this series, as a separated one.
>   - Fix malposition of mlx4 code in mlx5 commit introduced in v4.
> 
> v4:
>   - Change how to use this API: to previously agreement reached in mail.
> 
> v3:
>   - Change how to use this API: api to allocate mem for storing ptype
> array; and caller to free the mem.
>   - Change how to return back ptypes from PMDs: return a pointer to
> corresponding static const array of supported ptypes, terminated
> by RTE_PTYPE_UNKNOWN.
>   - Fix l3fwd parse_packet_type() when EXACT_MATCH is enabled.
>   - Fix l3fwd memory leak when calling the API.
> 
> v2:
>   - Move ptype_mask filter function from each PMDs into ether layer.
>   - Add ixgbe vPMD's ptype info.
>   - Fix code style issues.
> 
> Signed-off-by: Jianfeng Tan 
> Acked-by: Konstantin Ananyev 
> Acked-by: Adrien Mazarguil 
>
Applied to dpdk-next-net/rel_16_04, using v9 of patch 1 and v8 of all others.

/Bruce




[dpdk-dev] [PATCH 1/2] mlx5: fix overwritten RSS configuration

2016-03-18 Thread Adrien Mazarguil
On Fri, Mar 18, 2016 at 01:54:42PM +0100, Nelio Laranjeiro wrote:
> RSS configuration provided by the application should not be used as storage
> by the PMD.
> 
> Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get")
> 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  drivers/net/mlx5/mlx5.h| 1 +
>  drivers/net/mlx5/mlx5_ethdev.c | 1 +
>  drivers/net/mlx5/mlx5_rss.c| 7 ++-
>  drivers/net/mlx5/mlx5_rxq.c| 2 +-
>  4 files changed, 5 insertions(+), 6 deletions(-)

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] [PATCH 2/2] mlx5: fix handling of NULL RSS key

2016-03-18 Thread Adrien Mazarguil
On Fri, Mar 18, 2016 at 01:54:43PM +0100, Nelio Laranjeiro wrote:
> Update function can be called with no key to enable or disable a RSS
> protocol, or with a key to be applied to the desired protocols.
> 
> Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get")
> 
> Signed-off-by: Nelio Laranjeiro 
> ---
>  drivers/net/mlx5/mlx5_rss.c | 6 --
>  1 file changed, 6 deletions(-)

Acked-by: Adrien Mazarguil 

-- 
Adrien Mazarguil
6WIND


[dpdk-dev] Performance issue with uio_pci_generic driver

2016-03-18 Thread Simon Jouet
If it?s of any use for debugging I would be willing to setup teamviewer or 
something like that for a person to remote desktop in a VM and have an SSH 
access to both servers (all of this is being firewalls that are out of my 
control so I can?t give direct SSH access). The two servers are for 
experimentation purposes only so there is nothing critical running.



From: Bruce Richardson
Sent: 18 March 2016 03:11
To: Simon Jouet
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Performance issue with uio_pci_generic driver



On Wed, Mar 16, 2016 at 11:59:34PM +, Simon Jouet wrote:
> Hi everyone,
>
> First off I would like to thanks tmonjalo, Harry Van Harren and Bruce 
> Richardson for the input they gave while I was trying to figure out the issue 
> and pushing me to report the problem here ?
>
> Okay, so I was trying out some basic sanity benchmarks with DPDK before doing 
> anything more complicated and surprisingly I was getting lower than gigabit 
> speed for minimum packet size running l2fwd (or l3fwd for that matter).
>
> The setup is very simple I?ve got two machine with Intel x710 quad port NICs 
> one is running DPDK l2fwd and the other is running MoonGen for the 
> performance benchmark.
>
> After much debugging and trying to modify parameters one by one, giving up 
> after nothing worked and setting up ovs-dpdk I noticed from the ovs 
> documentation that the kernel module to load were uio and igb_uio while I was 
> previous loading uio_pci_generic as mentioned in the DPDK getting started 
> guide. I simply changed the kernel module and l2fwd went from 700Mbps to 10G 
> line-rate.  Bruce said that shouldn?t be the case and the performance should 
> be similar regardless of the driver loaded ...
>
> Here is the full log of the experiment, if you?re interested:
> https://gist.github.com/simon-jouet/178e1d302afef5c6a642
>
> Best regards,
> Simon
>
Thanks Simon for reporting this here!
It seems very strange to me that the particular module being used for uio has
such an impact on performance. Has anyone else seen anything similar?

/Bruce


[dpdk-dev] [PATCH v2] enic: update enic PMD maintainer and pointer to the guide.

2016-03-18 Thread John Daley
Change maintainers for ENIC PMD and fix pointer to enic
documentation in MAINTAINERS.

Signed-off-by: John Daley 
---
 MAINTAINERS  | 4 ++--
 doc/guides/nics/enic.rst | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8b21979..65669c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -268,8 +268,8 @@ F: doc/guides/nics/cxgbe.rst

 Cisco enic
 M: John Daley 
-M: Sujith Sankar 
-F: drivers/net/enic/
+M: Nelson Escobar 
+F: doc/guides/nics/enic.rst

 Combo szedata2
 M: Matej Vido 
diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst
index 2a228fd..e67c3db 100644
--- a/doc/guides/nics/enic.rst
+++ b/doc/guides/nics/enic.rst
@@ -218,4 +218,4 @@ Any questions or bugs should be reported to DPDK community 
and to the ENIC PMD
 maintainers:

 - John Daley 
-- Sujith Sankar 
+- Nelson Escobar 
-- 
2.7.0



[dpdk-dev] [PATCH v8 0/4] DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-03-18 Thread Bruce Richardson
On Thu, Mar 17, 2016 at 03:31:14PM +0100, Jan Medala wrote:
> v3:
> Additional features for Amazon ENA:
> * Low Latenycy Queue (LLQ) for Tx
> * RSS
> v4:
> * Improved doc
> * Improved style according to checkpatch script
> * Fixed build problems on: i686, clang, +shared, +debug
> v5:
> * Removed 'cvos' environment code from ena Makefile
> * Driver symbol version fixed to DPDK_16.04
> * Max MTU is read from device attributes
> v6:
> * Updated ENA communication layer
> * Added check if DPDK queue size is supported by device
> * Checkpatch results: 6 warns >80, 0 warns >90, no whitespace issues
> * defined likely/unlikely (can compile with ARM toolchain)
> * Updated doc/guides/nics/overview.rst w/ ENA
> * Removed metioned #pragma for "-Wcast-qual"
> v7:
> * Resolved Thomas's comments:
>   - included  instead of own definition of
> likely/unlikely
>   - used RTE_MIN/RTE_MAX macros
> v8:
> * Fixed init (error) logging to be always available
> 
> Jan Medala (4):
>   ena: Amazon ENA documentation
>   ena: Amazon ENA communication layer
>   ena: Amazon ENA communication layer for DPDK platform
>   ena: DPDK polling-mode driver for Amazon Elastic Network Adapters
> (ENA)
> 
Applied to dpdk-next-net/rel_16_04.

Thanks for the contribution of a new PMD.

/Bruce


[dpdk-dev] [PATCH 1/2] mbuf: Add rte_pktmbuf_copy

2016-03-18 Thread Pattan, Reshma
Hi,

> > >-Original Message-
> > >From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> > >Hemminger
> > >Sent: Friday, July 10, 2015 1:38 AM
> > >To: dev at dpdk.org
> > >Cc: Mike Davison ; Stephen Hemminger
> > >
> > >Subject: [dpdk-dev] [PATCH 1/2] mbuf: Add rte_pktmbuf_copy
> > >
> > >From: Stephen Hemminger 
> > >
> > >Added rte_pktmbuf_copy() function since copying multi-part segments
> > >is common issue and can be problematic.
> > >
> > >Signed-off-by: Mike Davison 
> > >Reviewed-by: Stephen Hemminger 
> > >---
> >
> > Hi Stephen :>
> > This patch look useful in case of backup buffs.
> > There will be second approach ?
> 
> I did bother resubmitting it since unless there are users in current code 
> base it
> would likely rot.

I was writing similar mbuf copy functionality  which is required for tcpdump 
feature.
But, It was brought to my notice that this patch with similar functionality 
already  exists, so I would like to take this patch and work on further.
Also, if you have any further code changes on this patch, would you please send 
 the latest one. I will work further.

Thanks,
Reshma


[dpdk-dev] [PATCH v8 0/4] DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-03-18 Thread Jan Mędala
Great news!
We are going to extend functionality of the ENA PMD, so from now I will
send patches based on v8 version.

Thanks,

  Jan

2016-03-18 18:04 GMT+01:00 Bruce Richardson :

> On Thu, Mar 17, 2016 at 03:31:14PM +0100, Jan Medala wrote:
> > v3:
> > Additional features for Amazon ENA:
> > * Low Latenycy Queue (LLQ) for Tx
> > * RSS
> > v4:
> > * Improved doc
> > * Improved style according to checkpatch script
> > * Fixed build problems on: i686, clang, +shared, +debug
> > v5:
> > * Removed 'cvos' environment code from ena Makefile
> > * Driver symbol version fixed to DPDK_16.04
> > * Max MTU is read from device attributes
> > v6:
> > * Updated ENA communication layer
> > * Added check if DPDK queue size is supported by device
> > * Checkpatch results: 6 warns >80, 0 warns >90, no whitespace issues
> > * defined likely/unlikely (can compile with ARM toolchain)
> > * Updated doc/guides/nics/overview.rst w/ ENA
> > * Removed metioned #pragma for "-Wcast-qual"
> > v7:
> > * Resolved Thomas's comments:
> >   - included  instead of own definition of
> > likely/unlikely
> >   - used RTE_MIN/RTE_MAX macros
> > v8:
> > * Fixed init (error) logging to be always available
> >
> > Jan Medala (4):
> >   ena: Amazon ENA documentation
> >   ena: Amazon ENA communication layer
> >   ena: Amazon ENA communication layer for DPDK platform
> >   ena: DPDK polling-mode driver for Amazon Elastic Network Adapters
> > (ENA)
> >
> Applied to dpdk-next-net/rel_16_04.
>
> Thanks for the contribution of a new PMD.
>
> /Bruce
>


[dpdk-dev] DPDK and HW offloads

2016-03-18 Thread Stephen Hemminger
As I look at how the ethernet device interface in DPDK has exploded in 
complexity;
it makes life very hard for end users.  The goal has been to enable all the 
cool hardware
features, but it has put blinders on the driver devlopers; they are ignoring 
the fact
that real applications can't just work on one kind of hardware.

The DPDK is doing a terrible job at providing abstractions.  There needs to be a
real generic set of operations, and every hardware offload feature must:
  * have a clear well defined API
  * if feature is not available in software, then the DPDK must provide
a software equivalent feature.
  * any difference in API must be hidden from application.
  * no compile config options about offload.
  * tests and documentation must work for both hw and sw version

Right now, all those offload features are pretty much unusable in a real product
without lots and lots of extra codes and huge bug surface. It bothers me enough
that I would recommend removing much of the filter/offload/ptype stuff from 
DPDK!




[dpdk-dev] [PATCH v8 4/4] ena: DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-03-18 Thread Bruce Richardson
On Thu, Mar 17, 2016 at 03:31:18PM +0100, Jan Medala wrote:
> This is a PMD for the Amazon ethernet ENA family.
> The driver operates variety of ENA adapters through feature negotiation
> with the adapter and upgradable commands set.
> ENA driver handles PCI Physical and Virtual ENA functions.
> 
> Signed-off-by: Evgeny Schemeilin 
> Signed-off-by: Jan Medala 
> Signed-off-by: Jakub Palider 

FYI: I've added a couple of lines to release note, based off this patch's commit
message, when applying the patch. I think a new PMD needs to be well-publicized
there. :-)
Please review, and send on any modification to the release notes entry you think
are necessary.

Regards,
/Bruce


[dpdk-dev] [PATCH v8 0/4] DPDK polling-mode driver for Amazon Elastic Network Adapters (ENA)

2016-03-18 Thread Bruce Richardson
On Fri, Mar 18, 2016 at 06:11:29PM +0100, Jan M?dala wrote:
> Great news!
> We are going to extend functionality of the ENA PMD, so from now I will
> send patches based on v8 version.
> 
More specifically, please base any further patches on the code on
dpdk-next-net/rel_16_04 branch. At this stage in the release cycle for 16.04,
no new feature patches can be accepted, only bug-fixes. Feel free to send
RFC patches for 16.07 release though, to add any new features.

/Bruce

> Thanks,
> 
>   Jan
> 
> 2016-03-18 18:04 GMT+01:00 Bruce Richardson :
> 
> > On Thu, Mar 17, 2016 at 03:31:14PM +0100, Jan Medala wrote:
> > > v3:
> > > Additional features for Amazon ENA:
> > > * Low Latenycy Queue (LLQ) for Tx
> > > * RSS
> > > v4:
> > > * Improved doc
> > > * Improved style according to checkpatch script
> > > * Fixed build problems on: i686, clang, +shared, +debug
> > > v5:
> > > * Removed 'cvos' environment code from ena Makefile
> > > * Driver symbol version fixed to DPDK_16.04
> > > * Max MTU is read from device attributes
> > > v6:
> > > * Updated ENA communication layer
> > > * Added check if DPDK queue size is supported by device
> > > * Checkpatch results: 6 warns >80, 0 warns >90, no whitespace issues
> > > * defined likely/unlikely (can compile with ARM toolchain)
> > > * Updated doc/guides/nics/overview.rst w/ ENA
> > > * Removed metioned #pragma for "-Wcast-qual"
> > > v7:
> > > * Resolved Thomas's comments:
> > >   - included  instead of own definition of
> > > likely/unlikely
> > >   - used RTE_MIN/RTE_MAX macros
> > > v8:
> > > * Fixed init (error) logging to be always available
> > >
> > > Jan Medala (4):
> > >   ena: Amazon ENA documentation
> > >   ena: Amazon ENA communication layer
> > >   ena: Amazon ENA communication layer for DPDK platform
> > >   ena: DPDK polling-mode driver for Amazon Elastic Network Adapters
> > > (ENA)
> > >
> > Applied to dpdk-next-net/rel_16_04.
> >
> > Thanks for the contribution of a new PMD.
> >
> > /Bruce
> >


[dpdk-dev] Fwd: EAL: map_all_hugepages(): mmap failed: Cannot allocate memory

2016-03-18 Thread John Wei
Thanks for the reply. Upon further debugging, I was able to root caused the
issue. In the cgroup, in addition to limiting the CPU, I also limited the
node where my OVS can allocate the memory (cpuset.mems). I understand that
DPDK first grab all the memory, then pick the best memory pages, then
release the rest. But this is taking a long time for my case that I started
many OVSs on the same host.
Each DPDK app will need to wait for previous app to release the memory
before next app can proceed.
In addition, since I have specified that (through cgroup cpuset.mems) dont
get memory from other node, DPDK library may be can skipp grabbing memory
from these excluded nodes?

Just some thoughts.

John


On Thu, Mar 17, 2016 at 7:51 PM, Tan, Jianfeng 
wrote:

>
>
> On 3/18/2016 6:41 AM, John Wei wrote:
>
> I am setting up OVS inside a Linux container. This OVS is built using DPDK
> library.
> During the startup of ovs-vswitchd, it core dumped due to fail to mmap.
>   in eal_memory.c
>virtaddr = mmap(vma_addr, hugepage_sz, PROT_READ | PROT_WRITE,
> MAP_SHARED, fd, 0);
>
> This call is made inside a for loop that loops through all the pages and
> mmap them.
> My server has two cores, and I allocated 8192 2MB pages.
> The mmap for the first 4096 pages were successful. It failed when trying to
> map 4096th page.
>
> Can someone help me understand when the mmap for the first 4096 pages were
> successful and it failed on 4096th page?
>
>
> In my limited experience, there are some scenario that may lead to such
> failure: a. specified an option size when do mount hugetlbfs; b. cgroup
> limitation, /sys/fs/cgroup/hugetlb/ name>/hugetlb.2MB.limit_in_bytes; c. open files by ulimit...
>
> Workaround: as only "--socket-mem 128,128" is needed, you can reduce the
> total number of 2M hugepages from 8192 to 512 (or something else).
> In addition: this is a case why I sent a patchset:
> http://dpdk.org/dev/patchwork/patch/11194/
>
> Thanks,
> Jianfeng
>
>
>
> John
>
>
>
> ovs-vswitchd --dpdk -c 0x1 -n 4 -l 1 --file-prefix ct- --socket-mem
> 128,128 -- unix:$DB_SOCK --pidfile --detach --log-file=ct.log
>
>
> EAL: Detected lcore 23 as core 5 on socket 1
> EAL: Support maximum 128 logical core(s) by configuration.
> EAL: Detected 24 lcore(s)
> EAL: No free hugepages reported in hugepages-1048576kB
> EAL: VFIO modules not all loaded, skip VFIO support...
> EAL: Setting up physically contiguous memory...
> EAL: map_all_hugepages(): mmap failed: Cannot allocate memory
> EAL: Failed to mmap 2 MB hugepages
> PANIC in rte_eal_init():
> Cannot init memory
> 7: [ovs-vswitchd() [0x411f15]]
> 6: [/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ff5f6133b15]]
> 5: [ovs-vswitchd() [0x4106f9]]
> 4: [ovs-vswitchd() [0x66917d]]
> 3: [ovs-vswitchd() [0x42b6f5]]
> 2: [ovs-vswitchd() [0x40dd8c]]
> 1: [ovs-vswitchd() [0x56b3ba]]
> Aborted (core dumped)
>
>
>


[dpdk-dev] [PATCH 1/2] mbuf: Add rte_pktmbuf_copy

2016-03-18 Thread Stephen Hemminger
On Fri, 18 Mar 2016 17:03:51 +
"Pattan, Reshma"  wrote:

> Hi,
> 
> > > >-Original Message-
> > > >From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
> > > >Hemminger
> > > >Sent: Friday, July 10, 2015 1:38 AM
> > > >To: dev at dpdk.org
> > > >Cc: Mike Davison ; Stephen Hemminger
> > > >
> > > >Subject: [dpdk-dev] [PATCH 1/2] mbuf: Add rte_pktmbuf_copy
> > > >
> > > >From: Stephen Hemminger 
> > > >
> > > >Added rte_pktmbuf_copy() function since copying multi-part segments
> > > >is common issue and can be problematic.
> > > >
> > > >Signed-off-by: Mike Davison 
> > > >Reviewed-by: Stephen Hemminger 
> > > >---
> > >
> > > Hi Stephen :>
> > > This patch look useful in case of backup buffs.
> > > There will be second approach ?
> > 
> > I did bother resubmitting it since unless there are users in current code 
> > base it
> > would likely rot.
> 
> I was writing similar mbuf copy functionality  which is required for tcpdump 
> feature.
> But, It was brought to my notice that this patch with similar functionality 
> already  exists, so I would like to take this patch and work on further.
> Also, if you have any further code changes on this patch, would you please 
> send  the latest one. I will work further.
> 
> Thanks,
> Reshma

We have a newer version that handles different size pools.


[dpdk-dev] [PATCH v2] enic: update enic PMD maintainer and pointer to the guide.

2016-03-18 Thread Thomas Monjalon
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -268,8 +268,8 @@ F: doc/guides/nics/cxgbe.rst
>  
>  Cisco enic
>  M: John Daley 
> -M: Sujith Sankar 
> -F: drivers/net/enic/

mistake here

> +M: Nelson Escobar 
> +F: doc/guides/nics/enic.rst



[dpdk-dev] DPDK and HW offloads

2016-03-18 Thread Thomas Monjalon
2016-03-18 10:16, Stephen Hemminger:
> As I look at how the ethernet device interface in DPDK has exploded in 
> complexity;

Yes I would like to start addressing this complexity in 16.07.

> it makes life very hard for end users.  The goal has been to enable all the 
> cool hardware
> features, but it has put blinders on the driver devlopers; they are ignoring 
> the fact
> that real applications can't just work on one kind of hardware.

+1

> The DPDK is doing a terrible job at providing abstractions.  There needs to 
> be a
> real generic set of operations, and every hardware offload feature must:
>   * have a clear well defined API

+1

>   * if feature is not available in software, then the DPDK must provide
> a software equivalent feature.

I'm not against this idea. Looking for more opinions.

>   * any difference in API must be hidden from application.
>   * no compile config options about offload.
>   * tests and documentation must work for both hw and sw version
> 
> Right now, all those offload features are pretty much unusable in a real 
> product
> without lots and lots of extra codes and huge bug surface. It bothers me 
> enough
> that I would recommend removing much of the filter/offload/ptype stuff from 
> DPDK!

One of the biggest challenge is to think about a good filtering API.
The offloading has some interaction with the mbuf struct.

I would like to suggest rewriting ethdev API by keeping it as is for some time 
for
compatibility while creating a new one. What about the prefix dpdk_netdev_ to
progressively replace rte_eth_dev?



[dpdk-dev] [PATCH v3] enic: update enic PMD maintainer and pointer to the guide.

2016-03-18 Thread John Daley
Change maintainers for ENIC PMD and fix pointer to enic
documentation in MAINTAINERS.

Signed-off-by: John Daley 
---
replace removed line

 MAINTAINERS  | 3 ++-
 doc/guides/nics/enic.rst | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8b21979..c90782e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -268,8 +268,9 @@ F: doc/guides/nics/cxgbe.rst

 Cisco enic
 M: John Daley 
-M: Sujith Sankar 
+M: Nelson Escobar 
 F: drivers/net/enic/
+F: doc/guides/nics/enic.rst

 Combo szedata2
 M: Matej Vido 
diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst
index 2a228fd..e67c3db 100644
--- a/doc/guides/nics/enic.rst
+++ b/doc/guides/nics/enic.rst
@@ -218,4 +218,4 @@ Any questions or bugs should be reported to DPDK community 
and to the ENIC PMD
 maintainers:

 - John Daley 
-- Sujith Sankar 
+- Nelson Escobar 
-- 
2.7.0



[dpdk-dev] [PATCH v2] enic: fix incorrect setting of rx descriptor limit

2016-03-18 Thread John Daley
From: Nelson Escobar 

On initialization, the rq descriptor count was set to the limit
of the vic.  When the requested number of rx descriptors was
less than this count, enic_alloc_rq() was incorrectly setting
the count to the lower value.  This results in later calls to
enic_alloc_rq() incorrectly using the lower value as the adapter
limit.

Signed-off-by: Nelson Escobar 
Fixes: fefed3d1e62c ("enic: new driver")
Reviewed-by: John Daley 
---
fix checkin comment
 drivers/net/enic/enic_main.c | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 9fff020..cd7857f 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -524,24 +524,22 @@ int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,
"policy.  Applying the value in the adapter "\
"policy (%d).\n",
queue_idx, nb_desc, enic->config.rq_desc_count);
-   } else if (nb_desc != enic->config.rq_desc_count) {
-   enic->config.rq_desc_count = nb_desc;
-   dev_info(enic,
-   "RX Queues - effective number of descs:%d\n",
-   nb_desc);
+   nb_desc = enic->config.rq_desc_count;
}
+   dev_info(enic, "RX Queues - effective number of descs:%d\n",
+nb_desc);
}

/* Allocate queue resources */
rc = vnic_rq_alloc(enic->vdev, rq, queue_idx,
-   enic->config.rq_desc_count, sizeof(struct rq_enet_desc));
+   nb_desc, sizeof(struct rq_enet_desc));
if (rc) {
dev_err(enic, "error in allocation of rq\n");
goto err_exit;
}

rc = vnic_cq_alloc(enic->vdev, &enic->cq[queue_idx], queue_idx,
-   socket_id, enic->config.rq_desc_count,
+   socket_id, nb_desc,
sizeof(struct cq_enet_rq_desc));
if (rc) {
dev_err(enic, "error in allocation of cq for rq\n");
@@ -550,7 +548,7 @@ int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,

/* Allocate the mbuf ring */
rq->mbuf_ring = (struct rte_mbuf **)rte_zmalloc_socket("rq->mbuf_ring",
-   sizeof(struct rte_mbuf *) * enic->config.rq_desc_count,
+   sizeof(struct rte_mbuf *) * nb_desc,
RTE_CACHE_LINE_SIZE, rq->socket_id);

if (rq->mbuf_ring != NULL)
-- 
2.7.0



  1   2   >