[dpdk-dev] Building dpdk 1.7.0 and 2.0.0 with shared library and combine library options fail in test module and building example application

2015-08-24 Thread William Yeung
Hello,

I built dpdk 1.7.0 with the share library and combine library options
activated, but I receive errors when building an example application and
when running the test module in tools/setup.sh.

I modified the common_linuxapp file (dpdk-1.7.0/config/common_linuxapp) to
share library and combine into one library as follow:

CONFIG_RTE_BUILD_SHARED_LIB=y # Compile to share library
CONFIG_RTE_BUILD_COMBINE_LIBS=y # Combine to one single library

Then I built using the following commands:

CONFIG_RTE_LIBRTE_MLX4_PMD=y
export EXTRA_CFLAGS=-I$TMP/install/usr/local/include
export EXTRA_LDFLAGS=-L$TMP/install/usr/local/lib
make config T=x86_64-native-linuxapp-gcc
make

I executed the tools/setup.sh script and ran the test modules, but I
get this error:

Option: 17

Enter hex bitmask of cores to execute test app on
Example: to execute app on cores 0 to 7, enter 0xff
bitmask: 0x01
Launching app
sudo: /app/test: command not found
Press enter to continue ...

Option 17 is "Run test application ($RTE_TARGET/app/test)".

Building the helloworld sample application yields the following error:

ubuntu-pc at 
ubuntu-pc:~/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/examples/helloworld$
export RTE_SDK=/home/ubuntu-
pc/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0
ubuntu-pc at 
ubuntu-pc:~/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/examples/helloworld$
export RTE_TARGET=x86_64-
native-linuxapp-gcc
ubuntu-pc at 
ubuntu-pc:~/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/examples/helloworld$
make
LD helloworld
/home/ubuntu-pc/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/x86_64-native-linuxapp-gcc/lib/librte_eal.so:
undefined reference
to `rte_malloc'
/home/ubuntu-pc/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/x86_64-native-linuxapp-gcc/lib/librte_eal.so:
undefined reference
to `rte_mempool_lookup'
/home/ubuntu-pc/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/x86_64-native-linuxapp-gcc/lib/librte_eal.so:
undefined reference
to `rte_zmalloc'
/home/ubuntu-pc/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/x86_64-native-linuxapp-gcc/lib/librte_eal.so:
undefined reference
to `rte_free'
/home/ubuntu-pc/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/x86_64-native-linuxapp-gcc/lib/librte_eal.so:
undefined reference
to `rte_mempool_create'
collect2: ld returned 1 exit status
make[1]: *** [helloworld] Error 1
make: *** [all] Error 2
ubuntu-pc at 
ubuntu-pc:~/devel-ostinato-dpdk/ostinato-dpdk-master/dpdk-1.7.0/examples/helloworld$

I also built dpdk 2.0.0 on the same PC, but I received the same
errors. My steps for building dpdk 2.0.0 are as follows:

Modify the following lines in common_linuxapp file
(dpdk-1.7.0/config/common_linuxapp):

CONFIG_RTE_BUILD_SHARED_LIB=y # Compile to share library
CONFIG_RTE_BUILD_COMBINE_LIBS=y # Combine to one single library

Execute the commands:

CONFIG_RTE_LIBRTE_MLX4_PMD=y
export EXTRA_CFLAGS=-I$TMP/install/usr/local/include
export EXTRA_LDFLAGS=-L$TMP/install/usr/local/lib
make config T=x86_64-native-linuxapp-gcc
sed -ri 's,(PMD_PCAP=).*,\1y,' build/.config
make

If I don't enable the share library and combine library options, I was
able to build the helloworld example and run the test modules in
tools/setup.sh without any issues. However I need to enable these two
options to build an application (In my case, the open source Ostinato
dpdk traffic generator. See: https://github.com/pstavirs/ostinato and
https://github.com/PLVision/ostinato-dpdk)

My environment:

Linux ubuntu-pc 3.13.0-32-generic #57~precise1-Ubuntu
OS: Ubuntu 12.04.5 desktop amd64 (http://releases.ubuntu.com/12.04/)
on an Intel CPU.
Linux Kernel: 3.13.0-32-generic

For the 3.13.0 kernel I had to modify the kcompat.h file in
dpdk-1.7.0/lib/librte_eal/linuxapp/kni/ethtool/igb to comply with this
kernel. I modified line 3848 to:

#if ( LINUX_VERSION_CODE < KERNEL_VERSION(3,13,0) )

I also built these two dpdk versions on the same PC using the 3.14.4
kernel but I still ended with the same results. What steps do I need
to take to make dpdk run properly (i.e. build example apps and run
test modules) while keeping the shared library and combine library
options set? I would like to note that in my attempts, dpdk was built
without any errors and binding drivers to NICs worked (despite not
having dpdk supported NICs). I need this to build the Ostinato
application.

Also, what will my NIC do if it has the igb_uio driver on it despite
not being a dpdk supported device?

My NICs:

Realtek RTL-8110SC/8169SC Gigabit Ethernet

Intel 82579V Gigabit Network Connection

I apologize for the long post.

Thank you and kind regards,

Will


[dpdk-dev] [PATCH 1/4] vhost: remove redundant ;

2015-08-24 Thread Yuanhan Liu
Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_rxtx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 0d07338..d412293 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -185,7 +185,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
}
}
len_to_cpy = RTE_MIN(data_len - offset, desc->len - 
vb_offset);
-   };
+   }

/* Update used ring with desc information */
vq->used->ring[res_cur_idx & (vq->size - 1)].id =
-- 
1.9.0



[dpdk-dev] [PATCH 2/4] vhost: fix typo

2015-08-24 Thread Yuanhan Liu
_det => _dev

Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/virtio-net.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index b520ec5..b670992 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -485,7 +485,7 @@ set_vring_num(struct vhost_device_ctx ctx, struct 
vhost_vring_state *state)
 }

 /*
- * Reallocate virtio_det and vhost_virtqueue data structure to make them on the
+ * Reallocate virtio_dev and vhost_virtqueue data structure to make them on the
  * same numa node as the memory of vring descriptor.
  */
 #ifdef RTE_LIBRTE_VHOST_NUMA
-- 
1.9.0



[dpdk-dev] [PATCH 3/4] vhost: get rid of duplicate code

2015-08-24 Thread Yuanhan Liu
Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_user/vhost-net-user.c | 36 
 1 file changed, 10 insertions(+), 26 deletions(-)

diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c 
b/lib/librte_vhost/vhost_user/vhost-net-user.c
index f406a94..d1f8877 100644
--- a/lib/librte_vhost/vhost_user/vhost-net-user.c
+++ b/lib/librte_vhost/vhost_user/vhost-net-user.c
@@ -329,32 +329,16 @@ vserver_message_handler(int connfd, void *dat, int 
*remove)

ctx.fh = cfd_ctx->fh;
ret = read_vhost_message(connfd, &msg);
-   if (ret < 0) {
-   RTE_LOG(ERR, VHOST_CONFIG,
-   "vhost read message failed\n");
-
-   close(connfd);
-   *remove = 1;
-   free(cfd_ctx);
-   user_destroy_device(ctx);
-   ops->destroy_device(ctx);
-
-   return;
-   } else if (ret == 0) {
-   RTE_LOG(INFO, VHOST_CONFIG,
-   "vhost peer closed\n");
-
-   close(connfd);
-   *remove = 1;
-   free(cfd_ctx);
-   user_destroy_device(ctx);
-   ops->destroy_device(ctx);
-
-   return;
-   }
-   if (msg.request > VHOST_USER_MAX) {
-   RTE_LOG(ERR, VHOST_CONFIG,
-   "vhost read incorrect message\n");
+   if (ret <= 0 || msg.request > VHOST_USER_MAX) {
+   if (ret < 0)
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "vhost read message failed\n");
+   else if (ret == 0)
+   RTE_LOG(INFO, VHOST_CONFIG,
+   "vhost peer closed\n");
+   else
+   RTE_LOG(ERR, VHOST_CONFIG,
+   "vhost read incorrect message\n");

close(connfd);
*remove = 1;
-- 
1.9.0



[dpdk-dev] [PATCH 4/4] vhost: define callfd and kickfd as int type

2015-08-24 Thread Yuanhan Liu
So that we can remove the redundant (int) cast.

Signed-off-by: Yuanhan Liu 
---
 examples/vhost/main.c |  6 ++---
 lib/librte_vhost/rte_virtio_net.h |  4 ++--
 lib/librte_vhost/vhost_rxtx.c |  6 ++---
 lib/librte_vhost/vhost_user/virtio-net-user.c | 16 +++---
 lib/librte_vhost/virtio-net.c | 32 +--
 5 files changed, 32 insertions(+), 32 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 1b137b9..b090b25 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1433,7 +1433,7 @@ put_desc_to_used_list_zcp(struct vhost_virtqueue *vq, 
uint16_t desc_idx)

/* Kick the guest if necessary. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
-   eventfd_write((int)vq->callfd, 1);
+   eventfd_write(vq->callfd, 1);
 }

 /*
@@ -1626,7 +1626,7 @@ txmbuf_clean_zcp(struct virtio_net *dev, struct vpool 
*vpool)

/* Kick guest if required. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
-   eventfd_write((int)vq->callfd, 1);
+   eventfd_write(vq->callfd, 1);

return 0;
 }
@@ -1774,7 +1774,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf 
**pkts,

/* Kick the guest if necessary. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
-   eventfd_write((int)vq->callfd, 1);
+   eventfd_write(vq->callfd, 1);

return count;
 }
diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index b9bf320..a037c15 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -87,8 +87,8 @@ struct vhost_virtqueue {
uint16_tvhost_hlen; /**< Vhost header 
length (varies depending on RX merge buffers. */
volatile uint16_t   last_used_idx;  /**< Last index used on 
the available ring */
volatile uint16_t   last_used_idx_res;  /**< Used for multiple 
devices reserving buffers. */
-   eventfd_t   callfd; /**< Used to notify the 
guest (trigger interrupt). */
-   eventfd_t   kickfd; /**< Currently unused 
as polling mode is enabled. */
+   int callfd; /**< Used to notify the 
guest (trigger interrupt). */
+   int kickfd; /**< Currently unused 
as polling mode is enabled. */
struct buf_vector   buf_vec[BUF_VECTOR_MAX];/**< for 
scatter RX. */
 } __rte_cache_aligned;

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index d412293..887cdb6 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -230,7 +230,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,

/* Kick the guest if necessary. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
-   eventfd_write((int)vq->callfd, 1);
+   eventfd_write(vq->callfd, 1);
return count;
 }

@@ -529,7 +529,7 @@ virtio_dev_merge_rx(struct virtio_net *dev, uint16_t 
queue_id,

/* Kick the guest if necessary. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
-   eventfd_write((int)vq->callfd, 1);
+   eventfd_write(vq->callfd, 1);
}

return count;
@@ -752,6 +752,6 @@ rte_vhost_dequeue_burst(struct virtio_net *dev, uint16_t 
queue_id,
vq->used->idx += entry_success;
/* Kick guest if required. */
if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT))
-   eventfd_write((int)vq->callfd, 1);
+   eventfd_write(vq->callfd, 1);
return entry_success;
 }
diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c 
b/lib/librte_vhost/vhost_user/virtio-net-user.c
index c1ffc38..4689927 100644
--- a/lib/librte_vhost/vhost_user/virtio-net-user.c
+++ b/lib/librte_vhost/vhost_user/virtio-net-user.c
@@ -214,10 +214,10 @@ virtio_is_ready(struct virtio_net *dev)
rvq = dev->virtqueue[VIRTIO_RXQ];
tvq = dev->virtqueue[VIRTIO_TXQ];
if (rvq && tvq && rvq->desc && tvq->desc &&
-   (rvq->kickfd != (eventfd_t)-1) &&
-   (rvq->callfd != (eventfd_t)-1) &&
-   (tvq->kickfd != (eventfd_t)-1) &&
-   (tvq->callfd != (eventfd_t)-1)) {
+   (rvq->kickfd != -1) &&
+   (rvq->callfd != -1) &&
+   (tvq->kickfd != -1) &&
+   (tvq->callfd != -1)) {
RTE_LOG(INFO, VHOST_CONFIG,
"virtio is now ready for processing.\n");
return 1;
@@ -290,13 +290,13 @@ user_get_vring_base(struct vhost_device_ctx ctx,
 * sent and only sent in vhost_vring_stop.
 * TODO: cleanup the vring, it isn't usable since here.
 */
-   if (((int

[dpdk-dev] [PATCH v4] ixgbe_pmd: enforce RS bit on every EOP descriptor for devices newer than 82598

2015-08-24 Thread Vlad Zolotarov


On 08/20/15 18:37, Vlad Zolotarov wrote:
> According to 82599 and x540 HW specifications RS bit *must* be
> set in the last descriptor of *every* packet.
>
> Before this patch there were 3 types of Tx callbacks that were setting
> RS bit every tx_rs_thresh descriptors. This patch introduces a set of
> new callbacks, one for each type mentioned above, that will set the RS
> bit in every EOP descriptor.
>
> ixgbe_set_tx_function() will set the appropriate Tx callback according
> to the device family.

[+Jesse and Jeff]

I've started to look at the i40e PMD and it has the same RS bit 
deferring logic
as ixgbe PMD has (surprise, surprise!.. ;)). To recall, i40e PMD uses a 
descriptor write-back
completion mode.

 From the HW Spec it's unclear if RS bit should be set on *every* descriptor
with EOP bit. However I noticed that Linux driver, before it moved to 
HEAD write-back mode, was setting RS
bit on every EOP descriptor.

So, here is a question to Intel guys: could u, pls., clarify this point?

Thanks in advance,
vlad

>
> This patch fixes the Tx hang we were constantly hitting with a
> seastar-based application on x540 NIC.
>
> Signed-off-by: Vlad Zolotarov 
> ---
> New in v4:
> - Styling (white spaces) fixes.
>
> New in v3:
> - Enforce the RS bit setting instead of enforcing tx_rs_thresh to be 1.
> ---
>   drivers/net/ixgbe/ixgbe_ethdev.c   |  14 +++-
>   drivers/net/ixgbe/ixgbe_ethdev.h   |   4 ++
>   drivers/net/ixgbe/ixgbe_rxtx.c | 139 
> -
>   drivers/net/ixgbe/ixgbe_rxtx.h |   2 +
>   drivers/net/ixgbe/ixgbe_rxtx_vec.c |  29 ++--
>   5 files changed, 149 insertions(+), 39 deletions(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c 
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index b8ee1e9..355882c 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -866,12 +866,17 @@ eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev)
>   uint32_t ctrl_ext;
>   uint16_t csum;
>   int diag, i;
> + bool rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB);
>   
>   PMD_INIT_FUNC_TRACE();
>   
>   eth_dev->dev_ops = &ixgbe_eth_dev_ops;
>   eth_dev->rx_pkt_burst = &ixgbe_recv_pkts;
> - eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
> +
> + if (rs_deferring_allowed)
> + eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
> + else
> + eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts_always_rs;
>   
>   /*
>* For secondary processes, we don't initialise any further as primary
> @@ -1147,12 +1152,17 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
>   struct ixgbe_hwstrip *hwstrip =
>   IXGBE_DEV_PRIVATE_TO_HWSTRIP_BITMAP(eth_dev->data->dev_private);
>   struct ether_addr *perm_addr = (struct ether_addr *) hw->mac.perm_addr;
> + bool rs_deferring_allowed = (hw->mac.type <= ixgbe_mac_82598EB);
>   
>   PMD_INIT_FUNC_TRACE();
>   
>   eth_dev->dev_ops = &ixgbevf_eth_dev_ops;
>   eth_dev->rx_pkt_burst = &ixgbe_recv_pkts;
> - eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
> +
> + if (rs_deferring_allowed)
> + eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
> + else
> + eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts_always_rs;
>   
>   /* for secondary processes, we don't initialise any further as primary
>* has already done this work. Only check we don't need a different
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h 
> b/drivers/net/ixgbe/ixgbe_ethdev.h
> index c3d4f4f..390356d 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.h
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.h
> @@ -367,9 +367,13 @@ uint16_t ixgbe_recv_pkts_lro_bulk_alloc(void *rx_queue,
>   
>   uint16_t ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
>   uint16_t nb_pkts);
> +uint16_t ixgbe_xmit_pkts_always_rs(
> + void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
>   
>   uint16_t ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
>   uint16_t nb_pkts);
> +uint16_t ixgbe_xmit_pkts_simple_always_rs(
> + void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
>   
>   int ixgbe_dev_rss_hash_update(struct rte_eth_dev *dev,
> struct rte_eth_rss_conf *rss_conf);
> diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
> index 91023b9..044f72c 100644
> --- a/drivers/net/ixgbe/ixgbe_rxtx.c
> +++ b/drivers/net/ixgbe/ixgbe_rxtx.c
> @@ -164,11 +164,16 @@ ixgbe_tx_free_bufs(struct ixgbe_tx_queue *txq)
>   
>   /* Populate 4 descriptors with data from 4 mbufs */
>   static inline void
> -tx4(volatile union ixgbe_adv_tx_desc *txdp, struct rte_mbuf **pkts)
> +tx4(volatile union ixgbe_adv_tx_desc *txdp, struct rte_mbuf **pkts,
> +bool always_rs)
>   {
>   uint64_t buf_dma_addr;
>   uint32_t pkt_len;
>   int i;
> + uint32_t flags = DCMD_DTYP_FLAGS;
> +
> + if (always_rs)
> + flags |= IXGBE_ADVTXD_DCMD_RS;
> 

[dpdk-dev] [PATCH] librte_eal: Fix wrong header file for old gcc version

2015-08-24 Thread Michael Qiu
For __SSE3__, the corresponding header file should be pmmintrin.h,
tmmintrin.h works for __SSSE3__.

Signed-off-by: Michael Qiu 
---
 lib/librte_eal/common/include/arch/x86/rte_vect.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_vect.h 
b/lib/librte_eal/common/include/arch/x86/rte_vect.h
index b698797..8a4dace 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_vect.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_vect.h
@@ -51,6 +51,10 @@
 #endif

 #ifdef __SSE3__
+#include 
+#endif
+
+#ifdef __SSSE3__
 #include 
 #endif

-- 
1.9.3



[dpdk-dev] [PATCH 1/6] ixgbe: Support VMDq RSS in non-SRIOV environment

2015-08-24 Thread Qiu, Michael
On 5/21/2015 3:50 PM, Ouyang Changchun wrote:
> In non-SRIOV environment, VMDq RSS could be enabled by MRQC register.
> In theory, the queue number per pool could be 2 or 4, but only 2 queues are
> available due to HW limitation, the same limit also exist in Linux ixgbe 
> driver.
>
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_ether/rte_ethdev.c | 40 +++
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 82 
> +--
>  2 files changed, 111 insertions(+), 11 deletions(-)
>
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 024fe8b..6535715 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -933,6 +933,16 @@ rte_eth_dev_check_vf_rss_rxq_num(uint8_t port_id, 
> uint16_t nb_rx_q)
>   return 0;
>  }
>  
> +#define VMDQ_RSS_RX_QUEUE_NUM_MAX 4
> +
> +static int
> +rte_eth_dev_check_vmdq_rss_rxq_num(__rte_unused uint8_t port_id, uint16_t 
> nb_rx_q)
> +{
> + if (nb_rx_q > VMDQ_RSS_RX_QUEUE_NUM_MAX)
> + return -EINVAL;
> + return 0;
> +}
> +
>  static int
>  rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t nb_rx_q, uint16_t 
> nb_tx_q,
> const struct rte_eth_conf *dev_conf)
> @@ -1093,6 +1103,36 @@ rte_eth_dev_check_mq_mode(uint8_t port_id, uint16_t 
> nb_rx_q, uint16_t nb_tx_q,
>   return -EINVAL;
>   }
>   }
> +
> + if (dev_conf->rxmode.mq_mode == ETH_MQ_RX_VMDQ_RSS) {
> + uint32_t nb_queue_pools =
> + 
> dev_conf->rx_adv_conf.vmdq_rx_conf.nb_queue_pools;
> + struct rte_eth_dev_info dev_info;
> +
> + rte_eth_dev_info_get(port_id, &dev_info);
> + dev->data->dev_conf.rxmode.mq_mode = ETH_MQ_RX_VMDQ_RSS;
> + if (nb_queue_pools == ETH_32_POOLS || nb_queue_pools == 
> ETH_64_POOLS)
> + RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool =
> + dev_info.max_rx_queues/nb_queue_pools;
> + else {
> + PMD_DEBUG_TRACE("ethdev port_id=%d VMDQ "
> + "nb_queue_pools=%d invalid "
> + "in VMDQ RSS\n"

Does here miss "," ?

Thanks,
Michael

> + port_id,
> + nb_queue_pools);
> + return -EINVAL;
> + }
> +
> + if (rte_eth_dev_check_vmdq_rss_rxq_num(port_id,
> + RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool) != 0) {
> + PMD_DEBUG_TRACE("ethdev port_id=%d"
> + " SRIOV active, invalid queue"
> + " number for VMDQ RSS, allowed"
> + " value are 1, 2 or 4\n",
> + port_id);
> + return -EINVAL;
> + }
> + }
>   }
>   return 0;
>  }
>



[dpdk-dev] i40e and RSS woes

2015-08-24 Thread Vlad Zolotarov


On 03/05/15 07:56, Zhang, Helin wrote:
> Hi Gleb
>
> Sorry for late! I am struggling on my tasks for the following DPDK release 
> these days.
>
>> -Original Message-
>> From: Gleb Natapov [mailto:gleb at cloudius-systems.com]
>> Sent: Monday, March 2, 2015 8:56 PM
>> To: dev at dpdk.org
>> Cc: Zhang, Helin
>> Subject: Re: i40e and RSS woes
>>
>> Ping.
>>
>> On Thu, Feb 19, 2015 at 04:50:10PM +0200, Gleb Natapov wrote:
>>> CCing i40e driver author in a hope to get an answer.
>>>
>>> On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
 I have an application that works reasonably well with ixgbe driver,
 but when I try to use it with i40e I encounter various RSS related issues.

 First one is that for some reason i40e, when it builds default reta
 table, round down number of queues to power of two. Why is this? If
> It seems because of i40e queue configuration. We will check it more and see
> if it can be changed or improved later.

Helin, hi!
Sorry for bringing it back but it seems that the RSS queues number issue 
(rounding it down to the nearest power of 2)
still hasn't been addressed in the master branch.

Could u, pls., clarify what is that "i40e queue configuration" that 
requires this alignment u are referring above?

 From what i could see "num" parameter is not propagated outside the 
i40e_pf_config_rss() in any form except for RSS table contents.
This means that any code that would need to know the number of Rx queues 
would use the dev_data->nb_rx_queues (e.g. i40e_dev_rx_init())
and wouldn't be able to know that i40e_pf_config_rss() something 
different except for scanning the RSS table in HW which is of course not 
an option.

Therefore, from the first look it seems that this rounding may be safely 
removed unless I've missed something.

Pls., comment.

thanks,
vlad

>
 I configure reta by my own using all of the queues everything seams
 to be working. To add insult to injury I do not get any errors
 during configuration some queues just do not receive any traffic.

 The second problem is that for some reason i40e does not use 40 byte
 toeplitz hash key like any other driver, but it expects the key to
 be 52 bytes. And it would have being fine (if we ignore the fact
 that it contradicts MS spec), but how my high level code suppose to know
>> that?
> Actually a rss_key_len was introduced in struct rte_eth_rss_conf recently. So 
> the
> length should be indicated clearly. But I found the annotations of that 
> structure
> should have been reworked. I will try to rework it with clear descriptions.
>
 And again, device configuration does not fail when wrong key length
 is provided, it just uses some other key. Guys this kind of error
 handling is completely unacceptable.
> If less length of key is provided, it will not be used at all, the default 
> key will be used.
> So there is no issue as you said. But we need to add more clear description 
> for the
> structure of rte_eth_rss_conf.
>
> Thank you very much for the good comments!
>
> Regards,
> Helin
>
 The last one is more of a question. Why interface to change RSS hash
 function (XOR or toeplitz) is part of a filter configuration and not
 rss config?

 --
Gleb.
>>> --
>>> Gleb.
>> --
>>  Gleb.



[dpdk-dev] i40e and RSS woes

2015-08-24 Thread Vlad Zolotarov


On 08/24/15 14:13, Vlad Zolotarov wrote:
>
>
> On 03/05/15 07:56, Zhang, Helin wrote:
>> Hi Gleb
>>
>> Sorry for late! I am struggling on my tasks for the following DPDK 
>> release these days.
>>
>>> -Original Message-
>>> From: Gleb Natapov [mailto:gleb at cloudius-systems.com]
>>> Sent: Monday, March 2, 2015 8:56 PM
>>> To: dev at dpdk.org
>>> Cc: Zhang, Helin
>>> Subject: Re: i40e and RSS woes
>>>
>>> Ping.
>>>
>>> On Thu, Feb 19, 2015 at 04:50:10PM +0200, Gleb Natapov wrote:
 CCing i40e driver author in a hope to get an answer.

 On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
> I have an application that works reasonably well with ixgbe driver,
> but when I try to use it with i40e I encounter various RSS related 
> issues.
>
> First one is that for some reason i40e, when it builds default reta
> table, round down number of queues to power of two. Why is this? If
>> It seems because of i40e queue configuration. We will check it more 
>> and see
>> if it can be changed or improved later.
>
> Helin, hi!
> Sorry for bringing it back but it seems that the RSS queues number 
> issue (rounding it down to the nearest power of 2)
> still hasn't been addressed in the master branch.
>
> Could u, pls., clarify what is that "i40e queue configuration" that 
> requires this alignment u are referring above?
>
> From what i could see "num" parameter is not propagated outside the 
> i40e_pf_config_rss() in any form except for RSS table contents.
> This means that any code that would need to know the number of Rx 
> queues would use the dev_data->nb_rx_queues (e.g. i40e_dev_rx_init())
> and wouldn't be able to know that i40e_pf_config_rss() something 
> different except for scanning the RSS table in HW which is of course 
> not an option.
>
> Therefore, from the first look it seems that this rounding may be 
> safely removed unless I've missed something.
>
> Pls., comment.

Have just noticed this:

/* Each of below queue pairs should be power of 2 since it's the
   precondition after TC configuration applied */
uint16_t lan_nb_qps; /* The number of queue pairs of LAN */

I still couldn't find any justification for either requiring the above 
or requiring any correlation between the number of Tx queues (whatever 
it is) and the
number of Rx queues in the HW spec. It seems that spec implies that Rx 
and Tx configuration is completely orthogonal (as it should be).

Could u, pls., clarify how TC configuration imposes a requirement on a 
number of Rx queues to be a power of 2?

thanks in advance,
vlad

>
> thanks,
> vlad
>
>>
> I configure reta by my own using all of the queues everything seams
> to be working. To add insult to injury I do not get any errors
> during configuration some queues just do not receive any traffic.
>
> The second problem is that for some reason i40e does not use 40 byte
> toeplitz hash key like any other driver, but it expects the key to
> be 52 bytes. And it would have being fine (if we ignore the fact
> that it contradicts MS spec), but how my high level code suppose 
> to know
>>> that?
>> Actually a rss_key_len was introduced in struct rte_eth_rss_conf 
>> recently. So the
>> length should be indicated clearly. But I found the annotations of 
>> that structure
>> should have been reworked. I will try to rework it with clear 
>> descriptions.
>>
> And again, device configuration does not fail when wrong key length
> is provided, it just uses some other key. Guys this kind of error
> handling is completely unacceptable.
>> If less length of key is provided, it will not be used at all, the 
>> default key will be used.
>> So there is no issue as you said. But we need to add more clear 
>> description for the
>> structure of rte_eth_rss_conf.
>>
>> Thank you very much for the good comments!
>>
>> Regards,
>> Helin
>>
> The last one is more of a question. Why interface to change RSS hash
> function (XOR or toeplitz) is part of a filter configuration and not
> rss config?
>
> -- 
> Gleb.
 -- 
 Gleb.
>>> -- 
>>> Gleb.
>



[dpdk-dev] [ovs-dev] OVS-DPDK performance problem on ixgbe vector PMD

2015-08-24 Thread Traynor, Kevin

> -Original Message-
> From: dev [mailto:dev-bounces at openvswitch.org] On Behalf Of Zoltan Kiss
> Sent: Friday, August 21, 2015 7:05 PM
> To: dev at dpdk.org; dev at openvswitch.org
> Cc: Richardson, Bruce; Ananyev, Konstantin
> Subject: [ovs-dev] OVS-DPDK performance problem on ixgbe vector PMD
> 
> Hi,
> 
> I've set up a simple packet forwarding perf test on a dual-port 10G
> 82599ES: one port receives 64 byte UDP packets, the other sends it out,
> one core used. I've used latest OVS with DPDK 2.1, and the first result
> was only 13.2 Mpps, which was a bit far from the 13.9 I've seen last
> year with the same test. The first thing I've changed was to revert back
> to the old behaviour about this issue:
> 
> http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/22731
> 
> So instead of the new default I've passed 2048 + RTE_PKTMBUF_HEADROOM.
> That increased the performance to 13.5, but to figure out what's wrong
> started to play with the receive functions. First I've disabled vector
> PMD, but ixgbe_recv_pkts_bulk_alloc() was even worse, only 12.5 Mpps. So
> then I've enabled scattered RX, and with
> ixgbe_recv_pkts_lro_bulk_alloc() I could manage to get 13.98 Mpps, which
> is I guess as close as possible to the 14.2 line rate (on my HW at
> least, with one core)
> Does anyone has a good explanation about why the vector PMD performs so
> significantly worse? I would expect that on a 3.2 GHz i5-4570 one core
> should be able to reach ~14 Mpps, SG and vector PMD shouldn't make a
> difference.

I've previously turned on/off vectorisation and found that for tx it makes
a significant difference. For Rx it didn't make a much of a difference but
rx bulk allocation which gets enabled with it did improve performance.

Is there is something else also running on the current pmd core? did you
try moving it to another? Also, did you compile OVS with -O3/-Ofast, they
tend to give a performance boost.

Are you hitting 3.2 GHz for the core with the pmd? I think that is only
with turbo boost, so it may not be achievable all the time.

> I've tried to look into it with oprofile, but the results were quite
> strange: 35% of the samples were from miniflow_extract, the part where
> parse_vlan calls data_pull to jump after the MAC addresses. The oprofile
> snippet (1M samples):
> 
>511454 190.0037  flow.c:511
>511458 149   0.0292  dp-packet.h:266
>51145f 4264  0.8357  dp-packet.h:267
>511466 180.0035  dp-packet.h:268
>51146d 430.0084  dp-packet.h:269
>511474 172   0.0337  flow.c:511
>51147a 4320  0.8467  string3.h:51
>51147e 358763   70.3176  flow.c:99
>511482 23.9e-04  string3.h:51
>511485 3060  0.5998  string3.h:51
>511488 1693  0.3318  string3.h:51
>51148c 2933  0.5749  flow.c:326
>511491 470.0092  flow.c:326
> 
> And the corresponding disassembled code:
> 
>511454:   49 83 f9 0d cmpr9,0xd
>511458:   c6 83 81 00 00 00 00movBYTE PTR [rbx+0x81],0x0
>51145f:   66 89 83 82 00 00 00movWORD PTR [rbx+0x82],ax
>511466:   66 89 93 84 00 00 00movWORD PTR [rbx+0x84],dx
>51146d:   66 89 8b 86 00 00 00movWORD PTR [rbx+0x86],cx
>511474:   0f 86 af 01 00 00   jbe511629
> 
>51147a:   48 8b 45 00 movrax,QWORD PTR [rbp+0x0]
>51147e:   4c 8d 5d 0c lear11,[rbp+0xc]
>511482:   49 89 00movQWORD PTR [r8],rax
>511485:   8b 45 08moveax,DWORD PTR [rbp+0x8]
>511488:   41 89 40 08 movDWORD PTR [r8+0x8],eax
>51148c:   44 0f b7 55 0c  movzx  r10d,WORD PTR [rbp+0xc]
>511491:   66 41 81 fa 81 00   cmpr10w,0x81
> 
> My only explanation to this so far is that I misunderstand something
> about the oprofile results.
> 
> Regards,
> 
> Zoltan
> ___
> dev mailing list
> dev at openvswitch.org
> http://openvswitch.org/mailman/listinfo/dev


[dpdk-dev] Why the offloads of the guest's virtio-net network adapter are disabled when vhost-user is used?

2015-08-24 Thread leo zhu
Hi all,

I am running the vhost sample application on my server.

According to the dpdk-sample-applications-user-guide.pdf, I run the Virtual
Machine with vhost-user enabled.
Following is the command that is used to run the virtual machine.






*qemu-system-x86_64 /root/leo/ubuntu-1.img -enable-kvm -m 1024 -vnc :5
-chardev \socket,id=char1,path=/root/leo/dpdk-2.0.0/examples/vhost/usvhost
-netdev type=vhost-user, \id=mynet1,chardev=char1,vhostforce -device
virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1 \-object
memory-backend-file,id=mem,size=1024M,mem-path=/dev/hugepages,share=on
-numa node,memdev=mem -mem-prealloc*

After the Virtual Machine is started, I found the offloads of the
Virtual Machine's virtio-net network adapter
are all disabled*.* The offloads status is checked with command*
ethtool -k eth0*. I try to enables the offloads with ethtool command,
but it does not work.

My questions are:

1. Can the offloads of the guest's virtio-net network adapter be
enabled when vhost-user is used?

2. If the offloads can't be enabled when vhost-user is used, what is the reason?

It will be great if someone from the forum could give the answers and clues.

Thanks.
Leo


[dpdk-dev] vSwitch Performance Comparison for NFV Use Case

2015-08-24 Thread Traynor, Kevin

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jun Xiao
> Sent: Friday, August 21, 2015 8:18 PM
> To: Gray, Mark D
> Cc: dev
> Subject: [dpdk-dev] vSwitch Performance Comparison for NFV Use Case
> 
> Hi Mark,
> Last time we discussed methodologies for vSwitch performance comparison, and
> the performance data we published is more for typical TCP based applications
> in virtualized data centers.Today we shared more data for small packet size
> traffic at?http://cloudnetengine.com/en/blog/2015/08/21/vswitch-performance-
> comparison-nfv-use-case, and the perfomance gets much closed (around?10-20%)
> between OVS-DPDK and CNE vSwitch as the tests are barely forwarding and
> without any other features.
> 
> On the other hand, it's really hard to find any public performance data for
> OVS-DPDK under pNIC -> vSwitch -> VM -> vSwitch -> pNIC case. What I observed
> is that OVS-DPDK can have generally less than 3 MPPS on my setup (vhost user
> is used instead of IVSHMEM), don't know if the data are aligned with what you
> have?

That seems reasonable enough (maybe a little low) considering you are on a 2.4 
GHz
and are using one logical core so the pmd will be sharing a physical core. As 
you
said you will get greater performance if you add another pmd. You will also get
better performance if you set rx_mrgbuf=off.

Did you core affinitize the pmd to an empty core and the pkt fwding qemu thread 
to
make sure they get the cycles they need?

> Thanks,Junwww.cloudnetengine.com


[dpdk-dev] i40e and RSS woes

2015-08-24 Thread Zhang, Helin


> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Monday, August 24, 2015 4:14 AM
> To: Zhang, Helin; Gleb Natapov; dev at dpdk.org
> Subject: Re: [dpdk-dev] i40e and RSS woes
> 
> 
> 
> On 03/05/15 07:56, Zhang, Helin wrote:
> > Hi Gleb
> >
> > Sorry for late! I am struggling on my tasks for the following DPDK release 
> > these
> days.
> >
> >> -Original Message-
> >> From: Gleb Natapov [mailto:gleb at cloudius-systems.com]
> >> Sent: Monday, March 2, 2015 8:56 PM
> >> To: dev at dpdk.org
> >> Cc: Zhang, Helin
> >> Subject: Re: i40e and RSS woes
> >>
> >> Ping.
> >>
> >> On Thu, Feb 19, 2015 at 04:50:10PM +0200, Gleb Natapov wrote:
> >>> CCing i40e driver author in a hope to get an answer.
> >>>
> >>> On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
>  I have an application that works reasonably well with ixgbe driver,
>  but when I try to use it with i40e I encounter various RSS related 
>  issues.
> 
>  First one is that for some reason i40e, when it builds default reta
>  table, round down number of queues to power of two. Why is this? If
> > It seems because of i40e queue configuration. We will check it more
> > and see if it can be changed or improved later.
> 
> Helin, hi!
> Sorry for bringing it back but it seems that the RSS queues number issue
> (rounding it down to the nearest power of 2) still hasn't been addressed in 
> the
> master branch.
> 
> Could u, pls., clarify what is that "i40e queue configuration" that requires 
> this
> alignment u are referring above?
> 
>  From what i could see "num" parameter is not propagated outside the
> i40e_pf_config_rss() in any form except for RSS table contents.
> This means that any code that would need to know the number of Rx queues
> would use the dev_data->nb_rx_queues (e.g. i40e_dev_rx_init()) and wouldn't
> be able to know that i40e_pf_config_rss() something different except for
> scanning the RSS table in HW which is of course not an option.
> 
> Therefore, from the first look it seems that this rounding may be safely 
> removed
> unless I've missed something.
Could you help to refer to the data sheet of 'Hash Filter', 'Receive Queue 
Regions', it
is said that '1, 2, 4, 8, 16, 32, 64' are the supported queue regions.
Yes, we should support more than 64 queues per port, but for rss, it should be 
one
of '1, 2, 4, 8, 16, 32, 64'.

Thanks,
Helin

> 
> Pls., comment.
> 
> thanks,
> vlad
> 
> >
>  I configure reta by my own using all of the queues everything seams
>  to be working. To add insult to injury I do not get any errors
>  during configuration some queues just do not receive any traffic.
> 
>  The second problem is that for some reason i40e does not use 40
>  byte toeplitz hash key like any other driver, but it expects the
>  key to be 52 bytes. And it would have being fine (if we ignore the
>  fact that it contradicts MS spec), but how my high level code
>  suppose to know
> >> that?
> > Actually a rss_key_len was introduced in struct rte_eth_rss_conf
> > recently. So the length should be indicated clearly. But I found the
> > annotations of that structure should have been reworked. I will try to 
> > rework it
> with clear descriptions.
> >
>  And again, device configuration does not fail when wrong key length
>  is provided, it just uses some other key. Guys this kind of error
>  handling is completely unacceptable.
> > If less length of key is provided, it will not be used at all, the default 
> > key will be
> used.
> > So there is no issue as you said. But we need to add more clear
> > description for the structure of rte_eth_rss_conf.
> >
> > Thank you very much for the good comments!
> >
> > Regards,
> > Helin
> >
>  The last one is more of a question. Why interface to change RSS
>  hash function (XOR or toeplitz) is part of a filter configuration
>  and not rss config?
> 
>  --
>   Gleb.
> >>> --
> >>>   Gleb.
> >> --
> >>Gleb.



[dpdk-dev] i40e and RSS woes

2015-08-24 Thread Vlad Zolotarov


On 08/24/15 20:51, Zhang, Helin wrote:
>
>> -Original Message-
>> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
>> Sent: Monday, August 24, 2015 4:14 AM
>> To: Zhang, Helin; Gleb Natapov; dev at dpdk.org
>> Subject: Re: [dpdk-dev] i40e and RSS woes
>>
>>
>>
>> On 03/05/15 07:56, Zhang, Helin wrote:
>>> Hi Gleb
>>>
>>> Sorry for late! I am struggling on my tasks for the following DPDK release 
>>> these
>> days.
 -Original Message-
 From: Gleb Natapov [mailto:gleb at cloudius-systems.com]
 Sent: Monday, March 2, 2015 8:56 PM
 To: dev at dpdk.org
 Cc: Zhang, Helin
 Subject: Re: i40e and RSS woes

 Ping.

 On Thu, Feb 19, 2015 at 04:50:10PM +0200, Gleb Natapov wrote:
> CCing i40e driver author in a hope to get an answer.
>
> On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
>> I have an application that works reasonably well with ixgbe driver,
>> but when I try to use it with i40e I encounter various RSS related 
>> issues.
>>
>> First one is that for some reason i40e, when it builds default reta
>> table, round down number of queues to power of two. Why is this? If
>>> It seems because of i40e queue configuration. We will check it more
>>> and see if it can be changed or improved later.
>> Helin, hi!
>> Sorry for bringing it back but it seems that the RSS queues number issue
>> (rounding it down to the nearest power of 2) still hasn't been addressed in 
>> the
>> master branch.
>>
>> Could u, pls., clarify what is that "i40e queue configuration" that requires 
>> this
>> alignment u are referring above?
>>
>>   From what i could see "num" parameter is not propagated outside the
>> i40e_pf_config_rss() in any form except for RSS table contents.
>> This means that any code that would need to know the number of Rx queues
>> would use the dev_data->nb_rx_queues (e.g. i40e_dev_rx_init()) and wouldn't
>> be able to know that i40e_pf_config_rss() something different except for
>> scanning the RSS table in HW which is of course not an option.
>>
>> Therefore, from the first look it seems that this rounding may be safely 
>> removed
>> unless I've missed something.
> Could you help to refer to the data sheet of 'Hash Filter', 'Receive Queue 
> Regions', it
> is said that '1, 2, 4, 8, 16, 32, 64' are the supported queue regions.
> Yes, we should support more than 64 queues per port, but for rss, it should 
> be one
> of '1, 2, 4, 8, 16, 32, 64'.

"The VSIs support 8 regions of receive queues that are aimed mainly for
the TCs. The TC regions are defined per VSI by the VSIQF_TCREGION
register. The region sizes (defined by the TC_SIZE fields) can be any of
the following value: 1, 2, 4, 8, 16, 32, 64 as long as the total number of
queues do not exceed the VSI allocation. These regions starts at the
offset defined by the TC_OFFSET parameter. According to the region
size, the ?n? LS bits of the Queue Index from the LUT are enabled."

I think the above says that the region sizes may only be one of the 
mentioned values.

AFAIU this doesn't mean that the number or RSS queues has to be the same 
- it may not exceed it.

Just like it's stated in the "Outcome Queue Index" definition the final 
mapping to the PF index space is done using the
VSILAN_QTABLE or VSILAN_QBASE registers (a.k.a. RSS indirection table).

For instance if u have a region of size 8 u may configure 3 RSS queues 
by setting the following RSS table:
0,1,2,0,1,2,0,1

>
> Thanks,
> Helin
>
>> Pls., comment.
>>
>> thanks,
>> vlad
>>
>> I configure reta by my own using all of the queues everything seams
>> to be working. To add insult to injury I do not get any errors
>> during configuration some queues just do not receive any traffic.
>>
>> The second problem is that for some reason i40e does not use 40
>> byte toeplitz hash key like any other driver, but it expects the
>> key to be 52 bytes. And it would have being fine (if we ignore the
>> fact that it contradicts MS spec), but how my high level code
>> suppose to know
 that?
>>> Actually a rss_key_len was introduced in struct rte_eth_rss_conf
>>> recently. So the length should be indicated clearly. But I found the
>>> annotations of that structure should have been reworked. I will try to 
>>> rework it
>> with clear descriptions.
>> And again, device configuration does not fail when wrong key length
>> is provided, it just uses some other key. Guys this kind of error
>> handling is completely unacceptable.
>>> If less length of key is provided, it will not be used at all, the default 
>>> key will be
>> used.
>>> So there is no issue as you said. But we need to add more clear
>>> description for the structure of rte_eth_rss_conf.
>>>
>>> Thank you very much for the good comments!
>>>
>>> Regards,
>>> Helin
>>>
>> The last one is more of a question. Why interface to change RSS
>> hash function (XOR or toeplitz) is part of a filter con

[dpdk-dev] i40e and RSS woes

2015-08-24 Thread Zhang, Helin


> -Original Message-
> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> Sent: Monday, August 24, 2015 11:26 AM
> To: Zhang, Helin; Gleb Natapov
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] i40e and RSS woes
> 
> 
> 
> On 08/24/15 20:51, Zhang, Helin wrote:
> >
> >> -Original Message-
> >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> >> Sent: Monday, August 24, 2015 4:14 AM
> >> To: Zhang, Helin; Gleb Natapov; dev at dpdk.org
> >> Subject: Re: [dpdk-dev] i40e and RSS woes
> >>
> >>
> >>
> >> On 03/05/15 07:56, Zhang, Helin wrote:
> >>> Hi Gleb
> >>>
> >>> Sorry for late! I am struggling on my tasks for the following DPDK
> >>> release these
> >> days.
>  -Original Message-
>  From: Gleb Natapov [mailto:gleb at cloudius-systems.com]
>  Sent: Monday, March 2, 2015 8:56 PM
>  To: dev at dpdk.org
>  Cc: Zhang, Helin
>  Subject: Re: i40e and RSS woes
> 
>  Ping.
> 
>  On Thu, Feb 19, 2015 at 04:50:10PM +0200, Gleb Natapov wrote:
> > CCing i40e driver author in a hope to get an answer.
> >
> > On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
> >> I have an application that works reasonably well with ixgbe
> >> driver, but when I try to use it with i40e I encounter various RSS 
> >> related
> issues.
> >>
> >> First one is that for some reason i40e, when it builds default
> >> reta table, round down number of queues to power of two. Why is
> >> this? If
> >>> It seems because of i40e queue configuration. We will check it more
> >>> and see if it can be changed or improved later.
> >> Helin, hi!
> >> Sorry for bringing it back but it seems that the RSS queues number
> >> issue (rounding it down to the nearest power of 2) still hasn't been
> >> addressed in the master branch.
> >>
> >> Could u, pls., clarify what is that "i40e queue configuration" that
> >> requires this alignment u are referring above?
> >>
> >>   From what i could see "num" parameter is not propagated outside the
> >> i40e_pf_config_rss() in any form except for RSS table contents.
> >> This means that any code that would need to know the number of Rx
> >> queues would use the dev_data->nb_rx_queues (e.g. i40e_dev_rx_init())
> >> and wouldn't be able to know that i40e_pf_config_rss() something
> >> different except for scanning the RSS table in HW which is of course not an
> option.
> >>
> >> Therefore, from the first look it seems that this rounding may be
> >> safely removed unless I've missed something.
> > Could you help to refer to the data sheet of 'Hash Filter', 'Receive
> > Queue Regions', it is said that '1, 2, 4, 8, 16, 32, 64' are the supported 
> > queue
> regions.
> > Yes, we should support more than 64 queues per port, but for rss, it
> > should be one of '1, 2, 4, 8, 16, 32, 64'.
> 
> "The VSIs support 8 regions of receive queues that are aimed mainly for the 
> TCs.
> The TC regions are defined per VSI by the VSIQF_TCREGION register. The region
> sizes (defined by the TC_SIZE fields) can be any of the following value: 1, 
> 2, 4, 8,
> 16, 32, 64 as long as the total number of queues do not exceed the VSI 
> allocation.
> These regions starts at the offset defined by the TC_OFFSET parameter.
> According to the region size, the 'n' LS bits of the Queue Index from the LUT 
> are
> enabled."
> 
> I think the above says that the region sizes may only be one of the mentioned
> values.
> 
> AFAIU this doesn't mean that the number or RSS queues has to be the same
> - it may not exceed it.
> 
> Just like it's stated in the "Outcome Queue Index" definition the final 
> mapping to
> the PF index space is done using the VSILAN_QTABLE or VSILAN_QBASE registers
> (a.k.a. RSS indirection table).
> 
> For instance if u have a region of size 8 u may configure 3 RSS queues by 
> setting
> the following RSS table:
> 0,1,2,0,1,2,0,1
I tend to agree with you. Anyway, I am working on supporting more queues per 
port than 64,
and I will take this into account. If not other strong reasons, I will change 
it. Thank you very much!

Regards,
Helin

> 
> >
> > Thanks,
> > Helin
> >
> >> Pls., comment.
> >>
> >> thanks,
> >> vlad
> >>
> >> I configure reta by my own using all of the queues everything
> >> seams to be working. To add insult to injury I do not get any
> >> errors during configuration some queues just do not receive any 
> >> traffic.
> >>
> >> The second problem is that for some reason i40e does not use 40
> >> byte toeplitz hash key like any other driver, but it expects the
> >> key to be 52 bytes. And it would have being fine (if we ignore
> >> the fact that it contradicts MS spec), but how my high level code
> >> suppose to know
>  that?
> >>> Actually a rss_key_len was introduced in struct rte_eth_rss_conf
> >>> recently. So the length should be indicated clearly. But I found the
> >>> annotations of that structure should have been reworked. I will t

[dpdk-dev] working example commands for ethertype/flow_director_filter ?

2015-08-24 Thread Navneet Rao
testpmd>  mac_addr add 0 00:10:E0:3B:3B:50

testpmd> set promisc all on

testpmd>  ethertype_filter 0 add mac_addr 00:10:E0:3B:3B:50  ethertype 0x0806 
fwd queue 1

ethertype filter programming error: (Invalid argument)

testpmd>  ethertype_filter 0 add mac_addr 00:10:E0:3B:3B:50 ethertype 0x0806 
fwd queue 1

ethertype filter programming error: (Invalid argument)

testpmd>  ethertype_filter 0 add mac_ignr ethertype 0x0806 fwd queue 1

Bad arguments





Has anyone been able to get this to work!!!

All I want to is steer the traffic on port0 to go to some other queue (instead 
of default 0)



And I want to filter on the mac_address.so using the ethertype_filter.



Thanks

-Navneet





-Original Message-
From: Navneet Rao 
Sent: Friday, August 21, 2015 2:55 PM
To: dev at dpdk.org
Subject: [dpdk-dev] working example commands for ethertype/flow_director_filter 
?



Hello:





If anybody has any working example commands for ethertype or 
flow_director_filter,  can you please send it across..



I am using the testpmd app, and it is constantly reporting "bad-arguments" even 
for the legal commands in the doc!!!





Thanks



-Navneet








[dpdk-dev] i40e and RSS woes

2015-08-24 Thread Vladislav Zolotarov
On Aug 24, 2015 21:54, "Zhang, Helin"  wrote:
>
>
>
> > -Original Message-
> > From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> > Sent: Monday, August 24, 2015 11:26 AM
> > To: Zhang, Helin; Gleb Natapov
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] i40e and RSS woes
> >
> >
> >
> > On 08/24/15 20:51, Zhang, Helin wrote:
> > >
> > >> -Original Message-
> > >> From: Vlad Zolotarov [mailto:vladz at cloudius-systems.com]
> > >> Sent: Monday, August 24, 2015 4:14 AM
> > >> To: Zhang, Helin; Gleb Natapov; dev at dpdk.org
> > >> Subject: Re: [dpdk-dev] i40e and RSS woes
> > >>
> > >>
> > >>
> > >> On 03/05/15 07:56, Zhang, Helin wrote:
> > >>> Hi Gleb
> > >>>
> > >>> Sorry for late! I am struggling on my tasks for the following DPDK
> > >>> release these
> > >> days.
> >  -Original Message-
> >  From: Gleb Natapov [mailto:gleb at cloudius-systems.com]
> >  Sent: Monday, March 2, 2015 8:56 PM
> >  To: dev at dpdk.org
> >  Cc: Zhang, Helin
> >  Subject: Re: i40e and RSS woes
> > 
> >  Ping.
> > 
> >  On Thu, Feb 19, 2015 at 04:50:10PM +0200, Gleb Natapov wrote:
> > > CCing i40e driver author in a hope to get an answer.
> > >
> > > On Mon, Feb 16, 2015 at 03:36:54PM +0200, Gleb Natapov wrote:
> > >> I have an application that works reasonably well with ixgbe
> > >> driver, but when I try to use it with i40e I encounter various
RSS related
> > issues.
> > >>
> > >> First one is that for some reason i40e, when it builds default
> > >> reta table, round down number of queues to power of two. Why is
> > >> this? If
> > >>> It seems because of i40e queue configuration. We will check it more
> > >>> and see if it can be changed or improved later.
> > >> Helin, hi!
> > >> Sorry for bringing it back but it seems that the RSS queues number
> > >> issue (rounding it down to the nearest power of 2) still hasn't been
> > >> addressed in the master branch.
> > >>
> > >> Could u, pls., clarify what is that "i40e queue configuration" that
> > >> requires this alignment u are referring above?
> > >>
> > >>   From what i could see "num" parameter is not propagated outside the
> > >> i40e_pf_config_rss() in any form except for RSS table contents.
> > >> This means that any code that would need to know the number of Rx
> > >> queues would use the dev_data->nb_rx_queues (e.g. i40e_dev_rx_init())
> > >> and wouldn't be able to know that i40e_pf_config_rss() something
> > >> different except for scanning the RSS table in HW which is of course
not an
> > option.
> > >>
> > >> Therefore, from the first look it seems that this rounding may be
> > >> safely removed unless I've missed something.
> > > Could you help to refer to the data sheet of 'Hash Filter', 'Receive
> > > Queue Regions', it is said that '1, 2, 4, 8, 16, 32, 64' are the
supported queue
> > regions.
> > > Yes, we should support more than 64 queues per port, but for rss, it
> > > should be one of '1, 2, 4, 8, 16, 32, 64'.
> >
> > "The VSIs support 8 regions of receive queues that are aimed mainly for
the TCs.
> > The TC regions are defined per VSI by the VSIQF_TCREGION register. The
region
> > sizes (defined by the TC_SIZE fields) can be any of the following
value: 1, 2, 4, 8,
> > 16, 32, 64 as long as the total number of queues do not exceed the VSI
allocation.
> > These regions starts at the offset defined by the TC_OFFSET parameter.
> > According to the region size, the 'n' LS bits of the Queue Index from
the LUT are
> > enabled."
> >
> > I think the above says that the region sizes may only be one of the
mentioned
> > values.
> >
> > AFAIU this doesn't mean that the number or RSS queues has to be the same
> > - it may not exceed it.
> >
> > Just like it's stated in the "Outcome Queue Index" definition the final
mapping to
> > the PF index space is done using the VSILAN_QTABLE or VSILAN_QBASE
registers
> > (a.k.a. RSS indirection table).
> >
> > For instance if u have a region of size 8 u may configure 3 RSS queues
by setting
> > the following RSS table:
> > 0,1,2,0,1,2,0,1
> I tend to agree with you. Anyway, I am working on supporting more queues
per port than 64,
> and I will take this into account. If not other strong reasons, I will
change it. Thank you very much!

Great! Thanks a lot, Helin.

>
> Regards,
> Helin
>
> >
> > >
> > > Thanks,
> > > Helin
> > >
> > >> Pls., comment.
> > >>
> > >> thanks,
> > >> vlad
> > >>
> > >> I configure reta by my own using all of the queues everything
> > >> seams to be working. To add insult to injury I do not get any
> > >> errors during configuration some queues just do not receive any
traffic.
> > >>
> > >> The second problem is that for some reason i40e does not use 40
> > >> byte toeplitz hash key like any other driver, but it expects the
> > >> key to be 52 bytes. And it would have being fine (if we ignore
> > >> the fact that it contradicts MS spec), but how my high level code