[dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Xu, Qian Q
Marc
#Test1 is just a simple test. Just launch testpmd with these nic port.
./testpmd ?c 0x3 ?n 4 -- -i

Thanks
Qian

From: marc.sune at gmail.com [mailto:marc.s...@gmail.com] On Behalf Of Marc
Sent: Thursday, March 24, 2016 3:48 PM
To: Xu, Qian Q
Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin; 
Richardson, Bruce; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API 
refactoring



On 24 March 2016 at 07:21, Xu, Qian Q mailto:qian.q.xu 
at intel.com>> wrote:
Marc
I didn?t quite get your points, I observed that after applying this patchset, 
all intel nic can?t be started, maybe something wrong happened when you check 
the duplex/autoneg value for different NICs. If we want to merge the patchset 
in RC2, we need fix them. Maybe not an easy job in several days.

Is this test#1 one of the tests contained in the DPDK repository or is it an 
internal test?

Marc



Thanks
Qian

From: marc.sune at gmail.com [mailto:marc.sune 
at gmail.com] On Behalf Of Marc
Sent: Thursday, March 24, 2016 4:54 AM
To: Xu, Qian Q
Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin; 
Richardson, Bruce; dev at dpdk.org

Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API 
refactoring

Qian,

On 23 March 2016 at 02:18, Xu, Qian Q mailto:qian.q.xu 
at intel.com>> wrote:
We have tested with intel nic and found port can't be started for all 
nics:ixgbe/i40e/igb/bonding, see attached mail for more details. Please check 
and fix it.


Thanks
Qian

-Original Message-
From: dev [mailto:dev-bounces at dpdk.org] On 
Behalf Of Thomas Monjalon
Sent: Wednesday, March 23, 2016 3:59 AM
To: Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin
Cc: marcdevel at gmail.com; Richardson, Bruce; 
dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API 
refactoring

2016-03-17 19:08, Thomas Monjalon:
> There are still too few tests and reviews, especially for
> autonegotiation with Intel devices (patch #6).
> I would not be surprised to see some bugs in this rework.

Any feedback about autoneg in e1000/ixgbe/i40e?
Has it been tested before its integration in RC2?

> The capabilities must be adapted per device. It can be improved in a
> separate patch.
>
> It will be integrated in 16.04-rc2.
> Please test and review shortly, thanks!


-- Forwarded message --
From: "Xu, Qian Q" mailto:qian.q...@intel.com>>
To: "Cao, Waterman" mailto:waterman.cao at 
intel.com>>, "Glynn, Michael J" mailto:michael.j.glynn at intel.com>>
Cc: "Richardson, Bruce" mailto:bruce.richardson 
at intel.com>>, "Zhu, Heqing" mailto:heqing.zhu at 
intel.com>>, "O'Driscoll, Tim" mailto:tim.odriscoll 
at intel.com>>, "Mcnamara, John" mailto:john.mcnamara at intel.com>>, "Xu, HuilongX" mailto:huilongx.xu at intel.com>>, "Fu, JingguoX" mailto:jingguox.fu at intel.com>>, "Xu, Qian Q" mailto:qian.q.xu at intel.com>>, "Zhang, Helin" mailto:helin.zhang at intel.com>>
Date: Tue, 22 Mar 2016 06:41:37 +
Subject: RE: DPDK link speed with Intel devices
Hi, all
We have worked out the basic test cases for the patchset.
1. Test the link speed on major Intel NICs to see if the speed is right.
2. Test the auto-negoation on major Intel NICs to ensure it's working.
Nic covered: ixgbe, igb, i40e, fm10k, bonding(SW), virtio(SW)

When we run the Test#1 for all major NICs. We found that all these NIC 
port(igb, ixgbe, i40e, fm10k) can't be started. Pls check, if the patch is 
applied, all INTEL port can't be start, terrible things!

Interactive-mode selected
Configuring Port 0 (socket 0)
PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f13e99e3440 hw_ring=0x7f13e99e5480 
dma_addr=0x8299e5480
PMD: ixgbe_set_tx_function(): Using simple tx code path
PMD: ixgbe_set_tx_function(): Vector tx enabled.
PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f13ffcb8080 
sw_sc_ring=0x7f13ffcbaac0 hw_ring=0x7f13e99d3380 dma_addr=0x8299d3380
PMD: ixgbe_dev_start(): Invalid link_speeds for port 0; autonegotiation disabled
Fail to start port 0
Configuring Port 1 (socket 0)
PMD: i40e_set_tx_function_flag(): Vector tx can be enabled on this txq.
PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
satisfied. Rx Burst Bulk Alloc function will be used on port=1, queue=0.
PMD: i40e_dev_start(): Invalid link_speeds for port 1; autonegotiation disabled


Just to double-check; is the test#1 adapted to the _new_ API that ethdev has to 
set link speeds? For the output it seems autoneg is disabled => fixed speed, 
hence the new bitmaps have to be used.

(I am not claiming patchset is bug free; there might be issues still)

Regards
marc

Fail to start port 1
Please stop the ports first
Done

Thanks
Qian


-Original Message-
From: Cao, Waterman
Sent: Tuesday, March 22, 2016 11:06 AM
To: Glynn, Michael J
Cc: Richardson, Bru

[dpdk-dev] [PATCH v2 0/5] virtio support for container

2016-03-25 Thread Tan, Jianfeng


On 3/24/2016 9:45 PM, Neil Horman wrote:
> On Thu, Mar 24, 2016 at 11:10:50AM +0800, Tan, Jianfeng wrote:
>> Hi Neil,
>>
>> On 3/24/2016 3:17 AM, Neil Horman wrote:
>>> On Fri, Feb 05, 2016 at 07:20:23PM +0800, Jianfeng Tan wrote:
 v1->v2:
   - Rebase on the patchset of virtio 1.0 support.
   - Fix cannot create non-hugepage memory.
   - Fix wrong size of memory region when "single-file" is used.
   - Fix setting of offset in virtqueue to use virtual address.
   - Fix setting TUNSETVNETHDRSZ in vhost-user's branch.
   - Add mac option to specify the mac address of this virtual device.
   - Update doc.

 This patchset is to provide high performance networking interface (virtio)
 for container-based DPDK applications. The way of starting DPDK apps in
 containers with ownership of NIC devices exclusively is beyond the scope.
 The basic idea here is to present a new virtual device (named eth_cvio),
 which can be discovered and initialized in container-based DPDK apps using
 rte_eal_init(). To minimize the change, we reuse already-existing virtio
 frontend driver code (driver/net/virtio/).
 Compared to QEMU/VM case, virtio device framework (translates I/O port r/w
 operations into unix socket/cuse protocol, which is originally provided in
 QEMU), is integrated in virtio frontend driver. So this converged driver
 actually plays the role of original frontend driver and the role of QEMU
 device framework.
 The major difference lies in how to calculate relative address for vhost.
 The principle of virtio is that: based on one or multiple shared memory
 segments, vhost maintains a reference system with the base addresses and
 length for each segment so that an address from VM comes (usually GPA,
 Guest Physical Address) can be translated into vhost-recognizable address
 (named VVA, Vhost Virtual Address). To decrease the overhead of address
 translation, we should maintain as few segments as possible. In VM's case,
 GPA is always locally continuous. In container's case, CVA (Container
 Virtual Address) can be used. Specifically:
 a. when set_base_addr, CVA address is used;
 b. when preparing RX's descriptors, CVA address is used;
 c. when transmitting packets, CVA is filled in TX's descriptors;
 d. in TX and CQ's header, CVA is used.
 How to share memory? In VM's case, qemu always shares all physical layout
 to backend. But it's not feasible for a container, as a process, to share
 all virtual memory regions to backend. So only specified virtual memory
 regions (with type of shared) are sent to backend. It's a limitation that
 only addresses in these areas can be used to transmit or receive packets.

 Known issues

 a. When used with vhost-net, root privilege is required to create tap
 device inside.
 b. Control queue and multi-queue are not supported yet.
 c. When --single-file option is used, socket_id of the memory may be
 wrong. (Use "numactl -N x -m x" to work around this for now)
 How to use?

 a. Apply this patchset.

 b. To compile container apps:
 $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
 $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
 $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
 $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc

 c. To build a docker image using Dockerfile below.
 $: cat ./Dockerfile
>>> >FROM ubuntu:latest
 WORKDIR /usr/src/dpdk
 COPY . /usr/src/dpdk
 ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/"
 $: docker build -t dpdk-app-l2fwd .

 d. Used with vhost-user
 $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \
--socket-mem 1024,1024 -- -p 0x1 --stats 1
 $: docker run -i -t -v :/var/run/usvhost \
-v /dev/hugepages:/dev/hugepages \
dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
--vdev=eth_cvio0,path=/var/run/usvhost -- -p 0x1

 f. Used with vhost-net
 $: modprobe vhost
 $: modprobe vhost-net
 $: docker run -i -t --privileged \
-v /dev/vhost-net:/dev/vhost-net \
-v /dev/net/tun:/dev/net/tun \
-v /dev/hugepages:/dev/hugepages \
dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
--vdev=eth_cvio0,path=/dev/vhost-net -- -p 0x1

 By the way, it's not necessary to run in a container.

 Signed-off-by: Huawei Xie 
 Signed-off-by: Jianfeng Tan 

 Jianfeng Tan (5):
mem: add --single-file to create single mem-backed file
mem: add API to obtain memory-backed file info
virtio/vdev: add embeded device emulation
virtio/vdev: add a new vdev named eth_cvio
docs: add release note for virtio for container

   config/common_linuxapp |   5 +
   doc/guides/rel_notes/release_2_

[dpdk-dev] DPDK's vhost-user logging capability

2016-03-25 Thread Yuanhan Liu
On Wed, Mar 23, 2016 at 03:34:09PM +, shesha Sreenivasamurthy (shesha) 
wrote:
> Hi All,
> 
> I was going over vhost-user migration capability in DPDK in lieu of a Cisco's
> multi-q DPDK vhost-user application. I see that log_base address is 
> implemented
> as per virtio_net device. However, desc, addr and used is per vhost_virtqueue.
> Additionally, QEMU sends one VHOST_USER_SET_LOG_BASE per queue-pair (QEMU - 
> hw/
> virtio/vhost.c::vhost_dev_set_log).
> 
> Does it mean we need to log dirty pages of all rings to same location ?

Hi,

Yes, and QEMU allocates only one block of memory (see vhost_log_alloc())
after all.

> If that
> is the case then why does QEMU sends separate VHOST_USER_SET_LOG_BASE per 
> queue
> pair ?

That's kind of like a design. One queue pair is associated with one
vhost_dev struct in QEMU, hence, all those requests will go through
vhost_dev structs (aka, all qeueu pairs), including those that one
time request is needed only, such as VHOST_USER_SET_MEM_TABLE. Thus,
we introduced vhost_user_one_time_request() to avoid such case.

So, good question, and we may need add it to the "one time request"
group, Marc?

And FYI, for queue-pair (or vring) request, there should be an index
in the payload, to point to the right vring. If not, it normally
means a global request, that _may_ need be sent once only.

--yliu


[dpdk-dev] [PATCH] ixgbe: extend the timer support to x550em

2016-03-25 Thread Wenzhuo Lu
An issue is found on x550em NICs, that ieee1588 is not working, the time
always be 0.
The root cause is the timer is only supported by x550, it's not extended
to x550em_x and x550em_a.

Fixes: a7740dc1303a("ixgbe: support new devices and MAC types")
Signed-off-by: Wenzhuo Lu 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index d4d883a..137183f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -5761,6 +5761,8 @@ ixgbe_read_systime_cyclecounter(struct rte_eth_dev *dev)

switch (hw->mac.type) {
case ixgbe_mac_X550:
+   case ixgbe_mac_X550EM_x:
+   case ixgbe_mac_X550EM_a:
/* SYSTIMEL stores ns and SYSTIMEH stores seconds. */
systime_cycles = (uint64_t)IXGBE_READ_REG(hw, IXGBE_SYSTIML);
systime_cycles += (uint64_t)IXGBE_READ_REG(hw, IXGBE_SYSTIMH)
@@ -5783,6 +5785,8 @@ ixgbe_read_rx_tstamp_cyclecounter(struct rte_eth_dev *dev)

switch (hw->mac.type) {
case ixgbe_mac_X550:
+   case ixgbe_mac_X550EM_x:
+   case ixgbe_mac_X550EM_a:
/* RXSTMPL stores ns and RXSTMPH stores seconds. */
rx_tstamp_cycles = (uint64_t)IXGBE_READ_REG(hw, IXGBE_RXSTMPL);
rx_tstamp_cycles += (uint64_t)IXGBE_READ_REG(hw, IXGBE_RXSTMPH)
@@ -5806,6 +5810,8 @@ ixgbe_read_tx_tstamp_cyclecounter(struct rte_eth_dev *dev)

switch (hw->mac.type) {
case ixgbe_mac_X550:
+   case ixgbe_mac_X550EM_x:
+   case ixgbe_mac_X550EM_a:
/* TXSTMPL stores ns and TXSTMPH stores seconds. */
tx_tstamp_cycles = (uint64_t)IXGBE_READ_REG(hw, IXGBE_TXSTMPL);
tx_tstamp_cycles += (uint64_t)IXGBE_READ_REG(hw, IXGBE_TXSTMPH)
@@ -5854,6 +5860,8 @@ ixgbe_start_timecounters(struct rte_eth_dev *dev)

switch (hw->mac.type) {
case ixgbe_mac_X550:
+   case ixgbe_mac_X550EM_x:
+   case ixgbe_mac_X550EM_a:
/* Independent of link speed. */
incval = 1;
/* Cycles read will be interpreted as ns. */
-- 
1.9.3



[dpdk-dev] [PATCH 0/4] vhost vlan tag and TSO fixes/cleanups

2016-03-25 Thread Yuanhan Liu

Ksiadz reported that TSO won't work for OVS with NIC, even with those
similar changes from the commit 9fd72e3cbd29 ("examples/vhost: add
virtio offload").

This gives me another chance to look at the TSO implementation a bit
deeper, and then came up with this small patch set, which moves some
left settings for enabling TSO to vhost lib.

With this patch set, an application can do mimimal (or even no)
changes to get the TSO capability. Take OVS as example, it just need
set MTU correctly and set the NIC port txq_flags properly to enable
NIC offloading ability, which is disabled by default for some drivers.

Patch 4 is a vlan tag fix reported by Qian.

---
Yuanhan Liu (4):
  vhost: remove unnecessary return
  vhost: complete TSO settings
  examples/vhost: remove unnessary settings for TX offload
  examples/vhost: fix wrong vlan_tag

 examples/vhost/main.c | 64 +++
 lib/librte_vhost/vhost_rxtx.c | 49 +++--
 2 files changed, 39 insertions(+), 74 deletions(-)

-- 
1.9.0



[dpdk-dev] [PATCH 1/4] vhost: remove unnecessary return

2016-03-25 Thread Yuanhan Liu
Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_rxtx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index b4da665..7d1224c 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -123,8 +123,6 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct 
virtio_net_hdr *net_hdr)
net_hdr->hdr_len = m_buf->l2_len + m_buf->l3_len
+ m_buf->l4_len;
}
-
-   return;
 }

 static inline void
-- 
1.9.0



[dpdk-dev] [PATCH 2/4] vhost: complete TSO settings

2016-03-25 Thread Yuanhan Liu
Commit d0cf91303d73 ("vhost: add Tx offload capabilities") has only
done partial settings for enabling TSO, and left the following part
to the application, say vhost-switch example, by commit 9fd72e3cbd29
("examples/vhost: add virtio offload").

- Setting PKT_TX_IP_CKSUM and ipv4_hdr->hdr_checksum = 0 for IPv4.

- calculate the pseudo header checksum without taking ip_len in
  account, and set it in the TCP header

Here we complete the left part in vhost side, so that an user (such
as OVS) can do minimal (or even no) changes to get TSO enabled.

Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_rxtx.c | 47 ---
 1 file changed, 35 insertions(+), 12 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index 7d1224c..a204703 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -602,11 +602,11 @@ rte_vhost_enqueue_burst(struct virtio_net *dev, uint16_t 
queue_id,
 }

 static void
-parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr)
+parse_ethernet(struct rte_mbuf *m, void **l3_hdr,
+  void **l4_hdr, uint16_t *l4_proto)
 {
struct ipv4_hdr *ipv4_hdr;
struct ipv6_hdr *ipv6_hdr;
-   void *l3_hdr = NULL;
struct ether_hdr *eth_hdr;
uint16_t ethertype;

@@ -622,21 +622,19 @@ parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, 
void **l4_hdr)
ethertype = rte_be_to_cpu_16(vlan_hdr->eth_proto);
}

-   l3_hdr = (char *)eth_hdr + m->l2_len;
+   *l3_hdr = (char *)eth_hdr + m->l2_len;

switch (ethertype) {
case ETHER_TYPE_IPv4:
-   ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+   ipv4_hdr  = *l3_hdr;
*l4_proto = ipv4_hdr->next_proto_id;
m->l3_len = (ipv4_hdr->version_ihl & 0x0f) * 4;
-   *l4_hdr = (char *)l3_hdr + m->l3_len;
m->ol_flags |= PKT_TX_IPV4;
break;
case ETHER_TYPE_IPv6:
-   ipv6_hdr = (struct ipv6_hdr *)l3_hdr;
+   ipv6_hdr = *l3_hdr;
*l4_proto = ipv6_hdr->proto;
m->l3_len = sizeof(struct ipv6_hdr);
-   *l4_hdr = (char *)l3_hdr + m->l3_len;
m->ol_flags |= PKT_TX_IPV6;
break;
default:
@@ -644,16 +642,28 @@ parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, 
void **l4_hdr)
*l4_proto = 0;
break;
}
+
+   *l4_hdr = (char *)*l3_hdr + m->l3_len;
+}
+
+static uint16_t
+get_psd_sum(void *l3_hdr, uint64_t ol_flags)
+{
+   if (ol_flags & PKT_TX_IPV4)
+   return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
+   else
+   return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
 }

 static inline void __attribute__((always_inline))
 vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
 {
-   uint16_t l4_proto = 0;
-   void *l4_hdr = NULL;
-   struct tcp_hdr *tcp_hdr = NULL;
+   void *l3_hdr;
+   void *l4_hdr;
+   uint16_t l4_proto;
+
+   parse_ethernet(m, &l3_hdr, &l4_hdr, &l4_proto);

-   parse_ethernet(m, &l4_proto, &l4_hdr);
if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
if (hdr->csum_start == (m->l2_len + m->l3_len)) {
switch (hdr->csum_offset) {
@@ -676,13 +686,26 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct 
rte_mbuf *m)
}

if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
+   struct ipv4_hdr *ipv4_hdr = l3_hdr;
+   struct tcp_hdr *tcp_hdr   = l4_hdr;
+
switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
case VIRTIO_NET_HDR_GSO_TCPV4:
+   /*
+* According to comments for PKT_TX_TCP_SEG
+* at rte_mbuf.h, we need following settings
+* for IPv4.
+*/
+   m->ol_flags |= PKT_TX_IP_CKSUM;
+   ipv4_hdr->hdr_checksum = 0;
+
+   /* Fall through */
case VIRTIO_NET_HDR_GSO_TCPV6:
-   tcp_hdr = (struct tcp_hdr *)l4_hdr;
m->ol_flags |= PKT_TX_TCP_SEG;
m->tso_segsz = hdr->gso_size;
m->l4_len = (tcp_hdr->data_off & 0xf0) >> 2;
+
+   tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
break;
default:
RTE_LOG(WARNING, VHOST_DATA,
-- 
1.9.0



[dpdk-dev] [PATCH 3/4] examples/vhost: remove unnessary settings for TX offload

2016-03-25 Thread Yuanhan Liu
We now got all required settings to make TSO work at vhost lib.
We also don't need to calculate the pseudo header checksum just
for the checksum offloading case, as the TCP/IP stack would have
done that.

So, those settings are not necessary; remove them.

Signed-off-by: Yuanhan Liu 
---
 examples/vhost/main.c | 58 ---
 1 file changed, 58 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index a45cddb..ae1e110 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -51,9 +51,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 
-#include 

 #include "main.h"

@@ -1147,58 +1144,6 @@ find_local_dest(struct virtio_net *dev, struct rte_mbuf 
*m,
return 0;
 }

-static uint16_t
-get_psd_sum(void *l3_hdr, uint64_t ol_flags)
-{
-   if (ol_flags & PKT_TX_IPV4)
-   return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
-   else /* assume ethertype == ETHER_TYPE_IPv6 */
-   return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
-}
-
-static void virtio_tx_offload(struct rte_mbuf *m)
-{
-   void *l3_hdr;
-   struct ipv4_hdr *ipv4_hdr = NULL;
-   struct tcp_hdr *tcp_hdr = NULL;
-   struct udp_hdr *udp_hdr = NULL;
-   struct sctp_hdr *sctp_hdr = NULL;
-   struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);
-
-   l3_hdr = (char *)eth_hdr + m->l2_len;
-
-   if (m->tso_segsz != 0) {
-   ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
-   tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + m->l3_len);
-   m->ol_flags |= PKT_TX_IP_CKSUM;
-   ipv4_hdr->hdr_checksum = 0;
-   tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
-   return;
-   }
-
-   if (m->ol_flags & PKT_TX_L4_MASK) {
-   switch (m->ol_flags & PKT_TX_L4_MASK) {
-   case PKT_TX_TCP_CKSUM:
-   tcp_hdr = (struct tcp_hdr *)
-   ((char *)l3_hdr + m->l3_len);
-   tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
-   break;
-   case PKT_TX_UDP_CKSUM:
-   udp_hdr = (struct udp_hdr *)
-   ((char *)l3_hdr + m->l3_len);
-   udp_hdr->dgram_cksum = get_psd_sum(l3_hdr, m->ol_flags);
-   break;
-   case PKT_TX_SCTP_CKSUM:
-   sctp_hdr = (struct sctp_hdr *)
-   ((char *)l3_hdr + m->l3_len);
-   sctp_hdr->cksum = 0;
-   break;
-   default:
-   break;
-   }
-   }
-}
-
 /*
  * This function routes the TX packet to the correct interface. This may be a 
local device
  * or the physical port.
@@ -1265,9 +1210,6 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf 
*m, uint16_t vlan_tag)
m->vlan_tci = vlan_tag;
}

-   if ((m->ol_flags & PKT_TX_L4_MASK) || (m->ol_flags & PKT_TX_TCP_SEG))
-   virtio_tx_offload(m);
-
tx_q->m_table[len] = m;
len++;
if (enable_stats) {
-- 
1.9.0



[dpdk-dev] [PATCH 4/4] examples/vhost: fix wrong vlan_tag

2016-03-25 Thread Yuanhan Liu
While the last arg of virtio_tx_route() asks a vlan tag, we currently
feed it with device_fh, which is wrong. Fix it.

Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")

Reported-by: Qian Xu 
Signed-off-by: Yuanhan Liu 
---
 examples/vhost/main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index ae1e110..00ae0de 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1364,8 +1364,10 @@ switch_worker(__attribute__((unused)) void *arg)

rte_pktmbuf_free(pkts_burst[--tx_count]);
}
}
-   for (i = 0; i < tx_count; ++i)
-   virtio_tx_route(vdev, pkts_burst[i], 
(uint16_t)dev->device_fh);
+   for (i = 0; i < tx_count; ++i) {
+   virtio_tx_route(vdev, pkts_burst[i],
+   
vlan_tags[(uint16_t)dev->device_fh]);
+   }
}

/*move to the next device in the list*/
-- 
1.9.0



[dpdk-dev] [PATCH] ixgbe: support mac type x550em_a

2016-03-25 Thread Wenzhuo Lu
On my side the development of l2 tunnel and e-tag features is being
done in paralell with the ixgbe base code update. So, l2 tunnel and
e-tag are not supported on the new x550em_a NICs.
Now all the code is ready, should extend the support to x550em_a
NICs.

Fixes: 22e77d4501b8("ixgbe: support L2 tunnel operations")
Signed-off-by: Wenzhuo Lu 
---
 drivers/net/ixgbe/ixgbe_ethdev.c | 42 ++--
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index d4d883a..5521b0f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1788,7 +1788,8 @@ ixgbe_vlan_hw_extend_enable(struct rte_eth_dev *dev)

/* Clear pooling mode of PFVTCTL. It's required by X550. */
if (hw->mac.type == ixgbe_mac_X550 ||
-   hw->mac.type == ixgbe_mac_X550EM_x) {
+   hw->mac.type == ixgbe_mac_X550EM_x ||
+   hw->mac.type == ixgbe_mac_X550EM_a) {
ctrl = IXGBE_READ_REG(hw, IXGBE_VT_CTL);
ctrl &= ~IXGBE_VT_CTL_POOLING_MODE_MASK;
IXGBE_WRITE_REG(hw, IXGBE_VT_CTL, ctrl);
@@ -2885,7 +2886,8 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_TCP_LRO;

if (hw->mac.type == ixgbe_mac_X550 ||
-   hw->mac.type == ixgbe_mac_X550EM_x)
+   hw->mac.type == ixgbe_mac_X550EM_x ||
+   hw->mac.type == ixgbe_mac_X550EM_a)
dev_info->rx_offload_capa |= DEV_RX_OFFLOAD_OUTER_IPV4_CKSUM;

dev_info->tx_offload_capa =
@@ -2897,7 +2899,8 @@ ixgbe_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_TCP_TSO;

if (hw->mac.type == ixgbe_mac_X550 ||
-   hw->mac.type == ixgbe_mac_X550EM_x)
+   hw->mac.type == ixgbe_mac_X550EM_x ||
+   hw->mac.type == ixgbe_mac_X550EM_a)
dev_info->tx_offload_capa |= DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM;

dev_info->default_rxconf = (struct rte_eth_rxconf) {
@@ -5000,7 +5003,8 @@ ixgbevf_set_default_mac_addr(struct rte_eth_dev *dev, 
struct ether_addr *addr)

 #define MAC_TYPE_FILTER_SUP(type)do {\
if ((type) != ixgbe_mac_82599EB && (type) != ixgbe_mac_X540 &&\
-   (type) != ixgbe_mac_X550)\
+   (type) != ixgbe_mac_X550 && (type) != ixgbe_mac_X550EM_x &&\
+   (type) != ixgbe_mac_X550EM_a)\
return -ENOTSUP;\
 } while (0)

@@ -6327,7 +6331,8 @@ ixgbe_update_e_tag_eth_type(struct ixgbe_hw *hw,
uint32_t etag_etype;

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;
}

@@ -6371,7 +6376,8 @@ ixgbe_e_tag_enable(struct ixgbe_hw *hw)
uint32_t etag_etype;

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;
}

@@ -6411,7 +6417,8 @@ ixgbe_e_tag_disable(struct ixgbe_hw *hw)
uint32_t etag_etype;

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;
}

@@ -6454,7 +6461,8 @@ ixgbe_e_tag_filter_del(struct rte_eth_dev *dev,
uint32_t rar_low, rar_high;

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;
}

@@ -6489,7 +6497,8 @@ ixgbe_e_tag_filter_add(struct rte_eth_dev *dev,
uint32_t rar_low, rar_high;

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;
}

@@ -6608,7 +6617,8 @@ ixgbe_e_tag_forwarding_en_dis(struct rte_eth_dev *dev, 
bool en)
struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;
}

@@ -6681,7 +6691,8 @@ ixgbe_e_tag_insertion_en_dis(struct rte_eth_dev *dev,
}

if (hw->mac.type != ixgbe_mac_X550 &&
-   hw->mac.type != ixgbe_mac_X550EM_x) {
+   hw->mac.type != ixgbe_mac_X550EM_x &&
+   hw->mac.type != ixgbe_mac_X550EM_a) {
return -ENOTSUP;

[dpdk-dev] [PATCH 2/4] vhost: complete TSO settings

2016-03-25 Thread Yuanhan Liu
On Fri, Mar 25, 2016 at 02:01:32PM +0800, Yuanhan Liu wrote:
> Commit d0cf91303d73 ("vhost: add Tx offload capabilities") has only
> done partial settings for enabling TSO, and left the following part
> to the application, say vhost-switch example, by commit 9fd72e3cbd29
> ("examples/vhost: add virtio offload").
> 
> - Setting PKT_TX_IP_CKSUM and ipv4_hdr->hdr_checksum = 0 for IPv4.
> 
> - calculate the pseudo header checksum without taking ip_len in
>   account, and set it in the TCP header
> 
> Here we complete the left part in vhost side, so that an user (such
> as OVS) can do minimal (or even no) changes to get TSO enabled.
> 
> Signed-off-by: Yuanhan Liu 
...
> + ipv4_hdr->hdr_checksum = 0;

Nah.. we can't do that here. This hurts VM2VM case badly.

Thanks Qian for letting me be aware of it.

--yliu


[dpdk-dev] [PATCH 0/2] Compile fixes in SUSE11 SP3 i686 platform

2016-03-25 Thread Michael Qiu
In SUSE11 SP3 i686 platform with gcc version 4.5.1, there is
some compile issues. This patch set is try to fix them.

Michael Qiu (2):
  lib/librte_lpm: Fix anonymous union initialization issue
  drivers/crypto: Fix anonymous union initialization in crypto

 drivers/crypto/null/null_crypto_pmd_ops.c | 16 
 lib/librte_lpm/rte_lpm.c  | 14 +++---
 2 files changed, 15 insertions(+), 15 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH 2/2] drivers/crypto: Fix anonymous union initialization in crypto

2016-03-25 Thread Michael Qiu
In SUSE11-SP3 i686 platform, with gcc 4.5.1, there is a
compile issue:
null_crypto_pmd_ops.c:44:3: error:
unknown field ?sym? specified in initializer
cc1: warnings being treated as errors

The member in anonymous union initialization should be inside '{}',
otherwise it will report an error.

Fixes: 26c2e4ad5ad4 ("cryptodev: add capabilities discovery")

Signed-off-by: Michael Qiu 
---
 drivers/crypto/null/null_crypto_pmd_ops.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/crypto/null/null_crypto_pmd_ops.c 
b/drivers/crypto/null/null_crypto_pmd_ops.c
index 39f8088..b7470c0 100644
--- a/drivers/crypto/null/null_crypto_pmd_ops.c
+++ b/drivers/crypto/null/null_crypto_pmd_ops.c
@@ -41,9 +41,9 @@
 static const struct rte_cryptodev_capabilities null_crypto_pmd_capabilities[] 
= {
{   /* NULL (AUTH) */
.op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
-   .sym = {
+   {.sym = {
.xform_type = RTE_CRYPTO_SYM_XFORM_AUTH,
-   .auth = {
+   {.auth = {
.algo = RTE_CRYPTO_AUTH_NULL,
.block_size = 1,
.key_size = {
@@ -57,14 +57,14 @@ static const struct rte_cryptodev_capabilities 
null_crypto_pmd_capabilities[] =
.increment = 0
},
.aad_size = { 0 }
-   }
-   }
+   }, },
+   }, },
},
{   /* NULL (CIPHER) */
.op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
-   .sym = {
+   {.sym = {
.xform_type = RTE_CRYPTO_SYM_XFORM_CIPHER,
-   .cipher = {
+   {.cipher = {
.algo = RTE_CRYPTO_CIPHER_NULL,
.block_size = 1,
.key_size = {
@@ -77,8 +77,8 @@ static const struct rte_cryptodev_capabilities 
null_crypto_pmd_capabilities[] =
.max = 0,
.increment = 0
}
-   }
-   }
+   }, },
+   }, },
},
RTE_CRYPTODEV_END_OF_CAPABILITIES_LIST()
 };
-- 
1.9.3



[dpdk-dev] [PATCH 1/2] lib/librte_lpm: Fix anonymous union initialization issue

2016-03-25 Thread Michael Qiu
In SUSE11-SP3 i686 platform, with gcc 4.5.1, there is a
compile issue:
rte_lpm.c: In function ?add_depth_small_v20?:
rte_lpm.c:778:7: error: unknown field ?next_hop?
specified in initializer
cc1: warnings being treated as errors
The root casue is gcc only allow anonymous union initialized
according to the field it is defined. But next_hop is defined
in different field when in different platform(Endian).

One solution is add if define in the code to avoid this issue,
but there is a simple way, initialize it separately later.

Fixes: afc5c914a083 ("lpm: fix big endian support")

Signed-off-by: Michael Qiu 
---
 lib/librte_lpm/rte_lpm.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/librte_lpm/rte_lpm.c b/lib/librte_lpm/rte_lpm.c
index af5811c..efd507e 100644
--- a/lib/librte_lpm/rte_lpm.c
+++ b/lib/librte_lpm/rte_lpm.c
@@ -744,11 +744,11 @@ add_depth_small_v20(struct rte_lpm_v20 *lpm, uint32_t ip, 
uint8_t depth,
lpm->tbl24[i].depth <= depth)) {

struct rte_lpm_tbl_entry_v20 new_tbl24_entry = {
-   { .next_hop = next_hop, },
.valid = VALID,
.valid_group = 0,
.depth = depth,
};
+   new_tbl24_entry.next_hop = next_hop;

/* Setting tbl24 entry in one go to avoid race
 * conditions
@@ -775,8 +775,8 @@ add_depth_small_v20(struct rte_lpm_v20 *lpm, uint32_t ip, 
uint8_t depth,
.valid = VALID,
.valid_group = VALID,
.depth = depth,
-   .next_hop = next_hop,
};
+   new_tbl8_entry.next_hop=next_hop;

/*
 * Setting tbl8 entry in one go to avoid
@@ -975,10 +975,9 @@ add_depth_big_v20(struct rte_lpm_v20 *lpm, uint32_t 
ip_masked, uint8_t depth,
struct rte_lpm_tbl_entry_v20 new_tbl8_entry = {
.valid = VALID,
.depth = depth,
-   .next_hop = next_hop,
.valid_group = lpm->tbl8[i].valid_group,
};
-
+   new_tbl8_entry.next_hop = next_hop;
/*
 * Setting tbl8 entry in one go to avoid race
 * condition
@@ -1375,9 +1374,9 @@ delete_depth_small_v20(struct rte_lpm_v20 *lpm, uint32_t 
ip_masked,
.valid = VALID,
.valid_group = VALID,
.depth = sub_rule_depth,
-   .next_hop = lpm->rules_tbl
-   [sub_rule_index].next_hop,
};
+   new_tbl8_entry.next_hop =
+   lpm->rules_tbl[sub_rule_index].next_hop;

for (i = tbl24_index; i < (tbl24_index + tbl24_range); i++) {

@@ -1639,9 +1638,10 @@ delete_depth_big_v20(struct rte_lpm_v20 *lpm, uint32_t 
ip_masked,
.valid = VALID,
.depth = sub_rule_depth,
.valid_group = lpm->tbl8[tbl8_group_start].valid_group,
-   .next_hop = lpm->rules_tbl[sub_rule_index].next_hop,
};

+   new_tbl8_entry.next_hop =
+   lpm->rules_tbl[sub_rule_index].next_hop;
/*
 * Loop through the range of entries on tbl8 for which the
 * rule_to_delete must be modified.
-- 
1.9.3



[dpdk-dev] [PATCH v4 0/3] packet type

2016-03-25 Thread Jianfeng Tan
This patch will work on below patch series.
 - [PATCH v9 01/11] Add API to get packet type info

v4:
 - refine the API to return 0 intead of ENOTSUP, and doc and note updated.
 - rte_eth_dev_get_ptype_info -> rte_eth_dev_get_supported_ptypes

v3:
 - em ptype check: (l4_tcp || l4_udp) -> (l4_tcp && l4_udp).
 - avoid rte_be_to_cpu_16 for each packet by adding proper macros.
 - with --parse-ptype specified, use sw parser mandatorily.
 - enable i40e vector driver by default.

v2:
 - Add patchset dependence in commit log.
 - Change hardcoded 0 to RTE_PTYPE_UNKNOWN.
 - More accurate em_parse_type.
 - Add restrictions in EM forwarding functions.
 - Define cb directly to avoid too many function calls when do analyze.
 - Some typo fixed.
 - Change the position to call rte_eth_dev_get_ptype_info
   after rte_eth_dev_start().

Patch 1: refine rte_eth_dev_get_supported_ptypes.
Patch 2: add an option in l3fwd.
Patch 3: enable vector pmd in i40e by default.

Signed-off-by: Jianfeng Tan 
Acked-by: Konstantin Ananyev 


Jianfeng Tan (3):
  ethdev: refine API to query supported packet types
  examples/l3fwd: fix using packet type blindly
  config: enable vector driver by default

 config/common_base  |   2 +-
 doc/guides/nics/overview.rst|   2 +-
 doc/guides/rel_notes/release_16_04.rst  |  15 +
 doc/guides/sample_app_ug/l3_forward.rst |   6 +-
 examples/l3fwd/l3fwd.h  |  14 
 examples/l3fwd/l3fwd_em.c   | 109 
 examples/l3fwd/l3fwd_em.h   |  10 ++-
 examples/l3fwd/l3fwd_em_hlm_sse.h   |  17 +++--
 examples/l3fwd/l3fwd_em_sse.h   |   9 ++-
 examples/l3fwd/l3fwd_lpm.c  |  65 +++
 examples/l3fwd/main.c   |  55 
 lib/librte_ether/rte_ethdev.c   |   3 +-
 lib/librte_ether/rte_ethdev.h   |   9 ++-
 13 files changed, 299 insertions(+), 17 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH v4 1/3] ethdev: refine API to query supported packet types

2016-03-25 Thread Jianfeng Tan
Return 0 instead of -ENOTSUP for those which do not fill any packet types,
with some note and doc updated.

Signed-off-by: Jianfeng Tan 
Acked-by: Konstantin Ananyev 
---
 doc/guides/nics/overview.rst  | 2 +-
 lib/librte_ether/rte_ethdev.c | 3 +--
 lib/librte_ether/rte_ethdev.h | 9 ++---
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
index 542479a..e7504da 100644
--- a/doc/guides/nics/overview.rst
+++ b/doc/guides/nics/overview.rst
@@ -124,7 +124,7 @@ Most of these differences are summarized below.
L4 checksum offload  X   X   X   X
inner L3 checksumX   X   X
inner L4 checksumX   X   X
-   packet type parsing  X   X   X
+   packet type parsing  X X X X X X   X X   X X X 
X X X   X
timesync X X
basic stats  X   X   X X X X   
X X
extended stats   X   X X X X
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a328027..1ee79d2 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1636,8 +1636,7 @@ rte_eth_dev_get_supported_ptypes(uint8_t port_id, 
uint32_t ptype_mask,

RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
dev = &rte_eth_devices[port_id];
-   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_supported_ptypes_get,
-   -ENOTSUP);
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_supported_ptypes_get, 0);
all_ptypes = (*dev->dev_ops->dev_supported_ptypes_get)(dev);

if (!all_ptypes)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index e7de34a..5167750 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2326,6 +2326,9 @@ void rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info);
  * @note
  *   Better to invoke this API after the device is already started or rx burst
  *   function is decided, to obtain correct supported ptypes.
+ * @note
+ *   if a given PMD does not report what ptypes it supports, then the supported
+ *   ptype count is reported as 0.
  * @param port_id
  *   The port identifier of the Ethernet device.
  * @param ptype_mask
@@ -2335,9 +2338,9 @@ void rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info);
  * @param num
  *  Size of the array pointed by param ptypes.
  * @return
- *   - (>0) Number of supported ptypes. If it exceeds param num, exceeding
- *  packet types will not be filled in the given array.
- *   - (0 or -ENOTSUP) if PMD does not fill the specified ptype.
+ *   - (>=0) Number of supported ptypes. If the number of types exceeds num,
+ only num entries will be filled into the ptypes array, but the 
full
+ count of supported ptypes will be returned.
  *   - (-ENODEV) if *port_id* invalid.
  */
 int rte_eth_dev_get_supported_ptypes(uint8_t port_id, uint32_t ptype_mask,
-- 
2.1.4



[dpdk-dev] [PATCH v4 2/3] examples/l3fwd: fix using packet type blindly

2016-03-25 Thread Jianfeng Tan
As a example to use ptype info, l3fwd needs firstly to use
rte_eth_dev_get_supported_ptypes() API to check if device and/or
its PMD driver will parse and fill the needed packet type; if not,
use the newly added option, --parse-ptype, to analyze it in the
callback softly.

As the mode of EXACT_MATCH uses the 5 tuples to caculate hash, so
we narrow down its scope to:
  a. ip packets with no extensions, and
  b. L4 payload should be either tcp or udp.

Note: this patch does not completely solve the issue, "cannot run
l3fwd on virtio or other devices", because hw_ip_checksum may be
not supported by the devices. Currently we can:
  a. remove this requirements, or
  b. wait for virtio front end (pmd) to support it.

Signed-off-by: Jianfeng Tan 
Acked-by: Konstantin Ananyev 
---
 doc/guides/rel_notes/release_16_04.rst  |   9 +++
 doc/guides/sample_app_ug/l3_forward.rst |   6 +-
 examples/l3fwd/l3fwd.h  |  14 
 examples/l3fwd/l3fwd_em.c   | 109 
 examples/l3fwd/l3fwd_em.h   |  10 ++-
 examples/l3fwd/l3fwd_em_hlm_sse.h   |  17 +++--
 examples/l3fwd/l3fwd_em_sse.h   |   9 ++-
 examples/l3fwd/l3fwd_lpm.c  |  65 +++
 examples/l3fwd/main.c   |  55 
 9 files changed, 284 insertions(+), 10 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
index 76e4f3d..26e985b 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -299,6 +299,15 @@ This section should contain bug fixes added to the 
relevant sections. Sample for
   The title should contain the code/lib section like a commit message.
   Add the entries in alphabetic order in the relevant sections below.

+* **examples/vhost: Fixed frequent mbuf allocation failure.**
+
+  vhost-switch often fails to allocate mbuf when dequeue from vring because it
+  wrongly calculates the number of mbufs needed.
+
+* **examples/l3fwd: Fixed using packet type blindly.**
+
+  l3fwd makes use of packet type information without even query if devices or 
PMDs
+  really set it. For those don't set ptypes, add an option to parse it softly.

 EAL
 ~~~
diff --git a/doc/guides/sample_app_ug/l3_forward.rst 
b/doc/guides/sample_app_ug/l3_forward.rst
index 1522650..3e070d0 100644
--- a/doc/guides/sample_app_ug/l3_forward.rst
+++ b/doc/guides/sample_app_ug/l3_forward.rst
@@ -92,7 +92,7 @@ The application has a number of command line options:

 .. code-block:: console

-./build/l3fwd [EAL options] -- -p PORTMASK [-P]  
--config(port,queue,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len 
PKTLEN]]  [--no-numa][--hash-entry-num][--ipv6]
+./build/l3fwd [EAL options] -- -p PORTMASK [-P]  
--config(port,queue,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len 
PKTLEN]]  [--no-numa][--hash-entry-num][--ipv6] [--parse-ptype]

 where,

@@ -113,6 +113,8 @@ where,

 *   --ipv6: optional, set it if running ipv6 packets

+*   --parse-ptype: optional, set it if use software way to analyze packet type
+
 For example, consider a dual processor socket platform where cores 0-7 and 
16-23 appear on socket 0, while cores 8-15 and 24-31 appear on socket 1.
 Let's say that the programmer wants to use memory from both NUMA nodes, the 
platform has only two ports, one connected to each NUMA node,
 and the programmer wants to use two cores from each processor socket to do the 
packet processing.
@@ -334,6 +336,8 @@ The key code snippet of simple_ipv4_fwd_4pkts() is shown 
below:

 The simple_ipv6_fwd_4pkts() function is similar to the simple_ipv4_fwd_4pkts() 
function.

+Known issue: IP packets with extensions or IP packets which are not TCP/UDP 
cannot work well at this mode.
+
 Packet Forwarding for LPM-based Lookups
 ~~~

diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h
index 726e8cc..d8798b7 100644
--- a/examples/l3fwd/l3fwd.h
+++ b/examples/l3fwd/l3fwd.h
@@ -206,6 +206,20 @@ void
 setup_hash(const int socketid);

 int
+em_check_ptype(int portid);
+
+int
+lpm_check_ptype(int portid);
+
+uint16_t
+em_cb_parse_ptype(uint8_t port, uint16_t queue, struct rte_mbuf *pkts[],
+ uint16_t nb_pkts, uint16_t max_pkts, void *user_param);
+
+uint16_t
+lpm_cb_parse_ptype(uint8_t port, uint16_t queue, struct rte_mbuf *pkts[],
+  uint16_t nb_pkts, uint16_t max_pkts, void *user_param);
+
+int
 em_main_loop(__attribute__((unused)) void *dummy);

 int
diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c
index 526b485..fc59243 100644
--- a/examples/l3fwd/l3fwd_em.c
+++ b/examples/l3fwd/l3fwd_em.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 

 #include 
 #include 
@@ -519,6 +520,114 @@ populate_ipv6_many_flow_into_table(const struct rte_hash 
*h,
printf("Hash: Adding 0x%x keys\n", nr_flow);
 }

+/* Requirements:
+ * 1. IP packets without extension;
+ * 2.

[dpdk-dev] [PATCH v4 3/3] config: enable vector driver by default

2016-03-25 Thread Jianfeng Tan
Previously, vector driver is not the first (default) choice for i40e,
as it cannot fill packet type info for l3fwd to work well. Now there
is an option for l3fwd to analysis packet type softly. So enable it
by default.

Signed-off-by: Jianfeng Tan 
Acked-by: Konstantin Ananyev 
---
 config/common_base | 2 +-
 doc/guides/rel_notes/release_16_04.rst | 6 ++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/config/common_base b/config/common_base
index d98a82c..abd6a64 100644
--- a/config/common_base
+++ b/config/common_base
@@ -179,7 +179,7 @@ CONFIG_RTE_LIBRTE_I40E_DEBUG_TX=n
 CONFIG_RTE_LIBRTE_I40E_DEBUG_TX_FREE=n
 CONFIG_RTE_LIBRTE_I40E_DEBUG_DRIVER=n
 CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC=y
-CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=n
+CONFIG_RTE_LIBRTE_I40E_INC_VECTOR=y
 CONFIG_RTE_LIBRTE_I40E_RX_OLFLAGS_ENABLE=y
 CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
 CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_PF=64
diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
index 26e985b..7359604 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -390,6 +390,12 @@ Drivers
   Allowed AES GCM on the cryptodev API, but in some cases gave invalid results
   due to incorrect IV setting.

+* **i40e: enable vector driver by default.**
+
+  Previously, vector driver is disabled by default as it cannot fill packet 
type
+  info for l3fwd to work well. Now there is an option for l3fwd to analysis
+  packet type softly, so enable vector driver by default.
+

 Libraries
 ~
-- 
2.1.4



[dpdk-dev] [PATCH v2 0/4] vhost vlan tag and TSO fixes/cleanups

2016-03-25 Thread Yuanhan Liu
v2: - we can't remove the left part of TSO settings to lib vhost, which
  hurts VM2VM performance badly.

Ksiadz reported that TSO won't work for OVS with NIC, even with those
similar changes from the commit 9fd72e3cbd29 ("examples/vhost: add
virtio offload").

This gives me another chance to look at the TSO implementation a bit
deeper, and then came up with this small patch set, which includes a
TSO cleanup and fix.

Patch 4 is a vlan tag fix reported from Qian.

---
Yuanhan Liu (4):
  vhost: remove unnecessary return
  examples/vhost: remove unnecessary pseudo checksum calc
  examples/vhost: fix offload settings
  examples/vhost: fix wrong vlan_tag

 examples/vhost/main.c | 44 ++-
 lib/librte_vhost/vhost_rxtx.c |  2 --
 2 files changed, 10 insertions(+), 36 deletions(-)

-- 
1.9.0



[dpdk-dev] [PATCH v2 1/4] vhost: remove unnecessary return

2016-03-25 Thread Yuanhan Liu
Signed-off-by: Yuanhan Liu 
---
 lib/librte_vhost/vhost_rxtx.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c
index b4da665..7d1224c 100644
--- a/lib/librte_vhost/vhost_rxtx.c
+++ b/lib/librte_vhost/vhost_rxtx.c
@@ -123,8 +123,6 @@ virtio_enqueue_offload(struct rte_mbuf *m_buf, struct 
virtio_net_hdr *net_hdr)
net_hdr->hdr_len = m_buf->l2_len + m_buf->l3_len
+ m_buf->l4_len;
}
-
-   return;
 }

 static inline void
-- 
1.9.0



[dpdk-dev] [PATCH v2 2/4] examples/vhost: remove unnecessary pseudo checksum calc

2016-03-25 Thread Yuanhan Liu
For checksum offloading only case, the TCP/IP stack would
have calculated the pseudo checksum. Therefore, we don't
need to re-calculate it again here; remove it.

Signed-off-by: Yuanhan Liu 
---
 examples/vhost/main.c | 41 ++---
 1 file changed, 6 insertions(+), 35 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index a45cddb..ceabbce 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -52,8 +52,6 @@
 #include 
 #include 
 #include 
-#include 
-#include 

 #include "main.h"

@@ -1161,42 +1159,15 @@ static void virtio_tx_offload(struct rte_mbuf *m)
void *l3_hdr;
struct ipv4_hdr *ipv4_hdr = NULL;
struct tcp_hdr *tcp_hdr = NULL;
-   struct udp_hdr *udp_hdr = NULL;
-   struct sctp_hdr *sctp_hdr = NULL;
struct ether_hdr *eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *);

l3_hdr = (char *)eth_hdr + m->l2_len;

-   if (m->tso_segsz != 0) {
-   ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
-   tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + m->l3_len);
-   m->ol_flags |= PKT_TX_IP_CKSUM;
-   ipv4_hdr->hdr_checksum = 0;
-   tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
-   return;
-   }
-
-   if (m->ol_flags & PKT_TX_L4_MASK) {
-   switch (m->ol_flags & PKT_TX_L4_MASK) {
-   case PKT_TX_TCP_CKSUM:
-   tcp_hdr = (struct tcp_hdr *)
-   ((char *)l3_hdr + m->l3_len);
-   tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
-   break;
-   case PKT_TX_UDP_CKSUM:
-   udp_hdr = (struct udp_hdr *)
-   ((char *)l3_hdr + m->l3_len);
-   udp_hdr->dgram_cksum = get_psd_sum(l3_hdr, m->ol_flags);
-   break;
-   case PKT_TX_SCTP_CKSUM:
-   sctp_hdr = (struct sctp_hdr *)
-   ((char *)l3_hdr + m->l3_len);
-   sctp_hdr->cksum = 0;
-   break;
-   default:
-   break;
-   }
-   }
+   ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+   tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + m->l3_len);
+   m->ol_flags |= PKT_TX_IP_CKSUM;
+   ipv4_hdr->hdr_checksum = 0;
+   tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
 }

 /*
@@ -1265,7 +1236,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf 
*m, uint16_t vlan_tag)
m->vlan_tci = vlan_tag;
}

-   if ((m->ol_flags & PKT_TX_L4_MASK) || (m->ol_flags & PKT_TX_TCP_SEG))
+   if (m->ol_flags & PKT_TX_TCP_SEG)
virtio_tx_offload(m);

tx_q->m_table[len] = m;
-- 
1.9.0



[dpdk-dev] [PATCH v2 3/4] examples/vhost: fix offload settings

2016-03-25 Thread Yuanhan Liu
Comments for PKT_TX_TCP_SEG at rte_mbuf says that we should only set
PKT_TX_IP_CKSUM and reset ip hdr checksum for IPv4:

  - if it's IPv4, set the PKT_TX_IP_CKSUM flag and write the IP checksum
to 0 in the packet

Fixes: 9fd72e3cbd29 ("examples/vhost: add virtio offload")

Signed-off-by: Yuanhan Liu 
---
 examples/vhost/main.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index ceabbce..86e5c24 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1163,10 +1163,13 @@ static void virtio_tx_offload(struct rte_mbuf *m)

l3_hdr = (char *)eth_hdr + m->l2_len;

-   ipv4_hdr = (struct ipv4_hdr *)l3_hdr;
+   if (m->ol_flags & PKT_TX_IPV4) {
+   ipv4_hdr = l3_hdr;
+   ipv4_hdr->hdr_checksum = 0;
+   m->ol_flags |= PKT_TX_IP_CKSUM;
+   }
+
tcp_hdr = (struct tcp_hdr *)((char *)l3_hdr + m->l3_len);
-   m->ol_flags |= PKT_TX_IP_CKSUM;
-   ipv4_hdr->hdr_checksum = 0;
tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
 }

-- 
1.9.0



[dpdk-dev] [PATCH v2 4/4] examples/vhost: fix wrong vlan_tag

2016-03-25 Thread Yuanhan Liu
While the last arg of virtio_tx_route() asks a vlan tag, we currently
feed it with device_fh, which is wrong. Fix it.

Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")

Reported-by: Qian Xu 
Signed-off-by: Yuanhan Liu 
---
 examples/vhost/main.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 86e5c24..28c17af 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -1396,8 +1396,10 @@ switch_worker(__attribute__((unused)) void *arg)

rte_pktmbuf_free(pkts_burst[--tx_count]);
}
}
-   for (i = 0; i < tx_count; ++i)
-   virtio_tx_route(vdev, pkts_burst[i], 
(uint16_t)dev->device_fh);
+   for (i = 0; i < tx_count; ++i) {
+   virtio_tx_route(vdev, pkts_burst[i],
+   
vlan_tags[(uint16_t)dev->device_fh]);
+   }
}

/*move to the next device in the list*/
-- 
1.9.0



[dpdk-dev] [PATCH 1/2] Fix CPU and memory parameters on IBM POWER8

2016-03-25 Thread Chao Zhu
This patch fixes the max logic number and memory channel number settings
on IBM POWER8 platform.
1. The max number of logic cores of a POWER8 processor is 96. Normally,
   there are two sockets on a server. So the max number of logic cores
   are 192. So this parch set CONFIG_RTE_MAX_LCORE to 256.
2. Currently, the max number of memory channels are hardcoded to 4. However,
   on a POWER8 machine, the max number of memory channels are 8. To fix this,
   CONFIG_RTE_MAX_NCHANNELS is added to do the configuration.

Signed-off-by: Chao Zhu 
---
 config/common_base |3 ++-
 lib/librte_eal/common/eal_common_options.c |2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/config/common_base b/config/common_base
index dbd405b..1beea32 100644
--- a/config/common_base
+++ b/config/common_base
@@ -83,10 +83,11 @@ CONFIG_RTE_CACHE_LINE_SIZE=64
 # Compile Environment Abstraction Layer
 #
 CONFIG_RTE_LIBRTE_EAL=y
-CONFIG_RTE_MAX_LCORE=128
+CONFIG_RTE_MAX_LCORE=256
 CONFIG_RTE_MAX_NUMA_NODES=8
 CONFIG_RTE_MAX_MEMSEG=256
 CONFIG_RTE_MAX_MEMZONE=2560
+CONFIG_RTE_MAX_NCHANNELS=8
 CONFIG_RTE_MAX_TAILQ=32
 CONFIG_RTE_LOG_LEVEL=8
 CONFIG_RTE_LOG_HISTORY=256
diff --git a/lib/librte_eal/common/eal_common_options.c 
b/lib/librte_eal/common/eal_common_options.c
index 29942ea..6c268c1 100644
--- a/lib/librte_eal/common/eal_common_options.c
+++ b/lib/librte_eal/common/eal_common_options.c
@@ -798,7 +798,7 @@ eal_parse_common_option(int opt, const char *optarg,
case 'n':
conf->force_nchannel = atoi(optarg);
if (conf->force_nchannel == 0 ||
-   conf->force_nchannel > 4) {
+   conf->force_nchannel > RTE_MAX_NCHANNELS) {
RTE_LOG(ERR, EAL, "invalid channel number\n");
return -1;
}
-- 
1.7.1



[dpdk-dev] [PATCH 2/2] Fix prefetch instruction on IBM POWER8

2016-03-25 Thread Chao Zhu
Current prefetch instruction (dcbt) implementation for IBM POWER8 has wrong
Touch Hint(TH) parameter. The current setting of TH=1 indicates to load data 
from
current cache line and an unlimited number of sequentially following cache 
lines.
TTH=0 means to load data from current cache line. rte_prefetch0 function is 
defined
to load one cache line, which means TH=0 is suited here.

Signed-off-by: Chao Zhu 
---
 .../common/include/arch/ppc_64/rte_prefetch.h  |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h 
b/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
index bcc7185..9a1995e 100644
--- a/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
@@ -41,17 +41,17 @@ extern "C" {

 static inline void rte_prefetch0(const volatile void *p)
 {
-   asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
+   asm volatile ("dcbt 0,%[p],0" : : [p] "r" (p));
 }

 static inline void rte_prefetch1(const volatile void *p)
 {
-   asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
+   asm volatile ("dcbt 0,%[p],0" : : [p] "r" (p));
 }

 static inline void rte_prefetch2(const volatile void *p)
 {
-   asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
+   asm volatile ("dcbt 0,%[p],0" : : [p] "r" (p));
 }

 static inline void rte_prefetch_non_temporal(const volatile void *p)
-- 
1.7.1



[dpdk-dev] [PATCH 0/2] Fix parameters and prefetch function on IBM POWER8

2016-03-25 Thread Chao Zhu
This patch set fixes CPU/memory parameters and correct wrong prefetch settings 
for IBM POWER8.

Chao Zhu (2):
  Fix CPU and memory parameters on IBM POWER8
  Fix prefetch instruction on IBM POWER8

 config/common_base |3 ++-
 lib/librte_eal/common/eal_common_options.c |2 +-
 .../common/include/arch/ppc_64/rte_prefetch.h  |6 +++---
 3 files changed, 6 insertions(+), 5 deletions(-)



[dpdk-dev] [PATCH 0/3 v7] i40e: Add floating VEB support for i40e

2016-03-25 Thread Zhe Tao
This patch-set add the support for floating VEB in i40e.
All the VFs VSIs can decide whether to connect to the legacy VEB/VEPA or
the floating VEB. When connect to the floating VEB a new floating VEB is
created. Now all the VFs need to connect to floating VEB or legacy VEB,
cannot connect to both of them. The PF and VMDQ,FD VSIs connect to
the old legacy VEB/VEPA.

All the VEB/VEPA concepts are not specific for FVL, they are defined in the
802.1Qbg spec.

This floating VEB only take effects on the specific version F/W.

Zhe Tao (2):
  Support floating VEB config
  Add floating VEB support in i40e
  Add global reset support for i40e

 doc/guides/nics/i40e.rst   |   7 ++
 doc/guides/rel_notes/release_16_04.rst |   2 +
 drivers/net/i40e/i40e_ethdev.c | 189 +
 drivers/net/i40e/i40e_ethdev.h |  38 +++
 drivers/net/i40e/i40e_pf.c |  11 +-
 5 files changed, 223 insertions(+), 24 deletions(-)

V2: Added the release notes and changed commit log. 
V3: Changed the VSI release operation. 
V4: Added the FW version check otherwise it will cause the segment fault.
V5: Edited the code for new share code APIs
V6: Changed the floating VEB configuration method 
V7: Added global reset for i40e 
-- 
2.1.4



[dpdk-dev] [PATCH 1/3 v7] i40e: support floating VEB config

2016-03-25 Thread Zhe Tao
Add the new floating related argument option in the devarg.
Using this parameter, all the samples can decide whether to use legacy VEB/VEPA
or floating VEB.
To enable this feature, the user should pass a devargs parameter to the EAL
like "-w 84:00.0,enable_floating=1", and the application will make sure the PMD
will use the floating VEB feature for all the VFs created by this PF device.

Signed-off-by: Zhe Tao 
---
 drivers/net/i40e/i40e_ethdev.c | 44 ++
 drivers/net/i40e/i40e_ethdev.h |  6 ++
 2 files changed, 50 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 6fdae57..01f1d3d 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -739,6 +739,44 @@ i40e_add_tx_flow_control_drop_filter(struct i40e_pf *pf)
  " frames from VSIs.");
 }

+static int i40e_check_floating_handler(__rte_unused const char *key,
+  const char *value,
+  __rte_unused void *opaque)
+{
+   if (strcmp(value, "1"))
+   return -1;
+
+   return 0;
+}
+
+static int
+i40e_check_floating(struct rte_devargs *devargs)
+{
+   struct rte_kvargs *kvlist;
+   const char *floating_key = "enable_floating";
+
+   if (devargs == NULL)
+   return 0;
+
+   kvlist = rte_kvargs_parse(devargs->args, NULL);
+   if (kvlist == NULL)
+   return 0;
+
+   if (!rte_kvargs_count(kvlist, floating_key)) {
+   rte_kvargs_free(kvlist);
+   return 0;
+   }
+   /* Floating is enabled when there's key-value pair: enable_floating=1 */
+   if (rte_kvargs_process(kvlist, floating_key,
+  i40e_check_floating_handler, NULL) < 0) {
+   rte_kvargs_free(kvlist);
+   return 0;
+   }
+   rte_kvargs_free(kvlist);
+
+   return 1;
+}
+
 static int
 eth_i40e_dev_init(struct rte_eth_dev *dev)
 {
@@ -829,6 +867,12 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
 ((hw->nvm.version >> 4) & 0xff),
 (hw->nvm.version & 0xf), hw->nvm.eetrack);

+   /* Need the special FW version support floating VEB */
+   if (hw->aq.fw_maj_ver >= FLOATING_FW_MAJ) {
+   pf->floating = i40e_check_floating(pci_dev->devargs);
+   } else {
+   pf->floating = false;
+   }
/* Clear PXE mode */
i40e_clear_pxe_mode(hw);

diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 1c75672..7dc6936 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -36,6 +36,7 @@

 #include 
 #include 
+#include 

 #define I40E_VLAN_TAG_SIZE4

@@ -171,6 +172,10 @@ enum i40e_flxpld_layer_idx {
 #define I40E_QUEUE_ITR_INTERVAL_DEFAULT 32 /* 32 us */
 #define I40E_QUEUE_ITR_INTERVAL_MAX 8160 /* 8160 us */

+/* Special FW support this floating VEB feature */
+#define FLOATING_FW_MAJ 5
+#define FLOATING_FW_MIN 0
+
 struct i40e_adapter;

 /**
@@ -446,6 +451,7 @@ struct i40e_pf {
struct i40e_fc_conf fc_conf; /* Flow control conf */
struct i40e_mirror_rule_list mirror_list;
uint16_t nb_mirror_rule;   /* The number of mirror rules */
+   uint16_t floating; /* The flag to use the floating VEB */
 };

 enum pending_msg {
-- 
2.1.4



[dpdk-dev] [PATCH 2/3 v7] i40e: Add floating VEB support in i40e

2016-03-25 Thread Zhe Tao
This patch add the support for floating VEB in i40e.
All the VFs VSIs can decide whether to connect to the legacy VEB/VEPA or
the floating VEB. When connect to the floating VEB a new floating VEB is
created. Now all the VFs need to connect to floating VEB or legacy VEB,
cannot connect to both of them. The PF and VMDQ,FD VSIs still connect to
the old legacy VEB/VEPA.

All the VEB/VEPA concepts are not specific for FVL, they are defined in the
802.1Qbg spec.

Now the floating VEB feature is only avaiable in the specific version of FW.

Signed-off-by: Zhe Tao 
---
 doc/guides/nics/i40e.rst   |   7 +++
 doc/guides/rel_notes/release_16_04.rst |   2 +
 drivers/net/i40e/i40e_ethdev.c | 110 ++---
 drivers/net/i40e/i40e_ethdev.h |   2 +
 drivers/net/i40e/i40e_pf.c |  11 +++-
 5 files changed, 109 insertions(+), 23 deletions(-)

diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst
index 4019b41..520ea09 100644
--- a/doc/guides/nics/i40e.rst
+++ b/doc/guides/nics/i40e.rst
@@ -366,3 +366,10 @@ Delete all flow director rules on a port:

testpmd> flush_flow_director 0

+Floating VEB
+~
+FVL can support floating VEB feature.
+To enable this feature, the user should pass a devargs parameter to the EAL
+like "-w 84:00.0,enable_floating=1", and the application will make sure the PMD
+will use the floating VEB feature for all the VFs created by this PF device.
+
diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
index 9922bcb..1545872 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -248,6 +248,8 @@ This section should contain new features added in this 
release. Sample format:

   New application implementing an IPsec Security Gateway.

+* **Added floating VEB support for FVL.**
+

 Resolved Issues
 ---
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 01f1d3d..87801d3 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -3734,21 +3734,27 @@ i40e_veb_release(struct i40e_veb *veb)
struct i40e_vsi *vsi;
struct i40e_hw *hw;

-   if (veb == NULL || veb->associate_vsi == NULL)
+   if (veb == NULL)
return -EINVAL;

if (!TAILQ_EMPTY(&veb->head)) {
PMD_DRV_LOG(ERR, "VEB still has VSI attached, can't remove");
return -EACCES;
}
+   /* associate_vsi field is NULL for floating VEB */
+   if (veb->associate_vsi != NULL) {
+   vsi = veb->associate_vsi;
+   hw = I40E_VSI_TO_HW(vsi);

-   vsi = veb->associate_vsi;
-   hw = I40E_VSI_TO_HW(vsi);
+   vsi->uplink_seid = veb->uplink_seid;
+   vsi->veb = NULL;
+   } else {
+   veb->associate_pf->main_vsi->floating_veb = NULL;
+   hw = I40E_VSI_TO_HW(veb->associate_pf->main_vsi);
+   }

-   vsi->uplink_seid = veb->uplink_seid;
i40e_aq_delete_element(hw, veb->seid, NULL);
rte_free(veb);
-   vsi->veb = NULL;
return I40E_SUCCESS;
 }

@@ -3760,9 +3766,9 @@ i40e_veb_setup(struct i40e_pf *pf, struct i40e_vsi *vsi)
int ret;
struct i40e_hw *hw;

-   if (NULL == pf || vsi == NULL) {
+   if (NULL == pf) {
PMD_DRV_LOG(ERR, "veb setup failed, "
-   "associated VSI shouldn't null");
+   "associated PF shouldn't null");
return NULL;
}
hw = I40E_PF_TO_HW(pf);
@@ -3774,11 +3780,19 @@ i40e_veb_setup(struct i40e_pf *pf, struct i40e_vsi *vsi)
}

veb->associate_vsi = vsi;
+   veb->associate_pf = pf;
TAILQ_INIT(&veb->head);
-   veb->uplink_seid = vsi->uplink_seid;
+   veb->uplink_seid = vsi ? vsi->uplink_seid : 0;

-   ret = i40e_aq_add_veb(hw, veb->uplink_seid, vsi->seid,
-   I40E_DEFAULT_TCMAP, false, &veb->seid, false, NULL);
+   /* create floating veb if vsi is NULL */
+   if (vsi != NULL) {
+   ret = i40e_aq_add_veb(hw, veb->uplink_seid, vsi->seid,
+ I40E_DEFAULT_TCMAP, false,
+ &veb->seid, false, NULL);
+   } else {
+   ret = i40e_aq_add_veb(hw, 0, 0, I40E_DEFAULT_TCMAP,
+ true, &veb->seid, false, NULL);
+   }

if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR, "Add veb failed, aq_err: %d",
@@ -3794,10 +3808,10 @@ i40e_veb_setup(struct i40e_pf *pf, struct i40e_vsi *vsi)
hw->aq.asq_last_status);
goto fail;
}
-
/* Get VEB bandwidth, to be implemented */
/* Now associated vsi binding to the VEB, set uplink to this VEB */
-   vsi->uplink_seid = veb->seid;
+   if (vsi)
+   vsi->uplink_seid = veb->seid;

return veb;
 f

[dpdk-dev] [PATCH 3/3 v7] i40e: Add global reset support for i40e

2016-03-25 Thread Zhe Tao
Add global reset support in i40e.
Sometimes the PF reset will fail, and the PF software reset cannot ensure
all the status and components are reset. So added the global reset to fix
this issue.
The essential difference for the new global reset and PF reset is that the
PF Reset doesn't clear the packet buffers, doesn't reset the PE
firmware, and doesn't bother the other PFs on the chip.

Signed-off-by: Zhe Tao 
---
 drivers/net/i40e/i40e_ethdev.c | 35 ++-
 drivers/net/i40e/i40e_ethdev.h | 30 ++
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 87801d3..8336321 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -437,6 +437,8 @@ static int i40e_get_eeprom(struct rte_eth_dev *dev,
 static void i40e_set_default_mac_addr(struct rte_eth_dev *dev,
  struct ether_addr *mac_addr);

+static void i40e_do_reset(struct i40e_hw *hw, u32 reset_flags);
+
 static const struct rte_pci_id pci_id_i40e_map[] = {
 #define RTE_PCI_DEV_ID_DECL_I40E(vend, dev) {RTE_PCI_DEVICE(vend, dev)},
 #include "rte_pci_dev_ids.h"
@@ -836,7 +838,7 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
ret = i40e_pf_reset(hw);
if (ret) {
PMD_INIT_LOG(ERR, "Failed to reset pf: %d", ret);
-   return ret;
+   i40e_do_reset(hw, BIT(__I40E_GLOBAL_RESET_REQUESTED));
}

/* Initialize the shared code (base driver) */
@@ -9117,3 +9119,34 @@ static void i40e_set_default_mac_addr(struct rte_eth_dev 
*dev,
/* Flags: 0x3 updates port address */
i40e_aq_mac_address_write(hw, 0x3, mac_addr->addr_bytes, NULL);
 }
+
+/**
+ * i40e_do_reset - Start a PF or Core Reset sequence
+ * @pf: board private structure
+ * @reset_flags: which reset is requested
+ *
+ * The essential difference in resets is that the PF Reset
+ * doesn't clear the packet buffers, doesn't reset the PE
+ * firmware, and doesn't bother the other PFs on the chip.
+ **/
+static void i40e_do_reset(struct i40e_hw *hw, u32 reset_flags)
+{
+   u32 val;
+
+   /* do the biggest reset indicated */
+   if (reset_flags & BIT_ULL(__I40E_GLOBAL_RESET_REQUESTED)) {
+   /* Request a Global Reset
+*
+* This will start the chip's countdown to the actual full
+* chip reset event, and a warning interrupt to be sent
+* to all PFs, including the requestor.  Our handler
+* for the warning interrupt will deal with the shutdown
+* and recovery of the switch setup.
+*/
+   PMD_INIT_LOG(NOTICE, "GlobalR requested\n");
+   val = rd32(hw, I40E_GLGEN_RTRIG);
+   val |= I40E_GLGEN_RTRIG_GLOBR_MASK;
+   wr32(hw, I40E_GLGEN_RTRIG, val);
+   }
+   /* other reset operations are not supported now */
+}
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 09fb6e2..f2a2fcc 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -108,6 +108,36 @@ enum i40e_flxpld_layer_idx {
I40E_FLXPLD_L4_IDX= 2,
I40E_MAX_FLXPLD_LAYER = 3,
 };
+
+/* driver state flags */
+enum i40e_state_t {
+   __I40E_TESTING,
+   __I40E_CONFIG_BUSY,
+   __I40E_CONFIG_DONE,
+   __I40E_DOWN,
+   __I40E_NEEDS_RESTART,
+   __I40E_SERVICE_SCHED,
+   __I40E_ADMINQ_EVENT_PENDING,
+   __I40E_MDD_EVENT_PENDING,
+   __I40E_VFLR_EVENT_PENDING,
+   __I40E_RESET_RECOVERY_PENDING,
+   __I40E_RESET_INTR_RECEIVED,
+   __I40E_REINIT_REQUESTED,
+   __I40E_PF_RESET_REQUESTED,
+   __I40E_CORE_RESET_REQUESTED,
+   __I40E_GLOBAL_RESET_REQUESTED,
+   __I40E_EMP_RESET_REQUESTED,
+   __I40E_EMP_RESET_INTR_RECEIVED,
+   __I40E_FILTER_OVERFLOW_PROMISC,
+   __I40E_SUSPENDED,
+   __I40E_BAD_EEPROM,
+   __I40E_DEBUG_MODE,
+   __I40E_DOWN_REQUESTED,
+   __I40E_FD_FLUSH_REQUESTED,
+   __I40E_RESET_FAILED,
+   __I40E_PORT_TX_SUSPENDED,
+   __I40E_VF_DISABLE,
+};
 #define I40E_MAX_FLXPLD_FIED3  /* max number of flex payload fields */
 #define I40E_FDIR_BITMASK_NUM_WORD  2  /* max number of bitmask words */
 #define I40E_FDIR_MAX_FLEXWORD_NUM  8  /* max number of flexpayload words */
-- 
2.1.4



[dpdk-dev] [PATCH 1/2] Fix CPU and memory parameters on IBM POWER8

2016-03-25 Thread David Marchand
On Fri, Mar 25, 2016 at 9:11 AM, Chao Zhu  wrote:
> This patch fixes the max logic number and memory channel number settings
> on IBM POWER8 platform.
> 1. The max number of logic cores of a POWER8 processor is 96. Normally,
>there are two sockets on a server. So the max number of logic cores
>are 192. So this parch set CONFIG_RTE_MAX_LCORE to 256.

This is a power8 configuration item, this should go to power8 config
file, not common_base.

> 2. Currently, the max number of memory channels are hardcoded to 4. However,
>on a POWER8 machine, the max number of memory channels are 8. To fix this,
>CONFIG_RTE_MAX_NCHANNELS is added to do the configuration.

I don't see any reason why we would need a max value for force_nchannel.
We should just get rid of this check, this is an obscure parameter for
most people, so people playing with it know what they are doing
(hopefully ?).

On the other hand, if power8 has some specifics about it, maybe we
should introduce some default value in a arch eal header for other
dpdk components to use (like in mempool).
Thoughts ?


-- 
David Marchand


[dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking at the tail of rx hwring

2016-03-25 Thread Jianbo Liu
On 22 March 2016 at 22:27, Ananyev, Konstantin
 wrote:
>
>
>> -Original Message-
>> From: Jianbo Liu [mailto:jianbo.liu at linaro.org]
>> Sent: Monday, March 21, 2016 2:27 AM
>> To: Richardson, Bruce
>> Cc: Lu, Wenzhuo; Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH] ixgbe: avoid unnessary break when checking 
>> at the tail of rx hwring
>>
>> On 18 March 2016 at 18:03, Bruce Richardson  
>> wrote:
>> > On Thu, Mar 17, 2016 at 10:20:01AM +0800, Jianbo Liu wrote:
>> >> On 16 March 2016 at 19:14, Bruce Richardson > >> intel.com> wrote:
>> >> > On Wed, Mar 16, 2016 at 03:51:53PM +0800, Jianbo Liu wrote:
>> >> >> Hi Wenzhuo,
>> >> >>
>> >> >> On 16 March 2016 at 14:06, Lu, Wenzhuo  wrote:
>> >> >> > HI Jianbo,
>> >> >> >
>> >> >> >
>> >> >> >> -Original Message-
>> >> >> >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jianbo Liu
>> >> >> >> Sent: Monday, March 14, 2016 10:26 PM
>> >> >> >> To: Zhang, Helin; Ananyev, Konstantin; dev at dpdk.org
>> >> >> >> Cc: Jianbo Liu
>> >> >> >> Subject: [dpdk-dev] [PATCH] ixgbe: avoid unnessary break when 
>> >> >> >> checking at the
>> >> >> >> tail of rx hwring
>> >> >> >>
>> >> >> >> When checking rx ring queue, it's possible that loop will break at 
>> >> >> >> the tail while
>> >> >> >> there are packets still in the queue header.
>> >> >> > Would you like to give more details about in what scenario this 
>> >> >> > issue will be hit? Thanks.
>> >> >> >
>> >> >>
>> >> >> vPMD will place extra RTE_IXGBE_DESCS_PER_LOOP - 1 number of empty
>> >> >> descriptiors at the end of hwring to avoid overflow when do checking
>> >> >> on rx side.
>> >> >>
>> >> >> For the loop in _recv_raw_pkts_vec(), we check 4 descriptors each
>> >> >> time. If all 4 DD are set, and all 4 packets are received.That's OK in
>> >> >> the middle.
>> >> >> But if come to the end of hwring, and less than 4 descriptors left, we
>> >> >> still need to check 4 descriptors at the same time, so the extra empty
>> >> >> descriptors are checked with them.
>> >> >> This time, the number of received packets is apparently less than 4,
>> >> >> and we break out of the loop because of the condition "var !=
>> >> >> RTE_IXGBE_DESCS_PER_LOOP".
>> >> >> So the problem arises. It is possible that there could be more packets
>> >> >> at the hwring beginning that still waiting for being received.
>> >> >> I think this fix can avoid this situation, and at least reduce the
>> >> >> latency for the packets in the header.
>> >> >>
>> >> > Packets are always received in order from the NIC, so no packets ever 
>> >> > get left
>> >> > behind or skipped on an RX burst call.
>> >> >
>> >> > /Bruce
>> >> >
>> >>
>> >> I knew packets are received in order, and no packets will be skipped,
>> >> but some will be left behind as I explained above.
>> >> vPMD will not received nb_pkts required by one RX burst call, and
>> >> those at the beginning of hwring are still waiting to be received till
>> >> the next call.
>> >>
>> >> Thanks!
>> >> Jianbo
>> > HI Jianbo,
>> >
>> > ok, I understand now. I'm not sure that this is a significant problem 
>> > though,
>> > since we are working in polling mode. Is there a performance impact to your
>> > change, because I don't think that we can reduce performance just to fix 
>> > this?
>> >
>> > Regards,
>> > /Bruce
>> It will be a problem because the possibility could be high.
>> Considering rx hwring size is 128 and rx burst is 32, the possiblity
>> can be 32/128.
>> I know this change is critical, so I want you (and maintainers) to do
>> full evaluations about throughput/latency..before making conclusion.
>
> I am still not sure what is a problem you are trying to solve here.
> Yes recv_raw_pkts_vec() call wouldn't wrap around HW ring boundary,
> and yes can return less packets that are actually available by the HW.
> Though as Bruce pointed, they'll be returned to the user by next call.
Have you thought of the interval between these two call, how long could it be?
If application is a simple one like l2fwd/testpmd, that's fine.
But if the interval is long because application has more work to do,
they are different.

> Actually recv_pkts_bulk_alloc() works in a similar way.
> Why do you consider that as a problem?
Driver should pull packets out of hardware and give them to APP as
fast as possible.
If not, there is a possibility that overflow the hardware queue by
more incoming packets.

I did some testings with pktgen-dpdk, and it behaves a little better
with this patch (at least not worse).
Sorry I can't provide more concreate evidences because I don't have
ixia/sprint equipment at hand.
That's why I asked you to do full evaluations before reject this patch. :-)

Thanks!

> Konstantin
>
>>
>> Jianbo


[dpdk-dev] [PATCH v2] i40e: fix using memory after free issue

2016-03-25 Thread Jiangu Zhao
The old code still uses entry in the next loop of LIST_FOREACH after free() in 
i40e_res_pool_destroy().
Change to a safe way to free entry, which is similar with LIST_FOREACH_SAFE in 
FreeBSD.

Fixes: 4861cde46116 ("i40e: new poll mode driver")

Signed-off-by: Jiangu Zhao 
---
 drivers/net/i40e/i40e_ethdev.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 6fdae57..42f5c82 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -3335,17 +3335,21 @@ i40e_res_pool_init (struct i40e_res_pool_info *pool, 
uint32_t base,
 static void
 i40e_res_pool_destroy(struct i40e_res_pool_info *pool)
 {
-   struct pool_entry *entry;
+   struct pool_entry *entry, *next_entry;

if (pool == NULL)
return;

-   LIST_FOREACH(entry, &pool->alloc_list, next) {
+   for (entry = LIST_FIRST(&pool->alloc_list); 
+   entry && (next_entry = LIST_NEXT(entry, next), 1);
+   entry = next_entry) {
LIST_REMOVE(entry, next);
rte_free(entry);
}

-   LIST_FOREACH(entry, &pool->free_list, next) {
+   for (entry = LIST_FIRST(&pool->free_list); 
+   entry && (next_entry = LIST_NEXT(entry, next), 1); 
+   entry = next_entry) {
LIST_REMOVE(entry, next);
rte_free(entry);
}
-- 
1.8.3.1



[dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Thomas Monjalon
Is there someone investigating the issue?
I think it should be simple to fix for someone mastering these Intel drivers.

2016-03-25 01:02, Xu, Qian Q:
> Marc
> #Test1 is just a simple test. Just launch testpmd with these nic port.
> ./testpmd ?c 0x3 ?n 4 -- -i
> 
> Thanks
> Qian
> 
> From: marc.sune at gmail.com [mailto:marc.sune at gmail.com] On Behalf Of Marc
> Sent: Thursday, March 24, 2016 3:48 PM
> To: Xu, Qian Q
> Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin; 
> Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API 
> refactoring
> 
> 
> 
> On 24 March 2016 at 07:21, Xu, Qian Q  intel.com> wrote:
> Marc
> I didn?t quite get your points, I observed that after applying this patchset, 
> all intel nic can?t be started, maybe something wrong happened when you check 
> the duplex/autoneg value for different NICs. If we want to merge the patchset 
> in RC2, we need fix them. Maybe not an easy job in several days.
> 
> Is this test#1 one of the tests contained in the DPDK repository or is it an 
> internal test?
> 
> Marc
> 
> 
> 
> Thanks
> Qian
> 
> From: marc.sune at gmail.com [mailto:marc.sune 
> at gmail.com] On Behalf Of Marc
> Sent: Thursday, March 24, 2016 4:54 AM
> To: Xu, Qian Q
> Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin; 
> Richardson, Bruce; dev at dpdk.org
> 
> Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API 
> refactoring
> 
> Qian,
> 
> On 23 March 2016 at 02:18, Xu, Qian Q  intel.com> wrote:
> We have tested with intel nic and found port can't be started for all 
> nics:ixgbe/i40e/igb/bonding, see attached mail for more details. Please check 
> and fix it.
> 
> 
> Thanks
> Qian
> 
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On 
> Behalf Of Thomas Monjalon
> Sent: Wednesday, March 23, 2016 3:59 AM
> To: Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin
> Cc: marcdevel at gmail.com; Richardson, Bruce; 
> dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API 
> refactoring
> 
> 2016-03-17 19:08, Thomas Monjalon:
> > There are still too few tests and reviews, especially for
> > autonegotiation with Intel devices (patch #6).
> > I would not be surprised to see some bugs in this rework.
> 
> Any feedback about autoneg in e1000/ixgbe/i40e?
> Has it been tested before its integration in RC2?
> 
> > The capabilities must be adapted per device. It can be improved in a
> > separate patch.
> >
> > It will be integrated in 16.04-rc2.
> > Please test and review shortly, thanks!
> 
> 
> -- Forwarded message --
> From: "Xu, Qian Q" mailto:qian.q.xu at intel.com>>
> To: "Cao, Waterman" mailto:waterman.cao at 
> intel.com>>, "Glynn, Michael J"  intel.com>
> Cc: "Richardson, Bruce"  intel.com>, "Zhu, Heqing"  at intel.com>, "O'Driscoll, Tim" 
> mailto:tim.odriscoll at intel.com>>, "Mcnamara, 
> John" mailto:john.mcnamara at intel.com>>, "Xu, 
> HuilongX" mailto:huilongx.xu at intel.com>>, "Fu, 
> JingguoX" mailto:jingguox.fu at intel.com>>, "Xu, 
> Qian Q" mailto:qian.q.xu at intel.com>>, "Zhang, 
> Helin" mailto:helin.zhang at intel.com>>
> Date: Tue, 22 Mar 2016 06:41:37 +
> Subject: RE: DPDK link speed with Intel devices
> Hi, all
> We have worked out the basic test cases for the patchset.
> 1. Test the link speed on major Intel NICs to see if the speed is right.
> 2. Test the auto-negoation on major Intel NICs to ensure it's working.
> Nic covered: ixgbe, igb, i40e, fm10k, bonding(SW), virtio(SW)
> 
> When we run the Test#1 for all major NICs. We found that all these NIC 
> port(igb, ixgbe, i40e, fm10k) can't be started. Pls check, if the patch is 
> applied, all INTEL port can't be start, terrible things!
> 
> Interactive-mode selected
> Configuring Port 0 (socket 0)
> PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f13e99e3440 
> hw_ring=0x7f13e99e5480 dma_addr=0x8299e5480
> PMD: ixgbe_set_tx_function(): Using simple tx code path
> PMD: ixgbe_set_tx_function(): Vector tx enabled.
> PMD: ixgbe_dev_rx_queue_setup(): sw_ring=0x7f13ffcb8080 
> sw_sc_ring=0x7f13ffcbaac0 hw_ring=0x7f13e99d3380 dma_addr=0x8299d3380
> PMD: ixgbe_dev_start(): Invalid link_speeds for port 0; autonegotiation 
> disabled
> Fail to start port 0
> Configuring Port 1 (socket 0)
> PMD: i40e_set_tx_function_flag(): Vector tx can be enabled on this txq.
> PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are 
> satisfied. Rx Burst Bulk Alloc function will be used on port=1, queue=0.
> PMD: i40e_dev_start(): Invalid link_speeds for port 1; autonegotiation 
> disabled
> 
> 
> Just to d

[dpdk-dev] [PATCH v4 1/3] ethdev: refine API to query supported packet types

2016-03-25 Thread Tan, Jianfeng
NACK.

I'll send an independent patchset for this.

Thanks,
Jianfeng

> -Original Message-
> From: Tan, Jianfeng
> Sent: Friday, March 25, 2016 8:48 AM
> To: dev at dpdk.org
> Cc: Tan, Jianfeng; Ananyev, Konstantin; Zhang, Helin; Richardson, Bruce
> Subject: [PATCH v4 1/3] ethdev: refine API to query supported packet types
> 
> Return 0 instead of -ENOTSUP for those which do not fill any packet types,
> with some note and doc updated.
> 
> Signed-off-by: Jianfeng Tan 
> Acked-by: Konstantin Ananyev 
> ---
>  doc/guides/nics/overview.rst  | 2 +-
>  lib/librte_ether/rte_ethdev.c | 3 +--
>  lib/librte_ether/rte_ethdev.h | 9 ++---
>  3 files changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
> index 542479a..e7504da 100644
> --- a/doc/guides/nics/overview.rst
> +++ b/doc/guides/nics/overview.rst
> @@ -124,7 +124,7 @@ Most of these differences are summarized below.
> L4 checksum offload  X   X   X   X
> inner L3 checksumX   X   X
> inner L4 checksumX   X   X
> -   packet type parsing  X   X   X
> +   packet type parsing  X X X X X X   X X   X X 
> X X X X   X
> timesync X X
> basic stats  X   X   X X X X  
>  X X
> extended stats   X   X X X X
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index a328027..1ee79d2 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -1636,8 +1636,7 @@ rte_eth_dev_get_supported_ptypes(uint8_t
> port_id, uint32_t ptype_mask,
> 
>   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
>   dev = &rte_eth_devices[port_id];
> - RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >dev_supported_ptypes_get,
> - -ENOTSUP);
> + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops-
> >dev_supported_ptypes_get, 0);
>   all_ptypes = (*dev->dev_ops->dev_supported_ptypes_get)(dev);
> 
>   if (!all_ptypes)
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index e7de34a..5167750 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -2326,6 +2326,9 @@ void rte_eth_dev_info_get(uint8_t port_id, struct
> rte_eth_dev_info *dev_info);
>   * @note
>   *   Better to invoke this API after the device is already started or rx 
> burst
>   *   function is decided, to obtain correct supported ptypes.
> + * @note
> + *   if a given PMD does not report what ptypes it supports, then the
> supported
> + *   ptype count is reported as 0.
>   * @param port_id
>   *   The port identifier of the Ethernet device.
>   * @param ptype_mask
> @@ -2335,9 +2338,9 @@ void rte_eth_dev_info_get(uint8_t port_id, struct
> rte_eth_dev_info *dev_info);
>   * @param num
>   *  Size of the array pointed by param ptypes.
>   * @return
> - *   - (>0) Number of supported ptypes. If it exceeds param num, exceeding
> - *  packet types will not be filled in the given array.
> - *   - (0 or -ENOTSUP) if PMD does not fill the specified ptype.
> + *   - (>=0) Number of supported ptypes. If the number of types exceeds
> num,
> + only num entries will be filled into the ptypes array, but the 
> full
> + count of supported ptypes will be returned.
>   *   - (-ENODEV) if *port_id* invalid.
>   */
>  int rte_eth_dev_get_supported_ptypes(uint8_t port_id, uint32_t
> ptype_mask,
> --
> 2.1.4



[dpdk-dev] [PATCH v4 1/3] ethdev: refine API to query supported packet types

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 08:47:45AM +0800, Jianfeng Tan wrote:
> Return 0 instead of -ENOTSUP for those which do not fill any packet types,
> with some note and doc updated.
> 
> Signed-off-by: Jianfeng Tan 
> Acked-by: Konstantin Ananyev 

Hi Jianfeng,

I think this is a good change to the API, as it should simplify app code - as
any driver which doesn't tell us what ptypes it supports should be counted as
not supporting any. It also eliminates the need for the vdevs to see about
exporting this function to say they don't support any types.

However, two comments:
1. I think the commit message for this change should include information as to
why we want to tweak the API of this new function i.e. put in the above reasons
plus any others.
2. Please separate out the doc change from the API change as they are unrelated.

Regards,
/Bruce



[dpdk-dev] [PATCH 0/2] ethdev: refine new API to query supported ptypes

2016-03-25 Thread Jianfeng Tan
patch 0: return 0 instead of -ENOTSUP.
patch 1: update doc/guides/nics/overview.rst.

Suggested-by: Bruce Richardson 
Signed-off-by: Jianfeng Tan 

Jianfeng Tan (2):
  ethdev: refine new API to query supported ptypes
  doc: update which PMDs can parse packet type

 doc/guides/nics/overview.rst  | 2 +-
 lib/librte_ether/rte_ethdev.c | 3 +--
 lib/librte_ether/rte_ethdev.h | 9 ++---
 3 files changed, 8 insertions(+), 6 deletions(-)

-- 
2.1.4



[dpdk-dev] [PATCH 1/2] ethdev: refine new API to query supported ptypes

2016-03-25 Thread Jianfeng Tan
This change is to  make user code simpler. For PMDs which do not fill any
packet types, return 0 instead of -ENOTSUP as suggested by Bruce.

Usually, users only care if the required (by ptype_mask) ptypes can be
filled by the specified PMD. If the PMD implements dev_supported_ptypes_get
func is not important. And the introduce of another return value (-ENOTSUP)
would increase the complexity of user programs to check it.

Besides, there are ways to know if a PMD implements the func:
  a. see doc/guides/nics/overview.rst.
  b. use (~1) as parameter ptype_mask, then check if return 0.

Suggested-by: Bruce Richardson 
Signed-off-by: Jianfeng Tan 
---
 lib/librte_ether/rte_ethdev.c | 3 +--
 lib/librte_ether/rte_ethdev.h | 9 ++---
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index a328027..1ee79d2 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1636,8 +1636,7 @@ rte_eth_dev_get_supported_ptypes(uint8_t port_id, 
uint32_t ptype_mask,

RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
dev = &rte_eth_devices[port_id];
-   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_supported_ptypes_get,
-   -ENOTSUP);
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_supported_ptypes_get, 0);
all_ptypes = (*dev->dev_ops->dev_supported_ptypes_get)(dev);

if (!all_ptypes)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index e7de34a..2d76849 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2326,6 +2326,9 @@ void rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info);
  * @note
  *   Better to invoke this API after the device is already started or rx burst
  *   function is decided, to obtain correct supported ptypes.
+ * @note
+ *   if a given PMD does not report what ptypes it supports, then the supported
+ *   ptype count is reported as 0.
  * @param port_id
  *   The port identifier of the Ethernet device.
  * @param ptype_mask
@@ -2335,9 +2338,9 @@ void rte_eth_dev_info_get(uint8_t port_id, struct 
rte_eth_dev_info *dev_info);
  * @param num
  *  Size of the array pointed by param ptypes.
  * @return
- *   - (>0) Number of supported ptypes. If it exceeds param num, exceeding
- *  packet types will not be filled in the given array.
- *   - (0 or -ENOTSUP) if PMD does not fill the specified ptype.
+ *   - (>=0) Number of supported ptypes. If the number of types exceeds num,
+ *   only num entries will be filled into the ptypes array, but the 
full
+ *   count of supported ptypes will be returned.
  *   - (-ENODEV) if *port_id* invalid.
  */
 int rte_eth_dev_get_supported_ptypes(uint8_t port_id, uint32_t ptype_mask,
-- 
2.1.4



[dpdk-dev] [PATCH 2/2] doc: update which PMDs can parse packet type

2016-03-25 Thread Jianfeng Tan
Signed-off-by: Jianfeng Tan 
---
 doc/guides/nics/overview.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
index 542479a..e7504da 100644
--- a/doc/guides/nics/overview.rst
+++ b/doc/guides/nics/overview.rst
@@ -124,7 +124,7 @@ Most of these differences are summarized below.
L4 checksum offload  X   X   X   X
inner L3 checksumX   X   X
inner L4 checksumX   X   X
-   packet type parsing  X   X   X
+   packet type parsing  X X X X X X   X X   X X X 
X X X   X
timesync X X
basic stats  X   X   X X X X   
X X
extended stats   X   X X X X
-- 
2.1.4



[dpdk-dev] [PATCH v2] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-25 Thread Olivier Matz
When using RSS, the number of rxqs has to be a power of two.
This is a problem because there is no API in DPDK that makes
the application aware of that.

A good compromise is to allow the application to request a
number of rxqs that is not a power of 2, but having inactive
queues that will never receive packets. In this configuration,
a warning will be issued to users to let them know that
this is not an optimal configuration.

Signed-off-by: Olivier Matz 
Acked-by: Adrien Mazarguil 
---
 drivers/net/mlx4/mlx4.c | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/drivers/net/mlx4/mlx4.c b/drivers/net/mlx4/mlx4.c
index cc4e9aa..a183927 100644
--- a/drivers/net/mlx4/mlx4.c
+++ b/drivers/net/mlx4/mlx4.c
@@ -698,7 +698,7 @@ txq_cleanup(struct txq *txq);

 static int
 rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
- unsigned int socket, const struct rte_eth_rxconf *conf,
+ unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
  struct rte_mempool *mp);

 static void
@@ -734,12 +734,15 @@ dev_configure(struct rte_eth_dev *dev)
}
if (rxqs_n == priv->rxqs_n)
return 0;
-   if ((rxqs_n & (rxqs_n - 1)) != 0) {
-   ERROR("%p: invalid number of RX queues (%u),"
- " must be a power of 2",
- (void *)dev, rxqs_n);
-   return EINVAL;
+   if (!rte_is_power_of_2(rxqs_n)) {
+   unsigned n_active;
+
+   n_active = rte_align32pow2(rxqs_n + 1) >> 1;
+   WARN("%p: number of RX queues must be a power"
+   " of 2: %u queues among %u will be active",
+   (void *)dev, n_active, rxqs_n);
}
+
INFO("%p: RX queues number update: %u -> %u",
 (void *)dev, priv->rxqs_n, rxqs_n);
/* If RSS is enabled, disable it first. */
@@ -775,7 +778,7 @@ dev_configure(struct rte_eth_dev *dev)
priv->rss = 1;
tmp = priv->rxqs_n;
priv->rxqs_n = rxqs_n;
-   ret = rxq_setup(dev, &priv->rxq_parent, 0, 0, NULL, NULL);
+   ret = rxq_setup(dev, &priv->rxq_parent, 0, 0, 0, NULL, NULL);
if (!ret)
return 0;
/* Failure, rollback. */
@@ -3466,7 +3469,8 @@ rxq_setup_qp_rss(struct priv *priv, struct ibv_cq *cq, 
uint16_t desc,
attr.qpg.qpg_type = IBV_EXP_QPG_PARENT;
/* TSS isn't necessary. */
attr.qpg.parent_attrib.tss_child_count = 0;
-   attr.qpg.parent_attrib.rss_child_count = priv->rxqs_n;
+   attr.qpg.parent_attrib.rss_child_count =
+   rte_align32pow2(priv->rxqs_n + 1) >> 1;
DEBUG("initializing parent RSS queue");
} else {
attr.qpg.qpg_type = IBV_EXP_QPG_CHILD_RX;
@@ -3689,6 +3693,9 @@ skip_rtr:
  *   Number of descriptors to configure in queue.
  * @param socket
  *   NUMA socket on which memory must be allocated.
+ * @param inactive
+ *   If true, the queue is disabled because its index is higher or
+ *   equal to the real number of queues, which must be a power of 2.
  * @param[in] conf
  *   Thresholds parameters.
  * @param mp
@@ -3699,7 +3706,7 @@ skip_rtr:
  */
 static int
 rxq_setup(struct rte_eth_dev *dev, struct rxq *rxq, uint16_t desc,
- unsigned int socket, const struct rte_eth_rxconf *conf,
+ unsigned int socket, int inactive, const struct rte_eth_rxconf *conf,
  struct rte_mempool *mp)
 {
struct priv *priv = dev->data->dev_private;
@@ -3800,7 +3807,7 @@ skip_mr:
DEBUG("priv->device_attr.max_sge is %d",
  priv->device_attr.max_sge);
 #ifdef RSS_SUPPORT
-   if (priv->rss)
+   if (priv->rss && !inactive)
tmpl.qp = rxq_setup_qp_rss(priv, tmpl.cq, desc, parent,
   tmpl.rd);
else
@@ -3936,6 +3943,7 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
idx, uint16_t desc,
 {
struct priv *priv = dev->data->dev_private;
struct rxq *rxq = (*priv->rxqs)[idx];
+   int inactive = 0;
int ret;

if (mlx4_is_secondary())
@@ -3967,7 +3975,9 @@ mlx4_rx_queue_setup(struct rte_eth_dev *dev, uint16_t 
idx, uint16_t desc,
return -ENOMEM;
}
}
-   ret = rxq_setup(dev, rxq, desc, socket, conf, mp);
+   if (idx >= rte_align32pow2(priv->rxqs_n + 1) >> 1)
+   inactive = 1;
+   ret = rxq_setup(dev, rxq, desc, socket, inactive, conf, mp);
if (ret)
rte_free(rxq);
else {
-- 
2.1.4



[dpdk-dev] [PATCH] igb: fix crash with offload on 82575 chipset

2016-03-25 Thread Olivier Matz
On the 82575 chipset, there is a pool of global TX contexts instead of 2
per queues on 82576. See Table A-1 "Changes in Programming Interface
Relative to 82575" of Intel? 82576EB GbE Controller datasheet (*).

In the driver, the contexts are attributed to a TX queue: 0-1 for txq0,
2-3 for txq1, and so on.

In igbe_set_xmit_ctx(), the variable ctx_curr contains the index of the
per-queue context (0 or 1), and ctx_idx contains the index to be given
to the hardware (0 to 7). The size of txq->ctx_cache[] is 2, and must
be indexed with ctx_curr to avoid an out-of-bound access.

Also, the index returned by what_advctx_update() is the per-queue
index (0 or 1), so we need to add txq->ctx_start before sending it
to the hardware.

(*) The datasheets says 16 global contexts, however the IDX fields in TX
descriptors are 3 bits, which gives a total of 8 contexts. The
driver assumes there are 8 contexts on 82575: 2 per queues, 4 txqs.

Fixes: 4c8db5f09a ("igb: enable TSO support")
Fixes: af75078fec ("first public release")
Signed-off-by: Olivier Matz 
---
 drivers/net/e1000/igb_rxtx.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index e527895..529dba4 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -325,9 +325,9 @@ igbe_set_xmit_ctx(struct igb_tx_queue* txq,
}

txq->ctx_cache[ctx_curr].flags = ol_flags;
-   txq->ctx_cache[ctx_idx].tx_offload.data =
+   txq->ctx_cache[ctx_curr].tx_offload.data =
tx_offload_mask.data & tx_offload.data;
-   txq->ctx_cache[ctx_idx].tx_offload_mask = tx_offload_mask;
+   txq->ctx_cache[ctx_curr].tx_offload_mask = tx_offload_mask;

ctx_txd->type_tucmd_mlhl = rte_cpu_to_le_32(type_tucmd_mlhl);
vlan_macip_lens = (uint32_t)tx_offload.data;
@@ -450,7 +450,7 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
ctx = what_advctx_update(txq, tx_ol_req, tx_offload);
/* Only allocate context descriptor if required*/
new_ctx = (ctx == IGB_CTX_NUM);
-   ctx = txq->ctx_curr;
+   ctx = txq->ctx_curr + txq->ctx_start;
tx_last = (uint16_t) (tx_last + new_ctx);
}
if (tx_last >= txq->nb_tx_desc)
-- 
2.1.4



[dpdk-dev] [RFC] hash/lpm: return NULL if the object exists

2016-03-25 Thread Olivier Matz
Hi Bruce,

On 03/15/2016 01:25 PM, Olivier Matz wrote:
> Seen by trying to fix the func_reentrancy autotest. The test
> was doing the following on several cores in parallel:
> 
>   name = "common_name";
>   do several times {
>   obj = allocate_an_object(name)   // obj = ring, mempool, hash, lpm, ...
>   if (obj == NULL && lookup(name) == NULL)
>   return TEST_FAIL;
>   }
> 
> Issues:
> 
> 1/ rings, mempools, hashs API are not coherent
>rings and mempool return NULL if the object does not exist
>hash and lpm return an object that was allocated allocated if
>it already was allocated
> 
> 2/ The hash/lpm API looks dangerous: when an object is returned,
>the user does not know if it should be freed or not (no refcnt)
> 
> 3/ There are some possible race conditions in cuckoo_hash as the
>lock is not held in rte_hash_create(). We could find some cases
>where NULL is returned when the object already exists (ex: when
>rte_ring_create() fails).
> 
> This patch tries to rationalize the APIs of lpm and hash.
> 
> Signed-off-by: Olivier Matz 

Sorry, I forgot to CC you in the first mail. Do you have any opinion
about this rfc patch?

Thanks,
Olivier


[dpdk-dev] [RFC] hash/lpm: return NULL if the object exists

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 11:32:47AM +0100, Olivier Matz wrote:
> Hi Bruce,
> 
> On 03/15/2016 01:25 PM, Olivier Matz wrote:
> > Seen by trying to fix the func_reentrancy autotest. The test
> > was doing the following on several cores in parallel:
> > 
> >   name = "common_name";
> >   do several times {
> >   obj = allocate_an_object(name)   // obj = ring, mempool, hash, lpm, 
> > ...
> >   if (obj == NULL && lookup(name) == NULL)
> >   return TEST_FAIL;
> >   }
> > 
> > Issues:
> > 
> > 1/ rings, mempools, hashs API are not coherent
> >rings and mempool return NULL if the object does not exist
> >hash and lpm return an object that was allocated allocated if
> >it already was allocated
> > 
> > 2/ The hash/lpm API looks dangerous: when an object is returned,
> >the user does not know if it should be freed or not (no refcnt)
> > 
> > 3/ There are some possible race conditions in cuckoo_hash as the
> >lock is not held in rte_hash_create(). We could find some cases
> >where NULL is returned when the object already exists (ex: when
> >rte_ring_create() fails).
> > 
> > This patch tries to rationalize the APIs of lpm and hash.
> > 
> > Signed-off-by: Olivier Matz 
> 
> Sorry, I forgot to CC you in the first mail. Do you have any opinion
> about this rfc patch?
> 
> Thanks,
> Olivier
Hi Olivier,

the idea looks good, since an object already existing is an error condition on
create. One small change to the libs I'd suggest is to set rte_errno to 
EEXIST before exit, so that the error reason is known to the app.

Regards,
/Bruce


[dpdk-dev] [PATCH 2/2] drivers/crypto: Fix anonymous union initialization in crypto

2016-03-25 Thread Trahe, Fiona


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Michael Qiu
> Sent: Friday, March 25, 2016 7:20 AM
> To: dev at dpdk.org
> Cc: Qiu, Michael
> Subject: [dpdk-dev] [PATCH 2/2] drivers/crypto: Fix anonymous union
> initialization in crypto
> 
> In SUSE11-SP3 i686 platform, with gcc 4.5.1, there is a compile issue:
>   null_crypto_pmd_ops.c:44:3: error:
>   unknown field ?sym? specified in initializer
>   cc1: warnings being treated as errors
> 
> The member in anonymous union initialization should be inside '{}', otherwise 
> it
> will report an error.
> 
> Fixes: 26c2e4ad5ad4 ("cryptodev: add capabilities discovery")
> 
> Signed-off-by: Michael Qiu 


Initialisation in QAT PMD has same issue, possibly also other PMDs.
I'll extend this fix where needed.

> ---
>  drivers/crypto/null/null_crypto_pmd_ops.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/crypto/null/null_crypto_pmd_ops.c
> b/drivers/crypto/null/null_crypto_pmd_ops.c
> index 39f8088..b7470c0 100644
> --- a/drivers/crypto/null/null_crypto_pmd_ops.c
> +++ b/drivers/crypto/null/null_crypto_pmd_ops.c
> @@ -41,9 +41,9 @@
>  static const struct rte_cryptodev_capabilities 
> null_crypto_pmd_capabilities[] = {
>   {   /* NULL (AUTH) */
>   .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
> - .sym = {
> + {.sym = {
>   .xform_type = RTE_CRYPTO_SYM_XFORM_AUTH,
> - .auth = {
> + {.auth = {
>   .algo = RTE_CRYPTO_AUTH_NULL,
>   .block_size = 1,
>   .key_size = {
> @@ -57,14 +57,14 @@ static const struct rte_cryptodev_capabilities
> null_crypto_pmd_capabilities[] =
>   .increment = 0
>   },
>   .aad_size = { 0 }
> - }
> - }
> + }, },
> + }, },
>   },
>   {   /* NULL (CIPHER) */
>   .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
> - .sym = {
> + {.sym = {
>   .xform_type = RTE_CRYPTO_SYM_XFORM_CIPHER,
> - .cipher = {
> + {.cipher = {
>   .algo = RTE_CRYPTO_CIPHER_NULL,
>   .block_size = 1,
>   .key_size = {
> @@ -77,8 +77,8 @@ static const struct rte_cryptodev_capabilities
> null_crypto_pmd_capabilities[] =
>   .max = 0,
>   .increment = 0
>   }
> - }
> - }
> + }, },
> + }, },
>   },
>   RTE_CRYPTODEV_END_OF_CAPABILITIES_LIST()
>  };
> --
> 1.9.3



[dpdk-dev] [PATCH] mempool: allow for user-owned mempool caches

2016-03-25 Thread Olivier Matz
Hi Venky,

>> The main benefit of having an external cache is to allow mempool users
>> (threads) to maintain a local cache even though they don't have a valid
>> lcore_id (non-EAL threads). The fact that cache access is done by indexing
>> with the lcore_id is what makes it difficult...
> 
> Hi Lazaros, 
> 
> Alternative suggestion: This could actually be very simply done via creating 
> an EAL API to register and return an lcore_id for a thread wanting to use 
> DPDK services. That way, you could simply create your pthread, call the 
> eal_register_thread() function that assigns an lcore_id to the caller (and 
> internally sets up the per_lcore variable. 
> 
> The advantage of doing it this way is that you could extend it to other 
> things other than the mempool that may need an lcore_id setup.

>From my opinion, externalize the cache structure as Lazaros suggests
would make things simpler, especially in case of dynamic threads
allocation/destruction.

If a lcore_id regristration API is added in EAL, we still need a
max lcore value when the mempool is created so the cache can be
allocated. Moreover, the API would not be as simple, especially
if it needs to support secondary processes.


Regards,
Olivier


[dpdk-dev] [PATCH 0/2] ethdev: refine new API to query supported ptypes

2016-03-25 Thread Ananyev, Konstantin


> -Original Message-
> From: Tan, Jianfeng
> Sent: Friday, March 25, 2016 3:16 AM
> To: dev at dpdk.org
> Cc: Tan, Jianfeng; Ananyev, Konstantin; Zhang, Helin; Richardson, Bruce
> Subject: [PATCH 0/2] ethdev: refine new API to query supported ptypes
> 
> patch 0: return 0 instead of -ENOTSUP.
> patch 1: update doc/guides/nics/overview.rst.
> 
> Suggested-by: Bruce Richardson 
> Signed-off-by: Jianfeng Tan 
> 
> Jianfeng Tan (2):
>   ethdev: refine new API to query supported ptypes
>   doc: update which PMDs can parse packet type
> 
>  doc/guides/nics/overview.rst  | 2 +-
>  lib/librte_ether/rte_ethdev.c | 3 +--
>  lib/librte_ether/rte_ethdev.h | 9 ++---
>  3 files changed, 8 insertions(+), 6 deletions(-)
> 

Acked-by: Konstantin Ananyev 

> --
> 2.1.4



[dpdk-dev] [PATCH v2 0/5] virtio support for container

2016-03-25 Thread Neil Horman
On Fri, Mar 25, 2016 at 09:25:49AM +0800, Tan, Jianfeng wrote:
> 
> 
> On 3/24/2016 9:45 PM, Neil Horman wrote:
> >On Thu, Mar 24, 2016 at 11:10:50AM +0800, Tan, Jianfeng wrote:
> >>Hi Neil,
> >>
> >>On 3/24/2016 3:17 AM, Neil Horman wrote:
> >>>On Fri, Feb 05, 2016 at 07:20:23PM +0800, Jianfeng Tan wrote:
> v1->v2:
>   - Rebase on the patchset of virtio 1.0 support.
>   - Fix cannot create non-hugepage memory.
>   - Fix wrong size of memory region when "single-file" is used.
>   - Fix setting of offset in virtqueue to use virtual address.
>   - Fix setting TUNSETVNETHDRSZ in vhost-user's branch.
>   - Add mac option to specify the mac address of this virtual device.
>   - Update doc.
> 
> This patchset is to provide high performance networking interface (virtio)
> for container-based DPDK applications. The way of starting DPDK apps in
> containers with ownership of NIC devices exclusively is beyond the scope.
> The basic idea here is to present a new virtual device (named eth_cvio),
> which can be discovered and initialized in container-based DPDK apps using
> rte_eal_init(). To minimize the change, we reuse already-existing virtio
> frontend driver code (driver/net/virtio/).
> Compared to QEMU/VM case, virtio device framework (translates I/O port r/w
> operations into unix socket/cuse protocol, which is originally provided in
> QEMU), is integrated in virtio frontend driver. So this converged driver
> actually plays the role of original frontend driver and the role of QEMU
> device framework.
> The major difference lies in how to calculate relative address for vhost.
> The principle of virtio is that: based on one or multiple shared memory
> segments, vhost maintains a reference system with the base addresses and
> length for each segment so that an address from VM comes (usually GPA,
> Guest Physical Address) can be translated into vhost-recognizable address
> (named VVA, Vhost Virtual Address). To decrease the overhead of address
> translation, we should maintain as few segments as possible. In VM's case,
> GPA is always locally continuous. In container's case, CVA (Container
> Virtual Address) can be used. Specifically:
> a. when set_base_addr, CVA address is used;
> b. when preparing RX's descriptors, CVA address is used;
> c. when transmitting packets, CVA is filled in TX's descriptors;
> d. in TX and CQ's header, CVA is used.
> How to share memory? In VM's case, qemu always shares all physical layout
> to backend. But it's not feasible for a container, as a process, to share
> all virtual memory regions to backend. So only specified virtual memory
> regions (with type of shared) are sent to backend. It's a limitation that
> only addresses in these areas can be used to transmit or receive packets.
> 
> Known issues
> 
> a. When used with vhost-net, root privilege is required to create tap
> device inside.
> b. Control queue and multi-queue are not supported yet.
> c. When --single-file option is used, socket_id of the memory may be
> wrong. (Use "numactl -N x -m x" to work around this for now)
> How to use?
> 
> a. Apply this patchset.
> 
> b. To compile container apps:
> $: make config RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make install RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/l2fwd RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> $: make -C examples/vhost RTE_SDK=`pwd` T=x86_64-native-linuxapp-gcc
> 
> c. To build a docker image using Dockerfile below.
> $: cat ./Dockerfile
> FROM ubuntu:latest
> WORKDIR /usr/src/dpdk
> COPY . /usr/src/dpdk
> ENV PATH "$PATH:/usr/src/dpdk/examples/l2fwd/build/"
> $: docker build -t dpdk-app-l2fwd .
> 
> d. Used with vhost-user
> $: ./examples/vhost/build/vhost-switch -c 3 -n 4 \
>   --socket-mem 1024,1024 -- -p 0x1 --stats 1
> $: docker run -i -t -v :/var/run/usvhost \
>   -v /dev/hugepages:/dev/hugepages \
>   dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>   --vdev=eth_cvio0,path=/var/run/usvhost -- -p 0x1
> 
> f. Used with vhost-net
> $: modprobe vhost
> $: modprobe vhost-net
> $: docker run -i -t --privileged \
>   -v /dev/vhost-net:/dev/vhost-net \
>   -v /dev/net/tun:/dev/net/tun \
>   -v /dev/hugepages:/dev/hugepages \
>   dpdk-app-l2fwd l2fwd -c 0x4 -n 4 -m 1024 --no-pci \
>   --vdev=eth_cvio0,path=/dev/vhost-net -- -p 0x1
> 
> By the way, it's not necessary to run in a container.
> 
> Signed-off-by: Huawei Xie 
> Signed-off-by: Jianfeng Tan 
> 
> Jianfeng Tan (5):
>    mem: add --single-file to create single mem-backed file
>    mem: add API to obtain memory-backed file info
>    virtio/vdev: add embeded device emulation
>    virtio/vdev: add a new vdev na

[dpdk-dev] DPDK's vhost-user logging capability

2016-03-25 Thread Marc-André Lureau
Hi

- Original Message -
> On Wed, Mar 23, 2016 at 03:34:09PM +, shesha Sreenivasamurthy (shesha)
> wrote:
> > Hi All,
> > 
> > I was going over vhost-user migration capability in DPDK in lieu of a
> > Cisco's
> > multi-q DPDK vhost-user application. I see that log_base address is
> > implemented
> > as per virtio_net device. However, desc, addr and used is per
> > vhost_virtqueue.
> > Additionally, QEMU sends one VHOST_USER_SET_LOG_BASE per queue-pair (QEMU -
> > hw/
> > virtio/vhost.c::vhost_dev_set_log).
> > 
> > Does it mean we need to log dirty pages of all rings to same location ?
> 
> Hi,
> 
> Yes, and QEMU allocates only one block of memory (see vhost_log_alloc())
> after all.

> > If that
> > is the case then why does QEMU sends separate VHOST_USER_SET_LOG_BASE per
> > queue
> > pair ?
> 
> That's kind of like a design. One queue pair is associated with one
> vhost_dev struct in QEMU, hence, all those requests will go through
> vhost_dev structs (aka, all qeueu pairs), including those that one
> time request is needed only, such as VHOST_USER_SET_MEM_TABLE. Thus,
> we introduced vhost_user_one_time_request() to avoid such case.

I am not familiar with multi-queue, but I can see that in vhost_net_start.

> So, good question, and we may need add it to the "one time request"
> group, Marc?

That would probably prevent vhost_dev_log_resize(), so it would need some kind 
of "force" flag. I am not sure it's worth it at this point.

> 
> And FYI, for queue-pair (or vring) request, there should be an index
> in the payload, to point to the right vring. If not, it normally
> means a global request, that _may_ need be sent once only.
> 
>   --yliu
> 


[dpdk-dev] [PATCH v2] ring: check for zero objects mc dequeue / mp enqueue

2016-03-25 Thread Olivier Matz
Hi Lazaros,

On 03/17/2016 04:49 PM, Lazaros Koromilas wrote:
> Issuing a zero objects dequeue with a single consumer has no effect.
> Doing so with multiple consumers, can get more than one thread to succeed
> the compare-and-set operation and observe starvation or even deadlock in
> the while loop that checks for preceding dequeues.  The problematic piece
> of code when n = 0:
> 
> cons_next = cons_head + n;
> success = rte_atomic32_cmpset(&r->cons.head, cons_head, cons_next);
> 
> The same is possible on the enqueue path.

Just a question about this patch (that has been applied). Thomas
retitled the commit from your log message:

  ring: fix deadlock in zero object multi enqueue or dequeue
  http://dpdk.org/browse/dpdk/commit/?id=d0979646166e

I think this patch does not fix a deadlock, or did I miss something?

As explained in the following links, the ring may not perform well
if several threads running on the same cpu use it:

  http://dpdk.org/ml/archives/dev/2013-November/000714.html
  http://www.dpdk.org/ml/archives/dev/2014-January/001070.html
  http://www.dpdk.org/ml/archives/dev/2014-January/001162.html
  http://dpdk.org/ml/archives/dev/2015-July/020659.html

A deadlock could occur if the threads running on the same core
have different priority.

Regards,
Olivier


[dpdk-dev] [PATCH] app/test/test_table_acl: fix incorrect IP header

2016-03-25 Thread Zhang, Roy Fan
Hi Thomas,

 Sorry for lack of detailed description in this patch.
 The patch was not actually a fix but just adding the missing field 
in the ipv4 5tuple area.

 I will send different patch with more detailed description on this.

Regards,
Fan

On 16/03/2016 20:30, Thomas Monjalon wrote:
> 2016-03-14 12:22, Fan Zhang:
>> This patch fixes the incorrect IP header in ACL table test.
> It is not really a header but a 5-tuple.
>
> Please could you elaborate on the issue?
>
> A "Fixes:" reference is missing.
>
> Thanks



[dpdk-dev] [PATCH v3 2/4] example/ip_pipeline: add PCAP file support

2016-03-25 Thread Zhang, Roy Fan
Hi Thomas,

Sorry about that.

I will
-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] Sent: Friday, 
March 11, 2016 11:11 AM
To: Zhang, Roy Fan 
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH v3 2/4] example/ip_pipeline: add PCAP 
file support

This patch cannot compile without NEXT_ABI:
examples/ip_pipeline/init.c:1227:24: error:
no member named 'file_name' in 'struct rte_port_source_params'


[dpdk-dev] [PATCH v2] testpmd: fix build on FreeBSD

2016-03-25 Thread Bruce Richardson
On Wed, Mar 23, 2016 at 04:17:12PM +0100, Thomas Monjalon wrote:
> 2016-03-23 02:17, Wu, Jingjing:
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marvin Liu
> > > Build log:
> > > /root/dpdk/app/test-pmd/cmdline.c:6687:45: error: no member named
> > > 's6_addr32' in 'struct in6_addr'
> > > rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
> > > 
> > > This is caused by macro "s6_addr32" not defined on FreeBSD and testpmd
> > > swap big endian parameter to host endian. Move the swap action to i40e
> > > ethdev will fix this issue.
> > > 
> > > Fixes: 7b1312891b69 ("ethdev: add IP in GRE tunnel")
> > > 
> > > Signed-off-by: Marvin Liu 
> > Acked-by: Jingjing Wu 
> 
> It looks good but something is missing to decide that it is the right fix:
> the API do not state wether these fields (and others) are big endian or
> something else.
> 
> Please Jingjing, fix the ethdev comments for these fields and others
> rte_eth_ipv*_flow in a separate patch.

+1 to the more info because the endianness is confusing here. However, this look
a better fix than the previous one (v1 patch).

Thomas, can this be merged for RC2 to fix the BSD build, which should be a 
priority? Even if it's not the full solution, I think we need to at least get
the code building on BSD.

Thanks,
/Bruce


[dpdk-dev] [PATCH v2] testpmd: fix build on FreeBSD

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 12:10:40PM +, Bruce Richardson wrote:
> On Wed, Mar 23, 2016 at 04:17:12PM +0100, Thomas Monjalon wrote:
> > 2016-03-23 02:17, Wu, Jingjing:
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marvin Liu
> > > > Build log:
> > > > /root/dpdk/app/test-pmd/cmdline.c:6687:45: error: no member named
> > > > 's6_addr32' in 'struct in6_addr'
> > > > rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
> > > > 
> > > > This is caused by macro "s6_addr32" not defined on FreeBSD and testpmd
> > > > swap big endian parameter to host endian. Move the swap action to i40e
> > > > ethdev will fix this issue.
> > > > 
> > > > Fixes: 7b1312891b69 ("ethdev: add IP in GRE tunnel")
> > > > 
> > > > Signed-off-by: Marvin Liu 
> > > Acked-by: Jingjing Wu 
> > 
> > It looks good but something is missing to decide that it is the right fix:
> > the API do not state wether these fields (and others) are big endian or
> > something else.
> > 
> > Please Jingjing, fix the ethdev comments for these fields and others
> > rte_eth_ipv*_flow in a separate patch.
> 
> +1 to the more info because the endianness is confusing here. However, this 
> look
> a better fix than the previous one (v1 patch).
> 
> Thomas, can this be merged for RC2 to fix the BSD build, which should be a 
> priority? Even if it's not the full solution, I think we need to at least get
> the code building on BSD.
> 
> Thanks,
> /Bruce

And I confirm this patch fixes the FreeBSD compile for both gcc and clang.

Tested-by: Bruce Richardson 



[dpdk-dev] [PATCH] nfp: copy pci info from pci to ethdev

2016-03-25 Thread Bruce Richardson
On Wed, Mar 23, 2016 at 08:51:36AM -0700, Stephen Hemminger wrote:
> The NFP driver (unlike other PCI devices) was not copying the pci info
> from the pci_dev to the eth_dev.  This would make the driver_name be
> null (and other unset fields) when application uses dev_info_get.
> 
> This was found by code review; do not have the hardware.
> 
> Signed-off-by: Stephen Hemminger 
> ---
Alejandro,

any review or ack on this patch for nfp driver?

Regards,
/Bruce


[dpdk-dev] [PATCH] bond: use existing enslaved device queues

2016-03-25 Thread Iremonger, Bernard
> -Original Message-
> From: Eric Kinzie [mailto:ehkinzie at gmail.com]
> Sent: Thursday, March 24, 2016 10:00 PM
> To: dev at dpdk.org
> Cc: tkiely at brocade.com; Doherty, Declan ;
> Iremonger, Bernard ; ekinzie at brocade.com
> Subject: [PATCH] bond: use existing enslaved device queues
> 
> This solves issues when an active device is added to a bond.
> 
> If a device to be enslaved already has transmit and/or receive queues
> allocated, use those and then create any additional queues that are
> necessary.
> 
> Fixes: 2efb58cbab6e ("bond: new link bonding library")
> 
> Signed-off-by: Eric Kinzie 

Acked-by: Bernard Iremonger


[dpdk-dev] [PATCH v2] ixgbe: disable icc compile warning

2016-03-25 Thread Bruce Richardson
On Thu, Mar 24, 2016 at 05:47:21PM +, Ananyev, Konstantin wrote:
> > -Original Message-
> > From: Yigit, Ferruh
> > Sent: Thursday, March 24, 2016 5:35 PM
> > To: dev at dpdk.org
> > Cc: Stephen Hemminger; Ananyev, Konstantin; Yigit, Ferruh
> > Subject: [PATCH v2] ixgbe: disable icc compile warning
> > 
> > icc (icc (ICC) 16.0.1 20151021) is generating following compile error:
> > "
> >   CC ixgbe_rxtx.o
> >   .../drivers/net/ixgbe/ixgbe_rxtx.c(153): error #3656: variable
> >   "free" may be used before its value is set
> >   (nb_free > 0 && m->pool != free[0]->pool)) {
> >  ^
> > "
> > 
> > Indeed this is a false positive and code is correct.
> > "nb_free" check prevents the free[] access before its value set.
> > 
> > Disabling this icc warning (#3656) for file ixgbe_rxtx.c.
> > 
> > Signed-off-by: Ferruh Yigit 
> > ---
> >  drivers/net/ixgbe/Makefile | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/net/ixgbe/Makefile b/drivers/net/ixgbe/Makefile
> > index c032775..50bf51c 100644
> > --- a/drivers/net/ixgbe/Makefile
> > +++ b/drivers/net/ixgbe/Makefile
> > @@ -49,6 +49,8 @@ ifeq ($(CC), icc)
> >  #
> >  CFLAGS_BASE_DRIVER = -wd174 -wd593 -wd869 -wd981 -wd2259
> > 
> > +CFLAGS_ixgbe_rxtx.o += -wd3656
> > +
> >  else ifeq ($(CC), clang)
> >  #
> >  # CFLAGS for clang
> > --
> 
> Acked-by: Konstantin Ananyev 
>
Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH] igb: fix crash with offload on 82575 chipset

2016-03-25 Thread Ananyev, Konstantin


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> Sent: Friday, March 25, 2016 10:32 AM
> To: dev at dpdk.org
> Cc: Lu, Wenzhuo
> Subject: [dpdk-dev] [PATCH] igb: fix crash with offload on 82575 chipset
> 
> On the 82575 chipset, there is a pool of global TX contexts instead of 2
> per queues on 82576. See Table A-1 "Changes in Programming Interface
> Relative to 82575" of Intel? 82576EB GbE Controller datasheet (*).
> 
> In the driver, the contexts are attributed to a TX queue: 0-1 for txq0,
> 2-3 for txq1, and so on.
> 
> In igbe_set_xmit_ctx(), the variable ctx_curr contains the index of the
> per-queue context (0 or 1), and ctx_idx contains the index to be given
> to the hardware (0 to 7). The size of txq->ctx_cache[] is 2, and must
> be indexed with ctx_curr to avoid an out-of-bound access.
> 
> Also, the index returned by what_advctx_update() is the per-queue
> index (0 or 1), so we need to add txq->ctx_start before sending it
> to the hardware.
> 
> (*) The datasheets says 16 global contexts, however the IDX fields in TX
> descriptors are 3 bits, which gives a total of 8 contexts. The
> driver assumes there are 8 contexts on 82575: 2 per queues, 4 txqs.
> 
> Fixes: 4c8db5f09a ("igb: enable TSO support")
> Fixes: af75078fec ("first public release")
> Signed-off-by: Olivier Matz 
> ---
>  drivers/net/e1000/igb_rxtx.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
> index e527895..529dba4 100644
> --- a/drivers/net/e1000/igb_rxtx.c
> +++ b/drivers/net/e1000/igb_rxtx.c
> @@ -325,9 +325,9 @@ igbe_set_xmit_ctx(struct igb_tx_queue* txq,
>   }
> 
>   txq->ctx_cache[ctx_curr].flags = ol_flags;
> - txq->ctx_cache[ctx_idx].tx_offload.data =
> + txq->ctx_cache[ctx_curr].tx_offload.data =
>   tx_offload_mask.data & tx_offload.data;
> - txq->ctx_cache[ctx_idx].tx_offload_mask = tx_offload_mask;
> + txq->ctx_cache[ctx_curr].tx_offload_mask = tx_offload_mask;
> 
>   ctx_txd->type_tucmd_mlhl = rte_cpu_to_le_32(type_tucmd_mlhl);
>   vlan_macip_lens = (uint32_t)tx_offload.data;
> @@ -450,7 +450,7 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf 
> **tx_pkts,
>   ctx = what_advctx_update(txq, tx_ol_req, tx_offload);
>   /* Only allocate context descriptor if required*/
>   new_ctx = (ctx == IGB_CTX_NUM);
> - ctx = txq->ctx_curr;
> + ctx = txq->ctx_curr + txq->ctx_start;
>   tx_last = (uint16_t) (tx_last + new_ctx);
>   }
>   if (tx_last >= txq->nb_tx_desc)
> --

Acked-by: Konstantin Ananyev 

> 2.1.4



[dpdk-dev] [PATCH] bonding: fix initialisation of current_primary_port

2016-03-25 Thread Bruce Richardson
On Wed, Mar 23, 2016 at 05:30:05PM +, Bernard Iremonger wrote:
> The current_primary_port is initialised to an invalid value
> during bonded device creation.
> It should be set to a valid value later.
> This fix sets it to a valid value when the first slave port
> is added to the bonding device.
> 
> Fixes: 2efb58cbab6e ("bond: new link bonding library")
> 
> Signed-off-by: Bernard Iremonger 
> ---
Acked-by: Bruce Richardson 

Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH v2] examples/l3fwd: fix validation for queue id of config tuple

2016-03-25 Thread De Lara Guarch, Pablo
Hi Reshma,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Thursday, March 24, 2016 11:11 PM
> To: dev at dpdk.org; Ananyev, Konstantin
> Cc: Pattan, Reshma
> Subject: [dpdk-dev] [PATCH v2] examples/l3fwd: fix validation for queue id of
> config tuple
> 
> Added validation for queue id of config parameter tuple.
> 
> This validation enforces user to enter queue ids of a port
> from 0 and in sequence.
> 
> This additional validation on queue ids avoids ixgbe crash caused by null
> rxq pointer access inside ixgbe_dev_rx_init.
> 
> Reason for null rxq is, L3fwd application allocates memory only for queues
> passed by user.
> But rte_eth_dev_start  tries to initialize rx queues in sequence from 0 to
> nb_rx_queues,
> which is not true and coredump while accessing the unallocated queue .
> 

You forgot to include the Fixes line.

> Signed-off-by: Reshma Pattan 
> ---
>  v2:

[...]

> + if (lcore_params[i].port_id == port) {
> + if (lcore_params[i].queue_id == queue+1)
> + queue = lcore_params[i].queue_id;
> + else
> + rte_exit(EXIT_FAILURE, "queue ids of the port
> %d must be"
> + " in sequence and must start
> with 0",

You should include a return at the end of the sentence.

> + lcore_params[i].port_id);
> + }
>   }
>   return (uint8_t)(++queue);
>  }
> --
> 2.5.0



[dpdk-dev] Question on examples/multi_process app

2016-03-25 Thread Ananyev, Konstantin

Hi Harish,

> >> >
> >> >> -Original Message-
> >> >> From: Richardson, Bruce
> >> >> Sent: Wednesday, March 23, 2016 11:45 AM
> >> >> To: Ananyev, Konstantin
> >> >> Cc: Harish Patil; dev at dpdk.org
> >> >> Subject: Re: [dpdk-dev] Question on examples/multi_process app
> >> >>
> >> >> On Wed, Mar 23, 2016 at 11:09:17AM +, Ananyev, Konstantin wrote:
> >> >> > Hi everyone,
> >> >> >
> >> >> > > -Original Message-
> >> >> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bruce
> >> >>Richardson
> >> >> > > Sent: Tuesday, March 22, 2016 9:38 PM
> >> >> > > To: Harish Patil
> >> >> > > Cc: dev at dpdk.org
> >> >> > > Subject: Re: [dpdk-dev] Question on examples/multi_process app
> >> >> > >
> >> >> > > On Tue, Mar 22, 2016 at 08:03:42PM +, Harish Patil wrote:
> >> >> > > > Hi,
> >> >> > > > I have a question regarding symmetric_mp and mp_server
> >> >>applications under
> >> >> > > > examples/multi_process. In those apps,
> >> >>rte_eth_promiscuous_enable() is
> >> >> > > > called before rte_eth_dev_start(). Is this the correct way to
> >> >>initialize
> >> >> > > > the port/device? As per the description in
> >> >> > > > http://dpdk.org/doc/api/rte__ethdev_8h.html:
> >> >> > > >
> >> >> > > > "The functions exported by the application Ethernet API to
> >>setup
> >> >>a device
> >> >> > > > designated by its port identifier must be invoked in the
> >> >>following order:
> >> >> > > >
> >> >> > > > * rte_eth_dev_configure()
> >> >> > > > * rte_eth_tx_queue_setup()
> >> >> > > > * rte_eth_rx_queue_setup()
> >> >> > > > * rte_eth_dev_start()
> >> >> > > >
> >> >> > > > Then, the network application can invoke, in any order, the
> >> >>functions
> >> >> > > > exported by the Ethernet API to get the MAC address of a given
> >> >>device, to
> >> >> > > > get the speed and the status of a device physical link, to
> >> >> > > > receive/transmit [burst of] packets, and so on.?
> >> >> > > >
> >> >> > > > So should I consider this as an application issue or whether
> >>the
> >> >>PMD is
> >> >> > > > expected to handle it? If PMD is to handle it, then should the
> >> >>PMD be:
> >> >> > > >
> >> >> > > > 1) Rejecting the Promisc config? OR
> >> >> > > > 2) Cache the config and apply when dev_start() is called at
> >>later
> >> >>point?
> >> >> >
> >> >> > Yes as I remember 2) is done.
> >> >> > dev_start() invokes rte_eth_dev_config_restore(), which restores
> >> >> > promisc mode, mac addresses, etc.
> >> >> >
> >> >> > > >
> >> >> > > > Thanks,
> >> >> > > > Harish
> >> >> > > >
> >> >> > > Good question. I think most/all of the Intel adapters, - for
> >>which
> >> >>the app was
> >> >> > > originally written, way back in the day when there were only 2
> >>PMDs
> >> >>in DPDK :)
> >> >> > > - will handle the promiscuous mode call either before or after
> >>the
> >> >>dev start.
> >> >> > > Assuming that's the case, and if it makes life easier for other
> >> >>driver writers,
> >> >> > > we should indeed standardize on one supported way of doing
> >>things -
> >> >>the way
> >> >> > > specified in the documentation being that one way, I would guess.
> >> >> > >
> >> >> > > So, e1000, ixgbe maintainers - do you see any issues with forcing
> >> >>the promiscuous
> >> >> > > mode set API to be called after the call to dev_start()?
> >> >> >
> >> >> > Not sure, why do we need to enforce that restriction?
> >> >> > Is there any problem with current way?
> >>
> >> Yes, at least with the our driver/firmware interface. The port/device
> >> bring-up is carried out in a certain order which requires port config
> >>like
> >> promisc mode is called after dev_start().
> >>
> >> >>
> >> >> It complicates things for driver writers is all,
> >> >
> >> >Not sure how?
> >> >All this replay is done at rte_ethdev layer.
> >> >Honestly, so far I don't remember any complaint about promisc on/off.
> >> >
> >> >> and conflicts slightly with
> >> >> what is stated in the docs.
> >> >
> >> >Update the docs? :)
> >>
> >> Anyway, please let me know what you guys decide? If the app is changed
> >> then nothing needs to done on driver side. Otherwise I have to think of
> >> how to handle this.
> >>
> >
> >So you are saying that for your device if dev_ops->promiscuous_enable()
> >is called before dev_ops->dev_start(), it would cause  a problem right?
> >Konstantin
> >
> >
> 
> Yes, with the way it is implemented currently it would pose a problem.
> Please note that it can be addressed in the driver, not an issue. However,
> I wanted to be sure if the app behavior is correct. Either ways, please
> let me know - I can take care of both.

If that's a real HW limitation, then my opinion yes, we probably better address 
it.
Though not sure what is the best way:
1) just update the docs and rely on users to read them carefully and write the
   proper code   
2) Inside rte_eth_promiscuous_enable/disable check for dev_started flag,
and if it is not set either
a) return an error or  
b) update data->promiscu

[dpdk-dev] [PATCH 2/2] doc: update which PMDs can parse packet type

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 11:15:36AM +0800, Jianfeng Tan wrote:
> Signed-off-by: Jianfeng Tan 
> ---
>  doc/guides/nics/overview.rst | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
> index 542479a..e7504da 100644
> --- a/doc/guides/nics/overview.rst
> +++ b/doc/guides/nics/overview.rst
> @@ -124,7 +124,7 @@ Most of these differences are summarized below.
> L4 checksum offload  X   X   X   X
> inner L3 checksumX   X   X
> inner L4 checksumX   X   X
> -   packet type parsing  X   X   X
> +   packet type parsing  X X X X X X   X X   X X 
> X X X X   X

This diff does not look right. How come some entries are being removed from 
some drivers?
Are you sure the line has correct whitespace on it?

/Bruce


[dpdk-dev] [PATCH] ixgbe: extend the timer support to x550em

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 01:16:07PM +0800, Wenzhuo Lu wrote:
> An issue is found on x550em NICs, that ieee1588 is not working, the time
> always be 0.
> The root cause is the timer is only supported by x550, it's not extended
> to x550em_x and x550em_a.
> 
> Fixes: a7740dc1303a("ixgbe: support new devices and MAC types")
> Signed-off-by: Wenzhuo Lu 

Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH] hash: fix to support multi process

2016-03-25 Thread Bruce Richardson
On Thu, Mar 24, 2016 at 01:56:44PM +, Pablo de Lara wrote:
> Hash library used a function pointer to choose a different
> key compare function, depending on the key size.
> As a result, multiple processes could not use the same hash table,
> as the function addresses vary from one process to another.
> 
> Instead, a jump table is used, so each process has its own
> function addresses, accessing this table with an index stored
> in the hash table (note that using a custom key compare function
> is not supported in multi-process mode).
> 
> Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
> 
> Signed-off-by: Pablo de Lara 

This is a hard problem to solve, but this probably looks the least-worst 
solution
to it. Pablo, I assume the performance hit for lookups is pretty small here?

One small nit in the code below, and a comment on the doc change. Otherwise,

Acked-by: Bruce Richardson 

> ---
>  doc/guides/rel_notes/release_16_04.rst |  5 ++
>  lib/librte_hash/rte_cuckoo_hash.c  | 85 
> ++
>  2 files changed, 71 insertions(+), 19 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_16_04.rst 
> b/doc/guides/rel_notes/release_16_04.rst
> index 2785b29..c4359dd 100644
> --- a/doc/guides/rel_notes/release_16_04.rst
> +++ b/doc/guides/rel_notes/release_16_04.rst
> @@ -351,6 +351,11 @@ Libraries
>Fix crc32c hash functions to return a valid crc32c value for data lengths
>not multiple of 4 bytes.
>  
> +* **hash: Fixed hash library to support multi-process mode.**
> +
> +  Fix hash library to support multi-process mode, using a jump table,
> +  instead of storing a function pointer to the key compare function.
> +
You probably need to clarify here and in the other docs (e.g. API doc) that:
* multi-process mode only works with the built-in functions and key sizes.
* a custom function not in the jump table can be used but only in single-process
mode.

>  * **librte_port: Fixed segmentation fault for ring and ethdev writer 
> nodrop.**
>  
>Fixed core dump issue on txq and swq when dropless is set to yes.
> diff --git a/lib/librte_hash/rte_cuckoo_hash.c 
> b/lib/librte_hash/rte_cuckoo_hash.c
> index 71b5b76..38c19ab 100644
> --- a/lib/librte_hash/rte_cuckoo_hash.c
> +++ b/lib/librte_hash/rte_cuckoo_hash.c
> @@ -102,6 +102,41 @@ EAL_REGISTER_TAILQ(rte_hash_tailq)
>  
>  #define LCORE_CACHE_SIZE 8
>  
> +/*
> + * All different options to select a key compare function,
> + * based on the key size and custom function.
> + */
> +enum cmp_jump_table_case {
> + KEY_CUSTOM = 0,
> + KEY_16_BYTES,
> + KEY_32_BYTES,
> + KEY_48_BYTES,
> + KEY_64_BYTES,
> + KEY_80_BYTES,
> + KEY_96_BYTES,
> + KEY_112_BYTES,
> + KEY_128_BYTES,
> + KEY_OTHER_BYTES,
> + NUM_KEY_CMP_CASES,
> +};
> +
> +/*
> + * Table storing all different key compare functions
> + * (multi-process supported)
> + */
> +rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
> + NULL,
> + rte_hash_k16_cmp_eq,
> + rte_hash_k32_cmp_eq,
> + rte_hash_k48_cmp_eq,
> + rte_hash_k64_cmp_eq,
> + rte_hash_k80_cmp_eq,
> + rte_hash_k96_cmp_eq,
> + rte_hash_k112_cmp_eq,
> + rte_hash_k128_cmp_eq,
> + memcmp
> +};

Array should be const.

/Bruce



[dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Zhang, Helin
Hi Thomas

Beilei is investigating that, she will give her findings soon later, and 
possibly a fix after validating that.
Thanks!

Regards,
Helin

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Friday, March 25, 2016 5:36 PM
> To: Xu, Qian Q 
> Cc: dev at dpdk.org; Marc ; Ananyev, Konstantin
> ; Lu, Wenzhuo ;
> Zhang, Helin ; Richardson, Bruce
> ; Glynn, Michael J  intel.com>
> Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API
> refactoring
> 
> Is there someone investigating the issue?
> I think it should be simple to fix for someone mastering these Intel drivers.
> 
> 2016-03-25 01:02, Xu, Qian Q:
> > Marc
> > #Test1 is just a simple test. Just launch testpmd with these nic port.
> > ./testpmd ?c 0x3 ?n 4 -- -i
> >
> > Thanks
> > Qian
> >
> > From: marc.sune at gmail.com [mailto:marc.sune at gmail.com] On Behalf Of
> > Marc
> > Sent: Thursday, March 24, 2016 3:48 PM
> > To: Xu, Qian Q
> > Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin;
> > Richardson, Bruce; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed
> > API refactoring
> >
> >
> >
> > On 24 March 2016 at 07:21, Xu, Qian Q
> mailto:qian.q.xu at intel.com>> wrote:
> > Marc
> > I didn?t quite get your points, I observed that after applying this 
> > patchset, all
> intel nic can?t be started, maybe something wrong happened when you check
> the duplex/autoneg value for different NICs. If we want to merge the patchset 
> in
> RC2, we need fix them. Maybe not an easy job in several days.
> >
> > Is this test#1 one of the tests contained in the DPDK repository or is it an
> internal test?
> >
> > Marc
> >
> >
> >
> > Thanks
> > Qian
> >
> > From: marc.sune at gmail.com
> > [mailto:marc.sune at gmail.com] On Behalf Of
> > Marc
> > Sent: Thursday, March 24, 2016 4:54 AM
> > To: Xu, Qian Q
> > Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin;
> > Richardson, Bruce; dev at dpdk.org
> >
> > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed
> > API refactoring
> >
> > Qian,
> >
> > On 23 March 2016 at 02:18, Xu, Qian Q
> mailto:qian.q.xu at intel.com>> wrote:
> > We have tested with intel nic and found port can't be started for all
> nics:ixgbe/i40e/igb/bonding, see attached mail for more details. Please check
> and fix it.
> >
> >
> > Thanks
> > Qian
> >
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org]
> > On Behalf Of Thomas Monjalon
> > Sent: Wednesday, March 23, 2016 3:59 AM
> > To: Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin
> > Cc: marcdevel at gmail.com; Richardson,
> > Bruce; dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed
> > API refactoring
> >
> > 2016-03-17 19:08, Thomas Monjalon:
> > > There are still too few tests and reviews, especially for
> > > autonegotiation with Intel devices (patch #6).
> > > I would not be surprised to see some bugs in this rework.
> >
> > Any feedback about autoneg in e1000/ixgbe/i40e?
> > Has it been tested before its integration in RC2?
> >
> > > The capabilities must be adapted per device. It can be improved in a
> > > separate patch.
> > >
> > > It will be integrated in 16.04-rc2.
> > > Please test and review shortly, thanks!
> >
> >
> > -- Forwarded message --
> > From: "Xu, Qian Q" mailto:qian.q.xu at intel.com>>
> > To: "Cao, Waterman"
> > mailto:waterman.cao at intel.com>>, "Glynn,
> > Michael J"
> > mailto:michael.j.glynn at intel.com>>
> > Cc: "Richardson, Bruce"
> > mailto:bruce.richardson at intel.com>>, "Zhu,
> > Heqing" mailto:heqing.zhu at intel.com>>,
> > "O'Driscoll, Tim"
> > mailto:tim.odriscoll at intel.com>>, "Mcnamara,
> > John" mailto:john.mcnamara at intel.com>>, "Xu,
> > HuilongX" mailto:huilongx.xu at intel.com>>, "Fu,
> > JingguoX" mailto:jingguox.fu at intel.com>>, "Xu,
> > Qian Q" mailto:qian.q.xu at intel.com>>, "Zhang,
> > Helin" mailto:helin.zhang at intel.com>>
> > Date: Tue, 22 Mar 2016 06:41:37 +
> > Subject: RE: DPDK link speed with Intel devices Hi, all We have worked
> > out the basic test cases for the patchset.
> > 1. Test the link speed on major Intel NICs to see if the speed is right.
> > 2. Test the auto-negoation on major Intel NICs to ensure it's working.
> > Nic covered: ixgbe, igb, i40e, fm10k, bonding(SW), virtio(SW)
> >
> > When we run the Test#1 for all major NICs. We found that all these NIC 
> > port(igb,
> ixgbe, i40e, fm10k) can't be started. Pls check, if the patch is applied, all 
> INTEL
> port can't be start, terrible things!
> >
> > Interactive-mode selected
> > Configuring Port 0 (socket 0)
> > PMD: ixgbe_dev_tx_queue_setup(): sw_ring=0x7f13e99e3440
> > hw_ring=0x7f13e99e5480 dma_addr=0x8299e5480
> > PMD: ixgbe_set_tx_function(): Usin

[dpdk-dev] [PATCH] hash: fix to support multi process

2016-03-25 Thread De Lara Guarch, Pablo
Hi,

> -Original Message-
> From: Richardson, Bruce
> Sent: Friday, March 25, 2016 2:41 PM
> To: De Lara Guarch, Pablo
> Cc: dev at dpdk.org; edreddy at gmail.com
> Subject: Re: [PATCH] hash: fix to support multi process
> 
> On Thu, Mar 24, 2016 at 01:56:44PM +, Pablo de Lara wrote:
> > Hash library used a function pointer to choose a different
> > key compare function, depending on the key size.
> > As a result, multiple processes could not use the same hash table,
> > as the function addresses vary from one process to another.
> >
> > Instead, a jump table is used, so each process has its own
> > function addresses, accessing this table with an index stored
> > in the hash table (note that using a custom key compare function
> > is not supported in multi-process mode).
> >
> > Fixes: 48a399119619 ("hash: replace with cuckoo hash implementation")
> >
> > Signed-off-by: Pablo de Lara 
> 
> This is a hard problem to solve, but this probably looks the least-worst
> solution
> to it. Pablo, I assume the performance hit for lookups is pretty small here?

Yes, performance degradation is negligible.
> 
> One small nit in the code below, and a comment on the doc change.

Will send a v2 with the changes. Anyway, I was waiting for Djama and Zhang
to give some feedback here and see if it works for them.

Thanks,
Pablo
> Otherwise,
> 
> Acked-by: Bruce Richardson 
> 
> > ---
> >  doc/guides/rel_notes/release_16_04.rst |  5 ++
> >  lib/librte_hash/rte_cuckoo_hash.c  | 85 ++-
> ---
> >  2 files changed, 71 insertions(+), 19 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_16_04.rst
> b/doc/guides/rel_notes/release_16_04.rst
> > index 2785b29..c4359dd 100644
> > --- a/doc/guides/rel_notes/release_16_04.rst
> > +++ b/doc/guides/rel_notes/release_16_04.rst
> > @@ -351,6 +351,11 @@ Libraries
> >Fix crc32c hash functions to return a valid crc32c value for data lengths
> >not multiple of 4 bytes.
> >
> > +* **hash: Fixed hash library to support multi-process mode.**
> > +
> > +  Fix hash library to support multi-process mode, using a jump table,
> > +  instead of storing a function pointer to the key compare function.
> > +
> You probably need to clarify here and in the other docs (e.g. API doc) that:
> * multi-process mode only works with the built-in functions and key sizes.
> * a custom function not in the jump table can be used but only in single-
> process
> mode.
> 
> >  * **librte_port: Fixed segmentation fault for ring and ethdev writer
> nodrop.**
> >
> >Fixed core dump issue on txq and swq when dropless is set to yes.
> > diff --git a/lib/librte_hash/rte_cuckoo_hash.c
> b/lib/librte_hash/rte_cuckoo_hash.c
> > index 71b5b76..38c19ab 100644
> > --- a/lib/librte_hash/rte_cuckoo_hash.c
> > +++ b/lib/librte_hash/rte_cuckoo_hash.c
> > @@ -102,6 +102,41 @@ EAL_REGISTER_TAILQ(rte_hash_tailq)
> >
> >  #define LCORE_CACHE_SIZE   8
> >
> > +/*
> > + * All different options to select a key compare function,
> > + * based on the key size and custom function.
> > + */
> > +enum cmp_jump_table_case {
> > +   KEY_CUSTOM = 0,
> > +   KEY_16_BYTES,
> > +   KEY_32_BYTES,
> > +   KEY_48_BYTES,
> > +   KEY_64_BYTES,
> > +   KEY_80_BYTES,
> > +   KEY_96_BYTES,
> > +   KEY_112_BYTES,
> > +   KEY_128_BYTES,
> > +   KEY_OTHER_BYTES,
> > +   NUM_KEY_CMP_CASES,
> > +};
> > +
> > +/*
> > + * Table storing all different key compare functions
> > + * (multi-process supported)
> > + */
> > +rte_hash_cmp_eq_t cmp_jump_table[NUM_KEY_CMP_CASES] = {
> > +   NULL,
> > +   rte_hash_k16_cmp_eq,
> > +   rte_hash_k32_cmp_eq,
> > +   rte_hash_k48_cmp_eq,
> > +   rte_hash_k64_cmp_eq,
> > +   rte_hash_k80_cmp_eq,
> > +   rte_hash_k96_cmp_eq,
> > +   rte_hash_k112_cmp_eq,
> > +   rte_hash_k128_cmp_eq,
> > +   memcmp
> > +};
> 
> Array should be const.
> 
> /Bruce



[dpdk-dev] [PATCH] ixgbe: support mac type x550em_a

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 02:11:02PM +0800, Wenzhuo Lu wrote:
> On my side the development of l2 tunnel and e-tag features is being
> done in paralell with the ixgbe base code update. So, l2 tunnel and
> e-tag are not supported on the new x550em_a NICs.
> Now all the code is ready, should extend the support to x550em_a
> NICs.
> 
> Fixes: 22e77d4501b8("ixgbe: support L2 tunnel operations")
> Signed-off-by: Wenzhuo Lu 

Applied to dpdk-next-net/rel_16_04 with a heavily reworked commit message.
Please keep the commit message to only describing the problem being fixed
and how the patch fixes it.

Thanks,
/Bruce



[dpdk-dev] [PATCH v2] mlx4: use dummy rxqs when a non-pow2 number is requested

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 11:24:41AM +0100, Olivier Matz wrote:
> When using RSS, the number of rxqs has to be a power of two.
> This is a problem because there is no API in DPDK that makes
> the application aware of that.
> 
> A good compromise is to allow the application to request a
> number of rxqs that is not a power of 2, but having inactive
> queues that will never receive packets. In this configuration,
> a warning will be issued to users to let them know that
> this is not an optimal configuration.
> 
> Signed-off-by: Olivier Matz 
> Acked-by: Adrien Mazarguil 

Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH] igb: fix crash with offload on 82575 chipset

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 02:06:51PM +, Ananyev, Konstantin wrote:
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Olivier Matz
> > Sent: Friday, March 25, 2016 10:32 AM
> > To: dev at dpdk.org
> > Cc: Lu, Wenzhuo
> > Subject: [dpdk-dev] [PATCH] igb: fix crash with offload on 82575 chipset
> > 
> > On the 82575 chipset, there is a pool of global TX contexts instead of 2
> > per queues on 82576. See Table A-1 "Changes in Programming Interface
> > Relative to 82575" of Intel? 82576EB GbE Controller datasheet (*).
> > 
> > In the driver, the contexts are attributed to a TX queue: 0-1 for txq0,
> > 2-3 for txq1, and so on.
> > 
> > In igbe_set_xmit_ctx(), the variable ctx_curr contains the index of the
> > per-queue context (0 or 1), and ctx_idx contains the index to be given
> > to the hardware (0 to 7). The size of txq->ctx_cache[] is 2, and must
> > be indexed with ctx_curr to avoid an out-of-bound access.
> > 
> > Also, the index returned by what_advctx_update() is the per-queue
> > index (0 or 1), so we need to add txq->ctx_start before sending it
> > to the hardware.
> > 
> > (*) The datasheets says 16 global contexts, however the IDX fields in TX
> > descriptors are 3 bits, which gives a total of 8 contexts. The
> > driver assumes there are 8 contexts on 82575: 2 per queues, 4 txqs.
> > 
> > Fixes: 4c8db5f09a ("igb: enable TSO support")
> > Fixes: af75078fec ("first public release")
> > Signed-off-by: Olivier Matz 
> 
> Acked-by: Konstantin Ananyev 

Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Thomas Monjalon
2016-03-25 15:07, Zhang, Helin:
> From: Thomas Monjalon
> > Is there someone investigating the issue?
> 
> Beilei is investigating that, she will give her findings soon later, and 
> possibly a fix after validating that.
> Thanks!

Great thanks!
After talking with Bruce, we have decided to close RC2 without this patch
and make it a prerequisite for RC3.



[dpdk-dev] [PATCH 0/3 v7] i40e: Add floating VEB support for i40e

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 04:41:57PM +0800, Zhe Tao wrote:
> This patch-set add the support for floating VEB in i40e.
> All the VFs VSIs can decide whether to connect to the legacy VEB/VEPA or
> the floating VEB. When connect to the floating VEB a new floating VEB is
> created. Now all the VFs need to connect to floating VEB or legacy VEB,
> cannot connect to both of them. The PF and VMDQ,FD VSIs connect to
> the old legacy VEB/VEPA.
> 
> All the VEB/VEPA concepts are not specific for FVL, they are defined in the
> 802.1Qbg spec.
> 
> This floating VEB only take effects on the specific version F/W.
> 
> Zhe Tao (2):
>   Support floating VEB config
>   Add floating VEB support in i40e
>   Add global reset support for i40e
> 
Thanks for these patches, but at this stage in the development process we are
past the feature-freeze deadline, so these patches need to be deferred until the
16.07 release.

Regards,
/Bruce


[dpdk-dev] [PATCH v3] examples/l3fwd: fix validation for queue id of config tuple

2016-03-25 Thread Reshma Pattan
Added validation for queue id of config parameter tuple.

This validation enforces user to enter queue ids of a port
from 0 and in sequence.

This additional validation on queue ids avoids ixgbe crash caused by null
rxq pointer access inside ixgbe_dev_rx_init.

Reason for null rxq is, L3fwd application allocates memory only for queues 
passed by user.
But rte_eth_dev_start  tries to initialize rx queues in sequence from 0 to 
nb_rx_queues,
which is not true and coredump while accessing the unallocated queue .

Fixes: af75078fece3 ("first public release")

Signed-off-by: Reshma Pattan 
---
 v3:
 added Fixes line
 added "\n" at the end of err message.

 v2:
 used nested if instead of if and elseif.

 examples/l3fwd/main.c |   11 ---
 1 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c
index 792894f..97a1423 100644
--- a/examples/l3fwd/main.c
+++ b/examples/l3fwd/main.c
@@ -263,9 +263,14 @@ get_port_n_rx_queues(const uint8_t port)
uint16_t i;

for (i = 0; i < nb_lcore_params; ++i) {
-   if (lcore_params[i].port_id == port &&
-   lcore_params[i].queue_id > queue)
-   queue = lcore_params[i].queue_id;
+   if (lcore_params[i].port_id == port) {
+   if (lcore_params[i].queue_id == queue+1)
+   queue = lcore_params[i].queue_id;
+   else
+   rte_exit(EXIT_FAILURE, "queue ids of the port 
%d must be"
+   " in sequence and must start 
with 0\n",
+   lcore_params[i].port_id);
+   }
}
return (uint8_t)(++queue);
 }
-- 
1.7.4.1



[dpdk-dev] [PATCH] bond: use existing enslaved device queues

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 01:28:05PM +, Iremonger, Bernard wrote:
> > -Original Message-
> > From: Eric Kinzie [mailto:ehkinzie at gmail.com]
> > Sent: Thursday, March 24, 2016 10:00 PM
> > To: dev at dpdk.org
> > Cc: tkiely at brocade.com; Doherty, Declan ;
> > Iremonger, Bernard ; ekinzie at brocade.com
> > Subject: [PATCH] bond: use existing enslaved device queues
> > 
> > This solves issues when an active device is added to a bond.
> > 
> > If a device to be enslaved already has transmit and/or receive queues
> > allocated, use those and then create any additional queues that are
> > necessary.
> > 
> > Fixes: 2efb58cbab6e ("bond: new link bonding library")
> > 
> > Signed-off-by: Eric Kinzie 
> 
> Acked-by: Bernard Iremonger

Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH v2] i40e: fix using memory after free issue

2016-03-25 Thread Bruce Richardson
On Fri, Mar 25, 2016 at 09:17:01AM +, Jiangu Zhao wrote:
> The old code still uses entry in the next loop of LIST_FOREACH after free() 
> in i40e_res_pool_destroy().
> Change to a safe way to free entry, which is similar with LIST_FOREACH_SAFE 
> in FreeBSD.
> 
> Fixes: 4861cde46116 ("i40e: new poll mode driver")
> 
> Signed-off-by: Jiangu Zhao 
Acked-by: Bruce Richardson 

Applied to dpdk-next-net/rel_16_04

/Bruce


[dpdk-dev] [PATCH v3] examples/l3fwd: fix validation for queue id of config tuple

2016-03-25 Thread De Lara Guarch, Pablo


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Reshma Pattan
> Sent: Friday, March 25, 2016 3:14 PM
> To: dev at dpdk.org; Ananyev, Konstantin
> Cc: Pattan, Reshma
> Subject: [dpdk-dev] [PATCH v3] examples/l3fwd: fix validation for queue id of
> config tuple
> 
> Added validation for queue id of config parameter tuple.
> 
> This validation enforces user to enter queue ids of a port
> from 0 and in sequence.
> 
> This additional validation on queue ids avoids ixgbe crash caused by null
> rxq pointer access inside ixgbe_dev_rx_init.
> 
> Reason for null rxq is, L3fwd application allocates memory only for queues
> passed by user.
> But rte_eth_dev_start  tries to initialize rx queues in sequence from 0 to
> nb_rx_queues,
> which is not true and coredump while accessing the unallocated queue .
> 
> Fixes: af75078fece3 ("first public release")
> 
> Signed-off-by: Reshma Pattan 

Acked-by: Pablo de Lara 


[dpdk-dev] [PATCH] doc: postpone flow director changes planned for cxgbe

2016-03-25 Thread Thomas Monjalon
It will be tried to find a better solution.

Signed-off-by: Thomas Monjalon 
---
 doc/guides/rel_notes/deprecation.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index bdbac15..179e30f 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -24,4 +24,4 @@ Deprecation Notices

 * ABI changes are planned for adding four new flow types. This impacts
   RTE_ETH_FLOW_MAX. The release 2.2 does not contain these ABI changes,
-  but release 2.3 will.
+  but release 2.3 will. [postponed]
-- 
2.7.0



[dpdk-dev] [PATCH 1/2] lib/librte_lpm: Fix anonymous union initialization issue

2016-03-25 Thread Stephen Hemminger
Making code run with picky compilers is good, as long as it
doesn't break other (more modern) compilers.

> + new_tbl8_entry.next_hop=next_hop;

Please run your patches through checkpatch, it will warn about missing 
whitespace like this.


[dpdk-dev] [PATCH] doc: postpone flow director changes planned for cxgbe

2016-03-25 Thread Thomas Monjalon
2016-03-25 17:29, Thomas Monjalon:
> It will be tried to find a better solution.
> 
> Signed-off-by: Thomas Monjalon 

Applied


[dpdk-dev] [PATCH v2] testpmd: fix build on FreeBSD

2016-03-25 Thread Thomas Monjalon
2016-03-25 12:15, Bruce Richardson:
> On Fri, Mar 25, 2016 at 12:10:40PM +, Bruce Richardson wrote:
> > On Wed, Mar 23, 2016 at 04:17:12PM +0100, Thomas Monjalon wrote:
> > > 2016-03-23 02:17, Wu, Jingjing:
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Marvin Liu
> > > > > Build log:
> > > > > /root/dpdk/app/test-pmd/cmdline.c:6687:45: error: no member named
> > > > > 's6_addr32' in 'struct in6_addr'
> > > > > rte_be_to_cpu_32(res->ip_value.addr.ipv6.s6_addr32[i]);
> > > > > 
> > > > > This is caused by macro "s6_addr32" not defined on FreeBSD and testpmd
> > > > > swap big endian parameter to host endian. Move the swap action to i40e
> > > > > ethdev will fix this issue.
> > > > > 
> > > > > Fixes: 7b1312891b69 ("ethdev: add IP in GRE tunnel")
> > > > > 
> > > > > Signed-off-by: Marvin Liu 
> > > > Acked-by: Jingjing Wu 
> > > 
> > > It looks good but something is missing to decide that it is the right fix:
> > > the API do not state wether these fields (and others) are big endian or
> > > something else.
> > > 
> > > Please Jingjing, fix the ethdev comments for these fields and others
> > > rte_eth_ipv*_flow in a separate patch.
> > 
> > +1 to the more info because the endianness is confusing here. However, this 
> > look
> > a better fix than the previous one (v1 patch).
> > 
> > Thomas, can this be merged for RC2 to fix the BSD build, which should be a 
> > priority? Even if it's not the full solution, I think we need to at least 
> > get
> > the code building on BSD.
> > 
> > Thanks,
> > /Bruce
> 
> And I confirm this patch fixes the FreeBSD compile for both gcc and clang.
> 
> Tested-by: Bruce Richardson 

Applied, thanks


[dpdk-dev] [PATCH v4 2/3] examples/l3fwd: fix using packet type blindly

2016-03-25 Thread Thomas Monjalon
2016-03-25 08:47, Jianfeng Tan:
> +* **examples/vhost: Fixed frequent mbuf allocation failure.**
> +
> +  vhost-switch often fails to allocate mbuf when dequeue from vring because 
> it
> +  wrongly calculates the number of mbufs needed.

Wrong rebase here ;)

> +* **examples/l3fwd: Fixed using packet type blindly.**
> +
> +  l3fwd makes use of packet type information without even query if devices 
> or PMDs
> +  really set it. For those don't set ptypes, add an option to parse it 
> softly.




[dpdk-dev] [PATCH v4 0/3] packet type

2016-03-25 Thread Thomas Monjalon
2016-03-25 08:47, Jianfeng Tan:
> Patch 1: refine rte_eth_dev_get_supported_ptypes.

This patch has been split in another series.

> Patch 2: add an option in l3fwd.
> Patch 3: enable vector pmd in i40e by default.
> 
> Signed-off-by: Jianfeng Tan 
> Acked-by: Konstantin Ananyev 
> 
> 
> Jianfeng Tan (3):
>   ethdev: refine API to query supported packet types
>   examples/l3fwd: fix using packet type blindly
>   config: enable vector driver by default

Applied patches 2 & 3 with release notes fixes.


[dpdk-dev] [PATCH v2 0/4] vhost vlan tag and TSO fixes/cleanups

2016-03-25 Thread Thomas Monjalon
2016-03-25 15:58, Yuanhan Liu:
> v2: - we can't remove the left part of TSO settings to lib vhost, which
>   hurts VM2VM performance badly.
> 
> Ksiadz reported that TSO won't work for OVS with NIC, even with those
> similar changes from the commit 9fd72e3cbd29 ("examples/vhost: add
> virtio offload").
> 
> This gives me another chance to look at the TSO implementation a bit
> deeper, and then came up with this small patch set, which includes a
> TSO cleanup and fix.
> 
> Patch 4 is a vlan tag fix reported from Qian.

Applied, thanks


[dpdk-dev] [PATCH v3] examples/l3fwd: fix validation for queue id of config tuple

2016-03-25 Thread Thomas Monjalon
> > Added validation for queue id of config parameter tuple.
> > 
> > This validation enforces user to enter queue ids of a port
> > from 0 and in sequence.
> > 
> > This additional validation on queue ids avoids ixgbe crash caused by null
> > rxq pointer access inside ixgbe_dev_rx_init.
> > 
> > Reason for null rxq is, L3fwd application allocates memory only for queues
> > passed by user.
> > But rte_eth_dev_start  tries to initialize rx queues in sequence from 0 to
> > nb_rx_queues,
> > which is not true and coredump while accessing the unallocated queue .
> > 
> > Fixes: af75078fece3 ("first public release")
> > 
> > Signed-off-by: Reshma Pattan 
> 
> Acked-by: Pablo de Lara 

Applied, thanks


[dpdk-dev] [dpdk-announce] release candidate 16.04-rc2

2016-03-25 Thread Thomas Monjalon
A new DPDK release candidate is ready for testing:
http://dpdk.org/browse/dpdk/tag/?id=v16.04-rc2

This release candidate is not too big. It includes a lot of fixes
for drivers, examples and tools. The Amazon ENA driver is now integrated.

The patchet about "link speed API rework" is probably breaking the
Intel drivers. As we cannot postpone again this feature, the RC3 will
wait for this patchset to be fixed.
Thanks for helping the release to be on time.



[dpdk-dev] [PATCH v12 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Thomas Monjalon
There are still too few tests and reviews, especially for
autonegotiation with Intel devices (patch #6).
I would not be surprised to see some bugs in this rework.

The capabilities must be adapted per device. It can be
improved in a separate patch.

It will be integrated in 16.04-rc3.
Please test and review shortly, thanks!



This series of patches adds the following capabilities:

* speed_capa bitmap in rte_eth_dev_info, which is filled by the PMDs
  according to the physical device capabilities.
* refactors link API in ethdev to allow the definition of the advertised
  link speeds, fix speed (no auto-negociation) or advertise all supported
  speeds (default).



v12:
- rebase on 16.04-rc2
- fix mlx capabilities
- update ENA driver

v11:
- rebase on 16.04-rc1
- replace on more link status value in e1000 driver
- merge szedata2 patches
- remove szedata2 temporary comments in code and doc

v10:
- rebase
- rework release notes
- rearrange patch splitting
- fix doxygen comments
- fix typos
- removed log format of link.link_speed as %d (keep %u)
- complete ETH_LINK_[DOWN/UP] replacement from 0/1
- change ETH_LINK_SPEED_AUTONEG to 1
- replace ETH_LINK_SPEED_NEG by ETH_LINK_SPEED_AUTONEG (1)
- replace ETH_LINK_SPEED_NO_AUTONEG by ETH_LINK_SPEED_FIXED (0)
- rework rte_eth_speed_to_bm_flag to rte_eth_speed_bitflag
- complete 100G support in testpmd

v9: rebased to current HEAD. Reverted numeric speed to 32 bit in struct
rte_eth_link (no atomic link get > 64bit). Fixed mlx5 driver compilation
and link speeds. Moved documentation to release_16_04.rst and fixed several
issues. Upgrade NIC notes with speed capabilities.

v8: Rebased to current HEAD. Modified em driver impl. to not touch base files.
Merged patch 5 into 3 (map file). Changed numeric speed to a 64 bit value.
Filled-in speed capabilities for drivers bnx2x, cxgbe, mlx5 and nfp in
addition to the ones of previous patch sets.

v7: Rebased to current HEAD. Moved documentation to v2.3. Still needs testing
from PMD maintainers.

v6: Move link_duplex to be part of bitfield. Fixed i40 autoneg flag link
update code. Added rte_eth_speed_to_bm_flag() to .map file. Fixed other
spelling issues. Rebased to current HEAD.

v5: revert to v2 speed capabilities patch. Fixed MLX4 speed capabilities
(thanks N. Laranjeiro). Refactored link speed API to allow setting
advertised speeds (3/4). Added NO_AUTONEG option to explicitely disable
auto-negociation. Updated 2.2 rel. notes (4/4). Rebased to current HEAD.

v4: fixed errata in the documentation of field speeds of rte_eth_conf, and
commit 1/2 message. rebased to v2.1.0. v3 was incorrectly based on
~2.1.0-rc1.

v3: rebase to v2.1. unified ETH_LINK_SPEED and ETH_SPEED_CAP into ETH_SPEED.
Converted field speed in struct rte_eth_conf to speed, to allow a bitmap
for defining the announced speeds, as suggested M. Brorup. Fixed spelling
issues.

v2: rebase, converted speed_capa into 32 bits bitmap, fixed alignment
(checkpatch).



Mare Sune (6):
  ethdev: use constants for link duplex
  app/testpmd: move speed and duplex parsing in a function
  ethdev: rename link speed constants
  ethdev: add speed capabilities
  ethdev: redesign link speed config
  ethdev: convert speed number to bitmap flag

Thomas Monjalon (2):
  ethdev: use constants for link state
  ethdev: add 100G link speed

 app/test-pipeline/init.c   |   2 +-
 app/test-pmd/cmdline.c | 125 ++---
 app/test-pmd/testpmd.c |   2 +-
 app/test/test_pmd_perf.c   |   2 +-
 app/test/virtual_pmd.c |   8 +-
 doc/guides/nics/overview.rst   |   1 +
 doc/guides/nics/szedata2.rst   |   6 -
 doc/guides/rel_notes/deprecation.rst   |   3 -
 doc/guides/rel_notes/release_16_04.rst |  22 
 doc/guides/testpmd_app_ug/testpmd_funcs.rst|   2 +-
 drivers/net/af_packet/rte_eth_af_packet.c  |   9 +-
 drivers/net/bnx2x/bnx2x_ethdev.c   |   7 +-
 drivers/net/bnx2x/elink.c  |   2 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c  |  14 +--
 drivers/net/bonding/rte_eth_bond_api.c |   4 +-
 drivers/net/bonding/rte_eth_bond_pmd.c |  12 +-
 drivers/net/cxgbe/base/t4_hw.c |   8 +-
 drivers/net/cxgbe/cxgbe_ethdev.c   |   1 +
 drivers/net/e1000/em_ethdev.c  | 113 +--
 drivers/net/e1000/igb_ethdev.c | 104 +
 drivers/net/ena/ena_ethdev.c   |  12 +-
 drivers/net/fm10k/fm10k_ethdev.c   |   6 +-
 drivers/net/i40e/i40e_ethdev.c |  76 +++--
 drivers/net/i40e/i40e_ethdev_vf.c  

[dpdk-dev] [PATCH v12 1/8] ethdev: use constants for link state

2016-03-25 Thread Thomas Monjalon
Define and use ETH_LINK_UP and ETH_LINK_DOWN where appropriate.

Signed-off-by: Marc Sune 
Signed-off-by: Thomas Monjalon 
---
 app/test-pipeline/init.c |  2 +-
 app/test-pmd/testpmd.c   |  2 +-
 app/test/test_pmd_perf.c |  2 +-
 app/test/virtual_pmd.c   |  6 +++---
 drivers/net/af_packet/rte_eth_af_packet.c|  6 +++---
 drivers/net/bnx2x/bnx2x_ethdev.c |  2 +-
 drivers/net/bnx2x/elink.c|  2 +-
 drivers/net/bonding/rte_eth_bond_api.c   |  4 ++--
 drivers/net/bonding/rte_eth_bond_pmd.c   | 12 ++--
 drivers/net/e1000/em_ethdev.c|  8 
 drivers/net/e1000/igb_ethdev.c   |  4 ++--
 drivers/net/fm10k/fm10k_ethdev.c |  2 +-
 drivers/net/i40e/i40e_ethdev_vf.c|  2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c |  4 ++--
 drivers/net/mpipe/mpipe_tilegx.c | 12 ++--
 drivers/net/nfp/nfp_net.c|  2 +-
 drivers/net/null/rte_eth_null.c  |  6 +++---
 drivers/net/pcap/rte_eth_pcap.c  |  6 +++---
 drivers/net/ring/rte_eth_ring.c  | 10 +-
 drivers/net/szedata2/rte_eth_szedata2.c  |  2 +-
 drivers/net/vhost/rte_eth_vhost.c|  6 +++---
 drivers/net/virtio/virtio_ethdev.c   |  6 +++---
 drivers/net/vmxnet3/vmxnet3_ethdev.c |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c|  6 +++---
 examples/exception_path/main.c   |  2 +-
 examples/ip_fragmentation/main.c |  2 +-
 examples/ip_pipeline/init.c  |  2 +-
 examples/ip_reassembly/main.c|  2 +-
 examples/ipsec-secgw/ipsec-secgw.c   |  2 +-
 examples/ipv4_multicast/main.c   |  2 +-
 examples/kni/main.c  |  2 +-
 examples/l2fwd-crypto/main.c |  2 +-
 examples/l2fwd-ivshmem/host/host.c   |  2 +-
 examples/l2fwd-jobstats/main.c   |  2 +-
 examples/l2fwd-keepalive/main.c  |  2 +-
 examples/l2fwd/main.c|  2 +-
 examples/l3fwd-acl/main.c|  2 +-
 examples/l3fwd-power/main.c  |  2 +-
 examples/l3fwd/main.c|  2 +-
 examples/link_status_interrupt/main.c|  2 +-
 examples/load_balancer/init.c|  2 +-
 examples/multi_process/client_server_mp/mp_server/init.c |  2 +-
 examples/multi_process/l2fwd_fork/main.c |  2 +-
 examples/multi_process/symmetric_mp/main.c   |  2 +-
 examples/performance-thread/l3fwd-thread/main.c  |  2 +-
 lib/librte_ether/rte_ethdev.h|  5 -
 46 files changed, 83 insertions(+), 80 deletions(-)

diff --git a/app/test-pipeline/init.c b/app/test-pipeline/init.c
index db2196b..aef082f 100644
--- a/app/test-pipeline/init.c
+++ b/app/test-pipeline/init.c
@@ -205,7 +205,7 @@ app_ports_check_link(void)
link.link_speed / 1000,
link.link_status ? "UP" : "DOWN");

-   if (link.link_status == 0)
+   if (link.link_status == ETH_LINK_DOWN)
all_ports_up = 0;
}

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 8605e62..1398c6c 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -1641,7 +1641,7 @@ check_all_ports_link_status(uint32_t port_mask)
continue;
}
/* clear all_ports_up flag if any link down */
-   if (link.link_status == 0) {
+   if (link.link_status == ETH_LINK_DOWN) {
all_ports_up = 0;
break;
}
diff --git a/app/test/test_pmd_perf.c b/app/test/test_pmd_perf.c
index 48e16c9..59803f7 100644
--- a/app/test/test_pmd_perf.c
+++ b/app/test/test_pmd_perf.c
@@ -192,7 +192,7 @@ check_all_ports_link_status(uint8_t port_num, uint32_t 
port_mask)
continue;
}
/* clear all_ports_up flag if any link down */
-   if (link.link_status == 0) {
+   if (link.link_status == ETH_LINK_DOWN) {
all_ports_up = 0;
break;
}
diff --git a/app/test/virtual_pmd.c b/app

[dpdk-dev] [PATCH v12 2/8] ethdev: use constants for link duplex

2016-03-25 Thread Thomas Monjalon
From: Marc Sune 

Some duplex values are replaced from 0 to half-duplex when link is down.

Some drivers are still using their own constants for duplex modes.

Signed-off-by: Marc Sune 
---
 drivers/net/e1000/em_ethdev.c  | 2 +-
 drivers/net/e1000/igb_ethdev.c | 2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c   | 2 +-
 drivers/net/virtio/virtio_ethdev.c | 2 +-
 drivers/net/virtio/virtio_ethdev.h | 2 --
 lib/librte_ether/rte_ethdev.h  | 2 +-
 6 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index dc9ed38..fad8f2f 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1107,7 +1107,7 @@ eth_em_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
link.link_status = ETH_LINK_UP;
} else if (!link_check && (link.link_status == ETH_LINK_UP)) {
link.link_speed = 0;
-   link.link_duplex = 0;
+   link.link_duplex = ETH_LINK_HALF_DUPLEX;
link.link_status = ETH_LINK_DOWN;
}
rte_em_dev_atomic_write_link_status(dev, &link);
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 045fc63..4dfa7e3 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -2062,7 +2062,7 @@ eth_igb_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)
link.link_status = ETH_LINK_UP;
} else if (!link_check) {
link.link_speed = 0;
-   link.link_duplex = 0;
+   link.link_duplex = ETH_LINK_HALF_DUPLEX;
link.link_status = ETH_LINK_DOWN;
}
rte_igb_dev_atomic_write_link_status(dev, &link);
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 129f36a..21a3b8c 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -3061,7 +3061,7 @@ ixgbe_dev_link_update(struct rte_eth_dev *dev, int 
wait_to_complete)

link.link_status = ETH_LINK_DOWN;
link.link_speed = 0;
-   link.link_duplex = 0;
+   link.link_duplex = ETH_LINK_HALF_DUPLEX;
memset(&old, 0, sizeof(old));
rte_ixgbe_dev_atomic_read_link_status(dev, &old);

diff --git a/drivers/net/virtio/virtio_ethdev.c 
b/drivers/net/virtio/virtio_ethdev.c
index 3ebc221..63a368a 100644
--- a/drivers/net/virtio/virtio_ethdev.c
+++ b/drivers/net/virtio/virtio_ethdev.c
@@ -1401,7 +1401,7 @@ virtio_dev_link_update(struct rte_eth_dev *dev, 
__rte_unused int wait_to_complet
memset(&link, 0, sizeof(link));
virtio_dev_atomic_read_link_status(dev, &link);
old = link;
-   link.link_duplex = FULL_DUPLEX;
+   link.link_duplex = ETH_LINK_FULL_DUPLEX;
link.link_speed  = SPEED_10G;

if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) {
diff --git a/drivers/net/virtio/virtio_ethdev.h 
b/drivers/net/virtio/virtio_ethdev.h
index fed9571..66423a0 100644
--- a/drivers/net/virtio/virtio_ethdev.h
+++ b/drivers/net/virtio/virtio_ethdev.h
@@ -42,8 +42,6 @@
 #define SPEED_100  100
 #define SPEED_1000 1000
 #define SPEED_10G  1
-#define HALF_DUPLEX1
-#define FULL_DUPLEX2

 #ifndef PAGE_SIZE
 #define PAGE_SIZE 4096
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index c5a215a..2d13f92 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -246,7 +246,7 @@ struct rte_eth_stats {
  */
 struct rte_eth_link {
uint16_t link_speed;  /**< ETH_LINK_SPEED_[10, 100, 1000, 1] */
-   uint16_t link_duplex; /**< ETH_LINK_[HALF_DUPLEX, FULL_DUPLEX] */
+   uint16_t link_duplex; /**< ETH_LINK_[HALF/FULL]_DUPLEX */
uint8_t  link_status : 1; /**< ETH_LINK_[DOWN/UP] */
 }__attribute__((aligned(8))); /**< aligned for atomic64 read/write */

-- 
2.7.0



[dpdk-dev] [PATCH v12 3/8] app/testpmd: move speed and duplex parsing in a function

2016-03-25 Thread Thomas Monjalon
From: Marc Sune 

The code for checking and parsing speed/duplex was duplicated.
The new function is also checking the speed/duplex combination.

Signed-off-by: Marc Sune 
---
 app/test-pmd/cmdline.c | 99 --
 1 file changed, 47 insertions(+), 52 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 93203f4..eb7bbb4 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -988,6 +988,49 @@ struct cmd_config_speed_all {
cmdline_fixed_string_t value2;
 };

+static int
+parse_and_check_speed_duplex(char *speedstr, char *duplexstr, uint16_t *speed)
+{
+
+   int duplex;
+
+   if (!strcmp(duplexstr, "half")) {
+   duplex = ETH_LINK_HALF_DUPLEX;
+   } else if (!strcmp(duplexstr, "full")) {
+   duplex = ETH_LINK_FULL_DUPLEX;
+   } else if (!strcmp(duplexstr, "auto")) {
+   duplex = ETH_LINK_FULL_DUPLEX;
+   } else {
+   printf("Unknown duplex parameter\n");
+   return -1;
+   }
+
+   if (!strcmp(speedstr, "10")) {
+   *speed = ETH_LINK_SPEED_10;
+   } else if (!strcmp(speedstr, "100")) {
+   *speed = ETH_LINK_SPEED_100;
+   } else {
+   if (duplex != ETH_LINK_FULL_DUPLEX) {
+   printf("Invalid speed/duplex parameters\n");
+   return -1;
+   }
+   if (!strcmp(speedstr, "1000")) {
+   *speed = ETH_LINK_SPEED_1000;
+   } else if (!strcmp(speedstr, "1")) {
+   *speed = ETH_LINK_SPEED_10G;
+   } else if (!strcmp(speedstr, "4")) {
+   *speed = ETH_LINK_SPEED_40G;
+   } else if (!strcmp(speedstr, "auto")) {
+   *speed = ETH_LINK_SPEED_AUTONEG;
+   } else {
+   printf("Unknown speed parameter\n");
+   return -1;
+   }
+   }
+
+   return 0;
+}
+
 static void
 cmd_config_speed_all_parsed(void *parsed_result,
__attribute__((unused)) struct cmdline *cl,
@@ -1003,33 +1046,9 @@ cmd_config_speed_all_parsed(void *parsed_result,
return;
}

-   if (!strcmp(res->value1, "10"))
-   link_speed = ETH_LINK_SPEED_10;
-   else if (!strcmp(res->value1, "100"))
-   link_speed = ETH_LINK_SPEED_100;
-   else if (!strcmp(res->value1, "1000"))
-   link_speed = ETH_LINK_SPEED_1000;
-   else if (!strcmp(res->value1, "1"))
-   link_speed = ETH_LINK_SPEED_10G;
-   else if (!strcmp(res->value1, "4"))
-   link_speed = ETH_LINK_SPEED_40G;
-   else if (!strcmp(res->value1, "auto"))
-   link_speed = ETH_LINK_SPEED_AUTONEG;
-   else {
-   printf("Unknown parameter\n");
+   if (parse_and_check_speed_duplex(res->value1, res->value2,
+   &link_speed) < 0)
return;
-   }
-
-   if (!strcmp(res->value2, "half"))
-   link_duplex = ETH_LINK_HALF_DUPLEX;
-   else if (!strcmp(res->value2, "full"))
-   link_duplex = ETH_LINK_FULL_DUPLEX;
-   else if (!strcmp(res->value2, "auto"))
-   link_duplex = ETH_LINK_AUTONEG_DUPLEX;
-   else {
-   printf("Unknown parameter\n");
-   return;
-   }

FOREACH_PORT(pid, ports) {
ports[pid].dev_conf.link_speed = link_speed;
@@ -1102,33 +1121,9 @@ cmd_config_speed_specific_parsed(void *parsed_result,
if (port_id_is_invalid(res->id, ENABLED_WARN))
return;

-   if (!strcmp(res->value1, "10"))
-   link_speed = ETH_LINK_SPEED_10;
-   else if (!strcmp(res->value1, "100"))
-   link_speed = ETH_LINK_SPEED_100;
-   else if (!strcmp(res->value1, "1000"))
-   link_speed = ETH_LINK_SPEED_1000;
-   else if (!strcmp(res->value1, "1"))
-   link_speed = ETH_LINK_SPEED_1;
-   else if (!strcmp(res->value1, "4"))
-   link_speed = ETH_LINK_SPEED_40G;
-   else if (!strcmp(res->value1, "auto"))
-   link_speed = ETH_LINK_SPEED_AUTONEG;
-   else {
-   printf("Unknown parameter\n");
-   return;
-   }
-
-   if (!strcmp(res->value2, "half"))
-   link_duplex = ETH_LINK_HALF_DUPLEX;
-   else if (!strcmp(res->value2, "full"))
-   link_duplex = ETH_LINK_FULL_DUPLEX;
-   else if (!strcmp(res->value2, "auto"))
-   link_duplex = ETH_LINK_AUTONEG_DUPLEX;
-   else {
-   printf("Unknown parameter\n");
+   if (parse_and_check_speed_duplex(res->value1, res->value2,
+   &link_speed) < 0)
return;
-   }

ports[res->id].dev_conf.link_speed = link_speed;
ports[res->id].dev_conf.link_duplex = link_duplex;
-- 

[dpdk-dev] [PATCH v12 4/8] ethdev: rename link speed constants

2016-03-25 Thread Thomas Monjalon
From: Marc Sune 

The speed numbers ETH_LINK_SPEED_ are renamed ETH_SPEED_NUM_.
The prefix ETH_LINK_SPEED_ is kept for AUTONEG and will be used
for bit flags in next patch.

Signed-off-by: Marc Sune 
---
 app/test-pmd/cmdline.c| 10 +-
 app/test/virtual_pmd.c|  2 +-
 drivers/net/af_packet/rte_eth_af_packet.c |  2 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c | 12 ++--
 drivers/net/cxgbe/base/t4_hw.c|  8 
 drivers/net/e1000/em_ethdev.c |  8 
 drivers/net/e1000/igb_ethdev.c|  8 
 drivers/net/ena/ena_ethdev.c  |  2 +-
 drivers/net/i40e/i40e_ethdev.c| 30 +++---
 drivers/net/i40e/i40e_ethdev_vf.c |  2 +-
 drivers/net/ixgbe/ixgbe_ethdev.c  | 22 +++---
 drivers/net/mpipe/mpipe_tilegx.c  |  4 ++--
 drivers/net/nfp/nfp_net.c |  2 +-
 drivers/net/null/rte_eth_null.c   |  2 +-
 drivers/net/pcap/rte_eth_pcap.c   |  2 +-
 drivers/net/ring/rte_eth_ring.c   |  2 +-
 drivers/net/szedata2/rte_eth_szedata2.c   |  8 
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  2 +-
 drivers/net/xenvirt/rte_eth_xenvirt.c |  2 +-
 lib/librte_ether/rte_ethdev.h | 29 ++---
 20 files changed, 83 insertions(+), 76 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index eb7bbb4..815b53b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -1006,20 +1006,20 @@ parse_and_check_speed_duplex(char *speedstr, char 
*duplexstr, uint16_t *speed)
}

if (!strcmp(speedstr, "10")) {
-   *speed = ETH_LINK_SPEED_10;
+   *speed = ETH_SPEED_NUM_10M;
} else if (!strcmp(speedstr, "100")) {
-   *speed = ETH_LINK_SPEED_100;
+   *speed = ETH_SPEED_NUM_100M;
} else {
if (duplex != ETH_LINK_FULL_DUPLEX) {
printf("Invalid speed/duplex parameters\n");
return -1;
}
if (!strcmp(speedstr, "1000")) {
-   *speed = ETH_LINK_SPEED_1000;
+   *speed = ETH_SPEED_NUM_1G;
} else if (!strcmp(speedstr, "1")) {
-   *speed = ETH_LINK_SPEED_10G;
+   *speed = ETH_SPEED_NUM_10G;
} else if (!strcmp(speedstr, "4")) {
-   *speed = ETH_LINK_SPEED_40G;
+   *speed = ETH_SPEED_NUM_40G;
} else if (!strcmp(speedstr, "auto")) {
*speed = ETH_LINK_SPEED_AUTONEG;
} else {
diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c
index b1d40d7..b4bd2f2 100644
--- a/app/test/virtual_pmd.c
+++ b/app/test/virtual_pmd.c
@@ -604,7 +604,7 @@ virtual_ethdev_create(const char *name, struct ether_addr 
*mac_addr,
TAILQ_INIT(&(eth_dev->link_intr_cbs));

eth_dev->data->dev_link.link_status = ETH_LINK_DOWN;
-   eth_dev->data->dev_link.link_speed = ETH_LINK_SPEED_1;
+   eth_dev->data->dev_link.link_speed = ETH_SPEED_NUM_10G;
eth_dev->data->dev_link.link_duplex = ETH_LINK_FULL_DUPLEX;

eth_dev->data->mac_addrs = rte_zmalloc(name, ETHER_ADDR_LEN, 0);
diff --git a/drivers/net/af_packet/rte_eth_af_packet.c 
b/drivers/net/af_packet/rte_eth_af_packet.c
index dee7b59..641f849 100644
--- a/drivers/net/af_packet/rte_eth_af_packet.c
+++ b/drivers/net/af_packet/rte_eth_af_packet.c
@@ -116,7 +116,7 @@ static const char *valid_arguments[] = {
 static const char *drivername = "AF_PACKET PMD";

 static struct rte_eth_link pmd_link = {
-   .link_speed = 1,
+   .link_speed = ETH_SPEED_NUM_10G,
.link_duplex = ETH_LINK_FULL_DUPLEX,
.link_status = ETH_LINK_DOWN,
 };
diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.c 
b/drivers/net/bonding/rte_eth_bond_8023ad.c
index 1b7e93a..ac8306f 100644
--- a/drivers/net/bonding/rte_eth_bond_8023ad.c
+++ b/drivers/net/bonding/rte_eth_bond_8023ad.c
@@ -711,22 +711,22 @@ link_speed_key(uint16_t speed) {
case ETH_LINK_SPEED_AUTONEG:
key_speed = 0x00;
break;
-   case ETH_LINK_SPEED_10:
+   case ETH_SPEED_NUM_10M:
key_speed = BOND_LINK_SPEED_KEY_10M;
break;
-   case ETH_LINK_SPEED_100:
+   case ETH_SPEED_NUM_100M:
key_speed = BOND_LINK_SPEED_KEY_100M;
break;
-   case ETH_LINK_SPEED_1000:
+   case ETH_SPEED_NUM_1G:
key_speed = BOND_LINK_SPEED_KEY_1000M;
break;
-   case ETH_LINK_SPEED_10G:
+   case ETH_SPEED_NUM_10G:
key_speed = BOND_LINK_SPEED_KEY_10G;
break;
-   case ETH_LINK_SPEED_20G:
+   case ETH_SPEED_NUM_20G:
key_speed = BOND_LINK_SPEED_KEY_20G;
break;
-   case ETH_LINK_SPE

[dpdk-dev] [PATCH v12 5/8] ethdev: add speed capabilities

2016-03-25 Thread Thomas Monjalon
From: Marc Sune 

The speed capabilities of a device can be retrieved with
rte_eth_dev_info_get().

The new field speed_capa is initialized in the drivers without
taking care of device characteristics in this patch.
When the capabilities of a driver are accurate, the table in
overview.rst must be filled.

Signed-off-by: Marc Sune 
---
 doc/guides/nics/overview.rst   |  1 +
 doc/guides/rel_notes/release_16_04.rst |  8 
 drivers/net/bnx2x/bnx2x_ethdev.c   |  1 +
 drivers/net/cxgbe/cxgbe_ethdev.c   |  1 +
 drivers/net/e1000/em_ethdev.c  |  4 
 drivers/net/e1000/igb_ethdev.c |  4 
 drivers/net/ena/ena_ethdev.c   |  9 +
 drivers/net/fm10k/fm10k_ethdev.c   |  4 
 drivers/net/i40e/i40e_ethdev.c |  8 
 drivers/net/ixgbe/ixgbe_ethdev.c   |  8 
 drivers/net/mlx4/mlx4.c|  6 ++
 drivers/net/mlx5/mlx5_ethdev.c |  8 
 drivers/net/nfp/nfp_net.c  |  2 ++
 lib/librte_ether/rte_ethdev.h  | 21 +
 14 files changed, 85 insertions(+)

diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst
index 542479a..62f1868 100644
--- a/doc/guides/nics/overview.rst
+++ b/doc/guides/nics/overview.rst
@@ -86,6 +86,7 @@ Most of these differences are summarized below.
   e   e   e   e   e
 e
   c   c   c   c   c
 c
 = = = = = = = = = = = = = = = = = = = = = = = = = = = 
= = = = = =
+   speed capabilities
link status  X   X X   
X X
link status eventX X
 X
queue status event  
 X
diff --git a/doc/guides/rel_notes/release_16_04.rst 
b/doc/guides/rel_notes/release_16_04.rst
index 79d76e1..9e7b0b7 100644
--- a/doc/guides/rel_notes/release_16_04.rst
+++ b/doc/guides/rel_notes/release_16_04.rst
@@ -47,6 +47,11 @@ This section should contain new features added in this 
release. Sample format:
   A new function ``rte_pktmbuf_alloc_bulk()`` has been added to allow the user
   to allocate a bulk of mbufs.

+* **Added device link speed capabilities.**
+
+  The structure ``rte_eth_dev_info`` has now a ``speed_capa`` bitmap, which
+  allows the application to know the supported speeds of each device.
+
 * **Added new poll-mode driver for Amazon Elastic Network Adapters (ENA).**

   The driver operates variety of ENA adapters through feature negotiation
@@ -456,6 +461,9 @@ This section should contain API changes. Sample format:
   All drivers are now counting the missed packets only once, i.e. drivers will
   not increment ierrors anymore for missed packets.

+* The ethdev structure ``rte_eth_dev_info`` was changed to support device
+  speed capabilities.
+
 * The functions ``rte_eth_dev_udp_tunnel_add`` and 
``rte_eth_dev_udp_tunnel_delete``
   have been renamed into ``rte_eth_dev_udp_tunnel_port_add`` and
   ``rte_eth_dev_udp_tunnel_port_delete``.
diff --git a/drivers/net/bnx2x/bnx2x_ethdev.c b/drivers/net/bnx2x/bnx2x_ethdev.c
index a3c6c01..897081f 100644
--- a/drivers/net/bnx2x/bnx2x_ethdev.c
+++ b/drivers/net/bnx2x/bnx2x_ethdev.c
@@ -327,6 +327,7 @@ bnx2x_dev_infos_get(struct rte_eth_dev *dev, __rte_unused 
struct rte_eth_dev_inf
dev_info->min_rx_bufsize = BNX2X_MIN_RX_BUF_SIZE;
dev_info->max_rx_pktlen  = BNX2X_MAX_RX_PKT_LEN;
dev_info->max_mac_addrs  = BNX2X_MAX_MAC_ADDRS;
+   dev_info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_20G;
 }

 static void
diff --git a/drivers/net/cxgbe/cxgbe_ethdev.c b/drivers/net/cxgbe/cxgbe_ethdev.c
index 8845c76..bb134e5 100644
--- a/drivers/net/cxgbe/cxgbe_ethdev.c
+++ b/drivers/net/cxgbe/cxgbe_ethdev.c
@@ -171,6 +171,7 @@ static void cxgbe_dev_info_get(struct rte_eth_dev *eth_dev,

device_info->rx_desc_lim = cxgbe_desc_lim;
device_info->tx_desc_lim = cxgbe_desc_lim;
+   device_info->speed_capa = ETH_LINK_SPEED_10G | ETH_LINK_SPEED_40G;
 }

 static void cxgbe_dev_promiscuous_enable(struct rte_eth_dev *eth_dev)
diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c
index 473d77f..d5f8c7f 100644
--- a/drivers/net/e1000/em_ethdev.c
+++ b/drivers/net/e1000/em_ethdev.c
@@ -1054,6 +1054,10 @@ eth_em_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
.nb_min = E1000_MIN_RING_DESC,
.nb_align = EM_TXD_ALIGN,
};
+
+   dev_info->speed_capa = ETH_LINK_SPEED_10M_HD | ETH_LINK_SPEED_10M |
+   ETH_LINK_SPEED_100M_HD | ETH_LINK_SPEED_100M |
+   ETH_LINK_SPEED_1G;
 }

 /* return 0 means link status changed, -1 means not changed */
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 86f25f6..95d1711 100644
--- a/drivers/

[dpdk-dev] [PATCH v12 6/8] ethdev: redesign link speed config

2016-03-25 Thread Thomas Monjalon
From: Marc Sune 

This patch redesigns the API to set the link speed/s configuration
of an ethernet port. Specifically:

- it allows to define a set of advertised speeds for
  auto-negociation.
- it allows to disable link auto-negociation (single fixed speed).
- default: auto-negociate all supported speeds.

A flag autoneg in struct rte_eth_link indicates if link speed was a
result of auto-negociation or was fixed by configuration.

Signed-off-by: Marc Sune 
Tested-by: Nelio Laranjeiro 
Signed-off-by: Thomas Monjalon 
---

PLEASE REVIEW CAREFULLY THIS PATCH

 app/test-pmd/cmdline.c| 26 
 doc/guides/rel_notes/deprecation.rst  |  3 -
 doc/guides/rel_notes/release_16_04.rst|  9 +++
 drivers/net/af_packet/rte_eth_af_packet.c |  1 +
 drivers/net/bnx2x/bnx2x_ethdev.c  |  4 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c |  2 +-
 drivers/net/e1000/em_ethdev.c | 99 +++
 drivers/net/e1000/igb_ethdev.c| 94 +++--
 drivers/net/i40e/i40e_ethdev.c| 48 +++
 drivers/net/i40e/i40e_ethdev_vf.c |  7 ++-
 drivers/net/ixgbe/ixgbe_ethdev.c  | 46 ++
 drivers/net/mlx4/mlx4.c   |  2 +
 drivers/net/mpipe/mpipe_tilegx.c  |  2 +
 drivers/net/null/rte_eth_null.c   |  1 +
 drivers/net/pcap/rte_eth_pcap.c   |  1 +
 drivers/net/ring/rte_eth_ring.c   |  1 +
 drivers/net/szedata2/rte_eth_szedata2.c   |  2 +
 drivers/net/vmxnet3/vmxnet3_ethdev.c  |  1 +
 drivers/net/xenvirt/rte_eth_xenvirt.c |  1 +
 examples/ip_pipeline/config_parse.c   |  3 +-
 lib/librte_ether/rte_ethdev.h | 29 +
 21 files changed, 196 insertions(+), 186 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 815b53b..741cac3 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -989,7 +989,7 @@ struct cmd_config_speed_all {
 };

 static int
-parse_and_check_speed_duplex(char *speedstr, char *duplexstr, uint16_t *speed)
+parse_and_check_speed_duplex(char *speedstr, char *duplexstr, uint32_t *speed)
 {

int duplex;
@@ -1006,20 +1006,22 @@ parse_and_check_speed_duplex(char *speedstr, char 
*duplexstr, uint16_t *speed)
}

if (!strcmp(speedstr, "10")) {
-   *speed = ETH_SPEED_NUM_10M;
+   *speed = (duplex == ETH_LINK_HALF_DUPLEX) ?
+   ETH_LINK_SPEED_10M_HD : ETH_LINK_SPEED_10M;
} else if (!strcmp(speedstr, "100")) {
-   *speed = ETH_SPEED_NUM_100M;
+   *speed = (duplex == ETH_LINK_HALF_DUPLEX) ?
+   ETH_LINK_SPEED_100M_HD : ETH_LINK_SPEED_100M;
} else {
if (duplex != ETH_LINK_FULL_DUPLEX) {
printf("Invalid speed/duplex parameters\n");
return -1;
}
if (!strcmp(speedstr, "1000")) {
-   *speed = ETH_SPEED_NUM_1G;
+   *speed = ETH_LINK_SPEED_1G;
} else if (!strcmp(speedstr, "1")) {
-   *speed = ETH_SPEED_NUM_10G;
+   *speed = ETH_LINK_SPEED_10G;
} else if (!strcmp(speedstr, "4")) {
-   *speed = ETH_SPEED_NUM_40G;
+   *speed = ETH_LINK_SPEED_40G;
} else if (!strcmp(speedstr, "auto")) {
*speed = ETH_LINK_SPEED_AUTONEG;
} else {
@@ -1037,8 +1039,7 @@ cmd_config_speed_all_parsed(void *parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_config_speed_all *res = parsed_result;
-   uint16_t link_speed = ETH_LINK_SPEED_AUTONEG;
-   uint16_t link_duplex = 0;
+   uint32_t link_speed;
portid_t pid;

if (!all_ports_stopped()) {
@@ -1051,8 +1052,7 @@ cmd_config_speed_all_parsed(void *parsed_result,
return;

FOREACH_PORT(pid, ports) {
-   ports[pid].dev_conf.link_speed = link_speed;
-   ports[pid].dev_conf.link_duplex = link_duplex;
+   ports[pid].dev_conf.link_speeds = link_speed;
}

cmd_reconfig_device_queue(RTE_PORT_ALL, 1, 1);
@@ -1110,8 +1110,7 @@ cmd_config_speed_specific_parsed(void *parsed_result,
__attribute__((unused)) void *data)
 {
struct cmd_config_speed_specific *res = parsed_result;
-   uint16_t link_speed = ETH_LINK_SPEED_AUTONEG;
-   uint16_t link_duplex = 0;
+   uint32_t link_speed;

if (!all_ports_stopped()) {
printf("Please stop all ports first\n");
@@ -1125,8 +1124,7 @@ cmd_config_speed_specific_parsed(void *parsed_result,
&link_speed) < 0)
return;

-   ports[res->id].dev_conf.link_speed = link_speed;
-   ports[res->id].dev_conf.link_duplex = link_duplex;
+   ports[res->

[dpdk-dev] [PATCH v12 7/8] ethdev: convert speed number to bitmap flag

2016-03-25 Thread Thomas Monjalon
From: Marc Sune 

It is a helper for the bitmap configuration.

Signed-off-by: Marc Sune 
Signed-off-by: Thomas Monjalon 
---
 lib/librte_ether/rte_ethdev.c  | 31 +++
 lib/librte_ether/rte_ethdev.h  | 13 +
 lib/librte_ether/rte_ether_version.map |  1 +
 3 files changed, 45 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 76a30fd..695b475 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -866,6 +866,37 @@ rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, 
uint16_t nb_queues)
return 0;
 }

+uint32_t
+rte_eth_speed_bitflag(uint32_t speed, int duplex)
+{
+   switch (speed) {
+   case ETH_SPEED_NUM_10M:
+   return duplex ? ETH_LINK_SPEED_10M : ETH_LINK_SPEED_10M_HD;
+   case ETH_SPEED_NUM_100M:
+   return duplex ? ETH_LINK_SPEED_100M : ETH_LINK_SPEED_100M_HD;
+   case ETH_SPEED_NUM_1G:
+   return ETH_LINK_SPEED_1G;
+   case ETH_SPEED_NUM_2_5G:
+   return ETH_LINK_SPEED_2_5G;
+   case ETH_SPEED_NUM_5G:
+   return ETH_LINK_SPEED_5G;
+   case ETH_SPEED_NUM_10G:
+   return ETH_LINK_SPEED_10G;
+   case ETH_SPEED_NUM_20G:
+   return ETH_LINK_SPEED_20G;
+   case ETH_SPEED_NUM_25G:
+   return ETH_LINK_SPEED_25G;
+   case ETH_SPEED_NUM_40G:
+   return ETH_LINK_SPEED_40G;
+   case ETH_SPEED_NUM_50G:
+   return ETH_LINK_SPEED_50G;
+   case ETH_SPEED_NUM_56G:
+   return ETH_LINK_SPEED_56G;
+   default:
+   return 0;
+   }
+}
+
 int
 rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
  const struct rte_eth_conf *dev_conf)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 03d3278..bb08ead 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1876,6 +1876,19 @@ struct eth_driver {
 void rte_eth_driver_register(struct eth_driver *eth_drv);

 /**
+ * Convert a numerical speed in Mbps to a bitmap flag that can be used in
+ * the bitmap link_speeds of the struct rte_eth_conf
+ *
+ * @param speed
+ *   Numerical speed value in Mbps
+ * @param duplex
+ *   ETH_LINK_[HALF/FULL]_DUPLEX (only for 10/100M speeds)
+ * @return
+ *   0 if the speed cannot be mapped
+ */
+uint32_t rte_eth_speed_bitflag(uint32_t speed, int duplex);
+
+/**
  * Configure an Ethernet device.
  * This function must be invoked first before any other function in the
  * Ethernet API. This function can also be re-invoked when a device is in the
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index b1f4475..214ecc7 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -125,6 +125,7 @@ DPDK_16.04 {
rte_eth_dev_set_vlan_ether_type;
rte_eth_dev_udp_tunnel_port_add;
rte_eth_dev_udp_tunnel_port_delete;
+   rte_eth_speed_bitflag;
rte_eth_tx_buffer_count_callback;
rte_eth_tx_buffer_drop_callback;
rte_eth_tx_buffer_init;
-- 
2.7.0



[dpdk-dev] [PATCH v12 8/8] ethdev: add 100G link speed

2016-03-25 Thread Thomas Monjalon
The link speed configuration is now done with bitmaps so 100G speed
requires only a new bit flag.
The actual link speed is a number so its size must be increased from
16-bit to 32-bit.

Signed-off-by: Marc Sune 
Signed-off-by: Thomas Monjalon 
Tested-by: Nelio Laranjeiro 
Tested-by: Matej Vido 
---
 app/test-pmd/cmdline.c  | 12 +++-
 doc/guides/nics/szedata2.rst|  6 --
 doc/guides/rel_notes/release_16_04.rst  |  5 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  2 +-
 drivers/net/ena/ena_ethdev.c|  3 ++-
 drivers/net/fm10k/fm10k_ethdev.c|  2 +-
 drivers/net/mlx5/mlx5_ethdev.c  |  3 ++-
 drivers/net/nfp/nfp_net.c   |  2 +-
 drivers/net/szedata2/rte_eth_szedata2.c |  9 ++---
 lib/librte_ether/rte_ethdev.c   |  2 ++
 lib/librte_ether/rte_ethdev.h   |  4 +++-
 11 files changed, 26 insertions(+), 24 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 741cac3..c5b9479 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -549,7 +549,7 @@ static void cmd_help_long_parsed(void *parsed_result,
"Detach physical or virtual dev by port_id\n\n"

"port config (port_id|all)"
-   " speed (10|100|1000|1|4|auto)"
+   " speed (10|100|1000|1|4|10|auto)"
" duplex (half|full|auto)\n"
"Set speed and duplex for all ports or port_id\n\n"

@@ -1022,6 +1022,8 @@ parse_and_check_speed_duplex(char *speedstr, char 
*duplexstr, uint32_t *speed)
*speed = ETH_LINK_SPEED_10G;
} else if (!strcmp(speedstr, "4")) {
*speed = ETH_LINK_SPEED_40G;
+   } else if (!strcmp(speedstr, "10")) {
+   *speed = ETH_LINK_SPEED_100G;
} else if (!strcmp(speedstr, "auto")) {
*speed = ETH_LINK_SPEED_AUTONEG;
} else {
@@ -1069,7 +1071,7 @@ cmdline_parse_token_string_t cmd_config_speed_all_item1 =
TOKEN_STRING_INITIALIZER(struct cmd_config_speed_all, item1, "speed");
 cmdline_parse_token_string_t cmd_config_speed_all_value1 =
TOKEN_STRING_INITIALIZER(struct cmd_config_speed_all, value1,
-   "10#100#1000#1#4#auto");
+   
"10#100#1000#1#4#10#auto");
 cmdline_parse_token_string_t cmd_config_speed_all_item2 =
TOKEN_STRING_INITIALIZER(struct cmd_config_speed_all, item2, "duplex");
 cmdline_parse_token_string_t cmd_config_speed_all_value2 =
@@ -1079,7 +1081,7 @@ cmdline_parse_token_string_t cmd_config_speed_all_value2 =
 cmdline_parse_inst_t cmd_config_speed_all = {
.f = cmd_config_speed_all_parsed,
.data = NULL,
-   .help_str = "port config all speed 10|100|1000|1|4|auto duplex "
+   .help_str = "port config all speed 10|100|1000|1|4|10|auto 
duplex "
"half|full|auto",
.tokens = {
(void *)&cmd_config_speed_all_port,
@@ -1143,7 +1145,7 @@ cmdline_parse_token_string_t 
cmd_config_speed_specific_item1 =
"speed");
 cmdline_parse_token_string_t cmd_config_speed_specific_value1 =
TOKEN_STRING_INITIALIZER(struct cmd_config_speed_specific, value1,
-   "10#100#1000#1#4#auto");
+   
"10#100#1000#1#4#10#auto");
 cmdline_parse_token_string_t cmd_config_speed_specific_item2 =
TOKEN_STRING_INITIALIZER(struct cmd_config_speed_specific, item2,
"duplex");
@@ -1154,7 +1156,7 @@ cmdline_parse_token_string_t 
cmd_config_speed_specific_value2 =
 cmdline_parse_inst_t cmd_config_speed_specific = {
.f = cmd_config_speed_specific_parsed,
.data = NULL,
-   .help_str = "port config X speed 10|100|1000|1|4|auto duplex "
+   .help_str = "port config X speed 10|100|1000|1|4|10|auto 
duplex "
"half|full|auto",
.tokens = {
(void *)&cmd_config_speed_specific_port,
diff --git a/doc/guides/nics/szedata2.rst b/doc/guides/nics/szedata2.rst
index 77c15b3..741b400 100644
--- a/doc/guides/nics/szedata2.rst
+++ b/doc/guides/nics/szedata2.rst
@@ -148,9 +148,3 @@ Example output:
  TX threshold registers: pthresh=0 hthresh=0 wthresh=0
  TX RS bit threshold=0 - TXQ flags=0x0
testpmd>
-
-.. note::
-
-   Link speed API currently supports speeds up to 40 Gbps.
-   Therefore there is used 10G constant for 100 Gbps cards until the link speed
-   API is not changed.
diff --git a/doc/g

[dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Marc
On 25 March 2016 at 16:07, Zhang, Helin  wrote:

> Hi Thomas
>
> Beilei is investigating that, she will give her findings soon later, and
> possibly a fix after validating that.
> Thanks!
>
>
I will try to reproduce this on my side too with the latest v12. I could
not try latest patchsets, but i40 (XL710) and igb (82540EM) were working on
my side for previous versions. Which exact NICs were used to test the
patchset for igb?

Marc


> Regards,
> Helin
>
> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Friday, March 25, 2016 5:36 PM
> > To: Xu, Qian Q 
> > Cc: dev at dpdk.org; Marc ; Ananyev, Konstantin
> > ; Lu, Wenzhuo ;
> > Zhang, Helin ; Richardson, Bruce
> > ; Glynn, Michael J <
> michael.j.glynn at intel.com>
> > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API
> > refactoring
> >
> > Is there someone investigating the issue?
> > I think it should be simple to fix for someone mastering these Intel
> drivers.
> >
> > 2016-03-25 01:02, Xu, Qian Q:
> > > Marc
> > > #Test1 is just a simple test. Just launch testpmd with these nic port.
> > > ./testpmd ?c 0x3 ?n 4 -- -i
> > >
> > > Thanks
> > > Qian
> > >
> > > From: marc.sune at gmail.com [mailto:marc.sune at gmail.com] On Behalf Of
> > > Marc
> > > Sent: Thursday, March 24, 2016 3:48 PM
> > > To: Xu, Qian Q
> > > Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin;
> > > Richardson, Bruce; dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed
> > > API refactoring
> > >
> > >
> > >
> > > On 24 March 2016 at 07:21, Xu, Qian Q
> > mailto:qian.q.xu at intel.com>> wrote:
> > > Marc
> > > I didn?t quite get your points, I observed that after applying this
> patchset, all
> > intel nic can?t be started, maybe something wrong happened when you check
> > the duplex/autoneg value for different NICs. If we want to merge the
> patchset in
> > RC2, we need fix them. Maybe not an easy job in several days.
> > >
> > > Is this test#1 one of the tests contained in the DPDK repository or is
> it an
> > internal test?
> > >
> > > Marc
> > >
> > >
> > >
> > > Thanks
> > > Qian
> > >
> > > From: marc.sune at gmail.com
> > > [mailto:marc.sune at gmail.com] On Behalf 
> > > Of
> > > Marc
> > > Sent: Thursday, March 24, 2016 4:54 AM
> > > To: Xu, Qian Q
> > > Cc: Thomas Monjalon; Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin;
> > > Richardson, Bruce; dev at dpdk.org
> > >
> > > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed
> > > API refactoring
> > >
> > > Qian,
> > >
> > > On 23 March 2016 at 02:18, Xu, Qian Q
> > mailto:qian.q.xu at intel.com>> wrote:
> > > We have tested with intel nic and found port can't be started for all
> > nics:ixgbe/i40e/igb/bonding, see attached mail for more details. Please
> check
> > and fix it.
> > >
> > >
> > > Thanks
> > > Qian
> > >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org]
> > > On Behalf Of Thomas Monjalon
> > > Sent: Wednesday, March 23, 2016 3:59 AM
> > > To: Ananyev, Konstantin; Lu, Wenzhuo; Zhang, Helin
> > > Cc: marcdevel at gmail.com; Richardson,
> > > Bruce; dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed
> > > API refactoring
> > >
> > > 2016-03-17 19:08, Thomas Monjalon:
> > > > There are still too few tests and reviews, especially for
> > > > autonegotiation with Intel devices (patch #6).
> > > > I would not be surprised to see some bugs in this rework.
> > >
> > > Any feedback about autoneg in e1000/ixgbe/i40e?
> > > Has it been tested before its integration in RC2?
> > >
> > > > The capabilities must be adapted per device. It can be improved in a
> > > > separate patch.
> > > >
> > > > It will be integrated in 16.04-rc2.
> > > > Please test and review shortly, thanks!
> > >
> > >
> > > -- Forwarded message --
> > > From: "Xu, Qian Q" mailto:qian.q.xu at intel.com>>
> > > To: "Cao, Waterman"
> > > mailto:waterman.cao at intel.com>>, "Glynn,
> > > Michael J"
> > > mailto:michael.j.glynn at intel.com>>
> > > Cc: "Richardson, Bruce"
> > > mailto:bruce.richardson at intel.com>>, 
> > > "Zhu,
> > > Heqing" mailto:heqing.zhu at intel.com>>,
> > > "O'Driscoll, Tim"
> > > mailto:tim.odriscoll at intel.com>>, 
> > > "Mcnamara,
> > > John" mailto:john.mcnamara at intel.com>>, 
> > > "Xu,
> > > HuilongX" mailto:huilongx.xu at intel.com>>, 
> > > "Fu,
> > > JingguoX" mailto:jingguox.fu at intel.com>>, 
> > > "Xu,
> > > Qian Q" mailto:qian.q.xu at intel.com>>, "Zhang,
> > > Helin" mailto:helin.zhang at intel.com>>
> > > Date: Tue, 22 Mar 2016 06:41:37 +
> > > Subject: RE: DPDK link speed with Intel devices Hi, all We have worked
> > > out the basic test cases for the patchset.
> > > 1. Test the link speed on major Intel N

[dpdk-dev] [PATCH] mempool: allow for user-owned mempool caches

2016-03-25 Thread Mauricio Vásquez
Hello to everybody,

I find this proposal very interesting as It could be used to solve an issue
that has not been mentioned yet: using memory pools shared with ivshmem.
Currently, due to the per lcore cache, it is not possible to allocate and
deallocate packets from the guest as it will cause cache corruption because
DPDK processes in the guest and in the host share some lcores_id.

If there is an API to register caches, the EAL could create and register
its own caches during the ivshmem initialization in the guest, avoiding
possible cache corruption problems.

I look forward to V2 in order to get a clear idea of how can it be used to
solve this issue.

Mauricio V,

On Fri, Mar 25, 2016 at 11:55 AM, Olivier Matz 
wrote:

> Hi Venky,
>
> >> The main benefit of having an external cache is to allow mempool users
> >> (threads) to maintain a local cache even though they don't have a valid
> >> lcore_id (non-EAL threads). The fact that cache access is done by
> indexing
> >> with the lcore_id is what makes it difficult...
> >
> > Hi Lazaros,
> >
> > Alternative suggestion: This could actually be very simply done via
> creating an EAL API to register and return an lcore_id for a thread wanting
> to use DPDK services. That way, you could simply create your pthread, call
> the eal_register_thread() function that assigns an lcore_id to the caller
> (and internally sets up the per_lcore variable.
> >
> > The advantage of doing it this way is that you could extend it to other
> things other than the mempool that may need an lcore_id setup.
>
> From my opinion, externalize the cache structure as Lazaros suggests
> would make things simpler, especially in case of dynamic threads
> allocation/destruction.
>
> If a lcore_id regristration API is added in EAL, we still need a
> max lcore value when the mempool is created so the cache can be
> allocated. Moreover, the API would not be as simple, especially
> if it needs to support secondary processes.
>
>
> Regards,
> Olivier
>


[dpdk-dev] [PATCH] mempool: allow for user-owned mempool caches

2016-03-25 Thread Venkatesan, Venky

> -Original Message-
> From: Olivier Matz [mailto:olivier.matz at 6wind.com]
> Sent: Friday, March 25, 2016 3:56 AM
> To: Venkatesan, Venky ; Lazaros Koromilas
> ; Wiles, Keith 
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] mempool: allow for user-owned mempool
> caches
> 
> Hi Venky,
> 
> >> The main benefit of having an external cache is to allow mempool
> >> users
> >> (threads) to maintain a local cache even though they don't have a
> >> valid lcore_id (non-EAL threads). The fact that cache access is done
> >> by indexing with the lcore_id is what makes it difficult...
> >
> > Hi Lazaros,
> >
> > Alternative suggestion: This could actually be very simply done via creating
> an EAL API to register and return an lcore_id for a thread wanting to use
> DPDK services. That way, you could simply create your pthread, call the
> eal_register_thread() function that assigns an lcore_id to the caller (and
> internally sets up the per_lcore variable.
> >
> > The advantage of doing it this way is that you could extend it to other
> things other than the mempool that may need an lcore_id setup.
> 
> From my opinion, externalize the cache structure as Lazaros suggests would
> make things simpler, especially in case of dynamic threads
> allocation/destruction.
> 
> If a lcore_id regristration API is added in EAL, we still need a max lcore 
> value
> when the mempool is created so the cache can be allocated. Moreover, the
> API would not be as simple, especially if it needs to support secondary
> processes.
>
Not really - the secondary process is simply another series of threads. They 
have their own caches. Yes, we will need a max lcore value, but we can make the 
allocations dynamic as opposed to static. That way, we will have MAX_LCORE 
pointers to store per mempool. 

The approach that's suggested currently is workable (and if I were solving 
mempool alone, this is very likely what I would do too), but is limited to the 
mempool alone. Adding the API to the eal has a rather huge secondary advantage 
- you now no longer need to create DPDK threads explicitly any more - you can 
create pthreads, and manage them how you wish. Architecturally speaking, longer 
term for DPDK that would be bigger win. 

> 
> Regards,
> Olivier


[dpdk-dev] [PATCH v11 0/8] ethdev: 100G and link speed API refactoring

2016-03-25 Thread Marc
On 25 March 2016 at 21:41, Marc  wrote:

>
> On 25 March 2016 at 16:07, Zhang, Helin  wrote:
>
>> Hi Thomas
>>
>> Beilei is investigating that, she will give her findings soon later, and
>> possibly a fix after validating that.
>> Thanks!
>>
>>
> I will try to reproduce this on my side too with the latest v12. I could
> not try latest patchsets, but i40 (XL710) and igb (82540EM) were working on
> my side for previous versions. Which exact NICs were used to test the
> patchset for igb?
>

I am able to reproduce it straight away by applying v12. The problem is
testpmd and in general existing applications have the default value of 0 as
link_speeds for autoneg.

>From v9 to v10 patchset the values ETH_LINK_SPEED_AUTONEG and
ETH_LINK_SPEED_FIXED were flipped. Reverting this makes it work:

marc at Beluga:~/personal/dpdk/tools$ git diff
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index ef2502a..fb247a7 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -244,8 +244,8 @@ struct rte_eth_stats {
 /**
  * Device supported speeds bitmap flags
  */
-#define ETH_LINK_SPEED_FIXED(0 <<  0)  /**< Disable autoneg (fixed
speed) */
-#define ETH_LINK_SPEED_AUTONEG  (1 <<  0)  /**< Autonegotiate (all speeds)
*/
+#define ETH_LINK_SPEED_AUTONEG  (0 <<  0)  /**< Autonegotiate (all speeds)
*/
+#define ETH_LINK_SPEED_FIXED(1 <<  0)  /**< Disable autoneg (fixed
speed) */
 #define ETH_LINK_SPEED_10M_HD   (1 <<  1)  /**<  10 Mbps half-duplex */
 #define ETH_LINK_SPEED_10M  (1 <<  2)  /**<  10 Mbps full-duplex */
 #define ETH_LINK_SPEED_100M_HD  (1 <<  3)  /**< 100 Mbps half-duplex */

I think having autoneg == 0 is better. Do you agree Thomas?

With this change my current NIC (I218-LM) is able to initialize:

Option: 27

  Enter hex bitmask of cores to execute testpmd app on
  Example: to execute app on cores 0 to 7, enter 0xff
bitmask: 0x3
Launching app
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 0 on socket 0
EAL: Detected lcore 2 as core 1 on socket 0
EAL: Detected lcore 3 as core 1 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 4 lcore(s)
EAL: Probing VFIO support...
EAL: Module /sys/module/vfio_pci not found! error 2 (No such file or
directory)
EAL: VFIO modules not loaded, skipping VFIO support...
EAL: Setting up physically contiguous memory...
EAL: Ask a virtual area of 0x2680 bytes
EAL: Virtual area found at 0x7f33ef80 (size = 0x2680)
EAL: Ask a virtual area of 0x6e0 bytes
EAL: Virtual area found at 0x7f33e880 (size = 0x6e0)
EAL: Ask a virtual area of 0x80 bytes
EAL: Virtual area found at 0x7f33e7e0 (size = 0x80)
EAL: Ask a virtual area of 0x440 bytes
EAL: Virtual area found at 0x7f33e380 (size = 0x440)
EAL: Ask a virtual area of 0xe0 bytes
EAL: Virtual area found at 0x7f33e280 (size = 0xe0)
EAL: Ask a virtual area of 0x60 bytes
EAL: Virtual area found at 0x7f33e200 (size = 0x60)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f33e1c0 (size = 0x20)
EAL: Ask a virtual area of 0x4360 bytes
EAL: Virtual area found at 0x7f339e40 (size = 0x4360)
EAL: Ask a virtual area of 0x8e0 bytes
EAL: Virtual area found at 0x7f339540 (size = 0x8e0)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f339500 (size = 0x20)
EAL: Ask a virtual area of 0x20 bytes
EAL: Virtual area found at 0x7f3394c0 (size = 0x20)
EAL: Requesting 1024 pages of size 2MB from socket 0
EAL: TSC frequency is ~2593996 KHz
EAL: Master lcore 0 is ready (tid=180078c0;cpuset=[0])
EAL: lcore 1 is ready (tid=94bff700;cpuset=[1])
EAL: PCI device :00:19.0 on NUMA socket -1
EAL:   probe driver: 8086:15a2 rte_em_pmd
EAL:   PCI memory mapped at 0x7f341600
EAL:   PCI memory mapped at 0x7f341602
PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x15a2
Interactive-mode selected
Configuring Port 0 (socket 0)
PMD: eth_em_tx_queue_setup(): sw_ring=0x7f33e210efc0 hw_ring=0x7f33e21110c0
dma_addr=0x745110c0
PMD: eth_em_rx_queue_setup(): sw_ring=0x7f33e20fea80 hw_ring=0x7f33e20fef80
dma_addr=0x744fef80
PMD: eth_em_start(): <<

I am troubleshooting link status reporting, which seems not correct with
l2fwd. I will also double check that fixed speed and autoneg with subset of
speeds work.

@Thomas: once I've fixed this shall I submit v13 or should we wait for more
feedback from the rest of untested NICs? This patchset needs to be tested
by all drivers, at least.

marc


> Marc
>
>
>> Regards,
>> Helin
>>
>> > -Original Message-
>> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
>> > Sent: Friday, March 25, 2016 5:36 PM
>> > To: Xu, Qian Q 
>> > Cc: dev at dpdk.org; Marc ; Ananyev, Konstantin
>> > ; Lu, Wenzhuo ;
>> > Zhang, Helin ; Richardson, Bruce
>> > ; Glynn, Michael J <
>> michael.j.glynn at intel.com>
>> > Subject: Re: [dpdk-dev] [PATCH v11 0/

[dpdk-dev] Unable to get multi-segment mbuf working for ixgbe

2016-03-25 Thread Clarylin L
Hello,

I am trying to use multi-segment mbuf to receive large packet. I enabled
jumbo_frame and enable_scatter for the port and was expecting mbuf chaining
would be used to receive packets larger than the mbuf size (which was set
to 2048).

When sending 3000-byte (without fragmentation) packet from another non-dpdk
host, I didn't see packet was received by the ixgbe PMD driver.

After a quick debugging session I found that the following statement
in ixgbe_recv_scattered_pkts
(ixgbe_rxtx.c) is
always true and break the loop in case of large packet, while it's not the
case for small packet (smaller than mbuf size):

if (! staterr & rte_cpu_to_le32(IXGBE_RXDADV_STAT_DD))
break;

Is enabling jumbo_frame and enable_scatter good enough to get started the
mbuf chaining?

Appreciate any input! Thanks.


[dpdk-dev] [PATCH] enic: fix TX hang when number of packets > queue size

2016-03-25 Thread John Daley
If the nb_pkts parameter to rte_eth_tx_burst() was greater than
the TX descriptor count, a completion was not being requested
from the NIC, so descriptors would not be released back to the
host causing a lock-up.

Introduce a limit of how many TX descriptors can be used in a single
call to the enic PMD burst TX function before requesting a completion.

Fixes: d739ba4c6abf ("enic: improve Tx packet rate")

Signed-off-by: John Daley 
---
 drivers/net/enic/enic_ethdev.c | 20 
 drivers/net/enic/enic_res.h|  1 +
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index 4969476..6bea940 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -523,7 +523,7 @@ static void enicpmd_remove_mac_addr(struct rte_eth_dev 
*eth_dev, __rte_unused ui
 static uint16_t enicpmd_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts)
 {
-   unsigned int index;
+   uint16_t index;
unsigned int frags;
unsigned int pkt_len;
unsigned int seg_len;
@@ -535,6 +535,7 @@ static uint16_t enicpmd_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
unsigned short vlan_id;
unsigned short ol_flags;
uint8_t last_seg, eop;
+   unsigned int host_tx_descs = 0;

for (index = 0; index < nb_pkts; index++) {
tx_pkt = *tx_pkts++;
@@ -550,6 +551,7 @@ static uint16_t enicpmd_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
return index;
}
}
+
pkt_len = tx_pkt->pkt_len;
vlan_id = tx_pkt->vlan_tci;
ol_flags = tx_pkt->ol_flags;
@@ -559,9 +561,19 @@ static uint16_t enicpmd_xmit_pkts(void *tx_queue, struct 
rte_mbuf **tx_pkts,
next_tx_pkt = tx_pkt->next;
seg_len = tx_pkt->data_len;
inc_len += seg_len;
-   eop = (pkt_len == inc_len) || (!next_tx_pkt);
-   last_seg = eop &&
-   (index == ((unsigned int)nb_pkts - 1));
+
+   host_tx_descs++;
+   last_seg = 0;
+   eop = 0;
+   if ((pkt_len == inc_len) || !next_tx_pkt) {
+   eop = 1;
+   /* post if last packet in batch or > thresh */
+   if ((index == (nb_pkts - 1)) ||
+  (host_tx_descs > ENIC_TX_POST_THRESH)) {
+   last_seg = 1;
+   host_tx_descs = 0;
+   }
+   }
enic_send_pkt(enic, wq, tx_pkt, (unsigned short)seg_len,
  !frags, eop, last_seg, ol_flags, vlan_id);
tx_pkt = next_tx_pkt;
diff --git a/drivers/net/enic/enic_res.h b/drivers/net/enic/enic_res.h
index 33f2e84..00fa71d 100644
--- a/drivers/net/enic/enic_res.h
+++ b/drivers/net/enic/enic_res.h
@@ -53,6 +53,7 @@

 #define ENIC_NON_TSO_MAX_DESC  16
 #define ENIC_DEFAULT_RX_FREE_THRESH32
+#define ENIC_TX_POST_THRESH(ENIC_MIN_WQ_DESCS / 2)

 #define ENIC_SETTING(enic, f) ((enic->config.flags & VENETF_##f) ? 1 : 0)

-- 
2.7.0



[dpdk-dev] [PATCH] bonding: fix bond link detect in non-interrupt mode

2016-03-25 Thread John Daley
From: Nelson Escobar 

Stopping then re-starting a bond interface containing slaves that
used polling for link detection caused the bond to think all slave
links were down and inactive.

Move the start of the polling for link from slave_add() to
bond_ethdev_start() and in bond_ethdev_stop() make sure we clear
the last_link_status of the slaves.

Signed-off-by: Nelson Escobar 
Signed-off-by: John Daley 
---
 drivers/net/bonding/rte_eth_bond_pmd.c | 27 +--
 1 file changed, 17 insertions(+), 10 deletions(-)

diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c 
b/drivers/net/bonding/rte_eth_bond_pmd.c
index fb26d35..f0960c6 100644
--- a/drivers/net/bonding/rte_eth_bond_pmd.c
+++ b/drivers/net/bonding/rte_eth_bond_pmd.c
@@ -1454,18 +1454,11 @@ slave_add(struct bond_dev_private *internals,
slave_details->port_id = slave_eth_dev->data->port_id;
slave_details->last_link_status = 0;

-   /* If slave device doesn't support interrupts then we need to enabled
-* polling to monitor link status */
+   /* Mark slave devices that don't support interrupts so we can
+* compensate when we start the bond
+*/
if (!(slave_eth_dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)) {
slave_details->link_status_poll_enabled = 1;
-
-   if (!internals->link_status_polling_enabled) {
-   internals->link_status_polling_enabled = 1;
-
-   
rte_eal_alarm_set(internals->link_status_polling_interval_ms * 1000,
-   
bond_ethdev_slave_link_status_change_monitor,
-   (void 
*)&rte_eth_devices[internals->port_id]);
-   }
}

slave_details->link_status_wait_to_complete = 0;
@@ -1550,6 +1543,18 @@ bond_ethdev_start(struct rte_eth_dev *eth_dev)
eth_dev->data->port_id, 
internals->slaves[i].port_id);
return -1;
}
+   /* We will need to poll for link status if any slave doesn't
+* support interrupts
+*/
+   if (internals->slaves[i].link_status_poll_enabled)
+   internals->link_status_polling_enabled = 1;
+   }
+   /* start polling if needed */
+   if (internals->link_status_polling_enabled) {
+   rte_eal_alarm_set(
+   internals->link_status_polling_interval_ms * 1000,
+   bond_ethdev_slave_link_status_change_monitor,
+   (void *)&rte_eth_devices[internals->port_id]);
}

if (internals->user_defined_primary_port)
@@ -1622,6 +1627,8 @@ bond_ethdev_stop(struct rte_eth_dev *eth_dev)

internals->active_slave_count = 0;
internals->link_status_polling_enabled = 0;
+   for (i = 0; i < internals->slave_count; i++)
+   internals->slaves[i].last_link_status = 0;

eth_dev->data->dev_link.link_status = 0;
eth_dev->data->dev_started = 0;
-- 
2.7.0



[dpdk-dev] [PATCH] enic: state change from link-down to link-up not recognized

2016-03-25 Thread John Daley
When the enic was disabled, link notification was correctly disabled
in the NIC but the software indicator that it was disabled was not
updated (vdev->notify_pa not set to 0). When the link came back up,
enic did not re-enable notification in the NIC.

This affected bonding when a enic slave device link bounced.

The fix is to unconditionally enable notification when the enic is
enabled.

Fixes: 9913fbb91df0 ("enic/base: common code")

Signed-off-by: John Daley 
Reviewed-by: Nelson Escobar 
---
 drivers/net/enic/base/vnic_dev.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/enic/base/vnic_dev.c b/drivers/net/enic/base/vnic_dev.c
index 6153864..e8a5028 100644
--- a/drivers/net/enic/base/vnic_dev.c
+++ b/drivers/net/enic/base/vnic_dev.c
@@ -768,11 +768,9 @@ int vnic_dev_notify_set(struct vnic_dev *vdev, u16 intr)
static u32 instance;

if (vdev->notify || vdev->notify_pa) {
-   pr_warn("notify block %p still allocated.\n" \
-   "Ignore if restarting port\n", vdev->notify);
-   return -EINVAL;
+   return vnic_dev_notify_setcmd(vdev, vdev->notify,
+ vdev->notify_pa, intr);
}
-
if (!vnic_dev_in_reset(vdev)) {
snprintf((char *)name, sizeof(name),
"vnic_notify-%d", instance++);
-- 
2.7.0



  1   2   >