[dpdk-dev] [PATCH 00/29] ixgbe/base: update base driver

2016-06-06 Thread Zhang, Helin
Acked-by: Helin Zhang 

> -Original Message-
> From: Xing, Beilei
> Sent: Friday, May 6, 2016 2:07 PM
> To: Zhang, Helin
> Cc: dev at dpdk.org
> Subject: [PATCH 00/29] ixgbe/base: update base driver
> 
> Update base driver for ixgbe, mainly work on new features and bug fixes.
> 
> Beilei Xing (29):
>   ixgbe/base: add new VF requests for mailbox API
>   ixgbe/base: add sgmii link for X550
>   ixgbe/base: fix problematic return value
>   ixgbe/base: add mac link setup for x550a SFP
>   ixgbe/base: fix checksum error of checking PHY token
>   ixgbe/base: refactor eee setup for x550
>   ixgbe/base: change access method
>   ixgbe/base: add KR support for X550EM_A devices
>   ixgbe/base: add link mac setup for x550a SFP+
>   ixgbe/base: clear stale pool mappings
>   ixgbe/base: rename macro of TDL
>   ixgbe/base: fix error path to release lock
>   ixgbe/base: refactor NW management interface ops
>   ixgbe/base: fix for code style
>   ixgbe/base: fix firmware commands on x550em_a
>   ixgbe/base: add new phy definitions
>   ixgbe/base: change device IDs
>   ixgbe/base: update swfw semaphore function
>   ixgbe/base: fix register access error
>   ixgbe/base: limit PHY token accessing to MDIO only
>   ixgbe/base: smplify add/remove VLANs
>   ixgbe/base: add bypassing VLVF
>   ixgbe/base: unify coding style
>   ixgbe/base: use u8 to replace u16 for a variable
>   ixgbe/base: fix endianness issues
>   ixgbe/base: allow setting mac anti spoofing per vf
>   ixgbe/base: add flow control autoneg for x550a
>   ixgbe/base: define if enable crosstalk work around
>   ixgbe/base: update README
> 
>  doc/guides/rel_notes/release_16_07.rst  |   11 +
>  drivers/net/ixgbe/base/README   |2 +-
>  drivers/net/ixgbe/base/ixgbe_82598.c|5 +-
>  drivers/net/ixgbe/base/ixgbe_82598.h|3 +-
>  drivers/net/ixgbe/base/ixgbe_82599.c|9 +-
>  drivers/net/ixgbe/base/ixgbe_api.c  |   41 +-
>  drivers/net/ixgbe/base/ixgbe_api.h  |8 +-
>  drivers/net/ixgbe/base/ixgbe_common.c   |  361 ---
>  drivers/net/ixgbe/base/ixgbe_common.h   |9 +-
>  drivers/net/ixgbe/base/ixgbe_mbx.h  |4 +-
>  drivers/net/ixgbe/base/ixgbe_osdep.h|1 +
>  drivers/net/ixgbe/base/ixgbe_phy.c  |   16 +-
>  drivers/net/ixgbe/base/ixgbe_phy.h  |3 +
>  drivers/net/ixgbe/base/ixgbe_type.h |  118 ++-
>  drivers/net/ixgbe/base/ixgbe_vf.c   |   10 +-
>  drivers/net/ixgbe/base/ixgbe_vf.h   |7 +-
>  drivers/net/ixgbe/base/ixgbe_x540.c |   29 +-
>  drivers/net/ixgbe/base/ixgbe_x540.h |1 +
>  drivers/net/ixgbe/base/ixgbe_x550.c | 1156 +++--
> --
>  drivers/net/ixgbe/base/ixgbe_x550.h |   52 +
>  drivers/net/ixgbe/ixgbe_ethdev.c|   11 +-
>  drivers/net/ixgbe/ixgbe_pf.c|2 +-
>  lib/librte_eal/common/include/rte_pci_dev_ids.h |   12 +-
>  23 files changed, 1456 insertions(+), 415 deletions(-)
> 
> --
> 2.5.0



[dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

2016-06-06 Thread Pei, Yulong
Tested-by: Yulong Pei 

1. Run dpdk app with multiple mount points, it works as expected.
2. Create new cgroup with limited hugepages like the following, and Run dpdk 
app with the newly created cgroup, it works as expected.

#cgcreate -g hugetlb:/test-subgroup
# cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
# cgexec -g hugetlb:test-subgroup ./x86_64-native-linuxapp-gcc/app/testpmd -c 
0x3 -n 4 -- -i

Best Regards
Yulong Pei

-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Jianfeng Tan
Sent: Tuesday, May 31, 2016 11:37 AM
To: dev at dpdk.org
Cc: Gonzalez Monroy, Sergio ; nhorman at 
tuxdriver.com; david.marchand at 6wind.com; thomas.monjalon at 6wind.com; Tan, 
Jianfeng 
Subject: [dpdk-dev] [PATCH v5] eal: fix allocating all free hugepages

EAL memory init allocates all free hugepages of the whole system, which seen 
from sysfs, even when applications do not ask so many.
When there is a limitation on how many hugepages an application can use (such 
as cgroup.hugetlb), or hugetlbfs is specified with an option of size (exceeding 
the quota of the fs), it just fails to start even there are enough hugepages 
allocated.

To fix above issue, this patch:
 - Changes the logic to continue memory init to see if hugetlb
   requirement of application can be addressed by already allocated
   hugepages.
 - To make sure each hugepage is allocated successfully, we add a
   recover mechanism, which relies on a mem access to fault-in
   hugepages, and if it fails with SIGBUS, recover to previously
   saved stack environment with siglongjmp().

For the case of CONFIG_RTE_EAL_SINGLE_FILE_SEGMENTS (enabled by default when 
compiling IVSHMEM target), it's indispensable to mapp all free hugepages in the 
system. Under this case, it fails to start when allocating fails.

Test example:
  a. cgcreate -g hugetlb:/test-subgroup
  b. cgset -r hugetlb.1GB.limit_in_bytes=2147483648 test-subgroup
  c. cgexec -g hugetlb:test-subgroup \
  ./examples/helloworld/build/helloworld -c 0x2 -n 4


Fixes: af75078fece ("first public release")

Signed-off-by: Jianfeng Tan 
Acked-by: Neil Horman 
---
v5:
 - Make this method as default instead of using an option.
 - When SIGBUS is triggered in the case of RTE_EAL_SINGLE_FILE_SEGMENTS,
   just return error.
 - Add prefix "huge_" to newly added function and static variables.
 - Move the internal_config.memory assignment after the page allocations.
v4:
 - Change map_all_hugepages to return unsigned instead of int.
v3:
 - Reword commit message to include it fixes the hugetlbfs quota issue.
 - setjmp -> sigsetjmp.
 - Fix RTE_LOG complaint from ERR to DEBUG as it does not mean init error
   so far.
 - Fix the second map_all_hugepages's return value check.
v2:
 - Address the compiling error by move setjmp into a wrap method.

 lib/librte_eal/linuxapp/eal/eal.c|  20 -
 lib/librte_eal/linuxapp/eal/eal_memory.c | 138 ---
 2 files changed, 125 insertions(+), 33 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c 
b/lib/librte_eal/linuxapp/eal/eal.c
index 8aafd51..4a8dfbd 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -465,24 +465,6 @@ eal_parse_vfio_intr(const char *mode)
return -1;
 }

-static inline size_t
-eal_get_hugepage_mem_size(void)
-{
-   uint64_t size = 0;
-   unsigned i, j;
-
-   for (i = 0; i < internal_config.num_hugepage_sizes; i++) {
-   struct hugepage_info *hpi = &internal_config.hugepage_info[i];
-   if (hpi->hugedir != NULL) {
-   for (j = 0; j < RTE_MAX_NUMA_NODES; j++) {
-   size += hpi->hugepage_sz * hpi->num_pages[j];
-   }
-   }
-   }
-
-   return (size < SIZE_MAX) ? (size_t)(size) : SIZE_MAX;
-}
-
 /* Parse the arguments for --log-level only */  static void  
eal_log_level_parse(int argc, char **argv) @@ -766,8 +748,6 @@ rte_eal_init(int 
argc, char **argv)
if (internal_config.memory == 0 && internal_config.force_sockets == 0) {
if (internal_config.no_hugetlbfs)
internal_config.memory = MEMSIZE_IF_NO_HUGE_PAGE;
-   else
-   internal_config.memory = eal_get_hugepage_mem_size();
}

if (internal_config.vmware_tsc_map == 1) { diff --git 
a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 5b9132c..dc6f49b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -80,6 +80,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include 
 #include 
@@ -309,6 +311,21 @@ get_virtual_area(size_t *size, size_t hugepage_sz)
return addr;
 }

+static sigjmp_buf huge_jmpenv;
+
+static void huge_sigbus_handler(int signo __rte_unused) {
+   siglongjmp(huge_jmpenv, 1);
+}
+
+/* Put setjmp into a wrap method to avoid compiling error.

[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tetsuya Mukawa
Hi Yuanhan,

Sorry for late replying.

On 2016/06/03 13:17, Yuanhan Liu wrote:
> On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
>> Hi Yuanhan,
>>
>> On 2016/06/02 16:31, Yuanhan Liu wrote:
>>> But still, I'd ask do we really need 2 virtio for container solutions?
>>
>> I appreciate your comments.
> 
> No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
> is just brilliant!
> 
>> Let me have time to discuss it with our team.
> 
> I'm wondering could we have one solution only. IMO, the drawback of
> having two (quite different) solutions might outweighs the benefit
> it takes. Say, it might just confuse user.

I agree with this.
If we have 2 solutions, it would confuse the DPDK users.

> 
> OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
> what's the missing parts, and could we fix it? I'm thinking having
> one unified solution will keep ours energy/focus on one thing, making
> it better and better! Having two just splits the energy; it also
> introduces extra burden for maintaining.

Of course, I adopt Jiangeng's solution basically.
Actually, his solution is almost similar I tried to implement at first.

I guess here is pros/cons of 2 solutions.

[Jianfeng's solution]
- Pros
Don't need to invoke QEMU process.
- Cons
If virtio-net specification is changed, we need to implement it by
ourselves. Also, LSC interrupt and control queue functions are not
supported yet.
I agree both functions may not be so important, and if we need it
we can implement them, but we need to pay energy to implement them.

[My solution]
- Pros
Basic principle of my implementation is not to reinvent the wheel.
We can use a virtio-net device of QEMU implementation, it means we don't
need to maintain virtio-net device by ourselves, and we can use all of
functions supported by QEMU virtio-net device.
- Cons
Need to invoke QEMU process.


Anyway, we can choose one of belows.
1. Take advantage of invoking less processes.
2. Take advantage of maintainability of virtio-net device.

Honestly, I'm OK if my solution is not merged.
Thus, it should be decided to let DPDK better.

What do you think?
Which is better for DPDK?

Thanks,
Tetsuya

> 
>   --yliu
> 



[dpdk-dev] [PATCH 0/8] support reset of VF link

2016-06-06 Thread Wenzhuo Lu
If the PF link is down and up, VF link will not work
accordingly.
This patch set addes the support of VF link reset. So, when VF
receices the messges of physical link down/up. APP can reset the
VF link and let it recover.

PS: This patch set is splitted from a previous patch set, *automatic
link recovery on ixgbe/igb VF*, and it's base on the patch set
*support mailbox interruption on ixgbe/igb VF*.

Wenzhuo Lu (8):
  lib/librte_ether: support device reset
  lib/librte_ether: defind RX/TX lock mode
  ixgbe: RX/TX with lock on VF
  ixgbe: implement device reset on VF
  igb: RX/TX with lock on VF
  igb: implement device reset on VF
  i40e:RX/TX with lock on VF
  i40e: implement device reset on VF

 doc/guides/rel_notes/release_16_07.rst |  14 
 drivers/net/e1000/e1000_ethdev.h   | 126 
 drivers/net/e1000/igb_ethdev.c | 118 +-
 drivers/net/e1000/igb_rxtx.c   | 148 +
 drivers/net/i40e/i40e_ethdev.c |   4 +-
 drivers/net/i40e/i40e_ethdev.h |   5 ++
 drivers/net/i40e/i40e_ethdev_vf.c  | 145 +++-
 drivers/net/i40e/i40e_rxtx.c   |  45 ++
 drivers/net/i40e/i40e_rxtx.h   |  34 
 drivers/net/ixgbe/ixgbe_ethdev.c   | 120 +-
 drivers/net/ixgbe/ixgbe_ethdev.h   |  32 ++-
 drivers/net/ixgbe/ixgbe_rxtx.c | 116 +++---
 drivers/net/ixgbe/ixgbe_rxtx.h |  13 +++
 drivers/net/ixgbe/ixgbe_rxtx_vec.c |   6 ++
 lib/librte_ether/rte_ethdev.c  |  17 
 lib/librte_ether/rte_ethdev.h  |  76 +
 lib/librte_ether/rte_ether_version.map |   7 ++
 17 files changed, 879 insertions(+), 147 deletions(-)

-- 
1.9.3



[dpdk-dev] [PATCH 1/8] lib/librte_ether: support device reset

2016-06-06 Thread Wenzhuo Lu
Add an API to reset the device.
It's for VF device in this scenario, kernel PF + DPDK VF.
When the PF port down/up, APP should call this API to
reset VF port. Most likely, APP should call it in its
management thread and guarantee the thread safe.

Signed-off-by: Wenzhuo Lu 
---
 lib/librte_ether/rte_ethdev.c  | 17 +
 lib/librte_ether/rte_ethdev.h  | 14 ++
 lib/librte_ether/rte_ether_version.map |  7 +++
 3 files changed, 38 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index e148028..e43dca9 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3346,3 +3346,20 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
-ENOTSUP);
return (*dev->dev_ops->l2_tunnel_offload_set)(dev, l2_tunnel, mask, en);
 }
+
+int
+rte_eth_dev_reset(uint8_t port_id)
+{
+   struct rte_eth_dev *dev;
+   int diag;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
+
+   dev = &rte_eth_devices[port_id];
+
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_reset, -ENOTSUP);
+
+   diag = (*dev->dev_ops->dev_reset)(dev);
+
+   return diag;
+}
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..74e895f 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1318,6 +1318,9 @@ typedef int (*eth_l2_tunnel_offload_set_t)
 uint8_t en);
 /**< @internal enable/disable the l2 tunnel offload functions */

+typedef int  (*eth_dev_reset_t)(struct rte_eth_dev *dev);
+/**< @internal Function used to reset a configured Ethernet device. */
+
 #ifdef RTE_NIC_BYPASS

 enum {
@@ -1508,6 +1511,8 @@ struct eth_dev_ops {
eth_l2_tunnel_eth_type_conf_t l2_tunnel_eth_type_conf;
/** Enable/disable l2 tunnel offload functions */
eth_l2_tunnel_offload_set_t l2_tunnel_offload_set;
+   /** Reset device. */
+   eth_dev_reset_t dev_reset;
 };

 /**
@@ -4253,6 +4258,15 @@ rte_eth_dev_l2_tunnel_offload_set(uint8_t port_id,
  uint32_t mask,
  uint8_t en);

+/**
+ * Reset an Ethernet device.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ */
+int
+rte_eth_dev_reset(uint8_t port_id);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ether_version.map 
b/lib/librte_ether/rte_ether_version.map
index 214ecc7..c34207e 100644
--- a/lib/librte_ether/rte_ether_version.map
+++ b/lib/librte_ether/rte_ether_version.map
@@ -132,3 +132,10 @@ DPDK_16.04 {
rte_eth_tx_buffer_set_err_callback;

 } DPDK_2.2;
+
+DPDK_16.07 {
+   global:
+
+   rte_eth_dev_reset;
+
+} DPDK_16.04;
-- 
1.9.3



[dpdk-dev] [PATCH 2/8] lib/librte_ether: defind RX/TX lock mode

2016-06-06 Thread Wenzhuo Lu
Define lock mode for RX/TX queue. Because when resetting
the device we want the resetting thread to get the lock
of the RX/TX queue to make sure the RX/TX is stopped.

Using next ABI macro for this ABI change as it has too
much impact. 7 APIs and 1 global variable are impacted.

Signed-off-by: Wenzhuo Lu 
Signed-off-by: Zhe Tao 
---
 lib/librte_ether/rte_ethdev.h | 62 +++
 1 file changed, 62 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 74e895f..4efb5e9 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -354,7 +354,12 @@ struct rte_eth_rxmode {
jumbo_frame  : 1, /**< Jumbo Frame Receipt enable. */
hw_strip_crc : 1, /**< Enable CRC stripping by hardware. */
enable_scatter   : 1, /**< Enable scatter packets rx handler */
+#ifndef RTE_NEXT_ABI
enable_lro   : 1; /**< Enable LRO */
+#else
+   enable_lro   : 1, /**< Enable LRO */
+   lock_mode: 1; /**< Using lock path */
+#endif
 };

 /**
@@ -634,11 +639,68 @@ struct rte_eth_txmode {
/**< If set, reject sending out tagged pkts */
hw_vlan_reject_untagged : 1,
/**< If set, reject sending out untagged pkts */
+#ifndef RTE_NEXT_ABI
hw_vlan_insert_pvid : 1;
/**< If set, enable port based VLAN insertion */
+#else
+   hw_vlan_insert_pvid : 1,
+   /**< If set, enable port based VLAN insertion */
+   lock_mode : 1;
+   /**< If set, using lock path */
+#endif
 };

 /**
+ * The macros for the RX/TX lock mode functions
+ */
+#ifdef RTE_NEXT_ABI
+#define RX_LOCK_FUNCTION(dev, func) \
+   (dev->data->dev_conf.rxmode.lock_mode ? \
+   func ## _lock : func)
+
+#define TX_LOCK_FUNCTION(dev, func) \
+   (dev->data->dev_conf.txmode.lock_mode ? \
+   func ## _lock : func)
+#else
+#define RX_LOCK_FUNCTION(dev, func) func
+
+#define TX_LOCK_FUNCTION(dev, func) func
+#endif
+
+/* Add the lock RX/TX function for VF reset */
+#define GENERATE_RX_LOCK(func, nic) \
+uint16_t func ## _lock(void *rx_queue, \
+ struct rte_mbuf **rx_pkts, \
+ uint16_t nb_pkts) \
+{  \
+   struct nic ## _rx_queue *rxq = rx_queue; \
+   uint16_t nb_rx = 0; \
+   \
+   if (rte_spinlock_trylock(&rxq->rx_lock)) { \
+   nb_rx = func(rx_queue, rx_pkts, nb_pkts); \
+   rte_spinlock_unlock(&rxq->rx_lock); \
+   } \
+   \
+   return nb_rx; \
+}
+
+#define GENERATE_TX_LOCK(func, nic) \
+uint16_t func ## _lock(void *tx_queue, \
+ struct rte_mbuf **tx_pkts, \
+ uint16_t nb_pkts) \
+{  \
+   struct nic ## _tx_queue *txq = tx_queue; \
+   uint16_t nb_tx = 0; \
+   \
+   if (rte_spinlock_trylock(&txq->tx_lock)) { \
+   nb_tx = func(tx_queue, tx_pkts, nb_pkts); \
+   rte_spinlock_unlock(&txq->tx_lock); \
+   } \
+   \
+   return nb_tx; \
+}
+
+/**
  * A structure used to configure an RX ring of an Ethernet port.
  */
 struct rte_eth_rxconf {
-- 
1.9.3



[dpdk-dev] [PATCH 3/8] ixgbe: RX/TX with lock on VF

2016-06-06 Thread Wenzhuo Lu
Add RX/TX paths with lock for VF. It's used when
the function of link reset on VF is needed.
When the lock for RX/TX is added, the RX/TX can be
stopped. Then we have a chance to reset the VF link.

Please be aware there's performence drop if the lock
path is chosen.

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/ixgbe/ixgbe_ethdev.c   | 12 +--
 drivers/net/ixgbe/ixgbe_ethdev.h   | 20 +++
 drivers/net/ixgbe/ixgbe_rxtx.c | 74 --
 drivers/net/ixgbe/ixgbe_rxtx.h | 13 +++
 drivers/net/ixgbe/ixgbe_rxtx_vec.c |  6 
 5 files changed, 112 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 05f4f29..fd2682f 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -1325,8 +1325,8 @@ eth_ixgbevf_dev_init(struct rte_eth_dev *eth_dev)
PMD_INIT_FUNC_TRACE();

eth_dev->dev_ops = &ixgbevf_eth_dev_ops;
-   eth_dev->rx_pkt_burst = &ixgbe_recv_pkts;
-   eth_dev->tx_pkt_burst = &ixgbe_xmit_pkts;
+   eth_dev->rx_pkt_burst = RX_LOCK_FUNCTION(eth_dev, ixgbe_recv_pkts);
+   eth_dev->tx_pkt_burst = TX_LOCK_FUNCTION(eth_dev, ixgbe_xmit_pkts);

/* for secondary processes, we don't initialise any further as primary
 * has already done this work. Only check we don't need a different
@@ -3012,7 +3012,15 @@ ixgbe_dev_supported_ptypes_get(struct rte_eth_dev *dev)
if (dev->rx_pkt_burst == ixgbe_recv_pkts ||
dev->rx_pkt_burst == ixgbe_recv_pkts_lro_single_alloc ||
dev->rx_pkt_burst == ixgbe_recv_pkts_lro_bulk_alloc ||
+#ifndef RTE_NEXT_ABI
dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc)
+#else
+   dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_lock ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_lro_single_alloc_lock ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_lro_bulk_alloc_lock ||
+   dev->rx_pkt_burst == ixgbe_recv_pkts_bulk_alloc_lock)
+#endif
return ptypes;
return NULL;
 }
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.h b/drivers/net/ixgbe/ixgbe_ethdev.h
index 4ff6338..701107b 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.h
+++ b/drivers/net/ixgbe/ixgbe_ethdev.h
@@ -390,12 +390,32 @@ uint16_t ixgbe_recv_pkts_lro_single_alloc(void *rx_queue,
 uint16_t ixgbe_recv_pkts_lro_bulk_alloc(void *rx_queue,
struct rte_mbuf **rx_pkts, uint16_t nb_pkts);

+uint16_t ixgbe_recv_pkts_lock(void *rx_queue,
+ struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+uint16_t ixgbe_recv_pkts_bulk_alloc_lock(void *rx_queue,
+struct rte_mbuf **rx_pkts,
+uint16_t nb_pkts);
+uint16_t ixgbe_recv_pkts_lro_single_alloc_lock(void *rx_queue,
+  struct rte_mbuf **rx_pkts,
+  uint16_t nb_pkts);
+uint16_t ixgbe_recv_pkts_lro_bulk_alloc_lock(void *rx_queue,
+struct rte_mbuf **rx_pkts,
+uint16_t nb_pkts);
+
 uint16_t ixgbe_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

 uint16_t ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
uint16_t nb_pkts);

+uint16_t ixgbe_xmit_pkts_lock(void *tx_queue,
+ struct rte_mbuf **tx_pkts,
+ uint16_t nb_pkts);
+uint16_t ixgbe_xmit_pkts_simple_lock(void *tx_queue,
+struct rte_mbuf **tx_pkts,
+uint16_t nb_pkts);
+
 int ixgbe_dev_rss_hash_update(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);

diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index 9c6eaf2..a45d115 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -353,6 +353,8 @@ ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf 
**tx_pkts,
return nb_tx;
 }

+GENERATE_TX_LOCK(ixgbe_xmit_pkts_simple, ixgbe)
+
 static inline void
 ixgbe_set_xmit_ctx(struct ixgbe_tx_queue *txq,
volatile struct ixgbe_adv_tx_context_desc *ctx_txd,
@@ -904,6 +906,8 @@ end_of_tx:
return nb_tx;
 }

+GENERATE_TX_LOCK(ixgbe_xmit_pkts, ixgbe)
+
 /*
  *
  *  RX functions
@@ -1524,6 +1528,8 @@ ixgbe_recv_pkts_bulk_alloc(void *rx_queue, struct 
rte_mbuf **rx_pkts,
return nb_rx;
 }

+GENERATE_RX_LOCK(ixgbe_recv_pkts_bulk_alloc, ixgbe)
+
 uint16_t
 ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
uint16_t nb_pkts)
@@ -1712,6 +1718,8 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
return nb_rx;
 }

+GENE

[dpdk-dev] [PATCH 4/8] ixgbe: implement device reset on VF

2016-06-06 Thread Wenzhuo Lu
Implement the device reset function.
1, Add the fake RX/TX functions.
2, The reset function tries to stop RX/TX by replacing
   the RX/TX functions with the fake ones and getting the
   locks to make sure the regular RX/TX finished.
3, After the RX/TX stopped, reset the VF port, and then
   release the locks and restore the RX/TX functions.

Signed-off-by: Wenzhuo Lu 
---
 doc/guides/rel_notes/release_16_07.rst |   9 +++
 drivers/net/ixgbe/ixgbe_ethdev.c   | 108 -
 drivers/net/ixgbe/ixgbe_ethdev.h   |  12 +++-
 drivers/net/ixgbe/ixgbe_rxtx.c |  42 -
 4 files changed, 168 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_07.rst 
b/doc/guides/rel_notes/release_16_07.rst
index a761e3c..d36c4b1 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -53,6 +53,15 @@ New Features
   VF. To handle this link up/down event, add the mailbox interruption
   support to receive the message.

+* **Added device reset support for ixgbe VF.**
+
+  Added the device reset API. APP can call this API to reset the VF port
+  when it's not working.
+  Based on the mailbox interruption support, when VF reseives the control
+  message from PF, it means the PF link state changes, VF uses the reset
+  callback in the message handler to notice the APP. APP need call the device
+  reset API to reset the VF port.
+

 Resolved Issues
 ---
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index fd2682f..1e3520b 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -381,6 +381,8 @@ static int ixgbe_dev_udp_tunnel_port_add(struct rte_eth_dev 
*dev,
 static int ixgbe_dev_udp_tunnel_port_del(struct rte_eth_dev *dev,
 struct rte_eth_udp_tunnel *udp_tunnel);

+static int ixgbevf_dev_reset(struct rte_eth_dev *dev);
+
 /*
  * Define VF Stats MACRO for Non "cleared on read" register
  */
@@ -586,6 +588,7 @@ static const struct eth_dev_ops ixgbevf_eth_dev_ops = {
.reta_query   = ixgbe_dev_rss_reta_query,
.rss_hash_update  = ixgbe_dev_rss_hash_update,
.rss_hash_conf_get= ixgbe_dev_rss_hash_conf_get,
+   .dev_reset= ixgbevf_dev_reset,
 };

 /* store statistics names and its offset in stats structure */
@@ -4060,7 +4063,8 @@ ixgbevf_dev_start(struct rte_eth_dev *dev)
ETH_VLAN_EXTEND_MASK;
ixgbevf_vlan_offload_set(dev, mask);

-   ixgbevf_dev_rxtx_start(dev);
+   if (ixgbevf_dev_rxtx_start(dev))
+   return -1;

/* check and configure queue intr-vector mapping */
if (dev->data->dev_conf.intr_conf.rxq != 0) {
@@ -7193,6 +7197,108 @@ static void ixgbevf_mbx_process(struct rte_eth_dev *dev)
 }

 static int
+ixgbevf_dev_reset(struct rte_eth_dev *dev)
+{
+   struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct ixgbe_adapter *adapter =
+   (struct ixgbe_adapter *)dev->data->dev_private;
+   int diag = 0;
+   uint32_t vteiam;
+   uint16_t i;
+   struct ixgbe_rx_queue *rxq;
+   struct ixgbe_tx_queue *txq;
+
+   /* Nothing needs to be done if the device is not started. */
+   if (!dev->data->dev_started)
+   return 0;
+
+   PMD_DRV_LOG(DEBUG, "Link up/down event detected.");
+
+   /**
+* Stop RX/TX by fake functions and locks.
+* Fake functions are used to make RX/TX lock easier.
+*/
+   adapter->rx_backup = dev->rx_pkt_burst;
+   adapter->tx_backup = dev->tx_pkt_burst;
+   dev->rx_pkt_burst = ixgbevf_recv_pkts_fake;
+   dev->tx_pkt_burst = ixgbevf_xmit_pkts_fake;
+
+   if (dev->data->rx_queues)
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   rxq = dev->data->rx_queues[i];
+   rte_spinlock_lock(&rxq->rx_lock);
+   }
+
+   if (dev->data->tx_queues)
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   txq = dev->data->tx_queues[i];
+   rte_spinlock_lock(&txq->tx_lock);
+   }
+
+   /* Performance VF reset. */
+   do {
+   dev->data->dev_started = 0;
+   ixgbevf_dev_stop(dev);
+   if (dev->data->dev_conf.intr_conf.lsc == 0)
+   diag = ixgbe_dev_link_update(dev, 0);
+   if (diag) {
+   PMD_INIT_LOG(INFO, "Ixgbe VF reset: "
+"Failed to update link.");
+   }
+   rte_delay_ms(1000);
+
+   diag = ixgbevf_dev_start(dev);
+   /*If fail to start the device, need to stop/start it again. */
+   if (diag) {
+   PMD_INIT_LOG(ERR, "Ixgbe VF reset: "
+"Failed to start device.");
+ 

[dpdk-dev] [PATCH 5/8] igb: RX/TX with lock on VF

2016-06-06 Thread Wenzhuo Lu
Add RX/TX paths with lock for VF. It's used when
the function of link reset on VF is needed.
When the lock for RX/TX is added, the RX/TX can be
stopped. Then we have a chance to reset the VF link.

Please be aware there's performence drop if the lock
path is chosen.

Signed-off-by: Wenzhuo Lu 
---
 drivers/net/e1000/e1000_ethdev.h | 10 ++
 drivers/net/e1000/igb_ethdev.c   | 14 +++---
 drivers/net/e1000/igb_rxtx.c | 26 +-
 3 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index e8bf8da..6a42994 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -319,6 +319,16 @@ uint16_t eth_igb_recv_pkts(void *rxq, struct rte_mbuf 
**rx_pkts,
 uint16_t eth_igb_recv_scattered_pkts(void *rxq,
struct rte_mbuf **rx_pkts, uint16_t nb_pkts);

+uint16_t eth_igb_xmit_pkts_lock(void *txq,
+   struct rte_mbuf **tx_pkts,
+   uint16_t nb_pkts);
+uint16_t eth_igb_recv_pkts_lock(void *rxq,
+   struct rte_mbuf **rx_pkts,
+   uint16_t nb_pkts);
+uint16_t eth_igb_recv_scattered_pkts_lock(void *rxq,
+ struct rte_mbuf **rx_pkts,
+ uint16_t nb_pkts);
+
 int eth_igb_rss_hash_update(struct rte_eth_dev *dev,
struct rte_eth_rss_conf *rss_conf);

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index b0e5e6a..8aad741 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -909,15 +909,17 @@ eth_igbvf_dev_init(struct rte_eth_dev *eth_dev)
PMD_INIT_FUNC_TRACE();

eth_dev->dev_ops = &igbvf_eth_dev_ops;
-   eth_dev->rx_pkt_burst = ð_igb_recv_pkts;
-   eth_dev->tx_pkt_burst = ð_igb_xmit_pkts;
+   eth_dev->rx_pkt_burst = RX_LOCK_FUNCTION(eth_dev, eth_igb_recv_pkts);
+   eth_dev->tx_pkt_burst = TX_LOCK_FUNCTION(eth_dev, eth_igb_xmit_pkts);

/* for secondary processes, we don't initialise any further as primary
 * has already done this work. Only check we don't need a different
 * RX function */
if (rte_eal_process_type() != RTE_PROC_PRIMARY){
if (eth_dev->data->scattered_rx)
-   eth_dev->rx_pkt_burst = ð_igb_recv_scattered_pkts;
+   eth_dev->rx_pkt_burst =
+   RX_LOCK_FUNCTION(eth_dev,
+eth_igb_recv_scattered_pkts);
return 0;
}

@@ -1999,7 +2001,13 @@ eth_igb_supported_ptypes_get(struct rte_eth_dev *dev)
};

if (dev->rx_pkt_burst == eth_igb_recv_pkts ||
+#ifndef RTE_NEXT_ABI
dev->rx_pkt_burst == eth_igb_recv_scattered_pkts)
+#else
+   dev->rx_pkt_burst == eth_igb_recv_scattered_pkts ||
+   dev->rx_pkt_burst == eth_igb_recv_pkts_lock ||
+   dev->rx_pkt_burst == eth_igb_recv_scattered_pkts_lock)
+#endif
return ptypes;
return NULL;
 }
diff --git a/drivers/net/e1000/igb_rxtx.c b/drivers/net/e1000/igb_rxtx.c
index 18aeead..7e97330 100644
--- a/drivers/net/e1000/igb_rxtx.c
+++ b/drivers/net/e1000/igb_rxtx.c
@@ -67,6 +67,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "e1000_logs.h"
 #include "base/e1000_api.h"
@@ -107,6 +108,7 @@ struct igb_rx_queue {
struct igb_rx_entry *sw_ring;   /**< address of RX software ring. */
struct rte_mbuf *pkt_first_seg; /**< First segment of current packet. */
struct rte_mbuf *pkt_last_seg;  /**< Last segment of current packet. */
+   rte_spinlock_t  rx_lock; /**< Lock for packet receiption. */
uint16_tnb_rx_desc; /**< number of RX descriptors. */
uint16_trx_tail;/**< current value of RDT register. */
uint16_tnb_rx_hold; /**< number of held free RX desc. */
@@ -174,6 +176,7 @@ struct igb_tx_queue {
volatile union e1000_adv_tx_desc *tx_ring; /**< TX ring address */
uint64_t   tx_ring_phys_addr; /**< TX ring DMA address. */
struct igb_tx_entry*sw_ring; /**< virtual address of SW ring. */
+   rte_spinlock_t tx_lock; /**< Lock for packet transmission. */
volatile uint32_t  *tdt_reg_addr; /**< Address of TDT register. */
uint32_t   txd_type;  /**< Device-specific TXD type */
uint16_t   nb_tx_desc;/**< number of TX descriptors. */
@@ -615,6 +618,8 @@ eth_igb_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
return nb_tx;
 }

+GENERATE_TX_LOCK(eth_igb_xmit_pkts, igb)
+
 /*
  *
  *  RX functions
@@ -931,6 +936,8 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
return nb_rx;
 }

+GENERA

[dpdk-dev] [PATCH 6/8] igb: implement device reset on VF

2016-06-06 Thread Wenzhuo Lu
Implement the device reset function.
1, Add the fake RX/TX functions.
2, The reset function tries to stop RX/TX by replacing
   the RX/TX functions with the fake ones and getting the
   locks to make sure the regular RX/TX finished.
3, After the RX/TX stopped, reset the VF port, and then
   release the locks and restore the RX/TX functions.

BTW: The definition of some structures are moved from .c
file to .h file.

Signed-off-by: Wenzhuo Lu 
---
 doc/guides/rel_notes/release_16_07.rst |   2 +-
 drivers/net/e1000/e1000_ethdev.h   | 116 ++
 drivers/net/e1000/igb_ethdev.c | 104 +++
 drivers/net/e1000/igb_rxtx.c   | 128 ++---
 4 files changed, 243 insertions(+), 107 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_07.rst 
b/doc/guides/rel_notes/release_16_07.rst
index d36c4b1..a4c0cc3 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -53,7 +53,7 @@ New Features
   VF. To handle this link up/down event, add the mailbox interruption
   support to receive the message.

-* **Added device reset support for ixgbe VF.**
+* **Added device reset support for ixgbe/igb VF.**

   Added the device reset API. APP can call this API to reset the VF port
   when it's not working.
diff --git a/drivers/net/e1000/e1000_ethdev.h b/drivers/net/e1000/e1000_ethdev.h
index 6a42994..4ae03ce 100644
--- a/drivers/net/e1000/e1000_ethdev.h
+++ b/drivers/net/e1000/e1000_ethdev.h
@@ -34,6 +34,7 @@
 #ifndef _E1000_ETHDEV_H_
 #define _E1000_ETHDEV_H_
 #include 
+#include 

 /* need update link, bit flag */
 #define E1000_FLAG_NEED_LINK_UPDATE (uint32_t)(1 << 0)
@@ -261,6 +262,113 @@ struct e1000_adapter {
struct rte_timecounter  systime_tc;
struct rte_timecounter  rx_tstamp_tc;
struct rte_timecounter  tx_tstamp_tc;
+   eth_rx_burst_t rx_backup;
+   eth_tx_burst_t tx_backup;
+};
+
+/**
+ * Structure associated with each descriptor of the RX ring of a RX queue.
+ */
+struct igb_rx_entry {
+   struct rte_mbuf *mbuf; /**< mbuf associated with RX descriptor. */
+};
+
+/**
+ * Structure associated with each descriptor of the TX ring of a TX queue.
+ */
+struct igb_tx_entry {
+   struct rte_mbuf *mbuf; /**< mbuf associated with TX desc, if any. */
+   uint16_t next_id; /**< Index of next descriptor in ring. */
+   uint16_t last_id; /**< Index of last scattered descriptor. */
+};
+
+/**
+ * Hardware context number
+ */
+enum igb_advctx_num {
+   IGB_CTX_0= 0, /**< CTX0*/
+   IGB_CTX_1= 1, /**< CTX1*/
+   IGB_CTX_NUM  = 2, /**< CTX_NUM */
+};
+
+/** Offload features */
+union igb_tx_offload {
+   uint64_t data;
+   struct {
+   uint64_t l3_len:9; /**< L3 (IP) Header Length. */
+   uint64_t l2_len:7; /**< L2 (MAC) Header Length. */
+   uint64_t vlan_tci:16;  /**< VLAN Tag Control Identifier(CPU 
order). */
+   uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */
+   uint64_t tso_segsz:16; /**< TCP TSO segment size. */
+
+   /* uint64_t unused:8; */
+   };
+};
+
+/**
+ * Strucutre to check if new context need be built
+ */
+struct igb_advctx_info {
+   uint64_t flags;   /**< ol_flags related to context build. */
+   /** tx offload: vlan, tso, l2-l3-l4 lengths. */
+   union igb_tx_offload tx_offload;
+   /** compare mask for tx offload. */
+   union igb_tx_offload tx_offload_mask;
+};
+
+/**
+ * Structure associated with each RX queue.
+ */
+struct igb_rx_queue {
+   struct rte_mempool  *mb_pool;   /**< mbuf pool to populate RX ring. */
+   volatile union e1000_adv_rx_desc *rx_ring; /**< RX ring virtual 
address. */
+   uint64_trx_ring_phys_addr; /**< RX ring DMA address. */
+   volatile uint32_t   *rdt_reg_addr; /**< RDT register address. */
+   volatile uint32_t   *rdh_reg_addr; /**< RDH register address. */
+   struct igb_rx_entry *sw_ring;   /**< address of RX software ring. */
+   struct rte_mbuf *pkt_first_seg; /**< First segment of current packet. */
+   struct rte_mbuf *pkt_last_seg;  /**< Last segment of current packet. */
+   rte_spinlock_t  rx_lock; /**< Lock for packet receiption. */
+   uint16_tnb_rx_desc; /**< number of RX descriptors. */
+   uint16_trx_tail;/**< current value of RDT register. */
+   uint16_tnb_rx_hold; /**< number of held free RX desc. */
+   uint16_trx_free_thresh; /**< max free RX desc to hold. */
+   uint16_tqueue_id;   /**< RX queue index. */
+   uint16_treg_idx;/**< RX queue register index. */
+   uint8_t port_id;/**< Device port identifier. */
+   uint8_t pthresh;/**< Prefetch threshold register. */
+   uint8_t hthresh;/**< Host threshold register. */
+   uin

[dpdk-dev] [PATCH 7/8] i40e:RX/TX with lock on VF

2016-06-06 Thread Wenzhuo Lu
Add RX/TX paths with lock for VF. It's used when
the function of link reset on VF is needed.
When the lock for RX/TX is added, the RX/TX can be
stopped. Then we have a chance to reset the VF link.

Please be aware there's performence drop if the lock
path is chosen.

Signed-off-by: Zhe Tao 
---
 drivers/net/i40e/i40e_ethdev.c|  4 ++--
 drivers/net/i40e/i40e_ethdev.h|  4 
 drivers/net/i40e/i40e_ethdev_vf.c |  4 ++--
 drivers/net/i40e/i40e_rxtx.c  | 45 +--
 drivers/net/i40e/i40e_rxtx.h  | 30 ++
 5 files changed, 67 insertions(+), 20 deletions(-)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 24777d5..1380330 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -764,8 +764,8 @@ eth_i40e_dev_init(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

dev->dev_ops = &i40e_eth_dev_ops;
-   dev->rx_pkt_burst = i40e_recv_pkts;
-   dev->tx_pkt_burst = i40e_xmit_pkts;
+   dev->rx_pkt_burst = RX_LOCK_FUNCTION(dev, i40e_recv_pkts);
+   dev->tx_pkt_burst = TX_LOCK_FUNCTION(dev, i40e_xmit_pkts);

/* for secondary processes, we don't initialise any further as primary
 * has already done this work. Only check we don't need a different
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index cfd2399..672d920 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -540,6 +540,10 @@ struct i40e_adapter {
struct rte_timecounter systime_tc;
struct rte_timecounter rx_tstamp_tc;
struct rte_timecounter tx_tstamp_tc;
+
+   /* For VF reset backup */
+   eth_rx_burst_t rx_backup;
+   eth_tx_burst_t tx_backup;
 };

 int i40e_dev_switch_queues(struct i40e_pf *pf, bool on);
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 90682ac..46d8a7c 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -1451,8 +1451,8 @@ i40evf_dev_init(struct rte_eth_dev *eth_dev)

/* assign ops func pointer */
eth_dev->dev_ops = &i40evf_eth_dev_ops;
-   eth_dev->rx_pkt_burst = &i40e_recv_pkts;
-   eth_dev->tx_pkt_burst = &i40e_xmit_pkts;
+   eth_dev->rx_pkt_burst = RX_LOCK_FUNCTION(eth_dev, i40e_recv_pkts);
+   eth_dev->tx_pkt_burst = TX_LOCK_FUNCTION(eth_dev, i40e_xmit_pkts);

/*
 * For secondary processes, we don't initialise any further as primary
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index c833aa3..0a6dcfb 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -79,10 +79,6 @@
PKT_TX_TCP_SEG | \
PKT_TX_OUTER_IP_CKSUM)

-static uint16_t i40e_xmit_pkts_simple(void *tx_queue,
- struct rte_mbuf **tx_pkts,
- uint16_t nb_pkts);
-
 static inline void
 i40e_rxd_to_vlan_tci(struct rte_mbuf *mb, volatile union i40e_rx_desc *rxdp)
 {
@@ -1144,7 +1140,7 @@ rx_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
return 0;
 }

-static uint16_t
+uint16_t
 i40e_recv_pkts_bulk_alloc(void *rx_queue,
  struct rte_mbuf **rx_pkts,
  uint16_t nb_pkts)
@@ -1169,7 +1165,7 @@ i40e_recv_pkts_bulk_alloc(void *rx_queue,
return nb_rx;
 }
 #else
-static uint16_t
+uint16_t
 i40e_recv_pkts_bulk_alloc(void __rte_unused *rx_queue,
  struct rte_mbuf __rte_unused **rx_pkts,
  uint16_t __rte_unused nb_pkts)
@@ -1892,7 +1888,7 @@ tx_xmit_pkts(struct i40e_tx_queue *txq,
return nb_pkts;
 }

-static uint16_t
+uint16_t
 i40e_xmit_pkts_simple(void *tx_queue,
  struct rte_mbuf **tx_pkts,
  uint16_t nb_pkts)
@@ -2121,10 +2117,13 @@ i40e_dev_supported_ptypes_get(struct rte_eth_dev *dev)
};

if (dev->rx_pkt_burst == i40e_recv_pkts ||
+   dev->rx_pkt_burst == i40e_recv_pkts_lock ||
 #ifdef RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC
dev->rx_pkt_burst == i40e_recv_pkts_bulk_alloc ||
+   dev->rx_pkt_burst == i40e_recv_pkts_bulk_alloc_lock ||
 #endif
-   dev->rx_pkt_burst == i40e_recv_scattered_pkts)
+   dev->rx_pkt_burst == i40e_recv_scattered_pkts ||
+   dev->rx_pkt_burst == i40e_recv_scattered_pkts_lock)
return ptypes;
return NULL;
 }
@@ -2648,6 +2647,7 @@ i40e_reset_rx_queue(struct i40e_rx_queue *rxq)

rxq->rxrearm_start = 0;
rxq->rxrearm_nb = 0;
+   rte_spinlock_init(&rxq->rx_lock);
 }

 void
@@ -2704,6 +2704,7 @@ i40e_reset_tx_queue(struct i40e_tx_queue *txq)

txq->last_desc_cleaned = (uint16_t)(txq->nb_tx_desc - 1);
txq->nb_tx_free = (uint16_t)(txq->nb_tx_desc - 1);
+   rte_spinlock_init(&txq->tx_lock);
 }

 /* Init the TX queue i

[dpdk-dev] [PATCH 8/8] i40e: implement device reset on VF

2016-06-06 Thread Wenzhuo Lu
Implement the device reset function.
1, Add the fake RX/TX functions.
2, The reset function tries to stop RX/TX by replacing
   the RX/TX functions with the fake ones and getting the
   locks to make sure the regular RX/TX finished.
3, After the RX/TX stopped, reset the VF port, and then
   release the locks.

Signed-off-by: Zhe Tao 
---
 doc/guides/rel_notes/release_16_07.rst |   5 ++
 drivers/net/i40e/i40e_ethdev.h |   7 +-
 drivers/net/i40e/i40e_ethdev_vf.c  | 141 +
 drivers/net/i40e/i40e_rxtx.h   |   4 +
 4 files changed, 154 insertions(+), 3 deletions(-)

diff --git a/doc/guides/rel_notes/release_16_07.rst 
b/doc/guides/rel_notes/release_16_07.rst
index a4c0cc3..f43b867 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -62,6 +62,11 @@ New Features
   callback in the message handler to notice the APP. APP need call the device
   reset API to reset the VF port.

+* **Added VF reset support for i40e VF driver.**
+
+  Added a new implementaion to allow i40e VF driver to
+  reset the functionality and state of itself.
+

 Resolved Issues
 ---
diff --git a/drivers/net/i40e/i40e_ethdev.h b/drivers/net/i40e/i40e_ethdev.h
index 672d920..dcd6e0f 100644
--- a/drivers/net/i40e/i40e_ethdev.h
+++ b/drivers/net/i40e/i40e_ethdev.h
@@ -541,9 +541,8 @@ struct i40e_adapter {
struct rte_timecounter rx_tstamp_tc;
struct rte_timecounter tx_tstamp_tc;

-   /* For VF reset backup */
-   eth_rx_burst_t rx_backup;
-   eth_tx_burst_t tx_backup;
+   /* For VF reset */
+   uint8_t reset_number;
 };

 int i40e_dev_switch_queues(struct i40e_pf *pf, bool on);
@@ -597,6 +596,8 @@ void i40e_rxq_info_get(struct rte_eth_dev *dev, uint16_t 
queue_id,
 void i40e_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
struct rte_eth_txq_info *qinfo);

+void i40evf_emulate_vf_reset(uint8_t port_id);
+
 /* I40E_DEV_PRIVATE_TO */
 #define I40E_DEV_PRIVATE_TO_PF(adapter) \
(&((struct i40e_adapter *)adapter)->pf)
diff --git a/drivers/net/i40e/i40e_ethdev_vf.c 
b/drivers/net/i40e/i40e_ethdev_vf.c
index 46d8a7c..9fc121b 100644
--- a/drivers/net/i40e/i40e_ethdev_vf.c
+++ b/drivers/net/i40e/i40e_ethdev_vf.c
@@ -157,6 +157,12 @@ i40evf_dev_rx_queue_intr_disable(struct rte_eth_dev *dev, 
uint16_t queue_id);
 static void i40evf_handle_pf_event(__rte_unused struct rte_eth_dev *dev,
   uint8_t *msg,
   uint16_t msglen);
+static int i40evf_dev_uninit(struct rte_eth_dev *eth_dev);
+static int i40evf_dev_init(struct rte_eth_dev *eth_dev);
+static void i40evf_dev_close(struct rte_eth_dev *dev);
+static int i40evf_dev_start(struct rte_eth_dev *dev);
+static int i40evf_dev_configure(struct rte_eth_dev *dev);
+static int i40evf_handle_vf_reset(struct rte_eth_dev *dev);

 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_VFQF_HKEY_MAX_INDEX + 1];
@@ -223,6 +229,7 @@ static const struct eth_dev_ops i40evf_eth_dev_ops = {
.reta_query   = i40evf_dev_rss_reta_query,
.rss_hash_update  = i40evf_dev_rss_hash_update,
.rss_hash_conf_get= i40evf_dev_rss_hash_conf_get,
+   .dev_reset= i40evf_handle_vf_reset
 };

 /*
@@ -1309,6 +1316,140 @@ i40evf_uninit_vf(struct rte_eth_dev *dev)
 }

 static void
+i40e_vf_queue_reset(struct rte_eth_dev *dev)
+{
+   uint16_t i;
+
+   for (i = 0; i < dev->data->nb_rx_queues; i++) {
+   struct i40e_rx_queue *rxq = dev->data->rx_queues[i];
+
+   if (rxq->q_set) {
+   i40e_dev_rx_queue_setup(dev,
+   rxq->queue_id,
+   rxq->nb_rx_desc,
+   rxq->socket_id,
+   &rxq->rxconf,
+   rxq->mp);
+   }
+
+   rxq = dev->data->rx_queues[i];
+   rte_spinlock_trylock(&rxq->rx_lock);
+   }
+   for (i = 0; i < dev->data->nb_tx_queues; i++) {
+   struct i40e_tx_queue *txq = dev->data->tx_queues[i];
+
+   if (txq->q_set) {
+   i40e_dev_tx_queue_setup(dev,
+   txq->queue_id,
+   txq->nb_tx_desc,
+   txq->socket_id,
+   &txq->txconf);
+   }
+
+   txq = dev->data->tx_queues[i];
+   rte_spinlock_trylock(&txq->tx_lock);
+   }
+}
+
+static void
+i40e_vf_reset_dev(struct rte_eth_dev *dev)
+{
+   struct i40e_adapter *adapter =
+   I40E_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private);
+
+   i40evf_dev_close(dev);
+   PMD_DRV_LOG(DEBUG, "i40evf dev close complete");
+   i

[dpdk-dev] [PATCH] an example for VF link reset

2016-06-06 Thread Wenzhuo Lu
Add a new example to show how to handle the reset event
on VF when PF link down/up.

PS: This patch set is base on the patch set *support reset
of VF link*.

Wenzhuo Lu (1):
  examples: add a new example for link reset

 MAINTAINERS |   4 +
 doc/guides/sample_app_ug/link_reset.rst | 177 
 examples/link_reset/Makefile|  50 +++
 examples/link_reset/main.c  | 769 
 4 files changed, 1000 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/link_reset.rst
 create mode 100644 examples/link_reset/Makefile
 create mode 100644 examples/link_reset/main.c

-- 
1.9.3



[dpdk-dev] [PATCH] examples: add a new example for link reset

2016-06-06 Thread Wenzhuo Lu
Add a new example to show when the PF is down and up,
VF port can be reset and recover.

Signed-off-by: Wenzhuo Lu 
---
 MAINTAINERS |   4 +
 doc/guides/sample_app_ug/link_reset.rst | 177 
 examples/link_reset/Makefile|  50 +++
 examples/link_reset/main.c  | 769 
 4 files changed, 1000 insertions(+)
 create mode 100644 doc/guides/sample_app_ug/link_reset.rst
 create mode 100644 examples/link_reset/Makefile
 create mode 100644 examples/link_reset/main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e8558f..76879c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -650,3 +650,7 @@ F: examples/tep_termination/
 F: examples/vmdq/
 F: examples/vmdq_dcb/
 F: doc/guides/sample_app_ug/vmdq_dcb_forwarding.rst
+
+M: Wenzhuo Lu 
+F: examples/link_reset/
+F: doc/guides/sample_app_ug/link_reset.rst
diff --git a/doc/guides/sample_app_ug/link_reset.rst 
b/doc/guides/sample_app_ug/link_reset.rst
new file mode 100644
index 000..fecae6d
--- /dev/null
+++ b/doc/guides/sample_app_ug/link_reset.rst
@@ -0,0 +1,177 @@
+..  BSD LICENSE
+Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Intel Corporation nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+Link Reset Sample Application (in Virtualized Environments)
+===
+
+The Link Reset sample application is a simple example of VF traffic recovery
+using the Data Plane Development Kit (DPDK) which also takes advantage of 
Single
+Root I/O Virtualization (SR-IOV) features in a virtualized environment.
+
+Overview
+
+
+The Link Reset sample application, which should operate in virtualized
+environments, performs L2 forwarding for each packet that is received on an
+RX_PORT.
+This example is extended from the L2 forwarding example. Please reference the
+example of L2 forwarding in virtualized environments for more details and
+explanation about the behavior of forwarding and how to setup the test.
+The purpose of this example is to show when the PF port is down and up, the VF
+port can recover and the traffic can recover too.
+
+Virtual Function Setup Instructions
+~~~
+
+This application can use the virtual function available in the system and
+therefore can be used in a virtual machine without passing through
+the whole Network Device into a guest machine in a virtualized scenario.
+The virtual functions can be enabled in the host machine or the hypervisor
+with the respective physical function driver.
+
+For example, in a Linux* host machine, it is possible to enable a virtual
+function using the following command:
+
+.. code-block:: console
+
+modprobe ixgbe max_vfs=2,2
+
+This command enables two Virtual Functions on each of Physical Function of the
+NIC, with two physical ports in the PCI configuration space.
+It is important to note that enabled Virtual Function 0 and 2 would belong to
+Physical Function 0 and Virtual Function 1 and 3 would belong to Physical
+Function 1, in this case enabling a total of four Virtual Functions.
+
+Compiling the Application
+-
+
+#.  Go to the example directory:
+
+.. code-block:: console
+
+export RTE_SDK=/path/to/rte_sdk
+cd ${RTE_SDK}/examples/link_reset
+
+#.  Set the target (a default target is used if not specified). For example:
+
+.. code-block:: console
+
+export RTE_TARGET=x86_64-native-linuxapp-gcc
+
+   

[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Yuanhan Liu
On Mon, Jun 06, 2016 at 02:10:46PM +0900, Tetsuya Mukawa wrote:
> Hi Yuanhan,
> 
> Sorry for late replying.

Never mind.

> 
> On 2016/06/03 13:17, Yuanhan Liu wrote:
> > On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
> >> Hi Yuanhan,
> >>
> >> On 2016/06/02 16:31, Yuanhan Liu wrote:
> >>> But still, I'd ask do we really need 2 virtio for container solutions?
> >>
> >> I appreciate your comments.
> > 
> > No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
> > is just brilliant!
> > 
> >> Let me have time to discuss it with our team.
> > 
> > I'm wondering could we have one solution only. IMO, the drawback of
> > having two (quite different) solutions might outweighs the benefit
> > it takes. Say, it might just confuse user.
> 
> I agree with this.
> If we have 2 solutions, it would confuse the DPDK users.
> 
> > 
> > OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
> > what's the missing parts, and could we fix it? I'm thinking having
> > one unified solution will keep ours energy/focus on one thing, making
> > it better and better! Having two just splits the energy; it also
> > introduces extra burden for maintaining.
> 
> Of course, I adopt Jiangeng's solution basically.
> Actually, his solution is almost similar I tried to implement at first.
> 
> I guess here is pros/cons of 2 solutions.
> 
> [Jianfeng's solution]
> - Pros
> Don't need to invoke QEMU process.
> - Cons
> If virtio-net specification is changed, we need to implement it by
> ourselves. Also, LSC interrupt and control queue functions are not
> supported yet.

Jianfeng have made and sent out the patch to enable ctrl queue and
multiple queue support.

For the LSC part, no much idea yet so far. But I'm assuming it will
not take too much effort, either.

> I agree both functions may not be so important, and if we need it
> we can implement them, but we need to pay energy to implement them.
> 
> [My solution]
> - Pros
> Basic principle of my implementation is not to reinvent the wheel.

Yes, that's a good point. However, it's not that hard as we would have
thought in the first time: the tough part that dequeue/enqueue packets
from/to vring is actually offloaded to DPDK vhost-user. That means we
only need re-implement the control path of virtio-net device, plus the
vhost-user frontend. If you have a detailed look of your patchset as
well Jianfeng's, you might find that the two patchset are actually with
same code size. 

> We can use a virtio-net device of QEMU implementation, it means we don't
> need to maintain virtio-net device by ourselves, and we can use all of
> functions supported by QEMU virtio-net device.
> - Cons
> Need to invoke QEMU process.

Another thing is that it makes the usage a bit harder: look at the
long qemu cli options of your example usage. It also has some traps,
say, "--enable-kvm" is not allowed, which is a default option used
with QEMU.

And judging that we actually don't take too much effort to implement
a virtio device emulation, I'd prefer it slightly. I guess something
light weight and easier for use is more important here.

Actually, I have foreseen another benefit of adding virtio-user device
emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
unit test case. We simply can't do it before, since we depend on QEMU
for testing, which is not acceptable for a unit test case. Making it
be a unit test case would help us spotting any bad changes that would
introduce bugs easily and automatically.

--yliu

> Anyway, we can choose one of belows.
> 1. Take advantage of invoking less processes.
> 2. Take advantage of maintainability of virtio-net device.
> 
> Honestly, I'm OK if my solution is not merged.
> Thus, it should be decided to let DPDK better.
> 
> What do you think?
> Which is better for DPDK?
> 
> Thanks,
> Tetsuya
> 
> > 
> > --yliu
> > 


[dpdk-dev] [PATCH v2 00/15] i40e base driver update

2016-06-06 Thread Lu, Wenzhuo
Hi,

Acked-by: Wenzhuo Lu 



[dpdk-dev] [PATCH v6 6/7] virtio-user: add new virtual pci driver for virtio

2016-06-06 Thread Yuanhan Liu
On Thu, Jun 02, 2016 at 09:54:36AM +, Jianfeng Tan wrote:
> +
> + desc_addr = (uint64_t)vq->mz->addr;
> + avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
> + used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
> +  ring[vq->vq_nentries]),
> +VIRTIO_PCI_VRING_ALIGN);
> +
> + dev->vrings[queue_idx].num = vq->vq_nentries;
> + dev->vrings[queue_idx].desc = (void *)desc_addr;
> + dev->vrings[queue_idx].avail = (void *)avail_addr;
> + dev->vrings[queue_idx].used = (void *)used_addr;

That would break 32 bit build. please also do more build and function
test, with and without CONFIG_RTE_VIRTIO_VDEV enabled, to make sure
we will not break anything. I'm sure you will meet build error without
that option enabled.

BTW, let's be consistent with using VIRTIO_USER_DEV instead of VDEV
or VIRTIO_VDEV?

Another thing that might be a bit late to ask is that how about
removing the vhost-net support? I mean, it's DPDK; if user stick
to using DPDK virtio-user, he will stick to using DPDK vhost-user
as well, but not the vhost-net. So, let's keep it being simple
first. And if there is really a need for vhost-net, we can add it
back later, easily. Makes sense?

I also would suggest you do a rebase based on my latest tree.

--yliu


[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tan, Jianfeng
Hi,


On 6/6/2016 1:10 PM, Tetsuya Mukawa wrote:
> Hi Yuanhan,
>
> Sorry for late replying.
>
> On 2016/06/03 13:17, Yuanhan Liu wrote:
>> On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
>>> Hi Yuanhan,
>>>
>>> On 2016/06/02 16:31, Yuanhan Liu wrote:
 But still, I'd ask do we really need 2 virtio for container solutions?
>>> I appreciate your comments.
>> No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
>> is just brilliant!
>>
>>> Let me have time to discuss it with our team.
>> I'm wondering could we have one solution only. IMO, the drawback of
>> having two (quite different) solutions might outweighs the benefit
>> it takes. Say, it might just confuse user.
> I agree with this.
> If we have 2 solutions, it would confuse the DPDK users.
>
>> OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
>> what's the missing parts, and could we fix it? I'm thinking having
>> one unified solution will keep ours energy/focus on one thing, making
>> it better and better! Having two just splits the energy; it also
>> introduces extra burden for maintaining.
> Of course, I adopt Jiangeng's solution basically.
> Actually, his solution is almost similar I tried to implement at first.
>
> I guess here is pros/cons of 2 solutions.
>
> [Jianfeng's solution]
> - Pros
> Don't need to invoke QEMU process.
> - Cons
> If virtio-net specification is changed, we need to implement it by
> ourselves.

It will barely introduce any change when virtio-net specification is 
changed as far as I can see. The only part we care is the how desc, 
avail, used distribute on memory, which is a very small part.

It's true that my solution now seriously depend on vhost-user protocol, 
which is defined in QEMU. I cannot see a big problem there so far.

>   Also, LSC interrupt and control queue functions are not
> supported yet.
> I agree both functions may not be so important, and if we need it
> we can implement them, but we need to pay energy to implement them.

LSC is really less important than rxq interrupt (IMO). We don't know how 
long will rxq interrupt of virtio be available for QEMU, but we can 
accelerate it if we avoid using QEMU.

Actually, if the vhost backend is vhost-user (the main use case), 
current qemu have limited control queue support, because it needs the 
support from the vhost user backend.

Add one more con of my solution:
- Need to write another logic to support other virtio device (say 
virtio-scsi), if it's easier of Tetsuya's solution to do that?

>
> [My solution]
> - Pros
> Basic principle of my implementation is not to reinvent the wheel.
> We can use a virtio-net device of QEMU implementation, it means we don't
> need to maintain virtio-net device by ourselves, and we can use all of
> functions supported by QEMU virtio-net device.
> - Cons
> Need to invoke QEMU process.

Two more possible cons:
a) This solution also needs to maintain qtest utility, right?
b) There's still address arrange restriction, right? Although we can use 
"--base-virtaddr=0x4" to relieve this question, but how about if 
there are 2 or more devices? (By the way, is there still address arrange 
requirement for 32 bit system)
c) Actually, IMO this solution is sensitive to any virtio spec change 
(io port, pci configuration space).

>
>
> Anyway, we can choose one of belows.
> 1. Take advantage of invoking less processes.
> 2. Take advantage of maintainability of virtio-net device.
>
> Honestly, I'm OK if my solution is not merged.
> Thus, it should be decided to let DPDK better.

Yes, agreed.

Thanks,
Jianfeng

>
> What do you think?
> Which is better for DPDK?
>
> Thanks,
> Tetsuya
>
>>  --yliu
>>



[dpdk-dev] [PATCH v6 6/7] virtio-user: add new virtual pci driver for virtio

2016-06-06 Thread Tan, Jianfeng
Hi Yuanhan,


On 6/6/2016 4:01 PM, Yuanhan Liu wrote:
> On Thu, Jun 02, 2016 at 09:54:36AM +, Jianfeng Tan wrote:
>> +
>> +desc_addr = (uint64_t)vq->mz->addr;
>> +avail_addr = desc_addr + vq->vq_nentries * sizeof(struct vring_desc);
>> +used_addr = RTE_ALIGN_CEIL(avail_addr + offsetof(struct vring_avail,
>> + ring[vq->vq_nentries]),
>> +   VIRTIO_PCI_VRING_ALIGN);
>> +
>> +dev->vrings[queue_idx].num = vq->vq_nentries;
>> +dev->vrings[queue_idx].desc = (void *)desc_addr;
>> +dev->vrings[queue_idx].avail = (void *)avail_addr;
>> +dev->vrings[queue_idx].used = (void *)used_addr;
> That would break 32 bit build. please also do more build and function
> test, with and without CONFIG_RTE_VIRTIO_VDEV enabled, to make sure
> we will not break anything. I'm sure you will meet build error without
> that option enabled.

Yes, thanks for pointing this out.

>
> BTW, let's be consistent with using VIRTIO_USER_DEV instead of VDEV
> or VIRTIO_VDEV?

OK.

>
> Another thing that might be a bit late to ask is that how about
> removing the vhost-net support? I mean, it's DPDK; if user stick
> to using DPDK virtio-user, he will stick to using DPDK vhost-user
> as well, but not the vhost-net. So, let's keep it being simple
> first. And if there is really a need for vhost-net, we can add it
> back later, easily. Makes sense?

Yes, it makes sense, because from an initial test, I see low 
performance. Or anyone who are willing to use it can comment?

Thanks,
Jianfeng
>
> I also would suggest you do a rebase based on my latest tree.

No problem.

Thanks,
Jianfeng

>
>   --yliu



[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tetsuya Mukawa
On 2016/06/06 16:21, Yuanhan Liu wrote:
> On Mon, Jun 06, 2016 at 02:10:46PM +0900, Tetsuya Mukawa wrote:
>> Hi Yuanhan,
>>
>> Sorry for late replying.
> 
> Never mind.
> 
>>
>> On 2016/06/03 13:17, Yuanhan Liu wrote:
>>> On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
 Hi Yuanhan,

 On 2016/06/02 16:31, Yuanhan Liu wrote:
> But still, I'd ask do we really need 2 virtio for container solutions?

 I appreciate your comments.
>>>
>>> No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
>>> is just brilliant!
>>>
 Let me have time to discuss it with our team.
>>>
>>> I'm wondering could we have one solution only. IMO, the drawback of
>>> having two (quite different) solutions might outweighs the benefit
>>> it takes. Say, it might just confuse user.
>>
>> I agree with this.
>> If we have 2 solutions, it would confuse the DPDK users.
>>
>>>
>>> OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
>>> what's the missing parts, and could we fix it? I'm thinking having
>>> one unified solution will keep ours energy/focus on one thing, making
>>> it better and better! Having two just splits the energy; it also
>>> introduces extra burden for maintaining.
>>
>> Of course, I adopt Jiangeng's solution basically.
>> Actually, his solution is almost similar I tried to implement at first.
>>
>> I guess here is pros/cons of 2 solutions.
>>
>> [Jianfeng's solution]
>> - Pros
>> Don't need to invoke QEMU process.
>> - Cons
>> If virtio-net specification is changed, we need to implement it by
>> ourselves. Also, LSC interrupt and control queue functions are not
>> supported yet.
> 
> Jianfeng have made and sent out the patch to enable ctrl queue and
> multiple queue support.

Sorry, I haven't noticed that ctrl queue has been already enabled.

> 
> For the LSC part, no much idea yet so far. But I'm assuming it will
> not take too much effort, either.
> 
>> I agree both functions may not be so important, and if we need it
>> we can implement them, but we need to pay energy to implement them.
>>
>> [My solution]
>> - Pros
>> Basic principle of my implementation is not to reinvent the wheel.
> 
> Yes, that's a good point. However, it's not that hard as we would have
> thought in the first time: the tough part that dequeue/enqueue packets
> from/to vring is actually offloaded to DPDK vhost-user. That means we
> only need re-implement the control path of virtio-net device, plus the
> vhost-user frontend. If you have a detailed look of your patchset as
> well Jianfeng's, you might find that the two patchset are actually with
> same code size. 

Yes, I know this.
So far, the amount of code is almost same, but in the future we may need
to implement more, if virtio-net specification is revised.

> 
>> We can use a virtio-net device of QEMU implementation, it means we don't
>> need to maintain virtio-net device by ourselves, and we can use all of
>> functions supported by QEMU virtio-net device.
>> - Cons
>> Need to invoke QEMU process.
> 
> Another thing is that it makes the usage a bit harder: look at the
> long qemu cli options of your example usage. It also has some traps,
> say, "--enable-kvm" is not allowed, which is a default option used
> with QEMU.

Probably a kind of shell script will help the users.

> 
> And judging that we actually don't take too much effort to implement
> a virtio device emulation, I'd prefer it slightly. I guess something
> light weight and easier for use is more important here.

This is very important point.
If so, we don't need much effort when virtio-spec is changed.

> 
> Actually, I have foreseen another benefit of adding virtio-user device
> emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
> unit test case. We simply can't do it before, since we depend on QEMU
> for testing, which is not acceptable for a unit test case. Making it
> be a unit test case would help us spotting any bad changes that would
> introduce bugs easily and automatically.

As you mentioned above, QEMU process is not related with
dequeuing/enqueuing.
So I guess we may have a testing for rte_vhost_dequeue/enqueue_burst()
regardless of choice.

>> Anyway, we can choose one of belows.
>> 1. Take advantage of invoking less processes.
>> 2. Take advantage of maintainability of virtio-net device.

If container usage that DPDK assumes is to invoke hundreds containers in
one host, we should take Jiangfeng's solution.

Also, if implementing a new feature and maintaining Jiangfeng's
virtio-net device are not so hard, we should take his solution.

I guess this is the point we need to consider.
What do you think?

Thanks,
Tetsuya

>>
>> Honestly, I'm OK if my solution is not merged.
>> Thus, it should be decided to let DPDK better.
>>
>> What do you think?
>> Which is better for DPDK?
>>
>> Thanks,
>> Tetsuya
>>
>>>
>>> --yliu
>>>



[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Yuanhan Liu
On Mon, Jun 06, 2016 at 05:33:31PM +0900, Tetsuya Mukawa wrote:
> >> [My solution]
> >> - Pros
> >> Basic principle of my implementation is not to reinvent the wheel.
> > 
> > Yes, that's a good point. However, it's not that hard as we would have
> > thought in the first time: the tough part that dequeue/enqueue packets
> > from/to vring is actually offloaded to DPDK vhost-user. That means we
> > only need re-implement the control path of virtio-net device, plus the
> > vhost-user frontend. If you have a detailed look of your patchset as
> > well Jianfeng's, you might find that the two patchset are actually with
> > same code size. 
> 
> Yes, I know this.
> So far, the amount of code is almost same, but in the future we may need
> to implement more, if virtio-net specification is revised.

It didn't take too much effort to implement from scratch, I doubt it
will for future revise. And, virtio-net spec is unlikely revised, or
to be precisely, unlikely revised quite often. Therefore, I don't see
big issues here.

> >> We can use a virtio-net device of QEMU implementation, it means we don't
> >> need to maintain virtio-net device by ourselves, and we can use all of
> >> functions supported by QEMU virtio-net device.
> >> - Cons
> >> Need to invoke QEMU process.
> > 
> > Another thing is that it makes the usage a bit harder: look at the
> > long qemu cli options of your example usage. It also has some traps,
> > say, "--enable-kvm" is not allowed, which is a default option used
> > with QEMU.
> 
> Probably a kind of shell script will help the users.

Yeah, that would help. But if we have a choice to make it simpler in the
beginning, why not then? :-)

> > 
> > And judging that we actually don't take too much effort to implement
> > a virtio device emulation, I'd prefer it slightly. I guess something
> > light weight and easier for use is more important here.
> 
> This is very important point.
> If so, we don't need much effort when virtio-spec is changed.

I'd assume so.

> > Actually, I have foreseen another benefit of adding virtio-user device
> > emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
> > unit test case. We simply can't do it before, since we depend on QEMU
> > for testing, which is not acceptable for a unit test case. Making it
> > be a unit test case would help us spotting any bad changes that would
> > introduce bugs easily and automatically.
> 
> As you mentioned above, QEMU process is not related with
> dequeuing/enqueuing.
> So I guess we may have a testing for rte_vhost_dequeue/enqueue_burst()
> regardless of choice.

Yes, we don't need the dequeue/enqueue part, but we need the vhost-user
initialization part from QEMU vhost-user. Now that we have vhost-user
frontend from virtio-user, we have no dependency on QEMU any more.

> >> Anyway, we can choose one of belows.
> >> 1. Take advantage of invoking less processes.
> >> 2. Take advantage of maintainability of virtio-net device.
> 
> If container usage that DPDK assumes is to invoke hundreds containers in
> one host,

I barely know about container, but I would assume that's not rare.

> we should take Jiangfeng's solution.
> 
> Also, if implementing a new feature and maintaining Jiangfeng's
> virtio-net device are not so hard,

As stated, I would assume so.

--yliu

> we should take his solution.
> 
> I guess this is the point we need to consider.
> What do you think?
> 
> Thanks,
> Tetsuya
> 
> >>
> >> Honestly, I'm OK if my solution is not merged.
> >> Thus, it should be decided to let DPDK better.
> >>
> >> What do you think?
> >> Which is better for DPDK?
> >>
> >> Thanks,
> >> Tetsuya
> >>
> >>>
> >>>   --yliu
> >>>


[dpdk-dev] [PATCH] fm10k: fix VF cannot receive broadcast traffic

2016-06-06 Thread Wang Xiao W
When app tries promisc/allmulti setting, fm10k will check if a valid glort
is acquired, if not then exit without doing anything. It's a long journey
for VF to acquire glort info from VF to PF mailbox, PF to switch mailbox.
It could be a long interval that's out of DPDK's control. Thus, app may
fail on promisc/allmulti setting in VF. In fact, we don't need a valid
glort value in VF, so this patch just skips the glort check for VF.

Fixes: df02ba864695 ("fm10k: support promiscuous mode")

Signed-off-by: Wang Xiao W 
---
 drivers/net/fm10k/fm10k_ethdev.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/fm10k/fm10k_ethdev.c b/drivers/net/fm10k/fm10k_ethdev.c
index c2d377f..b3aefdb 100644
--- a/drivers/net/fm10k/fm10k_ethdev.c
+++ b/drivers/net/fm10k/fm10k_ethdev.c
@@ -947,7 +947,7 @@ fm10k_dev_promiscuous_enable(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

/* Return if it didn't acquire valid glort range */
-   if (!fm10k_glort_valid(hw))
+   if ((hw->mac.type == fm10k_mac_pf) && !fm10k_glort_valid(hw))
return;

fm10k_mbx_lock(hw);
@@ -969,7 +969,7 @@ fm10k_dev_promiscuous_disable(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

/* Return if it didn't acquire valid glort range */
-   if (!fm10k_glort_valid(hw))
+   if ((hw->mac.type == fm10k_mac_pf) && !fm10k_glort_valid(hw))
return;

if (dev->data->all_multicast == 1)
@@ -995,7 +995,7 @@ fm10k_dev_allmulticast_enable(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

/* Return if it didn't acquire valid glort range */
-   if (!fm10k_glort_valid(hw))
+   if ((hw->mac.type == fm10k_mac_pf) && !fm10k_glort_valid(hw))
return;

/* If promiscuous mode is enabled, it doesn't make sense to enable
@@ -1026,7 +1026,7 @@ fm10k_dev_allmulticast_disable(struct rte_eth_dev *dev)
PMD_INIT_FUNC_TRACE();

/* Return if it didn't acquire valid glort range */
-   if (!fm10k_glort_valid(hw))
+   if ((hw->mac.type == fm10k_mac_pf) && !fm10k_glort_valid(hw))
return;

if (dev->data->promiscuous) {
-- 
1.9.3



[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Thomas Monjalon
2016-06-05 14:15, Neil Horman:
> On Fri, Jun 03, 2016 at 03:07:49PM +, Mcnamara, John wrote:
> > Introduction
> > 
> > 
> > This document sets out a proposal for a DPDK Long Term Support release 
> > (LTS).
> > 
> > The purpose of the DPDK LTS will be to maintain a stable release of DPDK 
> > with
> > backported bug fixes over an extended period of time. This will provide
> > downstream consumers of DPDK with a stable target on which to base
> > applications or packages.
[...]
> I'm not opposed to an LTS release, but it seems to be re-solving the issue of
> ABI breakage.  That is to say, there is alreay a process in place for managing
> ABI changes to the DPDK, which is designed to help ensure that:
> 
> 1) ABI changes are signaled at least 2 releases early
> 2) ABI changes whenever possible are designed such that backward compatibility
> versions can be encoded at the same time with versioning tags

Sorry I don't understand your point.
We are talking about two different things:
1/ ABI care for each new major release
2/ Minor release for bug fixes

I think both may exist.

> Those two mechanism are expressly intended to allow application upgrades of 
> DPDK
> libraries without worrying about ABI breakage.  While LTS releases are a fine
> approach for  some things, they sacrifice upstream efficiency (by creating 
> work
> for backporting teams), while allowing upstream developers more leverage to 
> just
> create ABI breaking changes on a whim, ignoring the existing ABI compatibility
> mechanism

No it was not stated that upstream developers should ignore ABI compatibility.
Do you mean having a stable branch means ABI preservation for the next major
release is less important?

> LTS is a fine process for projects in which API/ABI breakage is either 
> uncommon
> or fairly isolated, but that in my mind doesn't really describe DPDK.

Yes API/ABI breakages are still common in DPDK.
So it's even more important to have some stable branches.


[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tetsuya Mukawa
On 2016/06/06 17:03, Tan, Jianfeng wrote:
> Hi,
> 
> 
> On 6/6/2016 1:10 PM, Tetsuya Mukawa wrote:
>> Hi Yuanhan,
>>
>> Sorry for late replying.
>>
>> On 2016/06/03 13:17, Yuanhan Liu wrote:
>>> On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
 Hi Yuanhan,

 On 2016/06/02 16:31, Yuanhan Liu wrote:
> But still, I'd ask do we really need 2 virtio for container solutions?
 I appreciate your comments.
>>> No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
>>> is just brilliant!
>>>
 Let me have time to discuss it with our team.
>>> I'm wondering could we have one solution only. IMO, the drawback of
>>> having two (quite different) solutions might outweighs the benefit
>>> it takes. Say, it might just confuse user.
>> I agree with this.
>> If we have 2 solutions, it would confuse the DPDK users.
>>
>>> OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
>>> what's the missing parts, and could we fix it? I'm thinking having
>>> one unified solution will keep ours energy/focus on one thing, making
>>> it better and better! Having two just splits the energy; it also
>>> introduces extra burden for maintaining.
>> Of course, I adopt Jiangeng's solution basically.
>> Actually, his solution is almost similar I tried to implement at first.
>>
>> I guess here is pros/cons of 2 solutions.
>>
>> [Jianfeng's solution]
>> - Pros
>> Don't need to invoke QEMU process.
>> - Cons
>> If virtio-net specification is changed, we need to implement it by
>> ourselves.
> 
> It will barely introduce any change when virtio-net specification is
> changed as far as I can see. The only part we care is the how desc,
> avail, used distribute on memory, which is a very small part.

It's a good news, because we don't pay much effort to follow latest
virtio-net specification.

> 
> It's true that my solution now seriously depend on vhost-user protocol,
> which is defined in QEMU. I cannot see a big problem there so far.
> 
>>   Also, LSC interrupt and control queue functions are not
>> supported yet.
>> I agree both functions may not be so important, and if we need it
>> we can implement them, but we need to pay energy to implement them.
> 
> LSC is really less important than rxq interrupt (IMO). We don't know how
> long will rxq interrupt of virtio be available for QEMU, but we can
> accelerate it if we avoid using QEMU.
> 
> Actually, if the vhost backend is vhost-user (the main use case),
> current qemu have limited control queue support, because it needs the
> support from the vhost user backend.
> 
> Add one more con of my solution:
> - Need to write another logic to support other virtio device (say
> virtio-scsi), if it's easier of Tetsuya's solution to do that?
> 

Probably, my solution will be easier to do that.
My solution has enough facility to access to io port and PCI
configuration space of virtio-scsi device of QEMU.
So, if you invoke with QEMU with virtio-scsi, only you need to do is
changing PCI interface of current virtio-scsi PMD.
(I just assume currently we have virtio-scsi PMD.)
If the virtio-scsi PMD works on QEMU, same code should work with only
changing PCI interface.

>>
>> [My solution]
>> - Pros
>> Basic principle of my implementation is not to reinvent the wheel.
>> We can use a virtio-net device of QEMU implementation, it means we don't
>> need to maintain virtio-net device by ourselves, and we can use all of
>> functions supported by QEMU virtio-net device.
>> - Cons
>> Need to invoke QEMU process.
> 
> Two more possible cons:
> a) This solution also needs to maintain qtest utility, right?

But the spec of qtest will be more stable than virtio-net.

> b) There's still address arrange restriction, right? Although we can use
> "--base-virtaddr=0x4" to relieve this question, but how about if
> there are 2 or more devices? (By the way, is there still address arrange
> requirement for 32 bit system)

Our solutions are a virtio-net driver, and a vhost-user backend driver
needs to access to memory allocated by virtio-net driver.
If an application has 2 devices, it means 2 vhost-user backend PMD needs
to access to the same application memory, right?
Also, currently each virtio-net device has an one QEMU process.
So, I am not sure what will be problem if we have 2 devices.

BTW, 44bits limitations comes from current QEMU implementation itself.
(Actually, if modern virtio device is used, we should be able to remove
the restriction.)

> c) Actually, IMO this solution is sensitive to any virtio spec change
> (io port, pci configuration space).

In this case, virtio-net PMD itself will need to be fixed.
Then, my implementation will be also fixed with the same way.
Current implementation has only PCI abstraction that Yuanhan introduced,
so you may think my solution depends on above things, but actually, my
implementation depends on only how to access to io port and PCI
configuration space. This is what "qtest.h" provides.

Thanks,
Tetsuya

> 
>>
>>

[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tetsuya Mukawa
On 2016/06/06 17:49, Yuanhan Liu wrote:
> On Mon, Jun 06, 2016 at 05:33:31PM +0900, Tetsuya Mukawa wrote:
 [My solution]
 - Pros
 Basic principle of my implementation is not to reinvent the wheel.
>>>
>>> Yes, that's a good point. However, it's not that hard as we would have
>>> thought in the first time: the tough part that dequeue/enqueue packets
>>> from/to vring is actually offloaded to DPDK vhost-user. That means we
>>> only need re-implement the control path of virtio-net device, plus the
>>> vhost-user frontend. If you have a detailed look of your patchset as
>>> well Jianfeng's, you might find that the two patchset are actually with
>>> same code size. 
>>
>> Yes, I know this.
>> So far, the amount of code is almost same, but in the future we may need
>> to implement more, if virtio-net specification is revised.
> 
> It didn't take too much effort to implement from scratch, I doubt it
> will for future revise. And, virtio-net spec is unlikely revised, or
> to be precisely, unlikely revised quite often. Therefore, I don't see
> big issues here.
> 
 We can use a virtio-net device of QEMU implementation, it means we don't
 need to maintain virtio-net device by ourselves, and we can use all of
 functions supported by QEMU virtio-net device.
 - Cons
 Need to invoke QEMU process.
>>>
>>> Another thing is that it makes the usage a bit harder: look at the
>>> long qemu cli options of your example usage. It also has some traps,
>>> say, "--enable-kvm" is not allowed, which is a default option used
>>> with QEMU.
>>
>> Probably a kind of shell script will help the users.
> 
> Yeah, that would help. But if we have a choice to make it simpler in the
> beginning, why not then? :-)
> 
>>>
>>> And judging that we actually don't take too much effort to implement
>>> a virtio device emulation, I'd prefer it slightly. I guess something
>>> light weight and easier for use is more important here.
>>
>> This is very important point.
>> If so, we don't need much effort when virtio-spec is changed.
> 
> I'd assume so.
> 
>>> Actually, I have foreseen another benefit of adding virtio-user device
>>> emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
>>> unit test case. We simply can't do it before, since we depend on QEMU
>>> for testing, which is not acceptable for a unit test case. Making it
>>> be a unit test case would help us spotting any bad changes that would
>>> introduce bugs easily and automatically.
>>
>> As you mentioned above, QEMU process is not related with
>> dequeuing/enqueuing.
>> So I guess we may have a testing for rte_vhost_dequeue/enqueue_burst()
>> regardless of choice.
> 
> Yes, we don't need the dequeue/enqueue part, but we need the vhost-user
> initialization part from QEMU vhost-user. Now that we have vhost-user
> frontend from virtio-user, we have no dependency on QEMU any more.
> 
 Anyway, we can choose one of belows.
 1. Take advantage of invoking less processes.
 2. Take advantage of maintainability of virtio-net device.
>>
>> If container usage that DPDK assumes is to invoke hundreds containers in
>> one host,
> 
> I barely know about container, but I would assume that's not rare.

Hi Yuanhan,

It's great to hear it's not so hard to maintain Jiangfeng's virtio-net
device features.

Please let me make sure how we can invoke many DPDK applications in
hundreds containers.
(Do we have a way to do? Or, will we have it in the future?)

Thanks,
Tetsuya

> 
>> we should take Jiangfeng's solution.
>>
>> Also, if implementing a new feature and maintaining Jiangfeng's
>> virtio-net device are not so hard,
> 
> As stated, I would assume so.




> 
>   --yliu
> 
>> we should take his solution.
>>
>> I guess this is the point we need to consider.
>> What do you think?
>>
>> Thanks,
>> Tetsuya
>>

 Honestly, I'm OK if my solution is not merged.
 Thus, it should be decided to let DPDK better.

 What do you think?
 Which is better for DPDK?

 Thanks,
 Tetsuya

>
>   --yliu
>



[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Yuanhan Liu
On Mon, Jun 06, 2016 at 06:30:00PM +0900, Tetsuya Mukawa wrote:
> On 2016/06/06 17:49, Yuanhan Liu wrote:
> > On Mon, Jun 06, 2016 at 05:33:31PM +0900, Tetsuya Mukawa wrote:
>  [My solution]
>  - Pros
>  Basic principle of my implementation is not to reinvent the wheel.
> >>>
> >>> Yes, that's a good point. However, it's not that hard as we would have
> >>> thought in the first time: the tough part that dequeue/enqueue packets
> >>> from/to vring is actually offloaded to DPDK vhost-user. That means we
> >>> only need re-implement the control path of virtio-net device, plus the
> >>> vhost-user frontend. If you have a detailed look of your patchset as
> >>> well Jianfeng's, you might find that the two patchset are actually with
> >>> same code size. 
> >>
> >> Yes, I know this.
> >> So far, the amount of code is almost same, but in the future we may need
> >> to implement more, if virtio-net specification is revised.
> > 
> > It didn't take too much effort to implement from scratch, I doubt it
> > will for future revise. And, virtio-net spec is unlikely revised, or
> > to be precisely, unlikely revised quite often. Therefore, I don't see
> > big issues here.
> > 
>  We can use a virtio-net device of QEMU implementation, it means we don't
>  need to maintain virtio-net device by ourselves, and we can use all of
>  functions supported by QEMU virtio-net device.
>  - Cons
>  Need to invoke QEMU process.
> >>>
> >>> Another thing is that it makes the usage a bit harder: look at the
> >>> long qemu cli options of your example usage. It also has some traps,
> >>> say, "--enable-kvm" is not allowed, which is a default option used
> >>> with QEMU.
> >>
> >> Probably a kind of shell script will help the users.
> > 
> > Yeah, that would help. But if we have a choice to make it simpler in the
> > beginning, why not then? :-)
> > 
> >>>
> >>> And judging that we actually don't take too much effort to implement
> >>> a virtio device emulation, I'd prefer it slightly. I guess something
> >>> light weight and easier for use is more important here.
> >>
> >> This is very important point.
> >> If so, we don't need much effort when virtio-spec is changed.
> > 
> > I'd assume so.
> > 
> >>> Actually, I have foreseen another benefit of adding virtio-user device
> >>> emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
> >>> unit test case. We simply can't do it before, since we depend on QEMU
> >>> for testing, which is not acceptable for a unit test case. Making it
> >>> be a unit test case would help us spotting any bad changes that would
> >>> introduce bugs easily and automatically.
> >>
> >> As you mentioned above, QEMU process is not related with
> >> dequeuing/enqueuing.
> >> So I guess we may have a testing for rte_vhost_dequeue/enqueue_burst()
> >> regardless of choice.
> > 
> > Yes, we don't need the dequeue/enqueue part, but we need the vhost-user
> > initialization part from QEMU vhost-user. Now that we have vhost-user
> > frontend from virtio-user, we have no dependency on QEMU any more.
> > 
>  Anyway, we can choose one of belows.
>  1. Take advantage of invoking less processes.
>  2. Take advantage of maintainability of virtio-net device.
> >>
> >> If container usage that DPDK assumes is to invoke hundreds containers in
> >> one host,
> > 
> > I barely know about container, but I would assume that's not rare.
> 
> Hi Yuanhan,
> 
> It's great to hear it's not so hard to maintain Jiangfeng's virtio-net
> device features.
> 
> Please let me make sure how we can invoke many DPDK applications in
> hundreds containers.
> (Do we have a way to do? Or, will we have it in the future?)

One thing that I have thought of is that we should remove the huge page
dependency of current usage: huge page would be a very limited resource.

Note that I don't mean to remove support of huge page; DPDK supports
that by default and support it well after all. What I mean is to make
it work for the non-hugepage cases as well, so that it could fit for
the hundreds of containers case.

--yliu


[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tan, Jianfeng
Hi,

On 6/6/2016 5:28 PM, Tetsuya Mukawa wrote:
> On 2016/06/06 17:03, Tan, Jianfeng wrote:
>> Hi,
>>
>>
>> On 6/6/2016 1:10 PM, Tetsuya Mukawa wrote:
>>> Hi Yuanhan,
>>>
>>> Sorry for late replying.
>>>
>>> On 2016/06/03 13:17, Yuanhan Liu wrote:
 On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
> Hi Yuanhan,
>
> On 2016/06/02 16:31, Yuanhan Liu wrote:
>> But still, I'd ask do we really need 2 virtio for container solutions?
> I appreciate your comments.
 No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
 is just brilliant!

> Let me have time to discuss it with our team.
 I'm wondering could we have one solution only. IMO, the drawback of
 having two (quite different) solutions might outweighs the benefit
 it takes. Say, it might just confuse user.
>>> I agree with this.
>>> If we have 2 solutions, it would confuse the DPDK users.
>>>
 OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
 what's the missing parts, and could we fix it? I'm thinking having
 one unified solution will keep ours energy/focus on one thing, making
 it better and better! Having two just splits the energy; it also
 introduces extra burden for maintaining.
>>> Of course, I adopt Jiangeng's solution basically.
>>> Actually, his solution is almost similar I tried to implement at first.
>>>
>>> I guess here is pros/cons of 2 solutions.
>>>
>>> [Jianfeng's solution]
>>> - Pros
>>> Don't need to invoke QEMU process.
>>> - Cons
>>> If virtio-net specification is changed, we need to implement it by
>>> ourselves.
>> It will barely introduce any change when virtio-net specification is
>> changed as far as I can see. The only part we care is the how desc,
>> avail, used distribute on memory, which is a very small part.
> It's a good news, because we don't pay much effort to follow latest
> virtio-net specification.
>
>> It's true that my solution now seriously depend on vhost-user protocol,
>> which is defined in QEMU. I cannot see a big problem there so far.
>>
>>>Also, LSC interrupt and control queue functions are not
>>> supported yet.
>>> I agree both functions may not be so important, and if we need it
>>> we can implement them, but we need to pay energy to implement them.
>> LSC is really less important than rxq interrupt (IMO). We don't know how
>> long will rxq interrupt of virtio be available for QEMU, but we can
>> accelerate it if we avoid using QEMU.
>>
>> Actually, if the vhost backend is vhost-user (the main use case),
>> current qemu have limited control queue support, because it needs the
>> support from the vhost user backend.
>>
>> Add one more con of my solution:
>> - Need to write another logic to support other virtio device (say
>> virtio-scsi), if it's easier of Tetsuya's solution to do that?
>>
> Probably, my solution will be easier to do that.
> My solution has enough facility to access to io port and PCI
> configuration space of virtio-scsi device of QEMU.
> So, if you invoke with QEMU with virtio-scsi, only you need to do is
> changing PCI interface of current virtio-scsi PMD.
> (I just assume currently we have virtio-scsi PMD.)
> If the virtio-scsi PMD works on QEMU, same code should work with only
> changing PCI interface.
>
>>> [My solution]
>>> - Pros
>>> Basic principle of my implementation is not to reinvent the wheel.
>>> We can use a virtio-net device of QEMU implementation, it means we don't
>>> need to maintain virtio-net device by ourselves, and we can use all of
>>> functions supported by QEMU virtio-net device.
>>> - Cons
>>> Need to invoke QEMU process.
>> Two more possible cons:
>> a) This solution also needs to maintain qtest utility, right?
> But the spec of qtest will be more stable than virtio-net.
>
>> b) There's still address arrange restriction, right? Although we can use
>> "--base-virtaddr=0x4" to relieve this question, but how about if
>> there are 2 or more devices? (By the way, is there still address arrange
>> requirement for 32 bit system)
> Our solutions are a virtio-net driver, and a vhost-user backend driver
> needs to access to memory allocated by virtio-net driver.
> If an application has 2 devices, it means 2 vhost-user backend PMD needs
> to access to the same application memory, right?
> Also, currently each virtio-net device has an one QEMU process.
> So, I am not sure what will be problem if we have 2 devices.

OK, my bad. Multiple devices should have just one 
"--base-virtaddr=0x4".

>
> BTW, 44bits limitations comes from current QEMU implementation itself.
> (Actually, if modern virtio device is used, we should be able to remove
> the restriction.)

Good to know.

>
>> c) Actually, IMO this solution is sensitive to any virtio spec change
>> (io port, pci configuration space).
> In this case, virtio-net PMD itself will need to be fixed.
> Then, my implementation will be also fixed with the same way.
> Current implementation has only P

[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-06 Thread Tan, Jianfeng
Hi,


On 6/6/2016 5:30 PM, Tetsuya Mukawa wrote:
> On 2016/06/06 17:49, Yuanhan Liu wrote:
>> On Mon, Jun 06, 2016 at 05:33:31PM +0900, Tetsuya Mukawa wrote:
> [My solution]
> - Pros
> Basic principle of my implementation is not to reinvent the wheel.
 Yes, that's a good point. However, it's not that hard as we would have
 thought in the first time: the tough part that dequeue/enqueue packets
 from/to vring is actually offloaded to DPDK vhost-user. That means we
 only need re-implement the control path of virtio-net device, plus the
 vhost-user frontend. If you have a detailed look of your patchset as
 well Jianfeng's, you might find that the two patchset are actually with
 same code size.
>>> Yes, I know this.
>>> So far, the amount of code is almost same, but in the future we may need
>>> to implement more, if virtio-net specification is revised.
>> It didn't take too much effort to implement from scratch, I doubt it
>> will for future revise. And, virtio-net spec is unlikely revised, or
>> to be precisely, unlikely revised quite often. Therefore, I don't see
>> big issues here.
>>
> We can use a virtio-net device of QEMU implementation, it means we don't
> need to maintain virtio-net device by ourselves, and we can use all of
> functions supported by QEMU virtio-net device.
> - Cons
> Need to invoke QEMU process.
 Another thing is that it makes the usage a bit harder: look at the
 long qemu cli options of your example usage. It also has some traps,
 say, "--enable-kvm" is not allowed, which is a default option used
 with QEMU.
>>> Probably a kind of shell script will help the users.
>> Yeah, that would help. But if we have a choice to make it simpler in the
>> beginning, why not then? :-)
>>
 And judging that we actually don't take too much effort to implement
 a virtio device emulation, I'd prefer it slightly. I guess something
 light weight and easier for use is more important here.
>>> This is very important point.
>>> If so, we don't need much effort when virtio-spec is changed.
>> I'd assume so.
>>
 Actually, I have foreseen another benefit of adding virtio-user device
 emulation: we now might be able to add a rte_vhost_dequeue/enqueue_burst()
 unit test case. We simply can't do it before, since we depend on QEMU
 for testing, which is not acceptable for a unit test case. Making it
 be a unit test case would help us spotting any bad changes that would
 introduce bugs easily and automatically.
>>> As you mentioned above, QEMU process is not related with
>>> dequeuing/enqueuing.
>>> So I guess we may have a testing for rte_vhost_dequeue/enqueue_burst()
>>> regardless of choice.
>> Yes, we don't need the dequeue/enqueue part, but we need the vhost-user
>> initialization part from QEMU vhost-user. Now that we have vhost-user
>> frontend from virtio-user, we have no dependency on QEMU any more.
>>
> Anyway, we can choose one of belows.
> 1. Take advantage of invoking less processes.
> 2. Take advantage of maintainability of virtio-net device.
>>> If container usage that DPDK assumes is to invoke hundreds containers in
>>> one host,
>> I barely know about container, but I would assume that's not rare.
> Hi Yuanhan,
>
> It's great to hear it's not so hard to maintain Jiangfeng's virtio-net
> device features.
>
> Please let me make sure how we can invoke many DPDK applications in
> hundreds containers.
> (Do we have a way to do? Or, will we have it in the future?)

Just to add some option here, we cannot say no to that kind of use case. 
To have many instances, we can:

(1) add a restriction of "cpu share" on each instance, relying on kernel 
to schedule.
(2) enable interrupt mode, so that one instance can go to sleep when it 
has no pkts to receive and awoke by vhost backend when pkts come.

Option 2 is my choice.

Thanks,
Jianfeng

>
> Thanks,
> Tetsuya




[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Yuanhan Liu
On Fri, Jun 03, 2016 at 06:05:15PM +0200, Thomas Monjalon wrote:
> Hi,
> 
> 2016-06-03 15:07, Mcnamara, John:
> > Introduction
> > 
> > 
> > This document sets out a proposal for a DPDK Long Term Support release 
> > (LTS).
> 
> In general, LTS refer to a longer maintenance than than regular one.
> Here we are talking to doing some maintenance as stable releases first.
> Currently we have no maintenance at all.
> So I suggest to differentiate "stable branches" and "LTS" for some stable 
> branches.
> 
> > The purpose of the DPDK LTS will be to maintain a stable release of DPDK 
> > with
> > backported bug fixes over an extended period of time. This will provide
> > downstream consumers of DPDK with a stable target on which to base
> > applications or packages.
> [...]
> > The proposed maintainer for the LTS is Yuanhan Liu
> > .
> 
> I wonder if Yuanhan is OK to maintain every stable releases which could be
> requested/needed?

I'm Okay, since I assume the maintain effort would be small: mainly
for picking acked and tested *bug fix* patches.

> Or should we have other committers for the stable releases
> that Yuanhan would not want to maintain himself?
> The Linux model is to let people declare themselves when they want to maintain
> a stable branch.

I have no object though, if somebody volunteer him as a stable branch
maintainer.

> 
> > The proposed duration of the LTS support is 2 years.
> 
> I think we should discuss the support duration for each release separately.
> 
> > There will only be one LTS branch being maintained at any time. At the end 
> > of
> > the 2 year cycle the maintenance on the previous LTS will be wound down.
> 
> Seems a bit too restrictive.
> Currently, there is no maintenance at all because nobody was volunteer.
> If Yuanhan is volunteer for a stable branch every 2 years, fine.
> If someone else is volunteer for other branches, why not let him do it?
> 
> > The proposed initial LTS version will be DPDK 16.07. The next versions, 
> > based
> > on a 2 year cycle, will be DPDK 18.08, 20.08, etc.
> 
> Let's do a first run with 16.07 and see later what we want to do next.
> How long time a stable branch must be announced before its initial release?
> 
> > What changes should be backported
> > -
> > 
> > * Bug fixes that don't break the ABI.
> 
> And API?
> And behaviour (if not clearly documented in the API)?

Agreed, we should not include those changes, either.

> 
> [...]
> > Developers submitting fixes to the mainline should also CC the maintainer so
> > that they can evaluate the patch. A  email address 
> > could be
> > provided for this so that it can be included as a CC in the commit messages
> > and documented in the Code Contribution Guidelines.
> 
> Why?
> We must avoid putting too much restrictions on the contributors.

This is actually requested by me, in a behaviour similar to Linux
kernel community takes. Here is the thing, the developer normally
knows better than a generic maintainer (assume it's me) that a patch
applies to stable branch or not. This is especially true for DPDK,
since we ask the developer to note down the bug commit by adding a
fix line.

It wouldn't be a burden for an active contributor, as CCing to related
people (including right mailing list) is a good habit they already
have.  For some one-time contributors, it's okay that they don't know
and follow it.

In such case, I guess we need the help from the related subsystem
maintainer: if it's a good bug fix that applies to stable branch,
and the contributor forgot to make a explicit cc to stable mailing
list, the subsystem maintainer should forward or ask him to forward
to stable mailing list.

The reason I'm asking is that as a generic maintainer, there is
simply no such energy to keep an eye on all patches: you have to
be aware of that we have thoughts of email per month from dpdk dev
mailing list: the number of last month is 1808.

Doing so would allow one person maintain several stable tree
be possible.

For more info, you could check linux/Documentation/stable_kernel_rules.txt.

> 
> > Intel will provide validation engineers to test the LTS branch/tree. Tested
> > releases can be marked using a Git tag with an incremented revision number. 
> > For
> > example: 16.07.00_LTS -> 16.07.01_LTS. The testing cadence should be 
> > quarterly
> > but will be best effort only and dependent on available resources.
> 
> Thanks
> It must not be just a tag. There should be an announce and a tarball ready
> to download.

Agreed.

--yliu


[dpdk-dev] [PATCH v3 00/10] Remove string operations from xstats

2016-06-06 Thread David Harton (dharton)
Acked-by: David Harton 

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Remy Horton
> Sent: Monday, May 30, 2016 6:48 AM
> To: dev at dpdk.org; Thomas Monjalon ; Helin 
> Zhang
> ; Wenzhuo Lu ; Jing Chen
> ; Huawei Xie 
> Subject: [dpdk-dev] [PATCH v3 00/10] Remove string operations from xstats
> 
> The current extended ethernet statistics fetching involve doing several
> string operations, which causes performance issues if there are lots of
> statistics and/or network interfaces. This patchset changes the API for
> xstats to use integer identifiers instead of strings and implements
> this new API for the ixgbe, i40e, e1000, fm10k, and virtio drivers.
> 
> --
> 
> v3 changes:
> * Corrected ixgbe vf xstats fetching
> * Added xstats changes to e1000, f10k, and virtio drivers
> * Added cleanup patch that removes now-redundant name field
> * Removed ethtool xstats command
> * Removed unused .xstats_count from eth-dev_ops
> * Changed test-pmd & proc_info to use new API
> * Added documentation update
> * Added missing changes to .map file (affected shared lib builds)
> 
> v2 changes:
> * Fetching xstats count now seperate API function
> * Added #define constants for some magic numbers
> * Fixed bug with virtual function count fetching
> * For non-xstats-supporting drivers, queue stats returned
> * Some refactoring/cleanups
> * Removed index assumption from example
> 
> Remy Horton (10):
>   rte: change xstats to use integer ids
>   drivers/net/ixgbe: change xstats to use integer ids
>   drivers/net/e1000: change xstats to use integer ids
>   drivers/net/fm10k: change xstats to use integer ids
>   drivers/net/i40e: change xstats to use integer ids
>   drivers/net/virtio: change xstats to use integer ids
>   app/test-pmd: change xstats to use integer ids
>   app/proc_info: change xstats to use integer ids
>   remove name field from struct rte_eth_xstats
>   doc: update xstats documentation
> 
>  app/proc_info/main.c| 26 -
>  app/test-pmd/config.c   | 52 +
>  doc/guides/prog_guide/poll_mode_drv.rst | 25 +++--
>  drivers/net/e1000/igb_ethdev.c  | 50 +++--
>  drivers/net/fm10k/fm10k_ethdev.c| 52 ++---
>  drivers/net/i40e/i40e_ethdev.c  | 77 +-
>  drivers/net/i40e/i40e_ethdev_vf.c   | 24 +++-
>  drivers/net/ixgbe/ixgbe_ethdev.c| 98
> -
>  drivers/net/virtio/virtio_ethdev.c  | 60 +---
>  lib/librte_ether/rte_ethdev.c   | 92
> ---
>  lib/librte_ether/rte_ethdev.h   | 44 ++-
>  lib/librte_ether/rte_ether_version.map  |  7 +++
>  12 files changed, 527 insertions(+), 80 deletions(-)
> 
> --
> 2.5.5



[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Thomas Monjalon
2016-06-06 19:49, Yuanhan Liu:
> On Fri, Jun 03, 2016 at 06:05:15PM +0200, Thomas Monjalon wrote:
> > 2016-06-03 15:07, Mcnamara, John:
> > > Developers submitting fixes to the mainline should also CC the maintainer 
> > > so
> > > that they can evaluate the patch. A  email address 
> > > could be
> > > provided for this so that it can be included as a CC in the commit 
> > > messages
> > > and documented in the Code Contribution Guidelines.
> > 
> > Why?
> > We must avoid putting too much restrictions on the contributors.
> 
> This is actually requested by me, in a behaviour similar to Linux
> kernel community takes. Here is the thing, the developer normally
> knows better than a generic maintainer (assume it's me) that a patch
> applies to stable branch or not. This is especially true for DPDK,
> since we ask the developer to note down the bug commit by adding a
> fix line.
> 
> It wouldn't be a burden for an active contributor, as CCing to related
> people (including right mailing list) is a good habit they already
> have.  For some one-time contributors, it's okay that they don't know
> and follow it.
> 
> In such case, I guess we need the help from the related subsystem
> maintainer: if it's a good bug fix that applies to stable branch,
> and the contributor forgot to make a explicit cc to stable mailing
> list, the subsystem maintainer should forward or ask him to forward
> to stable mailing list.
> 
> The reason I'm asking is that as a generic maintainer, there is
> simply no such energy to keep an eye on all patches: you have to
> be aware of that we have thoughts of email per month from dpdk dev
> mailing list: the number of last month is 1808.
> 
> Doing so would allow one person maintain several stable tree
> be possible.
> 
> For more info, you could check linux/Documentation/stable_kernel_rules.txt.

Makes sense to CC stable at dpdk.org list (must be created).

Why put a CC tag in the commit? For automatic processing?
Maybe it is too early to run before walking ;)


[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Nirmoy Das

> LTS Version
> 
> 
> The proposed initial LTS version will be DPDK 16.07. The next versions, based
> on a 2 year cycle, will be DPDK 18.08, 20.08, etc.

Hi,

I can see 16.07's release due date is 18th July. Is it possible to know
the timeline for RC versions of dpdk-16.07 ? This might be helpful for
SUSE to decide the supported product(SLE12 SP*/Leap) for dpdk-lts.

regards,
Nirmoy
-- 
SUSE Linux GmbH, GF: Felix Imend?rffer, Jane Smithard, Graham Norton HRB
21284 (AG N?rnberg) Maxfeldstr. 5
D-90409 N?rnberg / Phone: +49-911-740 18-4


[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Neil Horman
On Mon, Jun 06, 2016 at 11:27:29AM +0200, Thomas Monjalon wrote:
> 2016-06-05 14:15, Neil Horman:
> > On Fri, Jun 03, 2016 at 03:07:49PM +, Mcnamara, John wrote:
> > > Introduction
> > > 
> > > 
> > > This document sets out a proposal for a DPDK Long Term Support release 
> > > (LTS).
> > > 
> > > The purpose of the DPDK LTS will be to maintain a stable release of DPDK 
> > > with
> > > backported bug fixes over an extended period of time. This will provide
> > > downstream consumers of DPDK with a stable target on which to base
> > > applications or packages.
> [...]
> > I'm not opposed to an LTS release, but it seems to be re-solving the issue 
> > of
> > ABI breakage.  That is to say, there is alreay a process in place for 
> > managing
> > ABI changes to the DPDK, which is designed to help ensure that:
> > 
> > 1) ABI changes are signaled at least 2 releases early
> > 2) ABI changes whenever possible are designed such that backward 
> > compatibility
> > versions can be encoded at the same time with versioning tags
> 
> Sorry I don't understand your point.
> We are talking about two different things:
> 1/ ABI care for each new major release
> 2/ Minor release for bug fixes
> 
> I think both may exist.
> 
Sure, they can exist together (they being both an ABI backwards compatible HEAD
and a set of LTS releases).  The point I'm trying to make is that if you do your
ABI compatible HEAD well enough, you don't really need an LTS release.

Thats not to say that you can't do both, but an LTS release is a significant
workload item, especially given the rapid pace of change in HEAD.  The longer
you maintain an LTS release, the more difficult "minor" bugfixes are to
integrate, especially if you wind up skipping any ABI breaking patches.  I think
its worth calling attention to that as this approach gets considered.

> > Those two mechanism are expressly intended to allow application upgrades of 
> > DPDK
> > libraries without worrying about ABI breakage.  While LTS releases are a 
> > fine
> > approach for  some things, they sacrifice upstream efficiency (by creating 
> > work
> > for backporting teams), while allowing upstream developers more leverage to 
> > just
> > create ABI breaking changes on a whim, ignoring the existing ABI 
> > compatibility
> > mechanism
> 
> No it was not stated that upstream developers should ignore ABI compatibility.
> Do you mean having a stable branch means ABI preservation for the next major
> release is less important?
> 
I never stated that developers should ignore ABI compatibility, I stated that
creating an LTS release will make it that much easier for developers to do so.

And I think, pragmatically speaking, that is a concern.  Given that the
existance of an LTS release will make it tempting for developers to simply
follow the deprecation process rather than try to create ABI backward compatible
paths.

Looking at the git history, it seems clear to me that this is already happening.
I'm able to find a multitude of instances in which the deprecation process has
been followed reasonably well, but I can find no instances in which any efforts
have been made for backward compatibility.

> > LTS is a fine process for projects in which API/ABI breakage is either 
> > uncommon
> > or fairly isolated, but that in my mind doesn't really describe DPDK.
> 
> Yes API/ABI breakages are still common in DPDK.
> So it's even more important to have some stable branches.

We seem to be comming to different conclusions based on the same evidence. We
agree that API/ABI changes continue to be frequent ocurances, but my position is
that we already have a process in place to mitigate that, which is simply not
being used (i.e. versioning symbols to provide backward compatible paths),
whereas you seem to be asserting that an LTS model will allow for ABI stabiilty
and bug fixes.

While I don't disagree with that statement (LTS does provide both of those
things if the maintainer does it properly), I'm forced to ask the question,
before we solve this problem in a new way, lets ask why the existing way isn't
being used.  Do developers just not care about backwards compatibility?  Is the
process to hard?  Something else?  I really don't like the idea of abandoning
what currently exists to replace it with something else, without first
addressing why what we have isn't working.

Neil

> 


[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Yuanhan Liu
On Mon, Jun 06, 2016 at 03:31:09PM +0200, Thomas Monjalon wrote:
> 2016-06-06 19:49, Yuanhan Liu:
> > On Fri, Jun 03, 2016 at 06:05:15PM +0200, Thomas Monjalon wrote:
> > > 2016-06-03 15:07, Mcnamara, John:
> > > > Developers submitting fixes to the mainline should also CC the 
> > > > maintainer so
> > > > that they can evaluate the patch. A  email address 
> > > > could be
> > > > provided for this so that it can be included as a CC in the commit 
> > > > messages
> > > > and documented in the Code Contribution Guidelines.
> > > 
> > > Why?
> > > We must avoid putting too much restrictions on the contributors.
> > 
> > This is actually requested by me, in a behaviour similar to Linux
> > kernel community takes. Here is the thing, the developer normally
> > knows better than a generic maintainer (assume it's me) that a patch
> > applies to stable branch or not. This is especially true for DPDK,
> > since we ask the developer to note down the bug commit by adding a
> > fix line.
> > 
> > It wouldn't be a burden for an active contributor, as CCing to related
> > people (including right mailing list) is a good habit they already
> > have.  For some one-time contributors, it's okay that they don't know
> > and follow it.
> > 
> > In such case, I guess we need the help from the related subsystem
> > maintainer: if it's a good bug fix that applies to stable branch,
> > and the contributor forgot to make a explicit cc to stable mailing
> > list, the subsystem maintainer should forward or ask him to forward
> > to stable mailing list.
> > 
> > The reason I'm asking is that as a generic maintainer, there is
> > simply no such energy to keep an eye on all patches: you have to
> > be aware of that we have thoughts of email per month from dpdk dev
> > mailing list: the number of last month is 1808.
> > 
> > Doing so would allow one person maintain several stable tree
> > be possible.
> > 
> > For more info, you could check linux/Documentation/stable_kernel_rules.txt.
> 
> Makes sense to CC stable at dpdk.org list (must be created).
> 
> Why put a CC tag in the commit? For automatic processing?
> Maybe it is too early to run before walking ;)

It's a tip/trick used a lot in kernel community. Assume you have made
a patchset, that just one of them fixes a bug that you hope this patch
could also be cc'ed to the original author that introduces the bug.
You could achieve that by adding him to the cc list from cli. However,
in such way, all patches are cc'ed to him. The alternative is to add
a line "Cc: some.one " in the commit log so that he will
get that patch only.

If you look at a small micro optimization patchset I sent out last
month [0], you will find that I used this trick for the 1st patch,
as it touches the core part of virtio-net vring operation, that I
hope I can get some comments from the virtio guru/maintainer, Michael.
Therefore, he is cc'ed. However, for the 2 other patches in the same
set, it's basically DPDK vhost-user stuff, so that I didn't cc him
to not bother him.

This rule, of course, also applies to the stable branch (for bug
fixing patches in a set). It doesn't matter which way you take if
it's just a patch set of one bug fixing patch though.

[0]: http://dpdk.org/ml/archives/dev/2016-May/038246.html

--yliu


[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Yuanhan Liu
On Mon, Jun 06, 2016 at 03:44:47PM +0200, Nirmoy Das wrote:
> 
> > LTS Version
> > 
> > 
> > The proposed initial LTS version will be DPDK 16.07. The next versions, 
> > based
> > on a 2 year cycle, will be DPDK 18.08, 20.08, etc.
> 
> Hi,
> 
> I can see 16.07's release due date is 18th July. Is it possible to know
> the timeline for RC versions of dpdk-16.07 ? This might be helpful for
> SUSE to decide the supported product(SLE12 SP*/Leap) for dpdk-lts.

You can get it from http://dpdk.org/dev/roadmap:

16.07
Proposal deadline: May 8
Integration deadline: June 16
Release: July 18

In another word, we're gona have 1st RC release in less then 2 weeks.

--yliu


[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Thomas Monjalon
2016-06-06 09:47, Neil Horman:
> On Mon, Jun 06, 2016 at 11:27:29AM +0200, Thomas Monjalon wrote:
> > 2016-06-05 14:15, Neil Horman:
> > > On Fri, Jun 03, 2016 at 03:07:49PM +, Mcnamara, John wrote:
> > > > Introduction
> > > > 
> > > > 
> > > > This document sets out a proposal for a DPDK Long Term Support release 
> > > > (LTS).
> > > > 
> > > > The purpose of the DPDK LTS will be to maintain a stable release of 
> > > > DPDK with
> > > > backported bug fixes over an extended period of time. This will provide
> > > > downstream consumers of DPDK with a stable target on which to base
> > > > applications or packages.
> > [...]
> > > I'm not opposed to an LTS release, but it seems to be re-solving the 
> > > issue of
> > > ABI breakage.  That is to say, there is alreay a process in place for 
> > > managing
> > > ABI changes to the DPDK, which is designed to help ensure that:
> > > 
> > > 1) ABI changes are signaled at least 2 releases early
> > > 2) ABI changes whenever possible are designed such that backward 
> > > compatibility
> > > versions can be encoded at the same time with versioning tags
> > 
> > Sorry I don't understand your point.
> > We are talking about two different things:
> > 1/ ABI care for each new major release
> > 2/ Minor release for bug fixes
> > 
> > I think both may exist.
> > 
> Sure, they can exist together (they being both an ABI backwards compatible 
> HEAD
> and a set of LTS releases).  The point I'm trying to make is that if you do 
> your
> ABI compatible HEAD well enough, you don't really need an LTS release.
> 
> Thats not to say that you can't do both, but an LTS release is a significant
> workload item, especially given the rapid pace of change in HEAD.  The longer
> you maintain an LTS release, the more difficult "minor" bugfixes are to
> integrate, especially if you wind up skipping any ABI breaking patches.  I 
> think
> its worth calling attention to that as this approach gets considered.
> 
> > > Those two mechanism are expressly intended to allow application upgrades 
> > > of DPDK
> > > libraries without worrying about ABI breakage.  While LTS releases are a 
> > > fine
> > > approach for  some things, they sacrifice upstream efficiency (by 
> > > creating work
> > > for backporting teams), while allowing upstream developers more leverage 
> > > to just
> > > create ABI breaking changes on a whim, ignoring the existing ABI 
> > > compatibility
> > > mechanism
> > 
> > No it was not stated that upstream developers should ignore ABI 
> > compatibility.
> > Do you mean having a stable branch means ABI preservation for the next major
> > release is less important?
> > 
> I never stated that developers should ignore ABI compatibility, I stated that
> creating an LTS release will make it that much easier for developers to do so.
> 
> And I think, pragmatically speaking, that is a concern.  Given that the
> existance of an LTS release will make it tempting for developers to simply
> follow the deprecation process rather than try to create ABI backward 
> compatible
> paths.
> 
> Looking at the git history, it seems clear to me that this is already 
> happening.
> I'm able to find a multitude of instances in which the deprecation process has
> been followed reasonably well, but I can find no instances in which any 
> efforts
> have been made for backward compatibility.

There were some examples of backward compatibility in hash and lpm libraries.

> > > LTS is a fine process for projects in which API/ABI breakage is either 
> > > uncommon
> > > or fairly isolated, but that in my mind doesn't really describe DPDK.
> > 
> > Yes API/ABI breakages are still common in DPDK.
> > So it's even more important to have some stable branches.
> 
> We seem to be comming to different conclusions based on the same evidence. We
> agree that API/ABI changes continue to be frequent ocurances, but my position 
> is
> that we already have a process in place to mitigate that, which is simply not
> being used (i.e. versioning symbols to provide backward compatible paths),
> whereas you seem to be asserting that an LTS model will allow for ABI 
> stabiilty
> and bug fixes.
> 
> While I don't disagree with that statement (LTS does provide both of those
> things if the maintainer does it properly), I'm forced to ask the question,
> before we solve this problem in a new way, 

The following questions are interesting but please don't assume the stable
branch address the same issue as ABI compat.
In each major release, we add some new bugs because of new features, even
if the ABI is kept.
In a minor stable release there are only some bug fixes. So the only way
to have a "bug free" version in a stable environment, is to do some
maintenance in a stable branch.

> lets ask why the existing way isn't
> being used.  Do developers just not care about backwards compatibility?  Is 
> the
> process to hard?  Something else?  I really don't like the idea of abandoning
> what currently exists t

[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Thomas Monjalon
2016-06-06 22:14, Yuanhan Liu:
> On Mon, Jun 06, 2016 at 03:31:09PM +0200, Thomas Monjalon wrote:
> > 2016-06-06 19:49, Yuanhan Liu:
> > > On Fri, Jun 03, 2016 at 06:05:15PM +0200, Thomas Monjalon wrote:
> > > > 2016-06-03 15:07, Mcnamara, John:
> > > > > Developers submitting fixes to the mainline should also CC the 
> > > > > maintainer so
> > > > > that they can evaluate the patch. A  email 
> > > > > address could be
> > > > > provided for this so that it can be included as a CC in the commit 
> > > > > messages
> > > > > and documented in the Code Contribution Guidelines.
[...]
> > Why put a CC tag in the commit? For automatic processing?
> > Maybe it is too early to run before walking ;)
> 
> It's a tip/trick used a lot in kernel community. Assume you have made
> a patchset, that just one of them fixes a bug that you hope this patch
> could also be cc'ed to the original author that introduces the bug.
> You could achieve that by adding him to the cc list from cli. However,
> in such way, all patches are cc'ed to him. The alternative is to add
> a line "Cc: some.one " in the commit log so that he will
> get that patch only.
> 
> If you look at a small micro optimization patchset I sent out last
> month [0], you will find that I used this trick for the 1st patch,
> as it touches the core part of virtio-net vring operation, that I
> hope I can get some comments from the virtio guru/maintainer, Michael.
> Therefore, he is cc'ed. However, for the 2 other patches in the same
> set, it's basically DPDK vhost-user stuff, so that I didn't cc him
> to not bother him.
> 
> This rule, of course, also applies to the stable branch (for bug
> fixing patches in a set). It doesn't matter which way you take if
> it's just a patch set of one bug fixing patch though.
> 
> [0]: http://dpdk.org/ml/archives/dev/2016-May/038246.html

OK


[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-06 Thread Shreyansh Jain
Hi,

This is more of a question/clarification than a comment. (And I have taken only 
some snippets from original mail to keep it cleaner)


> +MEMPOOL_REGISTER_OPS(ops_mp_mc);
> +MEMPOOL_REGISTER_OPS(ops_sp_sc);
> +MEMPOOL_REGISTER_OPS(ops_mp_sc);
> +MEMPOOL_REGISTER_OPS(ops_sp_mc);




> + /*
> +  * Since we have 4 combinations of the SP/SC/MP/MC examine the flags to
> +  * set the correct index into the table of ops structs.
> +  */
> + if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
> + rte_mempool_set_ops_byname(mp, "ring_sp_sc");
> + else if (flags & MEMPOOL_F_SP_PUT)
> + rte_mempool_set_ops_byname(mp, "ring_sp_mc");
> + else if (flags & MEMPOOL_F_SC_GET)
> + rte_mempool_set_ops_byname(mp, "ring_mp_sc");
> + else
> + rte_mempool_set_ops_byname(mp, "ring_mp_mc");
> +



[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-06 Thread Shreyansh Jain
Hi,

(Apologies for overly-eager email sent on this thread earlier. Will be more 
careful in future).

This is more of a question/clarification than a comment. (And I have taken only 
some snippets from original mail to keep it cleaner)


> +MEMPOOL_REGISTER_OPS(ops_mp_mc);
> +MEMPOOL_REGISTER_OPS(ops_sp_sc);
> +MEMPOOL_REGISTER_OPS(ops_mp_sc);
> +MEMPOOL_REGISTER_OPS(ops_sp_mc);


>From the above what I understand is that multiple packet pool handlers can be 
>created.

I have a use-case where application has multiple pools but only the packet pool 
is hardware backed. Using the hardware for general buffer requirements would 
prove costly.
>From what I understand from the patch, selection of the pool is based on the 
>flags below.


> + /*
> +  * Since we have 4 combinations of the SP/SC/MP/MC examine the flags to
> +  * set the correct index into the table of ops structs.
> +  */
> + if (flags & (MEMPOOL_F_SP_PUT | MEMPOOL_F_SC_GET))
> + rte_mempool_set_ops_byname(mp, "ring_sp_sc");
> + else if (flags & MEMPOOL_F_SP_PUT)
> + rte_mempool_set_ops_byname(mp, "ring_sp_mc");
> + else if (flags & MEMPOOL_F_SC_GET)
> + rte_mempool_set_ops_byname(mp, "ring_mp_sc");
> + else
> + rte_mempool_set_ops_byname(mp, "ring_mp_mc");
> +

Is there any way I can achieve the above use case of multiple pools which can 
be selected by an application - something like a run-time toggle/flag?

-
Shreyansh


[dpdk-dev] RFC: DPDK Long Term Support

2016-06-06 Thread Neil Horman
On Mon, Jun 06, 2016 at 04:21:11PM +0200, Thomas Monjalon wrote:
> 2016-06-06 09:47, Neil Horman:
> > On Mon, Jun 06, 2016 at 11:27:29AM +0200, Thomas Monjalon wrote:
> > > 2016-06-05 14:15, Neil Horman:
> > > > On Fri, Jun 03, 2016 at 03:07:49PM +, Mcnamara, John wrote:
> > > > > Introduction
> > > > > 
> > > > > 
> > > > > This document sets out a proposal for a DPDK Long Term Support 
> > > > > release (LTS).
> > > > > 
> > > > > The purpose of the DPDK LTS will be to maintain a stable release of 
> > > > > DPDK with
> > > > > backported bug fixes over an extended period of time. This will 
> > > > > provide
> > > > > downstream consumers of DPDK with a stable target on which to base
> > > > > applications or packages.
> > > [...]
> > > > I'm not opposed to an LTS release, but it seems to be re-solving the 
> > > > issue of
> > > > ABI breakage.  That is to say, there is alreay a process in place for 
> > > > managing
> > > > ABI changes to the DPDK, which is designed to help ensure that:
> > > > 
> > > > 1) ABI changes are signaled at least 2 releases early
> > > > 2) ABI changes whenever possible are designed such that backward 
> > > > compatibility
> > > > versions can be encoded at the same time with versioning tags
> > > 
> > > Sorry I don't understand your point.
> > > We are talking about two different things:
> > > 1/ ABI care for each new major release
> > > 2/ Minor release for bug fixes
> > > 
> > > I think both may exist.
> > > 
> > Sure, they can exist together (they being both an ABI backwards compatible 
> > HEAD
> > and a set of LTS releases).  The point I'm trying to make is that if you do 
> > your
> > ABI compatible HEAD well enough, you don't really need an LTS release.
> > 
> > Thats not to say that you can't do both, but an LTS release is a significant
> > workload item, especially given the rapid pace of change in HEAD.  The 
> > longer
> > you maintain an LTS release, the more difficult "minor" bugfixes are to
> > integrate, especially if you wind up skipping any ABI breaking patches.  I 
> > think
> > its worth calling attention to that as this approach gets considered.
> > 
> > > > Those two mechanism are expressly intended to allow application 
> > > > upgrades of DPDK
> > > > libraries without worrying about ABI breakage.  While LTS releases are 
> > > > a fine
> > > > approach for  some things, they sacrifice upstream efficiency (by 
> > > > creating work
> > > > for backporting teams), while allowing upstream developers more 
> > > > leverage to just
> > > > create ABI breaking changes on a whim, ignoring the existing ABI 
> > > > compatibility
> > > > mechanism
> > > 
> > > No it was not stated that upstream developers should ignore ABI 
> > > compatibility.
> > > Do you mean having a stable branch means ABI preservation for the next 
> > > major
> > > release is less important?
> > > 
> > I never stated that developers should ignore ABI compatibility, I stated 
> > that
> > creating an LTS release will make it that much easier for developers to do 
> > so.
> > 
> > And I think, pragmatically speaking, that is a concern.  Given that the
> > existance of an LTS release will make it tempting for developers to simply
> > follow the deprecation process rather than try to create ABI backward 
> > compatible
> > paths.
> > 
> > Looking at the git history, it seems clear to me that this is already 
> > happening.
> > I'm able to find a multitude of instances in which the deprecation process 
> > has
> > been followed reasonably well, but I can find no instances in which any 
> > efforts
> > have been made for backward compatibility.
> 
> There were some examples of backward compatibility in hash and lpm libraries.
> 
Ok, apologies, but you still see my point.  A relatively minor number of
instances of creating backward compatibility among a much larger set of easier
deprecate and replace instances.  Its not really having the effect it was
intended to.

> > > > LTS is a fine process for projects in which API/ABI breakage is either 
> > > > uncommon
> > > > or fairly isolated, but that in my mind doesn't really describe DPDK.
> > > 
> > > Yes API/ABI breakages are still common in DPDK.
> > > So it's even more important to have some stable branches.
> > 
> > We seem to be comming to different conclusions based on the same evidence. 
> > We
> > agree that API/ABI changes continue to be frequent ocurances, but my 
> > position is
> > that we already have a process in place to mitigate that, which is simply 
> > not
> > being used (i.e. versioning symbols to provide backward compatible paths),
> > whereas you seem to be asserting that an LTS model will allow for ABI 
> > stabiilty
> > and bug fixes.
> > 
> > While I don't disagree with that statement (LTS does provide both of those
> > things if the maintainer does it properly), I'm forced to ask the question,
> > before we solve this problem in a new way, 
> 
> The following questions are interesting but please don't 

[dpdk-dev] [PATCH v1 2/2] Test cases for rte_memcmp functions

2016-06-06 Thread Ravi Kerur
Zhilong, Thomas,

If there is enough interest within DPDK community I can work on adding
support for 'unaligned access' and 'test cases' for it. Please let me know
either way.

Thanks,
Ravi


On Thu, May 26, 2016 at 2:05 AM, Wang, Zhihong 
wrote:

>
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ravi Kerur
> > Sent: Tuesday, March 8, 2016 7:01 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH v1 2/2] Test cases for rte_memcmp functions
> >
> > v1:
> > This patch adds test cases for rte_memcmp functions.
> > New rte_memcmp functions can be tested via 'make test'
> > and 'testpmd' utility.
> >
> > Compiled and tested on Ubuntu 14.04(non-NUMA) and
> > 15.10(NUMA) systems.
> [...]
>
> > +/
> > ***
> > + * Memcmp function performance test configuration section. Each
> performance
> > test
> > + * will be performed MEMCMP_ITERATIONS times.
> > + *
> > + * The five arrays below control what tests are performed. Every
> combination
> > + * from the array entries is tested.
> > + */
> > +#define MEMCMP_ITERATIONS (500 * 500 * 500)
>
>
> Maybe less iteration will make the test faster without compromise precison?
>
>
> > +
> > +static size_t memcmp_sizes[] = {
> > + 2, 5, 8, 9, 15, 16, 17, 31, 32, 33, 63, 64, 65, 127, 128,
> > + 129, 191, 192, 193, 255, 256, 257, 319, 320, 321, 383, 384,
> > + 385, 447, 448, 449, 511, 512, 513, 767, 768, 769, 1023, 1024,
> > + 1025, 1522, 1536, 1600, 2048, 2560, 3072, 3584, 4096, 4608,
> > + 5632, 6144, 6656, 7168, 7680, 8192, 16834
> > +};
> > +
> [...]
> > +/*
> > + * Do all performance tests.
> > + */
> > +static int
> > +test_memcmp_perf(void)
> > +{
> > + if (run_all_memcmp_eq_perf_tests() != 0)
> > + return -1;
> > +
> > + if (run_all_memcmp_gt_perf_tests() != 0)
> > + return -1;
> > +
> > + if (run_all_memcmp_lt_perf_tests() != 0)
> > + return -1;
> > +
>
>
> Perhaps unaligned test cases are needed here.
> How do you think?
>
>
> > +
> > + return 0;
> > +}
> > +
> > +static struct test_command memcmp_perf_cmd = {
> > + .command = "memcmp_perf_autotest",
> > + .callback = test_memcmp_perf,
> > +};
> > +REGISTER_TEST_COMMAND(memcmp_perf_cmd);
> > --
> > 1.9.1
>
>


[dpdk-dev] [PATCH v4 01/39] bnxt: new driver for Broadcom NetXtreme-C devices

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds the initial skeleton for bnxt driver along with the
nic guide to tie into the build system.
At this point, the driver simply fails init.

v4:
Fix a warning that the document isn't included in any toctree
Also remove a PCI ID added erroneously.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 MAINTAINERS |   5 ++
 config/common_base  |   5 ++
 doc/guides/nics/bnxt.rst|  49 +++
 doc/guides/nics/index.rst   |   1 +
 drivers/net/Makefile|   1 +
 drivers/net/bnxt/Makefile   |  63 ++
 drivers/net/bnxt/bnxt_ethdev.c  | 104 
 drivers/net/bnxt/rte_pmd_bnxt_version.map   |   4 +
 lib/librte_eal/common/include/rte_pci_dev_ids.h |  38 +++--
 mk/rte.app.mk   |   1 +
 10 files changed, 266 insertions(+), 5 deletions(-)
 create mode 100644 doc/guides/nics/bnxt.rst
 create mode 100644 drivers/net/bnxt/Makefile
 create mode 100644 drivers/net/bnxt/bnxt_ethdev.c
 create mode 100644 drivers/net/bnxt/rte_pmd_bnxt_version.map

diff --git a/MAINTAINERS b/MAINTAINERS
index 3e8558f..8892086 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -400,6 +400,11 @@ M: Declan Doherty 
 F: drivers/crypto/null/
 F: doc/guides/cryptodevs/null.rst

+Broadcom BNXT PMD
+M: Stephen Hurd 
+F: drivers/net/bnxt/
+F: doc/guides/nics/bnxt.rst
+

 Packet processing
 -
diff --git a/config/common_base b/config/common_base
index 47c26f6..dc298e9 100644
--- a/config/common_base
+++ b/config/common_base
@@ -245,6 +245,11 @@ CONFIG_RTE_LIBRTE_NFP_PMD=n
 CONFIG_RTE_LIBRTE_NFP_DEBUG=n

 #
+# Compile burst-oriented Broadcom BNXT PMD driver
+#
+CONFIG_RTE_LIBRTE_BNXT_PMD=y
+
+#
 # Compile software PMD backed by SZEDATA2 device
 #
 CONFIG_RTE_LIBRTE_PMD_SZEDATA2=n
diff --git a/doc/guides/nics/bnxt.rst b/doc/guides/nics/bnxt.rst
new file mode 100644
index 000..2669e98
--- /dev/null
+++ b/doc/guides/nics/bnxt.rst
@@ -0,0 +1,49 @@
+..  BSD LICENSE
+Copyright 2016 Broadcom Limited
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+
+* Redistributions of source code must retain the above copyright
+notice, this list of conditions and the following disclaimer.
+* Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in
+the documentation and/or other materials provided with the
+distribution.
+* Neither the name of Broadcom Limited nor the names of its
+contributors may be used to endorse or promote products derived
+from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+bnxt poll mode driver library
+=
+
+The bnxt poll mode library (**librte_pmd_bnxt**) implements support for
+**Broadcom NetXtreme? C-Series**.  These adapters support Standards-
+compliant 10/25/50Gbps 30MPPS full-duplex throughput.
+
+Information about this family of adapters can be found in the
+`NetXtreme? Brand section 
`_
+of the `Broadcom web site `_.
+
+Limitations
+---
+
+With the current driver, allocated mbufs must be large enough to hold
+the entire received frame.  If the mbufs are not large enough, the
+packets will be dropped.  This is most limiting when jumbo frames are
+used.
+
+SR-IOV is not supported.
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index 0b13698..ffe011e 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -36,6 +36,7 @@ Network Interface Controller Drivers
 :numbered:

 overview
+bnxt
 bnx2x
 cxgbe
 e1000em
diff --git a/drivers/net/Makefile b/drivers/net/Makefile
index 6ba7658..3832706 100644
--- a/drivers/net/Makefile
+++ b/drivers/net/Makefile
@@ -45,6 +45,7 @@ DIRS-$(CONFI

[dpdk-dev] [PATCH v4 02/39] bnxt: add HWRM init code

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Start adding support to use the HWRM API.
Hardware Resource Manager or HWRM in short, is a set of API provided
by the firmware running in the ASIC to manage the various resources.

Initial commit just performs necessary HWRM queries for init, then
fails as before.

The used HWRM calls so far:
bnxt_hwrm_func_qcaps:
Queries device capabilities.

bnxt_hwrm_ver_get:
Gets the firmware version and interface specifications.
Returns an error if the firmware on the device is not
supported by the driver and ensures the response space
is large enough for the largest possible response.

bnxt_hwrm_queue_qportcfg:
Required to get the default queue ID.

v4:
Fix few issues highlighted by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt.h| 114 
 drivers/net/bnxt/bnxt_ethdev.c | 111 
 drivers/net/bnxt/bnxt_hwrm.c   | 324 +++
 drivers/net/bnxt/bnxt_hwrm.h   |  53 ++
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 954 +
 6 files changed, 1557 insertions(+)
 create mode 100644 drivers/net/bnxt/bnxt.h
 create mode 100644 drivers/net/bnxt/bnxt_hwrm.c
 create mode 100644 drivers/net/bnxt/bnxt_hwrm.h
 create mode 100644 drivers/net/bnxt/hsi_struct_def_dpdk.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index f6333fd..9965597 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -49,6 +49,7 @@ EXPORT_MAP := rte_pmd_bnxt_version.map
 # all source are stored in SRCS-y
 #
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c

 #
 # Export include files
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
new file mode 100644
index 000..8cb7f5b
--- /dev/null
+++ b/drivers/net/bnxt/bnxt.h
@@ -0,0 +1,114 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _BNXT_H_
+#define _BNXT_H_
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+struct bnxt_vf_info {
+   uint16_tfw_fid;
+   uint8_t mac_addr[ETHER_ADDR_LEN];
+   uint16_tmax_rsscos_ctx;
+   uint16_tmax_cp_rings;
+   uint16_tmax_tx_rings;
+   uint16_tmax_rx_rings;
+   uint16_tmax_l2_ctx;
+   uint16_tmax_vnics;
+   struct bnxt_pf_info *pf;
+};
+
+struct bnxt_pf_info {
+#define BNXT_FIRST_PF_FID  1
+#define BNXT_MAX_VFS(bp)   (bp->pf.max_vfs)
+#define BNXT_FIRST_VF_FID  128
+#define BNXT_PF_RINGS_USED(bp) bnxt_get_num_queues(bp)
+#define BNXT_PF_RINGS_AVAIL(bp)(bp->pf.max_cp_rings - 
BNXT_PF_RINGS_USED(bp))
+   uint32_tfw_fid;
+   uint8_t port_id;
+   uint8_t mac_addr[ETHER_ADDR_LEN];
+   uint16_tmax_rsscos_ctx;
+   uint16_tmax_cp_rings;
+   uint16_tmax_tx_rings;
+   uint16_tmax_rx_rings;
+   uint16_tmax_l2_ctx;
+   uint16_tmax_vnics;
+   uint16_tfirst_vf_id;
+   uint16_t 

[dpdk-dev] [PATCH v4 03/39] bnxt: add driver register/unregister support

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Move init() cleanup into uninit() function
Fix .dev_private_size
Add require hwrm calls:
bnxt_hwrm_func_driver_register()
bnxt_hwrm_func_driver_unregister()

v4:
Address review comment regarding removal of bnxt_dev_close_op

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt.h|   1 +
 drivers/net/bnxt/bnxt_ethdev.c |  38 -
 drivers/net/bnxt/bnxt_hwrm.c   |  50 ++
 drivers/net/bnxt/bnxt_hwrm.h   |   3 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 277 -
 5 files changed, 358 insertions(+), 11 deletions(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 8cb7f5b..ed057ef 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -91,6 +91,7 @@ struct bnxt {
struct rte_pci_device   *pdev;

uint32_tflags;
+#define BNXT_FLAG_REGISTERED   (1 << 0)
 #define BNXT_FLAG_VF   (1 << 1)
 #define BNXT_PF(bp)(!((bp)->flags & BNXT_FLAG_VF))
 #define BNXT_VF(bp)((bp)->flags & BNXT_FLAG_VF)
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 8ebd742..26e6447 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -123,7 +123,8 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
eth_dev->pci_dev->addr.function < 4) {
RTE_LOG(ERR, PMD, "Function not enabled %x:\n",
eth_dev->pci_dev->addr.function);
-   return -ENOMEM;
+   rc = -ENOMEM;
+   goto error;
}

rte_eth_copy_pci_info(eth_dev, eth_dev->pci_dev);
@@ -146,11 +147,11 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
if (rc) {
RTE_LOG(ERR, PMD,
"hwrm resource allocation failure rc: %x\n", rc);
-   goto error;
+   goto error_free;
}
rc = bnxt_hwrm_ver_get(bp);
if (rc)
-   goto error;
+   goto error_free;
bnxt_hwrm_queue_qportcfg(bp);

/* Get the MAX capabilities for this function */
@@ -175,17 +176,38 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
memcpy(bp->mac_addr, bp->vf.mac_addr, sizeof(bp->mac_addr));
memcpy(ð_dev->data->mac_addrs[0], bp->mac_addr, ETHER_ADDR_LEN);

-   return -EPERM;
+   rc = bnxt_hwrm_func_driver_register(bp, 0,
+   bp->pf.vf_req_fwd);
+   if (rc) {
+   RTE_LOG(ERR, PMD,
+   "Failed to register driver");
+   rc = -EBUSY;
+   goto error_free;
+   }
+
+   RTE_LOG(INFO, PMD,
+   DRV_MODULE_NAME " found at mem %" PRIx64 ", node addr %pM\n",
+   eth_dev->pci_dev->mem_resource[0].phys_addr,
+   eth_dev->pci_dev->mem_resource[0].addr);
+
+   return 0;

 error_free:
-   bnxt_dev_close_op(eth_dev);
+   eth_dev->driver->eth_dev_uninit(eth_dev);
 error:
return rc;
 }

 static int
-bnxt_dev_uninit(struct rte_eth_dev *eth_dev __rte_unused) {
-   return 0;
+bnxt_dev_uninit(struct rte_eth_dev *eth_dev) {
+   struct bnxt *bp = eth_dev->data->dev_private;
+   int rc;
+
+   if (eth_dev->data->mac_addrs)
+   rte_free(eth_dev->data->mac_addrs);
+   rc = bnxt_hwrm_func_driver_unregister(bp, 0);
+   bnxt_free_hwrm_resources(bp);
+   return rc;
 }

 static struct eth_driver bnxt_rte_pmd = {
@@ -196,7 +218,7 @@ static struct eth_driver bnxt_rte_pmd = {
},
.eth_dev_init = bnxt_dev_init,
.eth_dev_uninit = bnxt_dev_uninit,
-   .dev_private_size = 32 /* this must be non-zero apparently */,
+   .dev_private_size = sizeof(struct bnxt),
 };

 static int bnxt_rte_pmd_init(const char *name, const char *params __rte_unused)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index e187121..8aba8cd 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "bnxt.h"
 #include "bnxt_hwrm.h"
@@ -178,6 +179,34 @@ int bnxt_hwrm_func_qcaps(struct bnxt *bp)
return rc;
 }

+int bnxt_hwrm_func_driver_register(struct bnxt *bp, uint32_t flags,
+  uint32_t *vf_req_fwd)
+{
+   int rc;
+   struct hwrm_func_drv_rgtr_input req = {.req_type = 0 };
+   struct hwrm_func_drv_rgtr_output *resp = bp->hwrm_cmd_resp_addr;
+
+   if (bp->flags & BNXT_FLAG_REGISTERED)
+   return 0;
+
+   HWRM_PREP(req, FUNC_DRV_RGTR, -1, resp);
+   req.flags = flags;
+   req.enables = HWRM_FUNC_DRV_RGTR_INPUT_ENABLES_VER;
+   req.ver_maj = RTE_VER_YEAR;
+   req.ver_min = RTE_VER_MONTH;
+   req.ver_upd = RTE_VER_MINOR;
+
+   memcpy(req.vf_req_fwd, vf_req_fwd, sizeof(req.vf_

[dpdk-dev] [PATCH v4 04/39] bnxt: add dev infos get operation

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Gets device info from the bp structure filled in the init() function.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt.h|  3 ++
 drivers/net/bnxt/bnxt_ethdev.c | 95 ++
 2 files changed, 98 insertions(+)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index ed057ef..f8707b2 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -42,6 +42,9 @@
 #include 
 #include 

+#define BNXT_MAX_MTU   9000
+#define VLAN_TAG_SIZE  4
+
 struct bnxt_vf_info {
uint16_tfw_fid;
uint8_t mac_addr[ETHER_ADDR_LEN];
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 26e6447..a8a9912 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -61,10 +61,105 @@ static void bnxt_dev_close_op(struct rte_eth_dev *eth_dev)
 }

 /*
+ * Device configuration and status function
+ */
+
+static void bnxt_dev_info_get_op(struct rte_eth_dev *eth_dev,
+ struct rte_eth_dev_info *dev_info)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   uint16_t max_vnics, i, j, vpool, vrxq;
+
+   /* MAC Specifics */
+   dev_info->max_mac_addrs = MAX_NUM_MAC_ADDR;
+   dev_info->max_hash_mac_addrs = 0;
+
+   /* PF/VF specifics */
+   if (BNXT_PF(bp)) {
+   dev_info->max_rx_queues = bp->pf.max_rx_rings;
+   dev_info->max_tx_queues = bp->pf.max_tx_rings;
+   dev_info->max_vfs = bp->pf.active_vfs;
+   dev_info->reta_size = bp->pf.max_rsscos_ctx;
+   max_vnics = bp->pf.max_vnics;
+   } else {
+   dev_info->max_rx_queues = bp->vf.max_rx_rings;
+   dev_info->max_tx_queues = bp->vf.max_tx_rings;
+   dev_info->reta_size = bp->vf.max_rsscos_ctx;
+   max_vnics = bp->vf.max_vnics;
+   }
+
+   /* Fast path specifics */
+   dev_info->min_rx_bufsize = 1;
+   dev_info->max_rx_pktlen = BNXT_MAX_MTU + ETHER_HDR_LEN + ETHER_CRC_LEN
+ + VLAN_TAG_SIZE;
+   dev_info->rx_offload_capa = 0;
+   dev_info->tx_offload_capa = DEV_TX_OFFLOAD_IPV4_CKSUM |
+   DEV_TX_OFFLOAD_TCP_CKSUM |
+   DEV_TX_OFFLOAD_UDP_CKSUM |
+   DEV_TX_OFFLOAD_TCP_TSO;
+
+   /* *INDENT-OFF* */
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_thresh = {
+   .pthresh = 8,
+   .hthresh = 8,
+   .wthresh = 0,
+   },
+   .rx_free_thresh = 32,
+   .rx_drop_en = 0,
+   };
+
+   dev_info->default_txconf = (struct rte_eth_txconf) {
+   .tx_thresh = {
+   .pthresh = 32,
+   .hthresh = 0,
+   .wthresh = 0,
+   },
+   .tx_free_thresh = 32,
+   .tx_rs_thresh = 32,
+   .txq_flags = ETH_TXQ_FLAGS_NOMULTSEGS |
+ETH_TXQ_FLAGS_NOOFFLOADS,
+   };
+   /* *INDENT-ON* */
+
+   /*
+* TODO: default_rxconf, default_txconf, rx_desc_lim, and tx_desc_lim
+*   need further investigation.
+*/
+
+   /* VMDq resources */
+   vpool = 64; /* ETH_64_POOLS */
+   vrxq = 128; /* ETH_VMDQ_DCB_NUM_QUEUES */
+   for (i = 0; i < 4; vpool >>= 1, i++) {
+   if (max_vnics > vpool) {
+   for (j = 0; j < 5; vrxq >>= 1, j++) {
+   if (dev_info->max_rx_queues > vrxq) {
+   if (vpool > vrxq)
+   vpool = vrxq;
+   goto found;
+   }
+   }
+   /* Not enough resources to support VMDq */
+   break;
+   }
+   }
+   /* Not enough resources to support VMDq */
+   vpool = 0;
+   vrxq = 0;
+found:
+   dev_info->max_vmdq_pools = vpool;
+   dev_info->vmdq_queue_num = vrxq;
+
+   dev_info->vmdq_pool_base = 0;
+   dev_info->vmdq_queue_base = 0;
+}
+
+/*
  * Initialization
  */

 static struct eth_dev_ops bnxt_dev_ops = {
+   .dev_infos_get = bnxt_dev_info_get_op,
.dev_close = bnxt_dev_close_op,
 };

-- 
1.9.1



[dpdk-dev] [PATCH v4 05/39] bnxt: add dev configure operation

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds the bnxt_hwrm_port_phy_cfg() HWRM call,
and copies required information into the new struct bnxt_link_info.

v4:
Fixed few issues identified by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt.h|  32 +++
 drivers/net/bnxt/bnxt_ethdev.c |  24 ++
 drivers/net/bnxt/bnxt_hwrm.c   | 232 +++-
 drivers/net/bnxt/bnxt_hwrm.h   |   1 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 470 +
 5 files changed, 758 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index f8707b2..bfce91e 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -81,6 +81,29 @@ struct bnxt_pf_info {
struct bnxt_vf_info *vf;
 };

+/* Max wait time is 10 * 100ms = 1s */
+#define BNXT_LINK_WAIT_CNT 10
+#define BNXT_LINK_WAIT_INTERVAL100
+struct bnxt_link_info {
+   uint8_t phy_flags;
+   uint8_t mac_type;
+   uint8_t phy_link_status;
+   uint8_t loop_back;
+   uint8_t link_up;
+   uint8_t duplex;
+   uint8_t pause;
+   uint8_t force_pause;
+   uint8_t auto_pause;
+   uint8_t auto_mode;
+#define PHY_VER_LEN3
+   uint8_t phy_ver[PHY_VER_LEN];
+   uint16_tlink_speed;
+   uint16_tsupport_speeds;
+   uint16_tauto_link_speed;
+   uint16_tauto_link_speed_mask;
+   uint32_tpreemphasis;
+};
+
 #define BNXT_COS_QUEUE_COUNT   8
 struct bnxt_cos_queue_info {
uint8_t id;
@@ -99,6 +122,14 @@ struct bnxt {
 #define BNXT_PF(bp)(!((bp)->flags & BNXT_FLAG_VF))
 #define BNXT_VF(bp)((bp)->flags & BNXT_FLAG_VF)

+   unsigned intrx_nr_rings;
+   unsigned intrx_cp_nr_rings;
+   struct bnxt_rx_queue **rx_queues;
+
+   unsigned inttx_nr_rings;
+   unsigned inttx_cp_nr_rings;
+   struct bnxt_tx_queue **tx_queues;
+
 #define MAX_NUM_MAC_ADDR   32
uint8_t mac_addr[ETHER_ADDR_LEN];

@@ -109,6 +140,7 @@ struct bnxt {
uint16_tmax_req_len;
uint16_tmax_resp_len;

+   struct bnxt_link_info   link_info;
struct bnxt_cos_queue_info  cos_queue[BNXT_COS_QUEUE_COUNT];

struct bnxt_pf_info pf;
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index a8a9912..b46d2ce 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -154,6 +154,29 @@ found:
dev_info->vmdq_queue_base = 0;
 }

+/* Configure the device based on the configuration provided */
+static int bnxt_dev_configure_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   int rc;
+
+   bp->rx_queues = (void *)eth_dev->data->rx_queues;
+   bp->tx_queues = (void *)eth_dev->data->tx_queues;
+
+   /* Inherit new configurations */
+   bp->rx_nr_rings = eth_dev->data->nb_rx_queues;
+   bp->tx_nr_rings = eth_dev->data->nb_tx_queues;
+   bp->rx_cp_nr_rings = bp->rx_nr_rings;
+   bp->tx_cp_nr_rings = bp->tx_nr_rings;
+
+   if (eth_dev->data->dev_conf.rxmode.jumbo_frame)
+   eth_dev->data->mtu =
+   eth_dev->data->dev_conf.rxmode.max_rx_pkt_len -
+   ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE;
+   rc = bnxt_set_hwrm_link_config(bp, true);
+   return rc;
+}
+
 /*
  * Initialization
  */
@@ -161,6 +184,7 @@ found:
 static struct eth_dev_ops bnxt_dev_ops = {
.dev_infos_get = bnxt_dev_info_get_op,
.dev_close = bnxt_dev_close_op,
+   .dev_configure = bnxt_dev_configure_op,
 };

 static bool bnxt_vf_pciid(uint16_t id)
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 8aba8cd..a2d7815 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -83,7 +83,7 @@ static int bnxt_hwrm_send_message_locked(struct bnxt *bp, 
void *msg,
/* Sanity check on the resp->resp_len */
rte_rmb();
if (resp->resp_len && resp->resp_len <=
-   bp->max_resp_len) {
+   bp->max_resp_len) {
/* Last byte of resp contains the valid key */
valid = (uint8_t *)resp + resp->resp_len - 1;
if (*valid == HWRM_RESP_VALID_KEY)
@@ -314,6 +314,61 @@ int bnxt_hwrm_func_driver_unregister(struct bnxt *bp, 
uint32_t flags)
return rc;
 }

+static int bnxt_hwrm_port_phy_cfg(struct bnxt *bp, struct bnxt_

[dpdk-dev] [PATCH v4 07/39] bnxt: declare ring structs and free() func

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Declare ring structures and a ring free() function.
These are generic ring mamagement functions which will be used to create
Tx, Rx and Completion rings in the subsequent patches.

v4:
Address checkpatch warnings.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile|  1 +
 drivers/net/bnxt/bnxt_ring.c | 47 ++
 drivers/net/bnxt/bnxt_ring.h | 92 
 3 files changed, 140 insertions(+)
 create mode 100644 drivers/net/bnxt/bnxt_ring.c
 create mode 100644 drivers/net/bnxt/bnxt_ring.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index c57afaa..757ea62 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -50,6 +50,7 @@ EXPORT_MAP := rte_pmd_bnxt_version.map
 #
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c

 #
diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c
new file mode 100644
index 000..d3b70cc
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_ring.c
@@ -0,0 +1,47 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "bnxt.h"
+#include "bnxt_ring.h"
+
+/*
+ * Generic ring handling
+ */
+
+void bnxt_free_ring(struct bnxt_ring_struct *ring)
+{
+   if (ring->vmem_size && *ring->vmem) {
+   memset((char *)*ring->vmem, 0, ring->vmem_size);
+   *ring->vmem = NULL;
+   }
+}
diff --git a/drivers/net/bnxt/bnxt_ring.h b/drivers/net/bnxt/bnxt_ring.h
new file mode 100644
index 000..ebbd759
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_ring.h
@@ -0,0 +1,92 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGEN

[dpdk-dev] [PATCH v4 13/39] bnxt: initial Tx code implementation

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Initial implementation of tx_pkt_burst for transmit.
Add code to allocate rings to bnxt_ring.c
This allows creation of rings in ASIC, which is used by the Tx function.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt_cpr.h|   4 +-
 drivers/net/bnxt/bnxt_ethdev.c |   3 +-
 drivers/net/bnxt/bnxt_ring.c   | 145 ++
 drivers/net/bnxt/bnxt_ring.h   |   8 +
 drivers/net/bnxt/bnxt_txq.c|  42 ++-
 drivers/net/bnxt/bnxt_txr.c| 314 
 drivers/net/bnxt/bnxt_txr.h|  71 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 512 +
 9 files changed, 1091 insertions(+), 9 deletions(-)
 create mode 100644 drivers/net/bnxt/bnxt_txr.c
 create mode 100644 drivers/net/bnxt/bnxt_txr.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index f6a04f8..0785681 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -56,6 +56,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_rxq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txq.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txr.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c

 #
diff --git a/drivers/net/bnxt/bnxt_cpr.h b/drivers/net/bnxt/bnxt_cpr.h
index e6333fc..f104281 100644
--- a/drivers/net/bnxt/bnxt_cpr.h
+++ b/drivers/net/bnxt/bnxt_cpr.h
@@ -51,11 +51,11 @@

 #define B_CP_DB_REARM(cpr, raw_cons)   \
(*(uint32_t *)((cpr)->cp_doorbell) = (DB_CP_REARM_FLAGS | \
-   RING_CMP(&cpr->cp_ring_struct, raw_cons)))
+   RING_CMP(cpr->cp_ring_struct, raw_cons)))

 #define B_CP_DIS_DB(cpr, raw_cons) \
(*(uint32_t *)((cpr)->cp_doorbell) = (DB_CP_FLAGS | \
-   RING_CMP(&cpr->cp_ring_struct, raw_cons)))
+   RING_CMP(cpr->cp_ring_struct, raw_cons)))

 struct bnxt_ring_struct;
 struct bnxt_cp_ring_info {
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 3453509..4ace543 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -44,6 +44,7 @@
 #include "bnxt_rxq.h"
 #include "bnxt_stats.h"
 #include "bnxt_txq.h"
+#include "bnxt_txr.h"

 #define DRV_MODULE_NAME"bnxt"
 static const char bnxt_version[] =
@@ -269,7 +270,7 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
}
eth_dev->dev_ops = &bnxt_dev_ops;
/* eth_dev->rx_pkt_burst = &bnxt_recv_pkts; */
-   /* eth_dev->tx_pkt_burst = &bnxt_xmit_pkts; */
+   eth_dev->tx_pkt_burst = &bnxt_xmit_pkts;

rc = bnxt_alloc_hwrm_resources(bp);
if (rc) {
diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c
index d3b70cc..be77bbe 100644
--- a/drivers/net/bnxt/bnxt_ring.c
+++ b/drivers/net/bnxt/bnxt_ring.c
@@ -31,8 +31,14 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+
 #include "bnxt.h"
+#include "bnxt_cpr.h"
 #include "bnxt_ring.h"
+#include "bnxt_txr.h"
+
+#include "hsi_struct_def_dpdk.h"

 /*
  * Generic ring handling
@@ -45,3 +51,142 @@ void bnxt_free_ring(struct bnxt_ring_struct *ring)
*ring->vmem = NULL;
}
 }
+
+/*
+ * Allocates a completion ring with vmem and stats optionally also allocating
+ * a TX and/or RX ring.  Passing NULL as tx_ring_info and/or rx_ring_info
+ * to not allocate them.
+ *
+ * Order in the allocation is:
+ * stats - Always non-zero length
+ * cp vmem - Always zero-length, supported for the bnxt_ring_struct abstraction
+ * tx vmem - Only non-zero length if tx_ring_info is not NULL
+ * rx vmem - Only non-zero length if rx_ring_info is not NULL
+ * cp bd ring - Always non-zero length
+ * tx bd ring - Only non-zero length if tx_ring_info is not NULL
+ * rx bd ring - Only non-zero length if rx_ring_info is not NULL
+ */
+int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
+   struct bnxt_tx_ring_info *tx_ring_info,
+   struct bnxt_rx_ring_info *rx_ring_info,
+   struct bnxt_cp_ring_info *cp_ring_info,
+   const char *suffix)
+{
+   struct bnxt_ring_struct *cp_ring = cp_ring_info->cp_ring_struct;
+   struct bnxt_ring_struct *tx_ring;
+   /* TODO: RX ring */
+   /* struct bnxt_ring_struct *rx_ring; */
+   struct rte_pci_device *pdev = bp->pdev;
+   const struct rte_memzone *mz = NULL;
+   char mz_name[RTE_MEMZONE_NAMESIZE];
+
+   int stats_len = (tx_ring_info || rx_ring_info) ?
+   RTE_CACHE_LINE_ROUNDUP(sizeof(struct ct

[dpdk-dev] [PATCH v4 12/39] bnxt: Add statistics operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add get and clear staitstics operations and the asociated HWRM calls.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt_cpr.c|   1 +
 drivers/net/bnxt/bnxt_cpr.h|   2 -
 drivers/net/bnxt/bnxt_ethdev.c |   3 +
 drivers/net/bnxt/bnxt_hwrm.c   |  49 
 drivers/net/bnxt/bnxt_hwrm.h   |   8 +-
 drivers/net/bnxt/bnxt_rxq.c|   1 +
 drivers/net/bnxt/bnxt_stats.c  | 142 +
 drivers/net/bnxt/bnxt_stats.h  |  44 ++
 drivers/net/bnxt/bnxt_txq.c|   1 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 107 +
 11 files changed, 355 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/bnxt/bnxt_stats.c
 create mode 100644 drivers/net/bnxt/bnxt_stats.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index 21ed71c..f6a04f8 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -54,6 +54,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_rxq.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index ba8c0d4..06a618a 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -35,6 +35,7 @@
 #include "bnxt_cpr.h"
 #include "bnxt_hwrm.h"
 #include "bnxt_ring.h"
+#include "hsi_struct_def_dpdk.h"

 /*
  * Async event handling
diff --git a/drivers/net/bnxt/bnxt_cpr.h b/drivers/net/bnxt/bnxt_cpr.h
index 878c7c9..e6333fc 100644
--- a/drivers/net/bnxt/bnxt_cpr.h
+++ b/drivers/net/bnxt/bnxt_cpr.h
@@ -34,8 +34,6 @@
 #ifndef _BNXT_CPR_H_
 #define _BNXT_CPR_H_

-#include "hsi_struct_def_dpdk.h"
-
 #define CMP_VALID(cmp, raw_cons, ring) \
(!!(((struct cmpl_base *)(cmp))->info3_v & CMPL_BASE_V) ==  \
 !((raw_cons) & ((ring)->ring_size)))
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 7e7d1ab..3453509 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -42,6 +42,7 @@
 #include "bnxt.h"
 #include "bnxt_hwrm.h"
 #include "bnxt_rxq.h"
+#include "bnxt_stats.h"
 #include "bnxt_txq.h"

 #define DRV_MODULE_NAME"bnxt"
@@ -187,6 +188,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.dev_infos_get = bnxt_dev_info_get_op,
.dev_close = bnxt_dev_close_op,
.dev_configure = bnxt_dev_configure_op,
+   .stats_get = bnxt_stats_get_op,
+   .stats_reset = bnxt_stats_reset_op,
.rx_queue_setup = bnxt_rx_queue_setup_op,
.rx_queue_release = bnxt_rx_queue_release_op,
.tx_queue_setup = bnxt_tx_queue_setup_op,
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 5d9a991..2574bd0 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -39,8 +39,11 @@
 #include 

 #include "bnxt.h"
+#include "bnxt_cpr.h"
 #include "bnxt_filter.h"
 #include "bnxt_hwrm.h"
+#include "bnxt_rxq.h"
+#include "bnxt_txq.h"
 #include "hsi_struct_def_dpdk.h"

 #define HWRM_CMD_TIMEOUT   2000
@@ -436,10 +439,56 @@ int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
return rc;
 }

+int bnxt_hwrm_stat_clear(struct bnxt *bp, struct bnxt_cp_ring_info *cpr)
+{
+   int rc = 0;
+   struct hwrm_stat_ctx_clr_stats_input req = {.req_type = 0 };
+   struct hwrm_stat_ctx_clr_stats_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, STAT_CTX_CLR_STATS, -1, resp);
+
+   if (cpr->hw_stats_ctx_id == (uint32_t)HWRM_NA_SIGNATURE)
+   return rc;
+
+   req.stat_ctx_id = rte_cpu_to_le_16(cpr->hw_stats_ctx_id);
+   req.seq_id = rte_cpu_to_le_16(bp->hwrm_cmd_seq++);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 /*
  * HWRM utility functions
  */

+int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp)
+{
+   unsigned int i;
+   int rc = 0;
+
+   for (i = 0; i < bp->rx_cp_nr_rings + bp->tx_cp_nr_rings; i++) {
+   struct bnxt_tx_queue *txq;
+   struct bnxt_rx_queue *rxq;
+   struct bnxt_cp_ring_info *cpr;
+
+   if (i >= bp->rx_cp_nr_rings) {
+   txq = bp->tx_queues[i - bp->rx_cp_nr_rings];
+   cpr = txq->cp_ring;
+   } else {
+   rxq = bp->rx_queues[i];
+   cpr = rxq->cp_ring;
+   }
+
+   rc = bnxt_hwrm_stat_clear(bp, cpr);
+   if

[dpdk-dev] [PATCH v4 09/39] bnxt: add L2 filter alloc/init/free

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add the L2 filter structure and the alloc/init/free functions for
dealing with them.

A filter is used to identify traffic that contains a matching set of
parameters like unicast or broadcast MAC address or a VLAN tag amongst
other things which then allows the ASIC to direct the  incoming traffic
to an appropriate VNIC or Rx ring.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt.h|   3 +
 drivers/net/bnxt/bnxt_filter.c | 175 +
 drivers/net/bnxt/bnxt_filter.h |  74 ++
 drivers/net/bnxt/bnxt_hwrm.c   |  21 ++
 drivers/net/bnxt/bnxt_hwrm.h   |   3 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 456 +
 7 files changed, 733 insertions(+)
 create mode 100644 drivers/net/bnxt/bnxt_filter.c
 create mode 100644 drivers/net/bnxt/bnxt_filter.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index afd1690..b7834b1 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -50,6 +50,7 @@ EXPORT_MAP := rte_pmd_bnxt_version.map
 #
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_cpr.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index bdd355f..49aa38b 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -145,6 +145,9 @@ struct bnxt {
struct bnxt_vnic_info   *vnic_info;
STAILQ_HEAD(, bnxt_vnic_info)   free_vnic_list;

+   struct bnxt_filter_info *filter_info;
+   STAILQ_HEAD(, bnxt_filter_info) free_filter_list;
+
/* VNIC pointer for flow filter (VMDq) pools */
 #define MAX_FF_POOLS   ETH_64_POOLS
STAILQ_HEAD(, bnxt_vnic_info)   ff_pool[MAX_FF_POOLS];
diff --git a/drivers/net/bnxt/bnxt_filter.c b/drivers/net/bnxt/bnxt_filter.c
new file mode 100644
index 000..f03a1dc
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_filter.c
@@ -0,0 +1,175 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+#include 
+
+#include "bnxt.h"
+#include "bnxt_filter.h"
+#include "bnxt_hwrm.h"
+#include "bnxt_vnic.h"
+#include "hsi_struct_def_dpdk.h"
+
+/*
+ * Filter Functions
+ */
+
+struct bnxt_filter_info *bnxt_alloc_filter(struct bnxt *bp)
+{
+   struct bnxt_filter_info *filter;
+
+   /* Find the 1st unused filter from the free_filter_list pool*/
+   filter = STAILQ_FIRST(&bp->free_filter_list);
+   if (!filter) {
+   RTE_LOG(ERR, PMD, "No more free filter resources\n");
+   return NULL;
+   }
+   STAILQ_REMOVE_HEAD(&bp->free_filter_list, next);
+
+   /* Default to L2 MAC Addr filter */
+   filter->flags = HWRM_CFA_L2_FILTER_ALLOC_INPUT_FLAGS_PATH_RX;
+   filter->enables = HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR |
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR_MASK;
+   memcpy(filter->l2_addr, bp->eth_dev->data->mac_addrs->addr_bytes,
+  ETHER_AD

[dpdk-dev] [PATCH v4 06/39] bnxt: add vnic functions and structs

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add functions to allocate, initialize, and free vnics.

A VNIC represents a virtual interface. It is a resource in the RX path
of the chip and is used to setup various target actions such as RSS,
MAC filtering etc.. for the physical function in use.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt.h|  14 ++
 drivers/net/bnxt/bnxt_vnic.c   | 277 +
 drivers/net/bnxt/bnxt_vnic.h   |  80 ++
 drivers/net/bnxt/hsi_struct_def_dpdk.h |   5 +-
 5 files changed, 376 insertions(+), 1 deletion(-)
 create mode 100644 drivers/net/bnxt/bnxt_vnic.c
 create mode 100644 drivers/net/bnxt/bnxt_vnic.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index 9965597..c57afaa 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -50,6 +50,7 @@ EXPORT_MAP := rte_pmd_bnxt_version.map
 #
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c

 #
 # Export include files
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index bfce91e..d0f84f4 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -45,6 +45,13 @@
 #define BNXT_MAX_MTU   9000
 #define VLAN_TAG_SIZE  4

+enum bnxt_hw_context {
+   HW_CONTEXT_NONE = 0,
+   HW_CONTEXT_IS_RSS   = 1,
+   HW_CONTEXT_IS_COS   = 2,
+   HW_CONTEXT_IS_LB= 3,
+};
+
 struct bnxt_vf_info {
uint16_tfw_fid;
uint8_t mac_addr[ETHER_ADDR_LEN];
@@ -130,6 +137,13 @@ struct bnxt {
unsigned inttx_cp_nr_rings;
struct bnxt_tx_queue **tx_queues;

+   struct bnxt_vnic_info   *vnic_info;
+   STAILQ_HEAD(, bnxt_vnic_info)   free_vnic_list;
+
+   /* VNIC pointer for flow filter (VMDq) pools */
+#define MAX_FF_POOLS   ETH_64_POOLS
+   STAILQ_HEAD(, bnxt_vnic_info)   ff_pool[MAX_FF_POOLS];
+
 #define MAX_NUM_MAC_ADDR   32
uint8_t mac_addr[ETHER_ADDR_LEN];

diff --git a/drivers/net/bnxt/bnxt_vnic.c b/drivers/net/bnxt/bnxt_vnic.c
new file mode 100644
index 000..c04c4c7
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_vnic.c
@@ -0,0 +1,277 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2014-2015 Broadcom Corporation.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+#include 
+
+#include "bnxt.h"
+#include "bnxt_vnic.h"
+#include "hsi_struct_def_dpdk.h"
+
+/*
+ * VNIC Functions
+ */
+
+static void prandom_bytes(void *dest_ptr, size_t len)
+{
+   char *dest = (char *)dest_ptr;
+   uint64_t rb;
+
+   while (len) {
+   rb = rte_rand();
+   if (len >= 8) {
+   memcpy(dest, &rb, 8);
+   len -= 8;
+   dest += 8;
+   } else {
+   memcpy(dest, &rb, len);
+   dest += len;
+   len = 0;
+   }
+   }
+}
+
+void bnxt_init_vnics(struct bnxt *bp)
+{
+   struct bnxt_vnic_info *vnic;
+   uint16_t max_vnics;
+   int i, j;
+
+   if (BNXT_PF(bp)) {
+   struct bnxt_pf_info *pf = &bp->pf;
+
+   max_vnics = pf->max_vnic

[dpdk-dev] [PATCH v4 08/39] bnxt: add completion ring support

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Structures, macros, and functions for working with completion rings
in the driver.

Completion Ring is used by the Ethernet controller to provide the
status of transmitted & received packets, report errors, status changes
to the host software.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt.h|   5 +
 drivers/net/bnxt/bnxt_cpr.c| 140 +++
 drivers/net/bnxt/bnxt_cpr.h|  88 
 drivers/net/bnxt/bnxt_hwrm.c   |  18 +++
 drivers/net/bnxt/bnxt_hwrm.h   |   2 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 239 -
 7 files changed, 487 insertions(+), 6 deletions(-)
 create mode 100644 drivers/net/bnxt/bnxt_cpr.c
 create mode 100644 drivers/net/bnxt/bnxt_cpr.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index 757ea62..afd1690 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -48,6 +48,7 @@ EXPORT_MAP := rte_pmd_bnxt_version.map
 #
 # all source are stored in SRCS-y
 #
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_cpr.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index d0f84f4..bdd355f 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -42,6 +42,8 @@
 #include 
 #include 

+#include "bnxt_cpr.h"
+
 #define BNXT_MAX_MTU   9000
 #define VLAN_TAG_SIZE  4

@@ -137,6 +139,9 @@ struct bnxt {
unsigned inttx_cp_nr_rings;
struct bnxt_tx_queue **tx_queues;

+   /* Default completion ring */
+   struct bnxt_cp_ring_info*def_cp_ring;
+
struct bnxt_vnic_info   *vnic_info;
STAILQ_HEAD(, bnxt_vnic_info)   free_vnic_list;

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
new file mode 100644
index 000..ba8c0d4
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -0,0 +1,140 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "bnxt.h"
+#include "bnxt_cpr.h"
+#include "bnxt_hwrm.h"
+#include "bnxt_ring.h"
+
+/*
+ * Async event handling
+ */
+void bnxt_handle_async_event(struct bnxt *bp __rte_unused,
+struct cmpl_base *cmp)
+{
+   struct hwrm_async_event_cmpl *async_cmp =
+   (struct hwrm_async_event_cmpl *)cmp;
+
+   /* TODO: HWRM async events are not defined yet */
+   /* Needs to handle: link events, error events, etc. */
+   switch (async_cmp->event_id) {
+   case 0:
+   /* Assume LINK_CHANGE == 0 */
+   RTE_LOG(INFO, PMD, "Link change event\n");
+
+   /* Can just prompt the update_op routine to do a qcfg
+* instead of doing the actual qcfg
+*/
+   break;
+   case 1:
+   break;
+   default:
+   RTE_LOG(ERR, PMD, "handle_async_event id = 0x%x\n",
+   async_cmp->event_id);
+   break;
+   }
+}
+
+void bnxt_handle_fwd_req(struct bnxt *bp, str

[dpdk-dev] [PATCH v4 11/39] bnxt: add Rx queue create/destroy operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Adds initial create/destroy queue code. Still requires RX ring support
which will be brought in subsequent patches to be functional.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt.h|   2 +
 drivers/net/bnxt/bnxt_ethdev.c |   3 +
 drivers/net/bnxt/bnxt_rxq.c| 288 +
 drivers/net/bnxt/bnxt_rxq.h|  74 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 121 ++
 6 files changed, 489 insertions(+)
 create mode 100644 drivers/net/bnxt/bnxt_rxq.c
 create mode 100644 drivers/net/bnxt/bnxt_rxq.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index 13a90b9..21ed71c 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_rxq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index 49aa38b..f7cf9d1 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -142,6 +142,8 @@ struct bnxt {
/* Default completion ring */
struct bnxt_cp_ring_info*def_cp_ring;

+   unsigned intnr_vnics;
+
struct bnxt_vnic_info   *vnic_info;
STAILQ_HEAD(, bnxt_vnic_info)   free_vnic_list;

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 77a6d92..7e7d1ab 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -41,6 +41,7 @@

 #include "bnxt.h"
 #include "bnxt_hwrm.h"
+#include "bnxt_rxq.h"
 #include "bnxt_txq.h"

 #define DRV_MODULE_NAME"bnxt"
@@ -186,6 +187,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.dev_infos_get = bnxt_dev_info_get_op,
.dev_close = bnxt_dev_close_op,
.dev_configure = bnxt_dev_configure_op,
+   .rx_queue_setup = bnxt_rx_queue_setup_op,
+   .rx_queue_release = bnxt_rx_queue_release_op,
.tx_queue_setup = bnxt_tx_queue_setup_op,
.tx_queue_release = bnxt_tx_queue_release_op,
 };
diff --git a/drivers/net/bnxt/bnxt_rxq.c b/drivers/net/bnxt/bnxt_rxq.c
new file mode 100644
index 000..11e9466
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_rxq.c
@@ -0,0 +1,288 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+
+#include "bnxt.h"
+#include "bnxt_filter.h"
+#include "bnxt_hwrm.h"
+#include "bnxt_ring.h"
+#include "bnxt_rxq.h"
+#include "bnxt_vnic.h"
+#include "hsi_struct_def_dpdk.h"
+
+/*
+ * RX Queues
+ */
+
+void bnxt_free_rxq_stats(struct bnxt_rx_queue *rxq)
+{
+   struct bnxt_cp_ring_info *cpr = rxq->cp_ring;
+
+   /* 'Unreserve' rte_memzone */
+   /* N/A */
+
+   if (cpr->hw_stats)
+   cpr->hw_stats = NULL;
+}
+
+int bnxt_mq_rx_configure(struct bnxt *bp)
+{
+   struct rte_eth_conf *dev_conf = &bp->eth_dev->data->dev_conf;
+   unsigned int i, j, nb_q

[dpdk-dev] [PATCH v4 32/39] bnxt: add all multicast enable/disable operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds dev_ops to enable/disable multicast traffic.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 3fce540..d3a624f 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -465,6 +465,34 @@ static void bnxt_promiscuous_disable_op(struct rte_eth_dev 
*eth_dev)
bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
 }

+static void bnxt_allmulticast_enable_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic;
+
+   if (bp->vnic_info == NULL)
+   return;
+
+   vnic = &bp->vnic_info[0];
+
+   vnic->flags |= BNXT_VNIC_INFO_ALLMULTI;
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+}
+
+static void bnxt_allmulticast_disable_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic;
+
+   if (bp->vnic_info == NULL)
+   return;
+
+   vnic = &bp->vnic_info[0];
+
+   vnic->flags &= ~BNXT_VNIC_INFO_ALLMULTI;
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+}
+
 /*
  * Initialization
  */
@@ -484,6 +512,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.link_update = bnxt_link_update_op,
.promiscuous_enable = bnxt_promiscuous_enable_op,
.promiscuous_disable = bnxt_promiscuous_disable_op,
+   .allmulticast_enable = bnxt_allmulticast_enable_op,
+   .allmulticast_disable = bnxt_allmulticast_disable_op,
 };

 static bool bnxt_vf_pciid(uint16_t id)
-- 
1.9.1



[dpdk-dev] [PATCH v4 30/39] bnxt: add start/stop/link update operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds code to add the start, stop and link update dev_ops.
The BNXT driver will now minimally pass traffic with testpmd.

v4:
- Fix issues pointed out by checkpatch.
- Shorten the string passed for reserving memzone
when default completion ring is created.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 269 +
 1 file changed, 269 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 6888363..ac82876 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -40,12 +40,17 @@
 #include 

 #include "bnxt.h"
+#include "bnxt_cpr.h"
+#include "bnxt_filter.h"
 #include "bnxt_hwrm.h"
+#include "bnxt_ring.h"
 #include "bnxt_rxq.h"
 #include "bnxt_rxr.h"
 #include "bnxt_stats.h"
 #include "bnxt_txq.h"
 #include "bnxt_txr.h"
+#include "bnxt_vnic.h"
+#include "hsi_struct_def_dpdk.h"

 #define DRV_MODULE_NAME"bnxt"
 static const char bnxt_version[] =
@@ -65,6 +70,177 @@ static void bnxt_dev_close_op(struct rte_eth_dev *eth_dev)
bnxt_free_hwrm_resources(bp);
 }

+/***/
+
+/*
+ * High level utility functions
+ */
+
+static void bnxt_free_mem(struct bnxt *bp)
+{
+   bnxt_free_filter_mem(bp);
+   bnxt_free_vnic_attributes(bp);
+   bnxt_free_vnic_mem(bp);
+
+   bnxt_free_stats(bp);
+   bnxt_free_tx_rings(bp);
+   bnxt_free_rx_rings(bp);
+   bnxt_free_def_cp_ring(bp);
+}
+
+static int bnxt_alloc_mem(struct bnxt *bp)
+{
+   int rc;
+
+   /* Default completion ring */
+   rc = bnxt_init_def_ring_struct(bp, SOCKET_ID_ANY);
+   if (rc)
+   goto alloc_mem_err;
+
+   rc = bnxt_alloc_rings(bp, 0, NULL, NULL,
+ bp->def_cp_ring, "def_cp");
+   if (rc)
+   goto alloc_mem_err;
+
+   rc = bnxt_alloc_vnic_mem(bp);
+   if (rc)
+   goto alloc_mem_err;
+
+   rc = bnxt_alloc_vnic_attributes(bp);
+   if (rc)
+   goto alloc_mem_err;
+
+   rc = bnxt_alloc_filter_mem(bp);
+   if (rc)
+   goto alloc_mem_err;
+
+   return 0;
+
+alloc_mem_err:
+   bnxt_free_mem(bp);
+   return rc;
+}
+
+static int bnxt_init_chip(struct bnxt *bp)
+{
+   unsigned int i, rss_idx, fw_idx;
+   int rc;
+
+   rc = bnxt_alloc_all_hwrm_stat_ctxs(bp);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "HWRM stat ctx alloc failure rc: %x\n", rc);
+   goto err_out;
+   }
+
+   rc = bnxt_alloc_hwrm_rings(bp);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "HWRM ring alloc failure rc: %x\n", rc);
+   goto err_out;
+   }
+
+   rc = bnxt_alloc_all_hwrm_ring_grps(bp);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "HWRM ring grp alloc failure: %x\n", rc);
+   goto err_out;
+   }
+
+   rc = bnxt_mq_rx_configure(bp);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "MQ mode configure failure rc: %x\n", rc);
+   goto err_out;
+   }
+
+   /* VNIC configuration */
+   for (i = 0; i < bp->nr_vnics; i++) {
+   struct bnxt_vnic_info *vnic = &bp->vnic_info[i];
+
+   rc = bnxt_hwrm_vnic_alloc(bp, vnic);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "HWRM vnic alloc failure rc: %x\n",
+   rc);
+   goto err_out;
+   }
+
+   rc = bnxt_hwrm_vnic_ctx_alloc(bp, vnic);
+   if (rc) {
+   RTE_LOG(ERR, PMD,
+   "HWRM vnic ctx alloc failure rc: %x\n", rc);
+   goto err_out;
+   }
+
+   rc = bnxt_hwrm_vnic_cfg(bp, vnic);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "HWRM vnic cfg failure rc: %x\n", rc);
+   goto err_out;
+   }
+
+   rc = bnxt_set_hwrm_vnic_filters(bp, vnic);
+   if (rc) {
+   RTE_LOG(ERR, PMD, "HWRM vnic filter failure rc: %x\n",
+   rc);
+   goto err_out;
+   }
+   if (vnic->rss_table && vnic->hash_type) {
+   /*
+* Fill the RSS hash & redirection table with
+* ring group ids for all VNICs
+*/
+   for (rss_idx = 0, fw_idx = 0;
+rss_idx < HW_HASH_INDEX_SIZE;
+rss_idx++, fw_idx++) {
+   if (vnic->fw_grp_ids[fw_idx] ==
+   INVALID_HW_RING_ID)
+   fw_idx = 0;
+   vnic->rss_table[rss_idx] =
+   vnic->fw_grp_ids[fw_idx];
+

[dpdk-dev] [PATCH v4 31/39] bnxt: add promiscuous enable/disable operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds the promiscuous mode enable and disable dev_ops.

v4:
Fix couple of typos in the commit message.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index ac82876..3fce540 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -437,6 +437,34 @@ out:
return rc;
 }

+static void bnxt_promiscuous_enable_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic;
+
+   if (bp->vnic_info == NULL)
+   return;
+
+   vnic = &bp->vnic_info[0];
+
+   vnic->flags |= BNXT_VNIC_INFO_PROMISC;
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+}
+
+static void bnxt_promiscuous_disable_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic;
+
+   if (bp->vnic_info == NULL)
+   return;
+
+   vnic = &bp->vnic_info[0];
+
+   vnic->flags &= ~BNXT_VNIC_INFO_PROMISC;
+   bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
+}
+
 /*
  * Initialization
  */
@@ -454,6 +482,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.tx_queue_setup = bnxt_tx_queue_setup_op,
.tx_queue_release = bnxt_tx_queue_release_op,
.link_update = bnxt_link_update_op,
+   .promiscuous_enable = bnxt_promiscuous_enable_op,
+   .promiscuous_disable = bnxt_promiscuous_disable_op,
 };

 static bool bnxt_vf_pciid(uint16_t id)
-- 
1.9.1



[dpdk-dev] [PATCH v4 22/39] bnxt: add API for L2 Rx mask set/clear functions

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

These HWRM APIs allow setting and clearing of Rx masks in L2 context
per VNIC.

v4:
Address review comments.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   |  45 +++
 drivers/net/bnxt/bnxt_hwrm.h   |   3 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 135 +
 3 files changed, 183 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index c43c2da..7f39db0 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -141,6 +141,51 @@ static int bnxt_hwrm_send_message(struct bnxt *bp, void 
*msg, uint32_t msg_len)
} \
}

+int bnxt_hwrm_cfa_l2_clear_rx_mask(struct bnxt *bp, struct bnxt_vnic_info 
*vnic)
+{
+   int rc = 0;
+   struct hwrm_cfa_l2_set_rx_mask_input req = {.req_type = 0 };
+   struct hwrm_cfa_l2_set_rx_mask_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, CFA_L2_SET_RX_MASK, -1, resp);
+   req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id);
+   req.mask = 0;
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
+int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   int rc = 0;
+   struct hwrm_cfa_l2_set_rx_mask_input req = {.req_type = 0 };
+   struct hwrm_cfa_l2_set_rx_mask_output *resp = bp->hwrm_cmd_resp_addr;
+   uint32_t mask = 0;
+
+   HWRM_PREP(req, CFA_L2_SET_RX_MASK, -1, resp);
+   req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id);
+
+   /* FIXME add multicast flag, when multicast adding options is supported
+* by ethtool.
+*/
+   if (vnic->flags & BNXT_VNIC_INFO_PROMISC)
+   mask = HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_PROMISCUOUS;
+   if (vnic->flags & BNXT_VNIC_INFO_ALLMULTI)
+   mask = HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_ALL_MCAST;
+   req.mask = rte_cpu_to_le_32(HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_MCAST |
+   HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_BCAST |
+   mask);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_clear_filter(struct bnxt *bp,
   struct bnxt_filter_info *filter)
 {
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 7c12c6d..915cf2a 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -42,6 +42,9 @@
 struct bnxt;
 struct bnxt_filter_info;
 struct bnxt_cp_ring_info;
+int bnxt_hwrm_cfa_l2_clear_rx_mask(struct bnxt *bp,
+  struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_clear_filter(struct bnxt *bp,
   struct bnxt_filter_info *filter);

diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 72d4984..f8f6a3f 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -1732,6 +1732,141 @@ struct hwrm_cfa_l2_filter_free_output {
uint8_t valid;
 } __attribute__((packed));

+/* hwrm_cfa_l2_set_rx_mask */
+/* Description: This command will set rx mask of the function. */
+
+/* Input (40 bytes) */
+struct hwrm_cfa_l2_set_rx_mask_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other value must be a valid CR ring_id value
+* for this function.
+*/
+   uint16_t cmpl_ring;
+
+   /* This value indicates the command sequence number. */
+   uint16_t seq_id;
+
+   /*
+* Target ID of this command. 0x0 - 0xFFF8 - Used for function ids
+* 0xFFF8 - 0xFFFE - Reserved for internal processors 0x - HWRM
+*/
+   uint16_t target_id;
+
+   /*
+* This is the host address where the response will be written when the
+* request is complete. This area must be 16B aligned and must be
+* cleared to zero before the request is made.
+*/
+   uint64_t resp_addr;
+
+   /* VNIC ID */
+   uint32_t vnic_id;
+
+   /* Reserved for future use. */
+   #define HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_RESERVED UINT32_C(0x1)
+   /*
+* When this bit is '1', the function is requested to accept multi-cast
+* packets specified by the multicast addr table.
+*/
+   #define HWRM_CFA_L2_SET_RX_MASK_INPUT_MASK_MCASTUINT32_C(0x2)
+   /*
+* When this

[dpdk-dev] [PATCH v4 10/39] bnxt: add Tx queue operations (nonfunctional)

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add code to create/destroy TX queues. This still requires support to
create a TX ring in the ASIC which will be completed in a future commit.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt_ethdev.c |   3 +
 drivers/net/bnxt/bnxt_txq.c| 125 +
 drivers/net/bnxt/bnxt_txq.h|  75 +
 4 files changed, 204 insertions(+)
 create mode 100644 drivers/net/bnxt/bnxt_txq.c
 create mode 100644 drivers/net/bnxt/bnxt_txq.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index b7834b1..13a90b9 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -53,6 +53,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_vnic.c

 #
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index b46d2ce..77a6d92 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -41,6 +41,7 @@

 #include "bnxt.h"
 #include "bnxt_hwrm.h"
+#include "bnxt_txq.h"

 #define DRV_MODULE_NAME"bnxt"
 static const char bnxt_version[] =
@@ -185,6 +186,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.dev_infos_get = bnxt_dev_info_get_op,
.dev_close = bnxt_dev_close_op,
.dev_configure = bnxt_dev_configure_op,
+   .tx_queue_setup = bnxt_tx_queue_setup_op,
+   .tx_queue_release = bnxt_tx_queue_release_op,
 };

 static bool bnxt_vf_pciid(uint16_t id)
diff --git a/drivers/net/bnxt/bnxt_txq.c b/drivers/net/bnxt/bnxt_txq.c
new file mode 100644
index 000..cb3dd8e
--- /dev/null
+++ b/drivers/net/bnxt/bnxt_txq.c
@@ -0,0 +1,125 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) Broadcom Limited.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Broadcom Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include 
+
+#include 
+
+#include "bnxt.h"
+#include "bnxt_ring.h"
+#include "bnxt_txq.h"
+
+/*
+ * TX Queues
+ */
+
+void bnxt_free_txq_stats(struct bnxt_tx_queue *txq)
+{
+   struct bnxt_cp_ring_info *cpr = txq->cp_ring;
+
+   /* 'Unreserve' rte_memzone */
+   /* N/A */
+
+   if (cpr->hw_stats)
+   cpr->hw_stats = NULL;
+}
+
+static void bnxt_tx_queue_release_mbufs(struct bnxt_tx_queue *txq __rte_unused)
+{
+   /* TODO: Requires interaction with TX ring */
+}
+
+void bnxt_free_tx_mbufs(struct bnxt *bp)
+{
+   struct bnxt_tx_queue *txq;
+   int i;
+
+   for (i = 0; i < (int)bp->tx_nr_rings; i++) {
+   txq = bp->tx_queues[i];
+   bnxt_tx_queue_release_mbufs(txq);
+   }
+}
+
+void bnxt_tx_queue_release_op(void *tx_queue)
+{
+   struct bnxt_tx_queue *txq = (struct bnxt_tx_queue *)tx_queue;
+
+   if (txq) {
+   /* TODO: Free ring and stats here */
+   rte_free(txq);
+   }
+}
+
+int bnxt_tx_queue_setup_op(struct rte_eth_dev *eth_dev,
+  uint16_t queue_idx,
+  uint16_t nb_desc,
+  unsigned int socket_id,
+  cons

[dpdk-dev] [PATCH v4 26/39] bnxt: add HWRM stat context free function

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add function and associated structures and definitions to free
statistics context from the ASIC.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 44 ++
 drivers/net/bnxt/bnxt_hwrm.h   |  3 ++
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 81 ++
 3 files changed, 128 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 4c4f707..d3e77d5 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -699,6 +699,28 @@ int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
return rc;
 }

+int bnxt_hwrm_stat_ctx_free(struct bnxt *bp,
+   struct bnxt_cp_ring_info *cpr, unsigned int idx)
+{
+   int rc;
+   struct hwrm_stat_ctx_free_input req = {.req_type = 0 };
+   struct hwrm_stat_ctx_free_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, STAT_CTX_FREE, -1, resp);
+
+   req.stat_ctx_id = rte_cpu_to_le_16(cpr->hw_stats_ctx_id);
+   req.seq_id = rte_cpu_to_le_16(bp->hwrm_cmd_seq++);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   cpr->hw_stats_ctx_id = HWRM_NA_SIGNATURE;
+   bp->grp_info[idx].fw_stats_ctx = cpr->hw_stats_ctx_id;
+
+   return rc;
+}
+
 int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic)
 {
int rc = 0, i, j;
@@ -875,6 +897,28 @@ int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp)
return 0;
 }

+int bnxt_free_all_hwrm_stat_ctxs(struct bnxt *bp)
+{
+   int rc;
+   unsigned int i;
+   struct bnxt_cp_ring_info *cpr;
+
+   for (i = 0; i < bp->rx_cp_nr_rings + bp->tx_cp_nr_rings; i++) {
+   unsigned int idx = i + 1;
+
+   if (i >= bp->rx_cp_nr_rings)
+   cpr = bp->tx_queues[i - bp->rx_cp_nr_rings]->cp_ring;
+   else
+   cpr = bp->rx_queues[i]->cp_ring;
+   if (cpr->hw_stats_ctx_id != HWRM_NA_SIGNATURE) {
+   rc = bnxt_hwrm_stat_ctx_free(bp, cpr, idx);
+   if (rc)
+   return rc;
+   }
+   }
+   return 0;
+}
+
 int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp)
 {
unsigned int i;
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index fb088cf..5665762 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -70,6 +70,8 @@ int bnxt_hwrm_ring_grp_free(struct bnxt *bp, unsigned int 
idx);
 int bnxt_hwrm_stat_clear(struct bnxt *bp, struct bnxt_cp_ring_info *cpr);
 int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
 struct bnxt_cp_ring_info *cpr, unsigned int idx);
+int bnxt_hwrm_stat_ctx_free(struct bnxt *bp,
+   struct bnxt_cp_ring_info *cpr, unsigned int idx);

 int bnxt_hwrm_ver_get(struct bnxt *bp);

@@ -83,6 +85,7 @@ int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,

 int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
+int bnxt_free_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp);
 int bnxt_alloc_all_hwrm_ring_grps(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 4e2eb9f..d58295a 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -104,6 +104,7 @@ struct ctx_hw_stats64 {
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
 #define HWRM_CFA_L2_SET_RX_MASK(UINT32_C(0x93))
 #define HWRM_STAT_CTX_ALLOC(UINT32_C(0xb0))
+#define HWRM_STAT_CTX_FREE (UINT32_C(0xb1))
 #define HWRM_STAT_CTX_CLR_STATS(UINT32_C(0xb3))
 #define HWRM_EXEC_FWD_RESP (UINT32_C(0xd0))

@@ -3839,6 +3840,86 @@ struct hwrm_stat_ctx_clr_stats_output {
uint8_t valid;
 } __attribute__((packed));

+/* hwrm_stat_ctx_free */
+/* Description: This command is used to free a stat context. */
+/* Input (24 bytes) */
+
+struct hwrm_stat_ctx_free_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other value must be a valid CR ring_id value
+* for this function.
+*/
+   uint16_t cmpl_ring;
+
+   /* This value indicates the command sequence number. */
+   uint16_t seq_id;
+
+   /*
+* Target ID of this command. 0x0 - 0xFFF8 -

[dpdk-dev] [PATCH v4 33/39] bnxt: free memory in close operation

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds code to free all resources except the one corresponding
to HWRM, which are required to notify the HWRM that the driver is unloaded
(these are freed in uninit()).

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index d3a624f..4254531 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -62,14 +62,6 @@ static struct rte_pci_id bnxt_pci_id_map[] = {
{.device_id = 0},
 };

-static void bnxt_dev_close_op(struct rte_eth_dev *eth_dev)
-{
-   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
-
-   rte_free(eth_dev->data->mac_addrs);
-   bnxt_free_hwrm_resources(bp);
-}
-
 /***/

 /*
@@ -388,6 +380,16 @@ error:
return rc;
 }

+static void bnxt_dev_close_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+
+   bnxt_free_tx_mbufs(bp);
+   bnxt_free_rx_mbufs(bp);
+   bnxt_free_mem(bp);
+   rte_free(eth_dev->data->mac_addrs);
+}
+
 /* Unload the driver, release resources */
 static void bnxt_dev_stop_op(struct rte_eth_dev *eth_dev)
 {
-- 
1.9.1



[dpdk-dev] [PATCH v4 15/39] bnxt: Code to alloc/free ring

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Perform allocation and free()ing of ring and information structures
for the TX, RX, and completion rings. The previous patches had
so far provided top level stubs, while this patch does the real
allocation and freeing of the memory.

v4:
- Address review comments and fix issues pointed out by checkpatch.
- Change the argument passed to bnxt_alloc_rings.
 Instead of passing bnxt_tx_ring and bnxt_rx_ring,
 shorten them to txr and rxr respectively.
- Add code to free the reserved memzone

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_cpr.c  | 28 +++-
 drivers/net/bnxt/bnxt_cpr.h  |  2 +-
 drivers/net/bnxt/bnxt_ring.c |  4 
 drivers/net/bnxt/bnxt_ring.h |  1 +
 drivers/net/bnxt/bnxt_rxq.c  | 22 +-
 drivers/net/bnxt/bnxt_rxr.c  | 42 ++
 drivers/net/bnxt/bnxt_rxr.h  |  2 +-
 drivers/net/bnxt/bnxt_txq.c  | 28 +---
 drivers/net/bnxt/bnxt_txr.c  | 43 ++-
 drivers/net/bnxt/bnxt_txr.h  |  2 +-
 10 files changed, 129 insertions(+), 45 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 06a618a..98f3ca2 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -31,6 +31,8 @@
  *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  */

+#include 
+
 #include "bnxt.h"
 #include "bnxt_cpr.h"
 #include "bnxt_hwrm.h"
@@ -121,21 +123,37 @@ reject:
 void bnxt_free_def_cp_ring(struct bnxt *bp)
 {
struct bnxt_cp_ring_info *cpr = bp->def_cp_ring;
-   struct bnxt_ring_struct *ring = cpr->cp_ring_struct;

-   bnxt_free_ring(ring);
+   bnxt_free_ring(cpr->cp_ring_struct);
+   rte_free(cpr->cp_ring_struct);
+   rte_free(cpr);
 }

 /* For the default completion ring only */
-void bnxt_init_def_ring_struct(struct bnxt *bp)
+int bnxt_init_def_ring_struct(struct bnxt *bp, unsigned int socket_id)
 {
-   struct bnxt_cp_ring_info *cpr = bp->def_cp_ring;
-   struct bnxt_ring_struct *ring = cpr->cp_ring_struct;
+   struct bnxt_cp_ring_info *cpr;
+   struct bnxt_ring_struct *ring;

+   cpr = rte_zmalloc_socket("bnxt_cp_ring",
+sizeof(struct bnxt_cp_ring_info),
+RTE_CACHE_LINE_SIZE, socket_id);
+   if (cpr == NULL)
+   return -ENOMEM;
+   bp->def_cp_ring = cpr;
+
+   ring = rte_zmalloc_socket("bnxt_cp_ring_struct",
+ sizeof(struct bnxt_ring_struct),
+ RTE_CACHE_LINE_SIZE, socket_id);
+   if (ring == NULL)
+   return -ENOMEM;
+   cpr->cp_ring_struct = ring;
ring->bd = (void *)cpr->cp_desc_ring;
ring->bd_dma = cpr->cp_desc_mapping;
ring->ring_size = rte_align32pow2(DEFAULT_CP_RING_SIZE);
ring->ring_mask = ring->ring_size - 1;
ring->vmem_size = 0;
ring->vmem = NULL;
+
+   return 0;
 }
diff --git a/drivers/net/bnxt/bnxt_cpr.h b/drivers/net/bnxt/bnxt_cpr.h
index f104281..3e25a75 100644
--- a/drivers/net/bnxt/bnxt_cpr.h
+++ b/drivers/net/bnxt/bnxt_cpr.h
@@ -79,7 +79,7 @@ struct bnxt_cp_ring_info {

 struct bnxt;
 void bnxt_free_def_cp_ring(struct bnxt *bp);
-void bnxt_init_def_ring_struct(struct bnxt *bp);
+int bnxt_init_def_ring_struct(struct bnxt *bp, unsigned int socket_id);
 void bnxt_handle_async_event(struct bnxt *bp, struct cmpl_base *cmp);
 void bnxt_handle_fwd_req(struct bnxt *bp, struct cmpl_base *cmp);

diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c
index fab0e27..675c5b8 100644
--- a/drivers/net/bnxt/bnxt_ring.c
+++ b/drivers/net/bnxt/bnxt_ring.c
@@ -51,6 +51,7 @@ void bnxt_free_ring(struct bnxt_ring_struct *ring)
memset((char *)*ring->vmem, 0, ring->vmem_size);
*ring->vmem = NULL;
}
+   rte_memzone_free((const struct rte_memzone *)ring->mem_zone);
 }

 /*
@@ -134,6 +135,7 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
tx_ring_info->tx_desc_ring = (struct tx_bd_long *)tx_ring->bd;
tx_ring->bd_dma = mz->phys_addr + tx_ring_start;
tx_ring_info->tx_desc_mapping = tx_ring->bd_dma;
+   tx_ring->mem_zone = (const void *)mz;

if (!tx_ring->bd)
return -ENOMEM;
@@ -153,6 +155,7 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
(struct rx_prod_pkt_bd *)rx_ring->bd;
rx_ring->bd_dma = mz->phys_addr + rx_ring_start;
rx_ring_info->rx_desc_mapping = rx_ring->bd_dma;
+   rx_ring->mem_zone = (const void *)mz;

if (!rx_ring->bd)
return -ENOMEM;
@@ -168,6 +171,7 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
cp_ring->bd_dma = mz->phys_addr + cp_ring_start;
cp_rin

[dpdk-dev] [PATCH v4 18/39] bnxt: add HWRM vnic free function

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Frees a vnic allocated by vnic_alloc in the previous patch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 21 +
 drivers/net/bnxt/bnxt_hwrm.h   |  1 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 82 ++
 3 files changed, 104 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 77afb81..02fb0c4 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -510,6 +510,27 @@ int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct 
bnxt_vnic_info *vnic)
return rc;
 }

+int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   int rc = 0;
+   struct hwrm_vnic_free_input req = {.req_type = 0 };
+   struct hwrm_vnic_free_output *resp = bp->hwrm_cmd_resp_addr;
+
+   if (vnic->fw_vnic_id == INVALID_HW_RING_ID)
+   return rc;
+
+   HWRM_PREP(req, VNIC_FREE, -1, resp);
+
+   req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   vnic->fw_vnic_id = INVALID_HW_RING_ID;
+   return rc;
+}
+
 /*
  * HWRM utility functions
  */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 62dc801..887ad2d 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -59,6 +59,7 @@ int bnxt_hwrm_stat_clear(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr);

 int bnxt_hwrm_ver_get(struct bnxt *bp);

+int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic);

 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index eedd368..0771897 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -90,6 +90,7 @@ struct ctx_hw_stats64 {
 #define HWRM_PORT_PHY_CFG  (UINT32_C(0x20))
 #define HWRM_QUEUE_QPORTCFG(UINT32_C(0x30))
 #define HWRM_VNIC_ALLOC(UINT32_C(0x40))
+#define HWRM_VNIC_FREE (UINT32_C(0x41))
 #define HWRM_CFA_L2_FILTER_ALLOC   (UINT32_C(0x90))
 #define HWRM_CFA_L2_FILTER_FREE(UINT32_C(0x91))
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
@@ -3218,6 +3219,87 @@ struct hwrm_vnic_alloc_output {
uint8_t valid;
 } __attribute__((packed));

+/* hwrm_vnic_free */
+/*
+ * Description: Free a VNIC resource. Idle any resources associated with the
+ * VNIC as well as the VNIC. Reset and release all resources associated with 
the
+ * VNIC.
+ */
+
+/* Input (24 bytes) */
+struct hwrm_vnic_free_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other value must be a valid CR ring_id value
+* for this function.
+*/
+   uint16_t cmpl_ring;
+
+   /* This value indicates the command sequence number. */
+   uint16_t seq_id;
+
+   /*
+* Target ID of this command. 0x0 - 0xFFF8 - Used for function ids
+* 0xFFF8 - 0xFFFE - Reserved for internal processors 0x - HWRM
+*/
+   uint16_t target_id;
+
+   /*
+* This is the host address where the response will be written when the
+* request is complete. This area must be 16B aligned and must be
+* cleared to zero before the request is made.
+*/
+   uint64_t resp_addr;
+
+   /* Logical vnic ID */
+   uint32_t vnic_id;
+
+   uint32_t unused_0;
+} __attribute__((packed));
+
+/* Output (16 bytes) */
+struct hwrm_vnic_free_output {
+   /*
+* Pass/Fail or error type Note: receiver to verify the in parameters,
+* and fail the call with an error when appropriate
+*/
+   uint16_t error_code;
+
+   /* This field returns the type of original request. */
+   uint16_t req_type;
+
+   /* This field provides original sequence number of the command. */
+   uint16_t seq_id;
+
+   /*
+* This field is the length of the response in bytes. The last byte of
+* the response is a valid flag that will read as '1' when the command
+* has been completely written to memory.
+*/
+   uint16_t resp_len;
+
+   uint32_t unused_0;
+   uint8_t unused_1;
+   uint8_t unused_2;
+   uint8_t unused_3;
+
+   /*
+* This field is used in Output records to indicate that the output is
+* completely written to RAM. This field should be read as '1' to

[dpdk-dev] [PATCH v4 23/39] bnxt: add HWRM stats context allocation

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add HWRM API code to allocate a statistics context in the ASIC.
This API will be called by the previously submitted "add statistics
operations patch".

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 52 
 drivers/net/bnxt/bnxt_hwrm.h   |  3 ++
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 89 ++
 3 files changed, 144 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 7f39db0..5d0fbf1 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -525,6 +525,31 @@ int bnxt_hwrm_stat_clear(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr)
return rc;
 }

+int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
+struct bnxt_cp_ring_info *cpr, unsigned int idx)
+{
+   int rc;
+   struct hwrm_stat_ctx_alloc_input req = {.req_type = 0 };
+   struct hwrm_stat_ctx_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, STAT_CTX_ALLOC, -1, resp);
+
+   req.update_period_ms = rte_cpu_to_le_32(1000);
+
+   req.seq_id = rte_cpu_to_le_16(bp->hwrm_cmd_seq++);
+   req.stats_dma_addr =
+   rte_cpu_to_le_64(cpr->hw_stats_map);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   cpr->hw_stats_ctx_id = rte_le_to_cpu_16(resp->stat_ctx_id);
+   bp->grp_info[idx].fw_stats_ctx = cpr->hw_stats_ctx_id;
+
+   return rc;
+}
+
 int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic)
 {
int rc = 0, i, j;
@@ -701,6 +726,33 @@ int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp)
return 0;
 }

+int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp)
+{
+   unsigned int i;
+   int rc = 0;
+
+   for (i = 0; i < bp->rx_cp_nr_rings + bp->tx_cp_nr_rings; i++) {
+   struct bnxt_tx_queue *txq;
+   struct bnxt_rx_queue *rxq;
+   struct bnxt_cp_ring_info *cpr;
+   unsigned int idx = i + 1;
+
+   if (i >= bp->rx_cp_nr_rings) {
+   txq = bp->tx_queues[i - bp->rx_cp_nr_rings];
+   cpr = txq->cp_ring;
+   } else {
+   rxq = bp->rx_queues[i];
+   cpr = rxq->cp_ring;
+   }
+
+   rc = bnxt_hwrm_stat_ctx_alloc(bp, cpr, idx);
+
+   if (rc)
+   return rc;
+   }
+   return rc;
+}
+
 void bnxt_free_hwrm_resources(struct bnxt *bp)
 {
/* Release memzone */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 915cf2a..f41361e 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -59,6 +59,8 @@ int bnxt_hwrm_func_driver_unregister(struct bnxt *bp, 
uint32_t flags);
 int bnxt_hwrm_queue_qportcfg(struct bnxt *bp);

 int bnxt_hwrm_stat_clear(struct bnxt *bp, struct bnxt_cp_ring_info *cpr);
+int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
+struct bnxt_cp_ring_info *cpr, unsigned int idx);

 int bnxt_hwrm_ver_get(struct bnxt *bp);

@@ -70,6 +72,7 @@ int bnxt_hwrm_vnic_free(struct bnxt *bp, struct 
bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,
   struct bnxt_vnic_info *vnic);

+int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
 int bnxt_alloc_hwrm_resources(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index f8f6a3f..28362c9 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -99,6 +99,7 @@ struct ctx_hw_stats64 {
 #define HWRM_CFA_L2_FILTER_FREE(UINT32_C(0x91))
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
 #define HWRM_CFA_L2_SET_RX_MASK(UINT32_C(0x93))
+#define HWRM_STAT_CTX_ALLOC(UINT32_C(0xb0))
 #define HWRM_STAT_CTX_CLR_STATS(UINT32_C(0xb3))
 #define HWRM_EXEC_FWD_RESP (UINT32_C(0xd0))

@@ -3183,6 +3184,94 @@ struct hwrm_queue_qportcfg_input {
uint16_t unused_0;
 } __attribute__((packed));

+/* hwrm_stat_ctx_alloc */
+/*
+ * Description: This command allocates and does basic preparation for a stat
+ * context.
+ */
+
+/* Input (32 bytes) */
+struct hwrm_stat_ctx_alloc_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other

[dpdk-dev] [PATCH v4 19/39] bnxt: add HWRM vnic configure function

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

A VNIC represents a virtual interface. It is a resource in the RX path
of the chip and is used to setup various target actions such as RSS,
MAC filtering etc.. for the physical function in use.

This patch configures the properties and actions of the vnic
allocated by vnic_alloc function from the previous patch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   |  34 
 drivers/net/bnxt/bnxt_hwrm.h   |   3 +-
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 155 +
 3 files changed, 191 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 02fb0c4..be4020f 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -510,6 +510,40 @@ int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct 
bnxt_vnic_info *vnic)
return rc;
 }

+int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   int rc = 0;
+   struct hwrm_vnic_cfg_input req = {.req_type = 0 };
+   struct hwrm_vnic_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, VNIC_CFG, -1, resp);
+
+   /* Only RSS support for now TBD: COS & LB */
+   req.enables =
+   rte_cpu_to_le_32(HWRM_VNIC_CFG_INPUT_ENABLES_DFLT_RING_GRP |
+HWRM_VNIC_CFG_INPUT_ENABLES_RSS_RULE |
+HWRM_VNIC_CFG_INPUT_ENABLES_MRU);
+   req.vnic_id = rte_cpu_to_le_16(vnic->fw_vnic_id);
+   req.dflt_ring_grp =
+   rte_cpu_to_le_16(bp->grp_info[vnic->start_grp_id].fw_grp_id);
+   req.rss_rule = rte_cpu_to_le_16(vnic->fw_rss_cos_lb_ctx);
+   req.cos_rule = rte_cpu_to_le_16(0x);
+   req.lb_rule = rte_cpu_to_le_16(0x);
+   req.mru = rte_cpu_to_le_16(bp->eth_dev->data->mtu + ETHER_HDR_LEN +
+  ETHER_CRC_LEN + VLAN_TAG_SIZE);
+   if (vnic->func_default)
+   req.flags = 1;
+   if (vnic->vlan_strip)
+   req.flags |=
+   rte_cpu_to_le_32(HWRM_VNIC_CFG_INPUT_FLAGS_VLAN_STRIP_MODE);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic)
 {
int rc = 0;
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 887ad2d..b5cf090 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -59,8 +59,9 @@ int bnxt_hwrm_stat_clear(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr);

 int bnxt_hwrm_ver_get(struct bnxt *bp);

-int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);

 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 0771897..ef0b37a 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -91,6 +91,7 @@ struct ctx_hw_stats64 {
 #define HWRM_QUEUE_QPORTCFG(UINT32_C(0x30))
 #define HWRM_VNIC_ALLOC(UINT32_C(0x40))
 #define HWRM_VNIC_FREE (UINT32_C(0x41))
+#define HWRM_VNIC_CFG  (UINT32_C(0x42))
 #define HWRM_CFA_L2_FILTER_ALLOC   (UINT32_C(0x90))
 #define HWRM_CFA_L2_FILTER_FREE(UINT32_C(0x91))
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
@@ -3219,6 +3220,160 @@ struct hwrm_vnic_alloc_output {
uint8_t valid;
 } __attribute__((packed));

+/* hwrm_vnic_cfg */
+/* Description: Configure the RX VNIC structure. */
+
+/* Input (40 bytes) */
+struct hwrm_vnic_cfg_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other value must be a valid CR ring_id value
+* for this function.
+*/
+   uint16_t cmpl_ring;
+
+   /* This value indicates the command sequence number. */
+   uint16_t seq_id;
+
+   /*
+* Target ID of this command. 0x0 - 0xFFF8 - Used for function ids
+* 0xFFF8 - 0xFFFE - Reserved for internal processors 0x - HWRM
+*/
+   uint16_t target_id;
+
+   /*
+* This is the host address where the response will be written when the
+* request is complete. This area must be 16B aligned and must be
+* cleared to zero befor

[dpdk-dev] [PATCH v4 24/39] bnxt: add HWRM ring alloc/free functions

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add HWRM API calls to allocate and free TX, RX and Completion rings
in the hardware along with the associated structs and definitions.

As mentioned earlier, a completion ring is used by the Ethernet
controller to provide the status of transmitted & received packets,
report errors, status changes to the host software.

v4:
Address review comments.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 108 
 drivers/net/bnxt/bnxt_hwrm.h   |   7 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 305 +
 3 files changed, 420 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 5d0fbf1..6152856 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -504,6 +504,114 @@ int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
return rc;
 }

+int bnxt_hwrm_ring_alloc(struct bnxt *bp,
+struct bnxt_ring_struct *ring,
+uint32_t ring_type, uint32_t map_index,
+uint32_t stats_ctx_id)
+{
+   int rc = 0;
+   struct hwrm_ring_alloc_input req = {.req_type = 0 };
+   struct hwrm_ring_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, RING_ALLOC, -1, resp);
+
+   req.enables = rte_cpu_to_le_32(0);
+
+   req.page_tbl_addr = rte_cpu_to_le_64(ring->bd_dma);
+   req.fbo = rte_cpu_to_le_32(0);
+   /* Association of ring index with doorbell index */
+   req.logical_id = rte_cpu_to_le_16(map_index);
+
+   switch (ring_type) {
+   case HWRM_RING_ALLOC_INPUT_RING_TYPE_TX:
+   req.queue_id = bp->cos_queue[0].id;
+   case HWRM_RING_ALLOC_INPUT_RING_TYPE_RX:
+   req.ring_type = ring_type;
+   req.cmpl_ring_id =
+   rte_cpu_to_le_16(bp->grp_info[map_index].cp_fw_ring_id);
+   req.length = rte_cpu_to_le_32(ring->ring_size);
+   req.stat_ctx_id = rte_cpu_to_le_16(stats_ctx_id);
+   req.enables = rte_cpu_to_le_32(rte_le_to_cpu_32(req.enables) |
+   HWRM_RING_ALLOC_INPUT_ENABLES_STAT_CTX_ID_VALID);
+   break;
+   case HWRM_RING_ALLOC_INPUT_RING_TYPE_CMPL:
+   req.ring_type = ring_type;
+   req.int_mode = HWRM_RING_ALLOC_INPUT_INT_MODE_POLL;
+   req.length = rte_cpu_to_le_32(ring->ring_size);
+   break;
+   default:
+   RTE_LOG(ERR, PMD, "hwrm alloc invalid ring type %d\n",
+   ring_type);
+   return -1;
+   }
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   if (rc || resp->error_code) {
+   if (rc == 0 && resp->error_code)
+   rc = rte_le_to_cpu_16(resp->error_code);
+   switch (ring_type) {
+   case HWRM_RING_FREE_INPUT_RING_TYPE_CMPL:
+   RTE_LOG(ERR, PMD,
+   "hwrm_ring_alloc cp failed. rc:%d\n", rc);
+   return rc;
+   case HWRM_RING_FREE_INPUT_RING_TYPE_RX:
+   RTE_LOG(ERR, PMD,
+   "hwrm_ring_alloc rx failed. rc:%d\n", rc);
+   return rc;
+   case HWRM_RING_FREE_INPUT_RING_TYPE_TX:
+   RTE_LOG(ERR, PMD,
+   "hwrm_ring_alloc tx failed. rc:%d\n", rc);
+   return rc;
+   default:
+   RTE_LOG(ERR, PMD, "Invalid ring. rc:%d\n", rc);
+   return rc;
+   }
+   }
+
+   ring->fw_ring_id = rte_le_to_cpu_16(resp->ring_id);
+   return rc;
+}
+
+int bnxt_hwrm_ring_free(struct bnxt *bp,
+   struct bnxt_ring_struct *ring, uint32_t ring_type)
+{
+   int rc;
+   struct hwrm_ring_free_input req = {.req_type = 0 };
+   struct hwrm_ring_free_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, RING_FREE, -1, resp);
+
+   req.ring_type = ring_type;
+   req.ring_id = rte_cpu_to_le_16(ring->fw_ring_id);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   if (rc || resp->error_code) {
+   if (rc == 0 && resp->error_code)
+   rc = rte_le_to_cpu_16(resp->error_code);
+
+   switch (ring_type) {
+   case HWRM_RING_FREE_INPUT_RING_TYPE_CMPL:
+   RTE_LOG(ERR, PMD, "hwrm_ring_free cp failed. rc:%d\n",
+   rc);
+   return rc;
+   case HWRM_RING_FREE_INPUT_RING_TYPE_RX:
+   RTE_LOG(ERR, PMD, "hwrm_ring_free rx failed. rc:%d\n",
+   rc);
+   return rc;
+   case HWRM_RING_FREE_INPUT_RING_TYPE_TX:
+   RTE_LOG(ERR, PMD, "hwrm_ring_free 

[dpdk-dev] [PATCH v4 14/39] bnxt: initial Rx code implementation

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Initial implementation of rx_pkt_burst
Add code to allocate rings to bnxt_ring.c

v4:
Use rte_mbuf_raw_alloc instead of the now depricated
__rte_mbuf_raw_alloc and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/Makefile  |   1 +
 drivers/net/bnxt/bnxt_ethdev.c |   3 +-
 drivers/net/bnxt/bnxt_ring.c   |  60 ++---
 drivers/net/bnxt/bnxt_rxq.c|  34 ++-
 drivers/net/bnxt/bnxt_rxr.c| 341 
 drivers/net/bnxt/bnxt_rxr.h|  62 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 474 +
 7 files changed, 935 insertions(+), 40 deletions(-)
 create mode 100644 drivers/net/bnxt/bnxt_rxr.c
 create mode 100644 drivers/net/bnxt/bnxt_rxr.h

diff --git a/drivers/net/bnxt/Makefile b/drivers/net/bnxt/Makefile
index 0785681..4d35412 100644
--- a/drivers/net/bnxt/Makefile
+++ b/drivers/net/bnxt/Makefile
@@ -54,6 +54,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_filter.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_hwrm.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_ring.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_rxq.c
+SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_rxr.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_stats.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txq.c
 SRCS-$(CONFIG_RTE_LIBRTE_BNXT_PMD) += bnxt_txr.c
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 4ace543..6888363 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -42,6 +42,7 @@
 #include "bnxt.h"
 #include "bnxt_hwrm.h"
 #include "bnxt_rxq.h"
+#include "bnxt_rxr.h"
 #include "bnxt_stats.h"
 #include "bnxt_txq.h"
 #include "bnxt_txr.h"
@@ -269,7 +270,7 @@ bnxt_dev_init(struct rte_eth_dev *eth_dev)
goto error;
}
eth_dev->dev_ops = &bnxt_dev_ops;
-   /* eth_dev->rx_pkt_burst = &bnxt_recv_pkts; */
+   eth_dev->rx_pkt_burst = &bnxt_recv_pkts;
eth_dev->tx_pkt_burst = &bnxt_xmit_pkts;

rc = bnxt_alloc_hwrm_resources(bp);
diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c
index be77bbe..fab0e27 100644
--- a/drivers/net/bnxt/bnxt_ring.c
+++ b/drivers/net/bnxt/bnxt_ring.c
@@ -36,6 +36,7 @@
 #include "bnxt.h"
 #include "bnxt_cpr.h"
 #include "bnxt_ring.h"
+#include "bnxt_rxr.h"
 #include "bnxt_txr.h"

 #include "hsi_struct_def_dpdk.h"
@@ -73,9 +74,8 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
const char *suffix)
 {
struct bnxt_ring_struct *cp_ring = cp_ring_info->cp_ring_struct;
+   struct bnxt_ring_struct *rx_ring;
struct bnxt_ring_struct *tx_ring;
-   /* TODO: RX ring */
-   /* struct bnxt_ring_struct *rx_ring; */
struct rte_pci_device *pdev = bp->pdev;
const struct rte_memzone *mz = NULL;
char mz_name[RTE_MEMZONE_NAMESIZE];
@@ -92,12 +92,8 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
tx_ring_struct->vmem_size) : 0;

int rx_vmem_start = tx_vmem_start + tx_vmem_len;
-   /* TODO: RX ring */
-   int rx_vmem_len = 0;
-   /*
-* rx_ring_info ? RTE_CACHE_LINE_ROUNDUP(rx_ring_info->
-* rx_ring_struct->vmem_size) : 0;
-*/
+   int rx_vmem_len = rx_ring_info ? RTE_CACHE_LINE_ROUNDUP(rx_ring_info->
+   rx_ring_struct->vmem_size) : 0;

int cp_ring_start = rx_vmem_start + rx_vmem_len;
int cp_ring_len = RTE_CACHE_LINE_ROUNDUP(cp_ring->ring_size *
@@ -109,13 +105,9 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
   sizeof(struct tx_bd_long)) : 0;

int rx_ring_start = tx_ring_start + tx_ring_len;
-   /* TODO: RX ring */
-   int rx_ring_len = 0;
-   /*
-* rx_ring_info ?
-* RTE_CACHE_LINE_ROUNDUP(rx_ring_info->rx_ring_struct->ring_size *
-* sizeof(struct rx_prod_pkt_bd)) : 0;
-*/
+   int rx_ring_len =  rx_ring_info ?
+   RTE_CACHE_LINE_ROUNDUP(rx_ring_info->rx_ring_struct->ring_size *
+   sizeof(struct rx_prod_pkt_bd)) : 0;

int total_alloc_len = rx_ring_start + rx_ring_len;

@@ -153,26 +145,24 @@ int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
}
}

-/*
- * if (rx_ring_info) {
- * rx_ring = &rx_ring_info->rx_ring_struct;
- *
- * rx_ring->bd = ((char *)mz->addr + rx_ring_start);
- * rx_ring_info->rx_desc_ring =
- * (struct rx_prod_pkt_bd *)rx_ring->bd;
- * rx_ring->bd_dma = mz->phys_addr + rx_ring_start;
- * rx_ring_info->rx_desc_mapping = rx_ring->bd_dma;
- *
- * if (!rx_ring->bd)
- * return -ENOMEM;
- * if (rx_ring->vmem_size) {
- * rx_ring->vmem =
- * 

[dpdk-dev] [PATCH v4 27/39] bnxt: Add HWRM API to set and clear filters

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds code to set and clear L2 filters from the corresponding
VNIC. These filters will determine the characteristics of Rx traffic.

v4:
Separated this code from the previous patch as it had nothing to
do with freeing of statistics context.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c | 69 
 drivers/net/bnxt/bnxt_hwrm.h |  6 
 2 files changed, 75 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index d3e77d5..f8e9d20 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -206,6 +206,49 @@ int bnxt_hwrm_clear_filter(struct bnxt *bp,
return 0;
 }

+int bnxt_hwrm_set_filter(struct bnxt *bp,
+struct bnxt_vnic_info *vnic,
+struct bnxt_filter_info *filter)
+{
+   int rc = 0;
+   struct hwrm_cfa_l2_filter_alloc_input req = {.req_type = 0 };
+   struct hwrm_cfa_l2_filter_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+   uint32_t enables = 0;
+
+   HWRM_PREP(req, CFA_L2_FILTER_ALLOC, -1, resp);
+
+   req.flags = rte_cpu_to_le_32(filter->flags);
+
+   enables = filter->enables |
+ HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_DST_ID;
+   req.dst_id = rte_cpu_to_le_16(vnic->fw_vnic_id);
+
+   if (enables &
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR)
+   memcpy(req.l2_addr, filter->l2_addr,
+  ETHER_ADDR_LEN);
+   if (enables &
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_ADDR_MASK)
+   memcpy(req.l2_addr_mask, filter->l2_addr_mask,
+  ETHER_ADDR_LEN);
+   if (enables &
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_OVLAN)
+   req.l2_ovlan = filter->l2_ovlan;
+   if (enables &
+   HWRM_CFA_L2_FILTER_ALLOC_INPUT_ENABLES_L2_OVLAN_MASK)
+   req.l2_ovlan_mask = filter->l2_ovlan_mask;
+
+   req.enables = rte_cpu_to_le_32(enables);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   filter->fw_l2_filter_id = rte_le_to_cpu_64(resp->l2_filter_id);
+
+   return rc;
+}
+
 int bnxt_hwrm_exec_fwd_resp(struct bnxt *bp, void *fwd_cmd)
 {
int rc;
@@ -1016,6 +1059,32 @@ int bnxt_alloc_hwrm_resources(struct bnxt *bp)
return 0;
 }

+int bnxt_clear_hwrm_vnic_filters(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   struct bnxt_filter_info *filter;
+   int rc = 0;
+
+   STAILQ_FOREACH(filter, &vnic->filter, next) {
+   rc = bnxt_hwrm_clear_filter(bp, filter);
+   if (rc)
+   break;
+   }
+   return rc;
+}
+
+int bnxt_set_hwrm_vnic_filters(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   struct bnxt_filter_info *filter;
+   int rc = 0;
+
+   STAILQ_FOREACH(filter, &vnic->filter, next) {
+   rc = bnxt_hwrm_set_filter(bp, vnic, filter);
+   if (rc)
+   break;
+   }
+   return rc;
+}
+
 static uint16_t bnxt_parse_eth_link_duplex(uint32_t conf_link_speed)
 {
uint8_t hw_link_duplex = HWRM_PORT_PHY_CFG_INPUT_AUTO_DUPLEX_BOTH;
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 5665762..55728df 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -47,6 +47,9 @@ int bnxt_hwrm_cfa_l2_clear_rx_mask(struct bnxt *bp,
 int bnxt_hwrm_cfa_l2_set_rx_mask(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_clear_filter(struct bnxt *bp,
   struct bnxt_filter_info *filter);
+int bnxt_hwrm_set_filter(struct bnxt *bp,
+struct bnxt_vnic_info *vnic,
+struct bnxt_filter_info *filter);

 int bnxt_hwrm_exec_fwd_resp(struct bnxt *bp, void *fwd_cmd);

@@ -88,6 +91,9 @@ int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_free_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp);
 int bnxt_alloc_all_hwrm_ring_grps(struct bnxt *bp);
+int bnxt_set_hwrm_vnic_filters(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+int bnxt_clear_hwrm_vnic_filters(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+void bnxt_free_all_hwrm_resources(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
 int bnxt_alloc_hwrm_resources(struct bnxt *bp);
 int bnxt_set_hwrm_link_config(struct bnxt *bp, bool link_up);
-- 
1.9.1



[dpdk-dev] [PATCH v4 21/39] bnxt: add HWRM API to configure RSS

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

As mentioned earlier:
A VNIC represents a virtual interface. It is a resource in the RX path
of the chip and is used to setup various target actions such as RSS,
MAC filtering etc.. for the physical function in use.

The HWRM API defined in this patch will be used to enable RSS
configuration of the VNIC.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 24 
 drivers/net/bnxt/bnxt_hwrm.h   |  2 ++
 drivers/net/bnxt/hsi_struct_def_dpdk.h |  1 +
 3 files changed, 27 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 87853f6..c43c2da 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -603,6 +603,30 @@ int bnxt_hwrm_vnic_free(struct bnxt *bp, struct 
bnxt_vnic_info *vnic)
return rc;
 }

+int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,
+  struct bnxt_vnic_info *vnic)
+{
+   int rc = 0;
+   struct hwrm_vnic_rss_cfg_input req = {.req_type = 0 };
+   struct hwrm_vnic_rss_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, VNIC_RSS_CFG, -1, resp);
+
+   req.hash_type = rte_cpu_to_le_32(vnic->hash_type);
+
+   req.ring_grp_tbl_addr =
+   rte_cpu_to_le_64(vnic->rss_table_dma_addr);
+   req.hash_key_tbl_addr =
+   rte_cpu_to_le_64(vnic->rss_hash_key_dma_addr);
+   req.rss_ctx_idx = rte_cpu_to_le_16(vnic->fw_rss_cos_lb_ctx);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 /*
  * HWRM utility functions
  */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index b7f6b20..7c12c6d 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -64,6 +64,8 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info 
*vnic);
 int bnxt_hwrm_vnic_ctx_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_ctx_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,
+  struct bnxt_vnic_info *vnic);

 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 6412df2..72d4984 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -92,6 +92,7 @@ struct ctx_hw_stats64 {
 #define HWRM_VNIC_ALLOC(UINT32_C(0x40))
 #define HWRM_VNIC_FREE (UINT32_C(0x41))
 #define HWRM_VNIC_CFG  (UINT32_C(0x42))
+#define HWRM_VNIC_RSS_CFG  (UINT32_C(0x46))
 #define HWRM_VNIC_RSS_COS_LB_CTX_ALLOC (UINT32_C(0x70))
 #define HWRM_VNIC_RSS_COS_LB_CTX_FREE  (UINT32_C(0x71))
 #define HWRM_CFA_L2_FILTER_ALLOC   (UINT32_C(0x90))
-- 
1.9.1



[dpdk-dev] [PATCH v4 17/39] bnxt: add HWRM vnic alloc function

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This requires a group info array in struct bnxt, so add that, save
the max size from the func_qcap response, and alloc/free in init/uninit

As mentioned in the previous patch, A VNIC represents a virtual interface.
It is a resource in the RX path of the chip and is used to setup various
target actions such as RSS, MAC filtering etc.. for the physical function
in use.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt.h|  2 +
 drivers/net/bnxt/bnxt_hwrm.c   | 33 
 drivers/net/bnxt/bnxt_hwrm.h   |  2 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 99 ++
 4 files changed, 136 insertions(+)

diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index f7cf9d1..df1f771 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -141,6 +141,8 @@ struct bnxt {

/* Default completion ring */
struct bnxt_cp_ring_info*def_cp_ring;
+   uint32_tmax_ring_grps;
+   struct bnxt_ring_grp_info   *grp_info;

unsigned intnr_vnics;

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 3400083..77afb81 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -43,7 +43,9 @@
 #include "bnxt_filter.h"
 #include "bnxt_hwrm.h"
 #include "bnxt_rxq.h"
+#include "bnxt_ring.h"
 #include "bnxt_txq.h"
+#include "bnxt_vnic.h"
 #include "hsi_struct_def_dpdk.h"

 #define HWRM_CMD_TIMEOUT   2000
@@ -191,6 +193,7 @@ int bnxt_hwrm_func_qcaps(struct bnxt *bp)

HWRM_CHECK_RESULT;

+   bp->max_ring_grps = rte_le_to_cpu_32(resp->max_hw_ring_grps);
if (BNXT_PF(bp)) {
struct bnxt_pf_info *pf = &bp->pf;

@@ -477,6 +480,36 @@ int bnxt_hwrm_stat_clear(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr)
return rc;
 }

+int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   int rc = 0, i, j;
+   struct hwrm_vnic_alloc_input req = {.req_type = 0 };
+   struct hwrm_vnic_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+
+   /* map ring groups to this vnic */
+   for (i = vnic->start_grp_id, j = 0; i <= vnic->end_grp_id; i++, j++) {
+   if (bp->grp_info[i].fw_grp_id == (uint16_t)HWRM_NA_SIGNATURE) {
+   RTE_LOG(ERR, PMD,
+   "Not enough ring groups avail:%x req:%x\n", j,
+   (vnic->end_grp_id - vnic->start_grp_id) + 1);
+   break;
+   }
+   vnic->fw_grp_ids[j] = bp->grp_info[i].fw_grp_id;
+   }
+
+   vnic->fw_rss_cos_lb_ctx = (uint16_t)HWRM_NA_SIGNATURE;
+   vnic->ctx_is_rss_cos_lb = HW_CONTEXT_NONE;
+
+   HWRM_PREP(req, VNIC_ALLOC, -1, resp);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   vnic->fw_vnic_id = rte_le_to_cpu_16(resp->vnic_id);
+   return rc;
+}
+
 /*
  * HWRM utility functions
  */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 4fa94aa..62dc801 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -59,6 +59,8 @@ int bnxt_hwrm_stat_clear(struct bnxt *bp, struct 
bnxt_cp_ring_info *cpr);

 int bnxt_hwrm_ver_get(struct bnxt *bp);

+int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+
 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
 int bnxt_alloc_hwrm_resources(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 6209368..eedd368 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -89,6 +89,7 @@ struct ctx_hw_stats64 {
 #define HWRM_FUNC_DRV_RGTR (UINT32_C(0x1d))
 #define HWRM_PORT_PHY_CFG  (UINT32_C(0x20))
 #define HWRM_QUEUE_QPORTCFG(UINT32_C(0x30))
+#define HWRM_VNIC_ALLOC(UINT32_C(0x40))
 #define HWRM_CFA_L2_FILTER_ALLOC   (UINT32_C(0x90))
 #define HWRM_CFA_L2_FILTER_FREE(UINT32_C(0x91))
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
@@ -3119,6 +3120,104 @@ struct hwrm_stat_ctx_clr_stats_output {
uint8_t valid;
 } __attribute__((packed));

+/* hwrm_vnic_alloc */
+/*
+ * Description: This VNIC is a resource in the RX side of the chip that is used
+ * to represent a virtual host "interface". # At the time of VNIC allocation or
+ * configuration, the function can specify whether it wants the requested VNIC
+ * to be the default VNIC for the function or not. # If a function requests
+ * allocation of a VNIC for the first time and a VNIC is successfully allocated
+ * by the HWRM, then the HWRM shall make the allocated VNIC as the default VNIC
+ * for that function. # The default VNIC shall be used for the default action
+ 

[dpdk-dev] [PATCH v4 37/39] bnxt: add RSS device operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add rss_hash_update and rss_hash_conf_get dev_ops

v4:
Fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 121 +
 1 file changed, 121 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index b3b76f1..3c7f868 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -62,6 +62,14 @@ static struct rte_pci_id bnxt_pci_id_map[] = {
{.device_id = 0},
 };

+#define BNXT_ETH_RSS_SUPPORT ( \
+   ETH_RSS_IPV4 |  \
+   ETH_RSS_NONFRAG_IPV4_TCP |  \
+   ETH_RSS_NONFRAG_IPV4_UDP |  \
+   ETH_RSS_IPV6 |  \
+   ETH_RSS_NONFRAG_IPV6_TCP |  \
+   ETH_RSS_NONFRAG_IPV6_UDP)
+
 /***/

 /*
@@ -636,6 +644,117 @@ static int bnxt_reta_query_op(struct rte_eth_dev *eth_dev,
return 0;
 }

+static int bnxt_rss_hash_update_op(struct rte_eth_dev *eth_dev,
+  struct rte_eth_rss_conf *rss_conf)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct rte_eth_conf *dev_conf = &bp->eth_dev->data->dev_conf;
+   struct bnxt_vnic_info *vnic;
+   uint16_t hash_type = 0;
+   int i;
+
+   /*
+* If RSS enablement were different than dev_configure,
+* then return -EINVAL
+*/
+   if (dev_conf->rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG) {
+   if (!rss_conf->rss_hf)
+   return -EINVAL;
+   } else {
+   if (rss_conf->rss_hf & BNXT_ETH_RSS_SUPPORT)
+   return -EINVAL;
+   }
+   if (rss_conf->rss_hf & ETH_RSS_IPV4)
+   hash_type |= HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_IPV4;
+   if (rss_conf->rss_hf & ETH_RSS_NONFRAG_IPV4_TCP)
+   hash_type |= HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_TCP_IPV4;
+   if (rss_conf->rss_hf & ETH_RSS_NONFRAG_IPV4_UDP)
+   hash_type |= HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_UDP_IPV4;
+   if (rss_conf->rss_hf & ETH_RSS_IPV6)
+   hash_type |= HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_IPV6;
+   if (rss_conf->rss_hf & ETH_RSS_NONFRAG_IPV6_TCP)
+   hash_type |= HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_TCP_IPV6;
+   if (rss_conf->rss_hf & ETH_RSS_NONFRAG_IPV6_UDP)
+   hash_type |= HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_UDP_IPV6;
+
+   /* Update the RSS VNIC(s) */
+   for (i = 0; i < MAX_FF_POOLS; i++) {
+   STAILQ_FOREACH(vnic, &bp->ff_pool[i], next) {
+   vnic->hash_type = hash_type;
+
+   /*
+* Use the supplied key if the key length is
+* acceptable and the rss_key is not NULL
+*/
+   if (rss_conf->rss_key &&
+   rss_conf->rss_key_len <= HW_HASH_KEY_SIZE)
+   memcpy(vnic->rss_hash_key, rss_conf->rss_key,
+  rss_conf->rss_key_len);
+
+   bnxt_hwrm_vnic_rss_cfg(bp, vnic);
+   }
+   }
+   return 0;
+}
+
+static int bnxt_rss_hash_conf_get_op(struct rte_eth_dev *eth_dev,
+struct rte_eth_rss_conf *rss_conf)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic = &bp->vnic_info[0];
+   int len;
+   uint32_t hash_types;
+
+   /* RSS configuration is the same for all VNICs */
+   if (vnic && vnic->rss_hash_key) {
+   if (rss_conf->rss_key) {
+   len = rss_conf->rss_key_len <= HW_HASH_KEY_SIZE ?
+ rss_conf->rss_key_len : HW_HASH_KEY_SIZE;
+   memcpy(rss_conf->rss_key, vnic->rss_hash_key, len);
+   }
+
+   hash_types = vnic->hash_type;
+   rss_conf->rss_hf = 0;
+   if (hash_types & HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_IPV4) {
+   rss_conf->rss_hf |= ETH_RSS_IPV4;
+   hash_types &= ~HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_IPV4;
+   }
+   if (hash_types & HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_TCP_IPV4) {
+   rss_conf->rss_hf |= ETH_RSS_NONFRAG_IPV4_TCP;
+   hash_types &=
+   ~HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_TCP_IPV4;
+   }
+   if (hash_types & HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_UDP_IPV4) {
+   rss_conf->rss_hf |= ETH_RSS_NONFRAG_IPV4_UDP;
+   hash_types &=
+   ~HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_UDP_IPV4;
+   }
+   if (hash_types & HWRM_VNIC_RSS_CFG_INPUT_HASH_TYPE_IPV6) {
+   rss_conf->rss_hf |= ETH_RSS_IPV6;
+   h

[dpdk-dev] [PATCH v4 16/39] bnxt: add HWRM function reset command

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add bnxt_hwrm_func_reset() function and supporting structs and macros.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   |  17 +
 drivers/net/bnxt/bnxt_hwrm.h   |   1 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 129 +
 3 files changed, 147 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 2574bd0..3400083 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -221,6 +221,23 @@ int bnxt_hwrm_func_qcaps(struct bnxt *bp)
return rc;
 }

+int bnxt_hwrm_func_reset(struct bnxt *bp)
+{
+   int rc = 0;
+   struct hwrm_func_reset_input req = {.req_type = 0 };
+   struct hwrm_func_reset_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, FUNC_RESET, -1, resp);
+
+   req.enables = rte_cpu_to_le_32(0);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   return rc;
+}
+
 int bnxt_hwrm_func_driver_register(struct bnxt *bp, uint32_t flags,
   uint32_t *vf_req_fwd)
 {
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index 0861417..4fa94aa 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -50,6 +50,7 @@ int bnxt_hwrm_exec_fwd_resp(struct bnxt *bp, void *fwd_cmd);
 int bnxt_hwrm_func_driver_register(struct bnxt *bp, uint32_t flags,
   uint32_t *vf_req_fwd);
 int bnxt_hwrm_func_qcaps(struct bnxt *bp);
+int bnxt_hwrm_func_reset(struct bnxt *bp);
 int bnxt_hwrm_func_driver_unregister(struct bnxt *bp, uint32_t flags);

 int bnxt_hwrm_queue_qportcfg(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 8b30787..6209368 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -83,6 +83,7 @@ struct ctx_hw_stats64 {
  * Request types
  */
 #define HWRM_VER_GET   (UINT32_C(0x0))
+#define HWRM_FUNC_RESET(UINT32_C(0x11))
 #define HWRM_FUNC_QCAPS(UINT32_C(0x15))
 #define HWRM_FUNC_DRV_UNRGTR   (UINT32_C(0x1a))
 #define HWRM_FUNC_DRV_RGTR (UINT32_C(0x1d))
@@ -2048,6 +2049,134 @@ struct hwrm_func_qcaps_output {
uint8_t valid;
 } __attribute__((packed));

+/* hwrm_func_reset */
+/*
+ * Description: This command resets a hardware function (PCIe function) and
+ * frees any resources used by the function. This command shall be initiated by
+ * the driver after an FLR has occurred to prepare the function for re-use. 
This
+ * command may also be initiated by a driver prior to doing it's own
+ * configuration. This command puts the function into the reset state. In the
+ * reset state, global and port related features of the chip are not available.
+ */
+/*
+ * Note: This command will reset a function that has already been disabled or
+ * idled. The command returns all the resources owned by the function so a new
+ * driver may allocate and configure resources normally.
+ */
+
+/* Input (24 bytes) */
+struct hwrm_func_reset_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other value must be a valid CR ring_id value
+* for this function.
+*/
+   uint16_t cmpl_ring;
+
+   /* This value indicates the command sequence number. */
+   uint16_t seq_id;
+
+   /*
+* Target ID of this command. 0x0 - 0xFFF8 - Used for function ids
+* 0xFFF8 - 0xFFFE - Reserved for internal processors 0x - HWRM
+*/
+   uint16_t target_id;
+
+   /*
+* This is the host address where the response will be written when the
+* request is complete. This area must be 16B aligned and must be
+* cleared to zero before the request is made.
+*/
+   uint64_t resp_addr;
+
+   /* This bit must be '1' for the vf_id_valid field to be configured. */
+   #define HWRM_FUNC_RESET_INPUT_ENABLES_VF_ID_VALID \
+   UINT32_C(0x1)
+   uint32_t enables;
+
+   /*
+* The ID of the VF that this PF is trying to reset. Only the parent PF
+* shall be allowed to reset a child VF. A parent PF driver shall use
+* this field only when a specific child VF is requested to be reset.
+*/
+   uint16_t vf_id;
+
+   /* This value indicates the level of a function reset. */
+   /*
+* Reset the caller function and its children 

[dpdk-dev] [PATCH v4 36/39] bnxt: add reta update/query operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add code to Update/query reta dev_ops

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 56 ++
 1 file changed, 56 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index b04010c..b3b76f1 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -582,6 +582,60 @@ static void bnxt_allmulticast_disable_op(struct 
rte_eth_dev *eth_dev)
bnxt_hwrm_cfa_l2_set_rx_mask(bp, vnic);
 }

+static int bnxt_reta_update_op(struct rte_eth_dev *eth_dev,
+   struct rte_eth_rss_reta_entry64 *reta_conf,
+   uint16_t reta_size)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct rte_eth_conf *dev_conf = &bp->eth_dev->data->dev_conf;
+   struct bnxt_vnic_info *vnic;
+   int i;
+
+   if (!(dev_conf->rxmode.mq_mode & ETH_MQ_RX_RSS_FLAG))
+   return -EINVAL;
+
+   if (reta_size != HW_HASH_INDEX_SIZE) {
+   RTE_LOG(ERR, PMD, "The configured hash table lookup size "
+   "(%d) must equal the size supported by the hardware "
+   "(%d)\n", reta_size, HW_HASH_INDEX_SIZE);
+   return -EINVAL;
+   }
+   /* Update the RSS VNIC(s) */
+   for (i = 0; i < MAX_FF_POOLS; i++) {
+   STAILQ_FOREACH(vnic, &bp->ff_pool[i], next) {
+   memcpy(vnic->rss_table, reta_conf, reta_size);
+
+   bnxt_hwrm_vnic_rss_cfg(bp, vnic);
+   }
+   }
+   return 0;
+}
+
+static int bnxt_reta_query_op(struct rte_eth_dev *eth_dev,
+ struct rte_eth_rss_reta_entry64 *reta_conf,
+ uint16_t reta_size)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic = &bp->vnic_info[0];
+
+   /* Retrieve from the default VNIC */
+   if (!vnic)
+   return -EINVAL;
+   if (!vnic->rss_table)
+   return -EINVAL;
+
+   if (reta_size != HW_HASH_INDEX_SIZE) {
+   RTE_LOG(ERR, PMD, "The configured hash table lookup size "
+   "(%d) must equal the size supported by the hardware "
+   "(%d)\n", reta_size, HW_HASH_INDEX_SIZE);
+   return -EINVAL;
+   }
+   /* EW - need to revisit here copying from u64 to u16 */
+   memcpy(reta_conf, vnic->rss_table, reta_size);
+
+   return 0;
+}
+
 /*
  * Initialization
  */
@@ -600,6 +654,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.rx_queue_release = bnxt_rx_queue_release_op,
.tx_queue_setup = bnxt_tx_queue_setup_op,
.tx_queue_release = bnxt_tx_queue_release_op,
+   .reta_update = bnxt_reta_update_op,
+   .reta_query = bnxt_reta_query_op,
.link_update = bnxt_link_update_op,
.promiscuous_enable = bnxt_promiscuous_enable_op,
.promiscuous_disable = bnxt_promiscuous_disable_op,
-- 
1.9.1



[dpdk-dev] [PATCH v4 35/39] bnxt: add set link up/down operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Adds dev_ops to set link UP or DOWN as appropriate.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 34a5873..b04010c 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -380,6 +380,24 @@ error:
return rc;
 }

+static int bnxt_dev_set_link_up_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+
+   eth_dev->data->dev_link.link_status = 1;
+   bnxt_set_hwrm_link_config(bp, true);
+   return 0;
+}
+
+static int bnxt_dev_set_link_down_op(struct rte_eth_dev *eth_dev)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+
+   eth_dev->data->dev_link.link_status = 0;
+   bnxt_set_hwrm_link_config(bp, false);
+   return 0;
+}
+
 static void bnxt_dev_close_op(struct rte_eth_dev *eth_dev)
 {
struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
@@ -574,6 +592,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.dev_configure = bnxt_dev_configure_op,
.dev_start = bnxt_dev_start_op,
.dev_stop = bnxt_dev_stop_op,
+   .dev_set_link_up = bnxt_dev_set_link_up_op,
+   .dev_set_link_down = bnxt_dev_set_link_down_op,
.stats_get = bnxt_stats_get_op,
.stats_reset = bnxt_stats_reset_op,
.rx_queue_setup = bnxt_rx_queue_setup_op,
-- 
1.9.1



[dpdk-dev] [PATCH v4 20/39] bnxt: add API to allow configuration of vnic

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds APIs to allow configuration of a VNIC.
The functions alloc and free the COS and Load Balance context
corresponding to the VNIC in the chip.

v4:
Address review comments and fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   |  38 
 drivers/net/bnxt/bnxt_hwrm.h   |   2 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 153 +
 3 files changed, 193 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index be4020f..87853f6 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -544,6 +544,44 @@ int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct 
bnxt_vnic_info *vnic)
return rc;
 }

+int bnxt_hwrm_vnic_ctx_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   int rc = 0;
+   struct hwrm_vnic_rss_cos_lb_ctx_alloc_input req = {.req_type = 0 };
+   struct hwrm_vnic_rss_cos_lb_ctx_alloc_output *resp =
+   bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, VNIC_RSS_COS_LB_CTX_ALLOC, -1, resp);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   vnic->fw_rss_cos_lb_ctx = rte_le_to_cpu_16(resp->rss_cos_lb_ctx_id);
+
+   return rc;
+}
+
+int bnxt_hwrm_vnic_ctx_free(struct bnxt *bp, struct bnxt_vnic_info *vnic)
+{
+   int rc = 0;
+   struct hwrm_vnic_rss_cos_lb_ctx_free_input req = {.req_type = 0 };
+   struct hwrm_vnic_rss_cos_lb_ctx_free_output *resp =
+   bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, VNIC_RSS_COS_LB_CTX_FREE, -1, resp);
+
+   req.rss_cos_lb_ctx_id = rte_cpu_to_le_16(vnic->fw_rss_cos_lb_ctx);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   vnic->fw_rss_cos_lb_ctx = INVALID_HW_RING_ID;
+
+   return rc;
+}
+
 int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic)
 {
int rc = 0;
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index b5cf090..b7f6b20 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -61,6 +61,8 @@ int bnxt_hwrm_ver_get(struct bnxt *bp);

 int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_vnic_ctx_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_vnic_ctx_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);

 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index ef0b37a..6412df2 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -92,6 +92,8 @@ struct ctx_hw_stats64 {
 #define HWRM_VNIC_ALLOC(UINT32_C(0x40))
 #define HWRM_VNIC_FREE (UINT32_C(0x41))
 #define HWRM_VNIC_CFG  (UINT32_C(0x42))
+#define HWRM_VNIC_RSS_COS_LB_CTX_ALLOC (UINT32_C(0x70))
+#define HWRM_VNIC_RSS_COS_LB_CTX_FREE  (UINT32_C(0x71))
 #define HWRM_CFA_L2_FILTER_ALLOC   (UINT32_C(0x90))
 #define HWRM_CFA_L2_FILTER_FREE(UINT32_C(0x91))
 #define HWRM_CFA_L2_FILTER_CFG (UINT32_C(0x92))
@@ -3576,6 +3578,157 @@ struct hwrm_vnic_rss_cfg_output {
uint8_t valid;
 } __attribute__((packed));

+/* Input (16 bytes) */
+struct hwrm_vnic_rss_cos_lb_ctx_alloc_input {
+   /*
+* This value indicates what type of request this is. The format for the
+* rest of the command is determined by this field.
+*/
+   uint16_t req_type;
+
+   /*
+* This value indicates the what completion ring the request will be
+* optionally completed on. If the value is -1, then no CR completion
+* will be generated. Any other value must be a valid CR ring_id value
+* for this function.
+*/
+   uint16_t cmpl_ring;
+
+   /* This value indicates the command sequence number. */
+   uint16_t seq_id;
+
+   /*
+* Target ID of this command. 0x0 - 0xFFF8 - Used for function ids
+* 0xFFF8 - 0xFFFE - Reserved for internal processors 0x - HWRM
+*/
+   uint16_t target_id;
+
+   /*
+* This is the host address where the response will be written when the
+* request is complete. This area must be 16B aligned and must be
+* cleared to zero before the request is made.
+*/
+   uint64_t resp_addr;
+} __attribute__((packed));
+
+/* Output (16 bytes) */
+
+struct hwrm_vnic_rss_cos_lb_ctx_alloc_output {
+   /*
+* Pass/Fail or error type Note: receiver to verify the in parameters,
+* and fail the call with an error wh

[dpdk-dev] [PATCH v4 38/39] bnxt: add flow control operations

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add flow_ctrl_get and flow_ctrl_set dev_ops.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 83 ++
 1 file changed, 83 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 3c7f868..406e38a 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -755,6 +755,87 @@ static int bnxt_rss_hash_conf_get_op(struct rte_eth_dev 
*eth_dev,
return 0;
 }

+static int bnxt_flow_ctrl_get_op(struct rte_eth_dev *dev,
+  struct rte_eth_fc_conf *fc_conf __rte_unused)
+{
+   struct bnxt *bp = (struct bnxt *)dev->data->dev_private;
+   struct rte_eth_link link_info;
+   int rc;
+
+   rc = bnxt_get_hwrm_link_config(bp, &link_info);
+   if (rc)
+   return rc;
+
+   memset(fc_conf, 0, sizeof(*fc_conf));
+   if (bp->link_info.auto_pause)
+   fc_conf->autoneg = 1;
+   switch (bp->link_info.pause) {
+   case 0:
+   fc_conf->mode = RTE_FC_NONE;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_PAUSE_TX:
+   fc_conf->mode = RTE_FC_TX_PAUSE;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_PAUSE_RX:
+   fc_conf->mode = RTE_FC_RX_PAUSE;
+   break;
+   case (HWRM_PORT_PHY_QCFG_OUTPUT_PAUSE_TX |
+   HWRM_PORT_PHY_QCFG_OUTPUT_PAUSE_RX):
+   fc_conf->mode = RTE_FC_FULL;
+   break;
+   }
+   return 0;
+}
+
+static int bnxt_flow_ctrl_set_op(struct rte_eth_dev *dev,
+  struct rte_eth_fc_conf *fc_conf)
+{
+   struct bnxt *bp = (struct bnxt *)dev->data->dev_private;
+
+   switch (fc_conf->mode) {
+   case RTE_FC_NONE:
+   bp->link_info.auto_pause = 0;
+   bp->link_info.force_pause = 0;
+   break;
+   case RTE_FC_RX_PAUSE:
+   if (fc_conf->autoneg) {
+   bp->link_info.auto_pause =
+   HWRM_PORT_PHY_CFG_INPUT_AUTO_PAUSE_RX;
+   bp->link_info.force_pause = 0;
+   } else {
+   bp->link_info.auto_pause = 0;
+   bp->link_info.force_pause =
+   HWRM_PORT_PHY_CFG_INPUT_FORCE_PAUSE_RX;
+   }
+   break;
+   case RTE_FC_TX_PAUSE:
+   if (fc_conf->autoneg) {
+   bp->link_info.auto_pause =
+   HWRM_PORT_PHY_CFG_INPUT_AUTO_PAUSE_TX;
+   bp->link_info.force_pause = 0;
+   } else {
+   bp->link_info.auto_pause = 0;
+   bp->link_info.force_pause =
+   HWRM_PORT_PHY_CFG_INPUT_FORCE_PAUSE_TX;
+   }
+   break;
+   case RTE_FC_FULL:
+   if (fc_conf->autoneg) {
+   bp->link_info.auto_pause =
+   HWRM_PORT_PHY_CFG_INPUT_AUTO_PAUSE_TX |
+   HWRM_PORT_PHY_CFG_INPUT_AUTO_PAUSE_RX;
+   bp->link_info.force_pause = 0;
+   } else {
+   bp->link_info.auto_pause = 0;
+   bp->link_info.force_pause =
+   HWRM_PORT_PHY_CFG_INPUT_FORCE_PAUSE_TX |
+   HWRM_PORT_PHY_CFG_INPUT_FORCE_PAUSE_RX;
+   }
+   break;
+   }
+   return bnxt_set_hwrm_link_config(bp, true);
+}
+
 /*
  * Initialization
  */
@@ -784,6 +865,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.allmulticast_disable = bnxt_allmulticast_disable_op,
.mac_addr_add = bnxt_mac_addr_add_op,
.mac_addr_remove = bnxt_mac_addr_remove_op,
+   .flow_ctrl_get = bnxt_flow_ctrl_get_op,
+   .flow_ctrl_set = bnxt_flow_ctrl_set_op,
 };

 static bool bnxt_vf_pciid(uint16_t id)
-- 
1.9.1



[dpdk-dev] [PATCH v4 28/39] bnxt: add ring alloc, free and group init

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add a function to initialize ring groups, and a function to
allocate and free the rings via HWRM.

This should be the last functionality needed to add start/stop
device operations.

v4:
Address review comment to merge another patch into this to avoid
a compilation issue. Fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c | 112 +++-
 drivers/net/bnxt/bnxt_hwrm.h |   1 +
 drivers/net/bnxt/bnxt_ring.c | 119 +++
 drivers/net/bnxt/bnxt_ring.h |   2 +
 4 files changed, 233 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index f8e9d20..dc1bce7 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -43,8 +43,10 @@
 #include "bnxt_filter.h"
 #include "bnxt_hwrm.h"
 #include "bnxt_rxq.h"
+#include "bnxt_rxr.h"
 #include "bnxt_ring.h"
 #include "bnxt_txq.h"
+#include "bnxt_txr.h"
 #include "bnxt_vnic.h"
 #include "hsi_struct_def_dpdk.h"

@@ -579,7 +581,11 @@ int bnxt_hwrm_ring_alloc(struct bnxt *bp,
break;
case HWRM_RING_ALLOC_INPUT_RING_TYPE_CMPL:
req.ring_type = ring_type;
-   req.int_mode = HWRM_RING_ALLOC_INPUT_INT_MODE_POLL;
+   /*
+* TODO: Some HWRM versions crash with
+* HWRM_RING_ALLOC_INPUT_INT_MODE_POLL
+*/
+   req.int_mode = HWRM_RING_ALLOC_INPUT_INT_MODE_MSIX;
req.length = rte_cpu_to_le_32(ring->ring_size);
break;
default:
@@ -1012,6 +1018,84 @@ int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp)
return rc;
 }

+static void bnxt_free_cp_ring(struct bnxt *bp,
+ struct bnxt_cp_ring_info *cpr, unsigned int idx)
+{
+   struct bnxt_ring_struct *cp_ring = cpr->cp_ring_struct;
+
+   bnxt_hwrm_ring_free(bp, cp_ring,
+   HWRM_RING_FREE_INPUT_RING_TYPE_CMPL);
+   cp_ring->fw_ring_id = INVALID_HW_RING_ID;
+   bp->grp_info[idx].cp_fw_ring_id = INVALID_HW_RING_ID;
+   memset(cpr->cp_desc_ring, 0, cpr->cp_ring_struct->ring_size *
+   sizeof(*cpr->cp_desc_ring));
+   cpr->cp_raw_cons = 0;
+}
+
+int bnxt_free_all_hwrm_rings(struct bnxt *bp)
+{
+   unsigned int i;
+   int rc = 0;
+
+   for (i = 0; i < bp->tx_cp_nr_rings; i++) {
+   struct bnxt_tx_queue *txq = bp->tx_queues[i];
+   struct bnxt_tx_ring_info *txr = txq->tx_ring;
+   struct bnxt_ring_struct *ring = txr->tx_ring_struct;
+   struct bnxt_cp_ring_info *cpr = txq->cp_ring;
+   unsigned int idx = bp->rx_cp_nr_rings + i + 1;
+
+   if (ring->fw_ring_id != INVALID_HW_RING_ID) {
+   bnxt_hwrm_ring_free(bp, ring,
+   HWRM_RING_FREE_INPUT_RING_TYPE_TX);
+   ring->fw_ring_id = INVALID_HW_RING_ID;
+   memset(txr->tx_desc_ring, 0,
+   txr->tx_ring_struct->ring_size *
+   sizeof(*txr->tx_desc_ring));
+   memset(txr->tx_buf_ring, 0,
+   txr->tx_ring_struct->ring_size *
+   sizeof(*txr->tx_buf_ring));
+   txr->tx_prod = 0;
+   txr->tx_cons = 0;
+   }
+   if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID)
+   bnxt_free_cp_ring(bp, cpr, idx);
+   }
+
+   for (i = 0; i < bp->rx_cp_nr_rings; i++) {
+   struct bnxt_rx_queue *rxq = bp->rx_queues[i];
+   struct bnxt_rx_ring_info *rxr = rxq->rx_ring;
+   struct bnxt_ring_struct *ring = rxr->rx_ring_struct;
+   struct bnxt_cp_ring_info *cpr = rxq->cp_ring;
+   unsigned int idx = i + 1;
+
+   if (ring->fw_ring_id != INVALID_HW_RING_ID) {
+   bnxt_hwrm_ring_free(bp, ring,
+   HWRM_RING_FREE_INPUT_RING_TYPE_RX);
+   ring->fw_ring_id = INVALID_HW_RING_ID;
+   bp->grp_info[idx].rx_fw_ring_id = INVALID_HW_RING_ID;
+   memset(rxr->rx_desc_ring, 0,
+   rxr->rx_ring_struct->ring_size *
+   sizeof(*rxr->rx_desc_ring));
+   memset(rxr->rx_buf_ring, 0,
+   rxr->rx_ring_struct->ring_size *
+   sizeof(*rxr->rx_buf_ring));
+   rxr->rx_prod = 0;
+   }
+   if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID)
+   bnxt_free_cp_ring(bp, cpr, idx);
+   }
+
+   

[dpdk-dev] [PATCH v4 25/39] bnxt: add ring group alloc/free functions

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add HWRM API for ring group alloc/free functions, associated structs and
definitions.
This API allocates and does basic preparation for a ring group in ASIC.
A ring group is identified by an index. It consists of Rx ring id,
completion ring id and a statistics context.

v4:
Address issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   |  84 +++
 drivers/net/bnxt/bnxt_hwrm.h   |   4 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 185 +
 3 files changed, 273 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 6152856..4c4f707 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -612,6 +612,47 @@ int bnxt_hwrm_ring_free(struct bnxt *bp,
return 0;
 }

+int bnxt_hwrm_ring_grp_alloc(struct bnxt *bp, unsigned int idx)
+{
+   int rc = 0;
+   struct hwrm_ring_grp_alloc_input req = {.req_type = 0 };
+   struct hwrm_ring_grp_alloc_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, RING_GRP_ALLOC, -1, resp);
+
+   req.cr = rte_cpu_to_le_16(bp->grp_info[idx].cp_fw_ring_id);
+   req.rr = rte_cpu_to_le_16(bp->grp_info[idx].rx_fw_ring_id);
+   req.ar = rte_cpu_to_le_16(bp->grp_info[idx].ag_fw_ring_id);
+   req.sc = rte_cpu_to_le_16(bp->grp_info[idx].fw_stats_ctx);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   bp->grp_info[idx].fw_grp_id =
+   rte_le_to_cpu_16(resp->ring_group_id);
+
+   return rc;
+}
+
+int bnxt_hwrm_ring_grp_free(struct bnxt *bp, unsigned int idx)
+{
+   int rc;
+   struct hwrm_ring_grp_free_input req = {.req_type = 0 };
+   struct hwrm_ring_grp_free_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, RING_GRP_FREE, -1, resp);
+
+   req.ring_group_id = rte_cpu_to_le_16(bp->grp_info[idx].fw_grp_id);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   bp->grp_info[idx].fw_grp_id = INVALID_HW_RING_ID;
+   return rc;
+}
+
 int bnxt_hwrm_stat_clear(struct bnxt *bp, struct bnxt_cp_ring_info *cpr)
 {
int rc = 0;
@@ -861,6 +902,49 @@ int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp)
return rc;
 }

+int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp)
+{
+   uint16_t i;
+   uint32_t rc = 0;
+
+   for (i = 0; i < bp->rx_cp_nr_rings; i++) {
+   unsigned int idx = i + 1;
+
+   if (bp->grp_info[idx].fw_grp_id == INVALID_HW_RING_ID) {
+   RTE_LOG(ERR, PMD,
+   "Attempt to free invalid ring group %d\n",
+   idx);
+   continue;
+   }
+
+   rc = bnxt_hwrm_ring_grp_free(bp, idx);
+
+   if (rc)
+   return rc;
+   }
+   return rc;
+}
+
+int bnxt_alloc_all_hwrm_ring_grps(struct bnxt *bp)
+{
+   uint16_t i;
+   uint32_t rc = 0;
+
+   for (i = 0; i < bp->rx_cp_nr_rings; i++) {
+   unsigned int idx = i + 1;
+
+   if (bp->grp_info[idx].cp_fw_ring_id == INVALID_HW_RING_ID ||
+   bp->grp_info[idx].rx_fw_ring_id == INVALID_HW_RING_ID)
+   continue;
+
+   rc = bnxt_hwrm_ring_grp_alloc(bp, idx);
+
+   if (rc)
+   return rc;
+   }
+   return rc;
+}
+
 void bnxt_free_hwrm_resources(struct bnxt *bp)
 {
/* Release memzone */
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index e4fc243..fb088cf 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -64,6 +64,8 @@ int bnxt_hwrm_ring_alloc(struct bnxt *bp,
 uint32_t stats_ctx_id);
 int bnxt_hwrm_ring_free(struct bnxt *bp,
struct bnxt_ring_struct *ring, uint32_t ring_type);
+int bnxt_hwrm_ring_grp_alloc(struct bnxt *bp, unsigned int idx);
+int bnxt_hwrm_ring_grp_free(struct bnxt *bp, unsigned int idx);

 int bnxt_hwrm_stat_clear(struct bnxt *bp, struct bnxt_cp_ring_info *cpr);
 int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
@@ -81,6 +83,8 @@ int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,

 int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
+int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp);
+int bnxt_alloc_all_hwrm_ring_grps(struct bnxt *bp);
 void bnxt_free_hwrm_resources(struct bnxt *bp);
 int bnxt_alloc_hwrm_resources(struct bnxt *bp);
 int bnxt_set_hwrm_link_config(struct bnxt *bp, bool link_up);
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h 
b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index e6280b6..4e2eb9f 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -95,6 +95,8 @@ struct ctx_hw_stats64 {
 #define HWRM_V

[dpdk-dev] [PATCH v4 29/39] bnxt: add HWRM port PHY config call and helpers

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

Add HWRM calls to query the port's PHY and link configuration.
This HWRM command and helper function like bnxt_get_hwrm_link_config()
and bnxt_parse_hw_link_speed() parse the link state.

v4:
Fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_hwrm.c   | 120 +
 drivers/net/bnxt/bnxt_hwrm.h   |   1 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h | 790 +
 3 files changed, 911 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index dc1bce7..978e379 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -521,6 +521,43 @@ static int bnxt_hwrm_port_phy_cfg(struct bnxt *bp, struct 
bnxt_link_info *conf)
return rc;
 }

+static int bnxt_hwrm_port_phy_qcfg(struct bnxt *bp,
+  struct bnxt_link_info *link_info)
+{
+   int rc = 0;
+   struct hwrm_port_phy_qcfg_input req = {.req_type = 0};
+   struct hwrm_port_phy_qcfg_output *resp = bp->hwrm_cmd_resp_addr;
+
+   HWRM_PREP(req, PORT_PHY_QCFG, -1, resp);
+
+   rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+   HWRM_CHECK_RESULT;
+
+   link_info->phy_link_status = resp->link;
+   if (link_info->phy_link_status == HWRM_PORT_PHY_QCFG_OUTPUT_LINK_LINK) {
+   link_info->link_up = 1;
+   link_info->link_speed = rte_le_to_cpu_16(resp->link_speed);
+   } else {
+   link_info->link_up = 0;
+   link_info->link_speed = 0;
+   }
+   link_info->duplex = resp->duplex;
+   link_info->pause = resp->pause;
+   link_info->auto_pause = resp->auto_pause;
+   link_info->force_pause = resp->force_pause;
+   link_info->auto_mode = resp->auto_mode;
+
+   link_info->support_speeds = rte_le_to_cpu_16(resp->support_speeds);
+   link_info->auto_link_speed = rte_le_to_cpu_16(resp->auto_link_speed);
+   link_info->preemphasis = rte_le_to_cpu_32(resp->preemphasis);
+   link_info->phy_ver[0] = resp->phy_maj;
+   link_info->phy_ver[1] = resp->phy_min;
+   link_info->phy_ver[2] = resp->phy_bld;
+
+   return rc;
+}
+
 int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
 {
int rc = 0;
@@ -1326,6 +1363,89 @@ static uint16_t bnxt_parse_eth_link_speed_mask(uint32_t 
link_speed)
return ret;
 }

+static uint32_t bnxt_parse_hw_link_speed(uint16_t hw_link_speed)
+{
+   uint32_t eth_link_speed = ETH_SPEED_NUM_NONE;
+
+   switch (hw_link_speed) {
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_100MB:
+   eth_link_speed = ETH_SPEED_NUM_100M;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_1GB:
+   eth_link_speed = ETH_SPEED_NUM_1G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_2_5GB:
+   eth_link_speed = ETH_SPEED_NUM_2_5G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_10GB:
+   eth_link_speed = ETH_SPEED_NUM_10G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_20GB:
+   eth_link_speed = ETH_SPEED_NUM_20G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_25GB:
+   eth_link_speed = ETH_SPEED_NUM_25G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_40GB:
+   eth_link_speed = ETH_SPEED_NUM_40G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_50GB:
+   eth_link_speed = ETH_SPEED_NUM_50G;
+   break;
+   case HWRM_PORT_PHY_QCFG_OUTPUT_LINK_SPEED_2GB:
+   default:
+   RTE_LOG(ERR, PMD, "HWRM link speed %d not defined\n",
+   hw_link_speed);
+   break;
+   }
+   return eth_link_speed;
+}
+
+static uint16_t bnxt_parse_hw_link_duplex(uint16_t hw_link_duplex)
+{
+   uint16_t eth_link_duplex = ETH_LINK_FULL_DUPLEX;
+
+   switch (hw_link_duplex) {
+   case HWRM_PORT_PHY_CFG_INPUT_AUTO_DUPLEX_BOTH:
+   case HWRM_PORT_PHY_CFG_INPUT_AUTO_DUPLEX_FULL:
+   eth_link_duplex = ETH_LINK_FULL_DUPLEX;
+   break;
+   case HWRM_PORT_PHY_CFG_INPUT_AUTO_DUPLEX_HALF:
+   eth_link_duplex = ETH_LINK_HALF_DUPLEX;
+   break;
+   default:
+   RTE_LOG(ERR, PMD, "HWRM link duplex %d not defined\n",
+   hw_link_duplex);
+   break;
+   }
+   return eth_link_duplex;
+}
+
+int bnxt_get_hwrm_link_config(struct bnxt *bp, struct rte_eth_link *link)
+{
+   int rc = 0;
+   struct bnxt_link_info *link_info = &bp->link_info;
+
+   rc = bnxt_hwrm_port_phy_qcfg(bp, link_info);
+   if (rc) {
+   RTE_LOG(ERR, PMD,
+   "Get link config failed with rc %d\n", rc);
+   goto exit;
+   }
+   if (link_info->link_up

[dpdk-dev] [PATCH v4 34/39] bnxt: add MAC address add/remove dev_ops

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

This patch adds dev_ops to Add/Remove MAC addresses.

v4:
Fix issues pointed out by checkpatch.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_ethdev.c | 71 ++
 1 file changed, 71 insertions(+)

diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index 4254531..34a5873 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -402,6 +402,75 @@ static void bnxt_dev_stop_op(struct rte_eth_dev *eth_dev)
bnxt_shutdown_nic(bp);
 }

+static void bnxt_mac_addr_remove_op(struct rte_eth_dev *eth_dev,
+   uint32_t index)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   uint64_t pool_mask = eth_dev->data->mac_pool_sel[index];
+   struct bnxt_vnic_info *vnic;
+   struct bnxt_filter_info *filter, *temp_filter;
+   int i;
+
+   /*
+* Loop through all VNICs from the specified filter flow pools to
+* remove the corresponding MAC addr filter
+*/
+   for (i = 0; i < MAX_FF_POOLS; i++) {
+   if (!(pool_mask & (1 << i)))
+   continue;
+
+   STAILQ_FOREACH(vnic, &bp->ff_pool[i], next) {
+   filter = STAILQ_FIRST(&vnic->filter);
+   while (filter) {
+   temp_filter = STAILQ_NEXT(filter, next);
+   if (filter->mac_index == index) {
+   STAILQ_REMOVE(&vnic->filter, filter,
+ bnxt_filter_info, next);
+   bnxt_hwrm_clear_filter(bp, filter);
+   filter->mac_index = INVALID_MAC_INDEX;
+   memset(&filter->l2_addr, 0,
+  ETHER_ADDR_LEN);
+   STAILQ_INSERT_TAIL(
+   &bp->free_filter_list,
+   filter, next);
+   }
+   filter = temp_filter;
+   }
+   }
+   }
+}
+
+static void bnxt_mac_addr_add_op(struct rte_eth_dev *eth_dev,
+struct ether_addr *mac_addr,
+uint32_t index, uint32_t pool)
+{
+   struct bnxt *bp = (struct bnxt *)eth_dev->data->dev_private;
+   struct bnxt_vnic_info *vnic = STAILQ_FIRST(&bp->ff_pool[pool]);
+   struct bnxt_filter_info *filter;
+
+   if (!vnic) {
+   RTE_LOG(ERR, PMD, "VNIC not found for pool %d!\n", pool);
+   return;
+   }
+   /* Attach requested MAC address to the new l2_filter */
+   STAILQ_FOREACH(filter, &vnic->filter, next) {
+   if (filter->mac_index == index) {
+   RTE_LOG(ERR, PMD,
+   "MAC addr already existed for pool %d\n", pool);
+   return;
+   }
+   }
+   filter = bnxt_alloc_filter(bp);
+   if (!filter) {
+   RTE_LOG(ERR, PMD, "L2 filter alloc failed\n");
+   return;
+   }
+   STAILQ_INSERT_TAIL(&vnic->filter, filter, next);
+   filter->mac_index = index;
+   memcpy(filter->l2_addr, mac_addr, ETHER_ADDR_LEN);
+   bnxt_hwrm_set_filter(bp, vnic, filter);
+}
+
 static int bnxt_link_update_op(struct rte_eth_dev *eth_dev,
   int wait_to_complete)
 {
@@ -516,6 +585,8 @@ static struct eth_dev_ops bnxt_dev_ops = {
.promiscuous_disable = bnxt_promiscuous_disable_op,
.allmulticast_enable = bnxt_allmulticast_enable_op,
.allmulticast_disable = bnxt_allmulticast_disable_op,
+   .mac_addr_add = bnxt_mac_addr_add_op,
+   .mac_addr_remove = bnxt_mac_addr_remove_op,
 };

 static bool bnxt_vf_pciid(uint16_t id)
-- 
1.9.1



[dpdk-dev] [PATCH v4 39/39] bnxt: Replace bnxt_ring_struct with bnxt_ring

2016-06-06 Thread Stephen Hurd
From: Ajit Khaparde 

As pointed in the previous round of review,
Having struct at the end of the structure bnxt_ring_struct is a redundant.
Replace it with bnxt_ring.

Signed-off-by: Ajit Khaparde 
Reviewed-by: David Christensen 
Signed-off-by: Stephen Hurd 
---
 drivers/net/bnxt/bnxt_cpr.c  |  4 ++--
 drivers/net/bnxt/bnxt_cpr.h  |  4 ++--
 drivers/net/bnxt/bnxt_hwrm.c | 10 +-
 drivers/net/bnxt/bnxt_hwrm.h |  4 ++--
 drivers/net/bnxt/bnxt_ring.c | 20 ++--
 drivers/net/bnxt/bnxt_ring.h |  4 ++--
 drivers/net/bnxt/bnxt_rxr.c  | 10 +-
 drivers/net/bnxt/bnxt_rxr.h  |  2 +-
 drivers/net/bnxt/bnxt_txr.c  |  8 
 drivers/net/bnxt/bnxt_txr.h  |  2 +-
 10 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index 98f3ca2..f0bfa1f 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -133,7 +133,7 @@ void bnxt_free_def_cp_ring(struct bnxt *bp)
 int bnxt_init_def_ring_struct(struct bnxt *bp, unsigned int socket_id)
 {
struct bnxt_cp_ring_info *cpr;
-   struct bnxt_ring_struct *ring;
+   struct bnxt_ring *ring;

cpr = rte_zmalloc_socket("bnxt_cp_ring",
 sizeof(struct bnxt_cp_ring_info),
@@ -143,7 +143,7 @@ int bnxt_init_def_ring_struct(struct bnxt *bp, unsigned int 
socket_id)
bp->def_cp_ring = cpr;

ring = rte_zmalloc_socket("bnxt_cp_ring_struct",
- sizeof(struct bnxt_ring_struct),
+ sizeof(struct bnxt_ring),
  RTE_CACHE_LINE_SIZE, socket_id);
if (ring == NULL)
return -ENOMEM;
diff --git a/drivers/net/bnxt/bnxt_cpr.h b/drivers/net/bnxt/bnxt_cpr.h
index 3e25a75..c176f8c 100644
--- a/drivers/net/bnxt/bnxt_cpr.h
+++ b/drivers/net/bnxt/bnxt_cpr.h
@@ -57,7 +57,7 @@
(*(uint32_t *)((cpr)->cp_doorbell) = (DB_CP_FLAGS | \
RING_CMP(cpr->cp_ring_struct, raw_cons)))

-struct bnxt_ring_struct;
+struct bnxt_ring;
 struct bnxt_cp_ring_info {
uint32_tcp_raw_cons;
void*cp_doorbell;
@@ -70,7 +70,7 @@ struct bnxt_cp_ring_info {
phys_addr_t hw_stats_map;
uint32_thw_stats_ctx_id;

-   struct bnxt_ring_struct *cp_ring_struct;
+   struct bnxt_ring*cp_ring_struct;
 };

 #define RX_CMP_L2_ERRORS   \
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 978e379..5d81a60 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -587,7 +587,7 @@ int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
 }

 int bnxt_hwrm_ring_alloc(struct bnxt *bp,
-struct bnxt_ring_struct *ring,
+struct bnxt_ring *ring,
 uint32_t ring_type, uint32_t map_index,
 uint32_t stats_ctx_id)
 {
@@ -660,7 +660,7 @@ int bnxt_hwrm_ring_alloc(struct bnxt *bp,
 }

 int bnxt_hwrm_ring_free(struct bnxt *bp,
-   struct bnxt_ring_struct *ring, uint32_t ring_type)
+   struct bnxt_ring *ring, uint32_t ring_type)
 {
int rc;
struct hwrm_ring_free_input req = {.req_type = 0 };
@@ -1058,7 +1058,7 @@ int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp)
 static void bnxt_free_cp_ring(struct bnxt *bp,
  struct bnxt_cp_ring_info *cpr, unsigned int idx)
 {
-   struct bnxt_ring_struct *cp_ring = cpr->cp_ring_struct;
+   struct bnxt_ring *cp_ring = cpr->cp_ring_struct;

bnxt_hwrm_ring_free(bp, cp_ring,
HWRM_RING_FREE_INPUT_RING_TYPE_CMPL);
@@ -1077,7 +1077,7 @@ int bnxt_free_all_hwrm_rings(struct bnxt *bp)
for (i = 0; i < bp->tx_cp_nr_rings; i++) {
struct bnxt_tx_queue *txq = bp->tx_queues[i];
struct bnxt_tx_ring_info *txr = txq->tx_ring;
-   struct bnxt_ring_struct *ring = txr->tx_ring_struct;
+   struct bnxt_ring *ring = txr->tx_ring_struct;
struct bnxt_cp_ring_info *cpr = txq->cp_ring;
unsigned int idx = bp->rx_cp_nr_rings + i + 1;

@@ -1101,7 +1101,7 @@ int bnxt_free_all_hwrm_rings(struct bnxt *bp)
for (i = 0; i < bp->rx_cp_nr_rings; i++) {
struct bnxt_rx_queue *rxq = bp->rx_queues[i];
struct bnxt_rx_ring_info *rxr = rxq->rx_ring;
-   struct bnxt_ring_struct *ring = rxr->rx_ring_struct;
+   struct bnxt_ring *ring = rxr->rx_ring_struct;
struct bnxt_cp_ring_info *cpr = rxq->cp_ring;
unsigned int idx = i + 1;

diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index d1aee1c..7ad5f51 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -62,11 +62,11 @@ int bnxt_hwrm_func_driver_unreg