[dpdk-dev] [PATCH 0/3] Rename field name for RX/TX queue start/stop

2014-07-28 Thread Ouyang, Changchun
Hi Thomas,
I have generated patch v2 to resolve this according to your comments.
Pls see attachment.
Thanks and regards,
Changchun

-Original Message-
From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] 
Sent: Tuesday, July 22, 2014 5:38 PM
To: Ouyang, Changchun
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 0/3] Rename field name for RX/TX queue start/stop

Hi,

2014-07-22 15:47, Ouyang Changchun:
> This patch series include 3 things:
> 1) Rename the field name from start_rx_per_q to rx_enable_queue in 
> struct rte_eth_rxconf, and do same thing for TX.
> This patch also update description for field rx_enable_queue and 
> tx_enable_queue.
> 2) According to 1), update field name from start_rx_per_q to 
> rx_enable_queue in struct igb_rx_queue  in ixgbe PMD, do same thing for TX.
> 3) Update its reference in sample vhost.

In order to be atomic (and do not break git bisect), you should submit it in 
one patch.
Title would be "ethdev: rename queue enabler field" or something like that.
But the most important in such change is to explain why you make it.

Thanks
--
Thomas


[dpdk-dev] does vswitchd runs multiple threads when i added dpdk devices

2014-07-28 Thread Srinivas Reddi
Hi,

I have taken the code form
https://github.com/openvswitch/ovs

I have added two dpdk devices

ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk


https://github.com/openvswitch/ovs/blob/master/INSTALL.DPDK


In the above link it is mentioned as " [ Once first DPDK port is added to 
vswitchd, it creates a Polling thread and
polls dpdk device in continuous loop. Therefore CPU utilization
for that thread is always 100%. ] "


As per my understanding  each  dpdk device is polled on different thread .
But in my case vswithcd is running in only single thread [on core 0]  , I 
expected to run on 3 cores ..

One thing I want to clarify   that .. does ovs-vswitchd runs on single core 
only  .. or multiple thereads .. when I added dpdk devices .
If vswitchd runs on multiple threads , when I added dpdk devices .. pls let me 
know how can I run .


Thanks,
Srinivas.



"DISCLAIMER: This message is proprietary to Aricent and is intended solely for 
the use of the individual to whom it is addressed. It may contain privileged or 
confidential information and should not be circulated or used for any purpose 
other than for what it is intended. If you have received this message in error, 
please notify the originator immediately. If you are not the intended 
recipient, you are notified that you are strictly prohibited from using, 
copying, altering, or disclosing the contents of this message. Aricent accepts 
no responsibility for loss or damage arising from the use of the information 
transmitted by this email including damage from virus."


[dpdk-dev] [PATCH] virtio: Fix 2 compilation issues in virtio PMD

2014-07-28 Thread Ouyang, Changchun
Hi all,

> -Original Message-
> From: Ouyang, Changchun
> Sent: Thursday, July 24, 2014 12:58 PM
> To: dev at dpdk.org
> Cc: Cao, Waterman; Ouyang, Changchun
> Subject: [PATCH] virtio: Fix 2 compilation issues in virtio PMD
> 
> Fix 2 compilation issues in virtio PMD when dump option is enabled.
> 
> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_pmd_virtio/virtio_ethdev.c | 2 +-
>  lib/librte_pmd_virtio/virtqueue.h | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 

The offending commit which cause these 2 issues are:

commit f37cdfde46a30d93f3dd8a4e01243be8bc0ac142
Author: Stephen Hemminger
Date:   Fri Jun 13 18:06:23 2014 -0700

virtio: remove unused virtqueue name

vq_name is only used when setting up queue, and does not need
to be saved.


commit ce65e697c67ba1a357d806eed05957b3d43f562c
Author: Stephen Hemminger 
Date:   Fri Jun 13 18:06:25 2014 -0700

virtio: simplify the hardware structure

The host_features are never used after negotiation.
The PCI information is unused (and available in rte_pci if needed).

Thanks
Changchun



[dpdk-dev] [PATCH v2 1/6] ethdev: rename macros of packet classification type

2014-07-28 Thread Helin Zhang
For better understanding, 'PCTYPE' which represents
'Packet Classification Type' is used to replace 'RSS'
in the name of shift macros.

Signed-off-by: Helin Zhang 
---
 lib/librte_ether/rte_ethdev.h | 76 +--
 1 file changed, 38 insertions(+), 38 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 50df654..dd36605 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -345,47 +345,47 @@ struct rte_eth_rss_conf {
 #define ETH_RSS_IPV4_UDP_SHIFT6
 #define ETH_RSS_IPV6_UDP_SHIFT7
 #define ETH_RSS_IPV6_UDP_EX_SHIFT 8
-/* for 40G only */
-#define ETH_RSS_NONF_IPV4_UDP_SHIFT   31
-#define ETH_RSS_NONF_IPV4_TCP_SHIFT   33
-#define ETH_RSS_NONF_IPV4_SCTP_SHIFT  34
-#define ETH_RSS_NONF_IPV4_OTHER_SHIFT 35
-#define ETH_RSS_FRAG_IPV4_SHIFT   36
-#define ETH_RSS_NONF_IPV6_UDP_SHIFT   41
-#define ETH_RSS_NONF_IPV6_TCP_SHIFT   43
-#define ETH_RSS_NONF_IPV6_SCTP_SHIFT  44
-#define ETH_RSS_NONF_IPV6_OTHER_SHIFT 45
-#define ETH_RSS_FRAG_IPV6_SHIFT   46
-#define ETH_RSS_FCOE_OX_SHIFT 48
-#define ETH_RSS_FCOE_RX_SHIFT 49
-#define ETH_RSS_FCOE_OTHER_SHIFT  50
-#define ETH_RSS_L2_PAYLOAD_SHIFT  63
+/* Packet Classification Type for 40G only */
+#define ETH_PCTYPE_NONF_IPV4_UDP  31
+#define ETH_PCTYPE_NONF_IPV4_TCP  33
+#define ETH_PCTYPE_NONF_IPV4_SCTP 34
+#define ETH_PCTYPE_NONF_IPV4_OTHER35
+#define ETH_PCTYPE_FRAG_IPV4  36
+#define ETH_PCTYPE_NONF_IPV6_UDP  41
+#define ETH_PCTYPE_NONF_IPV6_TCP  43
+#define ETH_PCTYPE_NONF_IPV6_SCTP 44
+#define ETH_PCTYPE_NONF_IPV6_OTHER45
+#define ETH_PCTYPE_FRAG_IPV6  46
+#define ETH_PCTYPE_FCOE_OX48 /* not used */
+#define ETH_PCTYPE_FCOE_RX49 /* not used */
+#define ETH_PCTYPE_FCOE_OTHER 50 /* not used */
+#define ETH_PCTYPE_L2_PAYLOAD 63

 /* for 1G & 10G */
-#define ETH_RSS_IPV4((uint16_t)1 << ETH_RSS_IPV4_SHIFT)
-#define ETH_RSS_IPV4_TCP((uint16_t)1 << ETH_RSS_IPV4_TCP_SHIFT)
-#define ETH_RSS_IPV6((uint16_t)1 << ETH_RSS_IPV6_SHIFT)
-#define ETH_RSS_IPV6_EX ((uint16_t)1 << ETH_RSS_IPV6_EX_SHIFT)
-#define ETH_RSS_IPV6_TCP((uint16_t)1 << ETH_RSS_IPV6_TCP_SHIFT)
-#define ETH_RSS_IPV6_TCP_EX ((uint16_t)1 << 
ETH_RSS_IPV6_TCP_EX_SHIFT)
-#define ETH_RSS_IPV4_UDP((uint16_t)1 << ETH_RSS_IPV4_UDP_SHIFT)
-#define ETH_RSS_IPV6_UDP((uint16_t)1 << ETH_RSS_IPV6_UDP_SHIFT)
-#define ETH_RSS_IPV6_UDP_EX ((uint16_t)1 << 
ETH_RSS_IPV6_UDP_EX_SHIFT)
+#define ETH_RSS_IPV4(1 << ETH_RSS_IPV4_SHIFT)
+#define ETH_RSS_IPV4_TCP(1 << ETH_RSS_IPV4_TCP_SHIFT)
+#define ETH_RSS_IPV6(1 << ETH_RSS_IPV6_SHIFT)
+#define ETH_RSS_IPV6_EX (1 << ETH_RSS_IPV6_EX_SHIFT)
+#define ETH_RSS_IPV6_TCP(1 << ETH_RSS_IPV6_TCP_SHIFT)
+#define ETH_RSS_IPV6_TCP_EX (1 << ETH_RSS_IPV6_TCP_EX_SHIFT)
+#define ETH_RSS_IPV4_UDP(1 << ETH_RSS_IPV4_UDP_SHIFT)
+#define ETH_RSS_IPV6_UDP(1 << ETH_RSS_IPV6_UDP_SHIFT)
+#define ETH_RSS_IPV6_UDP_EX (1 << ETH_RSS_IPV6_UDP_EX_SHIFT)
 /* for 40G only */
-#define ETH_RSS_NONF_IPV4_UDP   ((uint64_t)1 << 
ETH_RSS_NONF_IPV4_UDP_SHIFT)
-#define ETH_RSS_NONF_IPV4_TCP   ((uint64_t)1 << 
ETH_RSS_NONF_IPV4_TCP_SHIFT)
-#define ETH_RSS_NONF_IPV4_SCTP  ((uint64_t)1 << 
ETH_RSS_NONF_IPV4_SCTP_SHIFT)
-#define ETH_RSS_NONF_IPV4_OTHER ((uint64_t)1 << 
ETH_RSS_NONF_IPV4_OTHER_SHIFT)
-#define ETH_RSS_FRAG_IPV4   ((uint64_t)1 << 
ETH_RSS_FRAG_IPV4_SHIFT)
-#define ETH_RSS_NONF_IPV6_UDP   ((uint64_t)1 << 
ETH_RSS_NONF_IPV6_UDP_SHIFT)
-#define ETH_RSS_NONF_IPV6_TCP   ((uint64_t)1 << 
ETH_RSS_NONF_IPV6_TCP_SHIFT)
-#define ETH_RSS_NONF_IPV6_SCTP  ((uint64_t)1 << 
ETH_RSS_NONF_IPV6_SCTP_SHIFT)
-#define ETH_RSS_NONF_IPV6_OTHER ((uint64_t)1 << 
ETH_RSS_NONF_IPV6_OTHER_SHIFT)
-#define ETH_RSS_FRAG_IPV6   ((uint64_t)1 << 
ETH_RSS_FRAG_IPV6_SHIFT)
-#define ETH_RSS_FCOE_OX ((uint64_t)1 << ETH_RSS_FCOE_OX_SHIFT) 
/* not used */
-#define ETH_RSS_FCOE_RX ((uint64_t)1 << ETH_RSS_FCOE_RX_SHIFT) 
/* not used */
-#define ETH_RSS_FCOE_OTHER  ((uint64_t)1 << 
ETH_RSS_FCOE_OTHER_SHIFT) /* not used */
-#define ETH_RSS_L2_PAYLOAD  ((uint64_t)1 << 
ETH_RSS_L2_PAYLOAD_SHIFT)
+#define ETH_RSS_NONF_IPV4_UDP   (1ULL << ETH_PCTYPE_NONF_IPV4_UDP)
+#define ETH_RSS_NONF_IPV4_TCP   (1ULL << ETH_PCTYPE_NONF_IPV4_TCP)
+#define ETH_RS

[dpdk-dev] [PATCH v2 4/6] i40e: support of 'is_command_supported'

2014-07-28 Thread Helin Zhang
'is_command_supported' is defined for the capability discovery.
Actually it is to check if a command (feature) is supported on
a specific type of NIC port. Now i40e supports below eight
commands. Of cause, more commands can be supported later.
 - RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE
 - RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE
 - RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT
 - RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT
 - RTE_CMD_GET_FILTER_SWAP
 - RTE_CMD_SET_FILTER_SWAP
 - RTE_CMD_GET_HASH_FUNCTION
 - RTE_CMD_SET_HASH_FUNCTION

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 28 
 1 file changed, 28 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 4403af4..87a4999 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -204,6 +204,8 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_dev_is_command_supported(struct rte_eth_dev *dev __rte_unused,
+enum rte_eth_command cmd);
 static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev,
 enum rte_eth_command cmd,
 void *args);
@@ -252,6 +254,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.reta_query   = i40e_dev_rss_reta_query,
.rss_hash_update  = i40e_dev_rss_hash_update,
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
+   .is_command_supported = i40e_dev_is_command_supported,
.rx_classification_filter_ctl = i40e_rx_classification_filter_ctl,
 };

@@ -4292,6 +4295,31 @@ i40e_get_hash_function(struct i40e_hw *hw, enum 
rte_i40e_hash_function *hf)
 }

 static int
+i40e_dev_is_command_supported(struct rte_eth_dev *dev __rte_unused,
+ enum rte_eth_command cmd)
+{
+   uint32_t i;
+   /* Below commands defined in rte_eth_features.h are for i40e only */
+   static const enum rte_eth_command i40e_commands[] = {
+   RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE,
+   RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE,
+   RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT,
+   RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT,
+   RTE_CMD_GET_FILTER_SWAP,
+   RTE_CMD_SET_FILTER_SWAP,
+   RTE_CMD_GET_HASH_FUNCTION,
+   RTE_CMD_SET_HASH_FUNCTION,
+   };
+
+   for (i = 0; i < RTE_DIM(i40e_commands); i++) {
+   if (i40e_commands[i] == cmd)
+   return 1;
+   }
+
+   return 0;
+}
+
+static int
 i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev,
  enum rte_eth_command cmd,
  void *args)
-- 
1.8.1.4



[dpdk-dev] [PATCH v2 5/6] i40e: Initialize hash function during port initialization.

2014-07-28 Thread Helin Zhang
As hash function are configured in gloabal registers, those
registers will not be reloaded unless a gloabl NIC hardware
reset. That means a DPDK application launch will not load
the default configuration of hash functions. It needs an
initialization of those registers during the port
initialization to make sure all those registers are in an
expected state.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/i40e_ethdev.c | 71 +++
 1 file changed, 71 insertions(+)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 87a4999..386d864 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -204,6 +204,7 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static void i40e_init_hash_function(struct i40e_hw *hw);
 static int i40e_dev_is_command_supported(struct rte_eth_dev *dev __rte_unused,
 enum rte_eth_command cmd);
 static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev,
@@ -392,6 +393,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv,
return ret;
}

+   /* Init hash functions */
+   i40e_init_hash_function(hw);
+
/* Initialize the shared code (base driver) */
ret = i40e_init_shared_code(hw);
if (ret) {
@@ -4369,3 +4373,70 @@ i40e_rx_classification_filter_ctl(struct rte_eth_dev 
*dev,

return ret;
 }
+
+/**
+ * Initialize hash functions. It includes,
+ * - set hash function to Toeplitz.
+ * - set the default filter swap configurations.
+ * - disable hash function enable per port.
+ * - disable hash function enable per pctype.
+ * Only global reset can reload the firmware configurations.
+ */
+static void
+i40e_init_hash_function(struct i40e_hw *hw)
+{
+   static struct rte_i40e_filter_swap_info swap_info[] = {
+   {ETH_PCTYPE_NONF_IPV4_UDP,
+   0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02},
+   {ETH_PCTYPE_NONF_IPV4_TCP,
+   0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02},
+   {ETH_PCTYPE_NONF_IPV4_SCTP,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_PCTYPE_NONF_IPV4_OTHER,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_PCTYPE_FRAG_IPV4,
+   0x1e, 0x36, 0x04, 0x00, 0x00, 0x00},
+   {ETH_PCTYPE_NONF_IPV6_UDP,
+   0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02},
+   {ETH_PCTYPE_NONF_IPV6_TCP,
+   0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02},
+   {ETH_PCTYPE_NONF_IPV6_SCTP,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_PCTYPE_NONF_IPV6_OTHER,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_PCTYPE_FRAG_IPV6,
+   0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00},
+   {ETH_PCTYPE_L2_PAYLOAD,
+   0x00, 0x00, 0x00, 0x00, 0x00, 0x00},
+   };
+   static struct rte_i40e_sym_hash_enable_info sym_hash_ena_info[] = {
+   {ETH_PCTYPE_NONF_IPV4_UDP, 0},
+   {ETH_PCTYPE_NONF_IPV4_TCP, 0},
+   {ETH_PCTYPE_NONF_IPV4_SCTP, 0},
+   {ETH_PCTYPE_NONF_IPV4_OTHER, 0},
+   {ETH_PCTYPE_FRAG_IPV4, 0},
+   {ETH_PCTYPE_NONF_IPV6_UDP, 0},
+   {ETH_PCTYPE_NONF_IPV6_TCP, 0},
+   {ETH_PCTYPE_NONF_IPV6_SCTP, 0},
+   {ETH_PCTYPE_NONF_IPV6_OTHER, 0},
+   {ETH_PCTYPE_FRAG_IPV6, 0},
+   {ETH_PCTYPE_L2_PAYLOAD, 0},
+   };
+   static enum rte_i40e_hash_function hf = rte_i40e_hash_function_toeplitz;
+   uint32_t i;
+
+   /* set hash function to Toeplitz by default */
+   i40e_set_hash_function(hw, &hf);
+
+   /* initialize filter swap */
+   for (i = 0; i < RTE_DIM(swap_info); i++)
+   i40e_set_filter_swap(hw, &swap_info[i]);
+
+   /* disable all symmetric hash per pctype */
+   for (i = 0; i < RTE_DIM(sym_hash_ena_info); i++)
+   i40e_set_symmetric_hash_enable_per_pctype(hw,
+   &sym_hash_ena_info[i]);
+
+   /* disable symmetric hash per port */
+   i40e_set_symmetric_hash_enable_per_port(hw, 0);
+}
-- 
1.8.1.4



[dpdk-dev] [PATCH v2 2/6] ethdev: add new ops of 'is_command_supported' and 'rx_classification_filter_ctl'

2014-07-28 Thread Helin Zhang
Two ops of 'is_command_supported' and
'rx_classification_filter_ctl' are added. New header
file of 'rte_eth_features.h' is added.
* 'is_command_supported': It is for capability discovery,
  that is to check if specific feature/command is
  supported on a port.
* 'rx_classification_filter_ctl': It is for receive
  classification filter configuring. e.g. selecting hash
  function, possibly configuring flow director. It is a
  common API where a lot of commands can be implemented
  for different sub features, to avoid defining quite a
  lot of ops for device specific features.
* 'rte_eth_features.h': It includes all the feature
  commands which can be checked and processed in above
  two ops. Also it may include other commands for future
  implementations.

Signed-off-by: Helin Zhang 
---
 lib/librte_ether/Makefile   |  1 +
 lib/librte_ether/rte_eth_features.h | 73 +
 lib/librte_ether/rte_ethdev.c   | 31 
 lib/librte_ether/rte_ethdev.h   | 55 
 4 files changed, 160 insertions(+)
 create mode 100644 lib/librte_ether/rte_eth_features.h

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index b310f8b..8089723 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -46,6 +46,7 @@ SRCS-y += rte_ethdev.c
 #
 SYMLINK-y-include += rte_ether.h
 SYMLINK-y-include += rte_ethdev.h
+SYMLINK-y-include += rte_eth_features.h

 # this lib depends upon:
 DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_features.h 
b/lib/librte_ether/rte_eth_features.h
new file mode 100644
index 000..983d7c6
--- /dev/null
+++ b/lib/librte_ether/rte_eth_features.h
@@ -0,0 +1,73 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETH_FEATURES_H_
+#define _RTE_ETH_FEATURES_H_
+
+/**
+ * @file
+ *
+ * Ethernet device specific features
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/* Commands defined for NIC specific features */
+enum rte_eth_command {
+   RTE_CMD_UNKNOWN = 0,
+   /**< Unknown command */
+   RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE,
+   /**< Get symmetric hash enable per pctype */
+   RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE,
+   /**< Set symmetric hash enable per pctype */
+   RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT,
+   /**< Get symmetric hash enable per port */
+   RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT,
+   /**< Set symmetric hash enable per port */
+   RTE_CMD_GET_FILTER_SWAP,
+   /**< Get filter swap configurations */
+   RTE_CMD_SET_FILTER_SWAP,
+   /**< Set filter swap configurations */
+   RTE_CMD_GET_HASH_FUNCTION,
+   /**< Get hash function */
+   RTE_CMD_SET_HASH_FUNCTION,
+   /**< Set hash function */
+};
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ETH_FEATURES_H_ */
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fd1010a..dfeb804 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3002,3 +3002,34 @@ rte_eth_dev_get_flex_filter(uint8_t port_id, uint16_t 
index,
return (*dev->dev_ops->get_flex_filter)(dev, index, filter,
rx_queue);
 }
+
+int
+rte_eth_dev_is_command_suppo

[dpdk-dev] [PATCH v2 3/6] i40e: support of 'rx_classification_filter_ctl'

2014-07-28 Thread Helin Zhang
'rx_classification_filter_ctl' was defined as a common API
for receive classification filter features. Eight commands
has been implemented for selecting hash functions of
'Toeplitz' and 'Simple XOR', and configuring symmetric hash
functions. In detail,
RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE:
 - Get symmetric hash enable configuration per 'PCTYPE'.
RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE:
 - Set symmetric hash enable configuration per 'PCTYPE'.
RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT:
 - Get symmetric hash enable configuration per port.
RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT:
 - Set symmetric hash enable configuration per port.
RTE_CMD_GET_FILTER_SWAP:
 - Get filter swap configurations.
RTE_CMD_SET_FILTER_SWAP:
 - Set filter swap configurations.
RTE_CMD_GET_HASH_FUNCTION:
 - Get current hash function.
RTE_CMD_SET_HASH_FUNCTION:
 - Set hash function of 'Toeplitz' or 'Simple XOR'.
Note that 'PCTYPE' means 'Packet Classification Type'.

Signed-off-by: Helin Zhang 
---
 lib/librte_pmd_i40e/Makefile  |   6 +
 lib/librte_pmd_i40e/i40e_ethdev.c | 385 ++
 lib/librte_pmd_i40e/i40e_ethdev.h |   2 +
 lib/librte_pmd_i40e/rte_i40e.h| 108 +++
 4 files changed, 501 insertions(+)
 create mode 100644 lib/librte_pmd_i40e/rte_i40e.h

diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile
index 4b31675..a777a76 100644
--- a/lib/librte_pmd_i40e/Makefile
+++ b/lib/librte_pmd_i40e/Makefile
@@ -87,6 +87,12 @@ SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_rxtx.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev_vf.c
 SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_pf.c
+
+#
+# Export include file
+#
+SYMLINK-$(CONFIG_RTE_LIBRTE_I40E_PMD)-include += rte_i40e.h
+
 # this lib depends upon:
 DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_eal lib/librte_ether
 DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_mempool lib/librte_mbuf
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 9ed31b5..4403af4 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "i40e_logs.h"
 #include "i40e/i40e_register_x710_int.h"
@@ -203,6 +204,9 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev,
+enum rte_eth_command cmd,
+void *args);

 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1];
@@ -248,6 +252,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.reta_query   = i40e_dev_rss_reta_query,
.rss_hash_update  = i40e_dev_rss_hash_update,
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
+   .rx_classification_filter_ctl = i40e_rx_classification_filter_ctl,
 };

 static struct eth_driver rte_i40e_pmd = {
@@ -3956,3 +3961,383 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf)

return 0;
 }
+
+static int
+i40e_get_filter_swap(struct i40e_hw *hw, struct rte_i40e_filter_swap_info 
*info)
+{
+   uint32_t reg;
+
+   if (!hw || !info) {
+   PMD_DRV_LOG(ERR, "Invalid pointer\n");
+   return -1;
+   }
+
+   switch (info->pctype) {
+   case ETH_PCTYPE_NONF_IPV4_UDP:
+   case ETH_PCTYPE_NONF_IPV4_TCP:
+   case ETH_PCTYPE_NONF_IPV4_SCTP:
+   case ETH_PCTYPE_NONF_IPV4_OTHER:
+   case ETH_PCTYPE_FRAG_IPV4:
+   case ETH_PCTYPE_NONF_IPV6_UDP:
+   case ETH_PCTYPE_NONF_IPV6_TCP:
+   case ETH_PCTYPE_NONF_IPV6_SCTP:
+   case ETH_PCTYPE_NONF_IPV6_OTHER:
+   case ETH_PCTYPE_FRAG_IPV6:
+   case ETH_PCTYPE_L2_PAYLOAD:
+   reg = I40E_READ_REG(hw, I40E_GLQF_SWAP(0, info->pctype));
+   PMD_DRV_LOG(DEBUG, "Value read from I40E_GLQF_SWAP[0,%d]: "
+   "0x%x\n", info->pctype, reg);
+
+   /**
+* The offset and length read from register in word unit,
+* which need to be converted in byte unit before being saved.
+*/
+   info->off0_src0 =
+   (uint8_t)((reg & I40E_GLQF_SWAP_OFF0_SRC0_MASK) >>
+   I40E_GLQF_SWAP_OFF0_SRC0_SHIFT) << 1;
+   info->off0_src1 =
+   (uint8_t)((reg & I40E_GLQF_SWAP_OFF0_SRC1_MASK) >>
+   I40E_GLQF_SWAP_OFF0_SRC1_SHIFT) << 1;
+   info->len0 = (uint8_t)((reg & I40E_GLQF_SWAP_FLEN0_MASK) >>
+   I40E_GLQF_SWAP_FLEN0_SHIFT) << 1;
+   info->off1_sr

[dpdk-dev] [PATCH v2 6/6] app/testpmd: add commands for configuring hash functions

2014-07-28 Thread Helin Zhang
Eight commands are added to configure hash functions.
They are,
 - i40e_get_sym_hash_ena_per_port
   Get symmetric hash enable per port.
 - i40e_set_sym_hash_ena_per_port
   Set symmetric hash enable per port.
 - i40e_get_sym_hash_ena_per_pctype
   Get symmetric hash enable per PCTYPE
   (Packet Classification Type).
 - i40e_set_sym_hash_ena_per_pctype
   Set symmetric hash enable per PCTYPE
   (Packet Classification Type).
 - i40e_get_filter_swap
   Get filter swap configurations.
 - i40e_set_filter_swap
   Set filter swap configurations.
 - i40e_get_hash_function
   Get hash function.
 - i40e_set_hash_function
   Set hash function to 'Toeplitz' or 'Simple XOR'

Signed-off-by: Helin Zhang 
---
 app/test-pmd/cmdline.c | 579 +
 1 file changed, 579 insertions(+)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 345be11..9890400 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -74,6 +74,10 @@
 #include 
 #include 
 #include 
+#include 
+#ifdef RTE_LIBRTE_I40E_PMD
+#include 
+#endif

 #include 
 #include 
@@ -655,6 +659,43 @@ static void cmd_help_long_parsed(void *parsed_result,

"get_flex_filter (port_id) index (idx)\n"
"get info of a flex filter.\n\n"
+
+#ifdef RTE_LIBRTE_I40E_PMD
+   "i40e_get_sym_hash_ena_per_port (port_id)\n"
+   "get symmetric hash enable configuration per port,"
+   " on i40e only\n\n"
+
+   "i40e_set_sym_hash_ena_per_port (port_id)"
+   " (enable|disable)\n"
+   "set symmetric hash enable configuration per port"
+   " to enable or disable, on i40e only\n\n"
+
+   "i40e_get_sym_hash_ena_per_pctype (port_id) (pctype)\n"
+   "get symmetric hash enable configuration per port,"
+   " on i40e only\n\n"
+
+   "i40e_set_sym_hash_ena_per_pctype (port_id) (pctype)"
+   " (enable|disable)\n"
+   "set symmetric hash enable configuration per"
+   " pctype to enable or disable, on i40e only\n\n"
+
+   "i40e_get_filter_swap (port_id) (pctype)\n"
+   "get filter swap configurations on i40e,"
+   " on i40e only\n\n"
+
+   "i40e_set_filter_swap (port_id) (pctype) (off0_src0)"
+   " (off0_src1) (len0) (off1_src0) (off1_src1) (len1)\n"
+   "set filter swap configurations, on i40e only\n\n"
+
+   "i40e_get_hash_function (port_id)\n"
+   "get hash function of Toeplitz or Simple XOR,"
+   " on i40e only\n\n"
+
+   "i40e_set_hash_function (port_id)"
+   " (toeplitz|simple_xor)\n"
+   "set the hash function to Toeplitz or Simple XOR,"
+   " on i40e only\n\n"
+#endif /* RTE_LIBRTE_I40E_PMD */
);
}
 }
@@ -7304,6 +7345,534 @@ cmdline_parse_inst_t cmd_get_flex_filter = {
},
 };

+/* *** Classification Filters Control *** */
+#ifdef RTE_LIBRTE_I40E_PMD
+/* *** Get symmetric hash enable per port *** */
+struct cmd_i40e_get_sym_hash_ena_per_port_result {
+   cmdline_fixed_string_t i40e_get_sym_hash_ena_per_port;
+   uint8_t port_id;
+};
+
+static void
+cmd_i40e_get_sym_hash_per_port_parsed(void *parsed_result,
+ __rte_unused struct cmdline *cl,
+ __rte_unused void *data)
+{
+   struct cmd_i40e_get_sym_hash_ena_per_port_result *res = parsed_result;
+   uint8_t enable = 0;
+   int ret;
+
+   if (rte_eth_dev_is_command_supported(res->port_id,
+   RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT) <= 0) {
+   printf("Command of RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT "
+   "not supported on port: %d\n", res->port_id);
+   return;
+   }
+
+   ret = rte_eth_dev_rx_classification_filter_ctl(res->port_id,
+   RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT, &enable);
+   if (ret < 0) {
+   printf("Cannot get symmetric hash enable per port "
+   "on i40e port %u\n", res->port_id);
+   return;
+   }
+
+   printf("Symmetric hash is %s on i40e port %u\n",
+   enable ? "enabled" : "disabled", res->port_id);
+}
+
+cmdline_parse_token_string_t cmd_i40e_get_sym_hash_ena_per_port_all =
+   TOKEN_STRING_INITIALIZER(
+   struct cmd_i40e_get_sym_hash_ena_per_port_result,
+   i40e_get_sym_hash_ena_per_port,
+   "i40e_get_sym_hash_ena_per_port");
+cmdline_parse_token_num_t cmd_i40e_get_sym_hash_ena_per_port_port_id =
+   TOKEN_NUM_INITI

[dpdk-dev] [PATCH v2 0/6] Support configuring hash functions

2014-07-28 Thread Helin Zhang
These pathches mainly support configuring hash functions.
In detail,
 - It can select Toeplitz or simple XOR hash functions.
 - It can configure symmetric hash functions.
   * Get/set symmetric hash enable per port.
   * Get/set symmetric hash enable per 'PCTYPE'.
   * Get/set filter swap configurations.
 - 'ethdev' level interfaces are added.
   * 'is_command_supported', to check if a feature (command)
 is supported on a port.
   * 'rx_classification_filter_ctl', a common API to execute
 specific command of each feature.
 - Seven commands are implemented in testpmd to support
   testing above.
Note that 'PCTYPE' means 'Packet Classification Type'.

Helin Zhang (6):
  ethdev: rename macros of packet classification type
  ethdev: add new ops of 'is_command_supported' and
'rx_classification_filter_ctl'
  i40e: support of 'rx_classification_filter_ctl'
  i40e: support of 'is_command_supported'
  i40e: Initialize hash function during port initialization.
  app/testpmd: add commands for configuring hash functions

 app/test-pmd/cmdline.c  | 579 
 lib/librte_ether/Makefile   |   1 +
 lib/librte_ether/rte_eth_features.h |  73 +
 lib/librte_ether/rte_ethdev.c   |  31 ++
 lib/librte_ether/rte_ethdev.h   | 131 +---
 lib/librte_pmd_i40e/Makefile|   6 +
 lib/librte_pmd_i40e/i40e_ethdev.c   | 484 ++
 lib/librte_pmd_i40e/i40e_ethdev.h   |   2 +
 lib/librte_pmd_i40e/rte_i40e.h  | 108 +++
 9 files changed, 1377 insertions(+), 38 deletions(-)
 create mode 100644 lib/librte_ether/rte_eth_features.h
 create mode 100644 lib/librte_pmd_i40e/rte_i40e.h

-- 
1.8.1.4



[dpdk-dev] free a memzone

2014-07-28 Thread Ananyev, Konstantin
Hi Mahdi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi Dashtbozorgi
> Sent: Thursday, July 24, 2014 6:20 AM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] free a memzone
> 
> Hi Bruce,
> 
> Thank you for the response. That's a great Idea!
> But I do not understand the last four parameters of this function. (vaddr,
> paddr, pg_num, pg_shift)
> I guess vaddr is the virtual address of the previously allocated mempool,

yes

> paddr is calculated using function call rte_mem_virt2phy(vaddr), am I
> right?

yes

 >what about pg_num and pg_shift? how can I pass them correctly?

>From rte_mempool.h:
"* @param pg_num
 *   Number of elements in the paddr array.
 * @param pg_shift
 *   LOG2 of the physical pages size."

If you are using memzone as externally allocated memory - it will be already 
physically continuos.
So in your case pg_num = MEMPOOL_PG_NUM_DEFAULT,  pg_shift = 
MEMPOOL_PG_SHIFT_MAX.

Though, I don't think rte_mempool_xmem_create() will help you in any way.
Again from rte_mempool.h:
"* Creates a new mempool named *name* in memory.
 *
 * This function uses ``memzone_reserve()`` to allocate memory. The
 * pool contains n elements of elt_size. Its size is set to n.
 * Depending on the input parameters, mempool elements can be either allocated
 * together with the mempool header, or an externally provided memory buffer
 * could be used to store mempool objects. In later case, that external
 * memory buffer can consist of set of disjoint phyiscal pages."

So xmem_create would still create a new ring, reserve a new memzone of 
mempool's metadata, etc.
The only difference - it can use externally allocated memory to store mempool 
elements.

As I understand what you need is sort of mempool_reset(): a function that would 
re-init mempool to just created state
(all elements are free, lcores caches are empty, etc).
Right now we don't have such function,  but I suppose something like that 
should do
(note that I didn't run or even build it):

If ((mp = rte_mempool_lookup(name)) != NULL {

char ring_name[RTE_RING_NAMESIZE];

/* save mp ring name. */
memcpy(ring_name, mp->ring->name, sizeof ring_name);

/* reset the ring. */
  rte_ring_init(mp->ring, ring_name,  rte_align32pow2(mp->size+1), 
mp->ring->flags);

/*repopulate mempool and reinit all its elements. */
mempool_populate(mp, mp->size, 1, rte_pktmbuf_init, NULL);

/* reset all lcore caches. */
 memset(mp->local_cache, 0, sizeof(local_cahce));

  /* reset statistics if needed. */
} else {
  /* create new mempool. */
}

Ideally such function should be in the librte_mempool of course, but if you are 
in a hurry - you probably can give it a try.

Note that I assume that no other process, except failed/restarting secondary 
are using this mempool.
If primary or some other secondary do, then first you need to stop them using 
this mempool and wait till they finish all their packet processing activity.

Konstantin

> Best Regards,
> Mahdi.
> 
> 
> On Thu, Jul 24, 2014 at 9:48 AM, Mahdi Dashtbozorgi 
> wrote:
> 
> > Hi Bruce,
> >
> > Thank you for the response. That's a great Idea!
> > But I do not understand the last four parameters of this function. (vaddr,
> > paddr, pg_num, pg_shift)
> > I guess vaddr is the virtual address of the previously allocated mempool,
> > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I
> > right? what about pg_num and pg_shift? how can I pass them correctly?
> >
> > Best Regards,
> > Mahdi.
> >
> >
> > On Wed, Jul 23, 2014 at 11:09 PM, Richardson, Bruce <
> > bruce.richardson at intel.com> wrote:
> >
> >> Rather than freeing the previously allocated memzone, could you not just
> >> re-initialize the mempool using something like rte_mempool_xmem_create?
> >>
> >> > -Original Message-
> >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi 
> >> > Dashtbozorgi
> >> > Sent: Wednesday, July 23, 2014 2:05 AM
> >> > To: dev at dpdk.org
> >> > Subject: Re: [dpdk-dev] free a memzone
> >> >
> >> > Hi guys,
> >> >
> >> > Is there any suggestion to free the previously allocated memzone?
> >> > I really need help in this issue.
> >> > Any help is appreciated.
> >> >
> >> > Best Regards,
> >> > Mahdi.
> >> >
> >> >
> >> >
> >> > On Tue, Jul 22, 2014 at 4:03 PM, Mahdi Dashtbozorgi 
> >> > wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > I have two processes, which uses DPDK multi-process feature to
> >> communicate.
> >> > > Master process captures packets from NIC and put them to a ring
> >> buffer,
> >> > > which is shared between master and slave process.
> >> > > The slave process looks up the shared ring buffer using
> >> rte_ring_lookup
> >> > > function and reads the packets.
> >> > > The slave process needs a memory pool, too. Therefore, it creates a
> >> > > mempool using rte_mempool_create. But If the slave process crashes
> >> during
> >> > > its processing and runs again, rte_mempool_create function fails and
> >> tells
> >> > > that there is a memor

[dpdk-dev] free a memzone

2014-07-28 Thread Mahdi Dashtbozorgi
Hi Konstantin,

Thank you very much. Your solution fixed my problem.
Is there a solution like this for resetting the memory zone, which is used
by rte_malloc function?
Because if I use rte_malloc instead of malloc, in the case of application
crash, the memory zone, which was used by rte_malloc in the previous run
would be unusable for the next run of slave process.

Best Regards,
Mahdi.




On Mon, Jul 28, 2014 at 4:27 PM, Ananyev, Konstantin <
konstantin.ananyev at intel.com> wrote:

> Hi Mahdi,
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi Dashtbozorgi
> > Sent: Thursday, July 24, 2014 6:20 AM
> > To: dev at dpdk.org
> > Subject: Re: [dpdk-dev] free a memzone
> >
> > Hi Bruce,
> >
> > Thank you for the response. That's a great Idea!
> > But I do not understand the last four parameters of this function.
> (vaddr,
> > paddr, pg_num, pg_shift)
> > I guess vaddr is the virtual address of the previously allocated mempool,
>
> yes
>
> > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I
> > right?
>
> yes
>
>  >what about pg_num and pg_shift? how can I pass them correctly?
>
> From rte_mempool.h:
> "* @param pg_num
>  *   Number of elements in the paddr array.
>  * @param pg_shift
>  *   LOG2 of the physical pages size."
>
> If you are using memzone as externally allocated memory - it will be
> already physically continuos.
> So in your case pg_num = MEMPOOL_PG_NUM_DEFAULT,  pg_shift =
> MEMPOOL_PG_SHIFT_MAX.
>
> Though, I don't think rte_mempool_xmem_create() will help you in any way.
> Again from rte_mempool.h:
> "* Creates a new mempool named *name* in memory.
>  *
>  * This function uses ``memzone_reserve()`` to allocate memory. The
>  * pool contains n elements of elt_size. Its size is set to n.
>  * Depending on the input parameters, mempool elements can be either
> allocated
>  * together with the mempool header, or an externally provided memory
> buffer
>  * could be used to store mempool objects. In later case, that external
>  * memory buffer can consist of set of disjoint phyiscal pages."
>
> So xmem_create would still create a new ring, reserve a new memzone of
> mempool's metadata, etc.
> The only difference - it can use externally allocated memory to store
> mempool elements.
>
> As I understand what you need is sort of mempool_reset(): a function that
> would re-init mempool to just created state
> (all elements are free, lcores caches are empty, etc).
> Right now we don't have such function,  but I suppose something like that
> should do
> (note that I didn't run or even build it):
>
> If ((mp = rte_mempool_lookup(name)) != NULL {
>
> char ring_name[RTE_RING_NAMESIZE];
>
> /* save mp ring name. */
> memcpy(ring_name, mp->ring->name, sizeof ring_name);
>
> /* reset the ring. */
>   rte_ring_init(mp->ring, ring_name,  rte_align32pow2(mp->size+1),
> mp->ring->flags);
>
> /*repopulate mempool and reinit all its elements. */
> mempool_populate(mp, mp->size, 1, rte_pktmbuf_init, NULL);
>
> /* reset all lcore caches. */
>  memset(mp->local_cache, 0, sizeof(local_cahce));
>
>   /* reset statistics if needed. */
> } else {
>   /* create new mempool. */
> }
>
> Ideally such function should be in the librte_mempool of course, but if
> you are in a hurry - you probably can give it a try.
>
> Note that I assume that no other process, except failed/restarting
> secondary are using this mempool.
> If primary or some other secondary do, then first you need to stop them
> using this mempool and wait till they finish all their packet processing
> activity.
>
> Konstantin
>
> > Best Regards,
> > Mahdi.
> >
> >
> > On Thu, Jul 24, 2014 at 9:48 AM, Mahdi Dashtbozorgi 
> > wrote:
> >
> > > Hi Bruce,
> > >
> > > Thank you for the response. That's a great Idea!
> > > But I do not understand the last four parameters of this function.
> (vaddr,
> > > paddr, pg_num, pg_shift)
> > > I guess vaddr is the virtual address of the previously allocated
> mempool,
> > > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I
> > > right? what about pg_num and pg_shift? how can I pass them correctly?
> > >
> > > Best Regards,
> > > Mahdi.
> > >
> > >
> > > On Wed, Jul 23, 2014 at 11:09 PM, Richardson, Bruce <
> > > bruce.richardson at intel.com> wrote:
> > >
> > >> Rather than freeing the previously allocated memzone, could you not
> just
> > >> re-initialize the mempool using something like
> rte_mempool_xmem_create?
> > >>
> > >> > -Original Message-
> > >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi
> Dashtbozorgi
> > >> > Sent: Wednesday, July 23, 2014 2:05 AM
> > >> > To: dev at dpdk.org
> > >> > Subject: Re: [dpdk-dev] free a memzone
> > >> >
> > >> > Hi guys,
> > >> >
> > >> > Is there any suggestion to free the previously allocated memzone?
> > >> > I really need help in this issue.
> > >> > Any help is appreciated.
> > >> >
> > >> > Best Regards,
> > >> > Mahdi.
> > >> >
> > >> >
> > >> >
> > >> > On Tue, Jul

[dpdk-dev] [PATCH] ixgbe: convert sse intrinsics to use __builtin variants

2014-07-28 Thread Neil Horman
On Sat, Jul 26, 2014 at 11:15:01AM +, Ananyev, Konstantin wrote:
> 
> Hi Neil,
>  
> > The ixgbe pmd currently can't be built without enabling sse instructions at
> > compile time.
> 
> Actually it can, all you have to do is set RTE_IXGBE_INC_VECTOR=n in your 
> config.
> 
> >  While sse extensions provide better performance, theres no reason
> > that we can't still create builds to run on systems that don't support sse. 
> >  If
> > we modify the ixgbe code to use the __builtin_shuffle and 
> > __builtin_popcountll
> > functions, I've confirmed that the gcc compiler emits the appropriate sse
> > instructions when the provided -march parameter indicates a machine that
> > includes sse support, and emits generic code when see isn't available.
> 
> I don't think it is ok to blindly replace _mm_shuffle_epi8 with 
> __builtin_shuffle.
> They are not identical.
> I tried your patch on IVB box (gcc 4.8.3, CONFIG_RTE_MACHINE="native").
> The result is - ixgbe_recv_pkts_vec() functionality is broken.
> See below for more details.
> So my vote is NACK.
> Konstantin 
>  
> 1. Code changes:
> uint16_t
> ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> uint16_t nb_pkts)
> {
> ...
> __m128i shuf_msk;
> ...
> /* mask to shuffle from desc. to mbuf */
> shuf_msk = _mm_set_epi8(
> 7, 6, 5, 4,  /* octet 4~7, 32bits rss */
> 0xFF, 0xFF,  /* skip high 16 bits vlan_macip, zero out */
> 15, 14,  /* octet 14~15, low 16 bits vlan_macip */
> 0xFF, 0xFF,  /* skip high 16 bits pkt_len, zero out */
> 13, 12,  /* octet 12~13, low 16 bits pkt_len */
> 0xFF, 0xFF,  /* skip nb_segs and in_port, zero out */
> 13, 12   /* octet 12~13, 16 bits data_len */
> );
>  ...
>  for (...) {
> __m128i descs[RTE_IXGBE_DESCS_PER_LOOP];
> __m128i pkt_mb1, pkt_mb2, pkt_mb3, pkt_mb4;
> ...
> descs[3] = _mm_loadu_si128((__m128i *)(rxdp + 3)); 
> ...
> -   pkt_mb4 = _mm_shuffle_epi8(descs[3], shuf_msk);
> +  pkt_mb4 = __builtin_shuffle(descs[3], shuf_msk);
> ...
> 
> 2. Code generated before the patch (valid one):
> ...
> vmovdqa 0x4978d(%rip),%xmm0   /* load shuf_msk */
> ...
> vmovdqu 0x30(%rdx),%xmm4/* load desc[3] */
> 
> vpshufb %xmm0,%xmm4,%xmm8
> 
> 
> 3. Code generated after the patch applied (broken one):
> ...
> vmovdqu 0x30(%rdx),%xmm3
> ...
> vpunpcklqdq %xmm3,%xmm3,%xmm3   /* !!! ERROR - should be vpshufb  */
> 
> 4. What happens here?
> My understanding:
> 
> GCC treats __m128i as vector of two 64bit integers:
> /lib/gcc/x86_64-redhat-linux/4.8.3/include/emmintrin.h:typedef long long 
> __m128i __attribute__ ((__vector_size__ (16), __may_alias__));
> 
> From https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html:
> "... Vector shuffling is available using functions __builtin_shuffle (vec, 
> mask) and __builtin_shuffle (vec0, vec1, mask). Both functions construct a 
> permutation of elements from one or two vectors and return a vector of the 
> same type as the input vector(s). The mask is an integral vector with the 
> same width (W) and element count (N) as the output vector.
> 
> The elements of the input vectors are numbered in memory ordering of vec0 
> beginning at 0 and vec1 beginning at N. The elements of mask are considered 
> modulo N in the single-operand case and modulo 2*N in the two-operand case."
> 
> For m128i N = 2, so:
> 
> __m128i x, __m128i msk;
> x = __builtin_shuffle(x, msk);
> 
> means:
> 
> index0 = msk[0..63] % 2;
> index1 = msk[64..127] % 2;
> x[0..63] =  x[index0 * 64..index0*64+63];
> x[64..127] =  x[index1 * 64..index1*64+63];
> 
> In ixgbe_recv_pkts_vec() shuf_msk[0..63] % 2 == 0 and  shuf_msk[64..127] % 2.
> So compiler makes optimisation:
> pkt_mb4[0..63]  = descs[3] [0..63];
> pkt_mb4[64..127]  = descs[3] [0..63];
> i.e:
> vpunpcklqdq %xmm3,%xmm3,%xmm3
> 
> BTW, changing to
> __builtin_shuffle((__v16qi) descs[3], (__v16qi)shuf_msk);
> wouldn't help either.
>  In that case __builtin_shuffle will consider elements of mas modulo 16.
> While _mm_shuffle_epi8 (PSHUFB) is expected to zero destination byte if upper 
> bit in the corresponding mask byte is 1.
> 
Ok, Sorry for the delayed response, I spend the weekend reading up on the
differences between these instructions, and you're right, I thought those
operations were equivalent, but the zeroing operation differentiates them.
Sorry about that.

That said, it would be really helpful for distribution packaging to be able to
enable vectorized reception at run time.  A compile time build variable really
just isn't very helpful.  I'll see if I can rework the patch to allow optional
vectorized patch reception at run time.

Neil

>  
> 


[dpdk-dev] free a memzone

2014-07-28 Thread Ananyev, Konstantin
Hi Mahdi,

>Hi Konstantin,
>Thank you very much. Your solution fixed my problem.
>Is there a solution like this for resetting the memory zone, which is used by 
>rte_malloc function? 
>Because if I use rte_malloc instead of malloc, in the case of application 
>crash, the memory zone, which was used by rte_malloc in the previous run would 
>be >unusable for the next run of slave process. 

Without significant modification inside librte_eal and/or librte_malloc - 
nothing comes on top of my head.
If that is such a big problem to you, might be it is possible to change your 
whole process model a bit:

In the parent process:
- allocate memory/init run-time strcutures
L1:
- fork();
- wait till child terminates
- if it terminated abnormally, then free memory/re-init run-time structures.
Goto L1

Do actual packet processing inside child process.

Would that help somehow?
Konstantin

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi Dashtbozorgi
> Sent: Thursday, July 24, 2014 6:20 AM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] free a memzone
>
> Hi Bruce,
>
> Thank you for the response. That's a great Idea!
> But I do not understand the last four parameters of this function. (vaddr,
> paddr, pg_num, pg_shift)
> I guess vaddr is the virtual address of the previously allocated mempool,
yes

> paddr is calculated using function call rte_mem_virt2phy(vaddr), am I
> right?
yes

?>what about pg_num and pg_shift? how can I pass them correctly?
>From rte_mempool.h:
"* @param pg_num
?* ? Number of elements in the paddr array.
?* @param pg_shift
?* ? LOG2 of the physical pages size."

If you are using memzone as externally allocated memory - it will be already 
physically continuos.
So in your case pg_num = MEMPOOL_PG_NUM_DEFAULT, ?pg_shift = 
MEMPOOL_PG_SHIFT_MAX.

Though, I don't think rte_mempool_xmem_create() will help you in any way.
Again from rte_mempool.h:
"* Creates a new mempool named *name* in memory.
?*
?* This function uses ``memzone_reserve()`` to allocate memory. The
?* pool contains n elements of elt_size. Its size is set to n.
?* Depending on the input parameters, mempool elements can be either allocated
?* together with the mempool header, or an externally provided memory buffer
?* could be used to store mempool objects. In later case, that external
?* memory buffer can consist of set of disjoint phyiscal pages."

So xmem_create would still create a new ring, reserve a new memzone of 
mempool's metadata, etc.
The only difference - it can use externally allocated memory to store mempool 
elements.

As I understand what you need is sort of mempool_reset(): a function that would 
re-init mempool to just created state
(all elements are free, lcores caches are empty, etc).
Right now we don't have such function, ?but I suppose something like that 
should do
(note that I didn't run or even build it):

If ((mp = rte_mempool_lookup(name)) != NULL {

char ring_name[RTE_RING_NAMESIZE];

/* save mp ring name. */
memcpy(ring_name, mp->ring->name, sizeof ring_name);

/* reset the ring. */
? rte_ring_init(mp->ring, ring_name, ?rte_align32pow2(mp->size+1), 
mp->ring->flags);

/*repopulate mempool and reinit all its elements. */
mempool_populate(mp, mp->size, 1, rte_pktmbuf_init, NULL);

/* reset all lcore caches. */
?memset(mp->local_cache, 0, sizeof(local_cahce));

? /* reset statistics if needed. */
} else {
? /* create new mempool. */
}

Ideally such function should be in the librte_mempool of course, but if you are 
in a hurry - you probably can give it a try.

Note that I assume that no other process, except failed/restarting secondary 
are using this mempool.
If primary or some other secondary do, then first you need to stop them using 
this mempool and wait till they finish all their packet processing activity.

Konstantin

> Best Regards,
> Mahdi.
>
>
> On Thu, Jul 24, 2014 at 9:48 AM, Mahdi Dashtbozorgi 
> wrote:
>
> > Hi Bruce,
> >
> > Thank you for the response. That's a great Idea!
> > But I do not understand the last four parameters of this function. (vaddr,
> > paddr, pg_num, pg_shift)
> > I guess vaddr is the virtual address of the previously allocated mempool,
> > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I
> > right? what about pg_num and pg_shift? how can I pass them correctly?
> >
> > Best Regards,
> > Mahdi.
> >
> >
> > On Wed, Jul 23, 2014 at 11:09 PM, Richardson, Bruce <
> > bruce.richardson at intel.com> wrote:
> >
> >> Rather than freeing the previously allocated memzone, could you not just
> >> re-initialize the mempool using something like rte_mempool_xmem_create?
> >>
> >> > -Original Message-
> >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi 
> >> > Dashtbozorgi
> >> > Sent: Wednesday, July 23, 2014 2:05 AM
> >> > To: dev at dpdk.org
> >> > Subject: Re: [dpdk-dev] free a memzone
> >> >
> >> > Hi guys,
> >> >
> >> > Is there any suggestion to free the previously allocated memzone?
> >> > I really

[dpdk-dev] [ovs-discuss] does vswitchd runs multiple threads when i added dpdk devices

2014-07-28 Thread Alex Wang
Hey Srinivas,

Right now, ovs has only one dpdk polling thread.  We are working on
creating multiple polling threads and pinning polling threads to the same
cpu socket as dpdk interface.


Thanks,
Alex Wang,


On Mon, Jul 28, 2014 at 9:15 AM, Ben Pfaff  wrote:

> On Mon, Jul 28, 2014 at 07:33:35AM +, Srinivas Reddi wrote:
> > As per my understanding  each  dpdk device is polled on different thread
> .
> > But in my case vswithcd is running in only single thread [on core 0]  ,
> I expected to run on 3 cores ..
> >
> > One thing I want to clarify   that .. does ovs-vswitchd runs on single
> core only  .. or multiple thereads .. when I added dpdk devices .
> > If vswitchd runs on multiple threads , when I added dpdk devices .. pls
> let me know how can I run .
>
> It looks like right now OVS has only one dpdk polling thread.
> ___
> discuss mailing list
> discuss at openvswitch.org
> http://openvswitch.org/mailman/listinfo/discuss
>