[dpdk-dev] [PATCH] virtio: fix crash if VIRTIO_NET_F_CTRL_VQ is not negotiated

2014-09-12 Thread Ouyang, Changchun
Hi

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of
> damarion at cisco.com
> Sent: Friday, September 12, 2014 6:25 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH] virtio: fix crash if VIRTIO_NET_F_CTRL_VQ is not
> negotiated
> 
> From: Damjan Marion 
> 
> If VIRTIO_NET_F_CTRL_VQ is not negotiated hw->cvq will be NULL
> 
> Signed-off-by: Damjan Marion 
> ---
>  lib/librte_pmd_virtio/virtio_rxtx.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c
> b/lib/librte_pmd_virtio/virtio_rxtx.c
> index 0b10108..8cb635e 100644
> --- a/lib/librte_pmd_virtio/virtio_rxtx.c
> +++ b/lib/librte_pmd_virtio/virtio_rxtx.c
> @@ -328,8 +328,10 @@ virtio_dev_cq_start(struct rte_eth_dev *dev)
>   struct virtio_hw *hw
>   = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> 
> - virtio_dev_vring_start(hw->cvq, VTNET_CQ);
> - VIRTQUEUE_DUMP((struct virtqueue *)hw->cvq);
> + if (hw->cvq) {
> + virtio_dev_vring_start(hw->cvq, VTNET_CQ);
> + VIRTQUEUE_DUMP((struct virtqueue *)hw->cvq);
> + }
>  }
> 
>  void
> --
> 2.1.0

Acked-by: Changchun Ouyang 



[dpdk-dev] [PATCH v3 0/8]Support VxLAN on Fortville

2014-09-12 Thread Jijiang Liu
The patch set supports VxLAN on Fortville based on current mbuf structure. When 
Bruce's new mbuf structure is done, there will be minor changes later.

It includes:
 - Support VxLAN packet identification by configuring tunneling UDP port. 
 - Support VxLAN packet filters. It uses MAC and VLAN to point
   to a queue. The filter types supported include below:
   1. Inner MAC and Inner VLAN ID
   2. Inner MAC address, inner VLAN ID and tenant ID.
   3. Inner MAC and tenant ID
   4. Inner MAC address
   5. Outer MAC address, tenant ID and inner MAC
 - Support VxLAN TX checksum offload, which include outer L3(IP), inner L3(IP) 
and inner L4(UDP,TCP and SCTP) 

Change notes:

 v3)
*Split the implementation of tunneling UDP port configuration into two 
patches.
*Introduce a new filter framewok in librte_ether. As to the implemetation 
discussion, please refer to 
 http://dpdk.org/ml/archives/dev/2014-September/005179.html, and VxLAN 
tunnel filter implementation is based on it. 

jijiangl (8):
  support VxLAN packet identification in librte_ether
  support VxLAN packet identification in librte_pmd_i40e
  test vxlan packet identification
  add new filter framework in librte_ether.
  implement API of VxLAN tunnel filter in librte_pmd_i40e
  test VxLAN packet filter API
  support VxLAN Tx checksum offload
  test VxLAN Tx checksum offload

 app/test-pmd/cmdline.c|  229 -
 app/test-pmd/config.c |6 +-
 app/test-pmd/csumonly.c   |  199 +--
 app/test-pmd/parameters.c |   13 ++
 app/test-pmd/rxonly.c |   49 +
 app/test-pmd/testpmd.c|8 +
 app/test-pmd/testpmd.h|4 +
 config/common_linuxapp|5 +
 lib/librte_ether/Makefile |1 +
 lib/librte_ether/rte_eth_ctrl.h   |  152 ++
 lib/librte_ether/rte_ethdev.c |   95 +
 lib/librte_ether/rte_ethdev.h |  108 ++
 lib/librte_ether/rte_ether.h  |8 +
 lib/librte_mbuf/rte_mbuf.h|4 +
 lib/librte_pmd_i40e/i40e_ethdev.c |  405 -
 lib/librte_pmd_i40e/i40e_ethdev.h |5 +
 lib/librte_pmd_i40e/i40e_rxtx.c   |   58 +-
 17 files changed, 1324 insertions(+), 25 deletions(-)
 create mode 100644 lib/librte_ether/rte_eth_ctrl.h

-- 
1.7.7.6



[dpdk-dev] [PATCH v3 1/8]i40e:support VxLAN packet identification in librte_ether

2014-09-12 Thread Jijiang Liu
Add data structures and APIs in librte_ether for supporting tunneling UDP port 
configuration on i40e,
Currently, only VxLAN is implemented, which include
 -  VxLAN UDP port initialization
 -  Add APIs to configure VxLAN UDP port

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 lib/librte_ether/rte_ethdev.c |   63 ++
 lib/librte_ether/rte_ethdev.h |   76 +
 lib/librte_ether/rte_ether.h  |8 
 3 files changed, 147 insertions(+), 0 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index fd1010a..325edb1 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1892,6 +1892,69 @@ rte_eth_dev_rss_hash_conf_get(uint8_t port_id,
 }

 int
+rte_eth_dev_udp_tunnel_add(uint8_t port_id,
+  struct rte_eth_udp_tunnel *udp_tunnel,
+  uint8_t count)
+{
+   uint8_t i;
+   struct rte_eth_dev *dev;
+   struct rte_eth_udp_tunnel *tunnel;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+
+   if (udp_tunnel == NULL) {
+   PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
+   return -EINVAL;
+   }
+   tunnel = udp_tunnel;
+
+   for (i = 0; i < count; i++, tunnel++) {
+   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
+   PMD_DEBUG_TRACE("Invalid tunnel type\n");
+   return -EINVAL;
+   }
+   }
+
+   dev = &rte_eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_add, -ENOTSUP);
+   return (*dev->dev_ops->udp_tunnel_add)(dev, udp_tunnel, count);
+}
+
+int
+rte_eth_dev_udp_tunnel_delete(uint8_t port_id,
+ struct rte_eth_udp_tunnel *udp_tunnel,
+ uint8_t count)
+{
+   uint8_t i;
+   struct rte_eth_dev *dev;
+   struct rte_eth_udp_tunnel *tunnel;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+   dev = &rte_eth_devices[port_id];
+
+   if (udp_tunnel == NULL) {
+   PMD_DEBUG_TRACE("Invalid udp_tunnel parametr\n");
+   return -EINVAL;
+   }
+   tunnel = udp_tunnel;
+   for (i = 0; i < count; i++, tunnel++) {
+   if (tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
+   PMD_DEBUG_TRACE("Invalid tunnel type\n");
+   return -EINVAL;
+   }
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_del, -ENOTSUP);
+   return (*dev->dev_ops->udp_tunnel_del)(dev, udp_tunnel, count);
+}
+
+int
 rte_eth_led_on(uint8_t port_id)
 {
struct rte_eth_dev *dev;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 50df654..74ac313 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -708,6 +708,26 @@ struct rte_fdir_conf {
 };

 /**
+ * UDP tunneling configuration.
+ */
+struct rte_eth_udp_tunnel {
+   uint16_t udp_port;
+   uint8_t prot_type;
+};
+
+/**
+ * Tunneled type.
+ */
+enum rte_eth_tunnel_type {
+   RTE_TUNNEL_TYPE_NONE = 0,
+   RTE_TUNNEL_TYPE_VXLAN,
+   RTE_TUNNEL_TYPE_GENEVE,
+   RTE_TUNNEL_TYPE_TEREDO,
+   RTE_TUNNEL_TYPE_NVGRE,
+   RTE_TUNNEL_TYPE_MAX,
+};
+
+/**
  *  Possible l4type of FDIR filters.
  */
 enum rte_l4type {
@@ -829,6 +849,7 @@ struct rte_intr_conf {
  * configuration settings may be needed.
  */
 struct rte_eth_conf {
+   enum rte_eth_tunnel_type tunnel_type;
uint16_t link_speed;
/**< ETH_LINK_SPEED_10[0|00|000], or 0 for autonegotation */
uint16_t link_duplex;
@@ -1240,6 +1261,17 @@ typedef int (*eth_mirror_rule_reset_t)(struct 
rte_eth_dev *dev,
  uint8_t rule_id);
 /**< @internal Remove a traffic mirroring rule on an Ethernet device */

+typedef int (*eth_udp_tunnel_add_t)(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *tunnel_udp,
+   uint8_t count);
+/**< @internal Add tunneling UDP info */
+
+typedef int (*eth_udp_tunnel_del_t)(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *tunnel_udp,
+   uint8_t count);
+/**< @internal Delete tunneling UDP info */
+
+
 #ifdef RTE_NIC_BYPASS

 enum {
@@ -1412,6 +1444,8 @@ struct eth_dev_ops {
eth_set_vf_rx_tset_vf_rx;  /**< enable/disable a VF receive 
*/
eth_set_vf_tx_tset_vf_tx;  /**< enable/disable a VF 
transmit */
eth_set_vf_vlan_filter_t   set_vf_vlan_filter;  /**< Set VF VLAN filter 
*/
+   eth_udp_tunnel_add_t   udp_tunnel_add;
+   eth_udp_tunnel_del_t  

[dpdk-dev] [PATCH v3 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e

2014-09-12 Thread Jijiang Liu
Support tunneling UDP port configuration on i40e in librte_pmd_i40e.
Currently, only VxLAN is implemented, which include
 -  VxLAN UDP port initialization
 -  Implement the APIs to configure VxLAN UDP port in librte_pmd_i40e.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 config/common_linuxapp|5 +
 lib/librte_mbuf/rte_mbuf.h|2 +
 lib/librte_pmd_i40e/i40e_ethdev.c |  200 -
 lib/librte_pmd_i40e/i40e_ethdev.h |5 +
 lib/librte_pmd_i40e/i40e_rxtx.c   |   11 ++
 5 files changed, 222 insertions(+), 1 deletions(-)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 9047975..b5ecf15 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -212,6 +212,11 @@ CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4
 CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1

 #
+# Compile tunneling UDP port support
+#
+CONFIG_RTE_LIBRTE_TUNNEL_UDP_PORT=4789
+
+#
 # Compile burst-oriented VIRTIO PMD driver
 #
 CONFIG_RTE_LIBRTE_VIRTIO_PMD=y
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 2735f37..1832e73 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -594,6 +594,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
m->pkt.in_port = 0xff;

m->ol_flags = 0;
+   m->reserved = 0;
buf_ofs = (RTE_PKTMBUF_HEADROOM <= m->buf_len) ?
RTE_PKTMBUF_HEADROOM : m->buf_len;
m->pkt.data = (char*) m->buf_addr + buf_ofs;
@@ -658,6 +659,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, 
struct rte_mbuf *md)
mi->pkt.pkt_len = mi->pkt.data_len;
mi->pkt.nb_segs = 1;
mi->ol_flags = md->ol_flags;
+   mi->reserved = md->reserved;

__rte_mbuf_sanity_check(mi, RTE_MBUF_PKT, 1);
__rte_mbuf_sanity_check(md, RTE_MBUF_PKT, 0);
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 4e65ca4..4234073 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -189,7 +189,7 @@ static int i40e_res_pool_alloc(struct i40e_res_pool_info 
*pool,
 static int i40e_dev_init_vlan(struct rte_eth_dev *dev);
 static int i40e_veb_release(struct i40e_veb *veb);
 static struct i40e_veb *i40e_veb_setup(struct i40e_pf *pf,
-   struct i40e_vsi *vsi);
+   struct i40e_vsi *vsi);
 static int i40e_pf_config_mq_rx(struct i40e_pf *pf);
 static int i40e_vsi_config_double_vlan(struct i40e_vsi *vsi, int on);
 static inline int i40e_find_all_vlan_for_mac(struct i40e_vsi *vsi,
@@ -205,6 +205,14 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev 
*dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
+  struct rte_eth_udp_tunnel *udp_tunnel,
+  uint8_t count);
+static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
+  struct rte_eth_udp_tunnel *udp_tunnel,
+  uint8_t count);
+static int i40e_pf_config_vxlan(struct i40e_pf *pf);
+

 /* Default hash key buffer for RSS */
 static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1];
@@ -256,6 +264,8 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.reta_query   = i40e_dev_rss_reta_query,
.rss_hash_update  = i40e_dev_rss_hash_update,
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
+   .udp_tunnel_add   = i40e_dev_udp_tunnel_add,
+   .udp_tunnel_del   = i40e_dev_udp_tunnel_del,
 };

 static struct eth_driver rte_i40e_pmd = {
@@ -2529,6 +2539,34 @@ i40e_vsi_dump_bw_config(struct i40e_vsi *vsi)
return 0;
 }

+static int
+i40e_vxlan_filters_init(struct i40e_pf *pf)
+{
+   uint8_t filter_index;
+   int ret = 0;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+
+   if (!(pf->flags & I40E_FLAG_VXLAN))
+   return 0;
+
+   /* Init first entry in tunneling UDP table */
+   ret = i40e_aq_add_udp_tunnel(hw, RTE_LIBRTE_TUNNEL_UDP_PORT,
+   I40E_AQC_TUNNEL_TYPE_VXLAN,
+   &filter_index, NULL);
+   if (ret < 0) {
+   PMD_DRV_LOG(ERR, "Failed to add UDP tunnel port %d "
+   "with index=%d\n", RTE_VXLAN_UDP_PORT,
+filter_index);
+   } else {
+   pf->vxlan_bitmap |= 1;
+   pf->vxlan_ports[0] = RTE_LIBRTE_TUNNEL_UDP_PORT;
+   PMD_DRV_LOG(INFO, "Added UDP tunnel port %d with "
+   "index=%d\n", RTE_VXLAN_UDP_PORT, filter_index);
+   }
+
+   return ret

[dpdk-dev] [PATCH v3 3/8]app/test-pmd:test VxLAN packet identification

2014-09-12 Thread Jijiang Liu
Add commands to test VxLAN packet identification, which include
 - use commands to add/delete VxLAN UDP port.
 - use rxonly mode to receive VxLAN packet.


Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 app/test-pmd/cmdline.c|   78 ++--
 app/test-pmd/parameters.c |   13 +++
 app/test-pmd/rxonly.c |   49 
 app/test-pmd/testpmd.c|8 +
 app/test-pmd/testpmd.h|4 ++
 5 files changed, 148 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index b04a4e8..de910db 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -285,6 +285,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

+   "rx_vxlan_port add (udp_port) (port_id)\n"
+   "Add an UDP port for VxLAN packet filter on a 
port\n\n"
+
+   "rx_vxlan_port rm (udp_port) (port_id)\n"
+   "Remove an UDP port for VxLAN packet filter on a 
port\n\n"
+
"tx_vlan set vlan_id (port_id)\n"
"Set hardware insertion of VLAN ID in packets sent"
" on a port.\n\n"
@@ -296,13 +302,17 @@ static void cmd_help_long_parsed(void *parsed_result,
"Disable hardware insertion of a VLAN header in"
" packets sent on a port.\n\n"

-   "tx_checksum set mask (port_id)\n"
+   "tx_checksum set (mask) (port_id)\n"
"Enable hardware insertion of checksum offload with"
-   " the 4-bit mask, 0~0xf, in packets sent on a port.\n"
+   " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
"bit 0 - insert ip   checksum offload if set\n"
"bit 1 - insert udp  checksum offload if set\n"
"bit 2 - insert tcp  checksum offload if set\n"
"bit 3 - insert sctp checksum offload if set\n"
+   "bit 4 - insert inner ip  checksum offload if 
set\n"
+   "bit 5 - insert inner udp checksum offload if 
set\n"
+   "bit 6 - insert inner tcp checksum offload if 
set\n"
+   "bit 7 - insert inner sctp checksum offload if 
set\n"
"Please check the NIC datasheet for HW limits.\n\n"

"set fwd (%s)\n"
@@ -2745,8 +2755,9 @@ cmdline_parse_inst_t cmd_tx_cksum_set = {
.f = cmd_tx_cksum_set_parsed,
.data = NULL,
.help_str = "enable hardware insertion of L3/L4checksum with a given "
-   "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
ip"
-   "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP",
+   "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
ip "
+   "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP, Bit 4 for inner ip "
+   "Bit 5 for inner UDP, Bit 6 for inner TCP, Bit 7 for inner SCTP",
.tokens = {
(void *)&cmd_tx_cksum_set_tx_cksum,
(void *)&cmd_tx_cksum_set_set,
@@ -6211,6 +6222,64 @@ cmdline_parse_inst_t cmd_vf_rate_limit = {
},
 };

+/* *** CONFIGURE TUNNEL UDP PORT *** */
+struct cmd_tunnel_udp_config {
+   cmdline_fixed_string_t cmd;
+   cmdline_fixed_string_t what;
+   uint16_t udp_port;
+   uint8_t port_id;
+};
+
+static void
+cmd_tunnel_udp_config_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_tunnel_udp_config *res = parsed_result;
+   struct rte_eth_udp_tunnel tunnel_udp;
+   int ret;
+
+   tunnel_udp.udp_port = res->udp_port;
+
+   if (!strcmp(res->cmd, "rx_vxlan_port"))
+   tunnel_udp.prot_type = RTE_TUNNEL_TYPE_VXLAN;
+
+   if (!strcmp(res->what, "add"))
+   ret = rte_eth_dev_udp_tunnel_add(res->port_id, &tunnel_udp, 1);
+   else
+   ret = rte_eth_dev_udp_tunnel_delete(res->port_id, &tunnel_udp, 
1);
+
+   if (ret < 0)
+   printf("udp tunneling add error: (%s)\n", strerror(-ret));
+}
+
+cmdline_parse_token_string_t cmd_tunnel_udp_config_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_udp_config,
+   cmd, "rx_vxlan_port");
+cmdline_parse_token_string_t cmd_tunnel_udp_config_what =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_udp_config,
+   what, "add#rm");
+cmdline_parse_token_num_t cmd_tunnel_udp_config_udp_port =
+   TOKEN_NUM_INITIALIZER(struct cmd_tunnel_u

[dpdk-dev] [PATCH v3 4/8]librte_ether:add a common filter API

2014-09-12 Thread Jijiang Liu
Introduce a new filter framewok in librte_ether. As to the implemetation 
discussion, please refer to
http://dpdk.org/ml/archives/dev/2014-September/005179.html, and VxLAN tunnel 
filter implementation is based on
it.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 

---
 lib/librte_ether/Makefile   |1 +
 lib/librte_ether/rte_eth_ctrl.h |  152 +++
 lib/librte_ether/rte_ethdev.c   |   32 
 lib/librte_ether/rte_ethdev.h   |   56 +++---
 4 files changed, 229 insertions(+), 12 deletions(-)
 create mode 100644 lib/librte_ether/rte_eth_ctrl.h

diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile
index b310f8b..a461c31 100644
--- a/lib/librte_ether/Makefile
+++ b/lib/librte_ether/Makefile
@@ -46,6 +46,7 @@ SRCS-y += rte_ethdev.c
 #
 SYMLINK-y-include += rte_ether.h
 SYMLINK-y-include += rte_ethdev.h
+SYMLINK-y-include += rte_eth_ctrl.h

 # this lib depends upon:
 DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf
diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
new file mode 100644
index 000..3e2bf45
--- /dev/null
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -0,0 +1,152 @@
+/*-
+ *   BSD LICENSE
+ *
+ *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+ *   All rights reserved.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ *   notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ *   notice, this list of conditions and the following disclaimer in
+ *   the documentation and/or other materials provided with the
+ *   distribution.
+ * * Neither the name of Intel Corporation nor the names of its
+ *   contributors may be used to endorse or promote products derived
+ *   from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ETH_CTRL_H_
+#define _RTE_ETH_CTRL_H_
+
+/**
+ * @file
+ *
+ * Ethernet device features and related data structures used
+ * by control APIs should be defined in this file.
+ *
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/**
+ * Feature filter types
+ */
+enum rte_filter_type {
+   RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_RSS,
+   RTE_ETH_FILTER_FDIR,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_MAX,
+};
+
+/**
+ * all generic operations to filters
+ */
+enum rte_filter_op {
+   RTE_ETH_FILTER_OP_NONE = 0,
+   /** used to check whether the type filter is supported */
+   RTE_ETH_FILTER_OP_ADD,  /**< add filter entry */
+   RTE_ETH_FILTER_OP_UPDATE,   /**< update filter entry */
+   RTE_ETH_FILTER_OP_DELETE,   /**< delete filter entry */
+   RTE_ETH_FILTER_OP_GET,  /**< get filter entry */
+   RTE_ETH_FILTER_OP_SET,  /**< configurations */
+   RTE_ETH_FILTER_OP_GET_INFO,
+   /** get information of filter, such as status or statistics */
+   RTE_ETH_FILTER_OP_MAX,
+};
+
+/ TUNNEL FILTER DATA DEFINATION *** */
+
+#define ETH_TUNNEL_FILTER_OMAC  0x01
+#define ETH_TUNNEL_FILTER_OIP   0x02
+#define ETH_TUNNEL_FILTER_TENID 0x04
+#define ETH_TUNNEL_FILTER_IMAC  0x08
+#define ETH_TUNNEL_FILTER_IVLAN 0x10
+#define ETH_TUNNEL_FILTER_IIP   0x20
+
+#define RTE_TUNNEL_FLAGS_TO_QUEUE 1
+
+/*
+ * Tunneled filter type
+ */
+enum rte_tunnel_filter_type {
+   RTE_TUNNEL_FILTER_TYPE_NONE = 0,
+   RTE_TUNNEL_FILTER_OIP = ETH_TUNNEL_FILTER_OIP,
+   RTE_TUNNEL_FILTER_IMAC_IVLAN =
+   ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_IVLAN,
+   RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID =
+   ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_IVLAN |
+   ETH_TUNNEL_FILTER_TENID,
+   RTE_TUNNEL_FILTER_IMAC_TENID =
+   ETH_TUNNEL_FILTER_IMAC | ETH_TUNNEL_FILTER_TENID,
+   RTE_TUNNEL_FILTER_IMAC = ETH_TUNNEL_FILTER_IMAC,
+   RTE_TUNNEL_FILTER_OMAC_TENID_IMAC =
+ 

[dpdk-dev] [PATCH v3 6/8]app/testpmd:test VxLAN packet filter API

2014-09-12 Thread Jijiang Liu
Add tunnel_filter command in testpmd to test VxLAN packet filter API.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 app/test-pmd/cmdline.c |  153 +++-
 1 files changed, 152 insertions(+), 1 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index de910db..71671b7 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -285,6 +285,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

+   "tunnel_filter add (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
+   "(inner_vlan) (tunnel_type) (filter_type) (tenant_id) 
(queue_id)\n"
+   "   add a tunnel filter of a port.\n\n"
+
+   "tunnel_filter rm (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
+   "(inner_vlan) (tunnel_type) (filter_type) (tenant_id) 
(queue_id)\n"
+   "   remove a tunnel filter of a port.\n\n"
+
"rx_vxlan_port add (udp_port) (port_id)\n"
"Add an UDP port for VxLAN packet filter on a 
port\n\n"

@@ -6222,6 +6230,148 @@ cmdline_parse_inst_t cmd_vf_rate_limit = {
},
 };

+/* *** ADD TUNNEL FILTER OF A PORT *** */
+struct cmd_tunnel_filter_result {
+   cmdline_fixed_string_t cmd;
+   cmdline_fixed_string_t what;
+   uint8_t port_id;
+   struct ether_addr outer_mac;
+   struct ether_addr inner_mac;
+   cmdline_ipaddr_t ip_value;
+   uint16_t inner_vlan;
+   cmdline_fixed_string_t tunnel_type;
+   cmdline_fixed_string_t filter_type;
+   uint32_t tenant_id;
+   uint16_t queue_num;
+};
+
+static void
+cmd_tunnel_filter_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_tunnel_filter_result *res = parsed_result;
+   struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
+   int ret = 0;
+
+   tunnel_filter_conf.outer_mac = &res->outer_mac;
+   tunnel_filter_conf.inner_mac = &res->inner_mac;
+   tunnel_filter_conf.inner_vlan = res->inner_vlan;
+
+   if (res->ip_value.family == AF_INET) {
+   tunnel_filter_conf.ip_addr.ipv4_addr =
+   res->ip_value.addr.ipv4.s_addr;
+   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV4;
+   } else {
+   memcpy(&(tunnel_filter_conf.ip_addr.ipv6_addr),
+   &(res->ip_value.addr.ipv6),
+   sizeof(struct in6_addr));
+   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV6;
+   }
+
+   if (!strcmp(res->filter_type, "imac-ivlan"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC_IVLAN;
+   else if (!strcmp(res->filter_type, "imac-ivlan-tenid"))
+   tunnel_filter_conf.filter_type =
+   RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID;
+   else if (!strcmp(res->filter_type, "imac-tenid"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC_TENID;
+   else if (!strcmp(res->filter_type, "imac"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC;
+   else if (!strcmp(res->filter_type, "omac-imac-tenid"))
+   tunnel_filter_conf.filter_type =
+   RTE_TUNNEL_FILTER_OMAC_TENID_IMAC;
+   else {
+   printf("The filter type is not supported");
+   return;
+   }
+
+   tunnel_filter_conf.to_queue = RTE_TUNNEL_FLAGS_TO_QUEUE;
+
+   if (!strcmp(res->tunnel_type, "vxlan"))
+   tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_VXLAN;
+   else {
+   printf("Only VxLAN is supported now.\n");
+   return;
+   }
+
+   tunnel_filter_conf.tenant_id = res->tenant_id;
+   tunnel_filter_conf.queue_id = res->queue_num;
+   if (!strcmp(res->what, "add"))
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_OP_ADD,
+   &tunnel_filter_conf);
+   else
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_OP_DELETE,
+   &tunnel_filter_conf);
+   if (ret < 0)
+   printf("cmd_tunnel_filter_parsed error: (%s)\n",
+   strerror(-ret));
+
+}
+cmdline_parse_token_string_t cmd_tunnel_filter_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
+   cmd, "tunnel_filter");
+cmdline_parse_token_string_t cmd_tunnel_filter_what =

[dpdk-dev] [PATCH v3 7/8]i40e:support VxLAN Tx checksum offload

2014-09-12 Thread Jijiang Liu
Support VxLAN Tx checksum offload, which include
  - outer L3(IP) checksum offload
  - inner L3(IP) checksum offload
  - inner L4(UDP, TCP and SCTP) checksum offload

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 lib/librte_mbuf/rte_mbuf.h  |2 +
 lib/librte_pmd_i40e/i40e_rxtx.c |   47 --
 2 files changed, 46 insertions(+), 3 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 1832e73..212ac3a 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -97,6 +97,8 @@ struct rte_ctrlmbuf {
 #define PKT_RX_IEEE1588_PTP  0x0200 /**< RX IEEE1588 L2 Ethernet PT Packet. */
 #define PKT_RX_IEEE1588_TMST 0x0400 /**< RX IEEE1588 L2/L4 timestamped 
packet.*/

+#define PKT_TX_VXLAN_CKSUM   0x0001 /**< Checksum of TX VxLAN pkt. computed by 
NIC.. */
+#define PKT_TX_IVLAN_PKT 0x0002 /**< TX packet is VxLAN packet with an 
inner VLAN. */
 #define PKT_TX_VLAN_PKT  0x0800 /**< TX packet is a 802.1q VLAN packet. */
 #define PKT_TX_IP_CKSUM  0x1000 /**< IP cksum of TX pkt. computed by NIC. 
*/
 #define PKT_TX_IPV4_CSUM 0x1000 /**< Alias of PKT_TX_IP_CKSUM. */
diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index a1dce74..a382d74 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -412,12 +412,16 @@ i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
return ip_ptype_map[ptype];
 }

+#define L4TUN_LEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr)\
++ sizeof(struct ether_hdr))
 static inline void
 i40e_txd_enable_checksum(uint32_t ol_flags,
uint32_t *td_cmd,
uint32_t *td_offset,
uint8_t l2_len,
-   uint8_t l3_len)
+   uint8_t l3_len,
+   uint8_t inner_l3_len,
+   uint32_t *cd_tunneling)
 {
if (!l2_len) {
PMD_DRV_LOG(DEBUG, "L2 length set to 0\n");
@@ -430,6 +434,31 @@ i40e_txd_enable_checksum(uint32_t ol_flags,
return;
}

+   /* VxLAN packet TX checksum offload */
+   if (unlikely(ol_flags & PKT_TX_VXLAN_CKSUM)) {
+   uint8_t l4tun_len;
+
+   /* packet with inner VLAN */
+   if (ol_flags  & PKT_TX_IVLAN_PKT)
+   l4tun_len = L4TUN_LEN + sizeof(struct vlan_hdr);
+   else
+   l4tun_len = L4TUN_LEN;
+
+   if (ol_flags & PKT_TX_IPV4_CSUM)
+   *cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV4;
+   else if (ol_flags & PKT_TX_IPV6)
+   *cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV6;
+
+   /* Now set the ctx descriptor fields */
+   *cd_tunneling |= (l3_len >> 2) <<
+   I40E_TXD_CTX_QW0_EXT_IPLEN_SHIFT |
+   I40E_TXD_CTX_UDP_TUNNELING |
+   (l4tun_len >> 1) <<
+   I40E_TXD_CTX_QW0_NATLEN_SHIFT;
+
+   l3_len = inner_l3_len;
+   }
+
/* Enable L3 checksum offloads */
if (ol_flags & PKT_TX_IPV4_CSUM) {
*td_cmd |= I40E_TX_DESC_CMD_IIPT_IPV4_CSUM;
@@ -1063,6 +1092,9 @@ i40e_calc_context_desc(uint16_t flags)
 {
uint16_t mask = 0;

+   if (flags | PKT_TX_VXLAN_CKSUM)
+   mask |= PKT_TX_VXLAN_CKSUM;
+
 #ifdef RTE_LIBRTE_IEEE1588
mask |= PKT_TX_IEEE1588_TMST;
 #endif
@@ -1082,6 +1114,7 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
volatile struct i40e_tx_desc *txr;
struct rte_mbuf *tx_pkt;
struct rte_mbuf *m_seg;
+   uint32_t cd_tunneling_params;
uint16_t tx_id;
uint16_t nb_tx;
uint32_t td_cmd;
@@ -1091,6 +1124,7 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
uint16_t ol_flags;
uint8_t l2_len;
uint8_t l3_len;
+   uint8_t inner_l3_len;
uint16_t nb_used;
uint16_t nb_ctx;
uint16_t tx_last;
@@ -1120,6 +1154,12 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
l2_len = tx_pkt->pkt.vlan_macip.f.l2_len;
l3_len = tx_pkt->pkt.vlan_macip.f.l3_len;

+   /**
+* the reserved in mbuf is used to store innel L3
+* header length.
+*/
+   inner_l3_len = tx_pkt->reserved;
+
/* Calculate the number of context descriptors needed. */
nb_ctx = i40e_calc_context_desc(ol_flags);

@@ -1166,15 +1206,16 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
td_cmd |= I40E_TX_DESC_CMD_ICRC;

/* Enable checksum offloading */
+   cd_tunneling_params = 0;
i

[dpdk-dev] [PATCH v3 5/8]i40e:implement API of VxLAN packet filter in librte_pmd_i40e

2014-09-12 Thread Jijiang Liu
The implementation of VxLAN tunnel filter in librte_pmd_i40e, which include
 - add the i40e_dev_filter_ctrl() function.
 - add the i40e_dev_tunnel_filter_set() function.

Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 lib/librte_pmd_i40e/i40e_ethdev.c |  205 +
 1 files changed, 205 insertions(+), 0 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 4234073..8adf04b 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "i40e_logs.h"
 #include "i40e/i40e_register_x710_int.h"
@@ -211,7 +212,13 @@ static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
 static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
   struct rte_eth_udp_tunnel *udp_tunnel,
   uint8_t count);
+static int i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
+struct rte_eth_tunnel_filter_conf *tunnel_filter,
+uint8_t add);
 static int i40e_pf_config_vxlan(struct i40e_pf *pf);
+static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
+  enum rte_filter_type filter_type,
+  enum rte_filter_op filter_op, void *arg);


 /* Default hash key buffer for RSS */
@@ -266,6 +273,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
.udp_tunnel_add   = i40e_dev_udp_tunnel_add,
.udp_tunnel_del   = i40e_dev_udp_tunnel_del,
+   .filter_ctrl  = i40e_dev_filter_ctrl,
 };

 static struct eth_driver rte_i40e_pmd = {
@@ -4121,6 +4129,110 @@ i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
 }

 static int
+i40e_dev_get_filter_type(enum rte_tunnel_filter_type filter_type,
+   uint16_t *flag)
+{
+   switch (filter_type) {
+   case RTE_TUNNEL_FILTER_IMAC_IVLAN:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN_TEN_ID;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC_TENID:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_TEN_ID;
+   break;
+   case RTE_TUNNEL_FILTER_OMAC_TENID_IMAC:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_OMAC_TEN_ID_IMAC;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC;
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "invalid tunnel filter type\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int
+i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
+   struct rte_eth_tunnel_filter_conf *tunnel_filter,
+   uint8_t add)
+{
+   uint16_t ip_type;
+   uint8_t tun_type = 0;
+   int ret = 0;
+   int val;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct i40e_vsi *vsi = pf->main_vsi;
+   struct i40e_aqc_add_remove_cloud_filters_element_data  *cld_filter;
+   struct i40e_aqc_add_remove_cloud_filters_element_data  *pfilter;
+
+   cld_filter = rte_zmalloc("tunnel_filter",
+   sizeof(struct i40e_aqc_add_remove_cloud_filters_element_data),
+   0);
+
+   if (NULL == cld_filter) {
+   PMD_DRV_LOG(ERR, "Failed to alloc memory.\n");
+   return -EINVAL;
+   }
+   pfilter = cld_filter;
+
+   (void)rte_memcpy(&pfilter->outer_mac, tunnel_filter->outer_mac,
+   sizeof(struct ether_addr));
+   (void)rte_memcpy(&pfilter->inner_mac, tunnel_filter->inner_mac,
+   sizeof(struct ether_addr));
+
+   pfilter->inner_vlan = tunnel_filter->inner_vlan;
+   if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
+   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV4;
+   (void)rte_memcpy(&pfilter->ipaddr.v4.data,
+   &tunnel_filter->ip_addr,
+   sizeof(pfilter->ipaddr.v4.data));
+   } else {
+   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV6;
+   (void)rte_memcpy(&pfilter->ipaddr.v6.data,
+   &tunnel_filter->ip_addr,
+   sizeof(pfilter->ipaddr.v6.data));
+   }
+
+   /* check tunnel type */
+   switch (tunnel_filter->tunnel_type) {
+   case RTE_TUNNEL_TYPE_VXLAN:
+   tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_XVLAN;
+   break;
+   default:
+   /* Other tunnel types is not supported. */
+   PMD_DRV_LOG(ERR, "tunnel type is not supported.\n");
+   rte_free(cld_filter);
+   return -EINV

[dpdk-dev] [PATCH v3 8/8]app/testpmd:test VxLAN Tx checksum offload

2014-09-12 Thread Jijiang Liu
Add test cases in testpmd to test VxLAN Tx Checksum offload, which include
 - IPv4 tunnel and IPv6 tunnel
 - outer L3, inner L3 and L4 checksum offload for Tx side.


Signed-off-by: Jijiang Liu 
Acked-by: Helin Zhang 
Acked-by: Jingjing Wu 
Acked-by: Jing Chen 

---
 app/test-pmd/config.c   |6 +-
 app/test-pmd/csumonly.c |  199 +++
 2 files changed, 188 insertions(+), 17 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 606e34a..fe77b89 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1719,9 +1719,9 @@ tx_cksum_set(portid_t port_id, uint8_t cksum_mask)
uint16_t tx_ol_flags;
if (port_id_is_invalid(port_id))
return;
-   /* Clear last 4 bits and then set L3/4 checksum mask again */
-   tx_ol_flags = (uint16_t) (ports[port_id].tx_ol_flags & 0xFFF0);
-   ports[port_id].tx_ol_flags = (uint16_t) ((cksum_mask & 0xf) | 
tx_ol_flags);
+   /* Clear last 8 bits and then set L3/4 checksum mask again */
+   tx_ol_flags = (uint16_t) (ports[port_id].tx_ol_flags & 0xFF00);
+   ports[port_id].tx_ol_flags = (uint16_t) ((cksum_mask & 0xff) | 
tx_ol_flags);
 }

 void
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index e5a1f52..3e4a773 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -196,7 +196,6 @@ get_ipv6_udptcp_checksum(struct ipv6_hdr *ipv6_hdr, 
uint16_t *l4_hdr)
return (uint16_t)cksum;
 }

-
 /*
  * Forwarding of packets. Change the checksum field with HW or SW methods
  * The HW/SW method selection depends on the ol_flags on every packet
@@ -209,10 +208,16 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
struct rte_mbuf  *mb;
struct ether_hdr *eth_hdr;
struct ipv4_hdr  *ipv4_hdr;
+   struct ether_hdr *inner_eth_hdr;
+   struct ipv4_hdr  *inner_ipv4_hdr = NULL;
struct ipv6_hdr  *ipv6_hdr;
+   struct ipv6_hdr  *inner_ipv6_hdr = NULL;
struct udp_hdr   *udp_hdr;
+   struct udp_hdr   *inner_udp_hdr;
struct tcp_hdr   *tcp_hdr;
+   struct tcp_hdr   *inner_tcp_hdr;
struct sctp_hdr  *sctp_hdr;
+   struct sctp_hdr  *inner_sctp_hdr;

uint16_t nb_rx;
uint16_t nb_tx;
@@ -221,12 +226,18 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
uint16_t pkt_ol_flags;
uint16_t tx_ol_flags;
uint16_t l4_proto;
+   uint16_t inner_l4_proto = 0;
uint16_t eth_type;
uint8_t  l2_len;
uint8_t  l3_len;
+   uint8_t  inner_l2_len;
+   uint8_t  inner_l3_len = 0;

uint32_t rx_bad_ip_csum;
uint32_t rx_bad_l4_csum;
+   uint8_t  ipv4_tunnel;
+   uint8_t  ipv6_tunnel;
+   uint16_t ptype;

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
@@ -261,8 +272,12 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
mb = pkts_burst[i];
l2_len  = sizeof(struct ether_hdr);
pkt_ol_flags = mb->ol_flags;
+   ptype = mb->reserved;
ol_flags = (uint16_t) (pkt_ol_flags & (~PKT_TX_L4_MASK));

+   ipv4_tunnel = IS_ETH_IPV4_TUNNEL(ptype);
+   ipv6_tunnel = IS_ETH_IPV6_TUNNEL(ptype);
+
eth_hdr = (struct ether_hdr *) mb->pkt.data;
eth_type = rte_be_to_cpu_16(eth_hdr->ether_type);
if (eth_type == ETHER_TYPE_VLAN) {
@@ -295,7 +310,7 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
 *  + ipv4 or ipv6
 *  + udp or tcp or sctp or others
 */
-   if (pkt_ol_flags & PKT_RX_IPV4_HDR) {
+   if (pkt_ol_flags & (PKT_RX_IPV4_HDR | PKT_RX_IPV4_HDR_EXT)) {

/* Do not support ipv4 option field */
l3_len = sizeof(struct ipv4_hdr) ;
@@ -325,17 +340,95 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
if (tx_ol_flags & 0x2) {
/* HW Offload */
ol_flags |= PKT_TX_UDP_CKSUM;
-   /* Pseudo header sum need be set 
properly */
-   udp_hdr->dgram_cksum = 
get_ipv4_psd_sum(ipv4_hdr);
+   if (ipv4_tunnel)
+   udp_hdr->dgram_cksum = 0;
+   else
+   /* Pseudo header sum need be 
set properly */
+   udp_hdr->dgram_cksum =
+   
get_ipv4_psd_sum(ipv4_hdr);
}
else {
/* SW Implementation, clear checksum 
field first */
udp_hdr->dgram_cksum = 0;

[dpdk-dev] [PATCH 0/3] eal affinitize low priority threads to lcore 0

2014-09-12 Thread Richardson, Bruce
> -Original Message-
> From: Hiroshi Shimamoto [mailto:h-shimamoto at ct.jp.nec.com]
> Sent: Friday, September 12, 2014 12:48 AM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH 0/3] eal affinitize low priority threads to 
> lcore 0
> 
> Hi Bruce,
> 
> > Subject: [dpdk-dev] [PATCH 0/3] eal affinitize low priority threads to 
> > lcore 0
> >
> > This patchset sets things up so that we can affinitize the interrupt,
> > vfio management, and hpet timer management threads to lcore 0, so that
> > they never interfere with data plane threads.
> 
> I don't think it works well always.
> The management threads can be floating all cpus on demand, because those
> threads are created before the master thread affinity is set. The kernel
> scheduler will take care of it. And we should isolate cpus which data plane
> threads are pinned to, so the management threads cannot run on those isolated
> cpus data plane thread run.

Yes, this was my understanding as well of how things would work - we just had a 
report back of where the management threads were using the same core as the 
data-plane despite the core being isolated. I'll double-check to see if this 
really is happening and resubmit a new patch with a configurable core 
assignment if it is definitely needed.

/Bruce

> In some cases, the user may run data plane thread on lcore 0, but with
> this patchset the data plane pinned to lcore 0 always run with the
> management threads. That doesn't seem good.
> 
> I think this functionality should be conditional.
> How about to add a parameter to specify the mask for the management threads
> instead of statically assignment to lcore 0?
> 
> thanks,
> Hiroshi
> 
> >
> > Bruce Richardson (3):
> >   eal: add core id param to  eal_thread_set_affinity
> >   eal: increase scope of eal_thread_set_affinity
> >   eal: affinitize low-priority threads to lcore 0
> >
> >  lib/librte_eal/bsdapp/eal/eal_thread.c | 12 ++--
> >  lib/librte_eal/common/include/eal_private.h| 10 ++
> >  lib/librte_eal/linuxapp/eal/eal_interrupts.c   |  5 +
> >  lib/librte_eal/linuxapp/eal/eal_pci_vfio_mp_sync.c |  6 ++
> >  lib/librte_eal/linuxapp/eal/eal_thread.c   | 12 ++--
> >  lib/librte_eal/linuxapp/eal/eal_timer.c|  5 +
> >  6 files changed, 38 insertions(+), 12 deletions(-)
> >
> > --
> > 1.9.3



[dpdk-dev] [PATCH v4 0/5] lib/librte_vhost: user space vhost cuse driver library

2014-09-12 Thread Huawei Xie
This set of patches transforms and refactors vhost example to a user
space vhost cuse library. This library implements a user space vhost
cuse driver, and provides generic APIs for user space ethernet vswitch
to integrate us-vhost for fast packet switching with guest virtio.

Change notes:

 v2) Turn off vhost lib by default

 v3) Fixed checkpatch issues

 v4) Split the patch per Thomas's requirement


Huawei Xie (5):
  mv vhost example to vhost lib directory
  copy the vhost rx/tx functions from main.c to new file vhost_rxtx.c 
  remove main.c main.h
  remove Makefile
  rename virtio-net.h to rte_virtio_net.h as API header file
  vmdq, mac learning and other switch related logics are removed
  zero copy feature isn't generic,and is removed.
  add vhost lib Makefile.
  Add TODOs for found new issues.
  Fix coding style issue which are treated as errors by checkpatch.pl
  add vhost lib support in makefile
  turn off vhost lib by default as it requires fuse development package.

 config/common_linuxapp   |7 +
 examples/vhost/Makefile  |   60 -
 examples/vhost/eventfd_link/Makefile |   39 -
 examples/vhost/eventfd_link/eventfd_link.c   |  205 --
 examples/vhost/eventfd_link/eventfd_link.h   |   79 -
 examples/vhost/libvirt/qemu-wrap.py  |  367 ---
 examples/vhost/main.c| 3722 --
 examples/vhost/main.h|   86 -
 examples/vhost/vhost-net-cdev.c  |  367 ---
 examples/vhost/vhost-net-cdev.h  |   83 -
 examples/vhost/virtio-net.c  | 1165 
 examples/vhost/virtio-net.h  |  161 --
 lib/Makefile |1 +
 lib/librte_vhost/Makefile|   48 +
 lib/librte_vhost/eventfd_link/Makefile   |   39 +
 lib/librte_vhost/eventfd_link/eventfd_link.c |  205 ++
 lib/librte_vhost/eventfd_link/eventfd_link.h |   79 +
 lib/librte_vhost/libvirt/qemu-wrap.py|  367 +++
 lib/librte_vhost/rte_virtio_net.h|  192 ++
 lib/librte_vhost/vhost-net-cdev.c|  362 +++
 lib/librte_vhost/vhost-net-cdev.h|  112 +
 lib/librte_vhost/vhost_rxtx.c|  301 +++
 lib/librte_vhost/virtio-net.c| 1000 +++
 mk/rte.app.mk|5 +
 24 files changed, 2718 insertions(+), 6334 deletions(-)
 delete mode 100644 examples/vhost/Makefile
 delete mode 100644 examples/vhost/eventfd_link/Makefile
 delete mode 100644 examples/vhost/eventfd_link/eventfd_link.c
 delete mode 100644 examples/vhost/eventfd_link/eventfd_link.h
 delete mode 100755 examples/vhost/libvirt/qemu-wrap.py
 delete mode 100644 examples/vhost/main.c
 delete mode 100644 examples/vhost/main.h
 delete mode 100644 examples/vhost/vhost-net-cdev.c
 delete mode 100644 examples/vhost/vhost-net-cdev.h
 delete mode 100644 examples/vhost/virtio-net.c
 delete mode 100644 examples/vhost/virtio-net.h
 create mode 100644 lib/librte_vhost/Makefile
 create mode 100644 lib/librte_vhost/eventfd_link/Makefile
 create mode 100644 lib/librte_vhost/eventfd_link/eventfd_link.c
 create mode 100644 lib/librte_vhost/eventfd_link/eventfd_link.h
 create mode 100755 lib/librte_vhost/libvirt/qemu-wrap.py
 create mode 100644 lib/librte_vhost/rte_virtio_net.h
 create mode 100644 lib/librte_vhost/vhost-net-cdev.c
 create mode 100644 lib/librte_vhost/vhost-net-cdev.h
 create mode 100644 lib/librte_vhost/vhost_rxtx.c
 create mode 100644 lib/librte_vhost/virtio-net.c

-- 
1.8.1.4



[dpdk-dev] [PATCH v4 3/5] lib/librte_vhost: vhost lib refactor

2014-09-12 Thread Huawei Xie
This vhost lib consists of five APIs plus several other helper routines
for feature disable/enable.
1) rte_vhost_driver_register to register vhost driver to the system.
2) rte_vhost_driver_callback_register to register the callback. Callbacks are
called when virtio device is ready for polling or is de-activated.
3) rte_vhost_driver_session_start, a blocking API to start vhost message handler
session.
4) rte_vhost_enqueue_burst and rte_vhost_dequeue_burst for enqueue/dequeue
to/from virtio ring.

Modifications include:
VMDQ, mac learning and other switch related logics are removed.
zero copy feature isn't generic currently, so it is removed.
retry logic is removed.
The above three logics will be implemented in example as reference.
vhost lib Makefile is added.
Add several TODOs:
1) allow application to disable cmpset reserve in rte_vhost_enqueue_burst
in case there is no contention.
2) fix memcpy from mbuf to vring desc when mbuf is chained and the desc couldn't
hold all the data
3) fix vhost_set_mem_table possible race condition: two vqs concurrently calls
set_mem_table which cause saved mem_temp to be overide.
merge-able feature is removed, will be merged after this patch is applied.

Signed-off-by: Huawei Xie 
Acked-by: Konstantin Ananyev 
Acked-by: Tommy Long 
---
 lib/librte_vhost/Makefile |  48 
 lib/librte_vhost/rte_virtio_net.h | 179 ---
 lib/librte_vhost/vhost-net-cdev.c |  35 +++---
 lib/librte_vhost/vhost-net-cdev.h |  45 +--
 lib/librte_vhost/vhost_rxtx.c | 157 +---
 lib/librte_vhost/virtio-net.c | 249 +++---
 6 files changed, 341 insertions(+), 372 deletions(-)
 create mode 100644 lib/librte_vhost/Makefile

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
new file mode 100644
index 000..6ad706d
--- /dev/null
+++ b/lib/librte_vhost/Makefile
@@ -0,0 +1,48 @@
+#   BSD LICENSE
+#
+#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
+#   All rights reserved.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+# * Redistributions of source code must retain the above copyright
+#   notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+#   notice, this list of conditions and the following disclaimer in
+#   the documentation and/or other materials provided with the
+#   distribution.
+# * Neither the name of Intel Corporation nor the names of its
+#   contributors may be used to endorse or promote products derived
+#   from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+include $(RTE_SDK)/mk/rte.vars.mk
+
+# library name
+LIB = librte_vhost.a
+
+CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64 -lfuse
+LDFLAGS += -lfuse
+# all source are stored in SRCS-y
+SRCS-$(CONFIG_RTE_LIBRTE_VHOST) := vhost-net-cdev.c virtio-net.c vhost_rxtx.c
+
+# install includes
+SYMLINK-$(CONFIG_RTE_LIBRTE_VHOST)-include += rte_virtio_net.h
+
+# this lib needs eal
+DEPDIRS-$(CONFIG_RTE_LIBRTE_VHOST) += lib/librte_eal lib/librte_mbuf
+
+include $(RTE_SDK)/mk/rte.lib.mk
diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 1a2f0dc..08dc6f4 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -34,28 +34,25 @@
 #ifndef _VIRTIO_NET_H_
 #define _VIRTIO_NET_H_

+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
 /* Used to indicate that the device is running on a data core */
 #define VIRTIO_DEV_RUNNING 1

 /* Backend value set by guest. */
 #define VIRTIO_DEV_STOPPED -1

-#define PAGE_SIZE   4096

 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};

-#define BUF_VECTOR_MAX 256
-
-/*
- * Structure contains buffer address, length and descriptor index
- * from vring to do scatter RX.
-*/
-struct buf_vector {
-uint64_t buf_addr;
-uint32_t buf_len;
-uint32_t desc_idx;
-};

 /*
  * Structure contains variables relevant 

[dpdk-dev] [PATCH v4 4/5] coding style issue fix

2014-09-12 Thread Huawei Xie
This vhost lib is based on old vhost example, and there are still plenty of
coding style issues left. Will fix those issues once this patch is applied. 

Signed-off-by: Huawei Xie 
Acked-by: Konstantin Ananyev 
Acked-by: Tommy Long 
---
 lib/librte_vhost/rte_virtio_net.h |  52 
 lib/librte_vhost/vhost-net-cdev.c | 256 +++---
 lib/librte_vhost/vhost-net-cdev.h |  40 +++---
 lib/librte_vhost/vhost_rxtx.c |  15 ++-
 lib/librte_vhost/virtio-net.c |  88 +++--
 5 files changed, 220 insertions(+), 231 deletions(-)

diff --git a/lib/librte_vhost/rte_virtio_net.h 
b/lib/librte_vhost/rte_virtio_net.h
index 08dc6f4..82eb993 100644
--- a/lib/librte_vhost/rte_virtio_net.h
+++ b/lib/librte_vhost/rte_virtio_net.h
@@ -43,44 +43,38 @@
 #include 
 #include 

-/* Used to indicate that the device is running on a data core */
-#define VIRTIO_DEV_RUNNING 1
-
-/* Backend value set by guest. */
-#define VIRTIO_DEV_STOPPED -1
-
+#define VIRTIO_DEV_RUNNING 1  /**< Used to indicate that the device is running 
on a data core. */
+#define VIRTIO_DEV_STOPPED -1 /**< Backend value set by guest. */

 /* Enum for virtqueue management. */
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};

-
-/*
- * Structure contains variables relevant to TX/RX virtqueues.
+/**
+ * Structure contains variables relevant to RX/TX virtqueues.
  */
-struct vhost_virtqueue
-{
-   struct vring_desc   *desc;  /* Virtqueue 
descriptor ring. */
-   struct vring_avail  *avail; /* Virtqueue 
available ring. */
-   struct vring_used   *used;  /* Virtqueue 
used ring. */
-   uint32_tsize;   /* Size 
of descriptor ring. */
-   uint32_tbackend;/* 
Backend value to determine if device should started/stopped. */
-   uint16_tvhost_hlen; /* 
Vhost header length (varies depending on RX merge buffers. */
-   volatile uint16_t   last_used_idx;  /* Last index used on 
the available ring */
-   volatile uint16_t   last_used_idx_res;  /* Used for multiple 
devices reserving buffers. */
-   eventfd_t   callfd; /* 
Currently unused as polling mode is enabled. */
-   eventfd_t   kickfd; /* Used 
to notify the guest (trigger interrupt). */
+struct vhost_virtqueue {
+   struct vring_desc*desc; /**< descriptor ring. */
+   struct vring_avail   *avail;/**< available ring. */
+   struct vring_used*used; /**< used ring. */
+   uint32_t size;  /**< size of descriptor ring. */
+   uint32_t backend;   /**< backend value to determine 
if device should be started/stopped. */
+   uint16_t vhost_hlen;/**< vhost header length 
(varies depending on RX merge buffers. */
+   volatile uint16_tlast_used_idx; /**< last index used on the 
available ring. */
+   volatile uint16_tlast_used_idx_res; /**< used for multiple devices 
reserving buffers. */
+   eventfd_tcallfd;/**< currently unused as 
polling mode is enabled. */
+   eventfd_tkickfd;/**< used to notify the guest 
(trigger interrupt). */
 } __rte_cache_aligned;

-
-/*
- * Information relating to memory regions including offsets to addresses in 
QEMUs memory file.
+/**
+ * Information relating to memory regions including offsets to
+ * addresses in QEMU memory file.
  */
 struct virtio_memory_regions {
-   uint64_tguest_phys_address; /* Base guest physical 
address of region. */
-   uint64_tguest_phys_address_end; /* End guest physical address 
of region. */
-   uint64_tmemory_size;/* Size of region. */
-   uint64_tuserspace_address;  /* Base userspace 
address of region. */
-   uint64_taddress_offset; /* Offset of region for 
address translation. */
+   uint64_tguest_phys_address; /**< base guest physical address of 
region. */
+   uint64_tguest_phys_address_end; /**< end guest physical address of 
region. */
+   uint64_tmemory_size;/**< size of region. */
+   uint64_tuserspace_address;  /**< base userspace address of 
region. */
+   uint64_taddress_offset; /**< offset of region for address 
translation. */
 };


diff --git a/lib/librte_vhost/vhost-net-cdev.c 
b/lib/librte_vhost/vhost-net-cdev.c
index e73bf23..c3b580a 100644
--- a/lib/librte_vhost/vhost-net-cdev.c
+++ b/lib/librte_vhost/vhost-net-cdev.c
@@ -46,16 +46,16 @@

 #include "vhost-net-cdev.h"

-#define FUSE_OPT_DUMMY "\0\0"
-#define FUSE_OPT_FORE  "-f\0\0"
-#d

[dpdk-dev] [PATCH v4 5/5] lib/librte_vhost: add vhost lib support in makefile

2014-09-12 Thread Huawei Xie
The build of vhost lib requires fuse development package. It is turned off by
default so as not to break DPDK build.

Signed-off-by: Huawei Xie 
Acked-by: Konstantin Ananyev 
Acked-by: Tommy Long 
---
 config/common_linuxapp | 7 +++
 lib/Makefile   | 1 +
 mk/rte.app.mk  | 5 +
 3 files changed, 13 insertions(+)

diff --git a/config/common_linuxapp b/config/common_linuxapp
index 9047975..c7c1c83 100644
--- a/config/common_linuxapp
+++ b/config/common_linuxapp
@@ -390,6 +390,13 @@ CONFIG_RTE_KNI_VHOST_DEBUG_RX=n
 CONFIG_RTE_KNI_VHOST_DEBUG_TX=n

 #
+# Compile vhost library
+# fuse, fuse-devel, kernel-modules-extra packages are needed
+#
+CONFIG_RTE_LIBRTE_VHOST=n
+CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
+
+#
 #Compile Xen domain0 support
 #
 CONFIG_RTE_LIBRTE_XEN_DOM0=n
diff --git a/lib/Makefile b/lib/Makefile
index 10c5bb3..007c174 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -60,6 +60,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_METER) += librte_meter
 DIRS-$(CONFIG_RTE_LIBRTE_SCHED) += librte_sched
 DIRS-$(CONFIG_RTE_LIBRTE_KVARGS) += librte_kvargs
 DIRS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += librte_distributor
+DIRS-$(CONFIG_RTE_LIBRTE_VHOST) += librte_vhost
 DIRS-$(CONFIG_RTE_LIBRTE_PORT) += librte_port
 DIRS-$(CONFIG_RTE_LIBRTE_TABLE) += librte_table
 DIRS-$(CONFIG_RTE_LIBRTE_PIPELINE) += librte_pipeline
diff --git a/mk/rte.app.mk b/mk/rte.app.mk
index 34dff2a..285b65c 100644
--- a/mk/rte.app.mk
+++ b/mk/rte.app.mk
@@ -190,6 +190,11 @@ ifeq ($(CONFIG_RTE_LIBRTE_VIRTIO_PMD),y)
 LDLIBS += -lrte_pmd_virtio_uio
 endif

+ifeq ($(CONFIG_RTE_LIBRTE_VHOST), y)
+LDLIBS += -lrte_vhost
+LDLIBS += -lfuse
+endif
+
 ifeq ($(CONFIG_RTE_LIBRTE_I40E_PMD),y)
 LDLIBS += -lrte_pmd_i40e
 endif
-- 
1.8.1.4



[dpdk-dev] [PATCH v4 2/5] lib/librte_vhost: re-factor vhost lib for subsequent transform

2014-09-12 Thread Huawei Xie
This patch does simple split of the original vhost example source files.
vhost rx/tx functions virtio_dev_rx/tx are copied from main.c to new file
vhost_rxtx.c.
main.c and main.h are removed. A new vhost example patchset will be submitted
later based on these two files.
Makefile for old example is removed.
virtio-net.h is renamed to rte_virtio_net.h as API header file.


Signed-off-by: Huawei Xie 
Acked-by: Konstantin Ananyev 
Acked-by: Tommy Long 
---
 lib/librte_vhost/Makefile |   60 -
 lib/librte_vhost/main.c   | 3722 -
 lib/librte_vhost/main.h   |   86 -
 lib/librte_vhost/rte_virtio_net.h |  161 ++
 lib/librte_vhost/vhost_rxtx.c |  281 +++
 lib/librte_vhost/virtio-net.h |  161 --
 6 files changed, 442 insertions(+), 4029 deletions(-)
 delete mode 100644 lib/librte_vhost/Makefile
 delete mode 100644 lib/librte_vhost/main.c
 delete mode 100644 lib/librte_vhost/main.h
 create mode 100644 lib/librte_vhost/rte_virtio_net.h
 create mode 100644 lib/librte_vhost/vhost_rxtx.c
 delete mode 100644 lib/librte_vhost/virtio-net.h

diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile
deleted file mode 100644
index f45f83f..000
--- a/lib/librte_vhost/Makefile
+++ /dev/null
@@ -1,60 +0,0 @@
-#   BSD LICENSE
-#
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
-#   All rights reserved.
-#
-#   Redistribution and use in source and binary forms, with or without
-#   modification, are permitted provided that the following conditions
-#   are met:
-#
-# * Redistributions of source code must retain the above copyright
-#   notice, this list of conditions and the following disclaimer.
-# * Redistributions in binary form must reproduce the above copyright
-#   notice, this list of conditions and the following disclaimer in
-#   the documentation and/or other materials provided with the
-#   distribution.
-# * Neither the name of Intel Corporation nor the names of its
-#   contributors may be used to endorse or promote products derived
-#   from this software without specific prior written permission.
-#
-#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-ifeq ($(RTE_SDK),)
-$(error "Please define RTE_SDK environment variable")
-endif
-
-# Default target, can be overriden by command line or environment
-RTE_TARGET ?= x86_64-native-linuxapp-gcc
-
-include $(RTE_SDK)/mk/rte.vars.mk
-
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
-$(info This application can only operate in a linuxapp environment, \
-please change the definition of the RTE_TARGET environment variable)
-all:
-else
-
-# binary name
-APP = vhost-switch
-
-# all source are stored in SRCS-y
-#SRCS-y := cusedrv.c loopback-userspace.c
-SRCS-y := main.c virtio-net.c vhost-net-cdev.c
-
-CFLAGS += -O2 -I/usr/local/include -D_FILE_OFFSET_BITS=64 -Wno-unused-parameter
-CFLAGS += $(WERROR_FLAGS)
-LDFLAGS += -lfuse
-
-include $(RTE_SDK)/mk/rte.extapp.mk
-
-endif
diff --git a/lib/librte_vhost/main.c b/lib/librte_vhost/main.c
deleted file mode 100644
index 7d9e6a2..000
--- a/lib/librte_vhost/main.c
+++ /dev/null
@@ -1,3722 +0,0 @@
-/*-
- *   BSD LICENSE
- *
- *   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES

[dpdk-dev] [PATCH v4 0/5] lib/librte_vhost: user space vhost cuse driver library

2014-09-12 Thread Xie, Huawei
Hi all:
We had generated fixes for plenty of coding style issues in the old vhost 
example code, 
and will re-generate the fixes for coding style issue once this patch is 
applied. This patch focuses
and only focuses on refactoring vhost example to a library. Any existing issue 
will be fixed in separate
patches, for example, like structure assignment rather than memcpy Stephen ever 
mentioned.

Appreciate your comments.

Best Regards
-huawei
> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Huawei Xie
> Sent: Friday, September 12, 2014 6:55 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 0/5] lib/librte_vhost: user space vhost cuse
> driver library
> 
> This set of patches transforms and refactors vhost example to a user
> space vhost cuse library. This library implements a user space vhost
> cuse driver, and provides generic APIs for user space ethernet vswitch
> to integrate us-vhost for fast packet switching with guest virtio.
> 
> Change notes:
> 
>  v2) Turn off vhost lib by default
> 
>  v3) Fixed checkpatch issues
> 
>  v4) Split the patch per Thomas's requirement
> 
> 
> Huawei Xie (5):
>   mv vhost example to vhost lib directory
>   copy the vhost rx/tx functions from main.c to new file vhost_rxtx.c
>   remove main.c main.h
>   remove Makefile
>   rename virtio-net.h to rte_virtio_net.h as API header file
>   vmdq, mac learning and other switch related logics are removed
>   zero copy feature isn't generic,and is removed.
>   add vhost lib Makefile.
>   Add TODOs for found new issues.
>   Fix coding style issue which are treated as errors by checkpatch.pl
>   add vhost lib support in makefile
>   turn off vhost lib by default as it requires fuse development package.
> 
>  config/common_linuxapp   |7 +
>  examples/vhost/Makefile  |   60 -
>  examples/vhost/eventfd_link/Makefile |   39 -
>  examples/vhost/eventfd_link/eventfd_link.c   |  205 --
>  examples/vhost/eventfd_link/eventfd_link.h   |   79 -
>  examples/vhost/libvirt/qemu-wrap.py  |  367 ---
>  examples/vhost/main.c| 3722 
> --
>  examples/vhost/main.h|   86 -
>  examples/vhost/vhost-net-cdev.c  |  367 ---
>  examples/vhost/vhost-net-cdev.h  |   83 -
>  examples/vhost/virtio-net.c  | 1165 
>  examples/vhost/virtio-net.h  |  161 --
>  lib/Makefile |1 +
>  lib/librte_vhost/Makefile|   48 +
>  lib/librte_vhost/eventfd_link/Makefile   |   39 +
>  lib/librte_vhost/eventfd_link/eventfd_link.c |  205 ++
>  lib/librte_vhost/eventfd_link/eventfd_link.h |   79 +
>  lib/librte_vhost/libvirt/qemu-wrap.py|  367 +++
>  lib/librte_vhost/rte_virtio_net.h|  192 ++
>  lib/librte_vhost/vhost-net-cdev.c|  362 +++
>  lib/librte_vhost/vhost-net-cdev.h|  112 +
>  lib/librte_vhost/vhost_rxtx.c|  301 +++
>  lib/librte_vhost/virtio-net.c| 1000 +++
>  mk/rte.app.mk|5 +
>  24 files changed, 2718 insertions(+), 6334 deletions(-)
>  delete mode 100644 examples/vhost/Makefile
>  delete mode 100644 examples/vhost/eventfd_link/Makefile
>  delete mode 100644 examples/vhost/eventfd_link/eventfd_link.c
>  delete mode 100644 examples/vhost/eventfd_link/eventfd_link.h
>  delete mode 100755 examples/vhost/libvirt/qemu-wrap.py
>  delete mode 100644 examples/vhost/main.c
>  delete mode 100644 examples/vhost/main.h
>  delete mode 100644 examples/vhost/vhost-net-cdev.c
>  delete mode 100644 examples/vhost/vhost-net-cdev.h
>  delete mode 100644 examples/vhost/virtio-net.c
>  delete mode 100644 examples/vhost/virtio-net.h
>  create mode 100644 lib/librte_vhost/Makefile
>  create mode 100644 lib/librte_vhost/eventfd_link/Makefile
>  create mode 100644 lib/librte_vhost/eventfd_link/eventfd_link.c
>  create mode 100644 lib/librte_vhost/eventfd_link/eventfd_link.h
>  create mode 100755 lib/librte_vhost/libvirt/qemu-wrap.py
>  create mode 100644 lib/librte_vhost/rte_virtio_net.h
>  create mode 100644 lib/librte_vhost/vhost-net-cdev.c
>  create mode 100644 lib/librte_vhost/vhost-net-cdev.h
>  create mode 100644 lib/librte_vhost/vhost_rxtx.c
>  create mode 100644 lib/librte_vhost/virtio-net.c
> 
> --
> 1.8.1.4



[dpdk-dev] [PATCH v2 00/17] cleanup logs in main PMDs

2014-09-12 Thread Bruce Richardson
On Mon, Sep 01, 2014 at 12:24:23PM +0200, David Marchand wrote:
> Here is a patchset that reworks the log macro in e1000, ixgbe and i40e PMDs.
> The idea behind this is to make it easier to debug some init failures and to 
> be
> sure of the datapath selected in these PMDs (rx / tx handlers selection).
> 
> The PMDs changes involve adding more debug messages in the default build.
> A new eal option has been added to set the default log level, so that you can
> render the eal a little less noisy.
> 
> I did not change the default log level for now, as some eal log messages are
> marked as DEBUG while being interesting (from my point of view).
> I suppose we can change the default log level later once the eal has been
> cleaned up.
> 
> Changes since v2:
> - continue clean up by always using PMD_*_LOG when logging something in
>   PMD (i.e. no more printf, RTE_LOG, DEBUGOUT)
> - introduce PMD_DRV_LOG_RAW macro for use by shared driver code
> - adopt 'second approach': no more \n in PMD_*_LOG callers. This means that we
>   will enforce a 'no \n' policy in logs for PMD.
> 
> -- 
> David Marchand


This patch set looks like a definite improvement despite the extra output on 
startup (which can probably be cleaned up a bit anyway in later patches).

Acked-by: Bruce Richardson 

> 
> David Marchand (17):
>   ixgbe: use the right debug macro
>   ixgbe/base: add a _RAW macro for use by shared code
>   ixgbe: clean log messages
>   ixgbe: always log init messages
>   ixgbe: add a message when forcing scatter mode
>   ixgbe: add log messages when rx bulk mode is not usable
>   i40e: use the right debug macro
>   i40e/base: add a _RAW macro for use by shared code
>   i40e: clean log messages
>   i40e: always log init messages
>   i40e: add log messages when rx bulk mode is not usable
>   e1000: use the right debug macro
>   e1000/base: add a _RAW macro for use by shared code
>   e1000: clean log messages
>   e1000: always log init messages
>   e1000: add a message when forcing scatter mode
>   eal: set log level from command line
> 
>  lib/librte_eal/bsdapp/eal/eal.c|   42 ++
>  .../bsdapp/eal/include/eal_internal_cfg.h  |1 +
>  lib/librte_eal/linuxapp/eal/eal.c  |   44 +-
>  .../linuxapp/eal/include/eal_internal_cfg.h|1 +
>  lib/librte_pmd_e1000/e1000/e1000_osdep.h   |4 +-
>  lib/librte_pmd_e1000/e1000_logs.h  |   18 +-
>  lib/librte_pmd_e1000/em_ethdev.c   |   64 ++-
>  lib/librte_pmd_e1000/em_rxtx.c |  137 +++---
>  lib/librte_pmd_e1000/igb_ethdev.c  |  100 +++--
>  lib/librte_pmd_e1000/igb_pf.c  |5 +-
>  lib/librte_pmd_e1000/igb_rxtx.c|   69 ++--
>  lib/librte_pmd_i40e/i40e/i40e_osdep.h  |8 +-
>  lib/librte_pmd_i40e/i40e_ethdev.c  |  434 
> ++--
>  lib/librte_pmd_i40e/i40e_ethdev_vf.c   |  168 
>  lib/librte_pmd_i40e/i40e_logs.h|   16 +-
>  lib/librte_pmd_i40e/i40e_pf.c  |   79 ++--
>  lib/librte_pmd_i40e/i40e_rxtx.c|  201 +
>  lib/librte_pmd_ixgbe/ixgbe/ixgbe_osdep.h   |4 +-
>  lib/librte_pmd_ixgbe/ixgbe_82599_bypass.c  |   14 +-
>  lib/librte_pmd_ixgbe/ixgbe_bypass.c|   26 +-
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c|  177 
>  lib/librte_pmd_ixgbe/ixgbe_fdir.c  |6 +-
>  lib/librte_pmd_ixgbe/ixgbe_logs.h  |   16 +-
>  lib/librte_pmd_ixgbe/ixgbe_pf.c|4 +-
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c  |  169 +---
>  25 files changed, 979 insertions(+), 828 deletions(-)
> 
> -- 
> 1.7.10.4
> 


[dpdk-dev] [PATCH v4 1/5] lib/librte_vhost: mv vhost example to vhost lib directory for further code re-factoring

2014-09-12 Thread Huawei Xie
This commit creates vhost library directory, and copies vhost example into it.

Signed-off-by: Huawei Xie 
Acked-by: Konstantin Ananyev 
Acked-by: Tommy Long 
---
 examples/vhost/Makefile  |   60 -
 examples/vhost/eventfd_link/Makefile |   39 -
 examples/vhost/eventfd_link/eventfd_link.c   |  205 --
 examples/vhost/eventfd_link/eventfd_link.h   |   79 -
 examples/vhost/libvirt/qemu-wrap.py  |  367 ---
 examples/vhost/main.c| 3722 --
 examples/vhost/main.h|   86 -
 examples/vhost/vhost-net-cdev.c  |  367 ---
 examples/vhost/vhost-net-cdev.h  |   83 -
 examples/vhost/virtio-net.c  | 1165 
 examples/vhost/virtio-net.h  |  161 --
 lib/librte_vhost/Makefile|   60 +
 lib/librte_vhost/eventfd_link/Makefile   |   39 +
 lib/librte_vhost/eventfd_link/eventfd_link.c |  205 ++
 lib/librte_vhost/eventfd_link/eventfd_link.h |   79 +
 lib/librte_vhost/libvirt/qemu-wrap.py|  367 +++
 lib/librte_vhost/main.c  | 3722 ++
 lib/librte_vhost/main.h  |   86 +
 lib/librte_vhost/vhost-net-cdev.c|  367 +++
 lib/librte_vhost/vhost-net-cdev.h|   83 +
 lib/librte_vhost/virtio-net.c| 1165 
 lib/librte_vhost/virtio-net.h|  161 ++
 22 files changed, 6334 insertions(+), 6334 deletions(-)
 delete mode 100644 examples/vhost/Makefile
 delete mode 100644 examples/vhost/eventfd_link/Makefile
 delete mode 100644 examples/vhost/eventfd_link/eventfd_link.c
 delete mode 100644 examples/vhost/eventfd_link/eventfd_link.h
 delete mode 100755 examples/vhost/libvirt/qemu-wrap.py
 delete mode 100644 examples/vhost/main.c
 delete mode 100644 examples/vhost/main.h
 delete mode 100644 examples/vhost/vhost-net-cdev.c
 delete mode 100644 examples/vhost/vhost-net-cdev.h
 delete mode 100644 examples/vhost/virtio-net.c
 delete mode 100644 examples/vhost/virtio-net.h
 create mode 100644 lib/librte_vhost/Makefile
 create mode 100644 lib/librte_vhost/eventfd_link/Makefile
 create mode 100644 lib/librte_vhost/eventfd_link/eventfd_link.c
 create mode 100644 lib/librte_vhost/eventfd_link/eventfd_link.h
 create mode 100755 lib/librte_vhost/libvirt/qemu-wrap.py
 create mode 100644 lib/librte_vhost/main.c
 create mode 100644 lib/librte_vhost/main.h
 create mode 100644 lib/librte_vhost/vhost-net-cdev.c
 create mode 100644 lib/librte_vhost/vhost-net-cdev.h
 create mode 100644 lib/librte_vhost/virtio-net.c
 create mode 100644 lib/librte_vhost/virtio-net.h

diff --git a/examples/vhost/Makefile b/examples/vhost/Makefile
deleted file mode 100644
index f45f83f..000
--- a/examples/vhost/Makefile
+++ /dev/null
@@ -1,60 +0,0 @@
-#   BSD LICENSE
-#
-#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
-#   All rights reserved.
-#
-#   Redistribution and use in source and binary forms, with or without
-#   modification, are permitted provided that the following conditions
-#   are met:
-#
-# * Redistributions of source code must retain the above copyright
-#   notice, this list of conditions and the following disclaimer.
-# * Redistributions in binary form must reproduce the above copyright
-#   notice, this list of conditions and the following disclaimer in
-#   the documentation and/or other materials provided with the
-#   distribution.
-# * Neither the name of Intel Corporation nor the names of its
-#   contributors may be used to endorse or promote products derived
-#   from this software without specific prior written permission.
-#
-#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
-#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
-#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
-#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
-#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
-#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
-#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
-#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
-#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
-#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
-#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-ifeq ($(RTE_SDK),)
-$(error "Please define RTE_SDK environment variable")
-endif
-
-# Default target, can be overriden by command line or environment
-RTE_TARGET ?= x86_64-native-linuxapp-gcc
-
-include $(RTE_SDK)/mk/rte.vars.mk
-
-ifneq ($(CONFIG_RTE_EXEC_ENV),"linuxapp")
-$(info This application can only operate in a linuxapp environment, \
-please change the definition of the RTE_TARGET environment variable)
-all:
-else
-
-# binary name
-APP = vhost-switch
-
-# all sou

[dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf metadata

2014-09-12 Thread Dumitrescu, Cristian
Bruce, Olivier, 

What is the reason to remove this field? Please explain the rationale of 
removing this field.

We previously agreed we need to provide an easy and standard mechanism for 
applications to extend the mandatory per buffer metadata (mbuf) with optional 
application-dependent metadata. This field just provides a clean way for the 
apps to know where is the end of the mandatory metadata, i.e. the first 
location in the packet buffer where the app can add its own metadata (of 
course, the app has to manage the headroom space before the first byte of 
packet data). A zero-size field is the standard mechanism that DPDK uses 
extensively in pretty much every library to access memory immediately after a 
header structure.

The impact of removing this field is that there is no standard way to identify 
where the end of the mandatory metadata is, so each application will have to 
reinvent this. With no clear convention, we will end up with a lot of 
non-standard ways. Every time the format of the mbuf structure is going to be 
changed, this can potentially break applications that use custom metadata, 
while using this simple standard mechanism would prevent this. So why remove 
this?

Having applications define their optional meta-data is a real need. Please take 
a look at the Service Chaining IEFT emerging protocols 
(https://datatracker.ietf.org/wg/sfc/documents/), which provide standard 
mechanisms for applications to define their own packet meta-data and share it 
between the elements of the processing pipeline (for Service Chaining, these 
are typically virtual machines scattered amongst the data center).

And, in my opinion, there is no negative impact/cost associated with keeping 
this field.

Regards,
Cristian


-Original Message-
From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Richardson, Bruce
Sent: Tuesday, September 9, 2014 10:01 AM
To: Olivier MATZ; dev at dpdk.org
Subject: Re: [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf 
metadata

> -Original Message-
> From: Olivier MATZ [mailto:olivier.matz at 6wind.com]
> Sent: Monday, September 08, 2014 1:06 PM
> To: Richardson, Bruce; dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the
> mbuf metadata
> 
> Hi Bruce,
> 
> On 09/03/2014 05:49 PM, Bruce Richardson wrote:
> > Removed the explicit zero-sized metadata definition at the end of the
> > mbuf data structure. Updated the metadata macros to take account of this
> > change so that all existing code which uses those macros still works.
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 22 --
> >  1 file changed, 8 insertions(+), 14 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 5260001..ca66d9a 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -166,31 +166,25 @@ struct rte_mbuf {
> > struct rte_mempool *pool; /**< Pool from which mbuf was allocated.
> */
> > struct rte_mbuf *next;/**< Next segment of scattered packet. */
> >
> > -   union {
> > -   uint8_t metadata[0];
> > -   uint16_t metadata16[0];
> > -   uint32_t metadata32[0];
> > -   uint64_t metadata64[0];
> > -   } __rte_cache_aligned;
> >  } __rte_cache_aligned;
> >
> >  #define RTE_MBUF_METADATA_UINT8(mbuf, offset)  \
> > -   (mbuf->metadata[offset])
> > +   (((uint8_t *)&(mbuf)[1])[offset])
> >  #define RTE_MBUF_METADATA_UINT16(mbuf, offset) \
> > -   (mbuf->metadata16[offset/sizeof(uint16_t)])
> > +   (((uint16_t *)&(mbuf)[1])[offset/sizeof(uint16_t)])
> >  #define RTE_MBUF_METADATA_UINT32(mbuf, offset) \
> > -   (mbuf->metadata32[offset/sizeof(uint32_t)])
> > +   (((uint32_t *)&(mbuf)[1])[offset/sizeof(uint32_t)])
> >  #define RTE_MBUF_METADATA_UINT64(mbuf, offset) \
> > -   (mbuf->metadata64[offset/sizeof(uint64_t)])
> > +   (((uint64_t *)&(mbuf)[1])[offset/sizeof(uint64_t)])
> >
> >  #define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset)  \
> > -   (&mbuf->metadata[offset])
> > +   (&RTE_MBUF_METADATA_UINT8(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset) \
> > -   (&mbuf->metadata16[offset/sizeof(uint16_t)])
> > +   (&RTE_MBUF_METADATA_UINT16(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset) \
> > -   (&mbuf->metadata32[offset/sizeof(uint32_t)])
> > +   (&RTE_MBUF_METADATA_UINT32(mbuf, offset))
> >  #define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset) \
> > -   (&mbuf->metadata64[offset/sizeof(uint64_t)])
> > +   (&RTE_MBUF_METADATA_UINT64(mbuf, offset))
> >
> >  /**
> >   * Given the buf_addr returns the pointer to corresponding mbuf.
> >
> 
> I think it goes in the good direction. So:
> Acked-by: Olivier Matz 
> 
> Just one question: why not removing RTE_MBUF_METADATA*() macros?
> I'd just provide one macro that gives a (void*) to the first byte
> 

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-09-12 Thread John W. Linville
Ping?  Are there objections to this patch from mid-July?

John

On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:
> This is a Linux-specific virtual PMD driver backed by an AF_PACKET
> socket.  This implementation uses mmap'ed ring buffers to limit copying
> and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
> AF_PACKET is used for frame reception.  In the current implementation,
> Tx and Rx queues are always paired, and therefore are always equal
> in number -- changing this would be a Simple Matter Of Programming.
> 
> Interfaces of this type are created with a command line option like
> "--vdev=eth_packet0,iface=...".  There are a number of options availabe
> as arguments:
> 
>  - Interface is chosen by "iface" (required)
>  - Number of queue pairs set by "qpairs" (optional, default: 1)
>  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
>  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
>  - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
> 
> Signed-off-by: John W. Linville 
> ---
> This PMD is intended to provide a means for using DPDK on a broad
> range of hardware without hardware-specific PMDs and (hopefully)
> with better performance than what PCAP offers in Linux.  This might
> be useful as a development platform for DPDK applications when
> DPDK-supported hardware is expensive or unavailable.
> 
> New in v2:
> 
> -- fixup some style issues found by check patch
> -- use if_index as part of fanout group ID
> -- set default number of queue pairs to 1
> 
>  config/common_bsdapp   |   5 +
>  config/common_linuxapp |   5 +
>  lib/Makefile   |   1 +
>  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
>  lib/librte_pmd_packet/Makefile |  60 +++
>  lib/librte_pmd_packet/rte_eth_packet.c | 826 
> +
>  lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
>  mk/rte.app.mk  |   4 +
>  8 files changed, 957 insertions(+)
>  create mode 100644 lib/librte_pmd_packet/Makefile
>  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c
>  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> 
> diff --git a/config/common_bsdapp b/config/common_bsdapp
> index 943dce8f1ede..c317f031278e 100644
> --- a/config/common_bsdapp
> +++ b/config/common_bsdapp
> @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
>  CONFIG_RTE_LIBRTE_PMD_BOND=y
>  
>  #
> +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> +#
> +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> +
> +#
>  # Do prefetch of packet data within PMD driver receive function
>  #
>  CONFIG_RTE_PMD_PACKET_PREFETCH=y
> diff --git a/config/common_linuxapp b/config/common_linuxapp
> index 7bf5d80d4e26..f9e7bc3015ec 100644
> --- a/config/common_linuxapp
> +++ b/config/common_linuxapp
> @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
>  CONFIG_RTE_LIBRTE_PMD_BOND=y
>  
>  #
> +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> +#
> +CONFIG_RTE_LIBRTE_PMD_PACKET=y
> +
> +#
>  # Compile Xen PMD
>  #
>  CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
> diff --git a/lib/Makefile b/lib/Makefile
> index 10c5bb3045bc..930fadf29898 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
> +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet
>  DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
>  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
>  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
> diff --git a/lib/librte_eal/linuxapp/eal/Makefile 
> b/lib/librte_eal/linuxapp/eal/Makefile
> index 756d6b0c9301..feed24a63272 100644
> --- a/lib/librte_eal/linuxapp/eal/Makefile
> +++ b/lib/librte_eal/linuxapp/eal/Makefile
> @@ -44,6 +44,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ether
>  CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem
>  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_ring
>  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_pcap
> +CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_packet
>  CFLAGS += -I$(RTE_SDK)/lib/librte_pmd_xenvirt
>  CFLAGS += $(WERROR_FLAGS) -O3
>  
> diff --git a/lib/librte_pmd_packet/Makefile b/lib/librte_pmd_packet/Makefile
> new file mode 100644
> index ..e1266fb992cd
> --- /dev/null
> +++ b/lib/librte_pmd_packet/Makefile
> @@ -0,0 +1,60 @@
> +#   BSD LICENSE
> +#
> +#   Copyright(c) 2014 John W. Linville 
> +#   Copyright(c) 2010-2014 Intel Corporation. All rights reserved.
> +#   Copyright(c) 2014 6WIND S.A.
> +#   All rights reserved.
> +#
> +#   Redistribution and use in source and binary forms, with or without
> +#   modification, are permitted provided that the following conditions
> +#   are met:
> +#
> +# * Redistributions of source code must retain the above copyright
> +#   n

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-09-12 Thread Zhou, Danny
I am concerned about its performance caused by too many memcpy(). Specifically, 
on Rx side, kernel NIC driver needs to copy packets to skb, then af_packet 
copies packets to AF_PACKET buffer which are mapped to user space, and then 
those packets to be copied to DPDK mbuf. In addition, 3 copies needed on Tx 
side. So to run a simple DPDK L2/L3 forwarding benchmark, each packet needs 6 
packet copies which brings significant negative performance impact. We had a 
bifurcated driver prototype that can do zero-copy and achieve native DPDK 
performance, but it depends on base driver and AF_PACKET code changes in 
kernel, John R will be presenting it in coming Linux Plumbers Conference. Once 
kernel adopts it, the relevant PMD will be submitted to dpdk.org.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John W. Linville
> Sent: Saturday, September 13, 2014 2:05 AM
> To: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for 
> AF_PACKET-based virtual devices
> 
> Ping?  Are there objections to this patch from mid-July?
> 
> John
> 
> On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:
> > This is a Linux-specific virtual PMD driver backed by an AF_PACKET
> > socket.  This implementation uses mmap'ed ring buffers to limit copying
> > and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
> > AF_PACKET is used for frame reception.  In the current implementation,
> > Tx and Rx queues are always paired, and therefore are always equal
> > in number -- changing this would be a Simple Matter Of Programming.
> >
> > Interfaces of this type are created with a command line option like
> > "--vdev=eth_packet0,iface=...".  There are a number of options availabe
> > as arguments:
> >
> >  - Interface is chosen by "iface" (required)
> >  - Number of queue pairs set by "qpairs" (optional, default: 1)
> >  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
> >  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
> >  - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
> >
> > Signed-off-by: John W. Linville 
> > ---
> > This PMD is intended to provide a means for using DPDK on a broad
> > range of hardware without hardware-specific PMDs and (hopefully)
> > with better performance than what PCAP offers in Linux.  This might
> > be useful as a development platform for DPDK applications when
> > DPDK-supported hardware is expensive or unavailable.
> >
> > New in v2:
> >
> > -- fixup some style issues found by check patch
> > -- use if_index as part of fanout group ID
> > -- set default number of queue pairs to 1
> >
> >  config/common_bsdapp   |   5 +
> >  config/common_linuxapp |   5 +
> >  lib/Makefile   |   1 +
> >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> >  lib/librte_pmd_packet/Makefile |  60 +++
> >  lib/librte_pmd_packet/rte_eth_packet.c | 826 
> > +
> >  lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
> >  mk/rte.app.mk  |   4 +
> >  8 files changed, 957 insertions(+)
> >  create mode 100644 lib/librte_pmd_packet/Makefile
> >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c
> >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> >
> > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > index 943dce8f1ede..c317f031278e 100644
> > --- a/config/common_bsdapp
> > +++ b/config/common_bsdapp
> > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
> >  CONFIG_RTE_LIBRTE_PMD_BOND=y
> >
> >  #
> > +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> > +#
> > +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> > +
> > +#
> >  # Do prefetch of packet data within PMD driver receive function
> >  #
> >  CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > diff --git a/config/common_linuxapp b/config/common_linuxapp
> > index 7bf5d80d4e26..f9e7bc3015ec 100644
> > --- a/config/common_linuxapp
> > +++ b/config/common_linuxapp
> > @@ -249,6 +249,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=n
> >  CONFIG_RTE_LIBRTE_PMD_BOND=y
> >
> >  #
> > +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> > +#
> > +CONFIG_RTE_LIBRTE_PMD_PACKET=y
> > +
> > +#
> >  # Compile Xen PMD
> >  #
> >  CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
> > diff --git a/lib/Makefile b/lib/Makefile
> > index 10c5bb3045bc..930fadf29898 100644
> > --- a/lib/Makefile
> > +++ b/lib/Makefile
> > @@ -47,6 +47,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += librte_pmd_i40e
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_BOND) += librte_pmd_bond
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_RING) += librte_pmd_ring
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += librte_pmd_pcap
> > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_PACKET) += librte_pmd_packet
> >  DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += librte_pmd_virtio
> >  DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += librte_pmd_vmxnet3
> >  DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += librte_pmd_xenvirt
> > diff --

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-09-12 Thread John W. Linville
On Fri, Sep 12, 2014 at 06:31:08PM +, Zhou, Danny wrote:
> I am concerned about its performance caused by too many
> memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy
> packets to skb, then af_packet copies packets to AF_PACKET buffer
> which are mapped to user space, and then those packets to be copied
> to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a
> simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet
> copies which brings significant negative performance impact. We
> had a bifurcated driver prototype that can do zero-copy and achieve
> native DPDK performance, but it depends on base driver and AF_PACKET
> code changes in kernel, John R will be presenting it in coming Linux
> Plumbers Conference. Once kernel adopts it, the relevant PMD will be
> submitted to dpdk.org.

Admittedly, this is not as good a performer as most of the existing
PMDs.  It serves a different purpose, afterall.  FWIW, you did
previously indicate that it performed better than the pcap-based PMD.

I look forward to seeing the changes you mention -- they sound very
exciting.  But, they will still require both networking core and
driver changes in the kernel.  And as I understand things today,
the userland code will still need at least some knowledge of specific
devices and how they layout their packet descriptors, etc.  So while
those changes sound very promising, they will still have certain
drawbacks in common with the current situation.

It seems like the changes you mention will still need some sort of
AF_PACKET-based PMD driver.  Have you implemented that completely
separate from the code I already posted?  Or did you add that work
on top of mine?

John

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John W. Linville
> > Sent: Saturday, September 13, 2014 2:05 AM
> > To: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for 
> > AF_PACKET-based virtual devices
> > 
> > Ping?  Are there objections to this patch from mid-July?
> > 
> > John
> > 
> > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:
> > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET
> > > socket.  This implementation uses mmap'ed ring buffers to limit copying
> > > and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
> > > AF_PACKET is used for frame reception.  In the current implementation,
> > > Tx and Rx queues are always paired, and therefore are always equal
> > > in number -- changing this would be a Simple Matter Of Programming.
> > >
> > > Interfaces of this type are created with a command line option like
> > > "--vdev=eth_packet0,iface=...".  There are a number of options availabe
> > > as arguments:
> > >
> > >  - Interface is chosen by "iface" (required)
> > >  - Number of queue pairs set by "qpairs" (optional, default: 1)
> > >  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
> > >  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
> > >  - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
> > >
> > > Signed-off-by: John W. Linville 
> > > ---
> > > This PMD is intended to provide a means for using DPDK on a broad
> > > range of hardware without hardware-specific PMDs and (hopefully)
> > > with better performance than what PCAP offers in Linux.  This might
> > > be useful as a development platform for DPDK applications when
> > > DPDK-supported hardware is expensive or unavailable.
> > >
> > > New in v2:
> > >
> > > -- fixup some style issues found by check patch
> > > -- use if_index as part of fanout group ID
> > > -- set default number of queue pairs to 1
> > >
> > >  config/common_bsdapp   |   5 +
> > >  config/common_linuxapp |   5 +
> > >  lib/Makefile   |   1 +
> > >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
> > >  lib/librte_pmd_packet/Makefile |  60 +++
> > >  lib/librte_pmd_packet/rte_eth_packet.c | 826 
> > > +
> > >  lib/librte_pmd_packet/rte_eth_packet.h |  55 +++
> > >  mk/rte.app.mk  |   4 +
> > >  8 files changed, 957 insertions(+)
> > >  create mode 100644 lib/librte_pmd_packet/Makefile
> > >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.c
> > >  create mode 100644 lib/librte_pmd_packet/rte_eth_packet.h
> > >
> > > diff --git a/config/common_bsdapp b/config/common_bsdapp
> > > index 943dce8f1ede..c317f031278e 100644
> > > --- a/config/common_bsdapp
> > > +++ b/config/common_bsdapp
> > > @@ -226,6 +226,11 @@ CONFIG_RTE_LIBRTE_PMD_PCAP=y
> > >  CONFIG_RTE_LIBRTE_PMD_BOND=y
> > >
> > >  #
> > > +# Compile software PMD backed by AF_PACKET sockets (Linux only)
> > > +#
> > > +CONFIG_RTE_LIBRTE_PMD_PACKET=n
> > > +
> > > +#
> > >  # Do prefetch of packet data within PMD driver receive function
> > >  #
> > >  CONFIG_RTE_PMD_PACKET_PREFETCH=y
> > > diff --git a/conf

[dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for AF_PACKET-based virtual devices

2014-09-12 Thread Zhou, Danny
> -Original Message-
> From: John W. Linville [mailto:linville at tuxdriver.com]
> Sent: Saturday, September 13, 2014 2:54 AM
> To: Zhou, Danny
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for 
> AF_PACKET-based virtual devices
> 
> On Fri, Sep 12, 2014 at 06:31:08PM +, Zhou, Danny wrote:
> > I am concerned about its performance caused by too many
> > memcpy(). Specifically, on Rx side, kernel NIC driver needs to copy
> > packets to skb, then af_packet copies packets to AF_PACKET buffer
> > which are mapped to user space, and then those packets to be copied
> > to DPDK mbuf. In addition, 3 copies needed on Tx side. So to run a
> > simple DPDK L2/L3 forwarding benchmark, each packet needs 6 packet
> > copies which brings significant negative performance impact. We
> > had a bifurcated driver prototype that can do zero-copy and achieve
> > native DPDK performance, but it depends on base driver and AF_PACKET
> > code changes in kernel, John R will be presenting it in coming Linux
> > Plumbers Conference. Once kernel adopts it, the relevant PMD will be
> > submitted to dpdk.org.
> 
> Admittedly, this is not as good a performer as most of the existing
> PMDs.  It serves a different purpose, afterall.  FWIW, you did
> previously indicate that it performed better than the pcap-based PMD.

Yes, slightly higher but makes no big difference.

> I look forward to seeing the changes you mention -- they sound very
> exciting.  But, they will still require both networking core and
> driver changes in the kernel.  And as I understand things today,
> the userland code will still need at least some knowledge of specific
> devices and how they layout their packet descriptors, etc.  So while
> those changes sound very promising, they will still have certain
> drawbacks in common with the current situation.

Yes, we would like the DPDK performance optimization techniques such as huge 
page, efficient rx/tx routines to manipulate device-specific 
packet descriptors, polling-model can be still used. We have to tradeoff 
between performance and commonality. But we believe it will be much easier
to develop DPDK PMD for non-Intel NICs than porting entire kernel drivers to 
DPDK.

> It seems like the changes you mention will still need some sort of
> AF_PACKET-based PMD driver.  Have you implemented that completely
> separate from the code I already posted?  Or did you add that work
> on top of mine?
> 

For userland code, it certainly use some of your code related to raw rocket, 
but highly modified. A layer will be added into eth_dev library to do device
probe and support new socket options.

> John
> 
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of John W. Linville
> > > Sent: Saturday, September 13, 2014 2:05 AM
> > > To: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2] librte_pmd_packet: add PMD for 
> > > AF_PACKET-based virtual devices
> > >
> > > Ping?  Are there objections to this patch from mid-July?
> > >
> > > John
> > >
> > > On Mon, Jul 14, 2014 at 02:24:50PM -0400, John W. Linville wrote:
> > > > This is a Linux-specific virtual PMD driver backed by an AF_PACKET
> > > > socket.  This implementation uses mmap'ed ring buffers to limit copying
> > > > and user/kernel transitions.  The PACKET_FANOUT_HASH behavior of
> > > > AF_PACKET is used for frame reception.  In the current implementation,
> > > > Tx and Rx queues are always paired, and therefore are always equal
> > > > in number -- changing this would be a Simple Matter Of Programming.
> > > >
> > > > Interfaces of this type are created with a command line option like
> > > > "--vdev=eth_packet0,iface=...".  There are a number of options availabe
> > > > as arguments:
> > > >
> > > >  - Interface is chosen by "iface" (required)
> > > >  - Number of queue pairs set by "qpairs" (optional, default: 1)
> > > >  - AF_PACKET MMAP block size set by "blocksz" (optional, default: 4096)
> > > >  - AF_PACKET MMAP frame size set by "framesz" (optional, default: 2048)
> > > >  - AF_PACKET MMAP frame count set by "framecnt" (optional, default: 512)
> > > >
> > > > Signed-off-by: John W. Linville 
> > > > ---
> > > > This PMD is intended to provide a means for using DPDK on a broad
> > > > range of hardware without hardware-specific PMDs and (hopefully)
> > > > with better performance than what PCAP offers in Linux.  This might
> > > > be useful as a development platform for DPDK applications when
> > > > DPDK-supported hardware is expensive or unavailable.
> > > >
> > > > New in v2:
> > > >
> > > > -- fixup some style issues found by check patch
> > > > -- use if_index as part of fanout group ID
> > > > -- set default number of queue pairs to 1
> > > >
> > > >  config/common_bsdapp   |   5 +
> > > >  config/common_linuxapp |   5 +
> > > >  lib/Makefile   |   1 +
> > > >  lib/librte_eal/linuxapp/eal/Makefile   |   1 +
>

[dpdk-dev] [PATCH 07/13] mbuf: use macros only to access the mbuf metadata

2014-09-12 Thread Olivier MATZ
Hello Cristian,

> What is the reason to remove this field? Please explain the
> rationale of removing this field.

The rationale is explained in
http://dpdk.org/ml/archives/dev/2014-September/005232.html

"The format of the metadata is up to the application".

The type of data the application stores after the mbuf has not
to be defined in the mbuf. These macros limits the types of
metadata to uint8, uint16, uint32, uint64? What should I do
if I need a void*, a struct foo ? Should we add a macro for
each possible type?

> We previously agreed we need to provide an easy and standard
> mechanism for applications to extend the mandatory per buffer
> metadata (mbuf) with optional application-dependent
> metadata.

Defining a structure in the application which does not pollute
the rte_mbuf structure is "easy and standard(TM)" too.

> This field just provides a clean way for the apps to
> know where is the end of the mandatory metadata, i.e. the first
> location in the packet buffer where the app can add its own
> metadata (of course, the app has to manage the headroom space
> before the first byte of packet data). A zero-size field is the
> standard mechanism that DPDK uses extensively in pretty much
> every library to access memory immediately after a header
> structure.

Having the following is clean too:

struct metadata {
 ...
};

struct app_mbuf {
 struct rte_mbuf mbuf;
 struct metadata metadata;
};

There is no need to define anything in the mbuf structure.

> 
> The impact of removing this field is that there is no standard
> way to identify where the end of the mandatory metadata is, so
> each application will have to reinvent this. With no clear
> convention, we will end up with a lot of non-standard ways. Every
> time the format of the mbuf structure is going to be changed,
> this can potentially break applications that use custom metadata,
> while using this simple standard mechanism would prevent this. So
> why remove this?

Waow. Five occurences of "standard" until now. Could you give a
reference to the standard you're refering to? :)

Our application defines private metadata in mbufs in the way described
above, we never changed that since we're supporting the dpdk. So
I don't understand when you say that each time mbuf is reformatted
it breaks the application.

> Having applications define their optional meta-data is a real
> need. 

Sure. This patch does not prevent this at all. You can continue
to do exactly the same, but in the concerned application, not
in the generic mbuf structure.

> Please take a look at the Service Chaining IEFT emerging
> protocols (https://datatracker.ietf.org/wg/sfc/documents/), which
> provide standard mechanisms for applications to define their own

Six :)

I'm not sure these documents define the way to extend a packet
structure with metadata in a C program. Again, Bruce's patch does
not prevent to do what you need, it just moves it at the proper
place.

> packet meta-data and share it between the elements of the
> processing pipeline (for Service Chaining, these are typically
> virtual machines scattered amongst the data center).
> 
> And, in my opinion, there is no negative impact/cost associated
> with keeping this field.

To summarize what I think:

- this patch does not prevent to do what you want to do
- removing the macros help to have a shorter and more comprehensible
  mbuf structure
- the previous approach does not scale because it would require
  a macro per type


> --
> Intel Shannon Limited
> Registered in Ireland
> Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
> Registered Number: 308263
> Business address: Dromore House, East Park, Shannon, Co. Clare
> 
> This e-mail and any attachments may contain confidential material for the 
> sole use of the intended recipient(s). Any review or distribution by others 
> is strictly prohibited. If you are not the intended recipient, please contact 
> the sender and delete all copies.

This is a public mailing list, this disclaimer is invalid.

Regards,
Olivier



[dpdk-dev] 1.7.1 testpmd hangs system

2014-09-12 Thread David S. Roth
All,

When I execute testpmd as shown below, it runs to the prompt apparently 
without error but the system becomes unresponsive and I am forced to reboot.

Any suggestions as to why this is happening?

Thanks in advance,
David

oot at dpdk-26:~/src/C/dpdk-1.7.1/x86_64-native-linuxapp-gcc# app/testpmd 
-c f -n 4 -- --portmask=0x1 --nb-cores=2 -i
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
[...]
EAL: Detected lcore 29 as core 5 on socket 1
EAL: Detected lcore 30 as core 6 on socket 1
EAL: Detected lcore 31 as core 7 on socket 1
EAL: Support maximum 64 logical core(s) by configuration.
EAL: Detected 32 lcore(s)
EAL: Setting up memory...
EAL: Ask a virtual area of 0x1 bytes
EAL: Virtual area found at 0x7f6b8000 (size = 0x1)
EAL: Ask a virtual area of 0x1 bytes
EAL: Virtual area found at 0x7f6a4000 (size = 0x1)
EAL: Requesting 4 pages of size 1024MB from socket 0
EAL: Requesting 4 pages of size 1024MB from socket 1
EAL: TSC frequency is ~260 KHz
EAL: Master core 0 is ready (tid=23fac840)
EAL: Core 3 is ready (tid=210bf700)
EAL: Core 2 is ready (tid=218c0700)
EAL: Core 1 is ready (tid=220c1700)
EAL: PCI device :05:00.0 on NUMA socket 0
EAL:   probe driver: 8086:10f8 rte_ixgbe_pmd
EAL:   :05:00.0 not managed by VFIO driver, skipping
EAL:   PCI memory mapped at 0x7f6d23e6b000
EAL:   PCI memory mapped at 0x7f6d23e67000
EAL: PCI device :05:00.1 on NUMA socket 0
EAL:   probe driver: 8086:10f8 rte_ixgbe_pmd
EAL:   :05:00.1 not managed by VFIO driver, skipping
EAL:   PCI memory mapped at 0x7f6d207bf000
EAL:   PCI memory mapped at 0x7f6d23e63000
previous number of forwarding ports 2 - changed to number of configured 
ports 1
Interactive-mode selected
Configuring Port 0 (socket 0)
Port 0: 38:EA:A7:8B:34:50
Configuring Port 1 (socket 0)
Port 1: 38:EA:A7:8B:34:51
Checking link statuses...
Port 0 Link Up - speed 1 Mbps - full-duplex
Port 1 Link Up - speed 1 Mbps - full-duplex
Done
testpmd>