[dpdk-dev] [PATCH v5 3/3] ethdev: fix wrong error return refere to API definition

2014-10-27 Thread Liang, Cunming
Ok, I'll roll back to v4.

> -Original Message-
> From: Ananyev, Konstantin
> Sent: Friday, October 24, 2014 7:05 PM
> To: dev at dpdk.org
> Cc: nhorman at tuxdriver.com; Richardson, Bruce; De Lara Guarch, Pablo; Liang,
> Cunming
> Subject: RE: [PATCH v5 3/3] ethdev: fix wrong error return refere to API 
> definition
> 
> 
> 
> > -Original Message-
> > From: y at ecsmtp.sh.intel.com [mailto:y at ecsmtp.sh.intel.com]
> > Sent: Friday, October 24, 2014 6:55 AM
> > To: dev at dpdk.org
> > Cc: nhorman at tuxdriver.com; Richardson, Bruce; Ananyev, Konstantin; De 
> > Lara
> Guarch, Pablo; Liang, Cunming
> > Subject: [PATCH v5 3/3] ethdev: fix wrong error return refere to API 
> > definition
> >
> > From: Cunming Liang 
> >
> > Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count
> returns the packet number.
> > When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE
> was set to -ENOTSUP.
> > It makes confusing.
> > The patch always return 0 no matter no packet or there's error.
> > Meanwhile set errno in such kind of checking.
> >
> > Signed-off-by: Cunming Liang 
> > ---
> >  lib/librte_ether/rte_ethdev.c |   10 +++---
> >  1 files changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> > index 50f10d9..6675f28 100644
> > --- a/lib/librte_ether/rte_ethdev.c
> > +++ b/lib/librte_ether/rte_ethdev.c
> > @@ -81,12 +81,14 @@
> >  /* Macros for checking for restricting functions to primary instance only 
> > */
> >  #define PROC_PRIMARY_OR_ERR_RET(retval) do { \
> > if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \
> > +   rte_errno = -E_RTE_SECONDARY;   \
> > PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \
> > return (retval); \
> > } \
> >  } while(0)
> >  #define PROC_PRIMARY_OR_RET() do { \
> > if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \
> > +   rte_errno = -E_RTE_SECONDARY;   \
> > PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \
> > return; \
> > } \
> > @@ -95,12 +97,14 @@
> >  /* Macros to check for invlaid function pointers in dev_ops structure */
> >  #define FUNC_PTR_OR_ERR_RET(func, retval) do { \
> > if ((func) == NULL) { \
> > +   rte_errno = -ENOTSUP; \
> > PMD_DEBUG_TRACE("Function not supported\n"); \
> > return (retval); \
> > } \
> >  } while(0)
> >  #define FUNC_PTR_OR_RET(func) do { \
> > if ((func) == NULL) { \
> > +   rte_errno = -ENOTSUP; \
> > PMD_DEBUG_TRACE("Function not supported\n"); \
> > return; \
> > } \
> > @@ -2530,7 +2534,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t
> queue_id,
> > return 0;
> > }
> > dev = &rte_eth_devices[port_id];
> > -   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
> > +   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
> > if (queue_id >= dev->data->nb_rx_queues) {
> > PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
> > return 0;
> > @@ -2551,7 +2555,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t
> queue_id,
> > }
> > dev = &rte_eth_devices[port_id];
> >
> > -   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP);
> > +   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
> > if (queue_id >= dev->data->nb_tx_queues) {
> > PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
> > return 0;
> > @@ -2570,7 +2574,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t
> queue_id)
> > return 0;
> > }
> > dev = &rte_eth_devices[port_id];
> > -   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
> > +   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
> > return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
> >  }
> 
> There are few things that worry me with that approach:
> 
> 1.  Different behaviour of rte_eth_rx_burst/rte_eth_tx_burst  for
> RTE_LIBRTE_ETHDEV_DEBUG switched on/off.
> So application might need to differentiate its code depending on
> RTE_LIBRTE_ETHDEV_DEBUG value.
> 
> 2. Even for RTE_LIBRTE_ETHDEV_DEBUG is on the behaviour of rte_eth_rx_burst/
> rte_eth_tx_burst will be inconsistent:
> It sets rte_errno if dev->rx_pkt_burst == NULL, but doesn't do the same for 
> other
> error conditions:
> When port_id or queue_id is invalid.
> 
> 3. Modifying FUNC_PTR_OR_ERR_RET() to set rte_errno, we make behaviour of
> other rte_ethdev functions inconsistent too:
> Now for some error conditions they do set rte_errno, for others they don't.
> 
> So if it would be me, I'll just:
> - leave FUNC_PTR_OR_*_RET unmodified.
> - changes rte_eth_rx_burst/tx_burst for RTE_LIBRTE_ETHDEV_DEBUG something
> like:
> 
> - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
> 
> I think, that just error logging is enough here.
> 
> Konstantin
> 
> >

[dpdk-dev] [PATCH v6 0/3] app/test: unit test to measure cycles per packet

2014-10-27 Thread Cunming Liang
v6 update:
# leave FUNC_PTR_OR_*_RET unmodified

v5 update:
# fix the confusing of retval in some API of rte_ethdev

v4 ignore

v3 update:
# Codes refine according to the feedback.
  1. add ether_format_addr to rte_ether.h
  2. fix typo in code comments.
  3. %lu to %PRIu64, fixing 32-bit targets compilation err
# merge 2 small incremental patches to the first one.
  The whole unit test as a single patch in [PATCH v3 2/2]
# rebase code to the latest master

v2 update:
Rebase code to the latest master branch.

It provides unit test to measure cycles/packet in NIC loopback mode.
It simply gives the average cycles of IO used per packet without test equipment.
When doing the test, make sure the link is UP.

There's two stream control mode support, one is continues, another is burst.
The former continues to forward the injected packets until reaching a certain 
amount of number.
The latter one stop when all the injected packets are received.
In burst stream, now measure two situations, with or without desc. cache 
conflict.
By default, it runs in continues stream mode to measure the whole rxtx.

Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
3. If choose continuous stream, there are another two options can configure
3.1 choose rx/tx pair, default is vector
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' 
without INC_VEC=y in config 
3.2 choose the area of masurement, default is rxtx
set_rxtx_anchor [rxtx|rxonly|txonly]
4. Run and wait for the result
pmd_perf_autotest

For who simply just want to see how much cycles cost per packet.
Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it.
Nothing else needs to configure. 
Using other options when you understand and what to measures more. 


BTW, [1/3] is the same patch as below one. 
http://dpdk.org/dev/patchwork/patch/817

*** BLURB HERE ***

Cunming Liang (3):
  app/test: allow to create packets in different sizes
  app/test: measure the cost of rx/tx routines by cycle number
  ethdev: fix wrong error return refere to API definition

 app/test/Makefile   |1 +
 app/test/commands.c |  111 +
 app/test/packet_burst_generator.c   |   26 +-
 app/test/packet_burst_generator.h   |   11 +-
 app/test/test.h |6 +
 app/test/test_link_bonding.c|   39 +-
 app/test/test_pmd_perf.c|  922 +++
 lib/librte_ether/rte_ethdev.c   |6 +-
 lib/librte_ether/rte_ether.h|   25 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
 10 files changed, 1117 insertions(+), 36 deletions(-)
 create mode 100644 app/test/test_pmd_perf.c

-- 
1.7.4.1



[dpdk-dev] [PATCH v6 2/3] app/test: measure the cost of rx/tx routines by cycle number

2014-10-27 Thread Cunming Liang
The unit test can be used to measure cycles per packet in different rx/tx 
rouines.
The NIC works in loopback mode. So it doesn't require test equipment to measure 
throughput.
As result, the unit test shows the average cycles per packet consuming.
When doing the test, make sure the link is UP.

Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Run and wait for the result
pmd_perf_autotest

There's option to choose rx/tx pair, default is vector.
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' without 
INC_VEC=y in config

It supports to measure standalone rx or tx.
Usage Example:
Choose rx or tx standalone, default is both
set_rxtx_anchor [rxtx|rxonly|txonly]

It also supports to measure standalone RX burst cycles.
In this way, it won't repeat re-send recevied packets.
Now it measures two situations, poll before/after xmit(w or w/o desc. cache 
conflict)
Usage Example:
Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]

Signed-off-by: Cunming Liang 
Acked-by: Bruce Richardson 
---
 app/test/Makefile   |1 +
 app/test/commands.c |  111 +
 app/test/test.h |6 +
 app/test/test_pmd_perf.c|  922 +++
 lib/librte_ether/rte_ether.h|   25 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
 6 files changed, 1071 insertions(+), 0 deletions(-)
 create mode 100644 app/test/test_pmd_perf.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 6af6d76..ebfa0ba 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -56,6 +56,7 @@ SRCS-y += test_memzone.c

 SRCS-y += test_ring.c
 SRCS-y += test_ring_perf.c
+SRCS-y += test_pmd_perf.c

 ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
 SRCS-y += test_table.c
diff --git a/app/test/commands.c b/app/test/commands.c
index a9e36b1..92a17ed 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -310,12 +310,123 @@ cmdline_parse_inst_t cmd_quit = {

 //

+struct cmd_set_rxtx_result {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t mode;
+};
+
+static void cmd_set_rxtx_parsed(void *parsed_result, struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_result *res = parsed_result;
+   if (test_set_rxtx_conf(res->mode) < 0)
+   cmdline_printf(cl, "Cannot find such mode\n");
+}
+
+cmdline_parse_token_string_t cmd_set_rxtx_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, set,
+"set_rxtx_mode");
+
+cmdline_parse_token_string_t cmd_set_rxtx_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, mode, NULL);
+
+cmdline_parse_inst_t cmd_set_rxtx = {
+   .f = cmd_set_rxtx_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "set rxtx routine: "
+   "set_rxtx ",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_set_rxtx_set,
+   (void *)&cmd_set_rxtx_mode,
+   NULL,
+   },
+};
+
+//
+
+struct cmd_set_rxtx_anchor {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t type;
+};
+
+static void
+cmd_set_rxtx_anchor_parsed(void *parsed_result,
+  struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_anchor *res = parsed_result;
+   if (test_set_rxtx_anchor(res->type) < 0)
+   cmdline_printf(cl, "Cannot find such anchor\n");
+}
+
+cmdline_parse_token_string_t cmd_set_rxtx_anchor_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, set,
+"set_rxtx_anchor");
+
+cmdline_parse_token_string_t cmd_set_rxtx_anchor_type =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, type, NULL);
+
+cmdline_parse_inst_t cmd_set_rxtx_anchor = {
+   .f = cmd_set_rxtx_anchor_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "set rxtx anchor: "
+   "set_rxtx_anchor ",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_set_rxtx_anchor_set,
+   (void *)&cmd_set_rxtx_anchor_type,
+   NULL,
+   },
+};
+
+//
+
+/* for stream control */
+struct cmd_set_rxtx_sc {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t type;
+};
+
+static void
+cmd_set_rxtx_sc_parsed(void *parsed_result,
+  struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_sc *res = parsed_result;
+   if (test_set_rxtx_sc(res->type) < 0)
+   cmdline_printf(cl, "Cannot find such stream control\n");
+}
+
+cmdline_parse_toke

[dpdk-dev] [PATCH v6 1/3] app/test: allow to create packets in different sizes

2014-10-27 Thread Cunming Liang
adding support to allow packet burst generator to create packets in differenct 
sizes

Signed-off-by: Cunming Liang 
Acked-by: Declan Doherty 
---
 app/test/packet_burst_generator.c |   26 
 app/test/packet_burst_generator.h |   11 +++--
 app/test/test_link_bonding.c  |   39 
 3 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/app/test/packet_burst_generator.c 
b/app/test/packet_burst_generator.c
index 9e747a4..017139b 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -191,20 +191,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,
  */
 #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. */

-#define TXONLY_DEF_PACKET_LEN 64
-#define TXONLY_DEF_PACKET_LEN_128 128
-
-uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN;
-uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = {
-   TXONLY_DEF_PACKET_LEN_128,
-};
-
-uint8_t  tx_pkt_nb_segs = 1;
-
 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
-   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst)
+ struct ether_hdr *eth_hdr, uint8_t vlan_enabled,
+ void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr,
+ int nb_pkt_per_burst, uint8_t pkt_len,
+ uint8_t nb_pkt_segs)
 {
int i, nb_pkt = 0;
size_t eth_hdr_size;
@@ -221,9 +213,9 @@ nomore_mbuf:
break;
}

-   pkt->data_len = tx_pkt_seg_lengths[0];
+   pkt->data_len = pkt_len;
pkt_seg = pkt;
-   for (i = 1; i < tx_pkt_nb_segs; i++) {
+   for (i = 1; i < nb_pkt_segs; i++) {
pkt_seg->next = rte_pktmbuf_alloc(mp);
if (pkt_seg->next == NULL) {
pkt->nb_segs = i;
@@ -231,7 +223,7 @@ nomore_mbuf:
goto nomore_mbuf;
}
pkt_seg = pkt_seg->next;
-   pkt_seg->data_len = tx_pkt_seg_lengths[i];
+   pkt_seg->data_len = pkt_len;
}
pkt_seg->next = NULL; /* Last segment of packet. */

@@ -259,8 +251,8 @@ nomore_mbuf:
 * Complete first mbuf of packet and append it to the
 * burst of packets to be transmitted.
 */
-   pkt->nb_segs = tx_pkt_nb_segs;
-   pkt->pkt_len = tx_pkt_length;
+   pkt->nb_segs = nb_pkt_segs;
+   pkt->pkt_len = pkt_len;
pkt->l2_len = eth_hdr_size;

if (ipv4) {
diff --git a/app/test/packet_burst_generator.h 
b/app/test/packet_burst_generator.h
index 5b3cd6c..fe992ac 100644
--- a/app/test/packet_burst_generator.h
+++ b/app/test/packet_burst_generator.h
@@ -47,10 +47,13 @@ extern "C" {
 #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \
((c & 0xff) << 8) | (d & 0xff))

+#define PACKET_BURST_GEN_PKT_LEN 60
+#define PACKET_BURST_GEN_PKT_LEN_128 128

 void
 initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
-   struct ether_addr *dst_mac, uint8_t vlan_enabled, uint16_t 
van_id);
+ struct ether_addr *dst_mac, uint8_t vlan_enabled,
+ uint16_t van_id);

 uint16_t
 initialize_udp_header(struct udp_hdr *udp_hdr, uint16_t src_port,
@@ -67,8 +70,10 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,

 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
-   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst);
+ struct ether_hdr *eth_hdr, uint8_t vlan_enabled,
+ void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr,
+ int nb_pkt_per_burst, uint8_t pkt_len,
+ uint8_t nb_pkt_segs);

 #ifdef __cplusplus
 }
diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 214d2a2..d407e4f 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -1192,9 +1192,12 @@ generate_test_burst(struct rte_mbuf **pkts_burst, 
uint16_t burst_size,
}

/* Generate burst of packets to transmit */
-   generated_burst_size = generate_packet_burst(test_params->mbuf_pool,
-   pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, 
ipv4,
-   test_params->pkt_udp_hdr, burst_size);
+   generated_burst_size =
+   generate_packet_burst(test_params->mbuf_pool,
+ pkts_burst, test_params->pkt_eth_hdr,
+

[dpdk-dev] [PATCH v6 3/3] ethdev: fix wrong error return refere to API definition

2014-10-27 Thread Cunming Liang
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count 
returns the packet number
When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set to 
-ENOTSUP.
It makes confusing.
The patch always return 0 no matter no packet or there's error.

Signed-off-by: Cunming Liang 
---
 lib/librte_ether/rte_ethdev.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 50f10d9..922a0c6 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2530,7 +2530,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
return 0;
}
dev = &rte_eth_devices[port_id];
-   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
if (queue_id >= dev->data->nb_rx_queues) {
PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
return 0;
@@ -2551,7 +2551,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
}
dev = &rte_eth_devices[port_id];

-   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
if (queue_id >= dev->data->nb_tx_queues) {
PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
return 0;
@@ -2570,7 +2570,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
return 0;
}
dev = &rte_eth_devices[port_id];
-   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
 }

-- 
1.7.4.1



[dpdk-dev] [PATCH v6 0/3] app/test: unit test to measure cycles per packet

2014-10-27 Thread Liu, Yong
Tested-by: Yong Liu 

- Tested Commit: 1ab07743b21b785a71fa334641ab58e779532600
- OS: Fedora20 3.15.8-200.fc20.x86_64
- GCC: gcc version 4.8.3 20140624
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
- NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection 
[8086:10fb]
- Default x86_64-native-linuxapp-gcc configuration
- Total 2 cases, 2 passed, 0 failed

- Case: Continuous Mode Performance
  Description: Measure continous mode cycles/packet in NIC loopback mode 
  Command / instruction:
Start sample test application.
./app/test/test -n 1 -c 
Set stream control mode to continuous
RTE>>set_rxtx_sc continuous
Choose rx/tx pair between vector|scalar|full|hybrid
RTE>>set_rxtx_mode vector
Choose the area of measurement
RTE>>set_rxtx_anchor rxtx
Start pmd performance measurement
RTE>>pmd_perf_autotest
  Expected test result:
Test result is OK and output cycle number for each packet.

- Case: Burst Mode Performance
  Description: Measure burst mode cycles/packet in NIC loopback mode
  Command / instruction:
Start sample test application.
./app/test/test -n 1 -c 
Set stream control mode to poll_before_xmit or poll_after_xmit. 
RTE>>set_rxtx_sc poll_before_xmit
Start pmd performance measurement
RTE>>pmd_perf_autotest
  Expected test result:
Test result is OK and output cycle number for each packet.

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cunming Liang
> Sent: Monday, October 27, 2014 9:20 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v6 0/3] app/test: unit test to measure cycles per
> packet
> 
> v6 update:
> # leave FUNC_PTR_OR_*_RET unmodified
> 
> v5 update:
> # fix the confusing of retval in some API of rte_ethdev
> 
> v4 ignore
> 
> v3 update:
> # Codes refine according to the feedback.
>   1. add ether_format_addr to rte_ether.h
>   2. fix typo in code comments.
>   3. %lu to %PRIu64, fixing 32-bit targets compilation err
> # merge 2 small incremental patches to the first one.
>   The whole unit test as a single patch in [PATCH v3 2/2]
> # rebase code to the latest master
> 
> v2 update:
> Rebase code to the latest master branch.
> 
> It provides unit test to measure cycles/packet in NIC loopback mode.
> It simply gives the average cycles of IO used per packet without test
> equipment.
> When doing the test, make sure the link is UP.
> 
> There's two stream control mode support, one is continues, another is burst.
> The former continues to forward the injected packets until reaching a certain
> amount of number.
> The latter one stop when all the injected packets are received.
> In burst stream, now measure two situations, with or without desc. cache
> conflict.
> By default, it runs in continues stream mode to measure the whole rxtx.
> 
> Usage Example:
> 1. Run unit test app in interactive mode
> app/test -c f -n 4 -- -i
> 2. Set stream control mode, by default is continuous
> set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
> 3. If choose continuous stream, there are another two options can configure
> 3.1 choose rx/tx pair, default is vector
> set_rxtx_mode [vector|scalar|full|hybrid]
> Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' 
> without
> INC_VEC=y in config
> 3.2 choose the area of masurement, default is rxtx
> set_rxtx_anchor [rxtx|rxonly|txonly]
> 4. Run and wait for the result
> pmd_perf_autotest
> 
> For who simply just want to see how much cycles cost per packet.
> Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it.
> Nothing else needs to configure.
> Using other options when you understand and what to measures more.
> 
> 
> BTW, [1/3] is the same patch as below one.
> http://dpdk.org/dev/patchwork/patch/817
> 
> *** BLURB HERE ***
> 
> Cunming Liang (3):
>   app/test: allow to create packets in different sizes
>   app/test: measure the cost of rx/tx routines by cycle number
>   ethdev: fix wrong error return refere to API definition
> 
>  app/test/Makefile   |1 +
>  app/test/commands.c |  111 +
>  app/test/packet_burst_generator.c   |   26 +-
>  app/test/packet_burst_generator.h   |   11 +-
>  app/test/test.h |6 +
>  app/test/test_link_bonding.c|   39 +-
>  app/test/test_pmd_perf.c|  922
> +++
>  lib/librte_ether/rte_ethdev.c   |6 +-
>  lib/librte_ether/rte_ether.h|   25 +
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
>  10 files changed, 1117 insertions(+), 36 deletions(-)
>  create mode 100644 app/test/test_pmd_perf.c
> 
> --
> 1.7.4.1



[dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville

2014-10-27 Thread Jijiang Liu
The patch set supports VxLAN on Fortville based on latest rte_mbuf structure. 

It includes:
 - Support VxLAN packet identification by configuring UDP tunneling port.
 - Support VxLAN packet filters. It uses MAC and VLAN to point
   to a queue. The filter types supported are listed below:
   1. Inner MAC and Inner VLAN ID
   2. Inner MAC address, inner VLAN ID and tenant ID.
   3. Inner MAC and tenant ID
   4. Inner MAC address
   5. Outer MAC address, tenant ID and inner MAC
 - Support VxLAN TX checksum offload, which include outer L3(IP), inner L3(IP) 
and inner L4(UDP,TCP and SCTP)

Change notes:

 v8)  * Fix the issue of redundant "PKT_RX" and the comma missing in the 
pkt_rx_flag_names[] in the rxonly.c file.

Jijiang Liu (10):
  change rte_mbuf structures 
  add data structures of UDP tunneling 
  add VxLAN packet identification API in librte_ether
  support VxLAN packet identification in i40e
  test VxLAN packet identification in testpmd.
  add data structures of tunneling filter in rte_eth_ctrl.h
  implement the API of VxLAN packet filter in i40e
  test VxLAN packet filter
  support VxLAN Tx checksum offload in i40e
  test VxLAN Tx checksum offload


 app/test-pmd/cmdline.c|  228 +-
 app/test-pmd/config.c |6 +-
 app/test-pmd/csumonly.c   |  194 --
 app/test-pmd/rxonly.c |   50 ++-
 lib/librte_ether/rte_eth_ctrl.h   |   61 +++
 lib/librte_ether/rte_ethdev.c |   52 ++
 lib/librte_ether/rte_ethdev.h |   54 ++
 lib/librte_ether/rte_ether.h  |   13 ++
 lib/librte_mbuf/rte_mbuf.h|   28 +++-
 lib/librte_pmd_i40e/i40e_ethdev.c |  331 -
 lib/librte_pmd_i40e/i40e_ethdev.h |8 +-
 lib/librte_pmd_i40e/i40e_rxtx.c   |  151 +++--
 12 files changed, 1096 insertions(+), 80 deletions(-)

-- 
1.7.7.6



[dpdk-dev] [PATCH v8 01/10] librte_mbuf:the rte_mbuf structure changes

2014-10-27 Thread Jijiang Liu
Replace the "reserved2" field with the "packet_type" field and add the 
"inner_l2_l3_len" field in the rte_mbuf structure.
The "packet_type" field is used to indicate ordinary packet format and also 
tunneling packet format such as IP in IP, IP in GRE, MAC in GRE and MAC in UDP.
The "inner_l2_len" and the "inner_l3_len" fields are added in the second cache 
line, they use 2 bytes for TX offloading of tunnels.

Signed-off-by: Jijiang Liu 
---
 lib/librte_mbuf/rte_mbuf.h |   25 -
 1 files changed, 24 insertions(+), 1 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..497d88b 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -163,7 +163,14 @@ struct rte_mbuf {

/* remaining bytes are set on RX when pulling packet from descriptor */
MARKER rx_descriptor_fields1;
-   uint16_t reserved2;   /**< Unused field. Required for padding */
+
+   /**
+* The packet type, which is used to indicate ordinary packet and also
+* tunneled packet format, i.e. each number is represented a type of
+* packet.
+*/
+   uint16_t packet_type;
+
uint16_t data_len;/**< Amount of data in segment buffer. */
uint32_t pkt_len; /**< Total pkt len: sum of all segments. */
uint16_t vlan_tci;/**< VLAN Tag Control Identifier (CPU order) 
*/
@@ -196,6 +203,18 @@ struct rte_mbuf {
uint16_t l2_len:7;  /**< L2 (MAC) Header Length. */
};
};
+
+   /* fields for TX offloading of tunnels */
+   union {
+   uint16_t inner_l2_l3_len;
+   /**< combined inner l2/l3 lengths as single var */
+   struct {
+   uint16_t inner_l3_len:9;
+   /**< inner L3 (IP) Header Length. */
+   uint16_t inner_l2_len:7;
+   /**< inner L2 (MAC) Header Length. */
+   };
+   };
 } __rte_cache_aligned;

 /**
@@ -546,11 +565,13 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf *m)
m->next = NULL;
m->pkt_len = 0;
m->l2_l3_len = 0;
+   m->inner_l2_l3_len = 0;
m->vlan_tci = 0;
m->nb_segs = 1;
m->port = 0xff;

m->ol_flags = 0;
+   m->packet_type = 0;
m->data_off = (RTE_PKTMBUF_HEADROOM <= m->buf_len) ?
RTE_PKTMBUF_HEADROOM : m->buf_len;

@@ -614,12 +635,14 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf 
*mi, struct rte_mbuf *md)
mi->port = md->port;
mi->vlan_tci = md->vlan_tci;
mi->l2_l3_len = md->l2_l3_len;
+   mi->inner_l2_l3_len = md->inner_l2_l3_len;
mi->hash = md->hash;

mi->next = NULL;
mi->pkt_len = mi->data_len;
mi->nb_segs = 1;
mi->ol_flags = md->ol_flags;
+   mi->packet_type = md->packet_type;

__rte_mbuf_sanity_check(mi, 1);
__rte_mbuf_sanity_check(md, 0);
-- 
1.7.7.6



[dpdk-dev] [PATCH v8 02/10] librte_ether:add the basic data structures of VxLAN

2014-10-27 Thread Jijiang Liu
Add definations of basic data structures of VxLAN.

Signed-off-by: Jijiang Liu 
---
 lib/librte_ether/rte_eth_ctrl.h |   12 
 lib/librte_ether/rte_ethdev.h   |8 
 lib/librte_ether/rte_ether.h|   13 +
 3 files changed, 33 insertions(+), 0 deletions(-)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index df21ac6..9a90d19 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -71,6 +71,18 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX
 };

+/**
+ * Tunneled type.
+ */
+enum rte_eth_tunnel_type {
+   RTE_TUNNEL_TYPE_NONE = 0,
+   RTE_TUNNEL_TYPE_VXLAN,
+   RTE_TUNNEL_TYPE_GENEVE,
+   RTE_TUNNEL_TYPE_TEREDO,
+   RTE_TUNNEL_TYPE_NVGRE,
+   RTE_TUNNEL_TYPE_MAX,
+};
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b69a6af..46a5568 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -710,6 +710,14 @@ struct rte_fdir_conf {
 };

 /**
+ * UDP tunneling configuration.
+ */
+struct rte_eth_udp_tunnel {
+   uint16_t udp_port;
+   uint8_t prot_type;
+};
+
+/**
  *  Possible l4type of FDIR filters.
  */
 enum rte_l4type {
diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h
index 2e08f23..100cc52 100644
--- a/lib/librte_ether/rte_ether.h
+++ b/lib/librte_ether/rte_ether.h
@@ -286,6 +286,16 @@ struct vlan_hdr {
uint16_t eth_proto;/**< Ethernet type of encapsulated frame. */
 } __attribute__((__packed__));

+/**
+ * VXLAN protocol header.
+ * Contains the 8-bit flag, 24-bit VXLAN Network Identifier and
+ * Reserved fields (24 bits and 8 bits)
+ */
+struct vxlan_hdr {
+   uint32_t vx_flags; /**< flag (8) + Reserved (24). */
+   uint32_t vx_vni;   /**< VNI (24) + Reserved (8). */
+} __attribute__((__packed__));
+
 /* Ethernet frame types */
 #define ETHER_TYPE_IPv4 0x0800 /**< IPv4 Protocol. */
 #define ETHER_TYPE_IPv6 0x86DD /**< IPv6 Protocol. */
@@ -294,6 +304,9 @@ struct vlan_hdr {
 #define ETHER_TYPE_VLAN 0x8100 /**< IEEE 802.1Q VLAN tagging. */
 #define ETHER_TYPE_1588 0x88F7 /**< IEEE 802.1AS 1588 Precise Time Protocol. */

+#define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr))
+/**< VxLAN tunnel header length. */
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.7.7.6



[dpdk-dev] [PATCH v8 03/10] librte_ether:add VxLAN packet identification API

2014-10-27 Thread Jijiang Liu
There are "some" destination UDP port numbers that have unque meaning.
In terms of VxLAN, "IANA has assigned the value 4789 for the VXLAN UDP port, 
and this value SHOULD be used by default as the destination UDP port. Some 
early implementations of VXLAN have used other values for the destination port. 
To enable interoperability with these implementations, the destination port 
SHOULD be configurable."

Add two APIs in librte_ether for supporting UDP tunneling port configuration on 
i40e.
Currently, only VxLAN is implemented in this patch set.

Signed-off-by: Jijiang Liu 
---
 lib/librte_ether/rte_ethdev.c |   52 +
 lib/librte_ether/rte_ethdev.h |   46 
 2 files changed, 98 insertions(+), 0 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 50f10d9..ff1c769 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2038,6 +2038,58 @@ rte_eth_dev_rss_hash_conf_get(uint8_t port_id,
 }

 int
+rte_eth_dev_udp_tunnel_add(uint8_t port_id,
+  struct rte_eth_udp_tunnel *udp_tunnel)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+
+   if (udp_tunnel == NULL) {
+   PMD_DEBUG_TRACE("Invalid udp_tunnel parameter\n");
+   return -EINVAL;
+   }
+
+   if (udp_tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
+   PMD_DEBUG_TRACE("Invalid tunnel type\n");
+   return -EINVAL;
+   }
+
+   dev = &rte_eth_devices[port_id];
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_add, -ENOTSUP);
+   return (*dev->dev_ops->udp_tunnel_add)(dev, udp_tunnel);
+}
+
+int
+rte_eth_dev_udp_tunnel_delete(uint8_t port_id,
+ struct rte_eth_udp_tunnel *udp_tunnel)
+{
+   struct rte_eth_dev *dev;
+
+   if (port_id >= nb_ports) {
+   PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id);
+   return -ENODEV;
+   }
+   dev = &rte_eth_devices[port_id];
+
+   if (udp_tunnel == NULL) {
+   PMD_DEBUG_TRACE("Invalid udp_tunnel parametr\n");
+   return -EINVAL;
+   }
+
+   if (udp_tunnel->prot_type >= RTE_TUNNEL_TYPE_MAX) {
+   PMD_DEBUG_TRACE("Invalid tunnel type\n");
+   return -EINVAL;
+   }
+
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->udp_tunnel_del, -ENOTSUP);
+   return (*dev->dev_ops->udp_tunnel_del)(dev, udp_tunnel);
+}
+
+int
 rte_eth_led_on(uint8_t port_id)
 {
struct rte_eth_dev *dev;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 46a5568..8bf274d 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1274,6 +1274,15 @@ typedef int (*eth_mirror_rule_reset_t)(struct 
rte_eth_dev *dev,
  uint8_t rule_id);
 /**< @internal Remove a traffic mirroring rule on an Ethernet device */

+typedef int (*eth_udp_tunnel_add_t)(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *tunnel_udp);
+/**< @internal Add tunneling UDP info */
+
+typedef int (*eth_udp_tunnel_del_t)(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *tunnel_udp);
+/**< @internal Delete tunneling UDP info */
+
+
 #ifdef RTE_NIC_BYPASS

 enum {
@@ -1454,6 +1463,8 @@ struct eth_dev_ops {
eth_set_vf_rx_tset_vf_rx;  /**< enable/disable a VF receive 
*/
eth_set_vf_tx_tset_vf_tx;  /**< enable/disable a VF 
transmit */
eth_set_vf_vlan_filter_t   set_vf_vlan_filter;  /**< Set VF VLAN filter 
*/
+   eth_udp_tunnel_add_t   udp_tunnel_add;
+   eth_udp_tunnel_del_t   udp_tunnel_del;
eth_set_queue_rate_limit_t set_queue_rate_limit;   /**< Set queue rate 
limit */
eth_set_vf_rate_limit_tset_vf_rate_limit;   /**< Set VF rate limit 
*/

@@ -3350,6 +3361,41 @@ int
 rte_eth_dev_rss_hash_conf_get(uint8_t port_id,
  struct rte_eth_rss_conf *rss_conf);

+ /**
+ * Add UDP tunneling port of an Ethernet device for filtering a specific
+ * tunneling packet by UDP port number.
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param tunnel_udp
+ *   UDP tunneling configuration.
+ *
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if port identifier is invalid.
+ *   - (-ENOTSUP) if hardware doesn't support tunnel type.
+ */
+int
+rte_eth_dev_udp_tunnel_add(uint8_t port_id,
+  struct rte_eth_udp_tunnel *tunnel_udp);
+
+ /**
+ * Detete UDP tunneling port configuration of Ethernet device
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param tunnel_udp
+ *   UDP tunneling configuration.
+ *
+ * @return
+ *   - (0) if successful.
+ *   - (-ENODEV) if port ide

[dpdk-dev] [PATCH v8 04/10] i40e:support VxLAN packet identification in i40e

2014-10-27 Thread Jijiang Liu
Implement the configuration API of VxLAN destination UDP port in 
librte_pmd_i40e,
and add new Rx offload flags for supporting VXLAN packet offload.

Signed-off-by: Jijiang Liu 
---
 lib/librte_mbuf/rte_mbuf.h|2 +
 lib/librte_pmd_i40e/i40e_ethdev.c |  157 +
 lib/librte_pmd_i40e/i40e_ethdev.h |8 ++-
 lib/librte_pmd_i40e/i40e_rxtx.c   |  105 +---
 4 files changed, 223 insertions(+), 49 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 497d88b..9af3bd9 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -91,6 +91,8 @@ extern "C" {
 #define PKT_RX_IPV6_HDR_EXT  (1ULL << 8)  /**< RX packet with extended IPv6 
header. */
 #define PKT_RX_IEEE1588_PTP  (1ULL << 9)  /**< RX IEEE1588 L2 Ethernet PT 
Packet. */
 #define PKT_RX_IEEE1588_TMST (1ULL << 10) /**< RX IEEE1588 L2/L4 timestamped 
packet.*/
+#define PKT_RX_TUNNEL_IPV4_HDR (1ULL << 11) /**< RX tunnel packet with IPv4 
header.*/
+#define PKT_RX_TUNNEL_IPV6_HDR (1ULL << 12) /**< RX tunnel packet with IPv6 
header. */

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
 #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..eb643e5 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -186,6 +186,10 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev 
*dev,
struct rte_eth_rss_conf *rss_conf);
 static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
  struct rte_eth_rss_conf *rss_conf);
+static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *udp_tunnel);
+static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev,
+   struct rte_eth_udp_tunnel *udp_tunnel);
 static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
enum rte_filter_type filter_type,
enum rte_filter_op filter_op,
@@ -241,6 +245,8 @@ static struct eth_dev_ops i40e_eth_dev_ops = {
.reta_query   = i40e_dev_rss_reta_query,
.rss_hash_update  = i40e_dev_rss_hash_update,
.rss_hash_conf_get= i40e_dev_rss_hash_conf_get,
+   .udp_tunnel_add   = i40e_dev_udp_tunnel_add,
+   .udp_tunnel_del   = i40e_dev_udp_tunnel_del,
.filter_ctrl  = i40e_dev_filter_ctrl,
 };

@@ -4092,6 +4098,157 @@ i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
return 0;
 }

+static int
+i40e_get_vxlan_port_idx(struct i40e_pf *pf, uint16_t port)
+{
+   uint8_t i;
+
+   for (i = 0; i < I40E_MAX_PF_UDP_OFFLOAD_PORTS; i++) {
+   if (pf->vxlan_ports[i] == port)
+   return i;
+   }
+
+   return -1;
+}
+
+static int
+i40e_add_vxlan_port(struct i40e_pf *pf, uint16_t port)
+{
+   int  idx, ret;
+   uint8_t filter_idx;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+
+   idx = i40e_get_vxlan_port_idx(pf, port);
+
+   /* Check if port already exists */
+   if (idx >= 0) {
+   PMD_DRV_LOG(ERR, "Port %d already offloaded\n", port);
+   return -EINVAL;
+   }
+
+   /* Now check if there is space to add the new port */
+   idx = i40e_get_vxlan_port_idx(pf, 0);
+   if (idx < 0) {
+   PMD_DRV_LOG(ERR, "Maximum number of UDP ports reached,"
+   "not adding port %d\n", port);
+   return -ENOSPC;
+   }
+
+   ret =  i40e_aq_add_udp_tunnel(hw, port, I40E_AQC_TUNNEL_TYPE_VXLAN,
+   &filter_idx, NULL);
+   if (ret < 0) {
+   PMD_DRV_LOG(ERR, "Failed to add VxLAN UDP port %d\n", port);
+   return -1;
+   }
+
+   PMD_DRV_LOG(INFO, "Added %s port %d with AQ command with index %d\n",
+port,  filter_index);
+
+   /* New port: add it and mark its index in the bitmap */
+   pf->vxlan_ports[idx] = port;
+   pf->vxlan_bitmap |= (1 << idx);
+
+   if (!(pf->flags & I40E_FLAG_VXLAN))
+   pf->flags |= I40E_FLAG_VXLAN;
+
+   return 0;
+}
+
+static int
+i40e_del_vxlan_port(struct i40e_pf *pf, uint16_t port)
+{
+   int idx;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+
+   if (!(pf->flags & I40E_FLAG_VXLAN)) {
+   PMD_DRV_LOG(ERR, "VxLAN UDP port was not configured.\n");
+   return -EINVAL;
+   }
+
+   idx = i40e_get_vxlan_port_idx(pf, port);
+
+   if (idx < 0) {
+   PMD_DRV_LOG(ERR, "Port %d doesn't exist\n", port);
+   return -EINVAL;
+   }
+
+   if (i40e_aq_del_udp_tunnel(hw, idx, NULL) < 0) {
+   PMD_DRV_

[dpdk-dev] [PATCH v8 05/10] app/test-pmd:test VxLAN packet identification

2014-10-27 Thread Jijiang Liu
Add two commands to test VxLAN packet identification.
The test steps are as follows:
 1> use commands to add/delete VxLAN UDP port.
 2> use rxonly mode to receive VxLAN packet.

Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c |   65 
 app/test-pmd/rxonly.c  |   55 +++-
 2 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..4d7b4d1 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -285,6 +285,12 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

+   "rx_vxlan_port add (udp_port) (port_id)\n"
+   "Add an UDP port for VxLAN packet filter on a 
port\n\n"
+
+   "rx_vxlan_port rm (udp_port) (port_id)\n"
+   "Remove an UDP port for VxLAN packet filter on a 
port\n\n"
+
"tx_vlan set vlan_id (port_id)\n"
"Set hardware insertion of VLAN ID in packets sent"
" on a port.\n\n"
@@ -6225,6 +6231,64 @@ cmdline_parse_inst_t cmd_vf_rate_limit = {
},
 };

+/* *** CONFIGURE TUNNEL UDP PORT *** */
+struct cmd_tunnel_udp_config {
+   cmdline_fixed_string_t cmd;
+   cmdline_fixed_string_t what;
+   uint16_t udp_port;
+   uint8_t port_id;
+};
+
+static void
+cmd_tunnel_udp_config_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_tunnel_udp_config *res = parsed_result;
+   struct rte_eth_udp_tunnel tunnel_udp;
+   int ret;
+
+   tunnel_udp.udp_port = res->udp_port;
+
+   if (!strcmp(res->cmd, "rx_vxlan_port"))
+   tunnel_udp.prot_type = RTE_TUNNEL_TYPE_VXLAN;
+
+   if (!strcmp(res->what, "add"))
+   ret = rte_eth_dev_udp_tunnel_add(res->port_id, &tunnel_udp);
+   else
+   ret = rte_eth_dev_udp_tunnel_delete(res->port_id, &tunnel_udp);
+
+   if (ret < 0)
+   printf("udp tunneling add error: (%s)\n", strerror(-ret));
+}
+
+cmdline_parse_token_string_t cmd_tunnel_udp_config_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_udp_config,
+   cmd, "rx_vxlan_port");
+cmdline_parse_token_string_t cmd_tunnel_udp_config_what =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_udp_config,
+   what, "add#rm");
+cmdline_parse_token_num_t cmd_tunnel_udp_config_udp_port =
+   TOKEN_NUM_INITIALIZER(struct cmd_tunnel_udp_config,
+   udp_port, UINT16);
+cmdline_parse_token_num_t cmd_tunnel_udp_config_port_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_tunnel_udp_config,
+   port_id, UINT8);
+
+cmdline_parse_inst_t cmd_tunnel_udp_config = {
+   .f = cmd_tunnel_udp_config_parsed,
+   .data = (void *)0,
+   .help_str = "add/rm an tunneling UDP port filter: "
+   "rx_vxlan_port add udp_port port_id",
+   .tokens = {
+   (void *)&cmd_tunnel_udp_config_cmd,
+   (void *)&cmd_tunnel_udp_config_what,
+   (void *)&cmd_tunnel_udp_config_udp_port,
+   (void *)&cmd_tunnel_udp_config_port_id,
+   NULL,
+   },
+};
+
 /* *** CONFIGURE VM MIRROR VLAN/POOL RULE *** */
 struct cmd_set_mirror_mask_result {
cmdline_fixed_string_t set;
@@ -7518,6 +7582,7 @@ cmdline_parse_ctx_t main_ctx[] = {
(cmdline_parse_inst_t *)&cmd_vf_rxvlan_filter,
(cmdline_parse_inst_t *)&cmd_queue_rate_limit,
(cmdline_parse_inst_t *)&cmd_vf_rate_limit,
+   (cmdline_parse_inst_t *)&cmd_tunnel_udp_config,
(cmdline_parse_inst_t *)&cmd_set_mirror_mask,
(cmdline_parse_inst_t *)&cmd_set_mirror_link,
(cmdline_parse_inst_t *)&cmd_reset_mirror_rule,
diff --git a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c
index 98c788b..d3be62e 100644
--- a/app/test-pmd/rxonly.c
+++ b/app/test-pmd/rxonly.c
@@ -66,10 +66,12 @@
 #include 
 #include 
 #include 
+#include 
+#include 

 #include "testpmd.h"

-#define MAX_PKT_RX_FLAGS 11
+#define MAX_PKT_RX_FLAGS 13
 static const char *pkt_rx_flag_names[MAX_PKT_RX_FLAGS] = {
"VLAN_PKT",
"RSS_HASH",
@@ -84,6 +86,9 @@ static const char *pkt_rx_flag_names[MAX_PKT_RX_FLAGS] = {

"IEEE1588_PTP",
"IEEE1588_TMST",
+
+   "TUNNEL_IPV4_HDR",
+   "TUNNEL_IPV6_HDR",
 };

 static inline void
@@ -111,7 +116,9 @@ pkt_burst_receive(struct fwd_stream *fs)
uint16_t eth_type;
uint64_t ol_flags;
uint16_t nb_rx;
-   uint16_t i;
+   uint16_t i, packet_type;
+   uint64_t is_encapsulation;
+
 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;

[dpdk-dev] [PATCH v8 06/10] librte_ether:add data structures of VxLAN filter

2014-10-27 Thread Jijiang Liu
Add definations of the data structures of tunneling packet filter in the 
rte_eth_ctrl.h file.

Signed-off-by: Jijiang Liu 
---
 lib/librte_ether/rte_eth_ctrl.h |   49 +++
 1 files changed, 49 insertions(+), 0 deletions(-)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index 9a90d19..b4ab731 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -51,6 +51,7 @@ extern "C" {
  */
 enum rte_filter_type {
RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_TUNNEL,
RTE_ETH_FILTER_MAX
 };

@@ -83,6 +84,54 @@ enum rte_eth_tunnel_type {
RTE_TUNNEL_TYPE_MAX,
 };

+/**
+ * filter type of tunneling packet
+ */
+#define ETH_TUNNEL_FILTER_OMAC  0x01 /**< filter by outer MAC addr */
+#define ETH_TUNNEL_FILTER_OIP   0x02 /**< filter by outer IP Addr */
+#define ETH_TUNNEL_FILTER_TENID 0x04 /**< filter by tenant ID */
+#define ETH_TUNNEL_FILTER_IMAC  0x08 /**< filter by inner MAC addr */
+#define ETH_TUNNEL_FILTER_IVLAN 0x10 /**< filter by inner VLAN ID */
+#define ETH_TUNNEL_FILTER_IIP   0x20 /**< filter by inner IP addr */
+
+#define RTE_TUNNEL_FILTER_IMAC_IVLAN (ETH_TUNNEL_FILTER_IMAC | \
+   ETH_TUNNEL_FILTER_IVLAN)
+#define RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID (ETH_TUNNEL_FILTER_IMAC | \
+   ETH_TUNNEL_FILTER_IVLAN | \
+   ETH_TUNNEL_FILTER_TENID)
+#define RTE_TUNNEL_FILTER_IMAC_TENID (ETH_TUNNEL_FILTER_IMAC | \
+   ETH_TUNNEL_FILTER_TENID)
+#define RTE_TUNNEL_FILTER_OMAC_TENID_IMAC (ETH_TUNNEL_FILTER_OMAC | \
+   ETH_TUNNEL_FILTER_TENID | \
+   ETH_TUNNEL_FILTER_IMAC)
+
+/**
+ *  Select IPv4 or IPv6 for tunnel filters.
+ */
+enum rte_tunnel_iptype {
+   RTE_TUNNEL_IPTYPE_IPV4 = 0, /**< IPv4. */
+   RTE_TUNNEL_IPTYPE_IPV6, /**< IPv6. */
+};
+
+/**
+ * Tunneling Packet filter configuration.
+ */
+struct rte_eth_tunnel_filter_conf {
+   struct ether_addr *outer_mac;  /**< Outer MAC address filter. */
+   struct ether_addr *inner_mac;  /**< Inner MAC address filter. */
+   uint16_t inner_vlan;   /**< Inner VLAN filter. */
+   enum rte_tunnel_iptype ip_type; /**< IP address type. */
+   union {
+   uint32_t ipv4_addr;/**< IPv4 source address to match. */
+   uint32_t ipv6_addr[4]; /**< IPv6 source address to match. */
+   } ip_addr; /**< IPv4/IPv6 source address to match (union of above). */
+
+   uint16_t filter_type;   /**< Filter type. */
+   enum rte_eth_tunnel_type tunnel_type; /**< Tunnel Type. */
+   uint32_t tenant_id; /** < Tenant number. */
+   uint16_t queue_id;  /** < queue number. */
+};
+
 #ifdef __cplusplus
 }
 #endif
-- 
1.7.7.6



[dpdk-dev] [PATCH v8 07/10] i40e:implement the API of VxLAN filter in librte_pmd_i40e

2014-10-27 Thread Jijiang Liu
The filter types supported are listed below for VxLAN:
   1. Inner MAC and Inner VLAN ID.
   2. Inner MAC address, inner VLAN ID and tenant ID.
   3. Inner MAC and tenant ID.
   4. Inner MAC address.
   5. Outer MAC address, tenant ID and inner MAC address.

Signed-off-by: Jijiang Liu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  174 -
 1 files changed, 172 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index eb643e5..be83268 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -48,6 +48,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "i40e_logs.h"
 #include "i40e/i40e_register_x710_int.h"
@@ -4099,6 +4100,108 @@ i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev,
 }

 static int
+i40e_dev_get_filter_type(uint16_t filter_type, uint16_t *flag)
+{
+   switch (filter_type) {
+   case RTE_TUNNEL_FILTER_IMAC_IVLAN:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_IVLAN_TEN_ID;
+   break;
+   case RTE_TUNNEL_FILTER_IMAC_TENID:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC_TEN_ID;
+   break;
+   case RTE_TUNNEL_FILTER_OMAC_TENID_IMAC:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_OMAC_TEN_ID_IMAC;
+   break;
+   case ETH_TUNNEL_FILTER_IMAC:
+   *flag = I40E_AQC_ADD_CLOUD_FILTER_IMAC;
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "invalid tunnel filter type\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int
+i40e_dev_tunnel_filter_set(struct i40e_pf *pf,
+   struct rte_eth_tunnel_filter_conf *tunnel_filter,
+   uint8_t add)
+{
+   uint16_t ip_type;
+   uint8_t tun_type = 0;
+   int val, ret = 0;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   struct i40e_vsi *vsi = pf->main_vsi;
+   struct i40e_aqc_add_remove_cloud_filters_element_data  *cld_filter;
+   struct i40e_aqc_add_remove_cloud_filters_element_data  *pfilter;
+
+   cld_filter = rte_zmalloc("tunnel_filter",
+   sizeof(struct i40e_aqc_add_remove_cloud_filters_element_data),
+   0);
+
+   if (NULL == cld_filter) {
+   PMD_DRV_LOG(ERR, "Failed to alloc memory.\n");
+   return -EINVAL;
+   }
+   pfilter = cld_filter;
+
+   (void)rte_memcpy(&pfilter->outer_mac, tunnel_filter->outer_mac,
+   sizeof(struct ether_addr));
+   (void)rte_memcpy(&pfilter->inner_mac, tunnel_filter->inner_mac,
+   sizeof(struct ether_addr));
+
+   pfilter->inner_vlan = tunnel_filter->inner_vlan;
+   if (tunnel_filter->ip_type == RTE_TUNNEL_IPTYPE_IPV4) {
+   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV4;
+   (void)rte_memcpy(&pfilter->ipaddr.v4.data,
+   &tunnel_filter->ip_addr,
+   sizeof(pfilter->ipaddr.v4.data));
+   } else {
+   ip_type = I40E_AQC_ADD_CLOUD_FLAGS_IPV6;
+   (void)rte_memcpy(&pfilter->ipaddr.v6.data,
+   &tunnel_filter->ip_addr,
+   sizeof(pfilter->ipaddr.v6.data));
+   }
+
+   /* check tunneled type */
+   switch (tunnel_filter->tunnel_type) {
+   case RTE_TUNNEL_TYPE_VXLAN:
+   tun_type = I40E_AQC_ADD_CLOUD_TNL_TYPE_XVLAN;
+   break;
+   default:
+   /* Other tunnel types is not supported. */
+   PMD_DRV_LOG(ERR, "tunnel type is not supported.\n");
+   rte_free(cld_filter);
+   return -EINVAL;
+   }
+
+   val = i40e_dev_get_filter_type(tunnel_filter->filter_type,
+   &pfilter->flags);
+   if (val < 0) {
+   rte_free(cld_filter);
+   return -EINVAL;
+   }
+
+   pfilter->flags |= I40E_AQC_ADD_CLOUD_FLAGS_TO_QUEUE | ip_type |
+   (tun_type << I40E_AQC_ADD_CLOUD_TNL_TYPE_SHIFT);
+   pfilter->tenant_id = tunnel_filter->tenant_id;
+   pfilter->queue_number = tunnel_filter->queue_id;
+
+   if (add)
+   ret = i40e_aq_add_cloud_filters(hw, vsi->seid, cld_filter, 1);
+   else
+   ret = i40e_aq_remove_cloud_filters(hw, vsi->seid,
+   cld_filter, 1);
+
+   rte_free(cld_filter);
+   return ret;
+}
+
+static int
 i40e_get_vxlan_port_idx(struct i40e_pf *pf, uint16_t port)
 {
uint8_t i;
@@ -4286,6 +4389,72 @@ i40e_pf_config_rss(struct i40e_pf *pf)
 }

 static int
+i40e_tunnel_filter_param_check(struct i40e_pf *pf,
+   struct rte_eth_tunnel_filter_conf *filter)
+{
+   if (pf == NULL || filter == NULL) {
+   

[dpdk-dev] [PATCH v8 08/10] app/testpmd:test VxLAN packet filter

2014-10-27 Thread Jijiang Liu
Add the "tunnel_filter" command in testpmd to test the API of VxLAN packet 
filter.

Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c |  150 
 1 files changed, 150 insertions(+), 0 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 4d7b4d1..da5d272 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -285,6 +285,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"Set the outer VLAN TPID for Packet Filtering on"
" a port\n\n"

+   "tunnel_filter add (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
+   "(inner_vlan) (tunnel_type) (filter_type) (tenant_id) 
(queue_id)\n"
+   "   add a tunnel filter of a port.\n\n"
+
+   "tunnel_filter rm (port_id) (outer_mac) (inner_mac) 
(ip_addr) "
+   "(inner_vlan) (tunnel_type) (filter_type) (tenant_id) 
(queue_id)\n"
+   "   remove a tunnel filter of a port.\n\n"
+
"rx_vxlan_port add (udp_port) (port_id)\n"
"Add an UDP port for VxLAN packet filter on a 
port\n\n"

@@ -6231,6 +6239,147 @@ cmdline_parse_inst_t cmd_vf_rate_limit = {
},
 };

+/* *** ADD TUNNEL FILTER OF A PORT *** */
+struct cmd_tunnel_filter_result {
+   cmdline_fixed_string_t cmd;
+   cmdline_fixed_string_t what;
+   uint8_t port_id;
+   struct ether_addr outer_mac;
+   struct ether_addr inner_mac;
+   cmdline_ipaddr_t ip_value;
+   uint16_t inner_vlan;
+   cmdline_fixed_string_t tunnel_type;
+   cmdline_fixed_string_t filter_type;
+   uint32_t tenant_id;
+   uint16_t queue_num;
+};
+
+static void
+cmd_tunnel_filter_parsed(void *parsed_result,
+ __attribute__((unused)) struct cmdline *cl,
+ __attribute__((unused)) void *data)
+{
+   struct cmd_tunnel_filter_result *res = parsed_result;
+   struct rte_eth_tunnel_filter_conf tunnel_filter_conf;
+   int ret = 0;
+
+   tunnel_filter_conf.outer_mac = &res->outer_mac;
+   tunnel_filter_conf.inner_mac = &res->inner_mac;
+   tunnel_filter_conf.inner_vlan = res->inner_vlan;
+
+   if (res->ip_value.family == AF_INET) {
+   tunnel_filter_conf.ip_addr.ipv4_addr =
+   res->ip_value.addr.ipv4.s_addr;
+   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV4;
+   } else {
+   memcpy(&(tunnel_filter_conf.ip_addr.ipv6_addr),
+   &(res->ip_value.addr.ipv6),
+   sizeof(struct in6_addr));
+   tunnel_filter_conf.ip_type = RTE_TUNNEL_IPTYPE_IPV6;
+   }
+
+   if (!strcmp(res->filter_type, "imac-ivlan"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC_IVLAN;
+   else if (!strcmp(res->filter_type, "imac-ivlan-tenid"))
+   tunnel_filter_conf.filter_type =
+   RTE_TUNNEL_FILTER_IMAC_IVLAN_TENID;
+   else if (!strcmp(res->filter_type, "imac-tenid"))
+   tunnel_filter_conf.filter_type = RTE_TUNNEL_FILTER_IMAC_TENID;
+   else if (!strcmp(res->filter_type, "imac"))
+   tunnel_filter_conf.filter_type = ETH_TUNNEL_FILTER_IMAC;
+   else if (!strcmp(res->filter_type, "omac-imac-tenid"))
+   tunnel_filter_conf.filter_type =
+   RTE_TUNNEL_FILTER_OMAC_TENID_IMAC;
+   else {
+   printf("The filter type is not supported");
+   return;
+   }
+
+   if (!strcmp(res->tunnel_type, "vxlan"))
+   tunnel_filter_conf.tunnel_type = RTE_TUNNEL_TYPE_VXLAN;
+   else {
+   printf("Only VxLAN is supported now.\n");
+   return;
+   }
+
+   tunnel_filter_conf.tenant_id = res->tenant_id;
+   tunnel_filter_conf.queue_id = res->queue_num;
+   if (!strcmp(res->what, "add"))
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_ADD,
+   &tunnel_filter_conf);
+   else
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_TUNNEL,
+   RTE_ETH_FILTER_DELETE,
+   &tunnel_filter_conf);
+   if (ret < 0)
+   printf("cmd_tunnel_filter_parsed error: (%s)\n",
+   strerror(-ret));
+
+}
+cmdline_parse_token_string_t cmd_tunnel_filter_cmd =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
+   cmd, "tunnel_filter");
+cmdline_parse_token_string_t cmd_tunnel_filter_what =
+   TOKEN_STRING_INITIALIZER(struct cmd_tunnel_filter_result,
+   what, "add#rm");
+cmdline_parse_token_num_t cmd_tunn

[dpdk-dev] [PATCH v8 09/10] i40e:support VxLAN Tx checksum offload

2014-10-27 Thread Jijiang Liu
Support VxLAN Tx checksum offload, which include
  - outer L3(IP) checksum offload
  - inner L3(IP) checksum offload
  - inner L4(UDP, TCP and SCTP) checksum offload

Signed-off-by: Jijiang Liu 
---
 lib/librte_mbuf/rte_mbuf.h  |1 +
 lib/librte_pmd_i40e/i40e_rxtx.c |   46 +-
 2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 9af3bd9..a86dedf 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -96,6 +96,7 @@ extern "C" {

 #define PKT_TX_VLAN_PKT  (1ULL << 55) /**< TX packet is a 802.1q VLAN 
packet. */
 #define PKT_TX_IP_CKSUM  (1ULL << 54) /**< IP cksum of TX pkt. computed by 
NIC. */
+#define PKT_TX_VXLAN_CKSUM   (1ULL << 50) /**< TX checksum of VxLAN computed 
by NIC */
 #define PKT_TX_IPV4_CSUM PKT_TX_IP_CKSUM /**< Alias of PKT_TX_IP_CKSUM. */
 #define PKT_TX_IPV4  PKT_RX_IPV4_HDR /**< IPv4 with no IP checksum 
offload. */
 #define PKT_TX_IPV6  PKT_RX_IPV6_HDR /**< IPv6 packet */
diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index 2108290..7599df9 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -411,11 +411,14 @@ i40e_rxd_ptype_to_pkt_flags(uint64_t qword)
 }

 static inline void
-i40e_txd_enable_checksum(uint32_t ol_flags,
+i40e_txd_enable_checksum(uint64_t ol_flags,
uint32_t *td_cmd,
uint32_t *td_offset,
uint8_t l2_len,
-   uint8_t l3_len)
+   uint16_t l3_len,
+   uint8_t inner_l2_len,
+   uint16_t inner_l3_len,
+   uint32_t *cd_tunneling)
 {
if (!l2_len) {
PMD_DRV_LOG(DEBUG, "L2 length set to 0");
@@ -428,6 +431,27 @@ i40e_txd_enable_checksum(uint32_t ol_flags,
return;
}

+   /* VxLAN packet TX checksum offload */
+   if (unlikely(ol_flags & PKT_TX_VXLAN_CKSUM)) {
+   uint8_t l4tun_len;
+
+   l4tun_len = ETHER_VXLAN_HLEN + inner_l2_len;
+
+   if (ol_flags & PKT_TX_IPV4_CSUM)
+   *cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV4;
+   else if (ol_flags & PKT_TX_IPV6)
+   *cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV6;
+
+   /* Now set the ctx descriptor fields */
+   *cd_tunneling |= (l3_len >> 2) <<
+   I40E_TXD_CTX_QW0_EXT_IPLEN_SHIFT |
+   I40E_TXD_CTX_UDP_TUNNELING |
+   (l4tun_len >> 1) <<
+   I40E_TXD_CTX_QW0_NATLEN_SHIFT;
+
+   l3_len = inner_l3_len;
+   }
+
/* Enable L3 checksum offloads */
if (ol_flags & PKT_TX_IPV4_CSUM) {
*td_cmd |= I40E_TX_DESC_CMD_IIPT_IPV4_CSUM;
@@ -1077,7 +1101,10 @@ i40e_recv_scattered_pkts(void *rx_queue,
 static inline uint16_t
 i40e_calc_context_desc(uint64_t flags)
 {
-   uint16_t mask = 0;
+   uint64_t mask = 0ULL;
+
+   if (flags | PKT_TX_VXLAN_CKSUM)
+   mask |= PKT_TX_VXLAN_CKSUM;

 #ifdef RTE_LIBRTE_IEEE1588
mask |= PKT_TX_IEEE1588_TMST;
@@ -1098,6 +1125,7 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
volatile struct i40e_tx_desc *txr;
struct rte_mbuf *tx_pkt;
struct rte_mbuf *m_seg;
+   uint32_t cd_tunneling_params;
uint16_t tx_id;
uint16_t nb_tx;
uint32_t td_cmd;
@@ -1106,7 +1134,9 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)
uint32_t td_tag;
uint64_t ol_flags;
uint8_t l2_len;
-   uint8_t l3_len;
+   uint16_t l3_len;
+   uint8_t inner_l2_len;
+   uint16_t inner_l3_len;
uint16_t nb_used;
uint16_t nb_ctx;
uint16_t tx_last;
@@ -1134,7 +1164,9 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, 
uint16_t nb_pkts)

ol_flags = tx_pkt->ol_flags;
l2_len = tx_pkt->l2_len;
+   inner_l2_len = tx_pkt->inner_l2_len;
l3_len = tx_pkt->l3_len;
+   inner_l3_len = tx_pkt->inner_l3_len;

/* Calculate the number of context descriptors needed. */
nb_ctx = i40e_calc_context_desc(ol_flags);
@@ -1182,15 +1214,17 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf 
**tx_pkts, uint16_t nb_pkts)
td_cmd |= I40E_TX_DESC_CMD_ICRC;

/* Enable checksum offloading */
+   cd_tunneling_params = 0;
i40e_txd_enable_checksum(ol_flags, &td_cmd, &td_offset,
-   l2_len, l3_len);
+   l2_len, l3_len, inner_l2_len,
+   inner_l3_len,
+

[dpdk-dev] [PATCH v8 10/10] app/testpmd:test VxLAN Tx checksum offload

2014-10-27 Thread Jijiang Liu
Add test cases in testpmd to test VxLAN Tx Checksum offload, which include
 - IPv4 and IPv6 packet
 - outer L3, inner L3 and L4 checksum offload for Tx side.

Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c  |   13 ++-
 app/test-pmd/config.c   |6 +-
 app/test-pmd/csumonly.c |  194 +++
 3 files changed, 192 insertions(+), 21 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index da5d272..757c399 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -310,13 +310,17 @@ static void cmd_help_long_parsed(void *parsed_result,
"Disable hardware insertion of a VLAN header in"
" packets sent on a port.\n\n"

-   "tx_checksum set mask (port_id)\n"
+   "tx_checksum set (mask) (port_id)\n"
"Enable hardware insertion of checksum offload with"
-   " the 4-bit mask, 0~0xf, in packets sent on a port.\n"
+   " the 8-bit mask, 0~0xff, in packets sent on a port.\n"
"bit 0 - insert ip   checksum offload if set\n"
"bit 1 - insert udp  checksum offload if set\n"
"bit 2 - insert tcp  checksum offload if set\n"
"bit 3 - insert sctp checksum offload if set\n"
+   "bit 4 - insert inner ip  checksum offload if 
set\n"
+   "bit 5 - insert inner udp checksum offload if 
set\n"
+   "bit 6 - insert inner tcp checksum offload if 
set\n"
+   "bit 7 - insert inner sctp checksum offload if 
set\n"
"Please check the NIC datasheet for HW limits.\n\n"

"set fwd (%s)\n"
@@ -2763,8 +2767,9 @@ cmdline_parse_inst_t cmd_tx_cksum_set = {
.f = cmd_tx_cksum_set_parsed,
.data = NULL,
.help_str = "enable hardware insertion of L3/L4checksum with a given "
-   "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
ip"
-   "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP",
+   "mask in packets sent on a port, the bit mapping is given as, Bit 0 for 
ip, "
+   "Bit 1 for UDP, Bit 2 for TCP, Bit 3 for SCTP, Bit 4 for inner ip, "
+   "Bit 5 for inner UDP, Bit 6 for inner TCP, Bit 7 for inner SCTP",
.tokens = {
(void *)&cmd_tx_cksum_set_tx_cksum,
(void *)&cmd_tx_cksum_set_set,
diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 2a1b93f..9bc08f4 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -1753,9 +1753,9 @@ tx_cksum_set(portid_t port_id, uint64_t ol_flags)
uint64_t tx_ol_flags;
if (port_id_is_invalid(port_id))
return;
-   /* Clear last 4 bits and then set L3/4 checksum mask again */
-   tx_ol_flags = ports[port_id].tx_ol_flags & (~0x0Full);
-   ports[port_id].tx_ol_flags = ((ol_flags & 0xf) | tx_ol_flags);
+   /* Clear last 8 bits and then set L3/4 checksum mask again */
+   tx_ol_flags = ports[port_id].tx_ol_flags & (~0x0FFull);
+   ports[port_id].tx_ol_flags = ((ol_flags & 0xff) | tx_ol_flags);
 }

 void
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index fcc4876..3967476 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -209,10 +209,16 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
struct rte_mbuf  *mb;
struct ether_hdr *eth_hdr;
struct ipv4_hdr  *ipv4_hdr;
+   struct ether_hdr *inner_eth_hdr;
+   struct ipv4_hdr  *inner_ipv4_hdr = NULL;
struct ipv6_hdr  *ipv6_hdr;
+   struct ipv6_hdr  *inner_ipv6_hdr = NULL;
struct udp_hdr   *udp_hdr;
+   struct udp_hdr   *inner_udp_hdr;
struct tcp_hdr   *tcp_hdr;
+   struct tcp_hdr   *inner_tcp_hdr;
struct sctp_hdr  *sctp_hdr;
+   struct sctp_hdr  *inner_sctp_hdr;

uint16_t nb_rx;
uint16_t nb_tx;
@@ -221,12 +227,17 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
uint64_t pkt_ol_flags;
uint64_t tx_ol_flags;
uint16_t l4_proto;
+   uint16_t inner_l4_proto = 0;
uint16_t eth_type;
uint8_t  l2_len;
uint8_t  l3_len;
+   uint8_t  inner_l2_len = 0;
+   uint8_t  inner_l3_len = 0;

uint32_t rx_bad_ip_csum;
uint32_t rx_bad_l4_csum;
+   uint8_t  ipv4_tunnel;
+   uint8_t  ipv6_tunnel;

 #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES
uint64_t start_tsc;
@@ -262,7 +273,10 @@ pkt_burst_checksum_forward(struct fwd_stream *fs)
l2_len  = sizeof(struct ether_hdr);
pkt_ol_flags = mb->ol_flags;
ol_flags = (pkt_ol_flags & (~PKT_TX_L4_MASK));
-
+   ipv4_tunnel = (pkt_ol_flags & PKT_RX_TUNNEL_IPV4_HDR) ?
+   1 : 0;
+

[dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville

2014-10-27 Thread Liu, Yong
Tested-by: Yong Liu 

- Tested Commit: 455d09e54b92a4626e178b020fe9c23e43ede3f7
- OS: Fedora20 3.15.8-200.fc20.x86_64
- GCC: gcc version 4.8.3 20140624
- CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
- NIC: Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ [8086:1583]
- Default x86_64-native-linuxapp-gcc configuration
- Total 6 cases, 6 passed, 0 failed

- Case: vxlan_ipv4_detect
  Description: Check testpmd can receive and detect vxlan packet 
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set forward type to rxonly and enable detail log output
testpmd>set fwd rxonly
testpmd>set verbose 1
testpmd>start
Send packets with udp/tcp/sctp inner L4 data
  Expected test result:
testpmd can receive the vxlan packet with different inner L4 data and 
detect whether the packet is vxlan packet

- Case: vxlan_ipv6_detect
  Description: Check testpmd can receive and detect ipv6 vxlan packet
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set forward type to rxonly and enable detail log output
testpmd>set fwd rxonly
testpmd>set verbose 1
testpmd>start
Send vxlan packets with outer IPv6 header and inner IPv6 header.
  Expected test result:
testpmd can receive the vxlan packet with different inner L4 data and 
detect whether the packet is IPv6 vxlan packet

- Case: vxlan_ipv4_checksum_offload
  Description: Check testpmd can offload vxlan checksum and forward the packet
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled.
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set csum packet forwarding mode and enable verbose log.
testpmd>set fwd csum
testpmd>set verbose 1
testpmd>start
Enable outer IP,UDP,TCP,SCTP and inner IP,UDP checksum offload when 
inner L4 protocal is UDP.
testpmd>tx_checksum set 0 0xf3
Enable outer IP,UDP,TCP,SCTP and inner IP,TCP,SCTP checksum offload 
when inner L4 protocal is TCP or SCTP.
testpmd>tx_checksum set 0 0xfd
Send ipv4 vxlan packet with invalid outer/inner l3 or l4 checksum.  
  Expected test result:
testpmd can forwarded vxlan packet and the checksum is corrected. The 
chksum error counter also increased.

- Case: vxlan_ipv6_checksum_offload
  Description: Check testpmd can offload ipv6 vxlan checksum and forward the 
packet 
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled.
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4789 1
Set csum packet forwarding mode and enable verbose log.
testpmd>set fwd csum
testpmd>set verbose 1
testpmd>start
Enable outer IP,UDP,TCP,SCTP and inner IP,UDP checksum offload when 
inner L4 protocal is UDP.
testpmd>tx_checksum set 0 0xf3
Enable outer IP,UDP,TCP,SCTP and inner IP,TCP,SCTP checksum offload 
when inner L4 protocal is TCP or SCTP.
testpmd>tx_checksum set 0 0xfd
Send ipv6 vxlan packet with invalid outer/inner l3 or l4 checksum.
  Expected test result:
testpmd can forwarded vxlan packet and the checksum is corrected. The 
chksum error counter also increased.


- Case: tunnel_filter
  Description: Check FVL vxlan tunnel filter function work with testpmd.
  Command / instruction:
Start testpmd with vxlan enabled and rss disabled.
testpmd -c  -n 4 -- -i --tunnel-type=1 --disable-rss 
--rxq=4 --txq=4 --nb-cores=8 --nb-ports=2
Enable VxLAN on both ports and UDP dport setting to 4789
testpmd>rx_vxlan_port add 4789 0
testpmd>rx_vxlan_port add 4
Set rxonly forwarding mode and enable verbose log.
testpmd>set fwd rxonly
testpmd>set verbose 1

[dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville

2014-10-27 Thread Zhang, Helin
Acked-by: Helin Zhang 

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu
> Sent: Monday, October 27, 2014 10:13 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville
> 
> The patch set supports VxLAN on Fortville based on latest rte_mbuf structure.
> 
> It includes:
>  - Support VxLAN packet identification by configuring UDP tunneling port.
>  - Support VxLAN packet filters. It uses MAC and VLAN to point
>to a queue. The filter types supported are listed below:
>1. Inner MAC and Inner VLAN ID
>2. Inner MAC address, inner VLAN ID and tenant ID.
>3. Inner MAC and tenant ID
>4. Inner MAC address
>5. Outer MAC address, tenant ID and inner MAC
>  - Support VxLAN TX checksum offload, which include outer L3(IP), inner L3(IP)
> and inner L4(UDP,TCP and SCTP)
> 
> Change notes:
> 
>  v8)  * Fix the issue of redundant "PKT_RX" and the comma missing in the
> pkt_rx_flag_names[] in the rxonly.c file.
> 
> Jijiang Liu (10):
>   change rte_mbuf structures
>   add data structures of UDP tunneling
>   add VxLAN packet identification API in librte_ether
>   support VxLAN packet identification in i40e
>   test VxLAN packet identification in testpmd.
>   add data structures of tunneling filter in rte_eth_ctrl.h
>   implement the API of VxLAN packet filter in i40e
>   test VxLAN packet filter
>   support VxLAN Tx checksum offload in i40e
>   test VxLAN Tx checksum offload
> 
> 
>  app/test-pmd/cmdline.c|  228 +-
>  app/test-pmd/config.c |6 +-
>  app/test-pmd/csumonly.c   |  194 --
>  app/test-pmd/rxonly.c |   50 ++-
>  lib/librte_ether/rte_eth_ctrl.h   |   61 +++
>  lib/librte_ether/rte_ethdev.c |   52 ++
>  lib/librte_ether/rte_ethdev.h |   54 ++
>  lib/librte_ether/rte_ether.h  |   13 ++
>  lib/librte_mbuf/rte_mbuf.h|   28 +++-
>  lib/librte_pmd_i40e/i40e_ethdev.c |  331
> -
>  lib/librte_pmd_i40e/i40e_ethdev.h |8 +-
>  lib/librte_pmd_i40e/i40e_rxtx.c   |  151 +++--
>  12 files changed, 1096 insertions(+), 80 deletions(-)
> 
> --
> 1.7.7.6



[dpdk-dev] [PATCH v2 0/5] Support virtio multicast feature

2014-10-27 Thread Ouyang Changchun
 - V1 change:
This patch series support multicast feature in virtio and vhost.
The vhost backend enables the promiscuous mode and config 
ETH_VMDQ_ACCEPT_BROADCAST
and ETH_VMDQ_ACCEPT_MULTICAST in VMDQ offload register to receive the multicast 
and broadcast packets.
The virtio frontend provides the functionality of enabling and disabling the 
multicast and
promiscuous mode.

 -V2 change:
Rework the patch basing on new vhost library and new vhost application.

Changchun Ouyang (5):
  Add RX mode in VMDQ config and set the register PFVML2FLT for IXGBE
PMD; this makes VMDQ accept broadcast and multicast packets.
  Set VM offload register according to VMDQ config for IGB PMD to
support broadcast and multicast packets.
  To let US-vHOST accept and forward broadcast and multicast packets:
Add promiscuous option into command line; set VMDQ RX mode into:
ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST.
  Add new API in virtio for supporting promiscuous and allmulticast
enable and disable.
  Specify rx_mode as 0 for 2 other samples: vmdq and vhost-xen.

 examples/vhost/main.c | 25 +++--
 examples/vhost_xen/main.c |  1 +
 examples/vmdq/main.c  |  1 +
 lib/librte_ether/rte_ethdev.h |  1 +
 lib/librte_pmd_e1000/igb_rxtx.c   | 20 +++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 ++
 lib/librte_pmd_virtio/virtio_ethdev.c | 98 ++-
 lib/librte_vhost/virtio-net.c |  4 +-
 8 files changed, 161 insertions(+), 5 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH v2 0/5] Support virtio multicast feature

2014-10-27 Thread Ouyang Changchun
 - V1 change:
This patch series support multicast feature in virtio and vhost.
The vhost backend enables the promiscuous mode and config 
ETH_VMDQ_ACCEPT_BROADCAST
and ETH_VMDQ_ACCEPT_MULTICAST in VMDQ offload register to receive the multicast 
and broadcast packets.
The virtio frontend provides the functionality of enabling and disabling the 
multicast and
promiscuous mode.

 -V2 change:
Rework the patch basing on new vhost library and new vhost application.

Changchun Ouyang (5):
  Add RX mode in VMDQ config and set the register PFVML2FLT for IXGBE
PMD; this makes VMDQ accept broadcast and multicast packets.
  Set VM offload register according to VMDQ config for IGB PMD to
support broadcast and multicast packets.
  To let US-vHOST accept and forward broadcast and multicast packets:
Add promiscuous option into command line; set VMDQ RX mode into:
ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST.
  Add new API in virtio for supporting promiscuous and allmulticast
enable and disable.
  Specify rx_mode as 0 for 2 other samples: vmdq and vhost-xen.

 examples/vhost/main.c | 25 +++--
 examples/vhost_xen/main.c |  1 +
 examples/vmdq/main.c  |  1 +
 lib/librte_ether/rte_ethdev.h |  1 +
 lib/librte_pmd_e1000/igb_rxtx.c   | 20 +++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 ++
 lib/librte_pmd_virtio/virtio_ethdev.c | 98 ++-
 lib/librte_vhost/virtio-net.c |  4 +-
 8 files changed, 161 insertions(+), 5 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH v2 0/5] Support virtio multicast feature

2014-10-27 Thread Ouyang Changchun
 - V1 change:
This patch series support multicast feature in virtio and vhost.
The vhost backend enables the promiscuous mode and config 
ETH_VMDQ_ACCEPT_BROADCAST
and ETH_VMDQ_ACCEPT_MULTICAST in VMDQ offload register to receive the multicast 
and broadcast packets.
The virtio frontend provides the functionality of enabling and disabling the 
multicast and
promiscuous mode.

 -V2 change:
Rework the patch basing on new vhost library and new vhost application.

Changchun Ouyang (5):
  Add RX mode in VMDQ config and set the register PFVML2FLT for IXGBE
PMD; this makes VMDQ accept broadcast and multicast packets.
  Set VM offload register according to VMDQ config for IGB PMD to
support broadcast and multicast packets.
  To let US-vHOST accept and forward broadcast and multicast packets:
Add promiscuous option into command line; set VMDQ RX mode into:
ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST.
  Add new API in virtio for supporting promiscuous and allmulticast
enable and disable.
  Specify rx_mode as 0 for 2 other samples: vmdq and vhost-xen.

 examples/vhost/main.c | 25 +++--
 examples/vhost_xen/main.c |  1 +
 examples/vmdq/main.c  |  1 +
 lib/librte_ether/rte_ethdev.h |  1 +
 lib/librte_pmd_e1000/igb_rxtx.c   | 20 +++
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 ++
 lib/librte_pmd_virtio/virtio_ethdev.c | 98 ++-
 lib/librte_vhost/virtio-net.c |  4 +-
 8 files changed, 161 insertions(+), 5 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH v2 1/5] ethdev: Add new config field to config VMDQ offload register

2014-10-27 Thread Ouyang Changchun
This patch adds new field of rx mode in VMDQ config; and set the register 
PFVML2FLT
for IXGBE PMD, this makes VMDQ receive multicast and broadcast packets.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_ether/rte_ethdev.h |  1 +
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 
 2 files changed, 17 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index b69a6af..5f5a35b 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -577,6 +577,7 @@ struct rte_eth_vmdq_rx_conf {
uint8_t default_pool; /**< The default pool, if applicable */
uint8_t enable_loop_back; /**< Enable VT loop back */
uint8_t nb_pool_maps; /**< We can have up to 64 filters/mappings */
+   uint32_t rx_mode; /**< RX mode for vmdq */
struct {
uint16_t vlan_id; /**< The vlan id of the received frame */
uint64_t pools;   /**< Bitmask of pools for packet rx */
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 123b8b3..7c72815 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -3121,6 +3121,7 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
struct ixgbe_hw *hw;
enum rte_eth_nb_pools num_pools;
uint32_t mrqc, vt_ctl, vlanctrl;
+   uint32_t vmolr = 0;
int i;

PMD_INIT_FUNC_TRACE();
@@ -3143,6 +3144,21 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev)

IXGBE_WRITE_REG(hw, IXGBE_VT_CTL, vt_ctl);

+   for (i = 0; i < (int)num_pools; i++) {
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_UNTAG)
+   vmolr |= IXGBE_VMOLR_AUPE;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_MC)
+   vmolr |= IXGBE_VMOLR_ROMPE;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_UC)
+   vmolr |= IXGBE_VMOLR_ROPE;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_BROADCAST)
+   vmolr |= IXGBE_VMOLR_BAM;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_MULTICAST)
+   vmolr |= IXGBE_VMOLR_MPE;
+
+   IXGBE_WRITE_REG(hw, IXGBE_VMOLR(i), vmolr);
+   }
+
/* VLNCTRL: enable vlan filtering and allow all vlan tags through */
vlanctrl = IXGBE_READ_REG(hw, IXGBE_VLNCTRL);
vlanctrl |= IXGBE_VLNCTRL_VFE ; /* enable vlan filters */
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 2/5] e1000: config VMDQ offload register to receive multicast packet

2014-10-27 Thread Ouyang Changchun
This patch set VM offload register according to VMDQ config for e1000
PMD to support multicast and broadcast packets.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_e1000/igb_rxtx.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index f09c525..0dca7b7 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -1779,6 +1779,26 @@ igb_vmdq_rx_hw_configure(struct rte_eth_dev *dev)
vt_ctl |= E1000_VT_CTL_IGNORE_MAC;
E1000_WRITE_REG(hw, E1000_VT_CTL, vt_ctl);

+   for (i = 0; i < E1000_VMOLR_SIZE; i++) {
+   vmolr = E1000_READ_REG(hw, E1000_VMOLR(i));
+   vmolr &= ~(E1000_VMOLR_AUPE | E1000_VMOLR_ROMPE |
+   E1000_VMOLR_ROPE | E1000_VMOLR_BAM |
+   E1000_VMOLR_MPME);
+
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_UNTAG)
+   vmolr |= E1000_VMOLR_AUPE;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_MC)
+   vmolr |= E1000_VMOLR_ROMPE;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_UC)
+   vmolr |= E1000_VMOLR_ROPE;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_BROADCAST)
+   vmolr |= E1000_VMOLR_BAM;
+   if (cfg->rx_mode & ETH_VMDQ_ACCEPT_MULTICAST)
+   vmolr |= E1000_VMOLR_MPME;
+
+   E1000_WRITE_REG(hw, E1000_VMOLR(i), vmolr);
+   }
+
/*
 * VMOLR: set STRVLAN as 1 if IGMAC in VTCTL is set as 1
 * Both 82576 and 82580 support it
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 3/5] vhost: enable promisc mode and config VMDQ offload register for multicast feature

2014-10-27 Thread Ouyang Changchun
This patch is to let vhost receive and forward multicast and broadcast packets,
add promiscuous option into command line; and set VMDQ RX mode as:
ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST if promisc mode is on.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 25 ++---
 lib/librte_vhost/virtio-net.c |  4 +++-
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 291128e..c4947f7 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -161,6 +161,9 @@
 /* mask of enabled ports */
 static uint32_t enabled_port_mask = 0;

+/* Ports set in promiscuous mode off by default. */
+static uint32_t promiscuous_on;
+
 /*Number of switching cores enabled*/
 static uint32_t num_switching_cores = 0;

@@ -274,6 +277,7 @@ static struct rte_eth_conf vmdq_conf_default = {
.enable_default_pool = 0,
.default_pool = 0,
.nb_pool_maps = 0,
+   .rx_mode = 0,
.pool_map = {{0, 0},},
},
},
@@ -364,13 +368,15 @@ static inline int
 get_eth_conf(struct rte_eth_conf *eth_conf, uint32_t num_devices)
 {
struct rte_eth_vmdq_rx_conf conf;
+   struct rte_eth_vmdq_rx_conf *def_conf =
+   &vmdq_conf_default.rx_adv_conf.vmdq_rx_conf;
unsigned i;

memset(&conf, 0, sizeof(conf));
conf.nb_queue_pools = (enum rte_eth_nb_pools)num_devices;
conf.nb_pool_maps = num_devices;
-   conf.enable_loop_back =
-   vmdq_conf_default.rx_adv_conf.vmdq_rx_conf.enable_loop_back;
+   conf.enable_loop_back = def_conf->enable_loop_back;
+   conf.rx_mode = def_conf->rx_mode;

for (i = 0; i < conf.nb_pool_maps; i++) {
conf.pool_map[i].vlan_id = vlan_tags[ i ];
@@ -468,6 +474,9 @@ port_init(uint8_t port)
return retval;
}

+   if (promiscuous_on)
+   rte_eth_promiscuous_enable(port);
+
rte_eth_macaddr_get(port, &vmdq_ports_eth_addr[port]);
RTE_LOG(INFO, VHOST_PORT, "Max virtio devices supported: %u\n", 
num_devices);
RTE_LOG(INFO, VHOST_PORT, "Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8
@@ -598,7 +607,8 @@ us_vhost_parse_args(int argc, char **argv)
};

/* Parse command line */
-   while ((opt = getopt_long(argc, argv, "p:",long_option, &option_index)) 
!= EOF) {
+   while ((opt = getopt_long(argc, argv, "p:P",
+   long_option, &option_index)) != EOF) {
switch (opt) {
/* Portmask */
case 'p':
@@ -610,6 +620,15 @@ us_vhost_parse_args(int argc, char **argv)
}
break;

+   case 'P':
+   promiscuous_on = 1;
+   vmdq_conf_default.rx_adv_conf.vmdq_rx_conf.rx_mode =
+   ETH_VMDQ_ACCEPT_BROADCAST |
+   ETH_VMDQ_ACCEPT_MULTICAST;
+   rte_vhost_feature_enable(1ULL << VIRTIO_NET_F_CTRL_RX);
+
+   break;
+
case 0:
/* Enable/disable vm2vm comms. */
if (!strncmp(long_option[option_index].name, "vm2vm",
diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c
index 27ba175..744156c 100644
--- a/lib/librte_vhost/virtio-net.c
+++ b/lib/librte_vhost/virtio-net.c
@@ -68,7 +68,9 @@ static struct virtio_net_device_ops const *notify_ops;
 static struct virtio_net_config_ll *ll_root;

 /* Features supported by this application. RX merge buffers are enabled by 
default. */
-#define VHOST_SUPPORTED_FEATURES (1ULL << VIRTIO_NET_F_MRG_RXBUF)
+#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \
+   (1ULL << VIRTIO_NET_F_CTRL_RX))
+
 static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES;

 /* Line size for reading maps file. */
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 4/5] virtio: New API to enable/disable multicast and promisc mode

2014-10-27 Thread Ouyang Changchun
This patch adds new API in virtio for supporting promiscuous and allmulticast 
enabling and disabling.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_virtio/virtio_ethdev.c | 98 ++-
 1 file changed, 97 insertions(+), 1 deletion(-)

diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c 
b/lib/librte_pmd_virtio/virtio_ethdev.c
index 19930c0..acffa9e 100644
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -66,6 +66,10 @@ static int eth_virtio_dev_init(struct eth_driver *eth_drv,
 static int  virtio_dev_configure(struct rte_eth_dev *dev);
 static int  virtio_dev_start(struct rte_eth_dev *dev);
 static void virtio_dev_stop(struct rte_eth_dev *dev);
+static void virtio_dev_promiscuous_enable(struct rte_eth_dev *dev);
+static void virtio_dev_promiscuous_disable(struct rte_eth_dev *dev);
+static void virtio_dev_allmulticast_enable(struct rte_eth_dev *dev);
+static void virtio_dev_allmulticast_disable(struct rte_eth_dev *dev);
 static void virtio_dev_info_get(struct rte_eth_dev *dev,
struct rte_eth_dev_info *dev_info);
 static int virtio_dev_link_update(struct rte_eth_dev *dev,
@@ -403,6 +407,94 @@ virtio_dev_close(struct rte_eth_dev *dev)
virtio_dev_stop(dev);
 }

+static void
+virtio_dev_promiscuous_enable(struct rte_eth_dev *dev)
+{
+   struct virtio_hw *hw
+   = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct virtio_pmd_ctrl ctrl;
+   int dlen[1];
+   int ret;
+
+   ctrl.hdr.class = VIRTIO_NET_CTRL_RX;
+   ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC;
+   ctrl.data[0] = 1;
+   dlen[0] = 1;
+
+   ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1);
+
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Promisc enabling but send command "
+ "failed, this is too late now...\n");
+   }
+}
+
+static void
+virtio_dev_promiscuous_disable(struct rte_eth_dev *dev)
+{
+   struct virtio_hw *hw
+   = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct virtio_pmd_ctrl ctrl;
+   int dlen[1];
+   int ret;
+
+   ctrl.hdr.class = VIRTIO_NET_CTRL_RX;
+   ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC;
+   ctrl.data[0] = 0;
+   dlen[0] = 1;
+
+   ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1);
+
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Promisc disabling but send command "
+ "failed, this is too late now...\n");
+   }
+}
+
+static void
+virtio_dev_allmulticast_enable(struct rte_eth_dev *dev)
+{
+   struct virtio_hw *hw
+   = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct virtio_pmd_ctrl ctrl;
+   int dlen[1];
+   int ret;
+
+   ctrl.hdr.class = VIRTIO_NET_CTRL_RX;
+   ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI;
+   ctrl.data[0] = 1;
+   dlen[0] = 1;
+
+   ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1);
+
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Promisc enabling but send command "
+ "failed, this is too late now...\n");
+   }
+}
+
+static void
+virtio_dev_allmulticast_disable(struct rte_eth_dev *dev)
+{
+   struct virtio_hw *hw
+   = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct virtio_pmd_ctrl ctrl;
+   int dlen[1];
+   int ret;
+
+   ctrl.hdr.class = VIRTIO_NET_CTRL_RX;
+   ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI;
+   ctrl.data[0] = 0;
+   dlen[0] = 1;
+
+   ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1);
+
+   if (ret) {
+   PMD_INIT_LOG(ERR, "Promisc disabling but send command "
+ "failed, this is too late now...\n");
+   }
+}
+
 /*
  * dev_ops for virtio, bare necessities for basic operation
  */
@@ -411,6 +503,10 @@ static struct eth_dev_ops virtio_eth_dev_ops = {
.dev_start   = virtio_dev_start,
.dev_stop= virtio_dev_stop,
.dev_close   = virtio_dev_close,
+   .promiscuous_enable  = virtio_dev_promiscuous_enable,
+   .promiscuous_disable = virtio_dev_promiscuous_disable,
+   .allmulticast_enable = virtio_dev_allmulticast_enable,
+   .allmulticast_disable= virtio_dev_allmulticast_disable,

.dev_infos_get   = virtio_dev_info_get,
.stats_get   = virtio_dev_stats_get,
@@ -561,7 +657,7 @@ virtio_negotiate_features(struct virtio_hw *hw)
 {
uint32_t host_features, mask;

-   mask = VIRTIO_NET_F_CTRL_RX | VIRTIO_NET_F_CTRL_VLAN;
+   mask = VIRTIO_NET_F_CTRL_VLAN;
mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM;

/* TSO and LRO are only available when their corresponding
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 5/5] examples/vmdq: set default value to rx mode

2014-10-27 Thread Ouyang Changchun
This patch specifies rx_mode as 0 for 2 samples: vmdq and vhost-xen
because the multicast feature is not available currently for both samples.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost_xen/main.c | 1 +
 examples/vmdq/main.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/examples/vhost_xen/main.c b/examples/vhost_xen/main.c
index 0160492..3182733 100644
--- a/examples/vhost_xen/main.c
+++ b/examples/vhost_xen/main.c
@@ -166,6 +166,7 @@ static const struct rte_eth_conf vmdq_conf_default = {
.enable_default_pool = 0,
.default_pool = 0,
.nb_pool_maps = 0,
+   .rx_mode = 0,
.pool_map = {{0, 0},},
},
},
diff --git a/examples/vmdq/main.c b/examples/vmdq/main.c
index c51e2fb..077a20e 100644
--- a/examples/vmdq/main.c
+++ b/examples/vmdq/main.c
@@ -122,6 +122,7 @@ static const struct rte_eth_conf vmdq_conf_default = {
.enable_default_pool = 0,
.default_pool = 0,
.nb_pool_maps = 0,
+   .rx_mode = 0,
.pool_map = {{0, 0},},
},
},
-- 
1.8.4.2



[dpdk-dev] [PATCH v2 0/5] Support virtio multicast feature

2014-10-27 Thread Ouyang, Changchun
Pls ignore this duplicated one,
The mail server should has some issue, I cancel the sending out in first 2 
times but it still sent it out. :-( Sorry for that.  
Changchun

> -Original Message-
> From: Ouyang, Changchun
> Sent: Monday, October 27, 2014 11:39 AM
> To: dev at dpdk.org
> Cc: Cao, Waterman; Ouyang, Changchun
> Subject: [PATCH v2 0/5] Support virtio multicast feature
> 
>  - V1 change:
> This patch series support multicast feature in virtio and vhost.
> The vhost backend enables the promiscuous mode and config
> ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST in
> VMDQ offload register to receive the multicast and broadcast packets.
> The virtio frontend provides the functionality of enabling and disabling the
> multicast and promiscuous mode.
> 
>  -V2 change:
> Rework the patch basing on new vhost library and new vhost application.
> 
> Changchun Ouyang (5):
>   Add RX mode in VMDQ config and set the register PFVML2FLT for IXGBE
> PMD; this makes VMDQ accept broadcast and multicast packets.
>   Set VM offload register according to VMDQ config for IGB PMD to
> support broadcast and multicast packets.
>   To let US-vHOST accept and forward broadcast and multicast packets:
> Add promiscuous option into command line; set VMDQ RX mode into:
> ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST.
>   Add new API in virtio for supporting promiscuous and allmulticast
> enable and disable.
>   Specify rx_mode as 0 for 2 other samples: vmdq and vhost-xen.
> 
>  examples/vhost/main.c | 25 +++--
>  examples/vhost_xen/main.c |  1 +
>  examples/vmdq/main.c  |  1 +
>  lib/librte_ether/rte_ethdev.h |  1 +
>  lib/librte_pmd_e1000/igb_rxtx.c   | 20 +++
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 ++
>  lib/librte_pmd_virtio/virtio_ethdev.c | 98
> ++-
>  lib/librte_vhost/virtio-net.c |  4 +-
>  8 files changed, 161 insertions(+), 5 deletions(-)
> 
> --
> 1.8.4.2



[dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx

2014-10-27 Thread Thomas Monjalon
2014-10-25 00:48, Ouyang, Changchun:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > 2014-10-24 16:38, Ouyang Changchun:
> > > For zero copy, it need check whether RX descriptor num meets the least
> > > requirement when using vector PMD Rx function, and give user more
> > > hints if it fails to meet the least requirement.
> > [...]
> > > --- a/examples/vhost/main.c
> > > +++ b/examples/vhost/main.c
> > > @@ -131,6 +131,10 @@
> > >  #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK virt FE: 
> > > 128. */
> > >  #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK virt FE: 
> > > 64.  */
> > >
> > > +#ifdef RTE_IXGBE_INC_VECTOR
> > > +#define VPMD_RX_BURST 32
> > > +#endif
> > > +
> > >  /* Get first 4 bytes in mbuf headroom. */  #define
> > > MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
> > >   + sizeof(struct rte_mbuf)))
> > > @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv)
> > >   return -1;
> > >   }
> > >
> > > +#ifdef RTE_IXGBE_INC_VECTOR
> > > + if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) {
> > > + RTE_LOG(INFO, VHOST_PORT,
> > > + "The RX desc num: %d is too small for PMD to work\n"
> > > + "properly, please enlarge it to bigger than %d if\n"
> > > + "possible by the option: '--rx-desc-num '\n"
> > > + "One alternative is disabling RTE_IXGBE_INC_VECTOR\n"
> > > + "in config file and rebuild the libraries.\n",
> > > + num_rx_descriptor, VPMD_RX_BURST);
> > > + return -1;
> > > + }
> > > +#endif
> > > +
> > >   return 0;
> > >  }
> > 
> > I feel there is a design problem here.
> > An application shouldn't have to care about the underlying driver.
> 
> For most of other applications, as their descriptor numbers are set as big 
> enough(1024 or so) ,
> So there is no need to check the descriptor number at the early stage of 
> running.
> 
> But for vhost zero copy(note vhost one copy also has 1024 descriptor number) 
> has the default 
> descriptor number of 32.
> Why use 32? 
> because vhost zero copy implementation (working as backend) need support dpdk 
> based app which use pmd virtio-net driver,
> And also need support linux legacy virtio-net based application.  
> When it is the linux legacy virtio-net case, on one side the qemu has hard 
> code to confine the total virtio descriptor size to 256, 
> On other side, legacy virtio use half of them as virtio header, and then only 
> another half i.e. 128 descriptors are available to use as real buffer.
> 
> In PMD mode, all HW descriptors need to be filled DMA address in the rx 
> initial stage, otherwise there is probably exceptional in rx process.
> Based on that, we need use really limited virtio buffer to fully fill all hw 
> descriptor DMA address,
> Or in other word, the available virtio descriptor size will determine the 
> total mbuf size and hw descriptor size in the case of zero copy,
> 
> Tune and find that 32 is the suitable value for vhost zero copy to work 
> properly when it legacy linux virtio case.
> Another factor to reduce the value to 32, is that mempool use ring to 
> accommodate the mbuf, it cost one to flag the ring head/tail,
> And there are some other overheads like temporary mbufs(size as RX_BURST) 
> when rx.
> Note that number descriptor should need power 2.   
> 
> Why the change occur at this moment?
> Recently the default rx function is modified into vector RX function, while 
> it use non-vector mode (scalar mode) Rx previously,
> Vector RX function need more than 32 descriptor to work properly,  but scalar 
> mode RX hasn't this limitation.
> 
> As the RX function is changeable(you can use vector mode or non-vector), and 
> descriptor number can also be changed.
> So here in the vhost app, check if they match to make sure all things could 
> work normally, and give some hints if they don't match.
> 
> Hope the above could make it a bit clearer. :-)

Thank you for your explanation.
Your fix shows that driver and application are tightly linked.
It's a design issue. As I said:
"An application shouldn't have to care about the underlying driver."
I didn't dig enough in vhost to suggest a good fix but I'm sure
someone could have an idea.

-- 
Thomas


[dpdk-dev] [PATCH] librte_cmdline: FreeBSD Fix oveflow when size of command result structure is greater than BUFSIZ

2014-10-27 Thread Olivier MATZ
Hello Alan,

On 10/20/2014 05:26 PM, Carew, Alan wrote:
> A comment on my own patch.
> 
> Making the size of result_buf consistent across each OS and keeping it as 
> large
> as the Linux BUFSIZ(8192) doesn't really address the core issue.
> 
> In the event that a user of librte_cmdline creates a custom context with a
> result structure > 8192 bytes then this problem will occur again, though 
> somewhat unlikely, as the minimum number of the largest type would be 64 x 
> cmdline_fixed_string_t types within a result structure, at its current size.
> 
> There is no checking of overflow, I would be tempted to add a runtime check in
> cmdline_parse()/match_inst(), however I would be more comfortable with a build
> time check for this type of problem.
> 
> Due to the opaque handling of user defined contexts there is no obvious way to
> do this at build time.
> 
> Thoughts?

Indeed, your patch does not address the core issue of the problem,
altough it's already an improvement to the current situation.

Your issue was already fixed in the latest libcmdline library by
this patch (which also includes the replacement of BUFSIZ):
http://git.droids-corp.org/?p=libcmdline.git;a=commitdiff;h=b1d5b169352e57df3fc14c51ffad4b83f3e5613f

I'm pretty sure it won't apply smoothly on the dpdk command line
library but it can probably be adapted. Ideally, the latest libcmdline
library should be [cleaned first and] merged in dpdk.org.

Regards,
Olivier


[dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville

2014-10-27 Thread Thomas Monjalon
2014-10-27 02:41, Zhang, Helin:
> > The patch set supports VxLAN on Fortville based on latest rte_mbuf 
> > structure.
> > 
> > It includes:
> >  - Support VxLAN packet identification by configuring UDP tunneling port.
> >  - Support VxLAN packet filters. It uses MAC and VLAN to point
> >to a queue. The filter types supported are listed below:
> >1. Inner MAC and Inner VLAN ID
> >2. Inner MAC address, inner VLAN ID and tenant ID.
> >3. Inner MAC and tenant ID
> >4. Inner MAC address
> >5. Outer MAC address, tenant ID and inner MAC
> >  - Support VxLAN TX checksum offload, which include outer L3(IP), inner 
> > L3(IP)
> > and inner L4(UDP,TCP and SCTP)
> > 
> > Change notes:
> > 
> >  v8)  * Fix the issue of redundant "PKT_RX" and the comma missing in the
> > pkt_rx_flag_names[] in the rxonly.c file.
> > 
> > Jijiang Liu (10):
> >   change rte_mbuf structures
> >   add data structures of UDP tunneling
> >   add VxLAN packet identification API in librte_ether
> >   support VxLAN packet identification in i40e
> >   test VxLAN packet identification in testpmd.
> >   add data structures of tunneling filter in rte_eth_ctrl.h
> >   implement the API of VxLAN packet filter in i40e
> >   test VxLAN packet filter
> >   support VxLAN Tx checksum offload in i40e
> >   test VxLAN Tx checksum offload
> 
> Acked-by: Helin Zhang 

Applied

I fixed logs which had \n despite recent log rework.
I think there is also a wording error: you are writing VxLAN with x lowercase
but standard is writing it all uppercase: VXLAN. Do you agree?

Thanks
-- 
Thomas


[dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx

2014-10-27 Thread Ouyang, Changchun
Hi Thomas,

> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, October 27, 2014 4:46 PM
> To: Ouyang, Changchun
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector
> Rx
> 
> 2014-10-25 00:48, Ouyang, Changchun:
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > > 2014-10-24 16:38, Ouyang Changchun:
> > > > For zero copy, it need check whether RX descriptor num meets the
> > > > least requirement when using vector PMD Rx function, and give user
> > > > more hints if it fails to meet the least requirement.
> > > [...]
> > > > --- a/examples/vhost/main.c
> > > > +++ b/examples/vhost/main.c
> > > > @@ -131,6 +131,10 @@
> > > >  #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK virt
> FE: 128. */
> > > >  #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK virt
> FE: 64.  */
> > > >
> > > > +#ifdef RTE_IXGBE_INC_VECTOR
> > > > +#define VPMD_RX_BURST 32
> > > > +#endif
> > > > +
> > > >  /* Get first 4 bytes in mbuf headroom. */  #define
> > > > MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
> > > > + sizeof(struct rte_mbuf)))
> > > > @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv)
> > > > return -1;
> > > > }
> > > >
> > > > +#ifdef RTE_IXGBE_INC_VECTOR
> > > > +   if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) {
> > > > +   RTE_LOG(INFO, VHOST_PORT,
> > > > +   "The RX desc num: %d is too small for PMD to
> work\n"
> > > > +   "properly, please enlarge it to bigger than %d 
> > > > if\n"
> > > > +   "possible by the option: '--rx-desc-num
> '\n"
> > > > +   "One alternative is disabling
> RTE_IXGBE_INC_VECTOR\n"
> > > > +   "in config file and rebuild the libraries.\n",
> > > > +   num_rx_descriptor, VPMD_RX_BURST);
> > > > +   return -1;
> > > > +   }
> > > > +#endif
> > > > +
> > > > return 0;
> > > >  }
> > >
> > > I feel there is a design problem here.
> > > An application shouldn't have to care about the underlying driver.
> >
> > For most of other applications, as their descriptor numbers are set as
> > big enough(1024 or so) , So there is no need to check the descriptor
> number at the early stage of running.
> >
> > But for vhost zero copy(note vhost one copy also has 1024 descriptor
> > number) has the default descriptor number of 32.
> > Why use 32?
> > because vhost zero copy implementation (working as backend) need
> > support dpdk based app which use pmd virtio-net driver, And also need
> support linux legacy virtio-net based application.
> > When it is the linux legacy virtio-net case, on one side the qemu has
> > hard code to confine the total virtio descriptor size to 256, On other side,
> legacy virtio use half of them as virtio header, and then only another half 
> i.e.
> 128 descriptors are available to use as real buffer.
> >
> > In PMD mode, all HW descriptors need to be filled DMA address in the rx
> initial stage, otherwise there is probably exceptional in rx process.
> > Based on that, we need use really limited virtio buffer to fully fill
> > all hw descriptor DMA address, Or in other word, the available virtio
> > descriptor size will determine the total mbuf size and hw descriptor
> > size in the case of zero copy,
> >
> > Tune and find that 32 is the suitable value for vhost zero copy to work
> properly when it legacy linux virtio case.
> > Another factor to reduce the value to 32, is that mempool use ring to
> > accommodate the mbuf, it cost one to flag the ring head/tail, And there are
> some other overheads like temporary mbufs(size as RX_BURST) when rx.
> > Note that number descriptor should need power 2.
> >
> > Why the change occur at this moment?
> > Recently the default rx function is modified into vector RX function,
> > while it use non-vector mode (scalar mode) Rx previously, Vector RX
> function need more than 32 descriptor to work properly,  but scalar mode RX
> hasn't this limitation.
> >
> > As the RX function is changeable(you can use vector mode or non-vector),
> and descriptor number can also be changed.
> > So here in the vhost app, check if they match to make sure all things could
> work normally, and give some hints if they don't match.
> >
> > Hope the above could make it a bit clearer. :-)
> 
> Thank you for your explanation.
> Your fix shows that driver and application are tightly linked.
> It's a design issue. As I said:
> "An application shouldn't have to care about the underlying driver."
> I didn't dig enough in vhost to suggest a good fix but I'm sure someone could
> have an idea.
>
Agree with you, there is something linked between app and driver, but that's 
due to a few things:
1.Qume has hard code to confine the total vring size;
2.PMD driver need fully fill the d

[dpdk-dev] [PATCH v3 3/8] i40e: support of setting hash lookup table size

2014-10-27 Thread Thomas Monjalon
2014-10-22 19:53, Helin Zhang:
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -430,6 +430,9 @@ struct rte_eth_rss_conf {
>  /* Definitions used for redirection table entry size */
>  #define ETH_RSS_RETA_NUM_ENTRIES 128
>  #define ETH_RSS_RETA_MAX_QUEUE   16
> +#define ETH_RSS_RETA_SIZE_64  64
> +#define ETH_RSS_RETA_SIZE_128 128
> +#define ETH_RSS_RETA_SIZE_512 512

You didn't answer to my previous comment on this.
I think these definitions are useless. 64 is 64.

-- 
Thomas


[dpdk-dev] [PATCH v3 7/8] ethdev: support of multiple sizes of redirection table

2014-10-27 Thread Thomas Monjalon
2014-10-22 19:53, Helin Zhang:
> +#define RTE_BIT_WIDTH_64 (CHAR_BIT * sizeof(uint64_t))

How can it be different of 64?
Using 64 would be simpler to understand than RTE_BIT_WIDTH_64.

> + uint8_t reta[RTE_BIT_WIDTH_64]; /**< 64 redirection table entries. */

Even your comment is saying that it's 64.

-- 
Thomas


[dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville

2014-10-27 Thread Liu, Jijiang


> -Original Message-
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> Sent: Monday, October 27, 2014 9:46 PM
> To: Liu, Jijiang
> Cc: dev at dpdk.org; Zhang, Helin
> Subject: Re: [dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville
> 
> 2014-10-27 02:41, Zhang, Helin:
> > > The patch set supports VxLAN on Fortville based on latest rte_mbuf 
> > > structure.
> > >
> > > It includes:
> > >  - Support VxLAN packet identification by configuring UDP tunneling port.
> > >  - Support VxLAN packet filters. It uses MAC and VLAN to point
> > >to a queue. The filter types supported are listed below:
> > >1. Inner MAC and Inner VLAN ID
> > >2. Inner MAC address, inner VLAN ID and tenant ID.
> > >3. Inner MAC and tenant ID
> > >4. Inner MAC address
> > >5. Outer MAC address, tenant ID and inner MAC
> > >  - Support VxLAN TX checksum offload, which include outer L3(IP),
> > > inner L3(IP) and inner L4(UDP,TCP and SCTP)
> > >
> > > Change notes:
> > >
> > >  v8)  * Fix the issue of redundant "PKT_RX" and the comma missing in
> > > the pkt_rx_flag_names[] in the rxonly.c file.
> > >
> > > Jijiang Liu (10):
> > >   change rte_mbuf structures
> > >   add data structures of UDP tunneling
> > >   add VxLAN packet identification API in librte_ether
> > >   support VxLAN packet identification in i40e
> > >   test VxLAN packet identification in testpmd.
> > >   add data structures of tunneling filter in rte_eth_ctrl.h
> > >   implement the API of VxLAN packet filter in i40e
> > >   test VxLAN packet filter
> > >   support VxLAN Tx checksum offload in i40e
> > >   test VxLAN Tx checksum offload
> >
> > Acked-by: Helin Zhang 
> 
> Applied
> 
> I fixed logs which had \n despite recent log rework.
> I think there is also a wording error: you are writing VxLAN with x lowercase 
> but
> standard is writing it all uppercase: VXLAN. Do you agree?
Virtual eXtensible Local Area Network (VXLAN)

Agree.
> Thanks
> --
> Thomas


[dpdk-dev] [PATCH v8 00/10] Support VxLAN on Fortville

2014-10-27 Thread Thomas Monjalon
2014-10-27 14:34, Liu, Jijiang:
> From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > I think there is also a wording error: you are writing VxLAN with x 
> > lowercase but
> > standard is writing it all uppercase: VXLAN. Do you agree?
> Virtual eXtensible Local Area Network (VXLAN)
> 
> Agree.

Fixed

-- 
Thomas


[dpdk-dev] [PATCH v4 00/21] Support flow director programming on Fortville

2014-10-27 Thread Thomas Monjalon
2014-10-22 09:01, Jingjing Wu:
> The patch set supports flow director on fortville.
> It includes:
>  - set up/tear down fortville resources to support flow director, such as 
> queue and vsi.
>  - support operation to add or delete 8 flow types of the flow director 
> filters, they are ipv4, tcpv4, udpv4, sctpv4, ipv6, tcpv6, udpv6, sctpv6.
>  - support flushing flow director table (all filters).
>  - support operation to get flow director information.
>  - match status statistics, FD_ID report.
>  - support operation to configure flexible payload and its mask
>  - support flexible payload involved in comparison and flex bytes report.
> 
> v2 changes:
>  - create real fdir vsi and assign queue 0 pair to it.
>  - check filter status report on the rx queue 0
>  
> v3 changes:
>  - redefine filter APIs to support multi-kind filters
>  - support sctpv4 and sctpv6 type flows
>  - support flexible payload involved in comparison 
>  
> v4 changes:
>  - strip the filter APIs definitions from this patch set
>  - extend mbuf field to support flex bytes report
>  - fix typos

Previous version was acked by Chen Jing D(Mark) and Helin Zhang.
Have they reviewed the v4?
I won't review neither i40e nor testpmd parts in detail.
I prefer focusing on API (mbuf and ethdev) for my review.

-- 
Thomas


[dpdk-dev] [PATCH v6 3/3] ethdev: fix wrong error return refere to API definition

2014-10-27 Thread Ananyev, Konstantin

> From: Liang, Cunming
> Sent: Monday, October 27, 2014 1:20 AM
> To: dev at dpdk.org
> Cc: nhorman at tuxdriver.com; Ananyev, Konstantin; Richardson, Bruce; De Lara 
> Guarch, Pablo; Liang, Cunming
> Subject: [PATCH v6 3/3] ethdev: fix wrong error return refere to API 
> definition
> 
> Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count 
> returns the packet number
> When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set 
> to -ENOTSUP.
> It makes confusing.
> The patch always return 0 no matter no packet or there's error.
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_ether/rte_ethdev.c |6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 50f10d9..922a0c6 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -2530,7 +2530,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>   return 0;
>   }
>   dev = &rte_eth_devices[port_id];
> - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
>   if (queue_id >= dev->data->nb_rx_queues) {
>   PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
>   return 0;
> @@ -2551,7 +2551,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
>   }
>   dev = &rte_eth_devices[port_id];
> 
> - FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
>   if (queue_id >= dev->data->nb_tx_queues) {
>   PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
>   return 0;
> @@ -2570,7 +2570,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t 
> queue_id)
>   return 0;
>   }
>   dev = &rte_eth_devices[port_id];
> - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
>   return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
>  }
> 
> --
> 1.7.4.1

Acked-by: Konstantin Ananyev 



[dpdk-dev] [PATCH v4 04/21] ethdev: define structures for adding/deleting flow director

2014-10-27 Thread Thomas Monjalon
2014-10-22 09:01, Jingjing Wu:
> +/**
> + * A structure used to define the input for IPV4 UDP flow
> + */
> +struct rte_eth_udpv4_flow {
> + uint32_t src_ip;  /**< IPv4 source address to match. */
> + uint32_t dst_ip;  /**< IPv4 destination address to match. */
> + uint16_t src_port;/**< UDP Source port to match. */
> + uint16_t dst_port;/**< UDP Destination port to match. */
> +};
> +
> +/**
> + * A structure used to define the input for IPV4 TCP flow
> + */
> +struct rte_eth_tcpv4_flow {
> + uint32_t src_ip;  /**< IPv4 source address to match. */
> + uint32_t dst_ip;  /**< IPv4 destination address to match. */
> + uint16_t src_port;/**< TCP Source port to match. */
> + uint16_t dst_port;/**< TCP Destination port to match. */
> +};
> +
> +/**
> + * A structure used to define the input for IPV4 SCTP flow
> + */
> +struct rte_eth_sctpv4_flow {
> + uint32_t src_ip;  /**< IPv4 source address to match. */
> + uint32_t dst_ip;  /**< IPv4 destination address to match. */
> + uint32_t verify_tag;  /**< verify tag to match */
> +};
> +
> +/**
> + * A structure used to define the input for IPV4 flow
> + */
> +struct rte_eth_ipv4_flow {
> + uint32_t src_ip;  /**< IPv4 source address to match. */
> + uint32_t dst_ip;  /**< IPv4 destination address to match. */
> +};

Why not defining only 1 structure?
struct rte_eth_ipv4_flow {
uint32_t src_ip;
uint32_t dst_ip;
uint16_t src_port;
uint16_t dst_port;
uint32_t sctp_tag;
};

I think the same structure could be used for many filters (not only
flow director).

> +#define RTE_ETH_FDIR_MAX_FLEXWORD_LEN  8
> +/**
> + * A structure used to contain extend input of flow
> + */
> +struct rte_eth_fdir_flow_ext {
> + uint16_t vlan_tci;
> + uint8_t num_flexwords; /**< number of flexwords */
> + uint16_t flexwords[RTE_ETH_FDIR_MAX_FLEXWORD_LEN];
> + uint16_t dest_id;  /**< destination vsi or pool id*/
> +};

Flexword should be explained.

> +/**
> + * A structure used to define the input for an flow director filter entry

typo: for *a* flow director

> + */
> +struct rte_eth_fdir_input {
> + enum rte_eth_flow_type flow_type;  /**< type of flow */
> + union rte_eth_fdir_flow flow;  /**< specific flow structure */
> + struct rte_eth_fdir_flow_ext flow_ext; /**< specific flow info */
> +};

I don't understand the logic behind flow/flow_ext.
Why flow_ext is not merged into flow ?

> +/**
> + * Flow director report status
> + */
> +enum rte_eth_fdir_status {
> + RTE_ETH_FDIR_NO_REPORT_STATUS = 0, /**< no report FDIR. */
> + RTE_ETH_FDIR_REPORT_FD_ID, /**< only report FD ID. */
> + RTE_ETH_FDIR_REPORT_FD_ID_FLEX_4,  /**< report FD ID and 4 flex bytes. 
> */
> + RTE_ETH_FDIR_REPORT_FLEX_8,/**< report 8 flex bytes. */
> +};

The names and explanations are cryptics.
Is FD redundant with FDIR?

> +/**
> + * A structure used to define an action when match FDIR packet filter.
> + */
> +struct rte_eth_fdir_action {
> + uint16_t rx_queue;/**< queue assigned to if fdir match. */
> + uint16_t cnt_idx; /**< statistic counter index */

what is the action of "statistic counter index"?

> + uint8_t  drop;/**< accept or reject */
> + uint8_t  flex_off;/**< offset used define words to report */

still difficult to understand the flex logic

> + enum rte_eth_fdir_status report_status;  /**< status report option */
> +};

> +/**
> + * A structure used to define the flow director filter entry by filter_ctl 
> API
> + * to support RTE_ETH_FILTER_FDIR with RTE_ETH_FILTER_ADD and
> + * RTE_ETH_FILTER_DELETE operations.
> + */
> +struct rte_eth_fdir_filter {
> + uint32_t soft_id;   /**< id */

Should the application handle the id numbering?
Why is it soft_id instead of id?

> + struct rte_eth_fdir_input input;/**< input set */
> + struct rte_eth_fdir_action action;  /**< action taken when match */
> +};

It's really a hard job to define a clear and easy to use API.
It would be really interesting to have more people involved in this discussion.
Thanks
-- 
Thomas


[dpdk-dev] Why do we need iommu=pt?

2014-10-27 Thread Shivapriya Hiremath
Hi Danny,

Your reply was very helpful in understanding the impact. Can you please
tell us if you saw any performance impact to DPDK when iommu=on ?

-Shivapriya

On Wed, Oct 22, 2014 at 8:21 AM, Zhou, Danny  wrote:

> Echo Cunming and we did not see obvious performance impact when iommu = pt
> is used despite of
> igb_uio or VFIO is used.
>
> Alex,
> The map and umap operation for each e/ingress packet is done by hw rather
> than sw, so
> performance impact to DPDK should be minimum in my mind. If it actually
> impacst perf, say on 100G NIC,
> I am sure it will be resolved in next generation Intel silicon. We will be
> performing some performance
> tests with iommu = on to see any performance degradation. I cannot share
> the detailed performance
> result here on the community, but I could tell if it really bring negative
> performance impact to DPDK.
> Please stay tuned.
>
> Alex,
>
> > -Original Message-
> > From: Liang, Cunming
> > Sent: Wednesday, October 22, 2014 4:53 PM
> > To: alex; Zhou, Danny
> > Cc: dev at dpdk.org
> > Subject: RE: [dpdk-dev] Why do we need iommu=pt?
> >
> > I thinks it's a good point using dma_addr rather than phys_addr.
> > Without iommu, the value of them are the same.
> > With iommu, the dma_addr value equal to the iova.
> > It's not all for DPDK working with iommu but not pass through.
> >
> > We know each iova belongs to one iommu domain.
> > And each device can attach to one domain.
> > It means the iova will have coupling relationship with domain/device.
> >
> > Looking back to DPDK descriptor ring, it's all right, already coupling
> with device.
> > But if for mbuf mempool, in most cases, it's shared by multiple ports.
> > So if keeping the way, all those ports/device need to put into the same
> iommu domain.
> > And the mempool has attach to specific domain, but not just the device.
> > On this time, iommu domain no longer be transparent in DPDK.
> > Vfio provide the verbs to control domain, we still need library to
> manager such domain with mempool.
> >
> > All that overhead just make DPDK works with iommu in host, but remember
> pt always works.
> > The isolation of devices mainly for security concern.
> > If it's not necessary, pt definitely is a good choice without
> performance impact.
> >
> > For those self-implemented PMD using the DMA kernel interface to set up
> its mappings appropriately.
> > It don't require "iommu=pt". The default option "iommu=on" also works.
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of alex
> > > Sent: Wednesday, October 22, 2014 3:36 PM
> > > To: Zhou, Danny
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> > >
> > > Shiva.
> > > The cost of disabling iommu=pt when intel_iommu=on is dire. DPDK won't
> work
> > > as the RX/TX descriptors will be useless.
> > > Any dam access by the device will be dropped as no dam-mapping will
> exists.
> > >
> > > Danny.
> > > The IOMMU hurts performance in kernel drivers which perform a map and
> umap
> > > operation for each e/ingress packet.
> > > The costs of unmapping when under strict protection limit a +10Gb to
> 3Gb
> > > with cpu maxed out at 100%. DPDK apps shouldn't feel any difference
> IFF the
> > > rx descriptors contain iova and not real physical addresses which are
> used
> > > currently.
> > >
> > >
> > > On Tue, Oct 21, 2014 at 10:10 PM, Zhou, Danny 
> wrote:
> > >
> > > > IMHO, if memory protection with IOMMU is needed or not really
> depends on
> > > > how you use
> > > > and deploy your DPDK based applications. For Telco network middle
> boxes,
> > > > which adopts
> > > > a "close model" solution to achieve extremely high performance, the
> entire
> > > > system including
> > > > HW, software in kernel and userspace are controlled by Telco vendors
> and
> > > > assumed trustable, so
> > > > memory protection is not so important. While for Datacenters, which
> > > > generally adopts a "open model"
> > > > solution allows running user space applications(e.g. tenant
> applications
> > > > and VMs) which could
> > > > direct access NIC and DMA engine inside the NIC using modified DPDK
> PMD
> > > > are not trustable
> > > > as they can potentially DAM to/from arbitrary memory regions using
> > > > physical addresses, so IOMMU
> > > > is needed to provide strict memory protection, at the cost of
> negative
> > > > performance impact.
> > > >
> > > > So if you want to seek high performance, disable IOMMU in BIOS or
> OS. And
> > > > if security is a major
> > > > concern, tune it on and tradeoff between performance and security.
> But I
> > > > do NOT think is comes with
> > > > an extremely high performance costs according to our performance
> > > > measurement, but it probably true
> > > > for 100G NIC.
> > > >
> > > > > -Original Message-
> > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Shivapriya
> Hiremath
> > > > > Sent: Wednesday, October 22, 2014 12:54 AM
> > > 

[dpdk-dev] Why do we need iommu=pt?

2014-10-27 Thread Zhou, Danny
Shivapriya?

It is still ongoing as we need to run it on different Xeon server
platform like SandyBridge and IvyBridge, will post summary result
here once it is ready.

-Danny

From: Shivapriya Hiremath [mailto:shivpr...@gmail.com]
Sent: Tuesday, October 28, 2014 1:28 AM
To: Zhou, Danny
Cc: dev at dpdk.org
Subject: Re: [dpdk-dev] Why do we need iommu=pt?

Hi Danny,

Your reply was very helpful in understanding the impact. Can you please tell us 
if you saw any performance impact to DPDK when iommu=on ?

-Shivapriya

On Wed, Oct 22, 2014 at 8:21 AM, Zhou, Danny mailto:danny.zhou at intel.com>> wrote:
Echo Cunming and we did not see obvious performance impact when iommu = pt is 
used despite of
igb_uio or VFIO is used.

Alex,
The map and umap operation for each e/ingress packet is done by hw rather than 
sw, so
performance impact to DPDK should be minimum in my mind. If it actually impacst 
perf, say on 100G NIC,
I am sure it will be resolved in next generation Intel silicon. We will be 
performing some performance
tests with iommu = on to see any performance degradation. I cannot share the 
detailed performance
result here on the community, but I could tell if it really bring negative 
performance impact to DPDK.
Please stay tuned.

Alex,

> -Original Message-
> From: Liang, Cunming
> Sent: Wednesday, October 22, 2014 4:53 PM
> To: alex; Zhou, Danny
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] Why do we need iommu=pt?
>
> I thinks it's a good point using dma_addr rather than phys_addr.
> Without iommu, the value of them are the same.
> With iommu, the dma_addr value equal to the iova.
> It's not all for DPDK working with iommu but not pass through.
>
> We know each iova belongs to one iommu domain.
> And each device can attach to one domain.
> It means the iova will have coupling relationship with domain/device.
>
> Looking back to DPDK descriptor ring, it's all right, already coupling with 
> device.
> But if for mbuf mempool, in most cases, it's shared by multiple ports.
> So if keeping the way, all those ports/device need to put into the same iommu 
> domain.
> And the mempool has attach to specific domain, but not just the device.
> On this time, iommu domain no longer be transparent in DPDK.
> Vfio provide the verbs to control domain, we still need library to manager 
> such domain with mempool.
>
> All that overhead just make DPDK works with iommu in host, but remember pt 
> always works.
> The isolation of devices mainly for security concern.
> If it's not necessary, pt definitely is a good choice without performance 
> impact.
>
> For those self-implemented PMD using the DMA kernel interface to set up its 
> mappings appropriately.
> It don't require "iommu=pt". The default option "iommu=on" also works.
>
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] 
> > On Behalf Of alex
> > Sent: Wednesday, October 22, 2014 3:36 PM
> > To: Zhou, Danny
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] Why do we need iommu=pt?
> >
> > Shiva.
> > The cost of disabling iommu=pt when intel_iommu=on is dire. DPDK won't work
> > as the RX/TX descriptors will be useless.
> > Any dam access by the device will be dropped as no dam-mapping will exists.
> >
> > Danny.
> > The IOMMU hurts performance in kernel drivers which perform a map and umap
> > operation for each e/ingress packet.
> > The costs of unmapping when under strict protection limit a +10Gb to 3Gb
> > with cpu maxed out at 100%. DPDK apps shouldn't feel any difference IFF the
> > rx descriptors contain iova and not real physical addresses which are used
> > currently.
> >
> >
> > On Tue, Oct 21, 2014 at 10:10 PM, Zhou, Danny  > intel.com> wrote:
> >
> > > IMHO, if memory protection with IOMMU is needed or not really depends on
> > > how you use
> > > and deploy your DPDK based applications. For Telco network middle boxes,
> > > which adopts
> > > a "close model" solution to achieve extremely high performance, the entire
> > > system including
> > > HW, software in kernel and userspace are controlled by Telco vendors and
> > > assumed trustable, so
> > > memory protection is not so important. While for Datacenters, which
> > > generally adopts a "open model"
> > > solution allows running user space applications(e.g. tenant applications
> > > and VMs) which could
> > > direct access NIC and DMA engine inside the NIC using modified DPDK PMD
> > > are not trustable
> > > as they can potentially DAM to/from arbitrary memory regions using
> > > physical addresses, so IOMMU
> > > is needed to provide strict memory protection, at the cost of negative
> > > performance impact.
> > >
> > > So if you want to seek high performance, disable IOMMU in BIOS or OS. And
> > > if security is a major
> > > concern, tune it on and tradeoff between performance and security. But I
> > > do NOT think is comes with
> > > an extreme

[dpdk-dev] ethtool and igb/ixgbe (kni)

2014-10-27 Thread Kevin Wilson
Poke!
Can anybody advice about this question ?
Kevin

On Fri, Oct 24, 2014 at 12:54 PM, Kevin Wilson  wrote:
> Hi,
>
> I am looking in the file hierarchy of dpdk, and I see that under
> /dpdk-1.7.1/lib/librte_eal/linuxapp/kni/ethtool
> we have:
> igb  ixgbe  README
>
> My question is: why the igb and ixgbe are on this path, under ethtool
> ? are they related
> to ethtool in any way ?
>
>
> The README does not explain it.
>
> Regards,
> Kevin


[dpdk-dev] [PATCH v3 3/8] i40e: support of setting hash lookup table size

2014-10-27 Thread Matthew Hall
On Mon, Oct 27, 2014 at 03:13:39PM +0100, Thomas Monjalon wrote:
> You didn't answer to my previous comment on this.
> I think these definitions are useless. 64 is 64.

Putting labels on the constants gives meaning to them as well as a numeric 
value. Not doing so is an antipattern referred to as "magic numbers" 
antipattern.

A maintainence programmer or community member will have a difficult time 
figuring out lost context when grepping through the code.

Matthew.


[dpdk-dev] [PATCH v3 3/8] i40e: support of setting hash lookup table size

2014-10-27 Thread Thomas Monjalon
2014-10-27 13:21, Matthew Hall:
> On Mon, Oct 27, 2014 at 03:13:39PM +0100, Thomas Monjalon wrote:
> > You didn't answer to my previous comment on this.
> > I think these definitions are useless. 64 is 64.
> 
> Putting labels on the constants gives meaning to them as well as a numeric 
> value. Not doing so is an antipattern referred to as "magic numbers" 
> antipattern.

Are you kidding Matthew?
I'm referring to these constants:
> +#define ETH_RSS_RETA_SIZE_64  64
> +#define ETH_RSS_RETA_SIZE_128 128
> +#define ETH_RSS_RETA_SIZE_512 512

It's not RETA_SIZE which would have a meaning but
RETA_SIZE_64. We could also define RETA_SIZE_32 or RETA_SIZE_33...

> A maintainence programmer or community member will have a difficult time 
> figuring out lost context when grepping through the code.
> 
> Matthew.