[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet

2014-10-24 Thread Liang, Cunming
It's reasonable to me.
I'll make a patch for rte_ethdev.c.

> -Original Message-
> From: Richardson, Bruce
> Sent: Wednesday, October 22, 2014 11:10 PM
> To: Ananyev, Konstantin; Neil Horman; Liang, Cunming
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> cycles/packet
> 
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin
> > Sent: Wednesday, October 22, 2014 3:53 PM
> > To: Neil Horman; Liang, Cunming
> > Cc: dev at dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > cycles/packet
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman
> > > Sent: Wednesday, October 22, 2014 3:03 PM
> > > To: Liang, Cunming
> > > Cc: dev at dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx
> > cycles/packet
> > >
> > > On Tue, Oct 21, 2014 at 01:17:01PM +, Liang, Cunming wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > > Sent: Tuesday, October 21, 2014 6:33 PM
> > > > > To: Liang, Cunming
> > > > > Cc: dev at dpdk.org
> > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and 
> > > > > tx
> > > > > cycles/packet
> > > > >
> > > > > On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote:
> > > > > > Hi Neil,
> > > > > >
> > > > > > Very appreciate your comments.
> > > > > > I add inline reply, will send v3 asap when we get alignment.
> > > > > >
> > > > > > BRs,
> > > > > > Liang Cunming
> > > > > >
> > > > > > > -Original Message-
> > > > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com]
> > > > > > > Sent: Saturday, October 11, 2014 1:52 AM
> > > > > > > To: Liang, Cunming
> > > > > > > Cc: dev at dpdk.org
> > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx 
> > > > > > > and tx
> > > > > cycles/packet
> > > > > > >
> <...snip...>
> > > > > > >
> > > > > > > > +   printf("Force Stop!\n");
> > > > > > > > +   stop = 1;
> > > > > > > > +   }
> > > > > > > > +   if (signum == SIGUSR2)
> > > > > > > > +   stats_display(0);
> > > > > > > > +}
> > > > > > > > +/* main processing loop */
> > > > > > > > +static int
> > > > > > > > +main_loop(__rte_unused void *args)
> > > > > > > > +{
> > > > > > > > +#define PACKET_SIZE 64
> > > > > > > > +#define FRAME_GAP 12
> > > > > > > > +#define MAC_PREAMBLE 8
> > > > > > > > +   struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
> > > > > > > > +   unsigned lcore_id;
> > > > > > > > +   unsigned i, portid, nb_rx = 0, nb_tx = 0;
> > > > > > > > +   struct lcore_conf *conf;
> > > > > > > > +   uint64_t prev_tsc, cur_tsc;
> > > > > > > > +   int pkt_per_port;
> > > > > > > > +   uint64_t packets_per_second, total_packets;
> > > > > > > > +
> > > > > > > > +   lcore_id = rte_lcore_id();
> > > > > > > > +   conf = &lcore_conf[lcore_id];
> > > > > > > > +   if (conf->status != LCORE_USED)
> > > > > > > > +   return 0;
> > > > > > > > +
> > > > > > > > +   pkt_per_port =  MAX_TRAFIC_BURST / conf->nb_ports;
> > > > > > > > +
> > > > > > > > +   int idx = 0;
> > > > > > > > +   for (i = 0; i < conf->nb_ports; i++) {
> > > > > > > > +   int num = pkt_per_port;
> > > > > > > > +   portid = conf->portlist[i];
> > > > > > > > +   printf("inject %d packet to port %d\n", num, 
> > > > > > > > portid);
> > > > > > > > +   while (num) {
> > > > > > > > +   nb_tx = RTE_MIN(MAX_PKT_BURST, num);
> > > > > > > > +   nb_tx = rte_eth_tx_burst(portid, 0,
> > > > > > > > +   &tx_burst[idx], 
> > > > > > > > nb_tx);
> > > > > > > > +   num -= nb_tx;
> > > > > > > > +   idx += nb_tx;
> > > > > > > > +   }
> > > > > > > > +   }
> > > > > > > > +   printf("Total packets inject to prime ports = %u\n", 
> > > > > > > > idx);
> > > > > > > > +
> > > > > > > > +   packets_per_second = (link_mbps * 1000 * 1000) /
> > > > > > > > +   +((PACKET_SIZE + FRAME_GAP + MAC_PREAMBLE) *
> > CHAR_BIT);
> > > > > > > > +   printf("Each port will do %"PRIu64" packets per 
> > > > > > > > second\n",
> > > > > > > > +   +packets_per_second);
> > > > > > > > +
> > > > > > > > +   total_packets = RTE_TEST_DURATION * conf->nb_ports *
> > > > > > > packets_per_second;
> > > > > > > > +   printf("Test will stop after at least %"PRIu64" packets
> > received\n",
> > > > > > > > +   + total_packets);
> > > > > > > > +
> > > > > > > > +   prev_tsc = rte_rdtsc();
> > > > > > > > +
> > > > > > > > +   while (likely(!stop)) {
> > > > > > > > +   for (i = 0; i < conf->nb_port

[dpdk-dev] EAL : Input/output error on DPDK 1.7.1

2014-10-24 Thread Masaru Oki
Hi,
I got same result in VMware Workstation environment.
At least in my environment, INTX toggle check is not work with VMware
E1000 Ethernet.
Please try attached patch.

2014-10-17 3:04 GMT+09:00 Raghav K :
> Hey,
> I observe continuous burst of I/O Errors, as indicated below, with the 
> testpmd application with DPDK 1.7.1.This seems to originate from 
> eal_intr_process_interrupts() function. I seemed to have setup the DPDK 
> prerequisites alright.
> Another recent post seemed to suggest moving back to 1.7.0, however I would 
> like to persist with 1.7.1.
> Any help/pointers in resolving this would be greatly appreciated.
> Much thanks,Raghav
> root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1/x86_64-native-linuxapp-gcc/app# 
> ./testpmd -c 0xf -n3 -- -i --nb-cores=3 --nb-ports=2
> EAL: Error reading from file descriptor 21: Input/output errorEAL: Error 
> reading from file descriptor 21: Input/output errorEAL: Error reading from 
> file descriptor 21: Input/output errorEAL: Error reading from file descriptor 
> 21: Input/output errorEAL: Error reading from file descriptor 21: 
> Input/output errorEAL: Error reading from file descriptor 21: Input/output 
> errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error 
> reading from file descriptor 21: Input/output errorEAL: Error reading from 
> file descriptor 21: Input/output errorEAL: Error reading from file descriptor 
> 21: Input/output errorEAL: Error reading from file descriptor 21: 
> Input/output errorEAL: Error reading from file descriptor 21: Input/output 
> errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error 
> reading from file descriptor 21: Input/output errorEAL: Error reading from 
> file descriptor 21: Input/output errorEAL: Error reading from file descriptor 
> 21: Input/output error
> 
> root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1# ./tools/dpdk_nic_bind.py --status
> Network devices using DPDK-compatible 
> driver:02:01.0 '82545EM 
> Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000:02:02.0 
> '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000
> Network devices using kernel 
> driver===:02:00.0 '82545EM Gigabit 
> Ethernet Controller (Copper)' if=eth0 drv=e1000 unused=igb_uio 
> *Active*:02:03.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth3 
> drv=e1000 unused=igb_uio :02:05.0 '82545EM Gigabit Ethernet Controller 
> (Copper)' if=eth4 drv=e1000 unused=igb_uio :02:06.0 '82545EM Gigabit 
> Ethernet Controller (Copper)' if=eth5 drv=e1000 unused=igb_uio
> Other network devices=
-- next part --
diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c 
b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index d1ca26e..c46a00f 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -505,14 +505,11 @@ igbuio_pci_probe(struct pci_dev *dev, const struct 
pci_device_id *id)
}
/* fall back to INTX */
case RTE_INTR_MODE_LEGACY:
-   if (pci_intx_mask_supported(dev)) {
-   dev_dbg(&dev->dev, "using INTX");
-   udev->info.irq_flags = IRQF_SHARED;
-   udev->info.irq = dev->irq;
-   udev->mode = RTE_INTR_MODE_LEGACY;
-   break;
-   }
-   dev_notice(&dev->dev, "PCI INTX mask not supported\n");
+   dev_dbg(&dev->dev, "using INTX");
+   udev->info.irq_flags = IRQF_SHARED;
+   udev->info.irq = dev->irq;
+   udev->mode = RTE_INTR_MODE_LEGACY;
+   break;
/* fall back to no IRQ */
case RTE_INTR_MODE_NONE:
udev->mode = RTE_INTR_MODE_NONE;


[dpdk-dev] [PATCH v4 5/8] test app: adding support for generating variable sized packet bursts

2014-10-24 Thread Liang, Cunming


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty
> Sent: Tuesday, September 30, 2014 5:58 PM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH v4 5/8] test app: adding support for generating
> variable sized packet bursts
> 
> 
> Signed-off-by: Declan Doherty 
> ---
>  app/test/packet_burst_generator.c | 25 -
>  app/test/packet_burst_generator.h |  6 +-
>  app/test/test_link_bonding.c  | 14 +-
>  3 files changed, 22 insertions(+), 23 deletions(-)
> 
> diff --git a/app/test/packet_burst_generator.c
> b/app/test/packet_burst_generator.c
> index 9e747a4..b2824dc 100644
> --- a/app/test/packet_burst_generator.c
> +++ b/app/test/packet_burst_generator.c
> @@ -74,8 +74,7 @@ static inline void
>  copy_buf_to_pkt(void *buf, unsigned len, struct rte_mbuf *pkt, unsigned 
> offset)
>  {
>   if (offset + len <= pkt->data_len) {
> - rte_memcpy(rte_pktmbuf_mtod(pkt, char *) + offset,
> - buf, (size_t) len);
> + rte_memcpy(rte_pktmbuf_mtod(pkt, char *) + offset, buf,
> (size_t) len);
>   return;
>   }
>   copy_buf_to_pkt_segs(buf, len, pkt, offset);
> @@ -191,20 +190,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t
> src_addr,
>   */
>  #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char.
> */
> 
> -#define TXONLY_DEF_PACKET_LEN 64
> -#define TXONLY_DEF_PACKET_LEN_128 128
> -
> -uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN;
> -uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = {
> - TXONLY_DEF_PACKET_LEN_128,
> -};
> -
> -uint8_t  tx_pkt_nb_segs = 1;
> 
>  int
>  generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
>   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
> - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst)
> + uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst,
> + uint8_t pkt_len, uint8_t nb_pkt_segs)
>  {
>   int i, nb_pkt = 0;
>   size_t eth_hdr_size;
> @@ -221,9 +212,9 @@ nomore_mbuf:
>   break;
>   }
> 
> - pkt->data_len = tx_pkt_seg_lengths[0];
> + pkt->data_len = pkt_len;
>   pkt_seg = pkt;
> - for (i = 1; i < tx_pkt_nb_segs; i++) {
> + for (i = 1; i < nb_pkt_segs; i++) {
>   pkt_seg->next = rte_pktmbuf_alloc(mp);
>   if (pkt_seg->next == NULL) {
>   pkt->nb_segs = i;
> @@ -231,7 +222,7 @@ nomore_mbuf:
>   goto nomore_mbuf;
>   }
>   pkt_seg = pkt_seg->next;
> - pkt_seg->data_len = tx_pkt_seg_lengths[i];
> + pkt_seg->data_len = pkt_len;
>   }
>   pkt_seg->next = NULL; /* Last segment of packet. */
> 
> @@ -259,8 +250,8 @@ nomore_mbuf:
>* Complete first mbuf of packet and append it to the
>* burst of packets to be transmitted.
>*/
> - pkt->nb_segs = tx_pkt_nb_segs;
> - pkt->pkt_len = tx_pkt_length;
> + pkt->nb_segs = nb_pkt_segs;
> + pkt->pkt_len = pkt_len;
>   pkt->l2_len = eth_hdr_size;
> 
>   if (ipv4) {
> diff --git a/app/test/packet_burst_generator.h
> b/app/test/packet_burst_generator.h
> index 5b3cd6c..f86589e 100644
> --- a/app/test/packet_burst_generator.h
> +++ b/app/test/packet_burst_generator.h
> @@ -47,6 +47,9 @@ extern "C" {
>  #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \
>   ((c & 0xff) << 8) | (d & 0xff))
> 
> +#define PACKET_BURST_GEN_PKT_LEN 60
> +#define PACKET_BURST_GEN_PKT_LEN_128 128
> +
> 
>  void
>  initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
> @@ -68,7 +71,8 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t
> src_addr,
>  int
>  generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
>   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
> - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst);
> + uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst,
> + uint8_t pkt_len, uint8_t nb_pkt_segs);
> 
>  #ifdef __cplusplus
>  }
> diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
> index 1a847eb..50355a3 100644
> --- a/app/test/test_link_bonding.c
> +++ b/app/test/test_link_bonding.c
> @@ -1338,7 +1338,8 @@ generate_test_burst(struct rte_mbuf **pkts_burst,
> uint16_t burst_size,
>   /* Generate burst of packets to transmit */
>   generated_burst_size = generate_packet_burst(test_params-
> >mbuf_pool,
>   pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr,
> ipv4,
> - test_params->pkt_udp_hdr, bu

[dpdk-dev] [PATCH v4 0/3] app/test: unit test to measure cycles per packet

2014-10-24 Thread Cunming Liang
BTW, [1/3] is the same patch as below one. 
http://dpdk.org/dev/patchwork/patch/817

v4 update:
# fix the confusing of retval in some API of rte_ethdev

v3 update:
# Codes refine according to the feedback.
  1. add ether_format_addr to rte_ether.h
  2. fix typo in code comments.
  3. %lu to %PRIu64, fixing 32-bit targets compilation err
# merge 2 small incremental patches to the first one.
  The whole unit test as a single patch in [PATCH v3 2/2]
# rebase code to the latest master

v2 update:
Rebase code to the latest master branch.

It provides unit test to measure cycles/packet in NIC loopback mode.
It simply gives the average cycles of IO used per packet without test equipment.
When doing the test, make sure the link is UP.

There's two stream control mode support, one is continues, another is burst.
The former continues to forward the injected packets until reaching a certain 
amount of number.
The latter one stop when all the injected packets are received.
In burst stream, now measure two situations, with or without desc. cache 
conflict.
By default, it runs in continues stream mode to measure the whole rxtx.

Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
3. If choose continuous stream, there are another two options can configure
3.1 choose rx/tx pair, default is vector
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' 
without INC_VEC=y in config 
3.2 choose the area of masurement, default is rxtx
set_rxtx_anchor [rxtx|rxonly|txonly]
4. Run and wait for the result
pmd_perf_autotest

For who simply just want to see how much cycles cost per packet.
Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it.
Nothing else needs to configure. 
Using other options when you understand and what to measures more. 


*** BLURB HERE ***

Cunming Liang (3):
  app/test: allow to create packets in different sizes
  app/test: measure the cost of rx/tx routines by cycle number
  ethdev: fix wrong error return refer to API definition

 app/test/Makefile   |1 +
 app/test/commands.c |  111 +
 app/test/packet_burst_generator.c   |   26 +-
 app/test/packet_burst_generator.h   |   11 +-
 app/test/test.h |6 +
 app/test/test_link_bonding.c|   39 +-
 app/test/test_pmd_perf.c|  922 +++
 lib/librte_ether/rte_ethdev.c   |6 +-
 lib/librte_ether/rte_ether.h|   25 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
 10 files changed, 1117 insertions(+), 36 deletions(-)
 create mode 100644 app/test/test_pmd_perf.c

-- 
1.7.4.1



[dpdk-dev] [PATCH v4 1/3] app/test: allow to create packets in different sizes

2014-10-24 Thread Cunming Liang
adding support to allow packet burst generator to create packets in differenct 
sizes

Signed-off-by: Cunming Liang 
Acked-by: Declan Doherty 
---
 app/test/packet_burst_generator.c |   26 
 app/test/packet_burst_generator.h |   11 +++--
 app/test/test_link_bonding.c  |   39 
 3 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/app/test/packet_burst_generator.c 
b/app/test/packet_burst_generator.c
index 9e747a4..017139b 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -191,20 +191,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,
  */
 #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. */

-#define TXONLY_DEF_PACKET_LEN 64
-#define TXONLY_DEF_PACKET_LEN_128 128
-
-uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN;
-uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = {
-   TXONLY_DEF_PACKET_LEN_128,
-};
-
-uint8_t  tx_pkt_nb_segs = 1;
-
 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
-   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst)
+ struct ether_hdr *eth_hdr, uint8_t vlan_enabled,
+ void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr,
+ int nb_pkt_per_burst, uint8_t pkt_len,
+ uint8_t nb_pkt_segs)
 {
int i, nb_pkt = 0;
size_t eth_hdr_size;
@@ -221,9 +213,9 @@ nomore_mbuf:
break;
}

-   pkt->data_len = tx_pkt_seg_lengths[0];
+   pkt->data_len = pkt_len;
pkt_seg = pkt;
-   for (i = 1; i < tx_pkt_nb_segs; i++) {
+   for (i = 1; i < nb_pkt_segs; i++) {
pkt_seg->next = rte_pktmbuf_alloc(mp);
if (pkt_seg->next == NULL) {
pkt->nb_segs = i;
@@ -231,7 +223,7 @@ nomore_mbuf:
goto nomore_mbuf;
}
pkt_seg = pkt_seg->next;
-   pkt_seg->data_len = tx_pkt_seg_lengths[i];
+   pkt_seg->data_len = pkt_len;
}
pkt_seg->next = NULL; /* Last segment of packet. */

@@ -259,8 +251,8 @@ nomore_mbuf:
 * Complete first mbuf of packet and append it to the
 * burst of packets to be transmitted.
 */
-   pkt->nb_segs = tx_pkt_nb_segs;
-   pkt->pkt_len = tx_pkt_length;
+   pkt->nb_segs = nb_pkt_segs;
+   pkt->pkt_len = pkt_len;
pkt->l2_len = eth_hdr_size;

if (ipv4) {
diff --git a/app/test/packet_burst_generator.h 
b/app/test/packet_burst_generator.h
index 5b3cd6c..fe992ac 100644
--- a/app/test/packet_burst_generator.h
+++ b/app/test/packet_burst_generator.h
@@ -47,10 +47,13 @@ extern "C" {
 #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \
((c & 0xff) << 8) | (d & 0xff))

+#define PACKET_BURST_GEN_PKT_LEN 60
+#define PACKET_BURST_GEN_PKT_LEN_128 128

 void
 initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
-   struct ether_addr *dst_mac, uint8_t vlan_enabled, uint16_t 
van_id);
+ struct ether_addr *dst_mac, uint8_t vlan_enabled,
+ uint16_t van_id);

 uint16_t
 initialize_udp_header(struct udp_hdr *udp_hdr, uint16_t src_port,
@@ -67,8 +70,10 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,

 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
-   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst);
+ struct ether_hdr *eth_hdr, uint8_t vlan_enabled,
+ void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr,
+ int nb_pkt_per_burst, uint8_t pkt_len,
+ uint8_t nb_pkt_segs);

 #ifdef __cplusplus
 }
diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 214d2a2..d407e4f 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -1192,9 +1192,12 @@ generate_test_burst(struct rte_mbuf **pkts_burst, 
uint16_t burst_size,
}

/* Generate burst of packets to transmit */
-   generated_burst_size = generate_packet_burst(test_params->mbuf_pool,
-   pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, 
ipv4,
-   test_params->pkt_udp_hdr, burst_size);
+   generated_burst_size =
+   generate_packet_burst(test_params->mbuf_pool,
+ pkts_burst, test_params->pkt_eth_hdr,
+

[dpdk-dev] [PATCH v4 3/3] ethdev: fix wrong error return refer to API definition

2014-10-24 Thread Cunming Liang
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count 
returns the packet number.
When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set to 
-ENOTSUP.
It makes confusing.
The patch always return 0 no matter no packet or there's error.
Meanwhile set errno in such kind of checking.

Signed-off-by: Cunming Liang 
---
 lib/librte_ether/rte_ethdev.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 50f10d9..922a0c6 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -2530,7 +2530,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
return 0;
}
dev = &rte_eth_devices[port_id];
-   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
if (queue_id >= dev->data->nb_rx_queues) {
PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
return 0;
@@ -2551,7 +2551,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
}
dev = &rte_eth_devices[port_id];

-   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
if (queue_id >= dev->data->nb_tx_queues) {
PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
return 0;
@@ -2570,7 +2570,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
return 0;
}
dev = &rte_eth_devices[port_id];
-   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
 }

-- 
1.7.4.1



[dpdk-dev] [PATCH v4 2/3] app/test: measure the cost of rx/tx routines by cycle number

2014-10-24 Thread Cunming Liang
The unit test can be used to measure cycles per packet in different rx/tx 
rouines.
The NIC works in loopback mode. So it doesn't require test equipment to measure 
throughput.
As result, the unit test shows the average cycles per packet consuming.
When doing the test, make sure the link is UP.

Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Run and wait for the result
pmd_perf_autotest

There's option to choose rx/tx pair, default is vector.
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' without 
INC_VEC=y in config

It supports to measure standalone rx or tx.
Usage Example:
Choose rx or tx standalone, default is both
set_rxtx_anchor [rxtx|rxonly|txonly]

It also supports to measure standalone RX burst cycles.
In this way, it won't repeat re-send recevied packets.
Now it measures two situations, poll before/after xmit(w or w/o desc. cache 
conflict)
Usage Example:
Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]

Signed-off-by: Cunming Liang 
Acked-by: Bruce Richardson 
---
 app/test/Makefile   |1 +
 app/test/commands.c |  111 +
 app/test/test.h |6 +
 app/test/test_pmd_perf.c|  922 +++
 lib/librte_ether/rte_ether.h|   25 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
 6 files changed, 1071 insertions(+), 0 deletions(-)
 create mode 100644 app/test/test_pmd_perf.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 6af6d76..ebfa0ba 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -56,6 +56,7 @@ SRCS-y += test_memzone.c

 SRCS-y += test_ring.c
 SRCS-y += test_ring_perf.c
+SRCS-y += test_pmd_perf.c

 ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
 SRCS-y += test_table.c
diff --git a/app/test/commands.c b/app/test/commands.c
index a9e36b1..92a17ed 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -310,12 +310,123 @@ cmdline_parse_inst_t cmd_quit = {

 //

+struct cmd_set_rxtx_result {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t mode;
+};
+
+static void cmd_set_rxtx_parsed(void *parsed_result, struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_result *res = parsed_result;
+   if (test_set_rxtx_conf(res->mode) < 0)
+   cmdline_printf(cl, "Cannot find such mode\n");
+}
+
+cmdline_parse_token_string_t cmd_set_rxtx_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, set,
+"set_rxtx_mode");
+
+cmdline_parse_token_string_t cmd_set_rxtx_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, mode, NULL);
+
+cmdline_parse_inst_t cmd_set_rxtx = {
+   .f = cmd_set_rxtx_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "set rxtx routine: "
+   "set_rxtx ",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_set_rxtx_set,
+   (void *)&cmd_set_rxtx_mode,
+   NULL,
+   },
+};
+
+//
+
+struct cmd_set_rxtx_anchor {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t type;
+};
+
+static void
+cmd_set_rxtx_anchor_parsed(void *parsed_result,
+  struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_anchor *res = parsed_result;
+   if (test_set_rxtx_anchor(res->type) < 0)
+   cmdline_printf(cl, "Cannot find such anchor\n");
+}
+
+cmdline_parse_token_string_t cmd_set_rxtx_anchor_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, set,
+"set_rxtx_anchor");
+
+cmdline_parse_token_string_t cmd_set_rxtx_anchor_type =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, type, NULL);
+
+cmdline_parse_inst_t cmd_set_rxtx_anchor = {
+   .f = cmd_set_rxtx_anchor_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "set rxtx anchor: "
+   "set_rxtx_anchor ",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_set_rxtx_anchor_set,
+   (void *)&cmd_set_rxtx_anchor_type,
+   NULL,
+   },
+};
+
+//
+
+/* for stream control */
+struct cmd_set_rxtx_sc {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t type;
+};
+
+static void
+cmd_set_rxtx_sc_parsed(void *parsed_result,
+  struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_sc *res = parsed_result;
+   if (test_set_rxtx_sc(res->type) < 0)
+   cmdline_printf(cl, "Cannot find such stream control\n");
+}
+
+cmdline_parse_toke

[dpdk-dev] [PATCH v5 0/3] app/test: unit test to measure cycles per packet

2014-10-24 Thread Cunming Liang
BTW, [1/3] is the same patch as below one. 
http://dpdk.org/dev/patchwork/patch/817

v5 update:
# fix the confusing of retval in some API of rte_ethdev

v4 ignore

v3 update:
# Codes refine according to the feedback.
  1. add ether_format_addr to rte_ether.h
  2. fix typo in code comments.
  3. %lu to %PRIu64, fixing 32-bit targets compilation err
# merge 2 small incremental patches to the first one.
  The whole unit test as a single patch in [PATCH v3 2/2]
# rebase code to the latest master

v2 update:
Rebase code to the latest master branch.

It provides unit test to measure cycles/packet in NIC loopback mode.
It simply gives the average cycles of IO used per packet without test equipment.
When doing the test, make sure the link is UP.

There's two stream control mode support, one is continues, another is burst.
The former continues to forward the injected packets until reaching a certain 
amount of number.
The latter one stop when all the injected packets are received.
In burst stream, now measure two situations, with or without desc. cache 
conflict.
By default, it runs in continues stream mode to measure the whole rxtx.

Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
3. If choose continuous stream, there are another two options can configure
3.1 choose rx/tx pair, default is vector
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' 
without INC_VEC=y in config 
3.2 choose the area of masurement, default is rxtx
set_rxtx_anchor [rxtx|rxonly|txonly]
4. Run and wait for the result
pmd_perf_autotest

For who simply just want to see how much cycles cost per packet.
Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it.
Nothing else needs to configure. 
Using other options when you understand and what to measures more. 


*** BLURB HERE ***

Cunming Liang (3):
  app/test: allow to create packets in different sizes
  app/test: measure the cost of rx/tx routines by cycle number
  ethdev: fix wrong error return refere to API definition

 app/test/Makefile   |1 +
 app/test/commands.c |  111 +
 app/test/packet_burst_generator.c   |   26 +-
 app/test/packet_burst_generator.h   |   11 +-
 app/test/test.h |6 +
 app/test/test_link_bonding.c|   39 +-
 app/test/test_pmd_perf.c|  922 +++
 lib/librte_ether/rte_ethdev.c   |   10 +-
 lib/librte_ether/rte_ether.h|   25 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
 10 files changed, 1121 insertions(+), 36 deletions(-)
 create mode 100644 app/test/test_pmd_perf.c

-- 
1.7.4.1



[dpdk-dev] [PATCH v5 1/3] app/test: allow to create packets in different sizes

2014-10-24 Thread Cunming Liang
adding support to allow packet burst generator to create packets in differenct 
sizes

Signed-off-by: Cunming Liang 
Acked-by: Declan Doherty 
---
 app/test/packet_burst_generator.c |   26 
 app/test/packet_burst_generator.h |   11 +++--
 app/test/test_link_bonding.c  |   39 
 3 files changed, 43 insertions(+), 33 deletions(-)

diff --git a/app/test/packet_burst_generator.c 
b/app/test/packet_burst_generator.c
index 9e747a4..017139b 100644
--- a/app/test/packet_burst_generator.c
+++ b/app/test/packet_burst_generator.c
@@ -191,20 +191,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,
  */
 #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. */

-#define TXONLY_DEF_PACKET_LEN 64
-#define TXONLY_DEF_PACKET_LEN_128 128
-
-uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN;
-uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = {
-   TXONLY_DEF_PACKET_LEN_128,
-};
-
-uint8_t  tx_pkt_nb_segs = 1;
-
 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
-   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst)
+ struct ether_hdr *eth_hdr, uint8_t vlan_enabled,
+ void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr,
+ int nb_pkt_per_burst, uint8_t pkt_len,
+ uint8_t nb_pkt_segs)
 {
int i, nb_pkt = 0;
size_t eth_hdr_size;
@@ -221,9 +213,9 @@ nomore_mbuf:
break;
}

-   pkt->data_len = tx_pkt_seg_lengths[0];
+   pkt->data_len = pkt_len;
pkt_seg = pkt;
-   for (i = 1; i < tx_pkt_nb_segs; i++) {
+   for (i = 1; i < nb_pkt_segs; i++) {
pkt_seg->next = rte_pktmbuf_alloc(mp);
if (pkt_seg->next == NULL) {
pkt->nb_segs = i;
@@ -231,7 +223,7 @@ nomore_mbuf:
goto nomore_mbuf;
}
pkt_seg = pkt_seg->next;
-   pkt_seg->data_len = tx_pkt_seg_lengths[i];
+   pkt_seg->data_len = pkt_len;
}
pkt_seg->next = NULL; /* Last segment of packet. */

@@ -259,8 +251,8 @@ nomore_mbuf:
 * Complete first mbuf of packet and append it to the
 * burst of packets to be transmitted.
 */
-   pkt->nb_segs = tx_pkt_nb_segs;
-   pkt->pkt_len = tx_pkt_length;
+   pkt->nb_segs = nb_pkt_segs;
+   pkt->pkt_len = pkt_len;
pkt->l2_len = eth_hdr_size;

if (ipv4) {
diff --git a/app/test/packet_burst_generator.h 
b/app/test/packet_burst_generator.h
index 5b3cd6c..fe992ac 100644
--- a/app/test/packet_burst_generator.h
+++ b/app/test/packet_burst_generator.h
@@ -47,10 +47,13 @@ extern "C" {
 #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \
((c & 0xff) << 8) | (d & 0xff))

+#define PACKET_BURST_GEN_PKT_LEN 60
+#define PACKET_BURST_GEN_PKT_LEN_128 128

 void
 initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac,
-   struct ether_addr *dst_mac, uint8_t vlan_enabled, uint16_t 
van_id);
+ struct ether_addr *dst_mac, uint8_t vlan_enabled,
+ uint16_t van_id);

 uint16_t
 initialize_udp_header(struct udp_hdr *udp_hdr, uint16_t src_port,
@@ -67,8 +70,10 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t 
src_addr,

 int
 generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst,
-   struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr,
-   uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst);
+ struct ether_hdr *eth_hdr, uint8_t vlan_enabled,
+ void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr,
+ int nb_pkt_per_burst, uint8_t pkt_len,
+ uint8_t nb_pkt_segs);

 #ifdef __cplusplus
 }
diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c
index 214d2a2..d407e4f 100644
--- a/app/test/test_link_bonding.c
+++ b/app/test/test_link_bonding.c
@@ -1192,9 +1192,12 @@ generate_test_burst(struct rte_mbuf **pkts_burst, 
uint16_t burst_size,
}

/* Generate burst of packets to transmit */
-   generated_burst_size = generate_packet_burst(test_params->mbuf_pool,
-   pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, 
ipv4,
-   test_params->pkt_udp_hdr, burst_size);
+   generated_burst_size =
+   generate_packet_burst(test_params->mbuf_pool,
+ pkts_burst, test_params->pkt_eth_hdr,
+

[dpdk-dev] [PATCH v5 2/3] app/test: measure the cost of rx/tx routines by cycle number

2014-10-24 Thread Cunming Liang
The unit test can be used to measure cycles per packet in different rx/tx 
rouines.
The NIC works in loopback mode. So it doesn't require test equipment to measure 
throughput.
As result, the unit test shows the average cycles per packet consuming.
When doing the test, make sure the link is UP.

Usage Example:
1. Run unit test app in interactive mode
app/test -c f -n 4 -- -i
2. Run and wait for the result
pmd_perf_autotest

There's option to choose rx/tx pair, default is vector.
set_rxtx_mode [vector|scalar|full|hybrid]
Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' without 
INC_VEC=y in config

It supports to measure standalone rx or tx.
Usage Example:
Choose rx or tx standalone, default is both
set_rxtx_anchor [rxtx|rxonly|txonly]

It also supports to measure standalone RX burst cycles.
In this way, it won't repeat re-send recevied packets.
Now it measures two situations, poll before/after xmit(w or w/o desc. cache 
conflict)
Usage Example:
Set stream control mode, by default is continuous
set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]

Signed-off-by: Cunming Liang 
Acked-by: Bruce Richardson 
---
 app/test/Makefile   |1 +
 app/test/commands.c |  111 +
 app/test/test.h |6 +
 app/test/test_pmd_perf.c|  922 +++
 lib/librte_ether/rte_ether.h|   25 +
 lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
 6 files changed, 1071 insertions(+), 0 deletions(-)
 create mode 100644 app/test/test_pmd_perf.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 6af6d76..ebfa0ba 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -56,6 +56,7 @@ SRCS-y += test_memzone.c

 SRCS-y += test_ring.c
 SRCS-y += test_ring_perf.c
+SRCS-y += test_pmd_perf.c

 ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y)
 SRCS-y += test_table.c
diff --git a/app/test/commands.c b/app/test/commands.c
index a9e36b1..92a17ed 100644
--- a/app/test/commands.c
+++ b/app/test/commands.c
@@ -310,12 +310,123 @@ cmdline_parse_inst_t cmd_quit = {

 //

+struct cmd_set_rxtx_result {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t mode;
+};
+
+static void cmd_set_rxtx_parsed(void *parsed_result, struct cmdline *cl,
+   __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_result *res = parsed_result;
+   if (test_set_rxtx_conf(res->mode) < 0)
+   cmdline_printf(cl, "Cannot find such mode\n");
+}
+
+cmdline_parse_token_string_t cmd_set_rxtx_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, set,
+"set_rxtx_mode");
+
+cmdline_parse_token_string_t cmd_set_rxtx_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, mode, NULL);
+
+cmdline_parse_inst_t cmd_set_rxtx = {
+   .f = cmd_set_rxtx_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "set rxtx routine: "
+   "set_rxtx ",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_set_rxtx_set,
+   (void *)&cmd_set_rxtx_mode,
+   NULL,
+   },
+};
+
+//
+
+struct cmd_set_rxtx_anchor {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t type;
+};
+
+static void
+cmd_set_rxtx_anchor_parsed(void *parsed_result,
+  struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_anchor *res = parsed_result;
+   if (test_set_rxtx_anchor(res->type) < 0)
+   cmdline_printf(cl, "Cannot find such anchor\n");
+}
+
+cmdline_parse_token_string_t cmd_set_rxtx_anchor_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, set,
+"set_rxtx_anchor");
+
+cmdline_parse_token_string_t cmd_set_rxtx_anchor_type =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, type, NULL);
+
+cmdline_parse_inst_t cmd_set_rxtx_anchor = {
+   .f = cmd_set_rxtx_anchor_parsed,  /* function to call */
+   .data = NULL,  /* 2nd arg of func */
+   .help_str = "set rxtx anchor: "
+   "set_rxtx_anchor ",
+   .tokens = {/* token list, NULL terminated */
+   (void *)&cmd_set_rxtx_anchor_set,
+   (void *)&cmd_set_rxtx_anchor_type,
+   NULL,
+   },
+};
+
+//
+
+/* for stream control */
+struct cmd_set_rxtx_sc {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t type;
+};
+
+static void
+cmd_set_rxtx_sc_parsed(void *parsed_result,
+  struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   struct cmd_set_rxtx_sc *res = parsed_result;
+   if (test_set_rxtx_sc(res->type) < 0)
+   cmdline_printf(cl, "Cannot find such stream control\n");
+}
+
+cmdline_parse_toke

[dpdk-dev] [PATCH v5 3/3] ethdev: fix wrong error return refere to API definition

2014-10-24 Thread Cunming Liang
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count 
returns the packet number.
When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set to 
-ENOTSUP.
It makes confusing.
The patch always return 0 no matter no packet or there's error.
Meanwhile set errno in such kind of checking.

Signed-off-by: Cunming Liang 
---
 lib/librte_ether/rte_ethdev.c |   10 +++---
 1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 50f10d9..6675f28 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -81,12 +81,14 @@
 /* Macros for checking for restricting functions to primary instance only */
 #define PROC_PRIMARY_OR_ERR_RET(retval) do { \
if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \
+   rte_errno = -E_RTE_SECONDARY;   \
PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \
return (retval); \
} \
 } while(0)
 #define PROC_PRIMARY_OR_RET() do { \
if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \
+   rte_errno = -E_RTE_SECONDARY;   \
PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \
return; \
} \
@@ -95,12 +97,14 @@
 /* Macros to check for invlaid function pointers in dev_ops structure */
 #define FUNC_PTR_OR_ERR_RET(func, retval) do { \
if ((func) == NULL) { \
+   rte_errno = -ENOTSUP; \
PMD_DEBUG_TRACE("Function not supported\n"); \
return (retval); \
} \
 } while(0)
 #define FUNC_PTR_OR_RET(func) do { \
if ((func) == NULL) { \
+   rte_errno = -ENOTSUP; \
PMD_DEBUG_TRACE("Function not supported\n"); \
return; \
} \
@@ -2530,7 +2534,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
return 0;
}
dev = &rte_eth_devices[port_id];
-   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
if (queue_id >= dev->data->nb_rx_queues) {
PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
return 0;
@@ -2551,7 +2555,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
}
dev = &rte_eth_devices[port_id];

-   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
if (queue_id >= dev->data->nb_tx_queues) {
PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
return 0;
@@ -2570,7 +2574,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id)
return 0;
}
dev = &rte_eth_devices[port_id];
-   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
+   FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
 }

-- 
1.7.4.1



[dpdk-dev] [PATCH v4 0/3] app/test: unit test to measure cycles per packet

2014-10-24 Thread Liang, Cunming
Sorry, just ignore this version.

> -Original Message-
> From: Liang, Cunming
> Sent: Friday, October 24, 2014 1:40 PM
> To: dev at dpdk.org
> Cc: nhorman at tuxdriver.com; Richardson, Bruce; Ananyev, Konstantin; De Lara
> Guarch, Pablo; Liang, Cunming
> Subject: [PATCH v4 0/3] app/test: unit test to measure cycles per packet
> Importance: High
> 
> BTW, [1/3] is the same patch as below one.
> http://dpdk.org/dev/patchwork/patch/817
> 
> v4 update:
> # fix the confusing of retval in some API of rte_ethdev
> 
> v3 update:
> # Codes refine according to the feedback.
>   1. add ether_format_addr to rte_ether.h
>   2. fix typo in code comments.
>   3. %lu to %PRIu64, fixing 32-bit targets compilation err
> # merge 2 small incremental patches to the first one.
>   The whole unit test as a single patch in [PATCH v3 2/2]
> # rebase code to the latest master
> 
> v2 update:
> Rebase code to the latest master branch.
> 
> It provides unit test to measure cycles/packet in NIC loopback mode.
> It simply gives the average cycles of IO used per packet without test 
> equipment.
> When doing the test, make sure the link is UP.
> 
> There's two stream control mode support, one is continues, another is burst.
> The former continues to forward the injected packets until reaching a certain
> amount of number.
> The latter one stop when all the injected packets are received.
> In burst stream, now measure two situations, with or without desc. cache 
> conflict.
> By default, it runs in continues stream mode to measure the whole rxtx.
> 
> Usage Example:
> 1. Run unit test app in interactive mode
> app/test -c f -n 4 -- -i
> 2. Set stream control mode, by default is continuous
> set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit]
> 3. If choose continuous stream, there are another two options can configure
> 3.1 choose rx/tx pair, default is vector
> set_rxtx_mode [vector|scalar|full|hybrid]
> Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' 
> without
> INC_VEC=y in config
> 3.2 choose the area of masurement, default is rxtx
> set_rxtx_anchor [rxtx|rxonly|txonly]
> 4. Run and wait for the result
> pmd_perf_autotest
> 
> For who simply just want to see how much cycles cost per packet.
> Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it.
> Nothing else needs to configure.
> Using other options when you understand and what to measures more.
> 
> 
> *** BLURB HERE ***
> 
> Cunming Liang (3):
>   app/test: allow to create packets in different sizes
>   app/test: measure the cost of rx/tx routines by cycle number
>   ethdev: fix wrong error return refer to API definition
> 
>  app/test/Makefile   |1 +
>  app/test/commands.c |  111 +
>  app/test/packet_burst_generator.c   |   26 +-
>  app/test/packet_burst_generator.h   |   11 +-
>  app/test/test.h |6 +
>  app/test/test_link_bonding.c|   39 +-
>  app/test/test_pmd_perf.c|  922
> +++
>  lib/librte_ether/rte_ethdev.c   |6 +-
>  lib/librte_ether/rte_ether.h|   25 +
>  lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 +
>  10 files changed, 1117 insertions(+), 36 deletions(-)
>  create mode 100644 app/test/test_pmd_perf.c
> 
> --
> 1.7.4.1



[dpdk-dev] [PATCH] eal: replace strict_strtoul with kstrtoul

2014-10-24 Thread Jincheng Miao
>From upstream kernel commit 3db2e9cd, strict_strto* serial functions
are removed. So that we should directly used kstrtoul instead.

Signed-off-by: Jincheng Miao 
---
 lib/librte_eal/linuxapp/igb_uio/igb_uio.c   | 4 ++--
 lib/librte_eal/linuxapp/kni/kni_vhost.c | 2 +-
 lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c 
b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
index d1ca26e..47ff2f3 100644
--- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
+++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
@@ -83,7 +83,7 @@ store_max_vfs(struct device *dev, struct device_attribute 
*attr,
unsigned long max_vfs;
struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);

-   if (0 != strict_strtoul(buf, 0, &max_vfs))
+   if (0 != kstrtoul(buf, 0, &max_vfs))
return -EINVAL;

if (0 == max_vfs)
@@ -174,7 +174,7 @@ store_max_read_request_size(struct device *dev,
unsigned long size = 0;
int ret;

-   if (strict_strtoul(buf, 0, &size) != 0)
+   if (0 != kstrtoul(buf, 0, &size))
return -EINVAL;

ret = pcie_set_readrq(pci_dev, (int)size);
diff --git a/lib/librte_eal/linuxapp/kni/kni_vhost.c 
b/lib/librte_eal/linuxapp/kni/kni_vhost.c
index fe512c2..ba0c1ac 100644
--- a/lib/librte_eal/linuxapp/kni/kni_vhost.c
+++ b/lib/librte_eal/linuxapp/kni/kni_vhost.c
@@ -739,7 +739,7 @@ set_sock_en(struct device *dev, struct device_attribute 
*attr,
unsigned long en;
int err = 0;

-   if (0 != strict_strtoul(buf, 0, &en))
+   if (0 != kstrtoul(buf, 0, &en))
return -EINVAL;

if (en)
diff --git a/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c 
b/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c
index dfb271d..8a3727d 100644
--- a/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c
+++ b/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c
@@ -123,7 +123,7 @@ store_memsize(struct device *dev, struct device_attribute 
*attr,
int err = 0;
unsigned long mem_size;

-   if (0 != strict_strtoul(buf, 0, &mem_size))
+   if (0 != kstrtoul(buf, 0, &mem_size))
return  -EINVAL;

mutex_lock(&dom0_dev.data_lock);
-- 
1.9.3



[dpdk-dev] [PATCH] doc: fix a typo

2014-10-24 Thread Jincheng Miao
Signed-off-by: Jincheng Miao 
---
 doc/guides/linux_gsg/sys_reqs.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/guides/linux_gsg/sys_reqs.rst 
b/doc/guides/linux_gsg/sys_reqs.rst
index 6a03f54..c14411e 100755
--- a/doc/guides/linux_gsg/sys_reqs.rst
+++ b/doc/guides/linux_gsg/sys_reqs.rst
@@ -267,7 +267,7 @@ Use the following command (assuming that 2048 MB is 
required):

 .. code-block:: console

-echo 2048 /sys/kernel/mm/dom0-mm/memsize-mB/memsize
+echo 2048 > /sys/kernel/mm/dom0-mm/memsize-mB/memsize

 The user can also check how much memory has already been used:

-- 
1.9.3



[dpdk-dev] [PATCH] kni: fix building on Ubuntu-hybrids

2014-10-24 Thread Thomas Monjalon
2014-10-23 16:39, Alexander Guy:
> In the case where a userspace reports itself as Ubuntu, but the
> kernel isn't providing the expected version signature interface,
> turn off Ubuntu specializations.
> 
> This situation happens often enough in development environments,
> and with multi-distribution build servers (e.g. chroot, containers).
[...]
> -ifeq ($(shell lsb_release -si 2>/dev/null),Ubuntu)
> +ifeq ($(shell test -f /proc/version_signature && lsb_release -si 
> 2>/dev/null),Ubuntu)

Please, could explain what is the file /proc/version_signature and why
it can be a check for Ubuntu kernel?

Thanks
-- 
Thomas


[dpdk-dev] [PATCH v2 0/4] support VF MAC filter on Fortville

2014-10-24 Thread Jijiang Liu
The patch set enhances configurability of MAC filter and supports VF MAC filter 
on Fortville.

It mainly includes:
 - The following filter type are configurable:
   1. Perfect match of MAC address
   2. Perfect match of MAC address and VLAN ID
   3. Hash match of MAC address
   4. Hash match of MAC address and perfect match of VLAN ID
 - Support perfect and hash match of unicast and multicast MAC address for VF 
for i40e

 v2 updates:
  * Integrate the v1 patch set into the new filter framework.
  * Optimize MAC filter data structures in rte_eth_ctrl.h file.

jijiangl (4):
  Expand data structures of MAC filter in rte_eth_ctrl.h file.
  Expand MAC filter implemantation in i40e. 
  Support VF MAC filter in i40e.
  Test VF MAC filter in testpmd 

 app/test-pmd/cmdline.c|  119 +++-
 lib/librte_ether/rte_eth_ctrl.h   |   23 +++
 lib/librte_pmd_i40e/i40e_ethdev.c |  283 -
 lib/librte_pmd_i40e/i40e_ethdev.h |   18 ++-
 lib/librte_pmd_i40e/i40e_pf.c |7 +-
 5 files changed, 404 insertions(+), 46 deletions(-)

-- 
1.7.7.6



[dpdk-dev] [PATCH v2 1/4] librte_ether:extend MAC filter data structures

2014-10-24 Thread Jijiang Liu
Add the data definations for MAC filter enhancement in rte_eth_ctrl.h file.

Signed-off-by: Jijiang Liu 
---
 lib/librte_ether/rte_eth_ctrl.h |   23 +++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h
index df21ac6..699ed2e 100644
--- a/lib/librte_ether/rte_eth_ctrl.h
+++ b/lib/librte_ether/rte_eth_ctrl.h
@@ -51,6 +51,7 @@ extern "C" {
  */
 enum rte_filter_type {
RTE_ETH_FILTER_NONE = 0,
+   RTE_ETH_FILTER_MACVLAN,
RTE_ETH_FILTER_MAX
 };

@@ -71,6 +72,28 @@ enum rte_filter_op {
RTE_ETH_FILTER_OP_MAX
 };

+/**
+ * MAC filter type
+ */
+enum rte_mac_filter_type {
+   RTE_MAC_PERFECT_MATCH = 1, /**< exact match of MAC addr. */
+   RTE_MACVLAN_PERFECT_MATCH,
+   /**< exact match of MAC addr and VLAN ID. */
+   RTE_MAC_HASH_MATCH, /**< hash match of MAC addr. */
+   RTE_MACVLAN_HASH_MATCH,
+   /**< hash match of MAC addr and exact match of VLAN ID. */
+};
+
+/**
+ * MAC filter info
+ */
+struct rte_eth_mac_filter {
+   uint8_t is_vf; /**< 1 for VF, 0 for port dev */
+   uint16_t dst_id; /**

[dpdk-dev] [PATCH v2 2/4] i40e:expand MAC filter implemantation in i40e

2014-10-24 Thread Jijiang Liu
This patch mainly optimizes the i40e_add_macvlan_filters() and the 
i40e_remove_macvlan_filters() functions in order that
we are able to provide filter type configuration. And another relevant MAC 
filter codes are changed based on new data structures.

Signed-off-by: Jijiang Liu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  165 
 lib/librte_pmd_i40e/i40e_ethdev.h |   18 -
 lib/librte_pmd_i40e/i40e_pf.c |7 ++-
 3 files changed, 149 insertions(+), 41 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 3b75f0f..5fae0e1 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -1529,6 +1529,7 @@ i40e_macaddr_add(struct rte_eth_dev *dev,
 {
struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   struct i40e_mac_filter_info mac_filter;
struct i40e_vsi *vsi = pf->main_vsi;
struct ether_addr old_mac;
int ret;
@@ -1554,8 +1555,10 @@ i40e_macaddr_add(struct rte_eth_dev *dev,
(void)rte_memcpy(&old_mac, hw->mac.addr, ETHER_ADDR_LEN);
(void)rte_memcpy(hw->mac.addr, mac_addr->addr_bytes,
ETHER_ADDR_LEN);
+   (void)rte_memcpy(&mac_filter.mac_addr, mac_addr, ETHER_ADDR_LEN);
+   mac_filter.filter_type = RTE_MACVLAN_PERFECT_MATCH;

-   ret = i40e_vsi_add_mac(vsi, mac_addr);
+   ret = i40e_vsi_add_mac(vsi, &mac_filter);
if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR, "Failed to add MACVLAN filter");
return;
@@ -2472,6 +2475,7 @@ i40e_update_default_filter_setting(struct i40e_vsi *vsi)
 {
struct i40e_hw *hw = I40E_VSI_TO_HW(vsi);
struct i40e_aqc_remove_macvlan_element_data def_filter;
+   struct i40e_mac_filter_info filter;
int ret;

if (vsi->type != I40E_VSI_MAIN)
@@ -2485,6 +2489,7 @@ i40e_update_default_filter_setting(struct i40e_vsi *vsi)
ret = i40e_aq_remove_macvlan(hw, vsi->seid, &def_filter, 1, NULL);
if (ret != I40E_SUCCESS) {
struct i40e_mac_filter *f;
+   struct ether_addr *mac;

PMD_DRV_LOG(WARNING, "Cannot remove the default "
"macvlan filter");
@@ -2494,15 +2499,18 @@ i40e_update_default_filter_setting(struct i40e_vsi *vsi)
PMD_DRV_LOG(ERR, "failed to allocate memory");
return I40E_ERR_NO_MEMORY;
}
-   (void)rte_memcpy(&f->macaddr.addr_bytes, hw->mac.perm_addr,
+   mac = &f->mac_info.mac_addr;
+   (void)rte_memcpy(&mac->addr_bytes, hw->mac.perm_addr,
ETH_ADDR_LEN);
TAILQ_INSERT_TAIL(&vsi->mac_list, f, next);
vsi->mac_num++;

return ret;
}
-
-   return i40e_vsi_add_mac(vsi, (struct ether_addr *)(hw->mac.perm_addr));
+   (void)rte_memcpy(&filter.mac_addr,
+   (struct ether_addr *)(hw->mac.perm_addr), ETH_ADDR_LEN);
+   filter.filter_type = RTE_MACVLAN_PERFECT_MATCH;
+   return i40e_vsi_add_mac(vsi, &filter);
 }

 static int
@@ -2556,6 +2564,7 @@ i40e_vsi_setup(struct i40e_pf *pf,
 {
struct i40e_hw *hw = I40E_PF_TO_HW(pf);
struct i40e_vsi *vsi;
+   struct i40e_mac_filter_info filter;
int ret;
struct i40e_vsi_context ctxt;
struct ether_addr broadcast =
@@ -2766,7 +2775,10 @@ i40e_vsi_setup(struct i40e_pf *pf,
}

/* MAC/VLAN configuration */
-   ret = i40e_vsi_add_mac(vsi, &broadcast);
+   (void)rte_memcpy(&filter.mac_addr, &broadcast, ETHER_ADDR_LEN);
+   filter.filter_type = RTE_MACVLAN_PERFECT_MATCH;
+
+   ret = i40e_vsi_add_mac(vsi, &filter);
if (ret != I40E_SUCCESS) {
PMD_DRV_LOG(ERR, "Failed to add MACVLAN filter");
goto fail_msix_alloc;
@@ -3467,6 +3479,7 @@ i40e_add_macvlan_filters(struct i40e_vsi *vsi,
 {
int ele_num, ele_buff_size;
int num, actual_num, i;
+   uint16_t flags;
int ret = I40E_SUCCESS;
struct i40e_hw *hw = I40E_VSI_TO_HW(vsi);
struct i40e_aqc_add_macvlan_element_data *req_list;
@@ -3492,9 +3505,31 @@ i40e_add_macvlan_filters(struct i40e_vsi *vsi,
&filter[num + i].macaddr, ETH_ADDR_LEN);
req_list[i].vlan_tag =
rte_cpu_to_le_16(filter[num + i].vlan_id);
-   req_list[i].flags = rte_cpu_to_le_16(\
-   I40E_AQC_MACVLAN_ADD_PERFECT_MATCH);
+
+   switch (filter[num + i].filter_type) {
+   case RTE_MAC_PERFECT_MATCH:
+   flags = I40E_AQC_MACVLAN_ADD_PERFECT_MATCH |
+   I40E_AQC_MACVLAN_ADD_IGNORE_VLAN;
+

[dpdk-dev] [PATCH v2 4/4] app/testpmd:test VF MAC filter

2014-10-24 Thread Jijiang Liu
Add a test command in testpmd to test VF MAC filter feature.

Signed-off-by: Jijiang Liu 
---
 app/test-pmd/cmdline.c |  119 ++-
 1 files changed, 116 insertions(+), 3 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 0b972f9..baa968b 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -351,9 +351,14 @@ static void cmd_help_long_parsed(void *parsed_result,
"e.g., 'set stat_qmap rx 0 2 5' sets rx queue 2"
" on port 0 to mapping 5.\n\n"

-   "set port (port_id) vf (vf_id) rx|tx on|off \n"
+   "set port (port_id) vf (vf_id) rx|tx on|off\n"
"Enable/Disable a VF receive/tranmit from a 
port\n\n"

+   "set port (port_id) vf (vf_id) (mac_addr)"
+   " (exact-mac#exact-mac-vlan#hashmac|hashmac-vlan) 
on|off\n"
+   "   Add/Remove unicast or multicast MAC addr filter"
+   " for a VF.\n\n"
+
"set port (port_id) vf (vf_id) rxmode (AUPE|ROPE|BAM"
"|MPE) (on|off)\n"
"AUPE:accepts untagged VLAN;"
@@ -5809,6 +5814,112 @@ cmdline_parse_inst_t cmd_set_uc_all_hash_filter = {
},
 };

+/* *** CONFIGURE MACVLAN FILTER FOR VF(s) *** */
+struct cmd_set_vf_macvlan_filter {
+   cmdline_fixed_string_t set;
+   cmdline_fixed_string_t port;
+   uint8_t port_id;
+   cmdline_fixed_string_t vf;
+   uint8_t vf_id;
+   struct ether_addr address;
+   cmdline_fixed_string_t filter_type;
+   cmdline_fixed_string_t mode;
+};
+
+static void
+cmd_set_vf_macvlan_parsed(void *parsed_result,
+  __attribute__((unused)) struct cmdline *cl,
+  __attribute__((unused)) void *data)
+{
+   int is_on, ret = 0;
+   struct cmd_set_vf_macvlan_filter *res = parsed_result;
+   struct rte_eth_mac_filter filter;
+
+   memset(&filter, 0, sizeof(struct rte_eth_mac_filter));
+
+   (void)rte_memcpy(&filter.mac_addr, &res->address, ETHER_ADDR_LEN);
+
+   /* set VF MAC filter */
+   filter.is_vf = 1;
+
+   /* set VF ID */
+   filter.dst_id = res->vf_id;
+
+   if (!strcmp(res->filter_type, "exact-mac"))
+   filter.filter_type = RTE_MAC_PERFECT_MATCH;
+   else if (!strcmp(res->filter_type, "exact-mac-vlan"))
+   filter.filter_type = RTE_MACVLAN_PERFECT_MATCH;
+   else if (!strcmp(res->filter_type, "hashmac"))
+   filter.filter_type = RTE_MAC_HASH_MATCH;
+   else if (!strcmp(res->filter_type, "hashmac-vlan"))
+   filter.filter_type = RTE_MACVLAN_HASH_MATCH;
+
+   is_on = (strcmp(res->mode, "on") == 0) ? 1 : 0;
+
+   if (is_on)
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_MACVLAN,
+   RTE_ETH_FILTER_ADD,
+&filter);
+   else
+   ret = rte_eth_dev_filter_ctrl(res->port_id,
+   RTE_ETH_FILTER_MACVLAN,
+   RTE_ETH_FILTER_DELETE,
+   &filter);
+
+   if (ret < 0)
+   printf("bad set MAC hash parameter, return code = %d\n", ret);
+
+}
+
+cmdline_parse_token_string_t cmd_set_vf_macvlan_set =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+set, "set");
+cmdline_parse_token_string_t cmd_set_vf_macvlan_port =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+port, "port");
+cmdline_parse_token_num_t cmd_set_vf_macvlan_portid =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+ port_id, UINT8);
+cmdline_parse_token_string_t cmd_set_vf_macvlan_vf =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+vf, "vf");
+cmdline_parse_token_num_t cmd_set_vf_macvlan_vf_id =
+   TOKEN_NUM_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+   vf_id, UINT8);
+cmdline_parse_token_etheraddr_t cmd_set_vf_macvlan_mac =
+   TOKEN_ETHERADDR_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+   address);
+cmdline_parse_token_string_t cmd_set_vf_macvlan_filter_type =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+   filter_type, "exact-mac#exact-mac-vlan"
+   "#hashmac#hashmac-vlan");
+cmdline_parse_token_string_t cmd_set_vf_macvlan_mode =
+   TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter,
+mode, "on#off");
+
+cmdline_parse_inst_t cmd_set_vf_macvlan_filter = {
+   .f = cmd_set_vf_macvlan_parsed,
+   .data = N

[dpdk-dev] [PATCH v2 3/4] i40e:add VF MAC filter

2014-10-24 Thread Jijiang Liu
It mainly add i40e_vf_mac_filter_set() function to support perfect match and 
hash match of MAC address and VLAN ID for VF.

Signed-off-by: Jijiang Liu 
---
 lib/librte_pmd_i40e/i40e_ethdev.c |  118 -
 1 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c 
b/lib/librte_pmd_i40e/i40e_ethdev.c
index 5fae0e1..f9e3aa8 100644
--- a/lib/librte_pmd_i40e/i40e_ethdev.c
+++ b/lib/librte_pmd_i40e/i40e_ethdev.c
@@ -1605,6 +1605,119 @@ i40e_macaddr_remove(struct rte_eth_dev *dev, uint32_t 
index)
memset(&pf->dev_addr, 0, sizeof(struct ether_addr));
 }

+/* Set perfect match or hash match of MAC and VLAN for a VF */
+static int
+i40e_vf_mac_filter_set(struct i40e_pf *pf,
+struct rte_eth_mac_filter *filter,
+bool add)
+{
+   struct i40e_hw *hw;
+   struct i40e_mac_filter_info mac_filter;
+   struct ether_addr old_mac;
+   struct ether_addr *new_mac;
+   struct i40e_pf_vf *vf = NULL;
+   uint16_t vf_id;
+   int ret;
+
+   if (pf == NULL) {
+   PMD_DRV_LOG(ERR, "Invalid PF argument\n");
+   return -EINVAL;
+   }
+   hw = I40E_PF_TO_HW(pf);
+
+   if (filter == NULL) {
+   PMD_DRV_LOG(ERR, "Invalid mac filter argument\n");
+   return -EINVAL;
+   }
+
+   new_mac = &filter->mac_addr;
+
+   if (is_zero_ether_addr(new_mac)) {
+   PMD_DRV_LOG(ERR, "Invalid ethernet address\n");
+   return -EINVAL;
+   }
+
+   vf_id = filter->dst_id;
+
+   if (vf_id > pf->vf_num - 1 || !pf->vfs) {
+   PMD_DRV_LOG(ERR, "Invalid argument\n");
+   return -EINVAL;
+   }
+   vf = &pf->vfs[vf_id];
+
+   if (add && is_same_ether_addr(new_mac, &(pf->dev_addr))) {
+   PMD_DRV_LOG(INFO, "Ignore adding permanent MAC address\n");
+   return -EINVAL;
+   }
+
+   if (add) {
+   (void)rte_memcpy(&old_mac, hw->mac.addr, ETHER_ADDR_LEN);
+   (void)rte_memcpy(hw->mac.addr, new_mac->addr_bytes,
+   ETHER_ADDR_LEN);
+   (void)rte_memcpy(&mac_filter.mac_addr, &filter->mac_addr,
+ETHER_ADDR_LEN);
+
+   mac_filter.filter_type = filter->filter_type;
+   ret = i40e_vsi_add_mac(vf->vsi, &mac_filter);
+   if (ret != I40E_SUCCESS) {
+   PMD_DRV_LOG(ERR, "Failed to add MAC filter\n");
+   return -1;
+   }
+   ether_addr_copy(new_mac, &pf->dev_addr);
+   } else {
+   (void)rte_memcpy(hw->mac.addr, hw->mac.perm_addr,
+   ETHER_ADDR_LEN);
+   ret = i40e_vsi_delete_mac(vf->vsi, &filter->mac_addr);
+   if (ret != I40E_SUCCESS) {
+   PMD_DRV_LOG(ERR, "Failed to delete MAC filter\n");
+   return -1;
+   }
+
+   /* Clear device address as it has been removed */
+   if (is_same_ether_addr(&(pf->dev_addr), new_mac))
+   memset(&pf->dev_addr, 0, sizeof(struct ether_addr));
+   }
+
+   return 0;
+}
+
+/* MAC filter handle */
+static int
+i40e_mac_filter_handle(struct rte_eth_dev *dev, enum rte_filter_op filter_op,
+   void *arg)
+{
+   struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private);
+   struct rte_eth_mac_filter *filter;
+   struct i40e_hw *hw = I40E_PF_TO_HW(pf);
+   int ret = I40E_NOT_SUPPORTED;
+
+   filter = (struct rte_eth_mac_filter *)(arg);
+
+   switch (filter_op) {
+   case RTE_ETH_FILTER_NONE:
+   ret = I40E_SUCCESS;
+   break;
+   case RTE_ETH_FILTER_ADD:
+   i40e_pf_disable_irq0(hw);
+   if (filter->is_vf)
+   ret = i40e_vf_mac_filter_set(pf, filter, 1);
+   i40e_pf_enable_irq0(hw);
+   break;
+   case RTE_ETH_FILTER_DELETE:
+   i40e_pf_disable_irq0(hw);
+   if (filter->is_vf)
+   ret = i40e_vf_mac_filter_set(pf, filter, 0);
+   i40e_pf_enable_irq0(hw);
+   break;
+   default:
+   PMD_DRV_LOG(ERR, "unknown operation %u\n", filter_op);
+   ret = I40E_ERR_PARAM;
+   break;
+   }
+
+   return ret;
+}
+
 static int
 i40e_dev_rss_reta_update(struct rte_eth_dev *dev,
 struct rte_eth_rss_reta *reta_conf)
@@ -4243,13 +4356,14 @@ i40e_dev_filter_ctrl(struct rte_eth_dev *dev,
 void *arg)
 {
int ret = 0;
-   (void)filter_op;
-   (void)arg;

if (dev == NULL)
return -EINVAL;

switch (filter_type) {
+   case RTE_ETH_FILTER_MACVLAN:
+   ret = i40e_mac_filter_handle(dev, filter_op, arg);
+break;
de

[dpdk-dev] [PATCH 0/3] Vhost app removes dependency of REFCNT

2014-10-24 Thread Ouyang Changchun
To remove the dependency of RTE_MBUF_REFCNT for vhost zero copy,
the mbuf need introduce EXTERNAL_MBUF(in ol_flags) to indicate it
attaches to an external buffer, say, from guest space. And don't
free the external buffer when freeing the mbuf itself in host, in
addition, RX function in PMD need make sure not overwrite this flag
when filling ol_flags from descriptors to mbuf.

Changchun Ouyang (3):
  mbuf use EXTERNAL_MBUF in ol_flags to indicate it is an external
buffer, when freeing such kind of mbuf, just need put mbuf
itself back into mempool, doesn't free the attached external
buffer, user/caller need take care of detaching and freeing the
external buffer.
  Every pmd RX function need keep the EXTERNAL_MBUF flag in
mbuf.ol_flags, and can't overwrite it when filling ol_flags from
descriptor to mbuf, otherwise, it probably cause to crash when
freeing a mbuf and trying to freeing its attached external
buffer, say, from guest space.
  vhost zero copy removes the dependency on macro REFCNT by using
EXTERNAL_MBUF flag in mbuf.ol_flags to indicate it is an
external buffer from guest.

 examples/vhost/main.c | 19 +--
 lib/librte_mbuf/rte_mbuf.h|  5 -
 lib/librte_pmd_e1000/igb_rxtx.c   |  5 +++--
 lib/librte_pmd_i40e/i40e_rxtx.c   |  8 +---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c |  8 +---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 
 6 files changed, 30 insertions(+), 27 deletions(-)

-- 
1.8.4.2



[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag

2014-10-24 Thread Ouyang Changchun
Every pmd RX function need keep the EXTERNAL_MBUF flag
in mbuf.ol_flags, and can't overwrite it when filling ol_flags from
descriptor to mbuf, otherwise, it probably cause to crash when freeing a mbuf
and trying to freeing its attached external buffer, say, from guest space.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_pmd_e1000/igb_rxtx.c   |  5 +++--
 lib/librte_pmd_i40e/i40e_rxtx.c   |  8 +---
 lib/librte_pmd_ixgbe/ixgbe_rxtx.c |  8 +---
 lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 
 4 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
index f09c525..4123310 100644
--- a/lib/librte_pmd_e1000/igb_rxtx.c
+++ b/lib/librte_pmd_e1000/igb_rxtx.c
@@ -786,7 +786,7 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
-   rxm->ol_flags = pkt_flags;
+   rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF);

/*
 * Store the mbuf address into the next entry of the array
@@ -1020,7 +1020,8 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct 
rte_mbuf **rx_pkts,
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
-   first_seg->ol_flags = pkt_flags;
+   first_seg->ol_flags = pkt_flags |
+   (first_seg->ol_flags & EXTERNAL_MBUF);

/* Prefetch data of first segment, if configured to do so. */
rte_packet_prefetch((char *)first_seg->buf_addr +
diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index 2b53677..68c3695 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -637,7 +637,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
-   mb->ol_flags = pkt_flags;
+   mb->ol_flags = pkt_flags |
+   (mb->ol_flags & EXTERNAL_MBUF);
if (pkt_flags & PKT_RX_RSS_HASH)
mb->hash.rss = rte_le_to_cpu_32(\
rxdp->wb.qword0.hi_dword.rss);
@@ -873,7 +874,7 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
uint16_t nb_pkts)
pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
-   rxm->ol_flags = pkt_flags;
+   rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF);
if (pkt_flags & PKT_RX_RSS_HASH)
rxm->hash.rss =
rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
@@ -1027,7 +1028,8 @@ i40e_recv_scattered_pkts(void *rx_queue,
pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
-   first_seg->ol_flags = pkt_flags;
+   first_seg->ol_flags = pkt_flags |
+   (first_seg->ol_flags & EXTERNAL_MBUF);
if (pkt_flags & PKT_RX_RSS_HASH)
rxm->hash.rss =
rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
index 1aefe5c..77e8689 100644
--- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
+++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
@@ -949,7 +949,8 @@ ixgbe_rx_scan_hw_ring(struct igb_rx_queue *rxq)
/* reuse status field from scan list */
pkt_flags |= rx_desc_status_to_pkt_flags(s[j]);
pkt_flags |= rx_desc_error_to_pkt_flags(s[j]);
-   mb->ol_flags = pkt_flags;
+   mb->ol_flags = pkt_flags |
+   (mb->ol_flags & EXTERNAL_MBUF);

if (likely(pkt_flags & PKT_RX_RSS_HASH))
mb->hash.rss = rxdp[j].wb.lower.hi_dword.rss;
@@ -1271,7 +1272,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss);
pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
pkt_flags = pkt_flags | rx_desc

[dpdk-dev] [PATCH 1/3] mbuf: Use EXTERNAL_MBUF to indicate external buffer

2014-10-24 Thread Ouyang Changchun
mbuf uses EXTERNAL_MBUF in ol_flags to indicate it is an
external buffer, when freeing such kind of mbuf, just need put mbuf itself
back into mempool, doesn't free the attached external buffer, user/caller
need take care of detaching and freeing the external buffer.

Signed-off-by: Changchun Ouyang 
---
 lib/librte_mbuf/rte_mbuf.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..8cee8fa 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -114,6 +114,9 @@ extern "C" {
 /* Bit 51 - IEEE1588*/
 #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to 
timestamp. */

+/* Bit 62 - Indicate it is external buffer */
+#define EXTERNAL_MBUF(1ULL << 62) /**< External buffer. */
+
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG   (1ULL << 63) /**< Mbuf contains control data */

@@ -670,7 +673,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 *  - detach mbuf
 *  - free attached mbuf segment
 */
-   if (unlikely (md != m)) {
+   if (unlikely((md != m) && !(m->ol_flags & EXTERNAL_MBUF))) {
rte_pktmbuf_detach(m);
if (rte_mbuf_refcnt_update(md, -1) == 0)
__rte_mbuf_raw_free(md);
-- 
1.8.4.2



[dpdk-dev] [PATCH 3/3] vhost: Removes dependency on REFCNT for zero copy

2014-10-24 Thread Ouyang Changchun
Vhost zero copy removes the dependency on macro REFCNT
by using EXTERNAL_MBUF flag in mbuf.ol_flags to indicate
it is an external buffer from guest.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 19 +--
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index fa0ad0c..e3b1884 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -713,19 +713,6 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
} else
zero_copy = ret;
-
-   if (zero_copy) {
-#ifdef RTE_MBUF_REFCNT
-   RTE_LOG(ERR, VHOST_CONFIG, "Before 
running "
-   "zero copy vhost APP, please "
-   "disable RTE_MBUF_REFCNT\n"
-   "in config file and then rebuild DPDK "
-   "core lib!\n"
-   "Otherwise please disable zero copy "
-   "flag in command line!\n");
-   return -1;
-#endif
-   }
}

/* Specify the descriptor number on RX. */
@@ -1453,6 +1440,7 @@ attach_rxmbuf_zcp(struct virtio_net *dev)
mbuf->buf_physaddr = phys_addr - RTE_PKTMBUF_HEADROOM;
mbuf->data_len = desc->len;
MBUF_HEADROOM_UINT32(mbuf) = (uint32_t)desc_idx;
+   mbuf->ol_flags |= EXTERNAL_MBUF;

LOG_DEBUG(VHOST_DATA,
"(%"PRIu64") in attach_rxmbuf_zcp: res base idx:%d, "
@@ -1489,6 +1477,8 @@ static inline void pktmbuf_detach_zcp(struct rte_mbuf *m)
m->data_off = buf_ofs;

m->data_len = 0;
+
+   m->ol_flags &= ~EXTERNAL_MBUF;
 }

 /*
@@ -1805,8 +1795,9 @@ virtio_tx_route_zcp(struct virtio_net *dev, struct 
rte_mbuf *m,
mbuf->data_off = m->data_off;
mbuf->buf_physaddr = m->buf_physaddr;
mbuf->buf_addr = m->buf_addr;
+   mbuf->ol_flags |= EXTERNAL_MBUF;
}
-   mbuf->ol_flags = PKT_TX_VLAN_PKT;
+   mbuf->ol_flags |= PKT_TX_VLAN_PKT;
mbuf->vlan_tci = vlan_tag;
mbuf->l2_len = sizeof(struct ether_hdr);
mbuf->l3_len = sizeof(struct ipv4_hdr);
-- 
1.8.4.2



[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-24 Thread O'driscoll, Tim
> From: Matthew Hall [mailto:mhall at mhcomputing.net]
> 
> On Wed, Oct 22, 2014 at 01:48:36PM +, O'driscoll, Tim wrote:
> > Single Virtio Driver: Merge existing Virtio drivers into a single
> > implementation, incorporating the best features from each of the
> > existing drivers.
> 
> Specifically, in the virtio-net case above, I have discovered, and Sergio at 
> Intel
> just reproduced today, that neither virtio PMD works at all inside of
> VirtualBox. One can't init, and the other gets into an infinite loop. But yet 
> it's
> claiming support for VBox on the DPDK Supported NICs page though it
> doesn't seem it ever could have worked.

At the moment, within Intel we test with KVM, Xen and ESXi. We've never tested 
with VirtualBox. So, maybe this is an error on the Supported NICs page, or 
maybe somebody else is testing that configuration.

> So I'd like to request an initiative alongside any virtio-net and/or vmxnet3
> type of changes, to make some kind of a Virtualization Test Lab, where we
> support VMWare ESXi, QEMU, Xen, VBox, and the other popular VM
> systems.
> 
> Otherwise it's hard for us community / app developers to make the DPDK
> available to end users in simple, elegant ways, such as packaging it into
> Vagrant VM's, Amazon AMI's etc. which are prebaked and ready-to-run.

Expanding the scope of virtualization testing is a good idea, especially given 
industry trends like NFV. We're in the process of getting our DPDK Test Suite 
ready to push to dpdk.org soon. The hope is that others will use it to validate 
changes they're making to DPDK, and contribute test cases so that we can build 
up a more comprehensive set over time.

One area where this does need further work is in virtualization. At the moment, 
our virtualization tests are manual, so they won't be included in the initial 
DPDK Test Suite release. We will look into automating our current 
virtualization tests and adding these to the test suite in future.

> Another thing which would help in this area would be additional
> improvements to the NUMA / socket / core / number of NICs / number of
> queues autodetections. To write a single app which can run on a virtual card,
> a hardware card without RSS available, and a hardware card with RSS
> available, in a thread-safe, flow-safe way, is somewhat complex at the
> present time.
> 
> I'm running into this in the VM based environments because most VNIC's
> don't have RSS and it complicates the process of keeping consistent state of
> the flows among the cores.

This is interesting. Do you have more details on what you're thinking here, 
that perhaps could be used as the basis for an RFC?


Tim


[dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx

2014-10-24 Thread Ouyang Changchun
For zero copy, it need check whether RX descriptor num meets the 
least requirement when using vector PMD Rx function, and give user 
more hints if it fails to meet the least requirement.

Signed-off-by: Changchun Ouyang 
---
 examples/vhost/main.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/examples/vhost/main.c b/examples/vhost/main.c
index 291128e..87ab854 100644
--- a/examples/vhost/main.c
+++ b/examples/vhost/main.c
@@ -131,6 +131,10 @@
 #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK virt FE: 128. */
 #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK virt FE: 64.  */

+#ifdef RTE_IXGBE_INC_VECTOR
+#define VPMD_RX_BURST 32
+#endif
+
 /* Get first 4 bytes in mbuf headroom. */
 #define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
+ sizeof(struct rte_mbuf)))
@@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv)
return -1;
}

+#ifdef RTE_IXGBE_INC_VECTOR
+   if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) {
+   RTE_LOG(INFO, VHOST_PORT,
+   "The RX desc num: %d is too small for PMD to work\n"
+   "properly, please enlarge it to bigger than %d if\n"
+   "possible by the option: '--rx-desc-num '\n"
+   "One alternative is disabling RTE_IXGBE_INC_VECTOR\n"
+   "in config file and rebuild the libraries.\n",
+   num_rx_descriptor, VPMD_RX_BURST);
+   return -1;
+   }
+#endif
+
return 0;
 }

-- 
1.8.4.2



[dpdk-dev] EAL : Input/output error on DPDK 1.7.1

2014-10-24 Thread Stephen Hemminger
INTX is badly emulated in VMWare; the disable logic doesn't work.
I thought the DPDK API detected when link state interrupt would not work.
But of course the application needs to check that before enabling link state

On Fri, Oct 24, 2014 at 8:52 AM, Masaru Oki 
wrote:

> Hi,
> I got same result in VMware Workstation environment.
> At least in my environment, INTX toggle check is not work with VMware
> E1000 Ethernet.
> Please try attached patch.
>
> 2014-10-17 3:04 GMT+09:00 Raghav K :
> > Hey,
> > I observe continuous burst of I/O Errors, as indicated below, with the
> testpmd application with DPDK 1.7.1.This seems to originate from
> eal_intr_process_interrupts() function. I seemed to have setup the DPDK
> prerequisites alright.
> > Another recent post seemed to suggest moving back to 1.7.0, however I
> would like to persist with 1.7.1.
> > Any help/pointers in resolving this would be greatly appreciated.
> > Much thanks,Raghav
> > root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1/x86_64-native-linuxapp-gcc/app#
> ./testpmd -c 0xf -n3 -- -i --nb-cores=3 --nb-ports=2
> > EAL: Error reading from file descriptor 21: Input/output errorEAL: Error
> reading from file descriptor 21: Input/output errorEAL: Error reading from
> file descriptor 21: Input/output errorEAL: Error reading from file
> descriptor 21: Input/output errorEAL: Error reading from file descriptor
> 21: Input/output errorEAL: Error reading from file descriptor 21:
> Input/output errorEAL: Error reading from file descriptor 21: Input/output
> errorEAL: Error reading from file descriptor 21: Input/output errorEAL:
> Error reading from file descriptor 21: Input/output errorEAL: Error reading
> from file descriptor 21: Input/output errorEAL: Error reading from file
> descriptor 21: Input/output errorEAL: Error reading from file descriptor
> 21: Input/output errorEAL: Error reading from file descriptor 21:
> Input/output errorEAL: Error reading from file descriptor 21: Input/output
> errorEAL: Error reading from file descriptor 21: Input/output errorEAL:
> Error reading from file descriptor 21: Input/output error
> > 
> > root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1# ./tools/dpdk_nic_bind.py
> --status
> > Network devices using DPDK-compatible
> driver:02:01.0 '82545EM
> Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000:02:02.0
> '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000
> > Network devices using kernel
> driver===:02:00.0 '82545EM Gigabit
> Ethernet Controller (Copper)' if=eth0 drv=e1000 unused=igb_uio
> *Active*:02:03.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth3
> drv=e1000 unused=igb_uio :02:05.0 '82545EM Gigabit Ethernet Controller
> (Copper)' if=eth4 drv=e1000 unused=igb_uio :02:06.0 '82545EM Gigabit
> Ethernet Controller (Copper)' if=eth5 drv=e1000 unused=igb_uio
> > Other network devices=
>


[dpdk-dev] DPDK Community Conference Call - Friday 31st October

2014-10-24 Thread O'driscoll, Tim
We're planning to hold our first community conference call on Friday 31st 
October. It's impossible to find a time that suits everybody, so we've chosen 
to do this in the afternoon/evening in Europe, which is the morning in the USA. 
This does unfortunately limit participation from PRC, Japan and other parts of 
the world. Here's the time and date in a variety of time zones:

Dublin (Ireland)Friday, October 31, 2014 at 
4:00:00 PMGMT UTC 
Paris (France)  Friday, October 31, 2014 at 5:00:00 PM  
  CET UTC+1 hour  
San Francisco (U.S.A. - California) Friday, October 31, 2014 at 9:00:00 AM  
  PDT UTC-7 hours 
New York (U.S.A. - New York)Friday, October 31, 2014 at 12:00:00 
Noon EDT UTC-4 hours 
Tel Aviv (Israel)   Friday, October 31, 2014 at 
6:00:00 PMIST UTC+2 hours 
Moscow (Russia) Friday, October 31, 2014 at 7:00:00 PMMSK 
UTC+3 hours


Audio bridge details are:
France: +33 1588 77298
Germany:+49 8999 143191
Israel: +972 2589 6577
Russia: +7 495 641 4663
UK: +44 1793 402663
USA:+1 916 356 2663

Bridge: 5
Conference ID: 1264677285

If anybody needs an access number for another country, let me know.


Agenda:
Discuss feature list for DPDK 2.0 (Q1 2015).
Suggestions for topics for future calls.


Thanks,
Tim


[dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx

2014-10-24 Thread Thomas Monjalon
Hi Changchun,

2014-10-24 16:38, Ouyang Changchun:
> For zero copy, it need check whether RX descriptor num meets the 
> least requirement when using vector PMD Rx function, and give user 
> more hints if it fails to meet the least requirement.
[...]
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -131,6 +131,10 @@
>  #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK virt FE: 128. 
> */
>  #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK virt FE: 64.  
> */
>  
> +#ifdef RTE_IXGBE_INC_VECTOR
> +#define VPMD_RX_BURST 32
> +#endif
> +
>  /* Get first 4 bytes in mbuf headroom. */
>  #define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
>   + sizeof(struct rte_mbuf)))
> @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv)
>   return -1;
>   }
>  
> +#ifdef RTE_IXGBE_INC_VECTOR
> + if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) {
> + RTE_LOG(INFO, VHOST_PORT,
> + "The RX desc num: %d is too small for PMD to work\n"
> + "properly, please enlarge it to bigger than %d if\n"
> + "possible by the option: '--rx-desc-num '\n"
> + "One alternative is disabling RTE_IXGBE_INC_VECTOR\n"
> + "in config file and rebuild the libraries.\n",
> + num_rx_descriptor, VPMD_RX_BURST);
> + return -1;
> + }
> +#endif
> +
>   return 0;
>  }

I feel there is a design problem here.
An application shouldn't have to care about the underlying driver.

-- 
Thomas


[dpdk-dev] [PATCH 0/3] Vhost app removes dependency of REFCNT

2014-10-24 Thread Thomas Monjalon
2014-10-24 16:10, Ouyang Changchun:
> To remove the dependency of RTE_MBUF_REFCNT for vhost zero copy,
> the mbuf need introduce EXTERNAL_MBUF(in ol_flags) to indicate it
> attaches to an external buffer, say, from guest space. And don't
> free the external buffer when freeing the mbuf itself in host, in
> addition, RX function in PMD need make sure not overwrite this flag
> when filling ol_flags from descriptors to mbuf.

So you are replacing refcnt by something else which requires special
handling in drivers.
I feel this is not the right design.
Why do you want to remove refcnt dependency?

-- 
Thomas


[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-24 Thread Thomas Monjalon
2014-10-24 08:10, O'driscoll, Tim:
> > From: Matthew Hall [mailto:mhall at mhcomputing.net]
> > Specifically, in the virtio-net case above, I have discovered, and Sergio 
> > at Intel
> > just reproduced today, that neither virtio PMD works at all inside of
> > VirtualBox. One can't init, and the other gets into an infinite loop. But 
> > yet it's
> > claiming support for VBox on the DPDK Supported NICs page though it
> > doesn't seem it ever could have worked.
> 
> At the moment, within Intel we test with KVM, Xen and ESXi. We've never
> tested with VirtualBox. So, maybe this is an error on the Supported NICs
> page, or maybe somebody else is testing that configuration.

I'm the author of this page. I think I've written VirtualBox to show where
virtio is implemented. You interpreted this as "supported environment", so
I'm removing it.
Thanks for testing and reporting.

-- 
Thomas


[dpdk-dev] [PATCH 1/2] ixgbe: remove static qualifier for thread safety

2014-10-24 Thread Bruce Richardson
On Thu, Oct 23, 2014 at 08:43:39AM +0900, Masaru Oki wrote:
> Hi,
> 
> in this code, pointer of local variable (mb_def) is returned by your changes.
> mb_def should be static for each thread.

Actually, no. A copy is made of 8 bytes of the mb_def variable and stored as 
an mbuf initializer inside the rxq structure. No use of the memory occupied 
by mb_def is made outside of the function, so the value does not need to be 
static.

/Bruce
> 
> 2014-10-22 19:55 GMT+09:00 Bruce Richardson :
> > Remove the "static" prefix to the template mbuf variable in
> > ixgbe_rxq_vec_setup function. This will then allow different
> > threads to initialize different RX queues at the same time,
> > without one overwriting the other's data.
> >
> > Signed-off-by: Bruce Richardson 
> > ---
> >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
> > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > index a0d3d78..e813e43 100644
> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > @@ -730,7 +730,7 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> >  int
> >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> >  {
> > -   static struct rte_mbuf mb_def = {
> > +   struct rte_mbuf mb_def = {
> > .nb_segs = 1,
> > .data_off = RTE_PKTMBUF_HEADROOM,
> >  #ifdef RTE_MBUF_REFCNT
> > --
> > 1.9.3
> >


[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag

2014-10-24 Thread Ananyev, Konstantin
Hi Changchun,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun
> Sent: Friday, October 24, 2014 9:10 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag
> 
> Every pmd RX function need keep the EXTERNAL_MBUF flag
> in mbuf.ol_flags, and can't overwrite it when filling ol_flags from
> descriptor to mbuf, otherwise, it probably cause to crash when freeing a mbuf
> and trying to freeing its attached external buffer, say, from guest space.
> 

Don't really like the idea to put:
mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); 
in each and every PMD from now on...

>From other side, it is probably not very good that RX functions update whole 
>ol_flags, not only RX related part.
Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for TX and 
generic stuff.
So our ol_flags will look something like that:

union {
uint64_t ol_raw_flags;
struct {
uint32_t rx;
uint32_t gen_tx;
} ol_flags
};

And make all PMD RX functions to operate on rx part of the flags only:
mb->ol_flags.rx = pkt_flags;
?

Konstantin

> Signed-off-by: Changchun Ouyang 
> ---
>  lib/librte_pmd_e1000/igb_rxtx.c   |  5 +++--
>  lib/librte_pmd_i40e/i40e_rxtx.c   |  8 +---
>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c |  8 +---
>  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 
>  4 files changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c
> index f09c525..4123310 100644
> --- a/lib/librte_pmd_e1000/igb_rxtx.c
> +++ b/lib/librte_pmd_e1000/igb_rxtx.c
> @@ -786,7 +786,7 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf 
> **rx_pkts,
>   pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss);
>   pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
>   pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
> - rxm->ol_flags = pkt_flags;
> + rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF);
> 
>   /*
>* Store the mbuf address into the next entry of the array
> @@ -1020,7 +1020,8 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct 
> rte_mbuf **rx_pkts,
>   pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss);
>   pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr);
>   pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr);
> - first_seg->ol_flags = pkt_flags;
> + first_seg->ol_flags = pkt_flags |
> + (first_seg->ol_flags & EXTERNAL_MBUF);
> 
>   /* Prefetch data of first segment, if configured to do so. */
>   rte_packet_prefetch((char *)first_seg->buf_addr +
> diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
> index 2b53677..68c3695 100644
> --- a/lib/librte_pmd_i40e/i40e_rxtx.c
> +++ b/lib/librte_pmd_i40e/i40e_rxtx.c
> @@ -637,7 +637,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq)
>   pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
>   pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
>   pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
> - mb->ol_flags = pkt_flags;
> + mb->ol_flags = pkt_flags |
> + (mb->ol_flags & EXTERNAL_MBUF);
>   if (pkt_flags & PKT_RX_RSS_HASH)
>   mb->hash.rss = rte_le_to_cpu_32(\
>   rxdp->wb.qword0.hi_dword.rss);
> @@ -873,7 +874,7 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, 
> uint16_t nb_pkts)
>   pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
>   pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
>   pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
> - rxm->ol_flags = pkt_flags;
> + rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF);
>   if (pkt_flags & PKT_RX_RSS_HASH)
>   rxm->hash.rss =
>   rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
> @@ -1027,7 +1028,8 @@ i40e_recv_scattered_pkts(void *rx_queue,
>   pkt_flags = i40e_rxd_status_to_pkt_flags(qword1);
>   pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1);
>   pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1);
> - first_seg->ol_flags = pkt_flags;
> + first_seg->ol_flags = pkt_flags |
> + (first_seg->ol_flags & EXTERNAL_MBUF);
>   if (pkt_flags & PKT_RX_RSS_HASH)
>   rxm->hash.rss =
>   rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss);
> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c 
> b/lib/librte_pmd_ixgbe/ixgbe_rxtx

[dpdk-dev] [PATCH 0/3] Vhost app removes dependency of REFCNT

2014-10-24 Thread Bruce Richardson
On Fri, Oct 24, 2014 at 11:47:46AM +0200, Thomas Monjalon wrote:
> 2014-10-24 16:10, Ouyang Changchun:
> > To remove the dependency of RTE_MBUF_REFCNT for vhost zero copy,
> > the mbuf need introduce EXTERNAL_MBUF(in ol_flags) to indicate it
> > attaches to an external buffer, say, from guest space. And don't
> > free the external buffer when freeing the mbuf itself in host, in
> > addition, RX function in PMD need make sure not overwrite this flag
> > when filling ol_flags from descriptors to mbuf.
> 
> So you are replacing refcnt by something else which requires special
> handling in drivers.
> I feel this is not the right design.
> Why do you want to remove refcnt dependency?
>
Ignoring the implementation of the patchset for now - as I haven't reviewed 
it in depth yet, I think the removal of the dependency on REFCNT in this 
vhost code is a good thing.  This is the only place in DPDK which depends on 
the REFCNT being *disabled*.  We have lots of things which rely on using 
having a reference count enabled in the mbuf, and lots and lots of #ifdefs 
in the code to work around the possibility of it being disabled. If we can 
remove the need for the reference count to be disabled here we can look to 
do some major cleanup, by removing completely the option to disable the 
reference counting.

Regards,
/Bruce


[dpdk-dev] ethtool and igb/ixgbe (kni)

2014-10-24 Thread Kevin Wilson
Hi,

I am looking in the file hierarchy of dpdk, and I see that under
/dpdk-1.7.1/lib/librte_eal/linuxapp/kni/ethtool
we have:
igb  ixgbe  README

My question is: why the igb and ixgbe are on this path, under ethtool
? are they related
to ethtool in any way ?


The README does not explain it.

Regards,
Kevin


[dpdk-dev] [PATCH 1/2] ixgbe: remove static qualifier for thread safety

2014-10-24 Thread Masaru Oki
Oh, sorry, you are right.  I had missed first * for copy.
thank you.

2014-10-24 19:34 GMT+09:00 Bruce Richardson :
> On Thu, Oct 23, 2014 at 08:43:39AM +0900, Masaru Oki wrote:
>> Hi,
>>
>> in this code, pointer of local variable (mb_def) is returned by your changes.
>> mb_def should be static for each thread.
>
> Actually, no. A copy is made of 8 bytes of the mb_def variable and stored as
> an mbuf initializer inside the rxq structure. No use of the memory occupied
> by mb_def is made outside of the function, so the value does not need to be
> static.
>
> /Bruce
>>
>> 2014-10-22 19:55 GMT+09:00 Bruce Richardson :
>> > Remove the "static" prefix to the template mbuf variable in
>> > ixgbe_rxq_vec_setup function. This will then allow different
>> > threads to initialize different RX queues at the same time,
>> > without one overwriting the other's data.
>> >
>> > Signed-off-by: Bruce Richardson 
>> > ---
>> >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c 
>> > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
>> > index a0d3d78..e813e43 100644
>> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
>> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
>> > @@ -730,7 +730,7 @@ static struct ixgbe_txq_ops vec_txq_ops = {
>> >  int
>> >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
>> >  {
>> > -   static struct rte_mbuf mb_def = {
>> > +   struct rte_mbuf mb_def = {
>> > .nb_segs = 1,
>> > .data_off = RTE_PKTMBUF_HEADROOM,
>> >  #ifdef RTE_MBUF_REFCNT
>> > --
>> > 1.9.3
>> >


[dpdk-dev] [PATCH v5 3/3] ethdev: fix wrong error return refere to API definition

2014-10-24 Thread Ananyev, Konstantin


> -Original Message-
> From: y at ecsmtp.sh.intel.com [mailto:y at ecsmtp.sh.intel.com]
> Sent: Friday, October 24, 2014 6:55 AM
> To: dev at dpdk.org
> Cc: nhorman at tuxdriver.com; Richardson, Bruce; Ananyev, Konstantin; De Lara 
> Guarch, Pablo; Liang, Cunming
> Subject: [PATCH v5 3/3] ethdev: fix wrong error return refere to API 
> definition
> 
> From: Cunming Liang 
> 
> Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count 
> returns the packet number.
> When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set 
> to -ENOTSUP.
> It makes confusing.
> The patch always return 0 no matter no packet or there's error.
> Meanwhile set errno in such kind of checking.
> 
> Signed-off-by: Cunming Liang 
> ---
>  lib/librte_ether/rte_ethdev.c |   10 +++---
>  1 files changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
> index 50f10d9..6675f28 100644
> --- a/lib/librte_ether/rte_ethdev.c
> +++ b/lib/librte_ether/rte_ethdev.c
> @@ -81,12 +81,14 @@
>  /* Macros for checking for restricting functions to primary instance only */
>  #define PROC_PRIMARY_OR_ERR_RET(retval) do { \
>   if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \
> + rte_errno = -E_RTE_SECONDARY;   \
>   PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \
>   return (retval); \
>   } \
>  } while(0)
>  #define PROC_PRIMARY_OR_RET() do { \
>   if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \
> + rte_errno = -E_RTE_SECONDARY;   \
>   PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \
>   return; \
>   } \
> @@ -95,12 +97,14 @@
>  /* Macros to check for invlaid function pointers in dev_ops structure */
>  #define FUNC_PTR_OR_ERR_RET(func, retval) do { \
>   if ((func) == NULL) { \
> + rte_errno = -ENOTSUP; \
>   PMD_DEBUG_TRACE("Function not supported\n"); \
>   return (retval); \
>   } \
>  } while(0)
>  #define FUNC_PTR_OR_RET(func) do { \
>   if ((func) == NULL) { \
> + rte_errno = -ENOTSUP; \
>   PMD_DEBUG_TRACE("Function not supported\n"); \
>   return; \
>   } \
> @@ -2530,7 +2534,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id,
>   return 0;
>   }
>   dev = &rte_eth_devices[port_id];
> - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
>   if (queue_id >= dev->data->nb_rx_queues) {
>   PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id);
>   return 0;
> @@ -2551,7 +2555,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
>   }
>   dev = &rte_eth_devices[port_id];
> 
> - FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0);
>   if (queue_id >= dev->data->nb_tx_queues) {
>   PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id);
>   return 0;
> @@ -2570,7 +2574,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t 
> queue_id)
>   return 0;
>   }
>   dev = &rte_eth_devices[port_id];
> - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP);
> + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0);
>   return (*dev->dev_ops->rx_queue_count)(dev, queue_id);
>  }

There are few things that worry me with that approach:

1.  Different behaviour of rte_eth_rx_burst/rte_eth_tx_burst  for 
RTE_LIBRTE_ETHDEV_DEBUG switched on/off.
So application might need to differentiate its code depending on  
RTE_LIBRTE_ETHDEV_DEBUG value.

2. Even for RTE_LIBRTE_ETHDEV_DEBUG is on the behaviour of rte_eth_rx_burst/ 
rte_eth_tx_burst will be inconsistent:
It sets rte_errno if dev->rx_pkt_burst == NULL, but doesn't do the same for 
other error conditions:
When port_id or queue_id is invalid.

3. Modifying FUNC_PTR_OR_ERR_RET() to set rte_errno, we make behaviour of other 
rte_ethdev functions inconsistent too:
Now for some error conditions they do set rte_errno, for others they don't.

So if it would be me, I'll just:
- leave FUNC_PTR_OR_*_RET unmodified.
- changes rte_eth_rx_burst/tx_burst for RTE_LIBRTE_ETHDEV_DEBUG something like:

- FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP);
+ FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);

I think, that just error logging is enough here.  

Konstantin

> 
> --
> 1.7.4.1



[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread

2014-10-24 Thread Mario Gianni
Hi all, I have a problem since I updated to 1.7.0 version,
I got a multi-process, multi-threaded application,
In my application first I launch a master process, then I launch a secondary 
process with multiple threads in it
Well, when the number of lcores reserved for the secondary process exceeds a 
certain number (eg. 4) i got an error in rte_eal_init() on the secondary 
process when it tries to map PCI memory:

EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): 
Success (0x7559b000)
EAL: Cannot mmap device resource
EAL: Error - exiting with code: 1
Cause: Requested device :01:00.0 cannot be used

Can you help me?


[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread

2014-10-24 Thread Bruce Richardson
On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote:
> Hi all, I have a problem since I updated to 1.7.0 version,
> I got a multi-process, multi-threaded application,
> In my application first I launch a master process, then I launch a secondary 
> process with multiple threads in it
> Well, when the number of lcores reserved for the secondary process exceeds a 
> certain number (eg. 4) i got an error in rte_eal_init() on the secondary 
> process when it tries to map PCI memory:
>  
> EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): 
> Success (0x7559b000)
> EAL: Cannot mmap device resource
> EAL: Error - exiting with code: 1
> Cause: Requested device :01:00.0 cannot be used
> 
> Can you help me?

This could be because the additional memory/stack space used by the pthreads 
for the cores in the secondary process is overlapping the space used in the 
primary process for hugepage or device memory. You could perhaps try adding 
a few cores to the primary process's coremask (and not using those cores) 
and see if it helps things. 
Alternatively there is a base-virtaddr parameter that can be passed to the 
primary process to try and adjust the starting address for it mapping 
memory. If you look at where it starts mapping memory right now, and then 
try hinting to it to maps the pages at a slightly higher or lower address 
and see if it helps.

/Bruce


[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag

2014-10-24 Thread Bruce Richardson
On Fri, Oct 24, 2014 at 10:46:06AM +, Ananyev, Konstantin wrote:
> Hi Changchun,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun
> > Sent: Friday, October 24, 2014 9:10 AM
> > To: dev at dpdk.org
> > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF 
> > flag
> > 
> > Every pmd RX function need keep the EXTERNAL_MBUF flag
> > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from
> > descriptor to mbuf, otherwise, it probably cause to crash when freeing a 
> > mbuf
> > and trying to freeing its attached external buffer, say, from guest space.
> > 
> 
> Don't really like the idea to put:
> mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); 
> in each and every PMD from now on...
> 
> From other side, it is probably not very good that RX functions update whole 
> ol_flags, not only RX related part.
> Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for TX 
> and generic stuff.
> So our ol_flags will look something like that:
> 
> union {
>   uint64_t ol_raw_flags;
>   struct {
>   uint32_t rx;
>   uint32_t gen_tx;
>   } ol_flags
> };
> 
> And make all PMD RX functions to operate on rx part of the flags only:
> mb->ol_flags.rx = pkt_flags;
> ?
> 
> Konstantin
>
I would tend to agree with this. Changchun, did you get to assess the 
performance impact of making this change to the PMDs? I suspect that making 
the changes to each PMD would impact performance, while Konstantin's 
suggestion should eliminate that impact.
The downside there is that we are limiting the flexibility we have in 
expanding beyond 32 RX flags and 24 TX flags. :-(

/Bruce



[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread

2014-10-24 Thread Mario Gianni
Hi Bruce, 
thank you for your answer, adding cores to the primary mask didn't help, 
instead it helped manually passing the --base-virtaddr parameter, setting it to 
the first value of Virtual Area that EAL finds when it starts the primary 
process.
?
Honestly I don't understand why it works in this way, in the experimental phase 
this could be a patch, but in the final program I have to automate this 
process, do you have any suggestions?
For example is there a way to find the virtual area before starting the primary 
process?
?
Mario
?

Sent:?Friday, October 24, 2014 at 2:08 PM
From:?"Bruce Richardson" 
To:?"Mario Gianni" 
Cc:?dev at dpdk.org
Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 
multi-process/multi-thread
On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote:
> Hi all, I have a problem since I updated to 1.7.0 version,
> I got a multi-process, multi-threaded application,
> In my application first I launch a master process, then I launch a secondary 
> process with multiple threads in it
> Well, when the number of lcores reserved for the secondary process exceeds a 
> certain number (eg. 4) i got an error in rte_eal_init() on the secondary 
> process when it tries to map PCI memory:
>
> EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): 
> Success (0x7559b000)
> EAL: Cannot mmap device resource
> EAL: Error - exiting with code: 1
> Cause: Requested device :01:00.0 cannot be used
>
> Can you help me?

This could be because the additional memory/stack space used by the pthreads
for the cores in the secondary process is overlapping the space used in the
primary process for hugepage or device memory. You could perhaps try adding
a few cores to the primary process's coremask (and not using those cores)
and see if it helps things.
Alternatively there is a base-virtaddr parameter that can be passed to the
primary process to try and adjust the starting address for it mapping
memory. If you look at where it starts mapping memory right now, and then
try hinting to it to maps the pages at a slightly higher or lower address
and see if it helps.

/Bruce


[dpdk-dev] Possible bug in eal_pci pci_scan_one

2014-10-24 Thread Stephen Hemminger
On Mon, 6 Oct 2014 02:13:44 -0700
Matthew Hall  wrote:

> Hi Guys,
> 
> I'm doing my development on kind of a cheap machine with no NUMA support... 
> but several years ago I used DPDK to build a NUMA box that could do 40 gbits 
> bidirectional L4-L7 stateful traffic replay.
> 
> So given the past experiences I had before, I wanted to clean the code up so 
> it'd work well if some crazy guy tried my code on one of these huge boxes, 
> too, but then I ran into some weird issues.
> 
> 1) When I call rte_eth_dev_socket_id() I get back -1. But the call can return 
> -1 if the port_id is bogus or if pci_scan_one didn't get a numa_node (because 
> you're on a non-NUMA box for example).
> 
> int rte_eth_dev_socket_id(uint8_t port_id)
> {
> if (port_id >= nb_ports)
> return -1;
> return rte_eth_devices[port_id].pci_dev->numa_node;
> }
> 
> So you couldn't tell the different between non-NUMA or a bad port value, etc.
> 
> 2) The code's behavior and comments disagree with one another. In the 
> pci_scan_one function, there's this code:
> 
> /* get numa node */
> snprintf(filename, sizeof(filename), "%s/numa_node",
>  dirname);
> if (access(filename, R_OK) != 0) {
> /* if no NUMA support just set node to 0 */
> dev->numa_node = -1;
> } else {
> if (eal_parse_sysfs_value(filename, &tmp) < 0) {
> free(dev);
> return -1;
> }
> dev->numa_node = tmp;
> }
> 
> It says, just use NUMA node 0 if there is no NUMA support. But then proceeds 
> to set the value to -1 in disagreement with the comment, and also stomping on 
> the other meaning for -1 in the higher function rte_eth_dev_socket_id.
> 
> 3) In conclusion, it seems like some stuff is missing... first there needs to 
> be a function that will tell you the number of NUMA nodes present on the box 
> so you can create the right number of mbuf_pools, but I couldn't find that 
> function.
> 
> Then if you have the function, you can do some magic and shuffle the NICs 
> around to get them hooked to a core on the same NUMA, and the mbuf_pool on 
> the 
> same NUMA.
> 
> When NUMA is not present, can we return 0 instead of -1, or return a specific 
> error code that the client can use to know he should just use Socket 0? Right 
> now I can't tell apart any potential errors or weird values from correct 
> values.
> 
> 4) I'm willing to help make and test some patches... but first I want to 
> understand what is happening with these funny functions before doing things 
> blindly.
> 
> Thanks,
> Matthew.

The code is fairly consistent in returning -1 for cases of not a NUMA socket,
bogus port value. It is interpreted as SOCKET_ID_ANY in several places.
The examples mostly check for -1 and use socket 0 as a fallback.
Probably not worth introducing more return values and breaking existing
applications.


[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread

2014-10-24 Thread Bruce Richardson
On Fri, Oct 24, 2014 at 03:04:26PM +0200, Mario Gianni wrote:
> Hi Bruce, 
> thank you for your answer, adding cores to the primary mask didn't help, 
> instead it helped manually passing the --base-virtaddr parameter, setting it 
> to the first value of Virtual Area that EAL finds when it starts the primary 
> process.
> ?
> Honestly I don't understand why it works in this way, in the experimental 
> phase this could be a patch, but in the final program I have to automate this 
> process, do you have any suggestions?
> For example is there a way to find the virtual area before starting the 
> primary process?
> ?
> Mario

In multi-process, there is a requirement that we can map the hugepage memory 
and the NIC BARs to the same virtual addresses in both processes. Mostly 
this works ok, but occasionally it needs help due to the memory regions 
being chosen in the primary process being used by something else 
pre-eal_init in the secondary process. Anything from additional threads, to 
having an additional shared library linked in can affect the amount of 
memory used by the secondary process and therefore affect the chances that 
we won't be able to get an exact mapping. As far as I know there is no way 
to pre-compute how much memory a given process will use, or what memory 
regions will be free in it, by the time rte_eal_init() is called.

If you just need multiple processes, which don't need to be individually 
spawned, then perhaps consider using fork() to spawn the processes, since 
that will guarantee you idential mappings without issues. The downside 
obviously is that you need to have all processes use the same binary, 
something not required for DPDK multi-process support.  

/Bruce

> ?
> 
> Sent:?Friday, October 24, 2014 at 2:08 PM
> From:?"Bruce Richardson" 
> To:?"Mario Gianni" 
> Cc:?dev at dpdk.org
> Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 
> multi-process/multi-thread
> On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote:
> > Hi all, I have a problem since I updated to 1.7.0 version,
> > I got a multi-process, multi-threaded application,
> > In my application first I launch a master process, then I launch a 
> > secondary process with multiple threads in it
> > Well, when the number of lcores reserved for the secondary process exceeds 
> > a certain number (eg. 4) i got an error in rte_eal_init() on the secondary 
> > process when it tries to map PCI memory:
> >
> > EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): 
> > Success (0x7559b000)
> > EAL: Cannot mmap device resource
> > EAL: Error - exiting with code: 1
> > Cause: Requested device :01:00.0 cannot be used
> >
> > Can you help me?
> 
> This could be because the additional memory/stack space used by the pthreads
> for the cores in the secondary process is overlapping the space used in the
> primary process for hugepage or device memory. You could perhaps try adding
> a few cores to the primary process's coremask (and not using those cores)
> and see if it helps things.
> Alternatively there is a base-virtaddr parameter that can be passed to the
> primary process to try and adjust the starting address for it mapping
> memory. If you look at where it starts mapping memory right now, and then
> try hinting to it to maps the pages at a slightly higher or lower address
> and see if it helps.
> 
> /Bruce


[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread

2014-10-24 Thread Mario Gianni
So you are telling me that in order to implement multi-process I should better 
use the l2fwd_fork example instead of client_server_mp.
In fact if I use the client_server_mp with a lot of mp_client threads it gives 
me the error.
If instead I use the l2fwd_fork example it doesn't give me the error.

One more question at this point:
Assume that I use l2fwd_fork, when I launch the secondary process, how do I 
assign the lcore coremask associated with that process? 

Mario
?
?

Sent:?Friday, October 24, 2014 at 3:39 PM
From:?"Bruce Richardson" 
To:?"Mario Gianni" 
Cc:?dev at dpdk.org
Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 
multi-process/multi-thread
On Fri, Oct 24, 2014 at 03:04:26PM +0200, Mario Gianni wrote:
> Hi Bruce,
> thank you for your answer, adding cores to the primary mask didn't help, 
> instead it helped manually passing the --base-virtaddr parameter, setting it 
> to the first value of Virtual Area that EAL finds when it starts the primary 
> process.
> ?
> Honestly I don't understand why it works in this way, in the experimental 
> phase this could be a patch, but in the final program I have to automate this 
> process, do you have any suggestions?
> For example is there a way to find the virtual area before starting the 
> primary process?
> ?
> Mario

In multi-process, there is a requirement that we can map the hugepage memory
and the NIC BARs to the same virtual addresses in both processes. Mostly
this works ok, but occasionally it needs help due to the memory regions
being chosen in the primary process being used by something else
pre-eal_init in the secondary process. Anything from additional threads, to
having an additional shared library linked in can affect the amount of
memory used by the secondary process and therefore affect the chances that
we won't be able to get an exact mapping. As far as I know there is no way
to pre-compute how much memory a given process will use, or what memory
regions will be free in it, by the time rte_eal_init() is called.

If you just need multiple processes, which don't need to be individually
spawned, then perhaps consider using fork() to spawn the processes, since
that will guarantee you idential mappings without issues. The downside
obviously is that you need to have all processes use the same binary,
something not required for DPDK multi-process support.

/Bruce

> ?
>
> Sent:?Friday, October 24, 2014 at 2:08 PM
> From:?"Bruce Richardson" 
> To:?"Mario Gianni" 
> Cc:?dev at dpdk.org
> Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 
> multi-process/multi-thread
> On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote:
> > Hi all, I have a problem since I updated to 1.7.0 version,
> > I got a multi-process, multi-threaded application,
> > In my application first I launch a master process, then I launch a 
> > secondary process with multiple threads in it
> > Well, when the number of lcores reserved for the secondary process exceeds 
> > a certain number (eg. 4) i got an error in rte_eal_init() on the secondary 
> > process when it tries to map PCI memory:
> >
> > EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): 
> > Success (0x7559b000)
> > EAL: Cannot mmap device resource
> > EAL: Error - exiting with code: 1
> > Cause: Requested device :01:00.0 cannot be used
> >
> > Can you help me?
>
> This could be because the additional memory/stack space used by the pthreads
> for the cores in the secondary process is overlapping the space used in the
> primary process for hugepage or device memory. You could perhaps try adding
> a few cores to the primary process's coremask (and not using those cores)
> and see if it helps things.
> Alternatively there is a base-virtaddr parameter that can be passed to the
> primary process to try and adjust the starting address for it mapping
> memory. If you look at where it starts mapping memory right now, and then
> try hinting to it to maps the pages at a slightly higher or lower address
> and see if it helps.
>
> /Bruce


[dpdk-dev] DPDK Community Conference Call - Friday 31st October

2014-10-24 Thread Michael Marchetti


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of O'driscoll, Tim
> Sent: Friday, October 24, 2014 5:22 AM
> To: dev at dpdk.org
> Subject: [dpdk-dev] DPDK Community Conference Call - Friday 31st October
> 
> We're planning to hold our first community conference call on Friday 31st
> October. It's impossible to find a time that suits everybody, so we've chosen
> to do this in the afternoon/evening in Europe, which is the morning in the
> USA. This does unfortunately limit participation from PRC, Japan and other
> parts of the world. Here's the time and date in a variety of time zones:
> 
> Dublin (Ireland)  Friday, October 31, 2014 at
> 4:00:00 PMGMT UTC
> Paris (France)Friday, October 31, 2014 at 
> 5:00:00
> PMCET UTC+1 hour
> San Francisco (U.S.A. - California)   Friday, October 31, 2014 at 9:00:00
> AMPDT UTC-7 hours
> New York (U.S.A. - New York)  Friday, October 31, 2014 at 12:00:00
> Noon EDT UTC-4 hours
> Tel Aviv (Israel) Friday, October 31, 2014 at 
> 6:00:00
> PMIST UTC+2 hours
> Moscow (Russia)   Friday, October 31, 2014 at 7:00:00
> PMMSK UTC+3 hours
> 
> 
> Audio bridge details are:
> France:   +33 1588 77298
> Germany:  +49 8999 143191
> Israel:   +972 2589 6577
> Russia:   +7 495 641 4663
> UK:   +44 1793 402663
> USA:  +1 916 356 2663
> 
> Bridge: 5
> Conference ID: 1264677285
> 
> If anybody needs an access number for another country, let me know.

Can you provide a number for Canada?  thanks, Mike.


> 
> 
> Agenda:
> Discuss feature list for DPDK 2.0 (Q1 2015).
> Suggestions for topics for future calls.
> 
> 
> Thanks,
> Tim


[dpdk-dev] DPDK Community Conference Call - Friday 31st October

2014-10-24 Thread O'driscoll, Tim
> From: Michael Marchetti [mailto:mmarchetti at sandvine.com]
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of O'driscoll, Tim
> > Audio bridge details are:
> > France: +33 1588 77298
> > Germany:+49 8999 143191
> > Israel: +972 2589 6577
> > Russia: +7 495 641 4663
> > UK: +44 1793 402663
> > USA:+1 916 356 2663
> >
> > Bridge: 5
> > Conference ID: 1264677285
> >
> > If anybody needs an access number for another country, let me know.
> 
> Can you provide a number for Canada?  thanks, Mike.

No problem. The USA number above should work, or else you can use 
+1-888-875-9370.


Tim


[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag

2014-10-24 Thread Bruce Richardson
On Fri, Oct 24, 2014 at 01:34:58PM +0100, Bruce Richardson wrote:
> On Fri, Oct 24, 2014 at 10:46:06AM +, Ananyev, Konstantin wrote:
> > Hi Changchun,
> > 
> > > -Original Message-
> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun
> > > Sent: Friday, October 24, 2014 9:10 AM
> > > To: dev at dpdk.org
> > > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF 
> > > flag
> > > 
> > > Every pmd RX function need keep the EXTERNAL_MBUF flag
> > > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from
> > > descriptor to mbuf, otherwise, it probably cause to crash when freeing a 
> > > mbuf
> > > and trying to freeing its attached external buffer, say, from guest space.
> > > 
> > 
> > Don't really like the idea to put:
> > mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); 
> > in each and every PMD from now on...
> > 
> > From other side, it is probably not very good that RX functions update 
> > whole ol_flags, not only RX related part.
> > Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for TX 
> > and generic stuff.
> > So our ol_flags will look something like that:
> > 
> > union {
> > uint64_t ol_raw_flags;
> > struct {
> > uint32_t rx;
> > uint32_t gen_tx;
> > } ol_flags
> > };
> > 
> > And make all PMD RX functions to operate on rx part of the flags only:
> > mb->ol_flags.rx = pkt_flags;
> > ?
> > 
> > Konstantin
> >
> I would tend to agree with this. Changchun, did you get to assess the 
> performance impact of making this change to the PMDs? I suspect that making 
> the changes to each PMD would impact performance, while Konstantin's 
> suggestion should eliminate that impact.
> The downside there is that we are limiting the flexibility we have in 
> expanding beyond 32 RX flags and 24 TX flags. :-(
> 
> /Bruce
> 

How about switching things about in terms of the flag. Instead of having to 
manage a flag across the baord to indicate if an mbuf is pointing to 
external memory, I think we should use the flag to indicate that an mbuf is 
attached to the memory space of another mbuf. 

My reasons for suggesting this are:
1. Mbufs pointing to externally managed memory are not really the problem to 
be dealt with on free, since they can be handled the same as mbufs with the 
data pointer pointing internally, it's mbufs attached to other mbufs which 
are - so that's what we need to track using a flag.
2. Setting the flag to indicate an indirect mbuf should have no impact on 
the driver, as an mbuf that has just been allocated from mempool cannot be 
an indirect one.
3. The only place we would need to worry about such a flag is in the attach, 
detach and free mbuf functions - and on free we would simply need to replace 
the existing check for "md != m" with a new check for the new flag. It would 
be a contained change.

Thoughts?
/Bruce


[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag

2014-10-24 Thread Ananyev, Konstantin


> -Original Message-
> From: Richardson, Bruce
> Sent: Friday, October 24, 2014 4:43 PM
> To: Ananyev, Konstantin
> Cc: dev at dpdk.org
> Subject: Re: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF 
> flag
> 
> On Fri, Oct 24, 2014 at 01:34:58PM +0100, Bruce Richardson wrote:
> > On Fri, Oct 24, 2014 at 10:46:06AM +, Ananyev, Konstantin wrote:
> > > Hi Changchun,
> > >
> > > > -Original Message-
> > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun
> > > > Sent: Friday, October 24, 2014 9:10 AM
> > > > To: dev at dpdk.org
> > > > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep 
> > > > EXTERNAL_MBUF flag
> > > >
> > > > Every pmd RX function need keep the EXTERNAL_MBUF flag
> > > > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from
> > > > descriptor to mbuf, otherwise, it probably cause to crash when freeing 
> > > > a mbuf
> > > > and trying to freeing its attached external buffer, say, from guest 
> > > > space.
> > > >
> > >
> > > Don't really like the idea to put:
> > > mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF);
> > > in each and every PMD from now on...
> > >
> > > From other side, it is probably not very good that RX functions update 
> > > whole ol_flags, not only RX related part.
> > > Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for 
> > > TX and generic stuff.
> > > So our ol_flags will look something like that:
> > >
> > > union {
> > >   uint64_t ol_raw_flags;
> > >   struct {
> > >   uint32_t rx;
> > >   uint32_t gen_tx;
> > >   } ol_flags
> > > };
> > >
> > > And make all PMD RX functions to operate on rx part of the flags only:
> > > mb->ol_flags.rx = pkt_flags;
> > > ?
> > >
> > > Konstantin
> > >
> > I would tend to agree with this. Changchun, did you get to assess the
> > performance impact of making this change to the PMDs? I suspect that making
> > the changes to each PMD would impact performance, while Konstantin's
> > suggestion should eliminate that impact.
> > The downside there is that we are limiting the flexibility we have in
> > expanding beyond 32 RX flags and 24 TX flags. :-(
> >
> > /Bruce
> >
> 
> How about switching things about in terms of the flag. Instead of having to
> manage a flag across the baord to indicate if an mbuf is pointing to
> external memory, I think we should use the flag to indicate that an mbuf is
> attached to the memory space of another mbuf.
> 
> My reasons for suggesting this are:
> 1. Mbufs pointing to externally managed memory are not really the problem to
> be dealt with on free, since they can be handled the same as mbufs with the
> data pointer pointing internally, it's mbufs attached to other mbufs which
> are - so that's what we need to track using a flag.
> 2. Setting the flag to indicate an indirect mbuf should have no impact on
> the driver, as an mbuf that has just been allocated from mempool cannot be
> an indirect one.
> 3. The only place we would need to worry about such a flag is in the attach,
> detach and free mbuf functions - and on free we would simply need to replace
> the existing check for "md != m" with a new check for the new flag. It would
> be a contained change.
> 

Sounds good to me.
That's' definitely much better than my proposal.
Plus, if we'll stop to rely on:

  md = RTE_MBUF_FROM_BADDR(m->buf_addr);
  if (unlikely (md != m)) {

That will allow us to set  buf_addr to some other valid offset inside mbuf
and that fix an old problem with mbufs extra metadata (userdata) stored in the 
packet's headroom. 

Konstantin

> Thoughts?
> /Bruce


[dpdk-dev] [PATCH] kni: fix building on Ubuntu-hybrids

2014-10-24 Thread Alexander Guy

On Oct 24, 2014, at 12:35 AM, Thomas Monjalon  
wrote:
> 
> Please, could explain what is the file /proc/version_signature and why
> it can be a check for Ubuntu kernel?

Ubuntu provides /proc/version_signature to help with determining kernel 
lineage; it doesn?t exist in upstream kernels:

https://wiki.ubuntu.com/Kernel/FAQ#Kernel.2BAC8-FAQ.2BAC8-GeneralVersionRunning.How_can_we_determine_the_version_of_the_running_kernel.3F

Commit a09b359d started gathering version information via version_signature in 
order to enable certain Ubuntu-specific kernel workarounds.   If you have a 
kernel without this information (e.g. upstream Linux v3.13 with an Ubuntu 
userspace), kni fails to build:

  CC [M]  
/home/alexander/dpdk/build/build/lib/librte_eal/linuxapp/kni/e1000_82575.o
In file included from 
/home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_osdep.h:41:0,
 from 
/home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_hw.h:31,
 from 
/home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_api.h:31,
 from 
/home/alexander/dpdk/build/build/lib/librte_eal/linuxapp/kni/e1000_82575.c:38:
/home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3864:8: 
error: macro "UBUNTU_KERNEL_VERSION" requires 5 arguments, but only 1 given
/home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3864:8: 
error: "UBUNTU_KERNEL_VERSION" is not defined [-Werror=undef]

My logic for the change is: if the build system is running in an environment 
that looks like Ubuntu, but can?t gather enough information to know if it 
should enable the kernel workarounds, it?s safe to not try to enable them at 
all.

Thanks.


Alexander



[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-24 Thread Matthew Hall
On Fri, Oct 24, 2014 at 08:10:40AM +, O'driscoll, Tim wrote:
> At the moment, within Intel we test with KVM, Xen and ESXi. We've never 
> tested with VirtualBox. So, maybe this is an error on the Supported NICs 
> page, or maybe somebody else is testing that configuration.

So, one of the most popular ways developers test out new code these days is 
using Vagrant or Docker. Vagrant by default creates machines using VirtualBox. 
VirtualBox runs on nearly everything out there (Linux, Windows, OS X, and 
more). Docker uses Linux LXC so it isn't multiplatform. There is a system 
called CoreOS which is still under development. It requires bare-metal w/ 
custom Linux on top.

https://www.vagrantup.com/
https://www.docker.com/
https://coreos.com/

As an open source DPDK app developer, who previously used it successfully in 
some commercial big-iron projects in the past, now I'm trying to drive 
adoption of the technology among security programmers. I'm doing it because I 
think DPDK is better than everything else I've seen for packet processing.

So it would help to drive adoption if there were a multiplatform 
virtualization environment that worked with the best performing DPDK drivers, 
so I could make it easy for developers to download, install, and run, so 
they'll get excited and learn more about all the great work you guys did and 
use it to build more DPDK apps.

I don't care if it's VBox necessarily. But we should support at least 1 
end-developer-friendly Virtualization environment so I can make it easy to 
deploy and run an app and get people excited to work with the DPDK. Low 
barrier to entry is important.

> One area where this does need further work is in virtualization. At the 
> moment, our virtualization tests are manual, so they won't be included in 
> the initial DPDK Test Suite release. We will look into automating our 
> current virtualization tests and adding these to the test suite in future.

Sounds good. Then we could help you make it work and keep it working on more 
platforms.

> > Another thing which would help in this area would be additional
> > improvements to the NUMA / socket / core / number of NICs / number of
> > queues autodetections. To write a single app which can run on a virtual 
> > card,
> > a hardware card without RSS available, and a hardware card with RSS
> > available, in a thread-safe, flow-safe way, is somewhat complex at the
> > present time.
> > 
> > I'm running into this in the VM based environments because most VNIC's
> > don't have RSS and it complicates the process of keeping consistent state of
> > the flows among the cores.
> 
> This is interesting. Do you have more details on what you're thinking here, 
> that perhaps could be used as the basis for an RFC?

It's something I am still trying to figure out how to deal with actually, 
hence all my virtio-net questions and PCI bus questions I've been hounding 
about on the list the last few weeks. It would be good if you had a contact 
for the virtual DPDK at Intel or 6WIND who could help me figure out the 
solution pattern.

I think it might involve making an app or some DPDK helper code which has 
something like this algorithm:

At load-time, app autodetects if RSS is available or not, and if NUMA is 
present or not.

If RSS is available, and NUMA is not available, enable RSS and create 1 RX 
queue for each lcore.

If RSS is available, and NUMA is available, find the NUMA socket of the NIC, 
and make 1 RX queue for each connected lcore on that NUMA socket.

If RSS is not available, and NUMA is not available, then configure the 
distributor framework. (I never used it so I am not sure if this part is 
right). Create 1 Load Balance on master lcore that does RX from all NICs, 
and hashes up and distributes packets to every other lcore.

If RSS is not available, and NUMA is available, then configure the distributor 
framework. (Again this might not be right). Create 1 Load Balance on first 
lcore on each socket that does RX from all NUMA connected NICs, and hashes up 
and distibutes packets to other NUMA connected lcores.

> Tim

Thanks,
Matthew.


[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015

2014-10-24 Thread Matthew Hall
On Fri, Oct 24, 2014 at 12:10:20PM +0200, Thomas Monjalon wrote:
> I'm the author of this page. I think I've written VirtualBox to show where 
> virtio is implemented. You interpreted this as "supported environment", so 
> I'm removing it. Thanks for testing and reporting.

Of course, I'm very sorry to see VirtualBox go, but happy to have accurate 
documentation.

Thanks Thomas.

Matthew.


[dpdk-dev] Possible bug in eal_pci pci_scan_one

2014-10-24 Thread Matthew Hall
On Fri, Oct 24, 2014 at 06:36:29PM +0530, Stephen Hemminger wrote:
> The code is fairly consistent in returning -1 for cases of not a NUMA socket,
> bogus port value. It is interpreted as SOCKET_ID_ANY in several places.
> The examples mostly check for -1 and use socket 0 as a fallback.
> Probably not worth introducing more return values and breaking existing
> applications.

OK. So I'll make a patch to correct the comment which was wrong.

Matthew.


[dpdk-dev] rte_acl test-acl app

2014-10-24 Thread Erik Ziegenbalg
Hi everyone,

I am having trouble to successfully perform a packet classification
using the rte_acl test app. I have my rules.acl and trace.acl files as
follows:

rules.acl:
@192.168.0.0/24 192.168.0.0/24 400 : 500 1000 : 2000 6/0xff

trace.acl:
192.168.0.5 192.168.0.9 450 1002 0x06

However, the result always comes up as 4294967295 (x). I have
dug through the code quite a bit to follow and see what is going on, but
not sure where I went wrong.

Any help on how the rte_acl_classify function works would be much
appreciated. In understand that the data for rte_acl_classify is a
uint32_t ** and I double checked to make sure I'm passing along proper
values. Is x the expected result? If so, I am getting the same
for packets that should not match.

Thank you,
Erik Ziegenbalg


[dpdk-dev] rte_acl test-acl app

2014-10-24 Thread Erik Ziegenbalg
Hi everyone,

I am having trouble to successfully perform a packet classification
using the rte_acl test app. I have my rules.acl and trace.acl files as
follows:

rules.acl:
@192.168.0.0/24 192.168.0.0/24 400 : 500 1000 : 2000 6/0xff

trace.acl:
192.168.0.5 192.168.0.9 450 1002 0x06

However, the result always comes up as 4294967295 (x). I have
dug through the code quite a bit to follow and see what is going on, but
not sure where I went wrong.

Any help on how the rte_acl_classify function works would be much
appreciated. In understand that the data for rte_acl_classify is a
uint32_t ** and I double checked to make sure I'm passing along proper
values. Is x the expected result? If so, I am getting the same
for packets that should not match.

Thank you,
Erik Ziegenbalg