[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet
It's reasonable to me. I'll make a patch for rte_ethdev.c. > -Original Message- > From: Richardson, Bruce > Sent: Wednesday, October 22, 2014 11:10 PM > To: Ananyev, Konstantin; Neil Horman; Liang, Cunming > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx > cycles/packet > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ananyev, Konstantin > > Sent: Wednesday, October 22, 2014 3:53 PM > > To: Neil Horman; Liang, Cunming > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx > > cycles/packet > > > > > > > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman > > > Sent: Wednesday, October 22, 2014 3:03 PM > > > To: Liang, Cunming > > > Cc: dev at dpdk.org > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx > > cycles/packet > > > > > > On Tue, Oct 21, 2014 at 01:17:01PM +, Liang, Cunming wrote: > > > > > > > > > > > > > -Original Message- > > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > > > > Sent: Tuesday, October 21, 2014 6:33 PM > > > > > To: Liang, Cunming > > > > > Cc: dev at dpdk.org > > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and > > > > > tx > > > > > cycles/packet > > > > > > > > > > On Sun, Oct 12, 2014 at 11:10:39AM +, Liang, Cunming wrote: > > > > > > Hi Neil, > > > > > > > > > > > > Very appreciate your comments. > > > > > > I add inline reply, will send v3 asap when we get alignment. > > > > > > > > > > > > BRs, > > > > > > Liang Cunming > > > > > > > > > > > > > -Original Message- > > > > > > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > > > > > > Sent: Saturday, October 11, 2014 1:52 AM > > > > > > > To: Liang, Cunming > > > > > > > Cc: dev at dpdk.org > > > > > > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx > > > > > > > and tx > > > > > cycles/packet > > > > > > > > <...snip...> > > > > > > > > > > > > > > > + printf("Force Stop!\n"); > > > > > > > > + stop = 1; > > > > > > > > + } > > > > > > > > + if (signum == SIGUSR2) > > > > > > > > + stats_display(0); > > > > > > > > +} > > > > > > > > +/* main processing loop */ > > > > > > > > +static int > > > > > > > > +main_loop(__rte_unused void *args) > > > > > > > > +{ > > > > > > > > +#define PACKET_SIZE 64 > > > > > > > > +#define FRAME_GAP 12 > > > > > > > > +#define MAC_PREAMBLE 8 > > > > > > > > + struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; > > > > > > > > + unsigned lcore_id; > > > > > > > > + unsigned i, portid, nb_rx = 0, nb_tx = 0; > > > > > > > > + struct lcore_conf *conf; > > > > > > > > + uint64_t prev_tsc, cur_tsc; > > > > > > > > + int pkt_per_port; > > > > > > > > + uint64_t packets_per_second, total_packets; > > > > > > > > + > > > > > > > > + lcore_id = rte_lcore_id(); > > > > > > > > + conf = &lcore_conf[lcore_id]; > > > > > > > > + if (conf->status != LCORE_USED) > > > > > > > > + return 0; > > > > > > > > + > > > > > > > > + pkt_per_port = MAX_TRAFIC_BURST / conf->nb_ports; > > > > > > > > + > > > > > > > > + int idx = 0; > > > > > > > > + for (i = 0; i < conf->nb_ports; i++) { > > > > > > > > + int num = pkt_per_port; > > > > > > > > + portid = conf->portlist[i]; > > > > > > > > + printf("inject %d packet to port %d\n", num, > > > > > > > > portid); > > > > > > > > + while (num) { > > > > > > > > + nb_tx = RTE_MIN(MAX_PKT_BURST, num); > > > > > > > > + nb_tx = rte_eth_tx_burst(portid, 0, > > > > > > > > + &tx_burst[idx], > > > > > > > > nb_tx); > > > > > > > > + num -= nb_tx; > > > > > > > > + idx += nb_tx; > > > > > > > > + } > > > > > > > > + } > > > > > > > > + printf("Total packets inject to prime ports = %u\n", > > > > > > > > idx); > > > > > > > > + > > > > > > > > + packets_per_second = (link_mbps * 1000 * 1000) / > > > > > > > > + +((PACKET_SIZE + FRAME_GAP + MAC_PREAMBLE) * > > CHAR_BIT); > > > > > > > > + printf("Each port will do %"PRIu64" packets per > > > > > > > > second\n", > > > > > > > > + +packets_per_second); > > > > > > > > + > > > > > > > > + total_packets = RTE_TEST_DURATION * conf->nb_ports * > > > > > > > packets_per_second; > > > > > > > > + printf("Test will stop after at least %"PRIu64" packets > > received\n", > > > > > > > > + + total_packets); > > > > > > > > + > > > > > > > > + prev_tsc = rte_rdtsc(); > > > > > > > > + > > > > > > > > + while (likely(!stop)) { > > > > > > > > + for (i = 0; i < conf->nb_port
[dpdk-dev] EAL : Input/output error on DPDK 1.7.1
Hi, I got same result in VMware Workstation environment. At least in my environment, INTX toggle check is not work with VMware E1000 Ethernet. Please try attached patch. 2014-10-17 3:04 GMT+09:00 Raghav K : > Hey, > I observe continuous burst of I/O Errors, as indicated below, with the > testpmd application with DPDK 1.7.1.This seems to originate from > eal_intr_process_interrupts() function. I seemed to have setup the DPDK > prerequisites alright. > Another recent post seemed to suggest moving back to 1.7.0, however I would > like to persist with 1.7.1. > Any help/pointers in resolving this would be greatly appreciated. > Much thanks,Raghav > root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1/x86_64-native-linuxapp-gcc/app# > ./testpmd -c 0xf -n3 -- -i --nb-cores=3 --nb-ports=2 > EAL: Error reading from file descriptor 21: Input/output errorEAL: Error > reading from file descriptor 21: Input/output errorEAL: Error reading from > file descriptor 21: Input/output errorEAL: Error reading from file descriptor > 21: Input/output errorEAL: Error reading from file descriptor 21: > Input/output errorEAL: Error reading from file descriptor 21: Input/output > errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error > reading from file descriptor 21: Input/output errorEAL: Error reading from > file descriptor 21: Input/output errorEAL: Error reading from file descriptor > 21: Input/output errorEAL: Error reading from file descriptor 21: > Input/output errorEAL: Error reading from file descriptor 21: Input/output > errorEAL: Error reading from file descriptor 21: Input/output errorEAL: Error > reading from file descriptor 21: Input/output errorEAL: Error reading from > file descriptor 21: Input/output errorEAL: Error reading from file descriptor > 21: Input/output error > > root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1# ./tools/dpdk_nic_bind.py --status > Network devices using DPDK-compatible > driver:02:01.0 '82545EM > Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000:02:02.0 > '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000 > Network devices using kernel > driver===:02:00.0 '82545EM Gigabit > Ethernet Controller (Copper)' if=eth0 drv=e1000 unused=igb_uio > *Active*:02:03.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth3 > drv=e1000 unused=igb_uio :02:05.0 '82545EM Gigabit Ethernet Controller > (Copper)' if=eth4 drv=e1000 unused=igb_uio :02:06.0 '82545EM Gigabit > Ethernet Controller (Copper)' if=eth5 drv=e1000 unused=igb_uio > Other network devices= -- next part -- diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c index d1ca26e..c46a00f 100644 --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c @@ -505,14 +505,11 @@ igbuio_pci_probe(struct pci_dev *dev, const struct pci_device_id *id) } /* fall back to INTX */ case RTE_INTR_MODE_LEGACY: - if (pci_intx_mask_supported(dev)) { - dev_dbg(&dev->dev, "using INTX"); - udev->info.irq_flags = IRQF_SHARED; - udev->info.irq = dev->irq; - udev->mode = RTE_INTR_MODE_LEGACY; - break; - } - dev_notice(&dev->dev, "PCI INTX mask not supported\n"); + dev_dbg(&dev->dev, "using INTX"); + udev->info.irq_flags = IRQF_SHARED; + udev->info.irq = dev->irq; + udev->mode = RTE_INTR_MODE_LEGACY; + break; /* fall back to no IRQ */ case RTE_INTR_MODE_NONE: udev->mode = RTE_INTR_MODE_NONE;
[dpdk-dev] [PATCH v4 5/8] test app: adding support for generating variable sized packet bursts
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty > Sent: Tuesday, September 30, 2014 5:58 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v4 5/8] test app: adding support for generating > variable sized packet bursts > > > Signed-off-by: Declan Doherty > --- > app/test/packet_burst_generator.c | 25 - > app/test/packet_burst_generator.h | 6 +- > app/test/test_link_bonding.c | 14 +- > 3 files changed, 22 insertions(+), 23 deletions(-) > > diff --git a/app/test/packet_burst_generator.c > b/app/test/packet_burst_generator.c > index 9e747a4..b2824dc 100644 > --- a/app/test/packet_burst_generator.c > +++ b/app/test/packet_burst_generator.c > @@ -74,8 +74,7 @@ static inline void > copy_buf_to_pkt(void *buf, unsigned len, struct rte_mbuf *pkt, unsigned > offset) > { > if (offset + len <= pkt->data_len) { > - rte_memcpy(rte_pktmbuf_mtod(pkt, char *) + offset, > - buf, (size_t) len); > + rte_memcpy(rte_pktmbuf_mtod(pkt, char *) + offset, buf, > (size_t) len); > return; > } > copy_buf_to_pkt_segs(buf, len, pkt, offset); > @@ -191,20 +190,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t > src_addr, > */ > #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. > */ > > -#define TXONLY_DEF_PACKET_LEN 64 > -#define TXONLY_DEF_PACKET_LEN_128 128 > - > -uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN; > -uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = { > - TXONLY_DEF_PACKET_LEN_128, > -}; > - > -uint8_t tx_pkt_nb_segs = 1; > > int > generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst, > struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr, > - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst) > + uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst, > + uint8_t pkt_len, uint8_t nb_pkt_segs) > { > int i, nb_pkt = 0; > size_t eth_hdr_size; > @@ -221,9 +212,9 @@ nomore_mbuf: > break; > } > > - pkt->data_len = tx_pkt_seg_lengths[0]; > + pkt->data_len = pkt_len; > pkt_seg = pkt; > - for (i = 1; i < tx_pkt_nb_segs; i++) { > + for (i = 1; i < nb_pkt_segs; i++) { > pkt_seg->next = rte_pktmbuf_alloc(mp); > if (pkt_seg->next == NULL) { > pkt->nb_segs = i; > @@ -231,7 +222,7 @@ nomore_mbuf: > goto nomore_mbuf; > } > pkt_seg = pkt_seg->next; > - pkt_seg->data_len = tx_pkt_seg_lengths[i]; > + pkt_seg->data_len = pkt_len; > } > pkt_seg->next = NULL; /* Last segment of packet. */ > > @@ -259,8 +250,8 @@ nomore_mbuf: >* Complete first mbuf of packet and append it to the >* burst of packets to be transmitted. >*/ > - pkt->nb_segs = tx_pkt_nb_segs; > - pkt->pkt_len = tx_pkt_length; > + pkt->nb_segs = nb_pkt_segs; > + pkt->pkt_len = pkt_len; > pkt->l2_len = eth_hdr_size; > > if (ipv4) { > diff --git a/app/test/packet_burst_generator.h > b/app/test/packet_burst_generator.h > index 5b3cd6c..f86589e 100644 > --- a/app/test/packet_burst_generator.h > +++ b/app/test/packet_burst_generator.h > @@ -47,6 +47,9 @@ extern "C" { > #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \ > ((c & 0xff) << 8) | (d & 0xff)) > > +#define PACKET_BURST_GEN_PKT_LEN 60 > +#define PACKET_BURST_GEN_PKT_LEN_128 128 > + > > void > initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac, > @@ -68,7 +71,8 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t > src_addr, > int > generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst, > struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr, > - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst); > + uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst, > + uint8_t pkt_len, uint8_t nb_pkt_segs); > > #ifdef __cplusplus > } > diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c > index 1a847eb..50355a3 100644 > --- a/app/test/test_link_bonding.c > +++ b/app/test/test_link_bonding.c > @@ -1338,7 +1338,8 @@ generate_test_burst(struct rte_mbuf **pkts_burst, > uint16_t burst_size, > /* Generate burst of packets to transmit */ > generated_burst_size = generate_packet_burst(test_params- > >mbuf_pool, > pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, > ipv4, > - test_params->pkt_udp_hdr, bu
[dpdk-dev] [PATCH v4 0/3] app/test: unit test to measure cycles per packet
BTW, [1/3] is the same patch as below one. http://dpdk.org/dev/patchwork/patch/817 v4 update: # fix the confusing of retval in some API of rte_ethdev v3 update: # Codes refine according to the feedback. 1. add ether_format_addr to rte_ether.h 2. fix typo in code comments. 3. %lu to %PRIu64, fixing 32-bit targets compilation err # merge 2 small incremental patches to the first one. The whole unit test as a single patch in [PATCH v3 2/2] # rebase code to the latest master v2 update: Rebase code to the latest master branch. It provides unit test to measure cycles/packet in NIC loopback mode. It simply gives the average cycles of IO used per packet without test equipment. When doing the test, make sure the link is UP. There's two stream control mode support, one is continues, another is burst. The former continues to forward the injected packets until reaching a certain amount of number. The latter one stop when all the injected packets are received. In burst stream, now measure two situations, with or without desc. cache conflict. By default, it runs in continues stream mode to measure the whole rxtx. Usage Example: 1. Run unit test app in interactive mode app/test -c f -n 4 -- -i 2. Set stream control mode, by default is continuous set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit] 3. If choose continuous stream, there are another two options can configure 3.1 choose rx/tx pair, default is vector set_rxtx_mode [vector|scalar|full|hybrid] Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' without INC_VEC=y in config 3.2 choose the area of masurement, default is rxtx set_rxtx_anchor [rxtx|rxonly|txonly] 4. Run and wait for the result pmd_perf_autotest For who simply just want to see how much cycles cost per packet. Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it. Nothing else needs to configure. Using other options when you understand and what to measures more. *** BLURB HERE *** Cunming Liang (3): app/test: allow to create packets in different sizes app/test: measure the cost of rx/tx routines by cycle number ethdev: fix wrong error return refer to API definition app/test/Makefile |1 + app/test/commands.c | 111 + app/test/packet_burst_generator.c | 26 +- app/test/packet_burst_generator.h | 11 +- app/test/test.h |6 + app/test/test_link_bonding.c| 39 +- app/test/test_pmd_perf.c| 922 +++ lib/librte_ether/rte_ethdev.c |6 +- lib/librte_ether/rte_ether.h| 25 + lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 + 10 files changed, 1117 insertions(+), 36 deletions(-) create mode 100644 app/test/test_pmd_perf.c -- 1.7.4.1
[dpdk-dev] [PATCH v4 1/3] app/test: allow to create packets in different sizes
adding support to allow packet burst generator to create packets in differenct sizes Signed-off-by: Cunming Liang Acked-by: Declan Doherty --- app/test/packet_burst_generator.c | 26 app/test/packet_burst_generator.h | 11 +++-- app/test/test_link_bonding.c | 39 3 files changed, 43 insertions(+), 33 deletions(-) diff --git a/app/test/packet_burst_generator.c b/app/test/packet_burst_generator.c index 9e747a4..017139b 100644 --- a/app/test/packet_burst_generator.c +++ b/app/test/packet_burst_generator.c @@ -191,20 +191,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t src_addr, */ #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. */ -#define TXONLY_DEF_PACKET_LEN 64 -#define TXONLY_DEF_PACKET_LEN_128 128 - -uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN; -uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = { - TXONLY_DEF_PACKET_LEN_128, -}; - -uint8_t tx_pkt_nb_segs = 1; - int generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst, - struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr, - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst) + struct ether_hdr *eth_hdr, uint8_t vlan_enabled, + void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr, + int nb_pkt_per_burst, uint8_t pkt_len, + uint8_t nb_pkt_segs) { int i, nb_pkt = 0; size_t eth_hdr_size; @@ -221,9 +213,9 @@ nomore_mbuf: break; } - pkt->data_len = tx_pkt_seg_lengths[0]; + pkt->data_len = pkt_len; pkt_seg = pkt; - for (i = 1; i < tx_pkt_nb_segs; i++) { + for (i = 1; i < nb_pkt_segs; i++) { pkt_seg->next = rte_pktmbuf_alloc(mp); if (pkt_seg->next == NULL) { pkt->nb_segs = i; @@ -231,7 +223,7 @@ nomore_mbuf: goto nomore_mbuf; } pkt_seg = pkt_seg->next; - pkt_seg->data_len = tx_pkt_seg_lengths[i]; + pkt_seg->data_len = pkt_len; } pkt_seg->next = NULL; /* Last segment of packet. */ @@ -259,8 +251,8 @@ nomore_mbuf: * Complete first mbuf of packet and append it to the * burst of packets to be transmitted. */ - pkt->nb_segs = tx_pkt_nb_segs; - pkt->pkt_len = tx_pkt_length; + pkt->nb_segs = nb_pkt_segs; + pkt->pkt_len = pkt_len; pkt->l2_len = eth_hdr_size; if (ipv4) { diff --git a/app/test/packet_burst_generator.h b/app/test/packet_burst_generator.h index 5b3cd6c..fe992ac 100644 --- a/app/test/packet_burst_generator.h +++ b/app/test/packet_burst_generator.h @@ -47,10 +47,13 @@ extern "C" { #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \ ((c & 0xff) << 8) | (d & 0xff)) +#define PACKET_BURST_GEN_PKT_LEN 60 +#define PACKET_BURST_GEN_PKT_LEN_128 128 void initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac, - struct ether_addr *dst_mac, uint8_t vlan_enabled, uint16_t van_id); + struct ether_addr *dst_mac, uint8_t vlan_enabled, + uint16_t van_id); uint16_t initialize_udp_header(struct udp_hdr *udp_hdr, uint16_t src_port, @@ -67,8 +70,10 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t src_addr, int generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst, - struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr, - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst); + struct ether_hdr *eth_hdr, uint8_t vlan_enabled, + void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr, + int nb_pkt_per_burst, uint8_t pkt_len, + uint8_t nb_pkt_segs); #ifdef __cplusplus } diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c index 214d2a2..d407e4f 100644 --- a/app/test/test_link_bonding.c +++ b/app/test/test_link_bonding.c @@ -1192,9 +1192,12 @@ generate_test_burst(struct rte_mbuf **pkts_burst, uint16_t burst_size, } /* Generate burst of packets to transmit */ - generated_burst_size = generate_packet_burst(test_params->mbuf_pool, - pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, ipv4, - test_params->pkt_udp_hdr, burst_size); + generated_burst_size = + generate_packet_burst(test_params->mbuf_pool, + pkts_burst, test_params->pkt_eth_hdr, +
[dpdk-dev] [PATCH v4 3/3] ethdev: fix wrong error return refer to API definition
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count returns the packet number. When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set to -ENOTSUP. It makes confusing. The patch always return 0 no matter no packet or there's error. Meanwhile set errno in such kind of checking. Signed-off-by: Cunming Liang --- lib/librte_ether/rte_ethdev.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 50f10d9..922a0c6 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -2530,7 +2530,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id, return 0; } dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0); if (queue_id >= dev->data->nb_rx_queues) { PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id); return 0; @@ -2551,7 +2551,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id, } dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0); if (queue_id >= dev->data->nb_tx_queues) { PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id); return 0; @@ -2570,7 +2570,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id) return 0; } dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0); return (*dev->dev_ops->rx_queue_count)(dev, queue_id); } -- 1.7.4.1
[dpdk-dev] [PATCH v4 2/3] app/test: measure the cost of rx/tx routines by cycle number
The unit test can be used to measure cycles per packet in different rx/tx rouines. The NIC works in loopback mode. So it doesn't require test equipment to measure throughput. As result, the unit test shows the average cycles per packet consuming. When doing the test, make sure the link is UP. Usage Example: 1. Run unit test app in interactive mode app/test -c f -n 4 -- -i 2. Run and wait for the result pmd_perf_autotest There's option to choose rx/tx pair, default is vector. set_rxtx_mode [vector|scalar|full|hybrid] Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' without INC_VEC=y in config It supports to measure standalone rx or tx. Usage Example: Choose rx or tx standalone, default is both set_rxtx_anchor [rxtx|rxonly|txonly] It also supports to measure standalone RX burst cycles. In this way, it won't repeat re-send recevied packets. Now it measures two situations, poll before/after xmit(w or w/o desc. cache conflict) Usage Example: Set stream control mode, by default is continuous set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit] Signed-off-by: Cunming Liang Acked-by: Bruce Richardson --- app/test/Makefile |1 + app/test/commands.c | 111 + app/test/test.h |6 + app/test/test_pmd_perf.c| 922 +++ lib/librte_ether/rte_ether.h| 25 + lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 + 6 files changed, 1071 insertions(+), 0 deletions(-) create mode 100644 app/test/test_pmd_perf.c diff --git a/app/test/Makefile b/app/test/Makefile index 6af6d76..ebfa0ba 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c SRCS-y += test_ring.c SRCS-y += test_ring_perf.c +SRCS-y += test_pmd_perf.c ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y) SRCS-y += test_table.c diff --git a/app/test/commands.c b/app/test/commands.c index a9e36b1..92a17ed 100644 --- a/app/test/commands.c +++ b/app/test/commands.c @@ -310,12 +310,123 @@ cmdline_parse_inst_t cmd_quit = { // +struct cmd_set_rxtx_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t mode; +}; + +static void cmd_set_rxtx_parsed(void *parsed_result, struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_rxtx_result *res = parsed_result; + if (test_set_rxtx_conf(res->mode) < 0) + cmdline_printf(cl, "Cannot find such mode\n"); +} + +cmdline_parse_token_string_t cmd_set_rxtx_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, set, +"set_rxtx_mode"); + +cmdline_parse_token_string_t cmd_set_rxtx_mode = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, mode, NULL); + +cmdline_parse_inst_t cmd_set_rxtx = { + .f = cmd_set_rxtx_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "set rxtx routine: " + "set_rxtx ", + .tokens = {/* token list, NULL terminated */ + (void *)&cmd_set_rxtx_set, + (void *)&cmd_set_rxtx_mode, + NULL, + }, +}; + +// + +struct cmd_set_rxtx_anchor { + cmdline_fixed_string_t set; + cmdline_fixed_string_t type; +}; + +static void +cmd_set_rxtx_anchor_parsed(void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_rxtx_anchor *res = parsed_result; + if (test_set_rxtx_anchor(res->type) < 0) + cmdline_printf(cl, "Cannot find such anchor\n"); +} + +cmdline_parse_token_string_t cmd_set_rxtx_anchor_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, set, +"set_rxtx_anchor"); + +cmdline_parse_token_string_t cmd_set_rxtx_anchor_type = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, type, NULL); + +cmdline_parse_inst_t cmd_set_rxtx_anchor = { + .f = cmd_set_rxtx_anchor_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "set rxtx anchor: " + "set_rxtx_anchor ", + .tokens = {/* token list, NULL terminated */ + (void *)&cmd_set_rxtx_anchor_set, + (void *)&cmd_set_rxtx_anchor_type, + NULL, + }, +}; + +// + +/* for stream control */ +struct cmd_set_rxtx_sc { + cmdline_fixed_string_t set; + cmdline_fixed_string_t type; +}; + +static void +cmd_set_rxtx_sc_parsed(void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_rxtx_sc *res = parsed_result; + if (test_set_rxtx_sc(res->type) < 0) + cmdline_printf(cl, "Cannot find such stream control\n"); +} + +cmdline_parse_toke
[dpdk-dev] [PATCH v5 0/3] app/test: unit test to measure cycles per packet
BTW, [1/3] is the same patch as below one. http://dpdk.org/dev/patchwork/patch/817 v5 update: # fix the confusing of retval in some API of rte_ethdev v4 ignore v3 update: # Codes refine according to the feedback. 1. add ether_format_addr to rte_ether.h 2. fix typo in code comments. 3. %lu to %PRIu64, fixing 32-bit targets compilation err # merge 2 small incremental patches to the first one. The whole unit test as a single patch in [PATCH v3 2/2] # rebase code to the latest master v2 update: Rebase code to the latest master branch. It provides unit test to measure cycles/packet in NIC loopback mode. It simply gives the average cycles of IO used per packet without test equipment. When doing the test, make sure the link is UP. There's two stream control mode support, one is continues, another is burst. The former continues to forward the injected packets until reaching a certain amount of number. The latter one stop when all the injected packets are received. In burst stream, now measure two situations, with or without desc. cache conflict. By default, it runs in continues stream mode to measure the whole rxtx. Usage Example: 1. Run unit test app in interactive mode app/test -c f -n 4 -- -i 2. Set stream control mode, by default is continuous set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit] 3. If choose continuous stream, there are another two options can configure 3.1 choose rx/tx pair, default is vector set_rxtx_mode [vector|scalar|full|hybrid] Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' without INC_VEC=y in config 3.2 choose the area of masurement, default is rxtx set_rxtx_anchor [rxtx|rxonly|txonly] 4. Run and wait for the result pmd_perf_autotest For who simply just want to see how much cycles cost per packet. Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it. Nothing else needs to configure. Using other options when you understand and what to measures more. *** BLURB HERE *** Cunming Liang (3): app/test: allow to create packets in different sizes app/test: measure the cost of rx/tx routines by cycle number ethdev: fix wrong error return refere to API definition app/test/Makefile |1 + app/test/commands.c | 111 + app/test/packet_burst_generator.c | 26 +- app/test/packet_burst_generator.h | 11 +- app/test/test.h |6 + app/test/test_link_bonding.c| 39 +- app/test/test_pmd_perf.c| 922 +++ lib/librte_ether/rte_ethdev.c | 10 +- lib/librte_ether/rte_ether.h| 25 + lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 + 10 files changed, 1121 insertions(+), 36 deletions(-) create mode 100644 app/test/test_pmd_perf.c -- 1.7.4.1
[dpdk-dev] [PATCH v5 1/3] app/test: allow to create packets in different sizes
adding support to allow packet burst generator to create packets in differenct sizes Signed-off-by: Cunming Liang Acked-by: Declan Doherty --- app/test/packet_burst_generator.c | 26 app/test/packet_burst_generator.h | 11 +++-- app/test/test_link_bonding.c | 39 3 files changed, 43 insertions(+), 33 deletions(-) diff --git a/app/test/packet_burst_generator.c b/app/test/packet_burst_generator.c index 9e747a4..017139b 100644 --- a/app/test/packet_burst_generator.c +++ b/app/test/packet_burst_generator.c @@ -191,20 +191,12 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t src_addr, */ #define RTE_MAX_SEGS_PER_PKT 255 /**< pkt.nb_segs is a 8-bit unsigned char. */ -#define TXONLY_DEF_PACKET_LEN 64 -#define TXONLY_DEF_PACKET_LEN_128 128 - -uint16_t tx_pkt_length = TXONLY_DEF_PACKET_LEN; -uint16_t tx_pkt_seg_lengths[RTE_MAX_SEGS_PER_PKT] = { - TXONLY_DEF_PACKET_LEN_128, -}; - -uint8_t tx_pkt_nb_segs = 1; - int generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst, - struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr, - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst) + struct ether_hdr *eth_hdr, uint8_t vlan_enabled, + void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr, + int nb_pkt_per_burst, uint8_t pkt_len, + uint8_t nb_pkt_segs) { int i, nb_pkt = 0; size_t eth_hdr_size; @@ -221,9 +213,9 @@ nomore_mbuf: break; } - pkt->data_len = tx_pkt_seg_lengths[0]; + pkt->data_len = pkt_len; pkt_seg = pkt; - for (i = 1; i < tx_pkt_nb_segs; i++) { + for (i = 1; i < nb_pkt_segs; i++) { pkt_seg->next = rte_pktmbuf_alloc(mp); if (pkt_seg->next == NULL) { pkt->nb_segs = i; @@ -231,7 +223,7 @@ nomore_mbuf: goto nomore_mbuf; } pkt_seg = pkt_seg->next; - pkt_seg->data_len = tx_pkt_seg_lengths[i]; + pkt_seg->data_len = pkt_len; } pkt_seg->next = NULL; /* Last segment of packet. */ @@ -259,8 +251,8 @@ nomore_mbuf: * Complete first mbuf of packet and append it to the * burst of packets to be transmitted. */ - pkt->nb_segs = tx_pkt_nb_segs; - pkt->pkt_len = tx_pkt_length; + pkt->nb_segs = nb_pkt_segs; + pkt->pkt_len = pkt_len; pkt->l2_len = eth_hdr_size; if (ipv4) { diff --git a/app/test/packet_burst_generator.h b/app/test/packet_burst_generator.h index 5b3cd6c..fe992ac 100644 --- a/app/test/packet_burst_generator.h +++ b/app/test/packet_burst_generator.h @@ -47,10 +47,13 @@ extern "C" { #define IPV4_ADDR(a, b, c, d)(((a & 0xff) << 24) | ((b & 0xff) << 16) | \ ((c & 0xff) << 8) | (d & 0xff)) +#define PACKET_BURST_GEN_PKT_LEN 60 +#define PACKET_BURST_GEN_PKT_LEN_128 128 void initialize_eth_header(struct ether_hdr *eth_hdr, struct ether_addr *src_mac, - struct ether_addr *dst_mac, uint8_t vlan_enabled, uint16_t van_id); + struct ether_addr *dst_mac, uint8_t vlan_enabled, + uint16_t van_id); uint16_t initialize_udp_header(struct udp_hdr *udp_hdr, uint16_t src_port, @@ -67,8 +70,10 @@ initialize_ipv4_header(struct ipv4_hdr *ip_hdr, uint32_t src_addr, int generate_packet_burst(struct rte_mempool *mp, struct rte_mbuf **pkts_burst, - struct ether_hdr *eth_hdr, uint8_t vlan_enabled, void *ip_hdr, - uint8_t ipv4, struct udp_hdr *udp_hdr, int nb_pkt_per_burst); + struct ether_hdr *eth_hdr, uint8_t vlan_enabled, + void *ip_hdr, uint8_t ipv4, struct udp_hdr *udp_hdr, + int nb_pkt_per_burst, uint8_t pkt_len, + uint8_t nb_pkt_segs); #ifdef __cplusplus } diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c index 214d2a2..d407e4f 100644 --- a/app/test/test_link_bonding.c +++ b/app/test/test_link_bonding.c @@ -1192,9 +1192,12 @@ generate_test_burst(struct rte_mbuf **pkts_burst, uint16_t burst_size, } /* Generate burst of packets to transmit */ - generated_burst_size = generate_packet_burst(test_params->mbuf_pool, - pkts_burst, test_params->pkt_eth_hdr, vlan, ip_hdr, ipv4, - test_params->pkt_udp_hdr, burst_size); + generated_burst_size = + generate_packet_burst(test_params->mbuf_pool, + pkts_burst, test_params->pkt_eth_hdr, +
[dpdk-dev] [PATCH v5 2/3] app/test: measure the cost of rx/tx routines by cycle number
The unit test can be used to measure cycles per packet in different rx/tx rouines. The NIC works in loopback mode. So it doesn't require test equipment to measure throughput. As result, the unit test shows the average cycles per packet consuming. When doing the test, make sure the link is UP. Usage Example: 1. Run unit test app in interactive mode app/test -c f -n 4 -- -i 2. Run and wait for the result pmd_perf_autotest There's option to choose rx/tx pair, default is vector. set_rxtx_mode [vector|scalar|full|hybrid] Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' without INC_VEC=y in config It supports to measure standalone rx or tx. Usage Example: Choose rx or tx standalone, default is both set_rxtx_anchor [rxtx|rxonly|txonly] It also supports to measure standalone RX burst cycles. In this way, it won't repeat re-send recevied packets. Now it measures two situations, poll before/after xmit(w or w/o desc. cache conflict) Usage Example: Set stream control mode, by default is continuous set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit] Signed-off-by: Cunming Liang Acked-by: Bruce Richardson --- app/test/Makefile |1 + app/test/commands.c | 111 + app/test/test.h |6 + app/test/test_pmd_perf.c| 922 +++ lib/librte_ether/rte_ether.h| 25 + lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 + 6 files changed, 1071 insertions(+), 0 deletions(-) create mode 100644 app/test/test_pmd_perf.c diff --git a/app/test/Makefile b/app/test/Makefile index 6af6d76..ebfa0ba 100644 --- a/app/test/Makefile +++ b/app/test/Makefile @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c SRCS-y += test_ring.c SRCS-y += test_ring_perf.c +SRCS-y += test_pmd_perf.c ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y) SRCS-y += test_table.c diff --git a/app/test/commands.c b/app/test/commands.c index a9e36b1..92a17ed 100644 --- a/app/test/commands.c +++ b/app/test/commands.c @@ -310,12 +310,123 @@ cmdline_parse_inst_t cmd_quit = { // +struct cmd_set_rxtx_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t mode; +}; + +static void cmd_set_rxtx_parsed(void *parsed_result, struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_rxtx_result *res = parsed_result; + if (test_set_rxtx_conf(res->mode) < 0) + cmdline_printf(cl, "Cannot find such mode\n"); +} + +cmdline_parse_token_string_t cmd_set_rxtx_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, set, +"set_rxtx_mode"); + +cmdline_parse_token_string_t cmd_set_rxtx_mode = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_result, mode, NULL); + +cmdline_parse_inst_t cmd_set_rxtx = { + .f = cmd_set_rxtx_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "set rxtx routine: " + "set_rxtx ", + .tokens = {/* token list, NULL terminated */ + (void *)&cmd_set_rxtx_set, + (void *)&cmd_set_rxtx_mode, + NULL, + }, +}; + +// + +struct cmd_set_rxtx_anchor { + cmdline_fixed_string_t set; + cmdline_fixed_string_t type; +}; + +static void +cmd_set_rxtx_anchor_parsed(void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_rxtx_anchor *res = parsed_result; + if (test_set_rxtx_anchor(res->type) < 0) + cmdline_printf(cl, "Cannot find such anchor\n"); +} + +cmdline_parse_token_string_t cmd_set_rxtx_anchor_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, set, +"set_rxtx_anchor"); + +cmdline_parse_token_string_t cmd_set_rxtx_anchor_type = + TOKEN_STRING_INITIALIZER(struct cmd_set_rxtx_anchor, type, NULL); + +cmdline_parse_inst_t cmd_set_rxtx_anchor = { + .f = cmd_set_rxtx_anchor_parsed, /* function to call */ + .data = NULL, /* 2nd arg of func */ + .help_str = "set rxtx anchor: " + "set_rxtx_anchor ", + .tokens = {/* token list, NULL terminated */ + (void *)&cmd_set_rxtx_anchor_set, + (void *)&cmd_set_rxtx_anchor_type, + NULL, + }, +}; + +// + +/* for stream control */ +struct cmd_set_rxtx_sc { + cmdline_fixed_string_t set; + cmdline_fixed_string_t type; +}; + +static void +cmd_set_rxtx_sc_parsed(void *parsed_result, + struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_rxtx_sc *res = parsed_result; + if (test_set_rxtx_sc(res->type) < 0) + cmdline_printf(cl, "Cannot find such stream control\n"); +} + +cmdline_parse_toke
[dpdk-dev] [PATCH v5 3/3] ethdev: fix wrong error return refere to API definition
Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count returns the packet number. When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set to -ENOTSUP. It makes confusing. The patch always return 0 no matter no packet or there's error. Meanwhile set errno in such kind of checking. Signed-off-by: Cunming Liang --- lib/librte_ether/rte_ethdev.c | 10 +++--- 1 files changed, 7 insertions(+), 3 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 50f10d9..6675f28 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -81,12 +81,14 @@ /* Macros for checking for restricting functions to primary instance only */ #define PROC_PRIMARY_OR_ERR_RET(retval) do { \ if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \ + rte_errno = -E_RTE_SECONDARY; \ PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \ return (retval); \ } \ } while(0) #define PROC_PRIMARY_OR_RET() do { \ if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \ + rte_errno = -E_RTE_SECONDARY; \ PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \ return; \ } \ @@ -95,12 +97,14 @@ /* Macros to check for invlaid function pointers in dev_ops structure */ #define FUNC_PTR_OR_ERR_RET(func, retval) do { \ if ((func) == NULL) { \ + rte_errno = -ENOTSUP; \ PMD_DEBUG_TRACE("Function not supported\n"); \ return (retval); \ } \ } while(0) #define FUNC_PTR_OR_RET(func) do { \ if ((func) == NULL) { \ + rte_errno = -ENOTSUP; \ PMD_DEBUG_TRACE("Function not supported\n"); \ return; \ } \ @@ -2530,7 +2534,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id, return 0; } dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0); if (queue_id >= dev->data->nb_rx_queues) { PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id); return 0; @@ -2551,7 +2555,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id, } dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0); if (queue_id >= dev->data->nb_tx_queues) { PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id); return 0; @@ -2570,7 +2574,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t queue_id) return 0; } dev = &rte_eth_devices[port_id]; - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0); return (*dev->dev_ops->rx_queue_count)(dev, queue_id); } -- 1.7.4.1
[dpdk-dev] [PATCH v4 0/3] app/test: unit test to measure cycles per packet
Sorry, just ignore this version. > -Original Message- > From: Liang, Cunming > Sent: Friday, October 24, 2014 1:40 PM > To: dev at dpdk.org > Cc: nhorman at tuxdriver.com; Richardson, Bruce; Ananyev, Konstantin; De Lara > Guarch, Pablo; Liang, Cunming > Subject: [PATCH v4 0/3] app/test: unit test to measure cycles per packet > Importance: High > > BTW, [1/3] is the same patch as below one. > http://dpdk.org/dev/patchwork/patch/817 > > v4 update: > # fix the confusing of retval in some API of rte_ethdev > > v3 update: > # Codes refine according to the feedback. > 1. add ether_format_addr to rte_ether.h > 2. fix typo in code comments. > 3. %lu to %PRIu64, fixing 32-bit targets compilation err > # merge 2 small incremental patches to the first one. > The whole unit test as a single patch in [PATCH v3 2/2] > # rebase code to the latest master > > v2 update: > Rebase code to the latest master branch. > > It provides unit test to measure cycles/packet in NIC loopback mode. > It simply gives the average cycles of IO used per packet without test > equipment. > When doing the test, make sure the link is UP. > > There's two stream control mode support, one is continues, another is burst. > The former continues to forward the injected packets until reaching a certain > amount of number. > The latter one stop when all the injected packets are received. > In burst stream, now measure two situations, with or without desc. cache > conflict. > By default, it runs in continues stream mode to measure the whole rxtx. > > Usage Example: > 1. Run unit test app in interactive mode > app/test -c f -n 4 -- -i > 2. Set stream control mode, by default is continuous > set_rxtx_sc [continuous|poll_before_xmit|poll_after_xmit] > 3. If choose continuous stream, there are another two options can configure > 3.1 choose rx/tx pair, default is vector > set_rxtx_mode [vector|scalar|full|hybrid] > Note: To get acurate scalar fast, plz choose 'vector' or 'hybrid' > without > INC_VEC=y in config > 3.2 choose the area of masurement, default is rxtx > set_rxtx_anchor [rxtx|rxonly|txonly] > 4. Run and wait for the result > pmd_perf_autotest > > For who simply just want to see how much cycles cost per packet. > Compile DPDK, Run 'app/test', and type 'pmd_perf_autotest', that's it. > Nothing else needs to configure. > Using other options when you understand and what to measures more. > > > *** BLURB HERE *** > > Cunming Liang (3): > app/test: allow to create packets in different sizes > app/test: measure the cost of rx/tx routines by cycle number > ethdev: fix wrong error return refer to API definition > > app/test/Makefile |1 + > app/test/commands.c | 111 + > app/test/packet_burst_generator.c | 26 +- > app/test/packet_burst_generator.h | 11 +- > app/test/test.h |6 + > app/test/test_link_bonding.c| 39 +- > app/test/test_pmd_perf.c| 922 > +++ > lib/librte_ether/rte_ethdev.c |6 +- > lib/librte_ether/rte_ether.h| 25 + > lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 + > 10 files changed, 1117 insertions(+), 36 deletions(-) > create mode 100644 app/test/test_pmd_perf.c > > -- > 1.7.4.1
[dpdk-dev] [PATCH] eal: replace strict_strtoul with kstrtoul
>From upstream kernel commit 3db2e9cd, strict_strto* serial functions are removed. So that we should directly used kstrtoul instead. Signed-off-by: Jincheng Miao --- lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 4 ++-- lib/librte_eal/linuxapp/kni/kni_vhost.c | 2 +- lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c index d1ca26e..47ff2f3 100644 --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c @@ -83,7 +83,7 @@ store_max_vfs(struct device *dev, struct device_attribute *attr, unsigned long max_vfs; struct pci_dev *pdev = container_of(dev, struct pci_dev, dev); - if (0 != strict_strtoul(buf, 0, &max_vfs)) + if (0 != kstrtoul(buf, 0, &max_vfs)) return -EINVAL; if (0 == max_vfs) @@ -174,7 +174,7 @@ store_max_read_request_size(struct device *dev, unsigned long size = 0; int ret; - if (strict_strtoul(buf, 0, &size) != 0) + if (0 != kstrtoul(buf, 0, &size)) return -EINVAL; ret = pcie_set_readrq(pci_dev, (int)size); diff --git a/lib/librte_eal/linuxapp/kni/kni_vhost.c b/lib/librte_eal/linuxapp/kni/kni_vhost.c index fe512c2..ba0c1ac 100644 --- a/lib/librte_eal/linuxapp/kni/kni_vhost.c +++ b/lib/librte_eal/linuxapp/kni/kni_vhost.c @@ -739,7 +739,7 @@ set_sock_en(struct device *dev, struct device_attribute *attr, unsigned long en; int err = 0; - if (0 != strict_strtoul(buf, 0, &en)) + if (0 != kstrtoul(buf, 0, &en)) return -EINVAL; if (en) diff --git a/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c b/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c index dfb271d..8a3727d 100644 --- a/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c +++ b/lib/librte_eal/linuxapp/xen_dom0/dom0_mm_misc.c @@ -123,7 +123,7 @@ store_memsize(struct device *dev, struct device_attribute *attr, int err = 0; unsigned long mem_size; - if (0 != strict_strtoul(buf, 0, &mem_size)) + if (0 != kstrtoul(buf, 0, &mem_size)) return -EINVAL; mutex_lock(&dom0_dev.data_lock); -- 1.9.3
[dpdk-dev] [PATCH] doc: fix a typo
Signed-off-by: Jincheng Miao --- doc/guides/linux_gsg/sys_reqs.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index 6a03f54..c14411e 100755 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -267,7 +267,7 @@ Use the following command (assuming that 2048 MB is required): .. code-block:: console -echo 2048 /sys/kernel/mm/dom0-mm/memsize-mB/memsize +echo 2048 > /sys/kernel/mm/dom0-mm/memsize-mB/memsize The user can also check how much memory has already been used: -- 1.9.3
[dpdk-dev] [PATCH] kni: fix building on Ubuntu-hybrids
2014-10-23 16:39, Alexander Guy: > In the case where a userspace reports itself as Ubuntu, but the > kernel isn't providing the expected version signature interface, > turn off Ubuntu specializations. > > This situation happens often enough in development environments, > and with multi-distribution build servers (e.g. chroot, containers). [...] > -ifeq ($(shell lsb_release -si 2>/dev/null),Ubuntu) > +ifeq ($(shell test -f /proc/version_signature && lsb_release -si > 2>/dev/null),Ubuntu) Please, could explain what is the file /proc/version_signature and why it can be a check for Ubuntu kernel? Thanks -- Thomas
[dpdk-dev] [PATCH v2 0/4] support VF MAC filter on Fortville
The patch set enhances configurability of MAC filter and supports VF MAC filter on Fortville. It mainly includes: - The following filter type are configurable: 1. Perfect match of MAC address 2. Perfect match of MAC address and VLAN ID 3. Hash match of MAC address 4. Hash match of MAC address and perfect match of VLAN ID - Support perfect and hash match of unicast and multicast MAC address for VF for i40e v2 updates: * Integrate the v1 patch set into the new filter framework. * Optimize MAC filter data structures in rte_eth_ctrl.h file. jijiangl (4): Expand data structures of MAC filter in rte_eth_ctrl.h file. Expand MAC filter implemantation in i40e. Support VF MAC filter in i40e. Test VF MAC filter in testpmd app/test-pmd/cmdline.c| 119 +++- lib/librte_ether/rte_eth_ctrl.h | 23 +++ lib/librte_pmd_i40e/i40e_ethdev.c | 283 - lib/librte_pmd_i40e/i40e_ethdev.h | 18 ++- lib/librte_pmd_i40e/i40e_pf.c |7 +- 5 files changed, 404 insertions(+), 46 deletions(-) -- 1.7.7.6
[dpdk-dev] [PATCH v2 1/4] librte_ether:extend MAC filter data structures
Add the data definations for MAC filter enhancement in rte_eth_ctrl.h file. Signed-off-by: Jijiang Liu --- lib/librte_ether/rte_eth_ctrl.h | 23 +++ 1 files changed, 23 insertions(+), 0 deletions(-) diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h index df21ac6..699ed2e 100644 --- a/lib/librte_ether/rte_eth_ctrl.h +++ b/lib/librte_ether/rte_eth_ctrl.h @@ -51,6 +51,7 @@ extern "C" { */ enum rte_filter_type { RTE_ETH_FILTER_NONE = 0, + RTE_ETH_FILTER_MACVLAN, RTE_ETH_FILTER_MAX }; @@ -71,6 +72,28 @@ enum rte_filter_op { RTE_ETH_FILTER_OP_MAX }; +/** + * MAC filter type + */ +enum rte_mac_filter_type { + RTE_MAC_PERFECT_MATCH = 1, /**< exact match of MAC addr. */ + RTE_MACVLAN_PERFECT_MATCH, + /**< exact match of MAC addr and VLAN ID. */ + RTE_MAC_HASH_MATCH, /**< hash match of MAC addr. */ + RTE_MACVLAN_HASH_MATCH, + /**< hash match of MAC addr and exact match of VLAN ID. */ +}; + +/** + * MAC filter info + */ +struct rte_eth_mac_filter { + uint8_t is_vf; /**< 1 for VF, 0 for port dev */ + uint16_t dst_id; /**
[dpdk-dev] [PATCH v2 2/4] i40e:expand MAC filter implemantation in i40e
This patch mainly optimizes the i40e_add_macvlan_filters() and the i40e_remove_macvlan_filters() functions in order that we are able to provide filter type configuration. And another relevant MAC filter codes are changed based on new data structures. Signed-off-by: Jijiang Liu --- lib/librte_pmd_i40e/i40e_ethdev.c | 165 lib/librte_pmd_i40e/i40e_ethdev.h | 18 - lib/librte_pmd_i40e/i40e_pf.c |7 ++- 3 files changed, 149 insertions(+), 41 deletions(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 3b75f0f..5fae0e1 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -1529,6 +1529,7 @@ i40e_macaddr_add(struct rte_eth_dev *dev, { struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct i40e_mac_filter_info mac_filter; struct i40e_vsi *vsi = pf->main_vsi; struct ether_addr old_mac; int ret; @@ -1554,8 +1555,10 @@ i40e_macaddr_add(struct rte_eth_dev *dev, (void)rte_memcpy(&old_mac, hw->mac.addr, ETHER_ADDR_LEN); (void)rte_memcpy(hw->mac.addr, mac_addr->addr_bytes, ETHER_ADDR_LEN); + (void)rte_memcpy(&mac_filter.mac_addr, mac_addr, ETHER_ADDR_LEN); + mac_filter.filter_type = RTE_MACVLAN_PERFECT_MATCH; - ret = i40e_vsi_add_mac(vsi, mac_addr); + ret = i40e_vsi_add_mac(vsi, &mac_filter); if (ret != I40E_SUCCESS) { PMD_DRV_LOG(ERR, "Failed to add MACVLAN filter"); return; @@ -2472,6 +2475,7 @@ i40e_update_default_filter_setting(struct i40e_vsi *vsi) { struct i40e_hw *hw = I40E_VSI_TO_HW(vsi); struct i40e_aqc_remove_macvlan_element_data def_filter; + struct i40e_mac_filter_info filter; int ret; if (vsi->type != I40E_VSI_MAIN) @@ -2485,6 +2489,7 @@ i40e_update_default_filter_setting(struct i40e_vsi *vsi) ret = i40e_aq_remove_macvlan(hw, vsi->seid, &def_filter, 1, NULL); if (ret != I40E_SUCCESS) { struct i40e_mac_filter *f; + struct ether_addr *mac; PMD_DRV_LOG(WARNING, "Cannot remove the default " "macvlan filter"); @@ -2494,15 +2499,18 @@ i40e_update_default_filter_setting(struct i40e_vsi *vsi) PMD_DRV_LOG(ERR, "failed to allocate memory"); return I40E_ERR_NO_MEMORY; } - (void)rte_memcpy(&f->macaddr.addr_bytes, hw->mac.perm_addr, + mac = &f->mac_info.mac_addr; + (void)rte_memcpy(&mac->addr_bytes, hw->mac.perm_addr, ETH_ADDR_LEN); TAILQ_INSERT_TAIL(&vsi->mac_list, f, next); vsi->mac_num++; return ret; } - - return i40e_vsi_add_mac(vsi, (struct ether_addr *)(hw->mac.perm_addr)); + (void)rte_memcpy(&filter.mac_addr, + (struct ether_addr *)(hw->mac.perm_addr), ETH_ADDR_LEN); + filter.filter_type = RTE_MACVLAN_PERFECT_MATCH; + return i40e_vsi_add_mac(vsi, &filter); } static int @@ -2556,6 +2564,7 @@ i40e_vsi_setup(struct i40e_pf *pf, { struct i40e_hw *hw = I40E_PF_TO_HW(pf); struct i40e_vsi *vsi; + struct i40e_mac_filter_info filter; int ret; struct i40e_vsi_context ctxt; struct ether_addr broadcast = @@ -2766,7 +2775,10 @@ i40e_vsi_setup(struct i40e_pf *pf, } /* MAC/VLAN configuration */ - ret = i40e_vsi_add_mac(vsi, &broadcast); + (void)rte_memcpy(&filter.mac_addr, &broadcast, ETHER_ADDR_LEN); + filter.filter_type = RTE_MACVLAN_PERFECT_MATCH; + + ret = i40e_vsi_add_mac(vsi, &filter); if (ret != I40E_SUCCESS) { PMD_DRV_LOG(ERR, "Failed to add MACVLAN filter"); goto fail_msix_alloc; @@ -3467,6 +3479,7 @@ i40e_add_macvlan_filters(struct i40e_vsi *vsi, { int ele_num, ele_buff_size; int num, actual_num, i; + uint16_t flags; int ret = I40E_SUCCESS; struct i40e_hw *hw = I40E_VSI_TO_HW(vsi); struct i40e_aqc_add_macvlan_element_data *req_list; @@ -3492,9 +3505,31 @@ i40e_add_macvlan_filters(struct i40e_vsi *vsi, &filter[num + i].macaddr, ETH_ADDR_LEN); req_list[i].vlan_tag = rte_cpu_to_le_16(filter[num + i].vlan_id); - req_list[i].flags = rte_cpu_to_le_16(\ - I40E_AQC_MACVLAN_ADD_PERFECT_MATCH); + + switch (filter[num + i].filter_type) { + case RTE_MAC_PERFECT_MATCH: + flags = I40E_AQC_MACVLAN_ADD_PERFECT_MATCH | + I40E_AQC_MACVLAN_ADD_IGNORE_VLAN; +
[dpdk-dev] [PATCH v2 4/4] app/testpmd:test VF MAC filter
Add a test command in testpmd to test VF MAC filter feature. Signed-off-by: Jijiang Liu --- app/test-pmd/cmdline.c | 119 ++- 1 files changed, 116 insertions(+), 3 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 0b972f9..baa968b 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -351,9 +351,14 @@ static void cmd_help_long_parsed(void *parsed_result, "e.g., 'set stat_qmap rx 0 2 5' sets rx queue 2" " on port 0 to mapping 5.\n\n" - "set port (port_id) vf (vf_id) rx|tx on|off \n" + "set port (port_id) vf (vf_id) rx|tx on|off\n" "Enable/Disable a VF receive/tranmit from a port\n\n" + "set port (port_id) vf (vf_id) (mac_addr)" + " (exact-mac#exact-mac-vlan#hashmac|hashmac-vlan) on|off\n" + " Add/Remove unicast or multicast MAC addr filter" + " for a VF.\n\n" + "set port (port_id) vf (vf_id) rxmode (AUPE|ROPE|BAM" "|MPE) (on|off)\n" "AUPE:accepts untagged VLAN;" @@ -5809,6 +5814,112 @@ cmdline_parse_inst_t cmd_set_uc_all_hash_filter = { }, }; +/* *** CONFIGURE MACVLAN FILTER FOR VF(s) *** */ +struct cmd_set_vf_macvlan_filter { + cmdline_fixed_string_t set; + cmdline_fixed_string_t port; + uint8_t port_id; + cmdline_fixed_string_t vf; + uint8_t vf_id; + struct ether_addr address; + cmdline_fixed_string_t filter_type; + cmdline_fixed_string_t mode; +}; + +static void +cmd_set_vf_macvlan_parsed(void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + int is_on, ret = 0; + struct cmd_set_vf_macvlan_filter *res = parsed_result; + struct rte_eth_mac_filter filter; + + memset(&filter, 0, sizeof(struct rte_eth_mac_filter)); + + (void)rte_memcpy(&filter.mac_addr, &res->address, ETHER_ADDR_LEN); + + /* set VF MAC filter */ + filter.is_vf = 1; + + /* set VF ID */ + filter.dst_id = res->vf_id; + + if (!strcmp(res->filter_type, "exact-mac")) + filter.filter_type = RTE_MAC_PERFECT_MATCH; + else if (!strcmp(res->filter_type, "exact-mac-vlan")) + filter.filter_type = RTE_MACVLAN_PERFECT_MATCH; + else if (!strcmp(res->filter_type, "hashmac")) + filter.filter_type = RTE_MAC_HASH_MATCH; + else if (!strcmp(res->filter_type, "hashmac-vlan")) + filter.filter_type = RTE_MACVLAN_HASH_MATCH; + + is_on = (strcmp(res->mode, "on") == 0) ? 1 : 0; + + if (is_on) + ret = rte_eth_dev_filter_ctrl(res->port_id, + RTE_ETH_FILTER_MACVLAN, + RTE_ETH_FILTER_ADD, +&filter); + else + ret = rte_eth_dev_filter_ctrl(res->port_id, + RTE_ETH_FILTER_MACVLAN, + RTE_ETH_FILTER_DELETE, + &filter); + + if (ret < 0) + printf("bad set MAC hash parameter, return code = %d\n", ret); + +} + +cmdline_parse_token_string_t cmd_set_vf_macvlan_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter, +set, "set"); +cmdline_parse_token_string_t cmd_set_vf_macvlan_port = + TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter, +port, "port"); +cmdline_parse_token_num_t cmd_set_vf_macvlan_portid = + TOKEN_NUM_INITIALIZER(struct cmd_set_vf_macvlan_filter, + port_id, UINT8); +cmdline_parse_token_string_t cmd_set_vf_macvlan_vf = + TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter, +vf, "vf"); +cmdline_parse_token_num_t cmd_set_vf_macvlan_vf_id = + TOKEN_NUM_INITIALIZER(struct cmd_set_vf_macvlan_filter, + vf_id, UINT8); +cmdline_parse_token_etheraddr_t cmd_set_vf_macvlan_mac = + TOKEN_ETHERADDR_INITIALIZER(struct cmd_set_vf_macvlan_filter, + address); +cmdline_parse_token_string_t cmd_set_vf_macvlan_filter_type = + TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter, + filter_type, "exact-mac#exact-mac-vlan" + "#hashmac#hashmac-vlan"); +cmdline_parse_token_string_t cmd_set_vf_macvlan_mode = + TOKEN_STRING_INITIALIZER(struct cmd_set_vf_macvlan_filter, +mode, "on#off"); + +cmdline_parse_inst_t cmd_set_vf_macvlan_filter = { + .f = cmd_set_vf_macvlan_parsed, + .data = N
[dpdk-dev] [PATCH v2 3/4] i40e:add VF MAC filter
It mainly add i40e_vf_mac_filter_set() function to support perfect match and hash match of MAC address and VLAN ID for VF. Signed-off-by: Jijiang Liu --- lib/librte_pmd_i40e/i40e_ethdev.c | 118 - 1 files changed, 116 insertions(+), 2 deletions(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 5fae0e1..f9e3aa8 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -1605,6 +1605,119 @@ i40e_macaddr_remove(struct rte_eth_dev *dev, uint32_t index) memset(&pf->dev_addr, 0, sizeof(struct ether_addr)); } +/* Set perfect match or hash match of MAC and VLAN for a VF */ +static int +i40e_vf_mac_filter_set(struct i40e_pf *pf, +struct rte_eth_mac_filter *filter, +bool add) +{ + struct i40e_hw *hw; + struct i40e_mac_filter_info mac_filter; + struct ether_addr old_mac; + struct ether_addr *new_mac; + struct i40e_pf_vf *vf = NULL; + uint16_t vf_id; + int ret; + + if (pf == NULL) { + PMD_DRV_LOG(ERR, "Invalid PF argument\n"); + return -EINVAL; + } + hw = I40E_PF_TO_HW(pf); + + if (filter == NULL) { + PMD_DRV_LOG(ERR, "Invalid mac filter argument\n"); + return -EINVAL; + } + + new_mac = &filter->mac_addr; + + if (is_zero_ether_addr(new_mac)) { + PMD_DRV_LOG(ERR, "Invalid ethernet address\n"); + return -EINVAL; + } + + vf_id = filter->dst_id; + + if (vf_id > pf->vf_num - 1 || !pf->vfs) { + PMD_DRV_LOG(ERR, "Invalid argument\n"); + return -EINVAL; + } + vf = &pf->vfs[vf_id]; + + if (add && is_same_ether_addr(new_mac, &(pf->dev_addr))) { + PMD_DRV_LOG(INFO, "Ignore adding permanent MAC address\n"); + return -EINVAL; + } + + if (add) { + (void)rte_memcpy(&old_mac, hw->mac.addr, ETHER_ADDR_LEN); + (void)rte_memcpy(hw->mac.addr, new_mac->addr_bytes, + ETHER_ADDR_LEN); + (void)rte_memcpy(&mac_filter.mac_addr, &filter->mac_addr, +ETHER_ADDR_LEN); + + mac_filter.filter_type = filter->filter_type; + ret = i40e_vsi_add_mac(vf->vsi, &mac_filter); + if (ret != I40E_SUCCESS) { + PMD_DRV_LOG(ERR, "Failed to add MAC filter\n"); + return -1; + } + ether_addr_copy(new_mac, &pf->dev_addr); + } else { + (void)rte_memcpy(hw->mac.addr, hw->mac.perm_addr, + ETHER_ADDR_LEN); + ret = i40e_vsi_delete_mac(vf->vsi, &filter->mac_addr); + if (ret != I40E_SUCCESS) { + PMD_DRV_LOG(ERR, "Failed to delete MAC filter\n"); + return -1; + } + + /* Clear device address as it has been removed */ + if (is_same_ether_addr(&(pf->dev_addr), new_mac)) + memset(&pf->dev_addr, 0, sizeof(struct ether_addr)); + } + + return 0; +} + +/* MAC filter handle */ +static int +i40e_mac_filter_handle(struct rte_eth_dev *dev, enum rte_filter_op filter_op, + void *arg) +{ + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); + struct rte_eth_mac_filter *filter; + struct i40e_hw *hw = I40E_PF_TO_HW(pf); + int ret = I40E_NOT_SUPPORTED; + + filter = (struct rte_eth_mac_filter *)(arg); + + switch (filter_op) { + case RTE_ETH_FILTER_NONE: + ret = I40E_SUCCESS; + break; + case RTE_ETH_FILTER_ADD: + i40e_pf_disable_irq0(hw); + if (filter->is_vf) + ret = i40e_vf_mac_filter_set(pf, filter, 1); + i40e_pf_enable_irq0(hw); + break; + case RTE_ETH_FILTER_DELETE: + i40e_pf_disable_irq0(hw); + if (filter->is_vf) + ret = i40e_vf_mac_filter_set(pf, filter, 0); + i40e_pf_enable_irq0(hw); + break; + default: + PMD_DRV_LOG(ERR, "unknown operation %u\n", filter_op); + ret = I40E_ERR_PARAM; + break; + } + + return ret; +} + static int i40e_dev_rss_reta_update(struct rte_eth_dev *dev, struct rte_eth_rss_reta *reta_conf) @@ -4243,13 +4356,14 @@ i40e_dev_filter_ctrl(struct rte_eth_dev *dev, void *arg) { int ret = 0; - (void)filter_op; - (void)arg; if (dev == NULL) return -EINVAL; switch (filter_type) { + case RTE_ETH_FILTER_MACVLAN: + ret = i40e_mac_filter_handle(dev, filter_op, arg); +break; de
[dpdk-dev] [PATCH 0/3] Vhost app removes dependency of REFCNT
To remove the dependency of RTE_MBUF_REFCNT for vhost zero copy, the mbuf need introduce EXTERNAL_MBUF(in ol_flags) to indicate it attaches to an external buffer, say, from guest space. And don't free the external buffer when freeing the mbuf itself in host, in addition, RX function in PMD need make sure not overwrite this flag when filling ol_flags from descriptors to mbuf. Changchun Ouyang (3): mbuf use EXTERNAL_MBUF in ol_flags to indicate it is an external buffer, when freeing such kind of mbuf, just need put mbuf itself back into mempool, doesn't free the attached external buffer, user/caller need take care of detaching and freeing the external buffer. Every pmd RX function need keep the EXTERNAL_MBUF flag in mbuf.ol_flags, and can't overwrite it when filling ol_flags from descriptor to mbuf, otherwise, it probably cause to crash when freeing a mbuf and trying to freeing its attached external buffer, say, from guest space. vhost zero copy removes the dependency on macro REFCNT by using EXTERNAL_MBUF flag in mbuf.ol_flags to indicate it is an external buffer from guest. examples/vhost/main.c | 19 +-- lib/librte_mbuf/rte_mbuf.h| 5 - lib/librte_pmd_e1000/igb_rxtx.c | 5 +++-- lib/librte_pmd_i40e/i40e_rxtx.c | 8 +--- lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 8 +--- lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 6 files changed, 30 insertions(+), 27 deletions(-) -- 1.8.4.2
[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag
Every pmd RX function need keep the EXTERNAL_MBUF flag in mbuf.ol_flags, and can't overwrite it when filling ol_flags from descriptor to mbuf, otherwise, it probably cause to crash when freeing a mbuf and trying to freeing its attached external buffer, say, from guest space. Signed-off-by: Changchun Ouyang --- lib/librte_pmd_e1000/igb_rxtx.c | 5 +++-- lib/librte_pmd_i40e/i40e_rxtx.c | 8 +--- lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 8 +--- lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 4 files changed, 21 insertions(+), 12 deletions(-) diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c index f09c525..4123310 100644 --- a/lib/librte_pmd_e1000/igb_rxtx.c +++ b/lib/librte_pmd_e1000/igb_rxtx.c @@ -786,7 +786,7 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss); pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr); pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr); - rxm->ol_flags = pkt_flags; + rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF); /* * Store the mbuf address into the next entry of the array @@ -1020,7 +1020,8 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss); pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr); pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr); - first_seg->ol_flags = pkt_flags; + first_seg->ol_flags = pkt_flags | + (first_seg->ol_flags & EXTERNAL_MBUF); /* Prefetch data of first segment, if configured to do so. */ rte_packet_prefetch((char *)first_seg->buf_addr + diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c index 2b53677..68c3695 100644 --- a/lib/librte_pmd_i40e/i40e_rxtx.c +++ b/lib/librte_pmd_i40e/i40e_rxtx.c @@ -637,7 +637,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); - mb->ol_flags = pkt_flags; + mb->ol_flags = pkt_flags | + (mb->ol_flags & EXTERNAL_MBUF); if (pkt_flags & PKT_RX_RSS_HASH) mb->hash.rss = rte_le_to_cpu_32(\ rxdp->wb.qword0.hi_dword.rss); @@ -873,7 +874,7 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); - rxm->ol_flags = pkt_flags; + rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF); if (pkt_flags & PKT_RX_RSS_HASH) rxm->hash.rss = rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); @@ -1027,7 +1028,8 @@ i40e_recv_scattered_pkts(void *rx_queue, pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); - first_seg->ol_flags = pkt_flags; + first_seg->ol_flags = pkt_flags | + (first_seg->ol_flags & EXTERNAL_MBUF); if (pkt_flags & PKT_RX_RSS_HASH) rxm->hash.rss = rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 1aefe5c..77e8689 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -949,7 +949,8 @@ ixgbe_rx_scan_hw_ring(struct igb_rx_queue *rxq) /* reuse status field from scan list */ pkt_flags |= rx_desc_status_to_pkt_flags(s[j]); pkt_flags |= rx_desc_error_to_pkt_flags(s[j]); - mb->ol_flags = pkt_flags; + mb->ol_flags = pkt_flags | + (mb->ol_flags & EXTERNAL_MBUF); if (likely(pkt_flags & PKT_RX_RSS_HASH)) mb->hash.rss = rxdp[j].wb.lower.hi_dword.rss; @@ -1271,7 +1272,7 @@ ixgbe_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss); pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr); pkt_flags = pkt_flags | rx_desc
[dpdk-dev] [PATCH 1/3] mbuf: Use EXTERNAL_MBUF to indicate external buffer
mbuf uses EXTERNAL_MBUF in ol_flags to indicate it is an external buffer, when freeing such kind of mbuf, just need put mbuf itself back into mempool, doesn't free the attached external buffer, user/caller need take care of detaching and freeing the external buffer. Signed-off-by: Changchun Ouyang --- lib/librte_mbuf/rte_mbuf.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index ddadc21..8cee8fa 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -114,6 +114,9 @@ extern "C" { /* Bit 51 - IEEE1588*/ #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to timestamp. */ +/* Bit 62 - Indicate it is external buffer */ +#define EXTERNAL_MBUF(1ULL << 62) /**< External buffer. */ + /* Use final bit of flags to indicate a control mbuf */ #define CTRL_MBUF_FLAG (1ULL << 63) /**< Mbuf contains control data */ @@ -670,7 +673,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m) * - detach mbuf * - free attached mbuf segment */ - if (unlikely (md != m)) { + if (unlikely((md != m) && !(m->ol_flags & EXTERNAL_MBUF))) { rte_pktmbuf_detach(m); if (rte_mbuf_refcnt_update(md, -1) == 0) __rte_mbuf_raw_free(md); -- 1.8.4.2
[dpdk-dev] [PATCH 3/3] vhost: Removes dependency on REFCNT for zero copy
Vhost zero copy removes the dependency on macro REFCNT by using EXTERNAL_MBUF flag in mbuf.ol_flags to indicate it is an external buffer from guest. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 19 +-- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index fa0ad0c..e3b1884 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -713,19 +713,6 @@ us_vhost_parse_args(int argc, char **argv) return -1; } else zero_copy = ret; - - if (zero_copy) { -#ifdef RTE_MBUF_REFCNT - RTE_LOG(ERR, VHOST_CONFIG, "Before running " - "zero copy vhost APP, please " - "disable RTE_MBUF_REFCNT\n" - "in config file and then rebuild DPDK " - "core lib!\n" - "Otherwise please disable zero copy " - "flag in command line!\n"); - return -1; -#endif - } } /* Specify the descriptor number on RX. */ @@ -1453,6 +1440,7 @@ attach_rxmbuf_zcp(struct virtio_net *dev) mbuf->buf_physaddr = phys_addr - RTE_PKTMBUF_HEADROOM; mbuf->data_len = desc->len; MBUF_HEADROOM_UINT32(mbuf) = (uint32_t)desc_idx; + mbuf->ol_flags |= EXTERNAL_MBUF; LOG_DEBUG(VHOST_DATA, "(%"PRIu64") in attach_rxmbuf_zcp: res base idx:%d, " @@ -1489,6 +1477,8 @@ static inline void pktmbuf_detach_zcp(struct rte_mbuf *m) m->data_off = buf_ofs; m->data_len = 0; + + m->ol_flags &= ~EXTERNAL_MBUF; } /* @@ -1805,8 +1795,9 @@ virtio_tx_route_zcp(struct virtio_net *dev, struct rte_mbuf *m, mbuf->data_off = m->data_off; mbuf->buf_physaddr = m->buf_physaddr; mbuf->buf_addr = m->buf_addr; + mbuf->ol_flags |= EXTERNAL_MBUF; } - mbuf->ol_flags = PKT_TX_VLAN_PKT; + mbuf->ol_flags |= PKT_TX_VLAN_PKT; mbuf->vlan_tci = vlan_tag; mbuf->l2_len = sizeof(struct ether_hdr); mbuf->l3_len = sizeof(struct ipv4_hdr); -- 1.8.4.2
[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015
> From: Matthew Hall [mailto:mhall at mhcomputing.net] > > On Wed, Oct 22, 2014 at 01:48:36PM +, O'driscoll, Tim wrote: > > Single Virtio Driver: Merge existing Virtio drivers into a single > > implementation, incorporating the best features from each of the > > existing drivers. > > Specifically, in the virtio-net case above, I have discovered, and Sergio at > Intel > just reproduced today, that neither virtio PMD works at all inside of > VirtualBox. One can't init, and the other gets into an infinite loop. But yet > it's > claiming support for VBox on the DPDK Supported NICs page though it > doesn't seem it ever could have worked. At the moment, within Intel we test with KVM, Xen and ESXi. We've never tested with VirtualBox. So, maybe this is an error on the Supported NICs page, or maybe somebody else is testing that configuration. > So I'd like to request an initiative alongside any virtio-net and/or vmxnet3 > type of changes, to make some kind of a Virtualization Test Lab, where we > support VMWare ESXi, QEMU, Xen, VBox, and the other popular VM > systems. > > Otherwise it's hard for us community / app developers to make the DPDK > available to end users in simple, elegant ways, such as packaging it into > Vagrant VM's, Amazon AMI's etc. which are prebaked and ready-to-run. Expanding the scope of virtualization testing is a good idea, especially given industry trends like NFV. We're in the process of getting our DPDK Test Suite ready to push to dpdk.org soon. The hope is that others will use it to validate changes they're making to DPDK, and contribute test cases so that we can build up a more comprehensive set over time. One area where this does need further work is in virtualization. At the moment, our virtualization tests are manual, so they won't be included in the initial DPDK Test Suite release. We will look into automating our current virtualization tests and adding these to the test suite in future. > Another thing which would help in this area would be additional > improvements to the NUMA / socket / core / number of NICs / number of > queues autodetections. To write a single app which can run on a virtual card, > a hardware card without RSS available, and a hardware card with RSS > available, in a thread-safe, flow-safe way, is somewhat complex at the > present time. > > I'm running into this in the VM based environments because most VNIC's > don't have RSS and it complicates the process of keeping consistent state of > the flows among the cores. This is interesting. Do you have more details on what you're thinking here, that perhaps could be used as the basis for an RFC? Tim
[dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx
For zero copy, it need check whether RX descriptor num meets the least requirement when using vector PMD Rx function, and give user more hints if it fails to meet the least requirement. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 17 + 1 file changed, 17 insertions(+) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 291128e..87ab854 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -131,6 +131,10 @@ #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. */ #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. */ +#ifdef RTE_IXGBE_INC_VECTOR +#define VPMD_RX_BURST 32 +#endif + /* Get first 4 bytes in mbuf headroom. */ #define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \ + sizeof(struct rte_mbuf))) @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv) return -1; } +#ifdef RTE_IXGBE_INC_VECTOR + if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) { + RTE_LOG(INFO, VHOST_PORT, + "The RX desc num: %d is too small for PMD to work\n" + "properly, please enlarge it to bigger than %d if\n" + "possible by the option: '--rx-desc-num '\n" + "One alternative is disabling RTE_IXGBE_INC_VECTOR\n" + "in config file and rebuild the libraries.\n", + num_rx_descriptor, VPMD_RX_BURST); + return -1; + } +#endif + return 0; } -- 1.8.4.2
[dpdk-dev] EAL : Input/output error on DPDK 1.7.1
INTX is badly emulated in VMWare; the disable logic doesn't work. I thought the DPDK API detected when link state interrupt would not work. But of course the application needs to check that before enabling link state On Fri, Oct 24, 2014 at 8:52 AM, Masaru Oki wrote: > Hi, > I got same result in VMware Workstation environment. > At least in my environment, INTX toggle check is not work with VMware > E1000 Ethernet. > Please try attached patch. > > 2014-10-17 3:04 GMT+09:00 Raghav K : > > Hey, > > I observe continuous burst of I/O Errors, as indicated below, with the > testpmd application with DPDK 1.7.1.This seems to originate from > eal_intr_process_interrupts() function. I seemed to have setup the DPDK > prerequisites alright. > > Another recent post seemed to suggest moving back to 1.7.0, however I > would like to persist with 1.7.1. > > Any help/pointers in resolving this would be greatly appreciated. > > Much thanks,Raghav > > root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1/x86_64-native-linuxapp-gcc/app# > ./testpmd -c 0xf -n3 -- -i --nb-cores=3 --nb-ports=2 > > EAL: Error reading from file descriptor 21: Input/output errorEAL: Error > reading from file descriptor 21: Input/output errorEAL: Error reading from > file descriptor 21: Input/output errorEAL: Error reading from file > descriptor 21: Input/output errorEAL: Error reading from file descriptor > 21: Input/output errorEAL: Error reading from file descriptor 21: > Input/output errorEAL: Error reading from file descriptor 21: Input/output > errorEAL: Error reading from file descriptor 21: Input/output errorEAL: > Error reading from file descriptor 21: Input/output errorEAL: Error reading > from file descriptor 21: Input/output errorEAL: Error reading from file > descriptor 21: Input/output errorEAL: Error reading from file descriptor > 21: Input/output errorEAL: Error reading from file descriptor 21: > Input/output errorEAL: Error reading from file descriptor 21: Input/output > errorEAL: Error reading from file descriptor 21: Input/output errorEAL: > Error reading from file descriptor 21: Input/output error > > > > root at sys6-vm6:/home/rghv/dpdk/dpdk-1.7.1# ./tools/dpdk_nic_bind.py > --status > > Network devices using DPDK-compatible > driver:02:01.0 '82545EM > Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000:02:02.0 > '82545EM Gigabit Ethernet Controller (Copper)' drv=igb_uio unused=e1000 > > Network devices using kernel > driver===:02:00.0 '82545EM Gigabit > Ethernet Controller (Copper)' if=eth0 drv=e1000 unused=igb_uio > *Active*:02:03.0 '82545EM Gigabit Ethernet Controller (Copper)' if=eth3 > drv=e1000 unused=igb_uio :02:05.0 '82545EM Gigabit Ethernet Controller > (Copper)' if=eth4 drv=e1000 unused=igb_uio :02:06.0 '82545EM Gigabit > Ethernet Controller (Copper)' if=eth5 drv=e1000 unused=igb_uio > > Other network devices= >
[dpdk-dev] DPDK Community Conference Call - Friday 31st October
We're planning to hold our first community conference call on Friday 31st October. It's impossible to find a time that suits everybody, so we've chosen to do this in the afternoon/evening in Europe, which is the morning in the USA. This does unfortunately limit participation from PRC, Japan and other parts of the world. Here's the time and date in a variety of time zones: Dublin (Ireland)Friday, October 31, 2014 at 4:00:00 PMGMT UTC Paris (France) Friday, October 31, 2014 at 5:00:00 PM CET UTC+1 hour San Francisco (U.S.A. - California) Friday, October 31, 2014 at 9:00:00 AM PDT UTC-7 hours New York (U.S.A. - New York)Friday, October 31, 2014 at 12:00:00 Noon EDT UTC-4 hours Tel Aviv (Israel) Friday, October 31, 2014 at 6:00:00 PMIST UTC+2 hours Moscow (Russia) Friday, October 31, 2014 at 7:00:00 PMMSK UTC+3 hours Audio bridge details are: France: +33 1588 77298 Germany:+49 8999 143191 Israel: +972 2589 6577 Russia: +7 495 641 4663 UK: +44 1793 402663 USA:+1 916 356 2663 Bridge: 5 Conference ID: 1264677285 If anybody needs an access number for another country, let me know. Agenda: Discuss feature list for DPDK 2.0 (Q1 2015). Suggestions for topics for future calls. Thanks, Tim
[dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx
Hi Changchun, 2014-10-24 16:38, Ouyang Changchun: > For zero copy, it need check whether RX descriptor num meets the > least requirement when using vector PMD Rx function, and give user > more hints if it fails to meet the least requirement. [...] > --- a/examples/vhost/main.c > +++ b/examples/vhost/main.c > @@ -131,6 +131,10 @@ > #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. > */ > #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. > */ > > +#ifdef RTE_IXGBE_INC_VECTOR > +#define VPMD_RX_BURST 32 > +#endif > + > /* Get first 4 bytes in mbuf headroom. */ > #define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \ > + sizeof(struct rte_mbuf))) > @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv) > return -1; > } > > +#ifdef RTE_IXGBE_INC_VECTOR > + if ((zero_copy == 1) && (num_rx_descriptor <= VPMD_RX_BURST)) { > + RTE_LOG(INFO, VHOST_PORT, > + "The RX desc num: %d is too small for PMD to work\n" > + "properly, please enlarge it to bigger than %d if\n" > + "possible by the option: '--rx-desc-num '\n" > + "One alternative is disabling RTE_IXGBE_INC_VECTOR\n" > + "in config file and rebuild the libraries.\n", > + num_rx_descriptor, VPMD_RX_BURST); > + return -1; > + } > +#endif > + > return 0; > } I feel there is a design problem here. An application shouldn't have to care about the underlying driver. -- Thomas
[dpdk-dev] [PATCH 0/3] Vhost app removes dependency of REFCNT
2014-10-24 16:10, Ouyang Changchun: > To remove the dependency of RTE_MBUF_REFCNT for vhost zero copy, > the mbuf need introduce EXTERNAL_MBUF(in ol_flags) to indicate it > attaches to an external buffer, say, from guest space. And don't > free the external buffer when freeing the mbuf itself in host, in > addition, RX function in PMD need make sure not overwrite this flag > when filling ol_flags from descriptors to mbuf. So you are replacing refcnt by something else which requires special handling in drivers. I feel this is not the right design. Why do you want to remove refcnt dependency? -- Thomas
[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015
2014-10-24 08:10, O'driscoll, Tim: > > From: Matthew Hall [mailto:mhall at mhcomputing.net] > > Specifically, in the virtio-net case above, I have discovered, and Sergio > > at Intel > > just reproduced today, that neither virtio PMD works at all inside of > > VirtualBox. One can't init, and the other gets into an infinite loop. But > > yet it's > > claiming support for VBox on the DPDK Supported NICs page though it > > doesn't seem it ever could have worked. > > At the moment, within Intel we test with KVM, Xen and ESXi. We've never > tested with VirtualBox. So, maybe this is an error on the Supported NICs > page, or maybe somebody else is testing that configuration. I'm the author of this page. I think I've written VirtualBox to show where virtio is implemented. You interpreted this as "supported environment", so I'm removing it. Thanks for testing and reporting. -- Thomas
[dpdk-dev] [PATCH 1/2] ixgbe: remove static qualifier for thread safety
On Thu, Oct 23, 2014 at 08:43:39AM +0900, Masaru Oki wrote: > Hi, > > in this code, pointer of local variable (mb_def) is returned by your changes. > mb_def should be static for each thread. Actually, no. A copy is made of 8 bytes of the mb_def variable and stored as an mbuf initializer inside the rxq structure. No use of the memory occupied by mb_def is made outside of the function, so the value does not need to be static. /Bruce > > 2014-10-22 19:55 GMT+09:00 Bruce Richardson : > > Remove the "static" prefix to the template mbuf variable in > > ixgbe_rxq_vec_setup function. This will then allow different > > threads to initialize different RX queues at the same time, > > without one overwriting the other's data. > > > > Signed-off-by: Bruce Richardson > > --- > > lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > > index a0d3d78..e813e43 100644 > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > > @@ -730,7 +730,7 @@ static struct ixgbe_txq_ops vec_txq_ops = { > > int > > ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq) > > { > > - static struct rte_mbuf mb_def = { > > + struct rte_mbuf mb_def = { > > .nb_segs = 1, > > .data_off = RTE_PKTMBUF_HEADROOM, > > #ifdef RTE_MBUF_REFCNT > > -- > > 1.9.3 > >
[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag
Hi Changchun, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun > Sent: Friday, October 24, 2014 9:10 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag > > Every pmd RX function need keep the EXTERNAL_MBUF flag > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from > descriptor to mbuf, otherwise, it probably cause to crash when freeing a mbuf > and trying to freeing its attached external buffer, say, from guest space. > Don't really like the idea to put: mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); in each and every PMD from now on... >From other side, it is probably not very good that RX functions update whole >ol_flags, not only RX related part. Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for TX and generic stuff. So our ol_flags will look something like that: union { uint64_t ol_raw_flags; struct { uint32_t rx; uint32_t gen_tx; } ol_flags }; And make all PMD RX functions to operate on rx part of the flags only: mb->ol_flags.rx = pkt_flags; ? Konstantin > Signed-off-by: Changchun Ouyang > --- > lib/librte_pmd_e1000/igb_rxtx.c | 5 +++-- > lib/librte_pmd_i40e/i40e_rxtx.c | 8 +--- > lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 8 +--- > lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 12 > 4 files changed, 21 insertions(+), 12 deletions(-) > > diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c > index f09c525..4123310 100644 > --- a/lib/librte_pmd_e1000/igb_rxtx.c > +++ b/lib/librte_pmd_e1000/igb_rxtx.c > @@ -786,7 +786,7 @@ eth_igb_recv_pkts(void *rx_queue, struct rte_mbuf > **rx_pkts, > pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss); > pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr); > pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr); > - rxm->ol_flags = pkt_flags; > + rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF); > > /* >* Store the mbuf address into the next entry of the array > @@ -1020,7 +1020,8 @@ eth_igb_recv_scattered_pkts(void *rx_queue, struct > rte_mbuf **rx_pkts, > pkt_flags = rx_desc_hlen_type_rss_to_pkt_flags(hlen_type_rss); > pkt_flags = pkt_flags | rx_desc_status_to_pkt_flags(staterr); > pkt_flags = pkt_flags | rx_desc_error_to_pkt_flags(staterr); > - first_seg->ol_flags = pkt_flags; > + first_seg->ol_flags = pkt_flags | > + (first_seg->ol_flags & EXTERNAL_MBUF); > > /* Prefetch data of first segment, if configured to do so. */ > rte_packet_prefetch((char *)first_seg->buf_addr + > diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c > index 2b53677..68c3695 100644 > --- a/lib/librte_pmd_i40e/i40e_rxtx.c > +++ b/lib/librte_pmd_i40e/i40e_rxtx.c > @@ -637,7 +637,8 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); > - mb->ol_flags = pkt_flags; > + mb->ol_flags = pkt_flags | > + (mb->ol_flags & EXTERNAL_MBUF); > if (pkt_flags & PKT_RX_RSS_HASH) > mb->hash.rss = rte_le_to_cpu_32(\ > rxdp->wb.qword0.hi_dword.rss); > @@ -873,7 +874,7 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, > uint16_t nb_pkts) > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); > - rxm->ol_flags = pkt_flags; > + rxm->ol_flags = pkt_flags | (rxm->ol_flags & EXTERNAL_MBUF); > if (pkt_flags & PKT_RX_RSS_HASH) > rxm->hash.rss = > rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); > @@ -1027,7 +1028,8 @@ i40e_recv_scattered_pkts(void *rx_queue, > pkt_flags = i40e_rxd_status_to_pkt_flags(qword1); > pkt_flags |= i40e_rxd_error_to_pkt_flags(qword1); > pkt_flags |= i40e_rxd_ptype_to_pkt_flags(qword1); > - first_seg->ol_flags = pkt_flags; > + first_seg->ol_flags = pkt_flags | > + (first_seg->ol_flags & EXTERNAL_MBUF); > if (pkt_flags & PKT_RX_RSS_HASH) > rxm->hash.rss = > rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > b/lib/librte_pmd_ixgbe/ixgbe_rxtx
[dpdk-dev] [PATCH 0/3] Vhost app removes dependency of REFCNT
On Fri, Oct 24, 2014 at 11:47:46AM +0200, Thomas Monjalon wrote: > 2014-10-24 16:10, Ouyang Changchun: > > To remove the dependency of RTE_MBUF_REFCNT for vhost zero copy, > > the mbuf need introduce EXTERNAL_MBUF(in ol_flags) to indicate it > > attaches to an external buffer, say, from guest space. And don't > > free the external buffer when freeing the mbuf itself in host, in > > addition, RX function in PMD need make sure not overwrite this flag > > when filling ol_flags from descriptors to mbuf. > > So you are replacing refcnt by something else which requires special > handling in drivers. > I feel this is not the right design. > Why do you want to remove refcnt dependency? > Ignoring the implementation of the patchset for now - as I haven't reviewed it in depth yet, I think the removal of the dependency on REFCNT in this vhost code is a good thing. This is the only place in DPDK which depends on the REFCNT being *disabled*. We have lots of things which rely on using having a reference count enabled in the mbuf, and lots and lots of #ifdefs in the code to work around the possibility of it being disabled. If we can remove the need for the reference count to be disabled here we can look to do some major cleanup, by removing completely the option to disable the reference counting. Regards, /Bruce
[dpdk-dev] ethtool and igb/ixgbe (kni)
Hi, I am looking in the file hierarchy of dpdk, and I see that under /dpdk-1.7.1/lib/librte_eal/linuxapp/kni/ethtool we have: igb ixgbe README My question is: why the igb and ixgbe are on this path, under ethtool ? are they related to ethtool in any way ? The README does not explain it. Regards, Kevin
[dpdk-dev] [PATCH 1/2] ixgbe: remove static qualifier for thread safety
Oh, sorry, you are right. I had missed first * for copy. thank you. 2014-10-24 19:34 GMT+09:00 Bruce Richardson : > On Thu, Oct 23, 2014 at 08:43:39AM +0900, Masaru Oki wrote: >> Hi, >> >> in this code, pointer of local variable (mb_def) is returned by your changes. >> mb_def should be static for each thread. > > Actually, no. A copy is made of 8 bytes of the mb_def variable and stored as > an mbuf initializer inside the rxq structure. No use of the memory occupied > by mb_def is made outside of the function, so the value does not need to be > static. > > /Bruce >> >> 2014-10-22 19:55 GMT+09:00 Bruce Richardson : >> > Remove the "static" prefix to the template mbuf variable in >> > ixgbe_rxq_vec_setup function. This will then allow different >> > threads to initialize different RX queues at the same time, >> > without one overwriting the other's data. >> > >> > Signed-off-by: Bruce Richardson >> > --- >> > lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c >> > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c >> > index a0d3d78..e813e43 100644 >> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c >> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c >> > @@ -730,7 +730,7 @@ static struct ixgbe_txq_ops vec_txq_ops = { >> > int >> > ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq) >> > { >> > - static struct rte_mbuf mb_def = { >> > + struct rte_mbuf mb_def = { >> > .nb_segs = 1, >> > .data_off = RTE_PKTMBUF_HEADROOM, >> > #ifdef RTE_MBUF_REFCNT >> > -- >> > 1.9.3 >> >
[dpdk-dev] [PATCH v5 3/3] ethdev: fix wrong error return refere to API definition
> -Original Message- > From: y at ecsmtp.sh.intel.com [mailto:y at ecsmtp.sh.intel.com] > Sent: Friday, October 24, 2014 6:55 AM > To: dev at dpdk.org > Cc: nhorman at tuxdriver.com; Richardson, Bruce; Ananyev, Konstantin; De Lara > Guarch, Pablo; Liang, Cunming > Subject: [PATCH v5 3/3] ethdev: fix wrong error return refere to API > definition > > From: Cunming Liang > > Per definition, rte_eth_rx_burst/rte_eth_tx_burst/rte_eth_rx_queue_count > returns the packet number. > When RTE_LIBRTE_ETHDEV_DEBUG turns on, retval of FUNC_PTR_OR_ERR_RTE was set > to -ENOTSUP. > It makes confusing. > The patch always return 0 no matter no packet or there's error. > Meanwhile set errno in such kind of checking. > > Signed-off-by: Cunming Liang > --- > lib/librte_ether/rte_ethdev.c | 10 +++--- > 1 files changed, 7 insertions(+), 3 deletions(-) > > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index 50f10d9..6675f28 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -81,12 +81,14 @@ > /* Macros for checking for restricting functions to primary instance only */ > #define PROC_PRIMARY_OR_ERR_RET(retval) do { \ > if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \ > + rte_errno = -E_RTE_SECONDARY; \ > PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \ > return (retval); \ > } \ > } while(0) > #define PROC_PRIMARY_OR_RET() do { \ > if (rte_eal_process_type() != RTE_PROC_PRIMARY) { \ > + rte_errno = -E_RTE_SECONDARY; \ > PMD_DEBUG_TRACE("Cannot run in secondary processes\n"); \ > return; \ > } \ > @@ -95,12 +97,14 @@ > /* Macros to check for invlaid function pointers in dev_ops structure */ > #define FUNC_PTR_OR_ERR_RET(func, retval) do { \ > if ((func) == NULL) { \ > + rte_errno = -ENOTSUP; \ > PMD_DEBUG_TRACE("Function not supported\n"); \ > return (retval); \ > } \ > } while(0) > #define FUNC_PTR_OR_RET(func) do { \ > if ((func) == NULL) { \ > + rte_errno = -ENOTSUP; \ > PMD_DEBUG_TRACE("Function not supported\n"); \ > return; \ > } \ > @@ -2530,7 +2534,7 @@ rte_eth_rx_burst(uint8_t port_id, uint16_t queue_id, > return 0; > } > dev = &rte_eth_devices[port_id]; > - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP); > + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0); > if (queue_id >= dev->data->nb_rx_queues) { > PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", queue_id); > return 0; > @@ -2551,7 +2555,7 @@ rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id, > } > dev = &rte_eth_devices[port_id]; > > - FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, -ENOTSUP); > + FUNC_PTR_OR_ERR_RET(*dev->tx_pkt_burst, 0); > if (queue_id >= dev->data->nb_tx_queues) { > PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", queue_id); > return 0; > @@ -2570,7 +2574,7 @@ rte_eth_rx_queue_count(uint8_t port_id, uint16_t > queue_id) > return 0; > } > dev = &rte_eth_devices[port_id]; > - FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, -ENOTSUP); > + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_count, 0); > return (*dev->dev_ops->rx_queue_count)(dev, queue_id); > } There are few things that worry me with that approach: 1. Different behaviour of rte_eth_rx_burst/rte_eth_tx_burst for RTE_LIBRTE_ETHDEV_DEBUG switched on/off. So application might need to differentiate its code depending on RTE_LIBRTE_ETHDEV_DEBUG value. 2. Even for RTE_LIBRTE_ETHDEV_DEBUG is on the behaviour of rte_eth_rx_burst/ rte_eth_tx_burst will be inconsistent: It sets rte_errno if dev->rx_pkt_burst == NULL, but doesn't do the same for other error conditions: When port_id or queue_id is invalid. 3. Modifying FUNC_PTR_OR_ERR_RET() to set rte_errno, we make behaviour of other rte_ethdev functions inconsistent too: Now for some error conditions they do set rte_errno, for others they don't. So if it would be me, I'll just: - leave FUNC_PTR_OR_*_RET unmodified. - changes rte_eth_rx_burst/tx_burst for RTE_LIBRTE_ETHDEV_DEBUG something like: - FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, -ENOTSUP); + FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0); I think, that just error logging is enough here. Konstantin > > -- > 1.7.4.1
[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread
Hi all, I have a problem since I updated to 1.7.0 version, I got a multi-process, multi-threaded application, In my application first I launch a master process, then I launch a secondary process with multiple threads in it Well, when the number of lcores reserved for the secondary process exceeds a certain number (eg. 4) i got an error in rte_eal_init() on the secondary process when it tries to map PCI memory: EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): Success (0x7559b000) EAL: Cannot mmap device resource EAL: Error - exiting with code: 1 Cause: Requested device :01:00.0 cannot be used Can you help me?
[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread
On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote: > Hi all, I have a problem since I updated to 1.7.0 version, > I got a multi-process, multi-threaded application, > In my application first I launch a master process, then I launch a secondary > process with multiple threads in it > Well, when the number of lcores reserved for the secondary process exceeds a > certain number (eg. 4) i got an error in rte_eal_init() on the secondary > process when it tries to map PCI memory: > > EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): > Success (0x7559b000) > EAL: Cannot mmap device resource > EAL: Error - exiting with code: 1 > Cause: Requested device :01:00.0 cannot be used > > Can you help me? This could be because the additional memory/stack space used by the pthreads for the cores in the secondary process is overlapping the space used in the primary process for hugepage or device memory. You could perhaps try adding a few cores to the primary process's coremask (and not using those cores) and see if it helps things. Alternatively there is a base-virtaddr parameter that can be passed to the primary process to try and adjust the starting address for it mapping memory. If you look at where it starts mapping memory right now, and then try hinting to it to maps the pages at a slightly higher or lower address and see if it helps. /Bruce
[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag
On Fri, Oct 24, 2014 at 10:46:06AM +, Ananyev, Konstantin wrote: > Hi Changchun, > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun > > Sent: Friday, October 24, 2014 9:10 AM > > To: dev at dpdk.org > > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF > > flag > > > > Every pmd RX function need keep the EXTERNAL_MBUF flag > > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from > > descriptor to mbuf, otherwise, it probably cause to crash when freeing a > > mbuf > > and trying to freeing its attached external buffer, say, from guest space. > > > > Don't really like the idea to put: > mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); > in each and every PMD from now on... > > From other side, it is probably not very good that RX functions update whole > ol_flags, not only RX related part. > Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for TX > and generic stuff. > So our ol_flags will look something like that: > > union { > uint64_t ol_raw_flags; > struct { > uint32_t rx; > uint32_t gen_tx; > } ol_flags > }; > > And make all PMD RX functions to operate on rx part of the flags only: > mb->ol_flags.rx = pkt_flags; > ? > > Konstantin > I would tend to agree with this. Changchun, did you get to assess the performance impact of making this change to the PMDs? I suspect that making the changes to each PMD would impact performance, while Konstantin's suggestion should eliminate that impact. The downside there is that we are limiting the flexibility we have in expanding beyond 32 RX flags and 24 TX flags. :-( /Bruce
[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread
Hi Bruce, thank you for your answer, adding cores to the primary mask didn't help, instead it helped manually passing the --base-virtaddr parameter, setting it to the first value of Virtual Area that EAL finds when it starts the primary process. ? Honestly I don't understand why it works in this way, in the experimental phase this could be a patch, but in the final program I have to automate this process, do you have any suggestions? For example is there a way to find the virtual area before starting the primary process? ? Mario ? Sent:?Friday, October 24, 2014 at 2:08 PM From:?"Bruce Richardson" To:?"Mario Gianni" Cc:?dev at dpdk.org Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote: > Hi all, I have a problem since I updated to 1.7.0 version, > I got a multi-process, multi-threaded application, > In my application first I launch a master process, then I launch a secondary > process with multiple threads in it > Well, when the number of lcores reserved for the secondary process exceeds a > certain number (eg. 4) i got an error in rte_eal_init() on the secondary > process when it tries to map PCI memory: > > EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): > Success (0x7559b000) > EAL: Cannot mmap device resource > EAL: Error - exiting with code: 1 > Cause: Requested device :01:00.0 cannot be used > > Can you help me? This could be because the additional memory/stack space used by the pthreads for the cores in the secondary process is overlapping the space used in the primary process for hugepage or device memory. You could perhaps try adding a few cores to the primary process's coremask (and not using those cores) and see if it helps things. Alternatively there is a base-virtaddr parameter that can be passed to the primary process to try and adjust the starting address for it mapping memory. If you look at where it starts mapping memory right now, and then try hinting to it to maps the pages at a slightly higher or lower address and see if it helps. /Bruce
[dpdk-dev] Possible bug in eal_pci pci_scan_one
On Mon, 6 Oct 2014 02:13:44 -0700 Matthew Hall wrote: > Hi Guys, > > I'm doing my development on kind of a cheap machine with no NUMA support... > but several years ago I used DPDK to build a NUMA box that could do 40 gbits > bidirectional L4-L7 stateful traffic replay. > > So given the past experiences I had before, I wanted to clean the code up so > it'd work well if some crazy guy tried my code on one of these huge boxes, > too, but then I ran into some weird issues. > > 1) When I call rte_eth_dev_socket_id() I get back -1. But the call can return > -1 if the port_id is bogus or if pci_scan_one didn't get a numa_node (because > you're on a non-NUMA box for example). > > int rte_eth_dev_socket_id(uint8_t port_id) > { > if (port_id >= nb_ports) > return -1; > return rte_eth_devices[port_id].pci_dev->numa_node; > } > > So you couldn't tell the different between non-NUMA or a bad port value, etc. > > 2) The code's behavior and comments disagree with one another. In the > pci_scan_one function, there's this code: > > /* get numa node */ > snprintf(filename, sizeof(filename), "%s/numa_node", > dirname); > if (access(filename, R_OK) != 0) { > /* if no NUMA support just set node to 0 */ > dev->numa_node = -1; > } else { > if (eal_parse_sysfs_value(filename, &tmp) < 0) { > free(dev); > return -1; > } > dev->numa_node = tmp; > } > > It says, just use NUMA node 0 if there is no NUMA support. But then proceeds > to set the value to -1 in disagreement with the comment, and also stomping on > the other meaning for -1 in the higher function rte_eth_dev_socket_id. > > 3) In conclusion, it seems like some stuff is missing... first there needs to > be a function that will tell you the number of NUMA nodes present on the box > so you can create the right number of mbuf_pools, but I couldn't find that > function. > > Then if you have the function, you can do some magic and shuffle the NICs > around to get them hooked to a core on the same NUMA, and the mbuf_pool on > the > same NUMA. > > When NUMA is not present, can we return 0 instead of -1, or return a specific > error code that the client can use to know he should just use Socket 0? Right > now I can't tell apart any potential errors or weird values from correct > values. > > 4) I'm willing to help make and test some patches... but first I want to > understand what is happening with these funny functions before doing things > blindly. > > Thanks, > Matthew. The code is fairly consistent in returning -1 for cases of not a NUMA socket, bogus port value. It is interpreted as SOCKET_ID_ANY in several places. The examples mostly check for -1 and use socket 0 as a fallback. Probably not worth introducing more return values and breaking existing applications.
[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread
On Fri, Oct 24, 2014 at 03:04:26PM +0200, Mario Gianni wrote: > Hi Bruce, > thank you for your answer, adding cores to the primary mask didn't help, > instead it helped manually passing the --base-virtaddr parameter, setting it > to the first value of Virtual Area that EAL finds when it starts the primary > process. > ? > Honestly I don't understand why it works in this way, in the experimental > phase this could be a patch, but in the final program I have to automate this > process, do you have any suggestions? > For example is there a way to find the virtual area before starting the > primary process? > ? > Mario In multi-process, there is a requirement that we can map the hugepage memory and the NIC BARs to the same virtual addresses in both processes. Mostly this works ok, but occasionally it needs help due to the memory regions being chosen in the primary process being used by something else pre-eal_init in the secondary process. Anything from additional threads, to having an additional shared library linked in can affect the amount of memory used by the secondary process and therefore affect the chances that we won't be able to get an exact mapping. As far as I know there is no way to pre-compute how much memory a given process will use, or what memory regions will be free in it, by the time rte_eal_init() is called. If you just need multiple processes, which don't need to be individually spawned, then perhaps consider using fork() to spawn the processes, since that will guarantee you idential mappings without issues. The downside obviously is that you need to have all processes use the same binary, something not required for DPDK multi-process support. /Bruce > ? > > Sent:?Friday, October 24, 2014 at 2:08 PM > From:?"Bruce Richardson" > To:?"Mario Gianni" > Cc:?dev at dpdk.org > Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 > multi-process/multi-thread > On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote: > > Hi all, I have a problem since I updated to 1.7.0 version, > > I got a multi-process, multi-threaded application, > > In my application first I launch a master process, then I launch a > > secondary process with multiple threads in it > > Well, when the number of lcores reserved for the secondary process exceeds > > a certain number (eg. 4) i got an error in rte_eal_init() on the secondary > > process when it tries to map PCI memory: > > > > EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): > > Success (0x7559b000) > > EAL: Cannot mmap device resource > > EAL: Error - exiting with code: 1 > > Cause: Requested device :01:00.0 cannot be used > > > > Can you help me? > > This could be because the additional memory/stack space used by the pthreads > for the cores in the secondary process is overlapping the space used in the > primary process for hugepage or device memory. You could perhaps try adding > a few cores to the primary process's coremask (and not using those cores) > and see if it helps things. > Alternatively there is a base-virtaddr parameter that can be passed to the > primary process to try and adjust the starting address for it mapping > memory. If you look at where it starts mapping memory right now, and then > try hinting to it to maps the pages at a slightly higher or lower address > and see if it helps. > > /Bruce
[dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread
So you are telling me that in order to implement multi-process I should better use the l2fwd_fork example instead of client_server_mp. In fact if I use the client_server_mp with a lot of mp_client threads it gives me the error. If instead I use the l2fwd_fork example it doesn't give me the error. One more question at this point: Assume that I use l2fwd_fork, when I launch the secondary process, how do I assign the lcore coremask associated with that process? Mario ? ? Sent:?Friday, October 24, 2014 at 3:39 PM From:?"Bruce Richardson" To:?"Mario Gianni" Cc:?dev at dpdk.org Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 multi-process/multi-thread On Fri, Oct 24, 2014 at 03:04:26PM +0200, Mario Gianni wrote: > Hi Bruce, > thank you for your answer, adding cores to the primary mask didn't help, > instead it helped manually passing the --base-virtaddr parameter, setting it > to the first value of Virtual Area that EAL finds when it starts the primary > process. > ? > Honestly I don't understand why it works in this way, in the experimental > phase this could be a patch, but in the final program I have to automate this > process, do you have any suggestions? > For example is there a way to find the virtual area before starting the > primary process? > ? > Mario In multi-process, there is a requirement that we can map the hugepage memory and the NIC BARs to the same virtual addresses in both processes. Mostly this works ok, but occasionally it needs help due to the memory regions being chosen in the primary process being used by something else pre-eal_init in the secondary process. Anything from additional threads, to having an additional shared library linked in can affect the amount of memory used by the secondary process and therefore affect the chances that we won't be able to get an exact mapping. As far as I know there is no way to pre-compute how much memory a given process will use, or what memory regions will be free in it, by the time rte_eal_init() is called. If you just need multiple processes, which don't need to be individually spawned, then perhaps consider using fork() to spawn the processes, since that will guarantee you idential mappings without issues. The downside obviously is that you need to have all processes use the same binary, something not required for DPDK multi-process support. /Bruce > ? > > Sent:?Friday, October 24, 2014 at 2:08 PM > From:?"Bruce Richardson" > To:?"Mario Gianni" > Cc:?dev at dpdk.org > Subject:?Re: [dpdk-dev] Cannot mmap device resource in DPDK 1.7.0 > multi-process/multi-thread > On Fri, Oct 24, 2014 at 01:21:08PM +0200, Mario Gianni wrote: > > Hi all, I have a problem since I updated to 1.7.0 version, > > I got a multi-process, multi-threaded application, > > In my application first I launch a master process, then I launch a > > secondary process with multiple threads in it > > Well, when the number of lcores reserved for the secondary process exceeds > > a certain number (eg. 4) i got an error in rte_eal_init() on the secondary > > process when it tries to map PCI memory: > > > > EAL: pci_map_resource(): cannot mmap(12, 0x72e96000, 0x80, 0x1000): > > Success (0x7559b000) > > EAL: Cannot mmap device resource > > EAL: Error - exiting with code: 1 > > Cause: Requested device :01:00.0 cannot be used > > > > Can you help me? > > This could be because the additional memory/stack space used by the pthreads > for the cores in the secondary process is overlapping the space used in the > primary process for hugepage or device memory. You could perhaps try adding > a few cores to the primary process's coremask (and not using those cores) > and see if it helps things. > Alternatively there is a base-virtaddr parameter that can be passed to the > primary process to try and adjust the starting address for it mapping > memory. If you look at where it starts mapping memory right now, and then > try hinting to it to maps the pages at a slightly higher or lower address > and see if it helps. > > /Bruce
[dpdk-dev] DPDK Community Conference Call - Friday 31st October
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of O'driscoll, Tim > Sent: Friday, October 24, 2014 5:22 AM > To: dev at dpdk.org > Subject: [dpdk-dev] DPDK Community Conference Call - Friday 31st October > > We're planning to hold our first community conference call on Friday 31st > October. It's impossible to find a time that suits everybody, so we've chosen > to do this in the afternoon/evening in Europe, which is the morning in the > USA. This does unfortunately limit participation from PRC, Japan and other > parts of the world. Here's the time and date in a variety of time zones: > > Dublin (Ireland) Friday, October 31, 2014 at > 4:00:00 PMGMT UTC > Paris (France)Friday, October 31, 2014 at > 5:00:00 > PMCET UTC+1 hour > San Francisco (U.S.A. - California) Friday, October 31, 2014 at 9:00:00 > AMPDT UTC-7 hours > New York (U.S.A. - New York) Friday, October 31, 2014 at 12:00:00 > Noon EDT UTC-4 hours > Tel Aviv (Israel) Friday, October 31, 2014 at > 6:00:00 > PMIST UTC+2 hours > Moscow (Russia) Friday, October 31, 2014 at 7:00:00 > PMMSK UTC+3 hours > > > Audio bridge details are: > France: +33 1588 77298 > Germany: +49 8999 143191 > Israel: +972 2589 6577 > Russia: +7 495 641 4663 > UK: +44 1793 402663 > USA: +1 916 356 2663 > > Bridge: 5 > Conference ID: 1264677285 > > If anybody needs an access number for another country, let me know. Can you provide a number for Canada? thanks, Mike. > > > Agenda: > Discuss feature list for DPDK 2.0 (Q1 2015). > Suggestions for topics for future calls. > > > Thanks, > Tim
[dpdk-dev] DPDK Community Conference Call - Friday 31st October
> From: Michael Marchetti [mailto:mmarchetti at sandvine.com] > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of O'driscoll, Tim > > Audio bridge details are: > > France: +33 1588 77298 > > Germany:+49 8999 143191 > > Israel: +972 2589 6577 > > Russia: +7 495 641 4663 > > UK: +44 1793 402663 > > USA:+1 916 356 2663 > > > > Bridge: 5 > > Conference ID: 1264677285 > > > > If anybody needs an access number for another country, let me know. > > Can you provide a number for Canada? thanks, Mike. No problem. The USA number above should work, or else you can use +1-888-875-9370. Tim
[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag
On Fri, Oct 24, 2014 at 01:34:58PM +0100, Bruce Richardson wrote: > On Fri, Oct 24, 2014 at 10:46:06AM +, Ananyev, Konstantin wrote: > > Hi Changchun, > > > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun > > > Sent: Friday, October 24, 2014 9:10 AM > > > To: dev at dpdk.org > > > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF > > > flag > > > > > > Every pmd RX function need keep the EXTERNAL_MBUF flag > > > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from > > > descriptor to mbuf, otherwise, it probably cause to crash when freeing a > > > mbuf > > > and trying to freeing its attached external buffer, say, from guest space. > > > > > > > Don't really like the idea to put: > > mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); > > in each and every PMD from now on... > > > > From other side, it is probably not very good that RX functions update > > whole ol_flags, not only RX related part. > > Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for TX > > and generic stuff. > > So our ol_flags will look something like that: > > > > union { > > uint64_t ol_raw_flags; > > struct { > > uint32_t rx; > > uint32_t gen_tx; > > } ol_flags > > }; > > > > And make all PMD RX functions to operate on rx part of the flags only: > > mb->ol_flags.rx = pkt_flags; > > ? > > > > Konstantin > > > I would tend to agree with this. Changchun, did you get to assess the > performance impact of making this change to the PMDs? I suspect that making > the changes to each PMD would impact performance, while Konstantin's > suggestion should eliminate that impact. > The downside there is that we are limiting the flexibility we have in > expanding beyond 32 RX flags and 24 TX flags. :-( > > /Bruce > How about switching things about in terms of the flag. Instead of having to manage a flag across the baord to indicate if an mbuf is pointing to external memory, I think we should use the flag to indicate that an mbuf is attached to the memory space of another mbuf. My reasons for suggesting this are: 1. Mbufs pointing to externally managed memory are not really the problem to be dealt with on free, since they can be handled the same as mbufs with the data pointer pointing internally, it's mbufs attached to other mbufs which are - so that's what we need to track using a flag. 2. Setting the flag to indicate an indirect mbuf should have no impact on the driver, as an mbuf that has just been allocated from mempool cannot be an indirect one. 3. The only place we would need to worry about such a flag is in the attach, detach and free mbuf functions - and on free we would simply need to replace the existing check for "md != m" with a new check for the new flag. It would be a contained change. Thoughts? /Bruce
[dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF flag
> -Original Message- > From: Richardson, Bruce > Sent: Friday, October 24, 2014 4:43 PM > To: Ananyev, Konstantin > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep EXTERNAL_MBUF > flag > > On Fri, Oct 24, 2014 at 01:34:58PM +0100, Bruce Richardson wrote: > > On Fri, Oct 24, 2014 at 10:46:06AM +, Ananyev, Konstantin wrote: > > > Hi Changchun, > > > > > > > -Original Message- > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Ouyang Changchun > > > > Sent: Friday, October 24, 2014 9:10 AM > > > > To: dev at dpdk.org > > > > Subject: [dpdk-dev] [PATCH 2/3] pmd: RX function need keep > > > > EXTERNAL_MBUF flag > > > > > > > > Every pmd RX function need keep the EXTERNAL_MBUF flag > > > > in mbuf.ol_flags, and can't overwrite it when filling ol_flags from > > > > descriptor to mbuf, otherwise, it probably cause to crash when freeing > > > > a mbuf > > > > and trying to freeing its attached external buffer, say, from guest > > > > space. > > > > > > > > > > Don't really like the idea to put: > > > mb->ol_flags = pkt_flags | (mb->ol_flags & EXTERNAL_MBUF); > > > in each and every PMD from now on... > > > > > > From other side, it is probably not very good that RX functions update > > > whole ol_flags, not only RX related part. > > > Wonder can we reserve low 32bits of ol_flags for RX, and high 32bits for > > > TX and generic stuff. > > > So our ol_flags will look something like that: > > > > > > union { > > > uint64_t ol_raw_flags; > > > struct { > > > uint32_t rx; > > > uint32_t gen_tx; > > > } ol_flags > > > }; > > > > > > And make all PMD RX functions to operate on rx part of the flags only: > > > mb->ol_flags.rx = pkt_flags; > > > ? > > > > > > Konstantin > > > > > I would tend to agree with this. Changchun, did you get to assess the > > performance impact of making this change to the PMDs? I suspect that making > > the changes to each PMD would impact performance, while Konstantin's > > suggestion should eliminate that impact. > > The downside there is that we are limiting the flexibility we have in > > expanding beyond 32 RX flags and 24 TX flags. :-( > > > > /Bruce > > > > How about switching things about in terms of the flag. Instead of having to > manage a flag across the baord to indicate if an mbuf is pointing to > external memory, I think we should use the flag to indicate that an mbuf is > attached to the memory space of another mbuf. > > My reasons for suggesting this are: > 1. Mbufs pointing to externally managed memory are not really the problem to > be dealt with on free, since they can be handled the same as mbufs with the > data pointer pointing internally, it's mbufs attached to other mbufs which > are - so that's what we need to track using a flag. > 2. Setting the flag to indicate an indirect mbuf should have no impact on > the driver, as an mbuf that has just been allocated from mempool cannot be > an indirect one. > 3. The only place we would need to worry about such a flag is in the attach, > detach and free mbuf functions - and on free we would simply need to replace > the existing check for "md != m" with a new check for the new flag. It would > be a contained change. > Sounds good to me. That's' definitely much better than my proposal. Plus, if we'll stop to rely on: md = RTE_MBUF_FROM_BADDR(m->buf_addr); if (unlikely (md != m)) { That will allow us to set buf_addr to some other valid offset inside mbuf and that fix an old problem with mbufs extra metadata (userdata) stored in the packet's headroom. Konstantin > Thoughts? > /Bruce
[dpdk-dev] [PATCH] kni: fix building on Ubuntu-hybrids
On Oct 24, 2014, at 12:35 AM, Thomas Monjalon wrote: > > Please, could explain what is the file /proc/version_signature and why > it can be a check for Ubuntu kernel? Ubuntu provides /proc/version_signature to help with determining kernel lineage; it doesn?t exist in upstream kernels: https://wiki.ubuntu.com/Kernel/FAQ#Kernel.2BAC8-FAQ.2BAC8-GeneralVersionRunning.How_can_we_determine_the_version_of_the_running_kernel.3F Commit a09b359d started gathering version information via version_signature in order to enable certain Ubuntu-specific kernel workarounds. If you have a kernel without this information (e.g. upstream Linux v3.13 with an Ubuntu userspace), kni fails to build: CC [M] /home/alexander/dpdk/build/build/lib/librte_eal/linuxapp/kni/e1000_82575.o In file included from /home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_osdep.h:41:0, from /home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_hw.h:31, from /home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/e1000_api.h:31, from /home/alexander/dpdk/build/build/lib/librte_eal/linuxapp/kni/e1000_82575.c:38: /home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3864:8: error: macro "UBUNTU_KERNEL_VERSION" requires 5 arguments, but only 1 given /home/alexander/dpdk/lib/librte_eal/linuxapp/kni/ethtool/igb/kcompat.h:3864:8: error: "UBUNTU_KERNEL_VERSION" is not defined [-Werror=undef] My logic for the change is: if the build system is running in an environment that looks like Ubuntu, but can?t gather enough information to know if it should enable the kernel workarounds, it?s safe to not try to enable them at all. Thanks. Alexander
[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015
On Fri, Oct 24, 2014 at 08:10:40AM +, O'driscoll, Tim wrote: > At the moment, within Intel we test with KVM, Xen and ESXi. We've never > tested with VirtualBox. So, maybe this is an error on the Supported NICs > page, or maybe somebody else is testing that configuration. So, one of the most popular ways developers test out new code these days is using Vagrant or Docker. Vagrant by default creates machines using VirtualBox. VirtualBox runs on nearly everything out there (Linux, Windows, OS X, and more). Docker uses Linux LXC so it isn't multiplatform. There is a system called CoreOS which is still under development. It requires bare-metal w/ custom Linux on top. https://www.vagrantup.com/ https://www.docker.com/ https://coreos.com/ As an open source DPDK app developer, who previously used it successfully in some commercial big-iron projects in the past, now I'm trying to drive adoption of the technology among security programmers. I'm doing it because I think DPDK is better than everything else I've seen for packet processing. So it would help to drive adoption if there were a multiplatform virtualization environment that worked with the best performing DPDK drivers, so I could make it easy for developers to download, install, and run, so they'll get excited and learn more about all the great work you guys did and use it to build more DPDK apps. I don't care if it's VBox necessarily. But we should support at least 1 end-developer-friendly Virtualization environment so I can make it easy to deploy and run an app and get people excited to work with the DPDK. Low barrier to entry is important. > One area where this does need further work is in virtualization. At the > moment, our virtualization tests are manual, so they won't be included in > the initial DPDK Test Suite release. We will look into automating our > current virtualization tests and adding these to the test suite in future. Sounds good. Then we could help you make it work and keep it working on more platforms. > > Another thing which would help in this area would be additional > > improvements to the NUMA / socket / core / number of NICs / number of > > queues autodetections. To write a single app which can run on a virtual > > card, > > a hardware card without RSS available, and a hardware card with RSS > > available, in a thread-safe, flow-safe way, is somewhat complex at the > > present time. > > > > I'm running into this in the VM based environments because most VNIC's > > don't have RSS and it complicates the process of keeping consistent state of > > the flows among the cores. > > This is interesting. Do you have more details on what you're thinking here, > that perhaps could be used as the basis for an RFC? It's something I am still trying to figure out how to deal with actually, hence all my virtio-net questions and PCI bus questions I've been hounding about on the list the last few weeks. It would be good if you had a contact for the virtual DPDK at Intel or 6WIND who could help me figure out the solution pattern. I think it might involve making an app or some DPDK helper code which has something like this algorithm: At load-time, app autodetects if RSS is available or not, and if NUMA is present or not. If RSS is available, and NUMA is not available, enable RSS and create 1 RX queue for each lcore. If RSS is available, and NUMA is available, find the NUMA socket of the NIC, and make 1 RX queue for each connected lcore on that NUMA socket. If RSS is not available, and NUMA is not available, then configure the distributor framework. (I never used it so I am not sure if this part is right). Create 1 Load Balance on master lcore that does RX from all NICs, and hashes up and distributes packets to every other lcore. If RSS is not available, and NUMA is available, then configure the distributor framework. (Again this might not be right). Create 1 Load Balance on first lcore on each socket that does RX from all NUMA connected NICs, and hashes up and distibutes packets to other NUMA connected lcores. > Tim Thanks, Matthew.
[dpdk-dev] [dpdk-announce] DPDK Features for Q1 2015
On Fri, Oct 24, 2014 at 12:10:20PM +0200, Thomas Monjalon wrote: > I'm the author of this page. I think I've written VirtualBox to show where > virtio is implemented. You interpreted this as "supported environment", so > I'm removing it. Thanks for testing and reporting. Of course, I'm very sorry to see VirtualBox go, but happy to have accurate documentation. Thanks Thomas. Matthew.
[dpdk-dev] Possible bug in eal_pci pci_scan_one
On Fri, Oct 24, 2014 at 06:36:29PM +0530, Stephen Hemminger wrote: > The code is fairly consistent in returning -1 for cases of not a NUMA socket, > bogus port value. It is interpreted as SOCKET_ID_ANY in several places. > The examples mostly check for -1 and use socket 0 as a fallback. > Probably not worth introducing more return values and breaking existing > applications. OK. So I'll make a patch to correct the comment which was wrong. Matthew.
[dpdk-dev] rte_acl test-acl app
Hi everyone, I am having trouble to successfully perform a packet classification using the rte_acl test app. I have my rules.acl and trace.acl files as follows: rules.acl: @192.168.0.0/24 192.168.0.0/24 400 : 500 1000 : 2000 6/0xff trace.acl: 192.168.0.5 192.168.0.9 450 1002 0x06 However, the result always comes up as 4294967295 (x). I have dug through the code quite a bit to follow and see what is going on, but not sure where I went wrong. Any help on how the rte_acl_classify function works would be much appreciated. In understand that the data for rte_acl_classify is a uint32_t ** and I double checked to make sure I'm passing along proper values. Is x the expected result? If so, I am getting the same for packets that should not match. Thank you, Erik Ziegenbalg
[dpdk-dev] rte_acl test-acl app
Hi everyone, I am having trouble to successfully perform a packet classification using the rte_acl test app. I have my rules.acl and trace.acl files as follows: rules.acl: @192.168.0.0/24 192.168.0.0/24 400 : 500 1000 : 2000 6/0xff trace.acl: 192.168.0.5 192.168.0.9 450 1002 0x06 However, the result always comes up as 4294967295 (x). I have dug through the code quite a bit to follow and see what is going on, but not sure where I went wrong. Any help on how the rte_acl_classify function works would be much appreciated. In understand that the data for rte_acl_classify is a uint32_t ** and I double checked to make sure I'm passing along proper values. Is x the expected result? If so, I am getting the same for packets that should not match. Thank you, Erik Ziegenbalg