[dpdk-dev] [PATCH v2 0/5] Virtio-net PMD: QEMU QTest extension for container
The patches will work on below patch series. - [PATCH v2 0/5] virtio support for container - [PATCH 0/4] rework ioport access for virtio [Changes] v2 changes: - Rebase on above patch seiries. - Rebase on master - Add "--qtest-virtio" EAL option. - Fixes in qtest.c - Fix error handling for the case qtest connection is closed. - Use eventfd for interrupt messaging. - Use linux header for PCI register definitions. - Fix qtest_raw_send/recv to handle error correctly. - Fix bit mask of PCI_CONFIG_ADDR. - Describe memory and ioport usage of qtest guest in qtest.c - Remove loop that is for finding PCI devices. [Abstraction] Normally, virtio-net PMD only works on VM, because there is no virtio-net device on host. This patches extend virtio-net PMD to be able to work on host as virtual PMD. But we didn't implement virtio-net device as a part of virtio-net PMD. To prepare virtio-net device for the PMD, start QEMU process with special QTest mode, then connect it from virtio-net PMD through unix domain socket. The PMD can connect to anywhere QEMU virtio-net device can. For example, the PMD can connects to vhost-net kernel module and vhost-user backend application. Similar to virtio-net PMD on QEMU, application memory that uses virtio-net PMD will be shared between vhost backend application. But vhost backend application memory will not be shared. Main target of this PMD is container like docker, rkt, lxc and etc. We can isolate related processes(virtio-net PMD process, QEMU and vhost-user backend process) by container. But, to communicate through unix domain socket, shared directory will be needed. [How to use] Please use QEMU-2.5.1, or above. (So far, QEMU-2.5.1 hasn't been released yet, so please checkout master from QEMU repository) - Compile Set "CONFIG_RTE_VIRTIO_VDEV_QTEST=y" in config/common_linux. Then compile it. - Start QEMU like below. $ qemu-system-x86_64 \ -machine pc-i440fx-1.4,accel=qtest \ -display none -qtest-log /dev/null \ -qtest unix:/tmp/socket,server \ -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1 \ -device virtio-net-pci,netdev=net0,mq=on,disable-modern=false,addr=3 \ -chardev socket,id=chr1,path=/tmp/ivshmem,server \ -device ivshmem,size=1G,chardev=chr1,vectors=1,addr=4 - Start DPDK application like below $ testpmd -c f -n 1 -m 1024 --no-pci --single-file --qtest-virtio \ --vdev="eth_qtest_virtio0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\ -- --disable-hw-vlan --txqflags=0xf00 -i (*1) Please Specify same memory size in QEMU and DPDK command line. (*2) Should use qemu-2.5.1, or above. (*3) QEMU process is needed per port. (*4) virtio-1.0 device are only supported. (*5) The vhost backends like vhost-net and vhost-user can be specified. (*6) In most cases, just using above command is enough, but you can also specify other QEMU virtio-net options. (*7) Only checked "pc-i440fx-1.4" machine, but may work with other machines. It depends on a machine has piix3 south bridge. If the machine doesn't have, virtio-net PMD cannot receive status changed interrupts. (*8) Should not add "--enable-kvm" to QEMU command line. [Detailed Description] - virtio-net device implementation The PMD uses QEMU virtio-net device. To do that, QEMU QTest functionality is used. QTest is a test framework of QEMU devices. It allows us to implement a device driver outside of QEMU. With QTest, we can implement DPDK application and virtio-net PMD as standalone process on host. When QEMU is invoked as QTest mode, any guest code will not run. To know more about QTest, see below. http://wiki.qemu.org/Features/QTest - probing devices QTest provides a unix domain socket. Through this socket, driver process can access to I/O port and memory of QEMU virtual machine. The PMD will send I/O port accesses to probe pci devices. If we can find virtio-net and ivshmem device, initialize the devices. Also, I/O port accesses of virtio-net PMD will be sent through socket, and virtio-net PMD can initialize vitio-net device on QEMU correctly. - ivshmem device to share memory To share memory that virtio-net PMD process uses, ivshmem device will be used. Because ivshmem device can only handle one file descriptor, shared memory should be consist of one file. To allocate such a memory, EAL has new option called "--single-file". Also, the hugepages should be mapped between "1 << 31" to "1 << 44". To map like above, EAL has one more new option called "-qtest-virtio". While initializing ivshmem device, we can set BAR(Base Address Register). It represents which memory QEMU vcpu can access to this shared memory. We will specify host virtual address of shared memory as this address. It is very useful because we don't need to apply patch to QEMU to calculate address offset. (For example, if virtio-net PMD process will allocate memory from shared mem
[dpdk-dev] [PATCH v2 1/5] virtio: Retrieve driver name from eth_dev
Currently, virtio_dev_info_get() retrieves driver name from pci_drv. If the driver is virtual PMD, pci_drv will be invalid. So retrieves the name from eth_dev. Signed-off-by: Tetsuya Mukawa --- drivers/net/virtio/virtio_ethdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index b790fd0..1c8c955 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -1438,7 +1438,7 @@ virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) { struct virtio_hw *hw = dev->data->dev_private; - dev_info->driver_name = dev->driver->pci_drv.name; + dev_info->driver_name = dev->data->drv_name; dev_info->max_rx_queues = (uint16_t)hw->max_rx_queues; dev_info->max_tx_queues = (uint16_t)hw->max_tx_queues; dev_info->min_rx_bufsize = VIRTIO_MIN_RX_BUFSIZE; -- 2.1.4
[dpdk-dev] [PATCH v2 2/5] EAL: Add new EAL "--qtest-virtio" option
To work with qtest virtio-net PMD, virtual address that maps hugepages should be between (1 << 31) to (1 << 44). This patch adds one more option to map like this. Also all hugepages should consists of one file. Because of this, the option will work only when '--single-file' option is specified. Signed-off-by: Tetsuya Mukawa --- lib/librte_eal/common/eal_common_options.c | 10 lib/librte_eal/common/eal_internal_cfg.h | 1 + lib/librte_eal/common/eal_options.h| 2 + lib/librte_eal/linuxapp/eal/eal_memory.c | 81 +- 4 files changed, 93 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c index 65bccbd..34c8bd1 100644 --- a/lib/librte_eal/common/eal_common_options.c +++ b/lib/librte_eal/common/eal_common_options.c @@ -96,6 +96,7 @@ eal_long_options[] = { {OPT_VMWARE_TSC_MAP,0, NULL, OPT_VMWARE_TSC_MAP_NUM }, {OPT_XEN_DOM0, 0, NULL, OPT_XEN_DOM0_NUM }, {OPT_SINGLE_FILE, 0, NULL, OPT_SINGLE_FILE_NUM }, + {OPT_QTEST_VIRTIO, 0, NULL, OPT_QTEST_VIRTIO_NUM }, {0, 0, NULL, 0} }; @@ -902,6 +903,10 @@ eal_parse_common_option(int opt, const char *optarg, conf->single_file = 1; break; + case OPT_QTEST_VIRTIO_NUM: + conf->qtest_virtio = 1; + break; + /* don't know what to do, leave this to caller */ default: return 1; @@ -971,6 +976,11 @@ eal_check_common_options(struct internal_config *internal_cfg) "be specified together with --"OPT_SINGLE_FILE"\n"); return -1; } + if (internal_cfg->qtest_virtio && !internal_cfg->single_file) { + RTE_LOG(ERR, EAL, "Option --"OPT_QTEST_VIRTIO" cannot " + "be specified without --"OPT_SINGLE_FILE"\n"); + return -1; + } if (internal_cfg->no_hugetlbfs && internal_cfg->hugepage_unlink) { RTE_LOG(ERR, EAL, "Option --"OPT_HUGE_UNLINK" cannot " diff --git a/lib/librte_eal/common/eal_internal_cfg.h b/lib/librte_eal/common/eal_internal_cfg.h index 9117ed9..7f3df39 100644 --- a/lib/librte_eal/common/eal_internal_cfg.h +++ b/lib/librte_eal/common/eal_internal_cfg.h @@ -71,6 +71,7 @@ struct internal_config { volatile unsigned no_hpet;/**< true to disable HPET */ volatile unsigned vmware_tsc_map; /**< true to use VMware TSC mapping * instead of native TSC */ + volatile unsigned qtest_virtio;/**< mmap hugepages to fit qtest virtio PMD */ volatile unsigned no_shconf; /**< true if there is no shared config */ volatile unsigned create_uio_dev; /**< true to create /dev/uioX devices */ volatile enum rte_proc_type_t process_type; /**< multi-process proc type */ diff --git a/lib/librte_eal/common/eal_options.h b/lib/librte_eal/common/eal_options.h index e5da14a..b33a3c3 100644 --- a/lib/librte_eal/common/eal_options.h +++ b/lib/librte_eal/common/eal_options.h @@ -85,6 +85,8 @@ enum { OPT_XEN_DOM0_NUM, #define OPT_SINGLE_FILE "single-file" OPT_SINGLE_FILE_NUM, +#define OPT_QTEST_VIRTIO "qtest-virtio" + OPT_QTEST_VIRTIO_NUM, OPT_LONG_MAX_NUM }; diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index a6b3616..677d6a7 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -1092,6 +1092,73 @@ calc_num_pages_per_socket(uint64_t * memory, } /* + * Find memory space that fits qtest virtio-net PMD. + */ +static void * +rte_eal_get_free_region(uint64_t alloc_size, uint64_t pagesz) +{ + uint64_t start, end, next_start; + uint64_t high_limit, low_limit; + char buf[1024], *p; + FILE *fp; + void *addr = NULL; + + /* all hugepages should be mapped between below values */ + low_limit = 1UL << 31; + high_limit = 1UL << 44; + + /* allocation size should be aligned by page size */ + if (alloc_size != RTE_ALIGN_CEIL(alloc_size, pagesz)) { + rte_panic("Invalid allocation size 0x%lx\n", alloc_size); + return NULL; + } + + /* +* address should be aligned by allocation size because +* BAR register requiers such an address +*/ + low_limit = RTE_ALIGN_CEIL(low_limit, alloc_size); + high_limit = RTE_ALIGN_FLOOR(high_limit, alloc_size); + + fp = fopen("/proc/self/maps", "r"); + if (fp == NULL) { + rte_panic("Cannot open /proc/self/maps\n"); + return NULL; + } + + next_start = 0; + do { + start = next_start; + + if ((p = fgets(buf, sizeof(buf), fp))
[dpdk-dev] [PATCH v2 3/5] vhost: Add a function to check virtio device type
The patch adds below function to cleanup virtio code. - virtio_dev_check() Signed-off-by: Tetsuya Mukawa --- drivers/net/virtio/virtio_ethdev.c | 52 ++ drivers/net/virtio/virtio_ethdev.h | 32 +++ 2 files changed, 57 insertions(+), 27 deletions(-) diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 1c8c955..c3e877a 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -371,7 +371,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, vq->mz = mz; vq->vq_ring_virt_mem = mz->addr; - if (dev->dev_type == RTE_ETH_DEV_PCI) { + if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0)) { vq->vq_ring_mem = mz->phys_addr; /* Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit, @@ -429,7 +429,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, vq->virtio_net_hdr_vaddr = mz->addr; memset(vq->virtio_net_hdr_vaddr, 0, hdr_size); - if (dev->dev_type == RTE_ETH_DEV_PCI) + if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0)) vq->virtio_net_hdr_mem = mz->phys_addr; #ifdef RTE_VIRTIO_VDEV else @@ -439,7 +439,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, hw->vtpci_ops->setup_queue(hw, vq); - if (dev->dev_type == RTE_ETH_DEV_PCI) + if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0)) vq->offset = offsetof(struct rte_mbuf, buf_physaddr); #ifdef RTE_VIRTIO_VDEV else @@ -490,15 +490,13 @@ static void virtio_dev_close(struct rte_eth_dev *dev) { struct virtio_hw *hw = dev->data->dev_private; - struct rte_pci_device *pci_dev = dev->pci_dev; PMD_INIT_LOG(DEBUG, "virtio_dev_close"); /* reset the NIC */ - if (dev->dev_type == RTE_ETH_DEV_PCI) { - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) - vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR); - } + if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, RTE_PCI_DRV_INTR_LSC)) + vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR); + vtpci_reset(hw); hw->started = 0; virtio_dev_free_mbufs(dev); @@ -1001,7 +999,7 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle, isr = vtpci_isr(hw); PMD_DRV_LOG(INFO, "interrupt status = %#x", isr); - if (dev->dev_type == RTE_ETH_DEV_PCI) + if (virtio_dev_check(dev, RTE_ETH_DEV_PCI, NULL, 0)) if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) PMD_DRV_LOG(ERR, "interrupt enable failed"); @@ -1056,9 +1054,10 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) pci_dev = eth_dev->pci_dev; - if (eth_dev->dev_type == RTE_ETH_DEV_PCI) + if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, NULL, 0)) { if (vtpci_init(pci_dev, hw) < 0) return -1; + } /* Reset the device although not necessary at startup */ vtpci_reset(hw); @@ -1072,7 +1071,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) return -1; /* If host does not support status then disable LSC */ - if (eth_dev->dev_type == RTE_ETH_DEV_PCI) { + if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, NULL, 0)) { if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) pci_dev->driver->drv_flags &= ~RTE_PCI_DRV_INTR_LSC; @@ -1154,13 +1153,14 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) PMD_INIT_LOG(DEBUG, "hw->max_rx_queues=%d hw->max_tx_queues=%d", hw->max_rx_queues, hw->max_tx_queues); - if (eth_dev->dev_type == RTE_ETH_DEV_PCI) { + if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, NULL, 0)) { PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x", eth_dev->data->port_id, pci_dev->id.vendor_id, pci_dev->id.device_id); /* Setup interrupt callback */ - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) + if (virtio_dev_check(eth_dev, RTE_ETH_DEV_PCI, + NULL, RTE_PCI_DRV_INTR_LSC)) rte_intr_callback_register(&pci_dev->intr_handle, virtio_interrupt_handler, eth_dev); @@ -1197,11 +1197,11 @@ eth_virtio_dev_uninit(struct rte_eth_dev *eth_dev) eth_dev->data->mac_addrs = NULL; /* reset interrupt callback */ - if (eth_dev->dev_type == RTE_ETH_DEV_PCI) - if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) - rte_intr_callback_unregister(&pci_dev->intr_handle, -vir
[dpdk-dev] [PATCH v2 4/5] virtio: Add support for qtest virtio-net PMD
The patch adds a new virtio-net PMD configuration that allows the PMD to work on host as if the PMD is in VM. Here is new configuration for virtio-net PMD. - CONFIG_RTE_VIRTIO_VDEV_QTEST To use this mode, EAL needs map all hugepages as one file. Also the file should be mapped between (1 << 31) and (1 << 44). To allocate like above, add "--single-file" and "--qtest-virtio" option to application command line. To prepare virtio-net device on host, the users need to invoke QEMU process in special qtest mode. This mode is mainly used for testing QEMU devices from outer process. In this mode, no guest runs. Here is QEMU command line. $ qemu-system-x86_64 \ -machine pc-i440fx-1.4,accel=qtest \ -display none -qtest-log /dev/null \ -qtest unix:/tmp/socket,server \ -netdev type=tap,script=/etc/qemu-ifup,id=net0,queues=1 \ -device virtio-net-pci,netdev=net0,mq=on,disable-modern=false,addr=3 \ -chardev socket,id=chr1,path=/tmp/ivshmem,server \ -device ivshmem,size=1G,chardev=chr1,vectors=1,addr=4 * Should use qemu-2.5.1, or above. * QEMU process is needed per port. * virtio-1.0 device are only supported. * The vhost backends like vhost-net and vhost-user can be specified. * In most cases, just using above command is enough, but you can also specify other QEMU virtio-net options. * Only checked "pc-i440fx-1.4" machine, but may work with other machines. It depends on a machine has piix3 south bridge. If the machine doesn't have, virtio-net PMD cannot receive status changed interrupts. * Should not add "--enable-kvm" to QEMU command line. After invoking QEMU, the PMD can connect to QEMU process using unix domain sockets. Over these sockets, virtio-net, ivshmem and piix3 device in QEMU are probed by the PMD. Here is example of command line. $ testpmd -c f -n 1 -m 1024 --no-pci --single-file --qtest-virtio \ --vdev="eth_qtest_virtio0,qtest=/tmp/socket,ivshmem=/tmp/ivshmem"\ -- --disable-hw-vlan --txqflags=0xf00 -i Please specify same unix domain sockets and memory size in both QEMU and DPDK command lines like above. The share memory size should be power of 2, because ivshmem only accepts such memory size. Signed-off-by: Tetsuya Mukawa --- config/common_linuxapp |1 + drivers/net/virtio/Makefile|4 + drivers/net/virtio/qtest.c | 1342 drivers/net/virtio/qtest.h | 65 ++ drivers/net/virtio/virtio_ethdev.c | 383 +- drivers/net/virtio/virtio_pci.c| 364 +- drivers/net/virtio/virtio_pci.h|5 +- 7 files changed, 2122 insertions(+), 42 deletions(-) create mode 100644 drivers/net/virtio/qtest.c create mode 100644 drivers/net/virtio/qtest.h diff --git a/config/common_linuxapp b/config/common_linuxapp index f76e162..7cbf50d 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -539,3 +539,4 @@ CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n # Enable virtio support for container # CONFIG_RTE_VIRTIO_VDEV=y +CONFIG_RTE_VIRTIO_VDEV_QTEST=y diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile index ef920f9..6c11378 100644 --- a/drivers/net/virtio/Makefile +++ b/drivers/net/virtio/Makefile @@ -56,6 +56,10 @@ ifeq ($(CONFIG_RTE_VIRTIO_VDEV),y) SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += vhost_embedded.c endif +ifeq ($(CONFIG_RTE_VIRTIO_VDEV_QTEST),y) + SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += qtest.c +endif + # this lib depends upon: DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_eal lib/librte_ether DEPDIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += lib/librte_mempool lib/librte_mbuf diff --git a/drivers/net/virtio/qtest.c b/drivers/net/virtio/qtest.c new file mode 100644 index 000..418214f --- /dev/null +++ b/drivers/net/virtio/qtest.c @@ -0,0 +1,1342 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2016 IGEL Co., Ltd. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of IGEL Co., Ltd. nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL
[dpdk-dev] [PATCH v2 5/5] docs: add release note for qtest virtio container support
Signed-off-by: Tetsuya Mukawa --- doc/guides/rel_notes/release_2_3.rst | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst index 1e7d51d..ab3baf9 100644 --- a/doc/guides/rel_notes/release_2_3.rst +++ b/doc/guides/rel_notes/release_2_3.rst @@ -43,6 +43,9 @@ This section should contain new features added in this release. Sample format: Add a new virtual device, named eth_cvio, to support virtio for containers. +* **Virtio support for containers using QEMU qtest mode.** + Add a new virtual device, named eth_qtest_virtio, to support virtio for containers + using QEMU qtest mode. Resolved Issues --- -- 2.1.4
[dpdk-dev] [PATCH v3] driver/net/mpipe: fix the crash/hung issue when testpmd quits
Fixes: the hung/crash issue when quitting testpmd under high traffic rate. The following issue were found and fixed. 1. edesc->size is not initialized properly in mpipe_do_xmit() and could cause buffer leak or corruption when HW buffer return is used. 2. Check the 'idesc.be' error bit in mpipe_recv_flush() to make sure buffer is valid before releasing it. This is to avoid issues when running out of buffers. 3. priv->rx_buffers counter is not accurate when HW buffer return is used. Remove this counter to simplify the code. Signed-off-by: Liming Sun Acked-by: Zhigang Lu --- drivers/net/mpipe/mpipe_tilegx.c | 46 ++-- 1 file changed, 16 insertions(+), 30 deletions(-) diff --git a/drivers/net/mpipe/mpipe_tilegx.c b/drivers/net/mpipe/mpipe_tilegx.c index 8d006fa..4cb54c3 100644 --- a/drivers/net/mpipe/mpipe_tilegx.c +++ b/drivers/net/mpipe/mpipe_tilegx.c @@ -134,7 +134,6 @@ struct mpipe_dev_priv { struct rte_mempool *rx_mpool; /* mpool used by the rx queues. */ unsigned rx_offset; /* Receive head room. */ unsigned rx_size_code; /* mPIPE rx buffer size code. */ - unsigned rx_buffers;/* receive buffers on stack. */ int is_xaui:1, /* Is this an xgbe or gbe? */ initialized:1, /* Initialized port? */ running:1; /* Running port? */ @@ -529,7 +528,6 @@ mpipe_recv_fill_stack(struct mpipe_dev_priv *priv, int count) mpipe_recv_push(priv, mbuf); } - priv->rx_buffers += count; PMD_DEBUG_RX("%s: Filled %d/%d buffers\n", mpipe_name(priv), i, count); } @@ -539,10 +537,9 @@ mpipe_recv_flush_stack(struct mpipe_dev_priv *priv) const int offset = priv->rx_offset & ~RTE_MEMPOOL_ALIGN_MASK; uint8_t in_port = priv->port_id; struct rte_mbuf *mbuf; - unsigned count; void *va; - for (count = 0; count < priv->rx_buffers; count++) { + while (1) { va = gxio_mpipe_pop_buffer(priv->context, priv->stack); if (!va) break; @@ -561,10 +558,6 @@ mpipe_recv_flush_stack(struct mpipe_dev_priv *priv) __rte_mbuf_raw_free(mbuf); } - - PMD_DEBUG_RX("%s: Returned %d/%d buffers\n", -mpipe_name(priv), count, priv->rx_buffers); - priv->rx_buffers -= count; } static void @@ -1246,31 +1239,23 @@ mpipe_recv_flush(struct mpipe_dev_priv *priv) gxio_mpipe_iqueue_t *iqueue; gxio_mpipe_idesc_t idesc; struct rte_mbuf *mbuf; - int retries = 0; unsigned queue; - do { - mpipe_recv_flush_stack(priv); - - /* Flush packets sitting in recv queues. */ - for (queue = 0; queue < priv->nb_rx_queues; queue++) { - rx_queue = mpipe_rx_queue(priv, queue); - iqueue = &rx_queue->iqueue; - while (gxio_mpipe_iqueue_try_get(iqueue, &idesc) >= 0) { - mbuf = mpipe_recv_mbuf(priv, &idesc, in_port); - rte_pktmbuf_free(mbuf); - priv->rx_buffers--; - } - rte_free(rx_queue->rx_ring_mem); - } - } while (retries++ < 10 && priv->rx_buffers); + /* Release packets on the buffer stack. */ + mpipe_recv_flush_stack(priv); - if (priv->rx_buffers) { - RTE_LOG(ERR, PMD, "%s: Leaked %d receive buffers.\n", - mpipe_name(priv), priv->rx_buffers); - } else { - PMD_DEBUG_RX("%s: Returned all receive buffers.\n", -mpipe_name(priv)); + /* Flush packets sitting in recv queues. */ + for (queue = 0; queue < priv->nb_rx_queues; queue++) { + rx_queue = mpipe_rx_queue(priv, queue); + iqueue = &rx_queue->iqueue; + while (gxio_mpipe_iqueue_try_get(iqueue, &idesc) >= 0) { + /* Skip idesc with the 'buffer error' bit set. */ + if (idesc.be) + continue; + mbuf = mpipe_recv_mbuf(priv, &idesc, in_port); + rte_pktmbuf_free(mbuf); + } + rte_free(rx_queue->rx_ring_mem); } } @@ -1339,6 +1324,7 @@ mpipe_do_xmit(struct mpipe_tx_queue *tx_queue, struct rte_mbuf **tx_pkts, .xfer_size = rte_pktmbuf_data_len(mbuf), .bound = next ? 0 : 1, .stack_idx = mpipe_mbuf_stack_index(priv, mbuf), + .size = priv->rx_size_code, } }; if (mpipe_local.mbuf_push_debt[port_id] > 0) { mpipe_local.mbuf_push_debt[port_id]--; -- 1.8.3.
[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers
On Tue, Feb 9, 2016 at 6:05 PM, Jan Viktorin wrote: > Maybe, a better subject? > > drivers: init pdev drivers in constructors Why not, I will try to find a best one, and if I can't, I will go with this. > On Fri, 29 Jan 2016 15:08:30 +0100 > David Marchand wrote: >> -static int >> -rte_qat_pmd_init(const char *name __rte_unused, const char *params >> __rte_unused) >> +/* Driver registration */ >> +static void __attribute__((constructor, used)) >> +rte_qat_pmd_init(void) >> { >> PMD_INIT_FUNC_TRACE(); >> rte_eal_pci_register(&rte_qat_pmd.pci_drv); >> - return 0; >> } >> - >> -static struct rte_driver pmd_qat_drv = { >> - .type = PMD_PDEV, >> - .init = rte_qat_pmd_init, >> -}; >> - >> -PMD_REGISTER_DRIVER(pmd_qat_drv); > > What about introducing a macro for this? > > RTE_REGISTER_PCI_DRIVER(rte_qad_pmd); Yes. -- David Marchand
[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers
On Wed, Feb 10, 2016 at 9:51 AM, David Marchand wrote: > On Tue, Feb 9, 2016 at 6:05 PM, Jan Viktorin > wrote: >> What about introducing a macro for this? >> >> RTE_REGISTER_PCI_DRIVER(rte_qad_pmd); > > Yes. The only problem here, is that rte_qad_pmd is a crypto structure (same problem with ethdev), so I can't just pass it to eal. I can't just pass the pci driver, for the cases where multiple drivers are registered in a single file (look at ixgbe driver). So, how about : In rte_pci.h : #define RTE_EAL_PCI_REGISTER(name, d)\ void pciinitfn_ ##name(void);\ void __attribute__((constructor, used)) pciinitfn_ ##name(void)\ {\ rte_eal_pci_register(d);\ } Then, in qat driver : RTE_EAL_PCI_REGISTER(qat, &rte_qat_pmd.pci_drv); -- David Marchand
[dpdk-dev] [PATCH v2 1/3] driver/net/mpipe: support native build on tilegx platform.
On Tue, Feb 09, 2016 at 11:47:55PM +0100, Thomas Monjalon wrote: > 2016-02-09 21:15, Liming Sun: > > Looks like this patch serie has been merged into dpdk-next-net/rel_16_04. > > What would be the usual way to submit changes for new comments? Would it be > > incremental changes (new commit) based on previous one? Thanks. > > Good question. > I think it's better if Bruce drops or reverts the commits from dpdk-next-net > to let you re-submit a better new version. > Bruce, do you agree? Unless there is something actually broken - that was previously working - by this patchset I'd rather not revert it. This patch was sitting acked for a month which is a reasonable time for comments before applying it. Allowing people to step up post-apply and look for patches being reverted is not something we want to encourage IMHO. There are already too many reviews being done at the last minute, and allowing reverts may make that situation worse, while applying acked patches within a reasonable time - irrespective of whether people subsequently find issues with them - should encourage earlier reviews, and makes it easier on contributors. Therefore I'd rather see any additional enhancements or changes done as incremental patches on top of this set. Regards, /Bruce
[dpdk-dev] [PATCH v2 1/3] driver/net/mpipe: support native build on tilegx platform.
2016-02-10 09:49, Bruce Richardson: > On Tue, Feb 09, 2016 at 11:47:55PM +0100, Thomas Monjalon wrote: > > 2016-02-09 21:15, Liming Sun: > > > Looks like this patch serie has been merged into dpdk-next-net/rel_16_04. > > > What would be the usual way to submit changes for new comments? Would it > > > be incremental changes (new commit) based on previous one? Thanks. > > > > Good question. > > I think it's better if Bruce drops or reverts the commits from dpdk-next-net > > to let you re-submit a better new version. > > Bruce, do you agree? > > Unless there is something actually broken - that was previously working - by > this patchset I'd rather not revert it. This patch was sitting acked for a > month > which is a reasonable time for comments before applying it. Allowing people to > step up post-apply and look for patches being reverted is not something we > want > to encourage IMHO. There are already too many reviews being done at the last > minute, and allowing reverts may make that situation worse, while applying > acked > patches within a reasonable time - irrespective of whether people subsequently > find issues with them - should encourage earlier reviews, and makes it easier > on > contributors. Yes you are right. > Therefore I'd rather see any additional enhancements or changes > done as incremental patches on top of this set. > > Regards, > /Bruce >
[dpdk-dev] Regarding mbuf allocation/free in secondary process
On Tue, Feb 09, 2016 at 11:43:19PM -0800, Saravana Kumar wrote: > Hi DPDK community, > > > > I'd like to have DPDK NIC IO operations in (primary) process and > execution logic in (secondary) processes. > Primary process pushes NIC Rx mbufs to Secondary process through S/W ring > > Seconary process allocates mbuf for Tx path and pushes down to Primary > process for NIC Tx > > > I have few doubts here: > > 1. If Secondary process dies because of SIGKILL then how can the mbufs > allocated in Secondary process can be freed. >If it is normal signals like SIGINT/SIGTERM then we can be catch > those and free in those respective signal handlers If a process terminates abnormally then the buffers being used by that process may well be leaked. The solution you propose of catching signals will certainly help as you want to try and ensure that a process always frees all its buffers properly on termination. > > 2. Secondary process needs to poll on the S/W ring. This can consume 100% cpu. >Is there a way to avoid polling in secondary process for Rx path Not using DPDK software rings, no. You'd have to use some kernel constructs such as fifo's/named pipes to do blocking reads on those. However, the overhead of using such structures can be severe making them unusable for many packet processing applications. An alternative might be to use some small sleep calls i.e. nanosleep between polls of the SW ring in cases where traffic rates are low. That will reduce your cpu usage. /Bruce
[dpdk-dev] [PATCH 0/2] bonding fixes
These patches fix segmentation faults which were occurring when slave devices were detached before being removed from the bonding device. The slave devices must now be removed from the bonding device before they can be detached. The bonding device cannot be detached now until all slave devices have been removed from it. Bernare Iremonger (2): bonding: fix detach of bonded device bonding: fix detach of bonded slave devices drivers/net/bonding/rte_eth_bond_api.c | 40 +++--- lib/librte_ether/rte_ethdev.c | 8 +-- lib/librte_ether/rte_ethdev.h | 4 +++- 3 files changed, 26 insertions(+), 26 deletions(-) -- 2.6.3
[dpdk-dev] [PATCH 2/2] bonding: fix detach of bonded slave devices
Ensure that a bonded slave device is not detached, until it is removed from the bonded device. Fixes: 2efb58cbab6e ("bond: new link bonding library") Fixes: a45b288ef21a ("bond: support link status polling") Fixes: 494adb7f63f2 ("ethdev: add device fields from PCI layer") Fixes: b1fb53a39d88 ("ethdev: remove some PCI specific handling") Signed-off-by: Bernard Iremonger --- drivers/net/bonding/rte_eth_bond_api.c | 33 +++-- lib/librte_ether/rte_ethdev.c | 8 ++-- lib/librte_ether/rte_ethdev.h | 4 +++- 3 files changed, 20 insertions(+), 25 deletions(-) diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c index 484a6f3..a0995ec 100644 --- a/drivers/net/bonding/rte_eth_bond_api.c +++ b/drivers/net/bonding/rte_eth_bond_api.c @@ -314,38 +314,23 @@ __eth_bond_slave_add_lock_free(uint8_t bonded_port_id, uint8_t slave_port_id) { struct rte_eth_dev *bonded_eth_dev, *slave_eth_dev; struct bond_dev_private *internals; - struct bond_dev_private *temp_internals; struct rte_eth_link link_props; struct rte_eth_dev_info dev_info; - int i, j; - if (valid_slave_port_id(slave_port_id) != 0) return -1; bonded_eth_dev = &rte_eth_devices[bonded_port_id]; internals = bonded_eth_dev->data->dev_private; - /* Verify that new slave device is not already a slave of another -* bonded device */ - for (i = rte_eth_dev_count()-1; i >= 0; i--) { - if (check_for_bonded_ethdev(&rte_eth_devices[i]) == 0) { - temp_internals = rte_eth_devices[i].data->dev_private; - - for (j = 0; j < temp_internals->slave_count; j++) { - /* Device already a slave of a bonded device */ - if (temp_internals->slaves[j].port_id == slave_port_id) { - RTE_BOND_LOG(ERR, "Slave port %d is already a slave", - slave_port_id); - return -1; - } - } - } - } - slave_eth_dev = &rte_eth_devices[slave_port_id]; + if (slave_eth_dev->data->dev_flags & RTE_ETH_DEV_BONDED_SLAVE) { + RTE_BOND_LOG(ERR, "Slave device is already a slave of a bonded device"); + return -1; + } /* Add slave details to bonded device */ + slave_eth_dev->data->dev_flags |= RTE_ETH_DEV_BONDED_SLAVE; slave_add(internals, slave_eth_dev); rte_eth_dev_info_get(slave_port_id, &dev_info); @@ -385,6 +370,7 @@ __eth_bond_slave_add_lock_free(uint8_t bonded_port_id, uint8_t slave_port_id) if (internals->link_props_set) { if (link_properties_valid(&(bonded_eth_dev->data->dev_link), &(slave_eth_dev->data->dev_link))) { + slave_eth_dev->data->dev_flags &= (~RTE_ETH_DEV_BONDED_SLAVE); RTE_BOND_LOG(ERR, "Slave port %d link speed/duplex not supported", slave_port_id); @@ -416,6 +402,7 @@ __eth_bond_slave_add_lock_free(uint8_t bonded_port_id, uint8_t slave_port_id) if (bonded_eth_dev->data->dev_started) { if (slave_configure(bonded_eth_dev, slave_eth_dev) != 0) { + slave_eth_dev->data->dev_flags &= (~RTE_ETH_DEV_BONDED_SLAVE); RTE_BOND_LOG(ERR, "rte_bond_slaves_configure: port=%d", slave_port_id); return -1; @@ -468,7 +455,7 @@ __eth_bond_slave_remove_lock_free(uint8_t bonded_port_id, uint8_t slave_port_id) { struct rte_eth_dev *bonded_eth_dev; struct bond_dev_private *internals; - + struct rte_eth_dev *slave_eth_dev; int i, slave_idx; if (valid_slave_port_id(slave_port_id) != 0) @@ -508,7 +495,9 @@ __eth_bond_slave_remove_lock_free(uint8_t bonded_port_id, uint8_t slave_port_id) mac_address_set(&rte_eth_devices[slave_port_id], &(internals->slaves[slave_idx].persisted_mac_addr)); - slave_remove(internals, &rte_eth_devices[slave_port_id]); + slave_eth_dev = &rte_eth_devices[slave_port_id]; + slave_remove(internals, slave_eth_dev); + slave_eth_dev->data->dev_flags &= (~RTE_ETH_DEV_BONDED_SLAVE); /* first slave in the active list will be the primary by default, * otherwise use first device in list */ diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 756b234..a6e83c1 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -1,7 +1,7 @@ /*- * BSD LICE
[dpdk-dev] [PATCH 1/2] bonding: fix detach of bonded device
Check that the bonded device has no slaves before detaching it. Fixes: 8d30fe7fa737 ("bonding: support port hotplug") Signed-off-by: Bernard Iremonger --- drivers/net/bonding/rte_eth_bond_api.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/net/bonding/rte_eth_bond_api.c b/drivers/net/bonding/rte_eth_bond_api.c index 8a000c8..484a6f3 100644 --- a/drivers/net/bonding/rte_eth_bond_api.c +++ b/drivers/net/bonding/rte_eth_bond_api.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -277,6 +277,7 @@ int rte_eth_bond_free(const char *name) { struct rte_eth_dev *eth_dev = NULL; + struct bond_dev_private *internals; /* now free all data allocation - for eth_dev structure, * dummy pci driver and internal (private) data @@ -287,6 +288,10 @@ rte_eth_bond_free(const char *name) if (eth_dev == NULL) return -ENODEV; + internals = eth_dev->data->dev_private; + if (internals->slave_count != 0) + return -EBUSY; + if (eth_dev->data->dev_started == 1) { bond_ethdev_stop(eth_dev); bond_ethdev_close(eth_dev); -- 2.6.3
[dpdk-dev] Regarding mbuf allocation/free in secondary process
Thanks for your response.. Sara On Wed, Feb 10, 2016 at 2:01 AM, Bruce Richardson < bruce.richardson at intel.com> wrote: > On Tue, Feb 09, 2016 at 11:43:19PM -0800, Saravana Kumar wrote: > > Hi DPDK community, > > > > > > > > I'd like to have DPDK NIC IO operations in (primary) process and > > execution logic in (secondary) processes. > > Primary process pushes NIC Rx mbufs to Secondary process through S/W ring > > > > Seconary process allocates mbuf for Tx path and pushes down to Primary > > process for NIC Tx > > > > > > I have few doubts here: > > > > 1. If Secondary process dies because of SIGKILL then how can the mbufs > > allocated in Secondary process can be freed. > >If it is normal signals like SIGINT/SIGTERM then we can be catch > > those and free in those respective signal handlers > > If a process terminates abnormally then the buffers being used by that > process > may well be leaked. The solution you propose of catching signals will > certainly > help as you want to try and ensure that a process always frees all its > buffers > properly on termination. > > > > > 2. Secondary process needs to poll on the S/W ring. This can consume > 100% cpu. > >Is there a way to avoid polling in secondary process for Rx path > > Not using DPDK software rings, no. You'd have to use some kernel > constructs such as > fifo's/named pipes to do blocking reads on those. However, the overhead of > using > such structures can be severe making them unusable for many packet > processing > applications. An alternative might be to use some small sleep calls i.e. > nanosleep > between polls of the SW ring in cases where traffic rates are low. That > will > reduce your cpu usage. > > /Bruce > >
[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers
On Wed, 10 Feb 2016 10:27:14 +0100 David Marchand wrote: > On Wed, Feb 10, 2016 at 9:51 AM, David Marchand > wrote: > > On Tue, Feb 9, 2016 at 6:05 PM, Jan Viktorin > > wrote: > >> What about introducing a macro for this? > >> > >> RTE_REGISTER_PCI_DRIVER(rte_qad_pmd); > > > > Yes. > > The only problem here, is that rte_qad_pmd is a crypto structure (same > problem with ethdev), so I can't just pass it to eal. > I can't just pass the pci driver, for the cases where multiple drivers > are registered in a single file (look at ixgbe driver). > > So, how about : > > In rte_pci.h : > > #define RTE_EAL_PCI_REGISTER(name, d)\ > void pciinitfn_ ##name(void);\ > void __attribute__((constructor, used)) pciinitfn_ ##name(void)\ > {\ > rte_eal_pci_register(d);\ I meant rte_eal_pci_register(&(d)->pci_drv);\ Perhaps, my assumption that a PCI driver is always referred as pci_drv is wrong... > } > > Then, in qat driver : > RTE_EAL_PCI_REGISTER(qat, &rte_qat_pmd.pci_drv); > However, I think that this is a detail. Passing rte_pci_driver is good. Regards Jan -- Jan Viktorin E-mail: Viktorin at RehiveTech.com System Architect Web:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] i40e: cannot change mtu to enable jumbo frame
On 02/09/2016 08:05 PM, Zhu, Heqing wrote: > Helin is still in Chinese New Year Vacation. Will the below command option > help ? > > 4.5.9. port config - max-pkt-len > Set the maximum packet length: > > testpmd> port config all max-pkt-len (value) > This is equivalent to the --max-pkt-len command-line option. > > > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Julien Meunier > Sent: Tuesday, February 9, 2016 9:36 AM > To: Zhang, Helin ; dev at dpdk.org > Subject: [dpdk-dev] i40e: cannot change mtu to enable jumbo frame > > Hello Helin, > > I tried to send jumbo frames to a i40e card. However, I observed that all > frames are dropped. Moreover, set_mtu function is not implemented on i40e PMD. > > > testpmd --log-level 8 --huge-dir=/mnt/huge -n 4 -l 2,18 --socket-mem > 1024,1024 -w :02:00.0 -w :02:00.2 -- -i --nb-cores=1 > --nb-ports=2 --total-num-mbufs=65536 > > = > Configuration > = > > +---+ +-+ > | | | | > | tgen | | | > | +--+ port 0 | > | | | | > | | | | > | | | | > | | | | > | +--+ port 1 | > | | | | > +---+ +-+ > > DPDK: DPDK-v2.2 > > == > MTU = 1500 > == > Packet sent from a tgen > > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP)) > > len(p) = 1514 > > testpmd> start > PMD: i40e_rxd_to_vlan_tci(): Mbuf vlan_tci: 0, vlan_tci_outer: 0 > testpmd> stop > Telling cores to stop... > Waiting for lcores to finish... > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start *** > PMD: i40e_update_vsi_stats(): rx_bytes:1518 > PMD: i40e_update_vsi_stats(): rx_unicast: 1 > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): rx_bytes:1514 > PMD: i40e_dev_stats_get(): rx_unicast: 1 > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1 > PMD: i40e_dev_stats_get(): rx_size_1522: 1 > PMD: i40e_dev_stats_get(): *** PF stats end *** > > -- Forward statistics for port 0 -- > RX-packets: 1 RX-dropped: 0 RX-total: 1 > TX-packets: 0 TX-dropped: 0 TX-total: 0 > > > PMD: i40e_update_vsi_stats(): *** VSI[14] stats start *** > PMD: i40e_update_vsi_stats(): tx_bytes:1514 > PMD: i40e_update_vsi_stats(): tx_unicast: 1 > PMD: i40e_update_vsi_stats(): *** VSI[14] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): tx_bytes:1514 > PMD: i40e_dev_stats_get(): tx_unicast: 1 > PMD: i40e_dev_stats_get(): tx_size_1522: 1 > PMD: i40e_dev_stats_get(): *** PF stats end *** > > -- Forward statistics for port 1 -- > RX-packets: 0 RX-dropped: 0 RX-total: 0 > TX-packets: 1 TX-dropped: 0 TX-total: 1 > > > > + Accumulated forward statistics for all ports+ > RX-packets: 1 RX-dropped: 0 RX-total: 1 > TX-packets: 1 TX-dropped: 0 TX-total: 1 > > > > => OK > > == > MTU = 1600 > == > Packet sent > > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP)) > > len(p) = 1614 > > testpmd> port config mtu 0 1600 > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95 > testpmd> port config mtu 1 1600 > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95 > testpmd> start > testpmd> stop > Telling cores to stop... > Waiting for lcores to finish... > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start *** > PMD: i40e_update_vsi_stats(): rx_bytes:1618 > PMD: i40e_update_vsi_stats(): rx_unicast: 1 > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): rx_bytes:1614 > PMD: i40e_dev_stats_get(): rx_unicast: 1 > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1 > PMD: i40e_dev_stats_get(): rx_size_big: 1 > PMD: i40e_dev_stats_get(): *** PF stats end *** > > -- Forward statistics for port 0 -- > RX-packets: 1 RX-dropped: 0 RX-total: 1 > TX-packets: 0 TX-dropped: 0 TX-total: 0 > > > PMD: i4
[dpdk-dev] [dpdk-users] DPDK i40evf problem in receving packet
Hi Saurabh, Can you tell me what card you are uing *Regards:* Zain ul Abideen *Disclaimer**: The information contained in this e-mail and any attachments is confidential; it is intended only for use of the individual or entity named above. If the reader of this message is not the intended recipient, you are notified that any dissemination, distribution or use of this information is strictly prohibited. If you have received this communication in error, please delete it and email confirmation to the sender. Thank You.* On Wed, Feb 10, 2016 at 6:30 AM, Saurabh Mishra wrote: > Hi Qian -- > > Any suggestions? This is bit urgent. > > /Saurabh > > On Sat, Feb 6, 2016 at 9:22 AM, Saurabh Mishra > wrote: > > > Hi Qian -- > > > > > > Here's the data from Host: > > > > [root at oscompute3 ~]# ethtool -i p3p1 > > > > driver: i40e > > > > version: 1.0.11-k > > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0 > > > > bus-info: :04:00.0 > > > > supports-statistics: yes > > > > supports-test: yes > > > > supports-eeprom-access: yes > > > > supports-register-dump: yes > > > > supports-priv-flags: no > > > > [root at oscompute3 ~]# ethtool -i p3p2 > > > > driver: i40e > > > > version: 1.0.11-k > > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0 > > > > bus-info: :04:00.1 > > > > supports-statistics: yes > > > > supports-test: yes > > > > supports-eeprom-access: yes > > > > supports-register-dump: yes > > > > supports-priv-flags: no > > > > [root at oscompute3 ~]# > > > > EthApp> drvinfo > > > > Port 0 driver: rte_i40evf_pmd (ver: RTE 2.2.0) > > > > Port 1 driver: rte_i40evf_pmd (ver: RTE 2.2.0) > > > > EthApp> > > > > On Fri, Feb 5, 2016 at 4:59 PM, Xu, Qian Q wrote: > > > >> What's your current firmware info, can u run ethtool -i port_interface > to > >> check? > >> > >> Thanks > >> Qian > >> > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Saurabh Mishra > >> Sent: Saturday, February 06, 2016 6:33 AM > >> To: dev at dpdk.org; users at dpdk.org > >> Subject: [dpdk-dev] DPDK i40evf problem in receving packet > >> > >> Hi, > >> > >> I'm seeing two problems: > >> > >> 1) when use our kernel '3.10.88-8.0.0.0.6', we only receive first > >> packet but not subsequent ones at all after that. However, when I use > >> centos7.0, then l2fwd is able to receive all the packets. > >> > >> 2) I've also seen that on centos7.0, symmetric_mp itself is not working. > >> dev start fails with 280 error. > >> > >> i40evf is giving us lot of headache. The i40evf kernel driver works > >> without any problem. Host is a centos7 KVM. I've already upgraded > firmware > >> to latest. > >> > >> [root at localhost ~]# uname -a > >> > >> Linux localhost.localdomain 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon Jan 25 > >> 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux > >> > >> [root at localhost ~]# > >> > >> [root at localhost ~]# ./symmetric_mp fakeelf --file-prefix=virtaddr -c2 > >> -m2048 -n1 --base-virtaddr=0x7fa0 --proc-type=primary -- -p 3 > >> --num-procs=1 --proc-id=0 > >> > >> EAL: Detected lcore 0 as core 0 on socket 0 > >> > >> EAL: Detected lcore 1 as core 0 on socket 0 > >> > >> EAL: Support maximum 128 logical core(s) by configuration. > >> > >> EAL: Detected 2 lcore(s) > >> > >> EAL: No free hugepages reported in hugepages-1048576kB > >> > >> EAL: VFIO modules not all loaded, skip VFIO support... > >> > >> EAL: Setting up physically contiguous memory... > >> > >> EAL: Ask a virtual area of 0x2a80 bytes > >> > >> EAL: Virtual area found at 0x7fa0 (size = 0x2a80) > >> > >> EAL: Ask a virtual area of 0x20 bytes > >> > >> EAL: Virtual area found at 0x7fa02a80 (size = 0x20) > >> > >> EAL: Ask a virtual area of 0x5400 bytes > >> > >> EAL: Virtual area found at 0x7fa02aa0 (size = 0x5400) > >> > >> EAL: Ask a virtual area of 0x40 bytes > >> > >> EAL: Virtual area found at 0x7fa07ea0 (size = 0x40) > >> > >> EAL: Ask a virtual area of 0x20 bytes > >> > >> EAL: Virtual area found at 0x7fa07ee0 (size = 0x20) > >> > >> EAL: Ask a virtual area of 0x20 bytes > >> > >> EAL: Virtual area found at 0x7fa07f00 (size = 0x20) > >> > >> EAL: Ask a virtual area of 0x60 bytes > >> > >> EAL: Virtual area found at 0x7fa07f20 (size = 0x60) > >> > >> EAL: Ask a virtual area of 0x20 bytes > >> > >> EAL: Virtual area found at 0x7fa07f80 (size = 0x20) > >> > >> EAL: Ask a virtual area of 0x20 bytes > >> > >> EAL: Virtual area found at 0x7fa07fa0 (size = 0x20) > >> > >> EAL: Ask a virtual area of 0x40 bytes > >> > >> EAL: Virtual area found at 0x7fa07fc0 (size = 0x40) > >> > >> EAL: Requesting 1024 pages of size 2MB from socket 0 > >> > >> EAL: TSC frequency is ~260 KHz > >> > >> EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using > >> unreliable clock cycles ! > >> > >> EAL: Master lcore 1 is ready (tid=3e55900;cpuset=[1]) > >> > >> EAL: PCI device :00:09.0 on
[dpdk-dev] [PATCH 6/9] eal: initialize vdevs right next to pci devices
On Fri, 29 Jan 2016 15:08:33 +0100 David Marchand wrote: > This way, the resources probing happens in a common place. > > Signed-off-by: David Marchand > --- > lib/librte_eal/bsdapp/eal/eal.c | 7 +++ > lib/librte_eal/common/include/rte_dev.h | 2 +- > lib/librte_eal/linuxapp/eal/eal.c | 7 +++ > 3 files changed, 15 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c > index a34e61d..b557a9f 100644 > --- a/lib/librte_eal/bsdapp/eal/eal.c > +++ b/lib/librte_eal/bsdapp/eal/eal.c > @@ -582,8 +582,10 @@ rte_eal_init(int argc, char **argv) > rte_config.master_lcore, thread_id, cpuset, > ret == 0 ? "" : "..."); > > +#ifndef RTE_NEXT_ABI > if (rte_eal_dev_init() < 0) > rte_panic("Cannot init pmd devices\n"); > +#endif > > RTE_LCORE_FOREACH_SLAVE(i) { > > @@ -617,6 +619,11 @@ rte_eal_init(int argc, char **argv) > rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER); > rte_eal_mp_wait_lcore(); > > +#ifdef RTE_NEXT_ABI > + if (rte_eal_dev_init() < 0) > + rte_panic("Cannot probe vdev devices\n"); > +#endif > + > /* Probe & Initialize PCI devices */ > if (rte_eal_pci_probe()) > rte_panic("Cannot probe PCI\n"); > diff --git a/lib/librte_eal/common/include/rte_dev.h > b/lib/librte_eal/common/include/rte_dev.h > index 88c1a19..df69e28 100644 > --- a/lib/librte_eal/common/include/rte_dev.h > +++ b/lib/librte_eal/common/include/rte_dev.h > @@ -156,7 +156,7 @@ void rte_eal_driver_register(struct rte_driver *driver); > void rte_eal_driver_unregister(struct rte_driver *driver); > > /** > - * Initalize all the registered drivers in this process > + * Scan all devargs and attach to drivers if available > */ > int rte_eal_dev_init(void); Move this to a separate commit? Is it just a forgotten doc comment? > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c > b/lib/librte_eal/linuxapp/eal/eal.c > index 62241ee..95313af 100644 > --- a/lib/librte_eal/linuxapp/eal/eal.c > +++ b/lib/librte_eal/linuxapp/eal/eal.c > @@ -834,8 +834,10 @@ rte_eal_init(int argc, char **argv) > rte_config.master_lcore, (int)thread_id, cpuset, > ret == 0 ? "" : "..."); > > +#ifndef RTE_NEXT_ABI > if (rte_eal_dev_init() < 0) > rte_panic("Cannot init pmd devices\n"); > +#endif > > RTE_LCORE_FOREACH_SLAVE(i) { > > @@ -873,6 +875,11 @@ rte_eal_init(int argc, char **argv) > rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER); > rte_eal_mp_wait_lcore(); > > +#ifdef RTE_NEXT_ABI > + if (rte_eal_dev_init() < 0) > + rte_panic("Cannot probe vdev devices\n"); > +#endif > + > /* Probe & Initialize PCI devices */ > if (rte_eal_pci_probe()) > rte_panic("Cannot probe PCI\n"); I cannot see the point why it is enclosed in the RTE_NEXT_ABI. Is it such a serious breakage? Regards Jan
[dpdk-dev] [PATCH 7/9] pci: add a helper for device name
On Fri, 29 Jan 2016 15:08:34 +0100 David Marchand wrote: > eal is a better place than ethdev for naming resources. > Add a helper here, and make use of it in ethdev hotplug code. > > Signed-off-by: David Marchand > --- > lib/librte_eal/common/include/rte_pci.h | 28 > lib/librte_ether/rte_ethdev.c | 22 ++ > 2 files changed, 30 insertions(+), 20 deletions(-) > > diff --git a/lib/librte_eal/common/include/rte_pci.h > b/lib/librte_eal/common/include/rte_pci.h > index 334c12e..9edd5f5 100644 > --- a/lib/librte_eal/common/include/rte_pci.h > +++ b/lib/librte_eal/common/include/rte_pci.h > @@ -309,6 +309,34 @@ eal_parse_pci_DomBDF(const char *input, struct > rte_pci_addr *dev_addr) > } > #undef GET_PCIADDR_FIELD > > +/** > + * Utility function to write a pci device name, this device name can later be > + * used to retrieve the corresponding rte_pci_addr using above functions. > + * > + * @param addr > + * The PCI Bus-Device-Function address > + * @param output > + * The output buffer string > + * @param size > + * The output buffer size > + * @return > + * 0 on success, negative on error. > + */ > +static inline int > +eal_pci_device_name(const struct rte_pci_addr *addr, > + char *output, int size) "size_t size" (or unsigned int) seems to be better to me. > +{ > + int ret; > + > + ret = snprintf(output, size, PCI_PRI_FMT, > +addr->domain, addr->bus, > +addr->devid, addr->function); > + if (ret < 0 || ret >= size) > + return -1; The return value is not checked (below). I think, such functions are usually missing error checking as nobody expects them to fail. Wouldn't it be better to panic here? > + > + return 0; > +} > + > /* Compare two PCI device addresses. */ > /** > * Utility function to compare two PCI device addresses. > diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c > index 17e4f4d..5ba7479 100644 > --- a/lib/librte_ether/rte_ethdev.c > +++ b/lib/librte_ether/rte_ethdev.c > @@ -214,20 +214,6 @@ rte_eth_dev_allocate(const char *name, enum > rte_eth_dev_type type) > return eth_dev; > } > > -static int > -rte_eth_dev_create_unique_device_name(char *name, size_t size, > - struct rte_pci_device *pci_dev) > -{ > - int ret; > - > - ret = snprintf(name, size, "%d:%d.%d", > - pci_dev->addr.bus, pci_dev->addr.devid, > - pci_dev->addr.function); > - if (ret < 0) > - return ret; > - return 0; > -} > - > int > rte_eth_dev_release_port(struct rte_eth_dev *eth_dev) > { > @@ -251,9 +237,7 @@ rte_eth_dev_pci_probe(struct rte_pci_driver *pci_drv, > > eth_drv = (struct eth_driver *)pci_drv; > > - /* Create unique Ethernet device name using PCI address */ > - rte_eth_dev_create_unique_device_name(ethdev_name, > - sizeof(ethdev_name), pci_dev); > + eal_pci_device_name(&pci_dev->addr, ethdev_name, sizeof(ethdev_name)); > > eth_dev = rte_eth_dev_allocate(ethdev_name, RTE_ETH_DEV_PCI); > if (eth_dev == NULL) > @@ -304,9 +288,7 @@ rte_eth_dev_pci_remove(struct rte_pci_device *pci_dev) > if (pci_dev == NULL) > return -EINVAL; > > - /* Create unique Ethernet device name using PCI address */ > - rte_eth_dev_create_unique_device_name(ethdev_name, > - sizeof(ethdev_name), pci_dev); > + eal_pci_device_name(&pci_dev->addr, ethdev_name, sizeof(ethdev_name)); > > eth_dev = rte_eth_dev_allocated(ethdev_name); > if (eth_dev == NULL) -- Jan Viktorin E-mail: Viktorin at RehiveTech.com System Architect Web:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] [dpdk-users] DPDK i40evf problem in receving packet
Formatting note: Please avoid top posting (inline replies are preferred), and do not include such disclaimer, not relevant on a mailing list. Thanks 2016-02-10 16:00, Muhammad Zain-ul-Abideen: > Hi Saurabh, > Can you tell me what card you are uing > > > *Regards:* > Zain ul Abideen > > > *Disclaimer**: The information contained in this e-mail and any attachments > is confidential; it is intended only for use of the individual or entity > named above. If the reader of this message is not the intended recipient, > you are notified that any dissemination, distribution or use of this > information is strictly prohibited. If you have received this communication > in error, please delete it and email confirmation to the sender. Thank You.* > > > On Wed, Feb 10, 2016 at 6:30 AM, Saurabh Mishra > wrote: > > > Hi Qian -- > > > > Any suggestions? This is bit urgent. > > > > /Saurabh > > > > On Sat, Feb 6, 2016 at 9:22 AM, Saurabh Mishra > > wrote: > > > > > Hi Qian -- > > > > > > > > > Here's the data from Host: > > > > > > [root at oscompute3 ~]# ethtool -i p3p1 > > > > > > driver: i40e > > > > > > version: 1.0.11-k > > > > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0 > > > > > > bus-info: :04:00.0 > > > > > > supports-statistics: yes > > > > > > supports-test: yes > > > > > > supports-eeprom-access: yes > > > > > > supports-register-dump: yes > > > > > > supports-priv-flags: no > > > > > > [root at oscompute3 ~]# ethtool -i p3p2 > > > > > > driver: i40e > > > > > > version: 1.0.11-k > > > > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0 > > > > > > bus-info: :04:00.1 > > > > > > supports-statistics: yes > > > > > > supports-test: yes > > > > > > supports-eeprom-access: yes > > > > > > supports-register-dump: yes > > > > > > supports-priv-flags: no > > > > > > [root at oscompute3 ~]# > > > > > > EthApp> drvinfo > > > > > > Port 0 driver: rte_i40evf_pmd (ver: RTE 2.2.0) > > > > > > Port 1 driver: rte_i40evf_pmd (ver: RTE 2.2.0) > > > > > > EthApp> > > > > > > On Fri, Feb 5, 2016 at 4:59 PM, Xu, Qian Q wrote: > > > > > >> What's your current firmware info, can u run ethtool -i port_interface > > to > > >> check? > > >> > > >> Thanks > > >> Qian > > >> > > >> -Original Message- > > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Saurabh Mishra > > >> Sent: Saturday, February 06, 2016 6:33 AM > > >> To: dev at dpdk.org; users at dpdk.org > > >> Subject: [dpdk-dev] DPDK i40evf problem in receving packet > > >> > > >> Hi, > > >> > > >> I'm seeing two problems: > > >> > > >> 1) when use our kernel '3.10.88-8.0.0.0.6', we only receive first > > >> packet but not subsequent ones at all after that. However, when I use > > >> centos7.0, then l2fwd is able to receive all the packets. > > >> > > >> 2) I've also seen that on centos7.0, symmetric_mp itself is not working. > > >> dev start fails with 280 error. > > >> > > >> i40evf is giving us lot of headache. The i40evf kernel driver works > > >> without any problem. Host is a centos7 KVM. I've already upgraded > > firmware > > >> to latest. > > >> > > >> [root at localhost ~]# uname -a > > >> > > >> Linux localhost.localdomain 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon Jan 25 > > >> 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux > > >> > > >> [root at localhost ~]# > > >> > > >> [root at localhost ~]# ./symmetric_mp fakeelf --file-prefix=virtaddr -c2 > > >> -m2048 -n1 --base-virtaddr=0x7fa0 --proc-type=primary -- -p 3 > > >> --num-procs=1 --proc-id=0 > > >> > > >> EAL: Detected lcore 0 as core 0 on socket 0 > > >> > > >> EAL: Detected lcore 1 as core 0 on socket 0 > > >> > > >> EAL: Support maximum 128 logical core(s) by configuration. > > >> > > >> EAL: Detected 2 lcore(s) > > >> > > >> EAL: No free hugepages reported in hugepages-1048576kB > > >> > > >> EAL: VFIO modules not all loaded, skip VFIO support... > > >> > > >> EAL: Setting up physically contiguous memory... > > >> > > >> EAL: Ask a virtual area of 0x2a80 bytes > > >> > > >> EAL: Virtual area found at 0x7fa0 (size = 0x2a80) > > >> > > >> EAL: Ask a virtual area of 0x20 bytes > > >> > > >> EAL: Virtual area found at 0x7fa02a80 (size = 0x20) > > >> > > >> EAL: Ask a virtual area of 0x5400 bytes > > >> > > >> EAL: Virtual area found at 0x7fa02aa0 (size = 0x5400) > > >> > > >> EAL: Ask a virtual area of 0x40 bytes > > >> > > >> EAL: Virtual area found at 0x7fa07ea0 (size = 0x40) > > >> > > >> EAL: Ask a virtual area of 0x20 bytes > > >> > > >> EAL: Virtual area found at 0x7fa07ee0 (size = 0x20) > > >> > > >> EAL: Ask a virtual area of 0x20 bytes > > >> > > >> EAL: Virtual area found at 0x7fa07f00 (size = 0x20) > > >> > > >> EAL: Ask a virtual area of 0x60 bytes > > >> > > >> EAL: Virtual area found at 0x7fa07f20 (size = 0x60) > > >> > > >> EAL: Ask a virtual area of 0x20 bytes > > >> > > >> EAL: Virtual area found at 0x7fa07f80 (size = 0x20) > > >> > > >> EAL: Ask a virtu
[dpdk-dev] [PATCH 8/9] pci: add a helper to refresh a device
On Fri, 29 Jan 2016 15:08:35 +0100 David Marchand wrote: > It will be used mainly for hotplug code. > > Signed-off-by: David Marchand > --- > lib/librte_eal/bsdapp/eal/eal_pci.c | 49 > +++ > lib/librte_eal/common/eal_private.h | 13 ++ > lib/librte_eal/linuxapp/eal/eal_pci.c | 13 ++ > 3 files changed, 75 insertions(+) > > diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c > b/lib/librte_eal/bsdapp/eal/eal_pci.c > index 4584522..5dd89e3 100644 > --- a/lib/librte_eal/bsdapp/eal/eal_pci.c > +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c > @@ -396,6 +396,55 @@ error: > return -1; > } > > +int > +pci_refresh_device(const struct rte_pci_addr *addr) What about pci_reload_device or pci_reload_device_info? I don't mind too much, only the word 'refresh' reminds me other associations. > +{ > + int fd; > + struct pci_conf matches[2]; > + struct pci_match_conf match = { > + .pc_sel = { > + .pc_domain = addr->domain, > + .pc_bus = addr->bus, > + .pc_dev = addr->devid, > + .pc_func = addr->function, > + }, > + }; > + struct pci_conf_io conf_io = { > + .pat_buf_len = 0, > + .num_patterns = 1, > + .patterns = { &match }, > + .match_buf_len = sizeof(matches), > + .matches = &matches[0], > + }; > + > + fd = open("/dev/pci", O_RDONLY); Just courious who provides this special file... is a DPDK-specific thing? I haven't noticed it anywhere in Linux. > + if (fd < 0) { > + RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", __func__); > + goto error; If you write: return -1; then you can... > + } > + > + if (ioctl(fd, PCIOCGETCONF, &conf_io) < 0) { > + RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n", > + __func__, strerror(errno)); > + goto error; > + } > + > + if (conf_io.num_matches != 1) > + goto error; > + > + if (pci_scan_one(fd, &matches[0]) < 0) > + goto error; > + > + close(fd); > + > + return 0; > + > +error: ...remove this if: > + if (fd >= 0) > + close(fd); Or, do you consider it more stable in the orignal way? > + return -1; > +} > + > /* Read PCI config space. */ > int rte_eal_pci_read_config(const struct rte_pci_device *dev, > void *buf, size_t len, off_t offset) > diff --git a/lib/librte_eal/common/eal_private.h > b/lib/librte_eal/common/eal_private.h > index 072e672..ed1903f 100644 > --- a/lib/librte_eal/common/eal_private.h > +++ b/lib/librte_eal/common/eal_private.h > @@ -155,6 +155,19 @@ struct rte_pci_driver; > struct rte_pci_device; > > /** > + * Refresh a pci device object by asking the kernel for the latest > information. > + * > + * This function is private to EAL. > + * > + * @param addr > + * The PCI Bus-Device-Function address to look for > + * @return > + * - 0 on success. > + * - negative on error. I don't know whether this is a convention in DPDK, anyway, I don't like to restrict errors to just negatives. You cannot write if ((err = pci_refresh_device(...)) /* < 0 */) { handle_error(err); } as the check for < 0 is required (easy to be avoided). > + */ > +int pci_refresh_device(const struct rte_pci_addr *addr); > + > +/** > * Unbind kernel driver for this device > * > * This function is private to EAL. > diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c > b/lib/librte_eal/linuxapp/eal/eal_pci.c > index a354f76..4fe8b60 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_pci.c > +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c > @@ -393,6 +393,19 @@ pci_scan_one(const char *dirname, uint16_t domain, > uint8_t bus, > return 0; > } > > +int > +pci_refresh_device(const struct rte_pci_addr *addr) > +{ > + char filename[PATH_MAX]; > + > + snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT, > + SYSFS_PCI_DEVICES, addr->domain, addr->bus, addr->devid, > + addr->function); > + > + return pci_scan_one(filename, addr->domain, addr->bus, addr->devid, > + addr->function); > +} > + > /* > * split up a pci address into its constituent parts. > */ -- Jan Viktorin E-mail: Viktorin at RehiveTech.com System Architect Web:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers
On Wed, Feb 10, 2016 at 11:20 AM, Jan Viktorin wrote: > On Wed, 10 Feb 2016 10:27:14 +0100 > David Marchand wrote: >> #define RTE_EAL_PCI_REGISTER(name, d)\ >> void pciinitfn_ ##name(void);\ >> void __attribute__((constructor, used)) pciinitfn_ ##name(void)\ >> {\ >> rte_eal_pci_register(d);\ > > I meant > rte_eal_pci_register(&(d)->pci_drv);\ > > Perhaps, my assumption that a PCI driver is always referred as pci_drv is > wrong... Well, I suppose we will always have a pci driver embedded in some other internal pmd structure. So we can always expect it to be called pci_drv ... Btw, for drivers like mlx or virtio that need to do some more stuff in their constructor (the iopl stuff for virtio is the most interesting case, since the pci register happens only if iopl succeeded), we might need some RTE_MODULE_INIT for those. But in such a case, I think having RTE_MODULE_INIT in all pmds would make more sense, and RTE_EAL_PCI_REGISTER looks unneeded ? -- David Marchand
[dpdk-dev] [PATCH 6/9] eal: initialize vdevs right next to pci devices
On Wed, Feb 10, 2016 at 12:05 PM, Jan Viktorin wrote: > On Fri, 29 Jan 2016 15:08:33 +0100 > David Marchand wrote: >> --- a/lib/librte_eal/common/include/rte_dev.h >> +++ b/lib/librte_eal/common/include/rte_dev.h >> @@ -156,7 +156,7 @@ void rte_eal_driver_register(struct rte_driver *driver); >> void rte_eal_driver_unregister(struct rte_driver *driver); >> >> /** >> - * Initalize all the registered drivers in this process >> + * Scan all devargs and attach to drivers if available >> */ >> int rte_eal_dev_init(void); > > Move this to a separate commit? Is it just a forgotten doc comment? Should be in previous commit, yes. >> >> diff --git a/lib/librte_eal/linuxapp/eal/eal.c >> b/lib/librte_eal/linuxapp/eal/eal.c >> index 62241ee..95313af 100644 >> --- a/lib/librte_eal/linuxapp/eal/eal.c >> +++ b/lib/librte_eal/linuxapp/eal/eal.c >> @@ -834,8 +834,10 @@ rte_eal_init(int argc, char **argv) >> rte_config.master_lcore, (int)thread_id, cpuset, >> ret == 0 ? "" : "..."); >> >> +#ifndef RTE_NEXT_ABI >> if (rte_eal_dev_init() < 0) >> rte_panic("Cannot init pmd devices\n"); >> +#endif >> >> RTE_LCORE_FOREACH_SLAVE(i) { >> >> @@ -873,6 +875,11 @@ rte_eal_init(int argc, char **argv) >> rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MASTER); >> rte_eal_mp_wait_lcore(); >> >> +#ifdef RTE_NEXT_ABI >> + if (rte_eal_dev_init() < 0) >> + rte_panic("Cannot probe vdev devices\n"); >> +#endif >> + >> /* Probe & Initialize PCI devices */ >> if (rte_eal_pci_probe()) >> rte_panic("Cannot probe PCI\n"); > > I cannot see the point why it is enclosed in the RTE_NEXT_ABI. Is it > such a serious breakage? No. We can make this move unconditional. Thanks. -- David Marchand
[dpdk-dev] [PATCH 8/9] pci: add a helper to refresh a device
On Wed, Feb 10, 2016 at 12:23 PM, Jan Viktorin wrote: > On Fri, 29 Jan 2016 15:08:35 +0100 > David Marchand wrote: > >> It will be used mainly for hotplug code. >> >> Signed-off-by: David Marchand >> --- >> lib/librte_eal/bsdapp/eal/eal_pci.c | 49 >> +++ >> lib/librte_eal/common/eal_private.h | 13 ++ >> lib/librte_eal/linuxapp/eal/eal_pci.c | 13 ++ >> 3 files changed, 75 insertions(+) >> >> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c >> b/lib/librte_eal/bsdapp/eal/eal_pci.c >> index 4584522..5dd89e3 100644 >> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c >> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c >> @@ -396,6 +396,55 @@ error: >> return -1; >> } >> >> +int >> +pci_refresh_device(const struct rte_pci_addr *addr) > > What about pci_reload_device or pci_reload_device_info? I don't mind > too much, only the word 'refresh' reminds me other associations. Or maybe pci_update_device ? I added pci_add_device in my other pci patchset, so update sounds better to me. >> +{ >> + int fd; >> + struct pci_conf matches[2]; >> + struct pci_match_conf match = { >> + .pc_sel = { >> + .pc_domain = addr->domain, >> + .pc_bus = addr->bus, >> + .pc_dev = addr->devid, >> + .pc_func = addr->function, >> + }, >> + }; >> + struct pci_conf_io conf_io = { >> + .pat_buf_len = 0, >> + .num_patterns = 1, >> + .patterns = { &match }, >> + .match_buf_len = sizeof(matches), >> + .matches = &matches[0], >> + }; >> + >> + fd = open("/dev/pci", O_RDONLY); > > Just courious who provides this special file... is a DPDK-specific > thing? I haven't noticed it anywhere in Linux. I don't know, just took the bsd pci code and plugged myself in it. So for me this is a special bsd device. This is mainly copy/paste. Look at rte_eal_pci_scan() from lib/librte_eal/bsdapp/eal/eal_pci.c. > >> + if (fd < 0) { >> + RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", __func__); >> + goto error; > > If you write: > return -1; > > then you can... > >> + } >> + >> + if (ioctl(fd, PCIOCGETCONF, &conf_io) < 0) { >> + RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n", >> + __func__, strerror(errno)); >> + goto error; >> + } >> + >> + if (conf_io.num_matches != 1) >> + goto error; >> + >> + if (pci_scan_one(fd, &matches[0]) < 0) >> + goto error; >> + >> + close(fd); >> + >> + return 0; >> + >> +error: > > ...remove this if: > >> + if (fd >= 0) >> + close(fd); > > Or, do you consider it more stable in the orignal way? Well, as said above, this is copy/paste code. But, anyway, when I write functions with goto statements, I prefer having a minimal number of return statements, matter of taste. Another way is to add two label error_close: error: but this is a bit overkill here. >> + return -1; >> +} >> + >> /* Read PCI config space. */ >> int rte_eal_pci_read_config(const struct rte_pci_device *dev, >> void *buf, size_t len, off_t offset) >> diff --git a/lib/librte_eal/common/eal_private.h >> b/lib/librte_eal/common/eal_private.h >> index 072e672..ed1903f 100644 >> --- a/lib/librte_eal/common/eal_private.h >> +++ b/lib/librte_eal/common/eal_private.h >> @@ -155,6 +155,19 @@ struct rte_pci_driver; >> struct rte_pci_device; >> >> /** >> + * Refresh a pci device object by asking the kernel for the latest >> information. >> + * >> + * This function is private to EAL. >> + * >> + * @param addr >> + * The PCI Bus-Device-Function address to look for >> + * @return >> + * - 0 on success. >> + * - negative on error. > > I don't know whether this is a convention in DPDK, anyway, I don't > like to restrict errors to just negatives. You cannot write > > if ((err = pci_refresh_device(...)) /* < 0 */) { > handle_error(err); > } > > as the check for < 0 is required (easy to be avoided). It is a remnant of a lot of code in eal that tries to have 0 for success, < 0 for errors, > 0 for special cases. -- David Marchand
[dpdk-dev] [PATCH 7/9] pci: add a helper for device name
On Wed, Feb 10, 2016 at 12:10 PM, Jan Viktorin wrote: > On Fri, 29 Jan 2016 15:08:34 +0100 > David Marchand wrote: > >> eal is a better place than ethdev for naming resources. >> Add a helper here, and make use of it in ethdev hotplug code. >> >> Signed-off-by: David Marchand >> --- >> lib/librte_eal/common/include/rte_pci.h | 28 >> lib/librte_ether/rte_ethdev.c | 22 ++ >> 2 files changed, 30 insertions(+), 20 deletions(-) >> >> diff --git a/lib/librte_eal/common/include/rte_pci.h >> b/lib/librte_eal/common/include/rte_pci.h >> index 334c12e..9edd5f5 100644 >> --- a/lib/librte_eal/common/include/rte_pci.h >> +++ b/lib/librte_eal/common/include/rte_pci.h >> @@ -309,6 +309,34 @@ eal_parse_pci_DomBDF(const char *input, struct >> rte_pci_addr *dev_addr) >> } >> #undef GET_PCIADDR_FIELD >> >> +/** >> + * Utility function to write a pci device name, this device name can later >> be >> + * used to retrieve the corresponding rte_pci_addr using above functions. >> + * >> + * @param addr >> + * The PCI Bus-Device-Function address >> + * @param output >> + * The output buffer string >> + * @param size >> + * The output buffer size >> + * @return >> + * 0 on success, negative on error. >> + */ >> +static inline int >> +eal_pci_device_name(const struct rte_pci_addr *addr, >> + char *output, int size) > > "size_t size" (or unsigned int) seems to be better to me. Yes. > >> +{ >> + int ret; >> + >> + ret = snprintf(output, size, PCI_PRI_FMT, >> +addr->domain, addr->bus, >> +addr->devid, addr->function); >> + if (ret < 0 || ret >= size) >> + return -1; > > The return value is not checked (below). I think, such functions > are usually missing error checking as nobody expects them to fail. > Wouldn't it be better to panic here? assert or panic, yes. -- David Marchand
[dpdk-dev] [PATCH v2] hash: fix CRC32c computation
Hi Didier, > -Original Message- > From: Didier Pallard [mailto:didier.pallard at 6wind.com] > Sent: Tuesday, February 09, 2016 9:34 AM > To: dev at dpdk.org; Richardson, Bruce; De Lara Guarch, Pablo > Cc: jean-mickael.guerin at 6wind.com; thomas.monjalon at 6wind.com > Subject: [PATCH v2] hash: fix CRC32c computation > > As demonstrated by the following code, CRC32c computation is not valid > when buffer length is not a multiple of 4 bytes: > (Output obtained by code below) > > CRC of 1 NULL bytes expected: 0x527d5351 > soft: 527d5351 > rte accelerated: 48674bc7 > rte soft: 48674bc7 > CRC of 2 NULL bytes expected: 0xf16177d2 > soft: f16177d2 > rte accelerated: 48674bc7 > rte soft: 48674bc7 > CRC of 2x1 NULL bytes expected: 0xf16177d2 > soft: f16177d2 > rte accelerated: 8c28b28a > rte soft: 8c28b28a > CRC of 3 NULL bytes expected: 0x6064a37a > soft: 6064a37a > rte accelerated: 48674bc7 > rte soft: 48674bc7 > CRC of 4 NULL bytes expected: 0x48674bc7 > soft: 48674bc7 > rte accelerated: 48674bc7 > rte soft: 48674bc7 > > Values returned by rte_hash_crc functions does not match the one > computed by a trivial crc32c implementation. > > ARM code is not tested. > > code showing the problem: > > uint8_t null_test[32] = {0}; > > static uint32_t crc32c_trivial(uint8_t *buffer, uint32_t length, uint32_t crc) > { > uint32_t i, j; > for (i = 0; i < length; ++i) > { > crc = crc ^ buffer[i]; > for (j = 0; j < 8; j++) > crc = (crc >> 1) ^ 0x8000 ^ ((~crc & 1) * 0x82f63b78); > } > return crc; > } > > void hash_test(void); > void hash_test(void) > { > printf("CRC of 1 nul byte expected: 0x527d5351\n"); > printf("soft: %08x\n", crc32c_trivial(null_test, 1, 0)); > rte_hash_crc_init_alg(); > printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 1, > 0x)); > rte_hash_crc_set_alg(CRC32_SW); > printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 1, 0x)); > > printf("CRC of 2 nul bytes expected: 0xf16177d2\n"); > printf("soft: %08x\n", crc32c_trivial(null_test, 2, 0)); > rte_hash_crc_init_alg(); > printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 2, > 0x)); > rte_hash_crc_set_alg(CRC32_SW); > printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 2, 0x)); > > printf("CRC of 2x1 nul bytes expected: 0xf16177d2\n"); > printf("soft: %08x\n", crc32c_trivial(null_test, 1, > crc32c_trivial(null_test, 1, 0))); > rte_hash_crc_init_alg(); > printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 1, > rte_hash_crc(null_test, 1, 0x))); > rte_hash_crc_set_alg(CRC32_SW); > printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 1, > rte_hash_crc(null_test, 1, 0x))); > > printf("CRC of 3 nul bytes expected: 0x6064a37a\n"); > printf("soft: %08x\n", crc32c_trivial(null_test, 3, 0)); > rte_hash_crc_init_alg(); > printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 3, > 0x)); > rte_hash_crc_set_alg(CRC32_SW); > printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 3, 0x)); > > printf("CRC of 4 nul bytes expected: 0x48674bc7\n"); > printf("soft: %08x\n", crc32c_trivial(null_test, 4, 0)); > rte_hash_crc_init_alg(); > printf("rte accelerated: %08x\n", ~rte_hash_crc(null_test, 4, > 0x)); > rte_hash_crc_set_alg(CRC32_SW); > printf("rte soft: %08x\n", ~rte_hash_crc(null_test, 4, 0x)); > } > > Signed-off-by: Didier Pallard > Acked-by: David Marchand It compiles fine now, thanks! Could you add also tests for not multiple of 4 bytes keys in test_hash_functions.c, so we make sure from now on that it works (and you can demonstrate that your fix works)? You could send a patchset with those new tests first and then the fix. Also, a note in release notes would be welcome :) Thanks! Pablo
[dpdk-dev] [PATCH 8/9] pci: add a helper to refresh a device
On Wed, 10 Feb 2016 13:00:50 +0100 David Marchand wrote: > On Wed, Feb 10, 2016 at 12:23 PM, Jan Viktorin > wrote: > > On Fri, 29 Jan 2016 15:08:35 +0100 > > David Marchand wrote: > > > >> It will be used mainly for hotplug code. > >> > >> Signed-off-by: David Marchand > >> --- > >> lib/librte_eal/bsdapp/eal/eal_pci.c | 49 > >> +++ > >> lib/librte_eal/common/eal_private.h | 13 ++ > >> lib/librte_eal/linuxapp/eal/eal_pci.c | 13 ++ > >> 3 files changed, 75 insertions(+) > >> > >> diff --git a/lib/librte_eal/bsdapp/eal/eal_pci.c > >> b/lib/librte_eal/bsdapp/eal/eal_pci.c > >> index 4584522..5dd89e3 100644 > >> --- a/lib/librte_eal/bsdapp/eal/eal_pci.c > >> +++ b/lib/librte_eal/bsdapp/eal/eal_pci.c > >> @@ -396,6 +396,55 @@ error: > >> return -1; > >> } > >> > >> +int > >> +pci_refresh_device(const struct rte_pci_addr *addr) > > > > What about pci_reload_device or pci_reload_device_info? I don't mind > > too much, only the word 'refresh' reminds me other associations. > > Or maybe pci_update_device ? > I added pci_add_device in my other pci patchset, so update sounds better to > me. > OK. > > >> +{ > >> + int fd; > >> + struct pci_conf matches[2]; > >> + struct pci_match_conf match = { > >> + .pc_sel = { > >> + .pc_domain = addr->domain, > >> + .pc_bus = addr->bus, > >> + .pc_dev = addr->devid, > >> + .pc_func = addr->function, > >> + }, > >> + }; > >> + struct pci_conf_io conf_io = { > >> + .pat_buf_len = 0, > >> + .num_patterns = 1, > >> + .patterns = { &match }, > >> + .match_buf_len = sizeof(matches), > >> + .matches = &matches[0], > >> + }; > >> + > >> + fd = open("/dev/pci", O_RDONLY); > > > > Just courious who provides this special file... is a DPDK-specific > > thing? I haven't noticed it anywhere in Linux. > > I don't know, just took the bsd pci code and plugged myself in it. > So for me this is a special bsd device. > > This is mainly copy/paste. > Look at rte_eal_pci_scan() from lib/librte_eal/bsdapp/eal/eal_pci.c. BSD... I didn't notice. That's the answer. > > > > >> + if (fd < 0) { > >> + RTE_LOG(ERR, EAL, "%s(): error opening /dev/pci\n", > >> __func__); > >> + goto error; > > > > If you write: > > return -1; > > > > then you can... > > > >> + } > >> + > >> + if (ioctl(fd, PCIOCGETCONF, &conf_io) < 0) { > >> + RTE_LOG(ERR, EAL, "%s(): error with ioctl on /dev/pci: %s\n", > >> + __func__, strerror(errno)); > >> + goto error; > >> + } > >> + > >> + if (conf_io.num_matches != 1) > >> + goto error; > >> + > >> + if (pci_scan_one(fd, &matches[0]) < 0) > >> + goto error; > >> + > >> + close(fd); > >> + > >> + return 0; > >> + > >> +error: > > > > ...remove this if: > > > >> + if (fd >= 0) > >> + close(fd); > > > > Or, do you consider it more stable in the orignal way? > > Well, as said above, this is copy/paste code. > But, anyway, when I write functions with goto statements, I prefer > having a minimal number of return statements, matter of taste. > Another way is to add two label error_close: error: but this is a bit > overkill here. All of them are OK. As for me, I prefer to not hide simple returns. > > > >> + return -1; > >> +} > >> + > >> /* Read PCI config space. */ > >> int rte_eal_pci_read_config(const struct rte_pci_device *dev, > >> void *buf, size_t len, off_t offset) > >> diff --git a/lib/librte_eal/common/eal_private.h > >> b/lib/librte_eal/common/eal_private.h > >> index 072e672..ed1903f 100644 > >> --- a/lib/librte_eal/common/eal_private.h > >> +++ b/lib/librte_eal/common/eal_private.h > >> @@ -155,6 +155,19 @@ struct rte_pci_driver; > >> struct rte_pci_device; > >> > >> /** > >> + * Refresh a pci device object by asking the kernel for the latest > >> information. > >> + * > >> + * This function is private to EAL. > >> + * > >> + * @param addr > >> + * The PCI Bus-Device-Function address to look for > >> + * @return > >> + * - 0 on success. > >> + * - negative on error. > > > > I don't know whether this is a convention in DPDK, anyway, I don't > > like to restrict errors to just negatives. You cannot write > > > > if ((err = pci_refresh_device(...)) /* < 0 */) { > > handle_error(err); > > } > > > > as the check for < 0 is required (easy to be avoided). > > It is a remnant of a lot of code in eal that tries to have 0 for > success, < 0 for errors, > 0 for special cases. > OK, makes sense. > -- Jan Viktorin E-mail: Viktorin at RehiveTech.com System Architect Web:www.RehiveTech.com RehiveTech Brno, Czech Republic
[dpdk-dev] [PATCH 3/9] drivers: no more pdev drivers
On Wed, 10 Feb 2016 12:38:20 +0100 David Marchand wrote: > On Wed, Feb 10, 2016 at 11:20 AM, Jan Viktorin > wrote: > > On Wed, 10 Feb 2016 10:27:14 +0100 > > David Marchand wrote: > >> #define RTE_EAL_PCI_REGISTER(name, d)\ > >> void pciinitfn_ ##name(void);\ > >> void __attribute__((constructor, used)) pciinitfn_ ##name(void)\ > >> {\ > >> rte_eal_pci_register(d);\ > > > > I meant > > rte_eal_pci_register(&(d)->pci_drv);\ > > > > Perhaps, my assumption that a PCI driver is always referred as pci_drv is > > wrong... > > Well, I suppose we will always have a pci driver embedded in some > other internal pmd structure. > So we can always expect it to be called pci_drv ... > > Btw, for drivers like mlx or virtio that need to do some more stuff in > their constructor (the iopl stuff for virtio is the most interesting > case, since the pci register happens only if iopl succeeded), we might > need some RTE_MODULE_INIT for those. > > But in such a case, I think having RTE_MODULE_INIT in all pmds would > make more sense, and RTE_EAL_PCI_REGISTER looks unneeded ? > The Linux Kernel solves this by allowing both ways. You can do it by hand and for most cases you can use just a simple oneliner. So, we can have the RTE_(EAL_)MODULE_INIT for the general case. And then eg. RTE_EAL_PCI_MODULE_INIT that makes the most common glue code to register a single driver and who finally calls the RTE_EAL_MODULE_INIT. Regards Jan
[dpdk-dev] [PATCH v4 0/6] vmxnet3 TSO, tx cksum offload and cleanups
On Tue, Jan 12, 2016 at 08:56:34PM -0800, Stephen Hemminger wrote: > On Tue, 12 Jan 2016 18:08:31 -0800 > Yong Wang wrote: > > > v4: > > * moved cleanups to separate patches > > * correctly handled multi-seg pkts with data ring used > > > > v3: > > * fixed comments from Stephen > > * added performance number for tx data ring > > > > v2: > > * fixed some logging issues when debug option turned on > > * updated the txq_flags check in vmxnet3_dev_tx_queue_setup() > > > > This patchset adds TCP/UDP checksum offload and TSO to vmxnet3 PMD. > > One of the use cases is to support STT. It also restores the tx > > data ring feature that was removed from a previous patch. > > > > Yong Wang (6): > > vmxnet3: fix typos and remove unused struct > > vmxnet3: restore tx data ring support > > vmxnet3: cleanup txNumDeferred usage > > vmxnet3: add tx l4 cksum offload > > vmxnet3: add TSO support > > vmxnet3: announce device offload capability > > > > doc/guides/rel_notes/release_2_3.rst| 11 +++ > > drivers/net/vmxnet3/base/includeCheck.h | 39 > > drivers/net/vmxnet3/base/vmxnet3_defs.h | 9 +- > > drivers/net/vmxnet3/vmxnet3_ethdev.c| 16 +++- > > drivers/net/vmxnet3/vmxnet3_ring.h | 13 --- > > drivers/net/vmxnet3/vmxnet3_rxtx.c | 160 > > +--- > > 6 files changed, 151 insertions(+), 97 deletions(-) > > delete mode 100644 drivers/net/vmxnet3/base/includeCheck.h > > > > Looks good. The only thing maybe worth adding would be some more checks > int the vmxnet3_dev_configure for unsupported offload bits, etc. > > Acked-by: Stephen Hemminger Applied to dpdk-next-net/rel_16_04 Thanks, Bruce
[dpdk-dev] i40e: cannot change mtu to enable jumbo frame
> > On 02/09/2016 08:05 PM, Zhu, Heqing wrote: > > Helin is still in Chinese New Year Vacation. Will the below command option > > help ? > > > > 4.5.9. port config - max-pkt-len > > Set the maximum packet length: > > > > testpmd> port config all max-pkt-len (value) > > This is equivalent to the --max-pkt-len command-line option. > > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Julien Meunier > > Sent: Tuesday, February 9, 2016 9:36 AM > > To: Zhang, Helin ; dev at dpdk.org > > Subject: [dpdk-dev] i40e: cannot change mtu to enable jumbo frame > > > > Hello Helin, > > > > I tried to send jumbo frames to a i40e card. However, I observed that all > > frames are dropped. Moreover, set_mtu function is not > implemented on i40e PMD. > > > > > testpmd --log-level 8 --huge-dir=/mnt/huge -n 4 -l 2,18 --socket-mem > > 1024,1024 -w :02:00.0 -w :02:00.2 -- -i --nb-cores=1 > > --nb-ports=2 --total-num-mbufs=65536 > > > > = > > Configuration > > = > > > > +---+ +-+ > > | | | | > > | tgen | | | > > | +--+ port 0 | > > | | | | > > | | | | > > | | | | > > | | | | > > | +--+ port 1 | > > | | | | > > +---+ +-+ > > > > DPDK: DPDK-v2.2 > > > > == > > MTU = 1500 > > == > > Packet sent from a tgen > > > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP)) > > > len(p) = 1514 > > > > testpmd> start > > PMD: i40e_rxd_to_vlan_tci(): Mbuf vlan_tci: 0, vlan_tci_outer: 0 > > testpmd> stop > > Telling cores to stop... > > Waiting for lcores to finish... > > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start *** > > PMD: i40e_update_vsi_stats(): rx_bytes:1518 > > PMD: i40e_update_vsi_stats(): rx_unicast: 1 > > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end *** > > PMD: i40e_dev_stats_get(): *** PF stats start *** > > PMD: i40e_dev_stats_get(): rx_bytes:1514 > > PMD: i40e_dev_stats_get(): rx_unicast: 1 > > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1 > > PMD: i40e_dev_stats_get(): rx_size_1522: 1 > > PMD: i40e_dev_stats_get(): *** PF stats end *** > > > > -- Forward statistics for port 0 -- > > RX-packets: 1 RX-dropped: 0 RX-total: 1 > > TX-packets: 0 TX-dropped: 0 TX-total: 0 > > > > > > PMD: i40e_update_vsi_stats(): *** VSI[14] stats start *** > > PMD: i40e_update_vsi_stats(): tx_bytes:1514 > > PMD: i40e_update_vsi_stats(): tx_unicast: 1 > > PMD: i40e_update_vsi_stats(): *** VSI[14] stats end *** > > PMD: i40e_dev_stats_get(): *** PF stats start *** > > PMD: i40e_dev_stats_get(): tx_bytes:1514 > > PMD: i40e_dev_stats_get(): tx_unicast: 1 > > PMD: i40e_dev_stats_get(): tx_size_1522: 1 > > PMD: i40e_dev_stats_get(): *** PF stats end *** > > > > -- Forward statistics for port 1 -- > > RX-packets: 0 RX-dropped: 0 RX-total: 0 > > TX-packets: 1 TX-dropped: 0 TX-total: 1 > > > > > > > > + Accumulated forward statistics for all ports+ > > RX-packets: 1 RX-dropped: 0 RX-total: 1 > > TX-packets: 1 TX-dropped: 0 TX-total: 1 > > > > > > > > => OK > > > > == > > MTU = 1600 > > == > > Packet sent > > > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP)) > > > len(p) = 1614 > > > > testpmd> port config mtu 0 1600 > > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95 > > testpmd> port config mtu 1 1600 > > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95 > > testpmd> start > > testpmd> stop > > Telling cores to stop... > > Waiting for lcores to finish... > > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start *** > > PMD: i40e_update_vsi_stats(): rx_bytes:1618 > > PMD: i40e_update_vsi_stats(): rx_unicast: 1 > > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end *** > > PMD: i40e_dev_stats_get(): *** PF stats start *** > > PMD: i40e_dev_stats_get(): rx_bytes:1614 > > PMD: i40e_dev_stats_get(): rx_unicast: 1 > > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1 > > PMD: i40e_dev_stats_get(): rx_size_big: 1 > > PMD: i40e_dev_stats_get(): *** PF stats end *** > > > > -- Forward statistics for port
[dpdk-dev] [PATCH] virtio: fix crashes in virtio stats functions
On Thu, Dec 24, 2015 at 11:48:41AM +0800, Yuanhan Liu wrote: > On Wed, Dec 23, 2015 at 09:45:19AM +, Bernard Iremonger wrote: > > This initialisation of nb_rx_queues and nb_tx_queues has been removed > > from eth_virtio_dev_init. > > > > The nb_rx_queues and nb_tx_queues were being initialised in > > eth_virtio_dev_init > > before the tx_queues and rx_queues arrays were allocated. > > > > The arrays are allocated when the ethdev port is configured and the > > nb_tx_queues and nb_rx_queues are initialised. > > > > If any of the following functions were called before the ethdev > > port was configured there was a segmentation fault because > > rx_queues and tx_queues were NULL: > > > > rte_eth_stats_get > > rte_eth_stats_reset > > rte_eth_xstats_get > > rte_eth_xstats_reset > > > > Fixes: 823ad647950a ("virtio: support multiple queues") > > Signed-off-by: Bernard Iremonger > > Acked-by: Yuanhan Liu > > Thanks. > > --yliu Applied to dpdk-next-net/rel_16_04 /Bruce
[dpdk-dev] [PATCH 2/3] version: adjust printing for new version scheme
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Thomas Monjalon > Sent: Monday, December 28, 2015 10:25 PM > To: Richardson, Bruce > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 2/3] version: adjust printing for new > version scheme > > 2015-12-21 13:26, Bruce Richardson: > > Since we are now using a year-month numbering scheme, adjust the > > printing of the version to always use 2-digits for YY.MM format. Hi, In the interest of getting the version patch in and the release notes renamed I'm going to drop this part of the patchset and resubmit 1/3 and 3/3. If we want to rework the version output we can do it in a separate patch. I am going to maintain the zero padding of the month so that it appear as 16.04 and not 16.4 in "make showversion" and rte_version(). > Shouldn't we take the opportunity to update RTE_VER_PREFIX from "RTE" to > "DPDK"? I'm also going to make this change "en passant" since it makes sense and is in the same file. The version outputs now look like this: $ make showversion 16.04.0-rc0 RTE>> version_autotest Version string: 'DPDK 16.04.0-rc0' Note, if the s/RTE/DPDK/ change breaks anyone's parsing of output for tests or otherwise, let me know. John. --
[dpdk-dev] [PATCH v2 2/2] doc: rename release notes 2.3 to 16.04
From: "Richardson, Bruce" Updated release documentation to reflect new numbering scheme. Signed-off-by: Bruce Richardson Signed-off-by: John McNamara --- doc/guides/rel_notes/index.rst | 2 +- doc/guides/rel_notes/release_16_04.rst | 138 + doc/guides/rel_notes/release_2_3.rst | 138 - 3 files changed, 139 insertions(+), 139 deletions(-) create mode 100644 doc/guides/rel_notes/release_16_04.rst delete mode 100644 doc/guides/rel_notes/release_2_3.rst diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst index 29013cf..84317b8 100644 --- a/doc/guides/rel_notes/index.rst +++ b/doc/guides/rel_notes/index.rst @@ -36,7 +36,7 @@ Release Notes :numbered: rel_description -release_2_3 +release_16_04 release_2_2 release_2_1 release_2_0 diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst new file mode 100644 index 000..27fc624 --- /dev/null +++ b/doc/guides/rel_notes/release_16_04.rst @@ -0,0 +1,138 @@ +DPDK Release 16.04 +== + + +**Read this first** + +The text below explains how to update the release notes. + +Use proper spelling, capitalization and punctuation in all sections. + +Variable and config names should be quoted as fixed width text: ``LIKE_THIS``. + +Build the docs and view the output file to ensure the changes are correct:: + + make doc-guides-html + + firefox build/doc/html/guides/rel_notes/release_16_04.html + + +New Features + + +This section should contain new features added in this release. Sample format: + +* **Add a title in the past tense with a full stop.** + + Add a short 1-2 sentence description in the past tense. The description + should be enough to allow someone scanning the release notes to understand + the new feature. + + If the feature adds a lot of sub-features you can use a bullet list like this. + + * Added feature foo to do something. + * Enhanced feature bar to do something else. + + Refer to the previous release notes for examples. + +* **Virtio 1.0.** + + Enabled virtio 1.0 support for virtio pmd driver. + + +Resolved Issues +--- + +This section should contain bug fixes added to the relevant sections. Sample format: + +* **code/section Fixed issue in the past tense with a full stop.** + + Add a short 1-2 sentence description of the resolved issue in the past tense. + The title should contain the code/lib section like a commit message. + Add the entries in alphabetic order in the relevant sections below. + + +EAL +~~~ + + +Drivers +~~~ + + +Libraries +~ + + +Examples + + + +Other +~ + + +Known Issues + + +This section should contain new known issues in this release. Sample format: + +* **Add title in present tense with full stop.** + + Add a short 1-2 sentence description of the known issue in the present + tense. Add information on any known workarounds. + + +API Changes +--- + +This section should contain API changes. Sample format: + +* Add a short 1-2 sentence description of the API change. Use fixed width + quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + + +ABI Changes +--- + +* Add a short 1-2 sentence description of the ABI change that was announced in + the previous releases and made in this release. Use fixed width quotes for + ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + + +Shared Library Versions +--- + +Update any library version updated in this release and prepend with a ``+`` sign. + +The libraries prepended with a plus sign were incremented in this version. + +.. code-block:: diff + + libethdev.so.2 + librte_acl.so.2 + librte_cfgfile.so.2 + librte_cmdline.so.1 + librte_distributor.so.1 + librte_eal.so.2 + librte_hash.so.2 + librte_ip_frag.so.1 + librte_ivshmem.so.1 + librte_jobstats.so.1 + librte_kni.so.2 + librte_kvargs.so.1 + librte_lpm.so.2 + librte_mbuf.so.2 + librte_mempool.so.1 + librte_meter.so.1 + librte_pipeline.so.2 + librte_pmd_bond.so.1 + librte_pmd_ring.so.2 + librte_port.so.2 + librte_power.so.1 + librte_reorder.so.1 + librte_ring.so.1 + librte_sched.so.1 + librte_table.so.2 + librte_timer.so.1 + librte_vhost.so.2 diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst deleted file mode 100644 index 7945694..000 --- a/doc/guides/rel_notes/release_2_3.rst +++ /dev/null @@ -1,138 +0,0 @@ -DPDK Release 2.3 - - - -**Read this first** - -The text below explains how to update the release notes. - -Use proper spelling, capitalization and punctuation in all sections. - -Variable and config names should be quoted as fixed width text: ``LIKE_THIS``. - -Build the docs and view the output file to ensure the changes are correct:: -
[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers
From: "Richardson, Bruce" As discussed on list, switch numbering scheme to be based on year/month. Release 2.3 then becomes 16.04. Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html Also, added zero padding to the month so that it appear as 16.04 and not 16.4 in "make showversion" and rte_version(). Signed-off-by: Bruce Richardson Signed-off-by: John McNamara --- lib/librte_eal/common/include/rte_version.h | 10 +- mk/rte.sdkconfig.mk | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/librte_eal/common/include/rte_version.h b/lib/librte_eal/common/include/rte_version.h index 6b1890e..4784e67 100644 --- a/lib/librte_eal/common/include/rte_version.h +++ b/lib/librte_eal/common/include/rte_version.h @@ -50,17 +50,17 @@ extern "C" { /** * String that appears before the version number */ -#define RTE_VER_PREFIX "RTE" +#define RTE_VER_PREFIX "DPDK" /** * Major version number i.e. the x in x.y.z */ -#define RTE_VER_MAJOR 2 +#define RTE_VER_MAJOR 16 /** * Minor version number i.e. the y in x.y.z */ -#define RTE_VER_MINOR 3 +#define RTE_VER_MINOR 4 /** * Patch level number i.e. the z in x.y.z @@ -105,13 +105,13 @@ rte_version(void) if (version[0] != 0) return version; if (strlen(RTE_VER_SUFFIX) == 0) - snprintf(version, sizeof(version), "%s %d.%d.%d", + snprintf(version, sizeof(version), "%s %d.%02d.%d", RTE_VER_PREFIX, RTE_VER_MAJOR, RTE_VER_MINOR, RTE_VER_PATCH_LEVEL); else - snprintf(version, sizeof(version), "%s %d.%d.%d%s%d", + snprintf(version, sizeof(version), "%s %d.%02d.%d%s%d", RTE_VER_PREFIX, RTE_VER_MAJOR, RTE_VER_MINOR, diff --git a/mk/rte.sdkconfig.mk b/mk/rte.sdkconfig.mk index f8d95b1..564a5c3 100644 --- a/mk/rte.sdkconfig.mk +++ b/mk/rte.sdkconfig.mk @@ -35,7 +35,7 @@ showversion: $$(sed -rne 's,^#define RTE_VER_[A-Z_]*[[:space:]]+([0-9]+).*,\1,p' \ -e 's,^#define RTE_VER_SUFFIX[[:space:]]+"(.*)",\1,p' \ $(RTE_SRCDIR)/lib/librte_eal/common/include/rte_version.h) ;\ - printf '%d.%d.%d' "$$1" "$$2" "$$3"; \ + printf '%d.%02d.%d' "$$1" "$$2" "$$3"; \ if [ -z "$$5" ]; then echo; \ else printf '%s' "$$4"; \ if [ $$5 -lt 16 ] ; then echo $$5; \ -- 2.5.0
[dpdk-dev] [PATCH] hash: fix CRC32c computation
Hi, Small typo. > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Didier Pallard > Sent: Tuesday, December 22, 2015 9:35 AM > To: dev at dpdk.org; Richardson, Bruce ; De > Lara > Guarch, Pablo > Subject: [dpdk-dev] [PATCH] hash: fix CRC32c computation > diff --git a/lib/librte_hash/rte_hash_crc.h b/lib/librte_hash/rte_hash_crc.h > index > 78a34b7..485f8a2 100644 > --- a/lib/librte_hash/rte_hash_crc.h > +++ b/lib/librte_hash/rte_hash_crc.h > + * Use single crc32 instruction to perform a hash on a byte value. "on a byte value" should be ===> "on 2 bytes". > + * Fall back to software crc32 implementation in case SSE4.2 is > + * not supported > + * > + * @param data > + * Data to perform hash on. > + * @param init_val > + * Value to initialise hash generator. > + * @return > + * 32bit calculated hash value. > + */ > +static inline uint32_t > +rte_hash_crc_2byte(uint16_t data, uint32_t init_val) { #if defined > +RTE_ARCH_I686 || defined RTE_ARCH_X86_64 > + if (likely(crc32_alg & CRC32_SSE42)) > + return crc32c_sse42_u16(data, init_val); #endif > + > + return crc32c_2bytes(data, init_val); > +} > + Thanks, Reshma
[dpdk-dev] [PATCH v2 2/2] driver/net/mpipe: fix the crash/hung issue when testpmd quits
I split the change and resubmitted it as v3 http://dpdk.org/dev/patchwork/patch/10456/ It includes one of the two fixes. The other one is optional for now and could be submitted separately later. Thanks, Liming -Original Message- From: Bruce Richardson [mailto:bruce.richard...@intel.com] Sent: Tuesday, February 09, 2016 10:59 AM To: Liming Sun Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH v2 2/2] driver/net/mpipe: fix the crash/hung issue when testpmd quits On Fri, Jan 08, 2016 at 09:39:08AM -0500, Liming Sun wrote: > 1. Fixed the compiling issue of the ethtool example on tilegx >platform. > 2. Fixed the hung/crash issue when quitting testpmd under high >traffic rate. The buffer error bit needs to be checked before >processing the idesc and releasing the buffer. Code logic is >also simplified. > Again, with two issues being solved, this looks like two patches. Can you also describe exactly the causes of the individual issues in each patch and how the patch fixes them. Please also include a fixes line for each patch as described here: http://dpdk.org/doc/guides/contributing/patches.html#commit-messages-body Thanks, /Bruce
[dpdk-dev] [PATCH v3] remove extra parentheses in return statement
2016-01-27 21:58, Huawei Xie: > v3 changes: > remove other extra parentheses in 'return (logical expressions)' > which checkpatch doesn't report as error > remove extra parentheses in return statement which crosses > multiple line > fix the document > > v2 changes: > add missed commit message in v1 > > fix the error reported by checkpatch: > "ERROR: return is not a function, parentheses are not required" > > remove parentheses in return like: > "return (logical expressions)" > > remove parentheses in return a function like: > "return (rte_mempool_lookup(...))" > > Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return") > > Signed-off-by: Huawei Xie > --- > 95 files changed, 436 insertions(+), 436 deletions(-) Wow 95 files changed! good work Applied, thanks
[dpdk-dev] [PATCH v2 1/2] driver/net/mpipe: add rte_vect.h and enable CONFIG_RTE_LIBRTE_LPM
Thanks Bruce. I split the changes as suggested and resubmitted as below. http://dpdk.org/dev/patchwork/patch/10454/ http://dpdk.org/dev/patchwork/patch/10455/ - Liming -Original Message- From: Bruce Richardson [mailto:bruce.richard...@intel.com] Sent: Tuesday, February 09, 2016 10:56 AM To: Liming Sun Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH v2 1/2] driver/net/mpipe: add rte_vect.h and enable CONFIG_RTE_LIBRTE_LPM On Fri, Jan 08, 2016 at 09:39:07AM -0500, Liming Sun wrote: > rte_vect.h was missing earlier thus LPM was disabled and l3fwd is not > able to compile. This commit implements the vector api and enable LPM > in the tilegx configuration by default. It also includes a minor > optimization to use __insn_fetchadd4() instead of > rte_atomic32_xxx() in mpipe_dp_enter/mpipe_dp_exit to avoid the > unnecessary memory fence. This looks like it should be two patches to me. One patch to add the missing dependency and get lpm to work. The second patch should then contain the driver optimization. Do you agree? /Bruce PS: the commit title prefix for the first patch should probably be "eal/tile" rather than mpipe, since it's not directly affecting the mpipe driver. > > Signed-off-by: Liming Sun > Acked-by: Zhigang Lu > --- > config/defconfig_tile-tilegx-linuxapp-gcc | 2 +- > drivers/net/mpipe/mpipe_tilegx.c | 18 +++-- > lib/librte_eal/common/include/arch/tile/rte_vect.h | 93 > ++ > 3 files changed, 107 insertions(+), 6 deletions(-) create mode > 100644 lib/librte_eal/common/include/arch/tile/rte_vect.h > [snip] >+ mbuf->next = NULL; > > PMD_DEBUG_RX("%s: RX mbuf %p, buffer %p, buf_addr %p, size %d\n", >mpipe_name(priv), mbuf, va, mbuf->buf_addr, size); diff > --git > a/lib/librte_eal/common/include/arch/tile/rte_vect.h > b/lib/librte_eal/common/include/arch/tile/rte_vect.h > new file mode 100644 > index 000..32d768a > --- /dev/null > +++ b/lib/librte_eal/common/include/arch/tile/rte_vect.h > @@ -0,0 +1,93 @@ > +/* > + * BSD LICENSE > + * > + * Copyright (C) EZchip Semiconductor Ltd. 2015. Maybe update the copyright year? > + * > + * Redistribution and use in source and binary forms, with or without > + * modification, are permitted provided that the following conditions [snip]
[dpdk-dev] [PATCH] doc: drop old naming of the project
2016-02-09 17:20, Bruce Richardson: > On Mon, Feb 08, 2016 at 02:39:02PM +0100, Thomas Monjalon wrote: > > It was requested by Intel, more than one year ago, to replace the name > > "Intel DPDK" by "DPDK". > > Some references to the old name were still in some docs and code comments, > > leading to confusion. > > > > Fixes: ac8ada004c12 ("doc: remove Intel references from release notes") > > > > Signed-off-by: Thomas Monjalon > > Acked-by: Bruce Richardson Applied
[dpdk-dev] Cannot run DPDK applications in more than one Container simultaneously
Hi, I have tried the stops mentioned in the mail thread below.. http://dpdk.org/ml/archives/dev/2014-July/004031.html But I could not run the same applcation in two different containers. It seems once I start the application on the first container, the /mnt/huge filesystem cannot be locked from the second container.. until I stop the app on the first one, and 'umount /mnt/dev' on it.. Kind of like seen in the mail thread above.. Can anyone point to me any material on how to achieve the same? Thanks in advance -Pushpasis
[dpdk-dev] [PATCH] virtio: fix rx ring descriptor starvation
On Tue, Jan 05, 2016 at 07:13:04AM +, Xie, Huawei wrote: > On 12/17/2015 7:18 PM, Tom Kiely wrote: > > > > > > On 11/25/2015 05:32 PM, Xie, Huawei wrote: > >> On 11/13/2015 5:33 PM, Tom Kiely wrote: > >>> If all rx descriptors are processed while transient > >>> mbuf exhaustion is present, the rx ring ends up with > >>> no available descriptors. Thus no packets are received > >>> on that ring. Since descriptor refill is performed post > >>> rx descriptor processing, in this case no refill is > >>> ever subsequently performed resulting in permanent rx > >>> traffic drop. > >>> > >>> Signed-off-by: Tom Kiely > >>> --- > >>> drivers/net/virtio/virtio_rxtx.c |6 -- > >>> 1 file changed, 4 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/drivers/net/virtio/virtio_rxtx.c > >>> b/drivers/net/virtio/virtio_rxtx.c > >>> index 5770fa2..a95e234 100644 > >>> --- a/drivers/net/virtio/virtio_rxtx.c > >>> +++ b/drivers/net/virtio/virtio_rxtx.c > >>> @@ -586,7 +586,8 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf > >>> **rx_pkts, uint16_t nb_pkts) > >>> if (likely(num > DESC_PER_CACHELINE)) > >>> num = num - ((rxvq->vq_used_cons_idx + num) % > >>> DESC_PER_CACHELINE); > >>> -if (num == 0) > >>> +/* Refill free descriptors even if no pkts recvd */ > >>> +if (num == 0 && virtqueue_full(rxvq)) > >> Should the return condition be that no used buffers and we have avail > >> descs in avail ring, i.e, > >> num == 0 && rxvq->vq_free_cnt != rxvq->vq_nentries > >> > >> rather than > >> num == 0 && rxvq->vq_free_cnt == 0 > > Yes we could do that but I don't see a good reason to wait until the > > vq_free_cnt == vq_nentries > > before attempting the refill. The existing code will attempt refill > > even if only 1 packet was received > > and the free count is small. To me it seems safer to extend that to > > try refill even if no packet was received > > but the free count is non-zero. > The existing code attempt to refill only if 1 packet was received. > > If we want to refill even no packet was received, then the strict > condition should be > num == 0 && rxvq->vq_free_cnt != rxvq->vq_nentries > > The safer condition, what you want to use, should be > num == 0 && !virtqueue_full(...) > rather than > num == 0 && virtqueue_full(...) > > We could simplify things a bit, just remove this check, if the following > receiving code already takes care of the "num == 0" condition. > > I find virtqueue_full is confusing, maybe we could change it to some > other meaningful name. > > > > >Tom > > Ping. Tom and Huawei, what is the status of this patch? Will there be a V2? /Bruce
[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers
2016-02-10 14:33, John McNamara: > /** > * Major version number i.e. the x in x.y.z > */ > -#define RTE_VER_MAJOR 2 > +#define RTE_VER_MAJOR 16 > > /** > * Minor version number i.e. the y in x.y.z > */ > -#define RTE_VER_MINOR 3 > +#define RTE_VER_MINOR 4 > I liked the idea of Bruce of renaming these constants to *_YEAR and *_MONTH. I also think RTE_VER_PATCH_LEVEL should be RTE_VER_MINOR. And maybe that RTE_VER_PATCH_RELEASE should be RTE_VER_RELEASE. It shouldn't hurt applications as they must use RTE_VERSION and rte_version(). Opinions?
[dpdk-dev] [PATCH v2 ] ixgbe: speed up transmit
On Fri, Dec 11, 2015 at 06:52:36PM +, Ananyev, Konstantin wrote: > > -Original Message- > > From: Stephen Hemminger [mailto:shemming at brocade.com] > > Sent: Friday, December 11, 2015 4:48 PM > > To: dev at dpdk.org; Zhang, Helin; Ananyev, Konstantin > > Subject: Re: [PATCH v2 ] ixgbe: speed up transmit > > > > On Fri, 13 Nov 2015 08:10:13 -0800 > > Stephen Hemminger wrote: > > > > > From: Stephen Hemminger > > > > > > The freeing of mbuf's in ixgbe is one of the observable hot spots > > > under load. Optimize it by doing bulk free of mbufs using code similar > > > to i40e and fm10k. > > > > > > Drop the no longer needed micro-optimization for the no refcount flag. > > > > > > Signed-off-by: Stephen Hemminger > > > > How come this patch got no review or comments? > > It gets a visible performance gain of up to 10% in some cases. > > > > I understand maintainers are busy with internal work, but they need > > to read mailing list as well. > > Yep, I missed it somehow. > BTW, by some reason I couldn't find it patchwork > Was it submitted too late for 2.2 timeframe? > Anyway, about the patch itself: it looks good to me and indeed it provides > 10%+ > performance improvement. > So here is my ACK for it. > Konstantin Applied to dpdk-next-net/rel_16_04, with edited title to clarify that it's the non-vectorized tx code that is being sped up, and not all ixgbe transmit paths. Thanks, /Bruce
[dpdk-dev] i40e: cannot change mtu to enable jumbo frame
> -Original Message- > From: Julien Meunier [mailto:julien.meunier at 6wind.com] > Sent: Wednesday, February 10, 2016 12:36 AM > To: Zhang, Helin ; dev at dpdk.org > Subject: i40e: cannot change mtu to enable jumbo frame > > Hello Helin, > > I tried to send jumbo frames to a i40e card. However, I observed that all > frames > are dropped. Moreover, set_mtu function is not implemented on i40e PMD. > > > testpmd --log-level 8 --huge-dir=/mnt/huge -n 4 -l 2,18 --socket-mem > 1024,1024 -w :02:00.0 -w :02:00.2 -- -i --nb-cores=1 > --nb-ports=2 --total-num-mbufs=65536 > > = > Configuration > = > > +---+ +-+ > | | | | > | tgen | | | > | +--+ port 0 | > | | | | > | | | | > | | | | > | | | | > | +--+ port 1 | > | | | | > +---+ +-+ > > DPDK: DPDK-v2.2 > > == > MTU = 1500 > == > Packet sent from a tgen > > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP)) > > len(p) = 1514 > > testpmd> start > PMD: i40e_rxd_to_vlan_tci(): Mbuf vlan_tci: 0, vlan_tci_outer: 0 > testpmd> stop > Telling cores to stop... > Waiting for lcores to finish... > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start *** > PMD: i40e_update_vsi_stats(): rx_bytes:1518 > PMD: i40e_update_vsi_stats(): rx_unicast: 1 > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): rx_bytes:1514 > PMD: i40e_dev_stats_get(): rx_unicast: 1 > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1 > PMD: i40e_dev_stats_get(): rx_size_1522: 1 > PMD: i40e_dev_stats_get(): *** PF stats end *** > >-- Forward statistics for port 0 -- >RX-packets: 1 RX-dropped: 0 RX-total: 1 >TX-packets: 0 TX-dropped: 0 TX-total: 0 > > > PMD: i40e_update_vsi_stats(): *** VSI[14] stats start *** > PMD: i40e_update_vsi_stats(): tx_bytes:1514 > PMD: i40e_update_vsi_stats(): tx_unicast: 1 > PMD: i40e_update_vsi_stats(): *** VSI[14] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): tx_bytes:1514 > PMD: i40e_dev_stats_get(): tx_unicast: 1 > PMD: i40e_dev_stats_get(): tx_size_1522: 1 > PMD: i40e_dev_stats_get(): *** PF stats end *** > >-- Forward statistics for port 1 -- >RX-packets: 0 RX-dropped: 0 RX-total: 0 >TX-packets: 1 TX-dropped: 0 TX-total: 1 > > > >+ Accumulated forward statistics for all ports+ >RX-packets: 1 RX-dropped: 0 RX-total: 1 >TX-packets: 1 TX-dropped: 0 TX-total: 1 > > > > > => OK > > == > MTU = 1600 > == > Packet sent > > p = Ether / IP / UDP / Raw(MTU + HDR(Ethernet)- HDR(IP) - HDR(UDP)) > > len(p) = 1614 > > testpmd> port config mtu 0 1600 > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95 > testpmd> port config mtu 1 1600 > rte_eth_dev_set_mtu: Function not supported Set MTU failed. diag=-95 > testpmd> start > testpmd> stop > Telling cores to stop... > Waiting for lcores to finish... > PMD: i40e_update_vsi_stats(): *** VSI[13] stats start *** > PMD: i40e_update_vsi_stats(): rx_bytes:1618 > PMD: i40e_update_vsi_stats(): rx_unicast: 1 > PMD: i40e_update_vsi_stats(): *** VSI[13] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): rx_bytes:1614 > PMD: i40e_dev_stats_get(): rx_unicast: 1 > PMD: i40e_dev_stats_get(): rx_unknown_protocol: 1 > PMD: i40e_dev_stats_get(): rx_size_big: 1 > PMD: i40e_dev_stats_get(): *** PF stats end *** > >-- Forward statistics for port 0 -- >RX-packets: 1 RX-dropped: 0 RX-total: 1 >TX-packets: 0 TX-dropped: 0 TX-total: 0 > > > PMD: i40e_update_vsi_stats(): *** VSI[14] stats start *** > PMD: i40e_update_vsi_stats(): tx_bytes:0 > PMD: i40e_update_vsi_stats(): tx_unicast: 0 > PMD: i40e_update_vsi_stats(): *** VSI[14] stats end *** > PMD: i40e_dev_stats_get(): *** PF stats start *** > PMD: i40e_dev_stats_get(): tx_bytes:0 > PM
[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers
On Wed, Feb 10, 2016 at 04:11:40PM +0100, Thomas Monjalon wrote: > 2016-02-10 14:33, John McNamara: > > /** > > * Major version number i.e. the x in x.y.z > > */ > > -#define RTE_VER_MAJOR 2 > > +#define RTE_VER_MAJOR 16 > > > > /** > > * Minor version number i.e. the y in x.y.z > > */ > > -#define RTE_VER_MINOR 3 > > +#define RTE_VER_MINOR 4 > > > > I liked the idea of Bruce of renaming these constants to > *_YEAR and *_MONTH. > I also think RTE_VER_PATCH_LEVEL should be RTE_VER_MINOR. > And maybe that RTE_VER_PATCH_RELEASE should be RTE_VER_RELEASE. > > It shouldn't hurt applications as they must use RTE_VERSION > and rte_version(). > > Opinions? Well, I always tend to like Bruce's ideas too, so +1 from me :-) /Bruce
[dpdk-dev] [PATCH] virtio: fix virtio_net_hdr desc pointing to the same buffer
On Mon, Dec 14, 2015 at 01:58:34PM +, Xie, Huawei wrote: > > > > -Original Message- > > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > > Sent: Monday, December 14, 2015 9:45 PM > > To: Xie, Huawei > > Cc: Yuanhan Liu; dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH] virtio: fix virtio_net_hdr desc > > pointing to the same buffer > > > > 2015-12-14 13:38, Xie, Huawei: > Np. There is no issue either apply this patch or delay it to 2.3. > Patch now applied to dpdk-next-net/rel_16_04. /Bruce
[dpdk-dev] [PATCH v2] af_packet: make the device detachable
Hi Bruce, >The use of "deinitialization" sounds awkward Thank you for your interest. I called it deinitialization in opposition to an initialization of a device. As I'm not a native English speaker, I trust your opinion and I'll try to rephrase this. Hi Bernard, >What parameters do you use with --vdev option in testpmd I launch testpmd like this: # ./testpmd -c f -n 4 --vdev 'eth_af_packet0,iface=eth2' -- -i --port-topology=chained Then I can see that af_packet starts: PMD: Initializing pmd_af_packet for eth_af_packet0 PMD: eth_af_packet0: AF_PACKET MMAP parameters: PMD: eth_af_packet0:block size 4096 PMD: eth_af_packet0:block count 256 PMD: eth_af_packet0:frame size 2048 PMD: eth_af_packet0:frame count 512 PMD: eth_af_packet0: creating AF_PACKET-backed ethdev on numa socket When I get to the CLI, I do as follows: Checking link statuses... Port 0 Link Up - speed 1 Mbps - full-duplex Done testpmd> port stop 0 Stopping ports... Checking link statuses... Port 0 Link Down Done testpmd> port close 0 Closing ports... Done testpmd> port detach 0 Detaching a port... PMD: Closing AF_PACKET ethdev on numa socket 0 Port 'eth_af_packet0' is detached. Now total ports is 0 Done testpmd> In opposition, without the patch detach is impossible: testpmd> port detach 0 Detaching a port... EAL: Driver, cannot detach the device testpmd> Bernard, Bruce, I have a question, if I may. Do you know what is the reason that rte_pmd_af_packet_devinit() is the only non-static device initialization function among all the dpdk drivers? There's even a comment in the rte_eth_af_packet.h: /** * For use by the EAL only. Called as part of EAL init to set up any dummy NICs * configured on command line. */ int rte_pmd_af_packet_devinit(const char *name, const char *params); Despite the comment above, I cannot see this function being called directly anywhere. Is there any reason it is implemented this way? Or should I change the definition to static, as it should be called via proper API functions? Thank you for your time, Wojtek 2016-02-09 17:37 GMT+01:00 Bruce Richardson : > On Tue, Feb 09, 2016 at 05:09:06PM +0100, Wojciech Zmuda wrote: >> Implement rte_pmd_af_packet_devuninit() exposed through struct >> rte_driver.uninit() and set dev_flags to RTE_ETH_DEV_DETACHABLE, >> to allow af_packet device deinitialization with API function >> rte_eth_dev_detach(). This fixes memory leak by freeing memory >> allocated during initialization. >> During device initialization copy device name to ethdev->data to make >> it compatible with rte_eth_dev_allocated(). >> >> Signed-off-by: Wojciech Zmuda >> --- >> v2: >> * Fixed typo and a comment. >> * Added feature to the 2.3 release notes. >> * Free memory allocated for rx and tx queues. >> >> doc/guides/rel_notes/release_2_3.rst | 4 >> drivers/net/af_packet/rte_eth_af_packet.c | 37 >> ++- >> 2 files changed, 40 insertions(+), 1 deletion(-) >> >> diff --git a/doc/guides/rel_notes/release_2_3.rst >> b/doc/guides/rel_notes/release_2_3.rst >> index 7945694..4694646 100644 >> --- a/doc/guides/rel_notes/release_2_3.rst >> +++ b/doc/guides/rel_notes/release_2_3.rst >> @@ -39,6 +39,10 @@ This section should contain new features added in this >> release. Sample format: >> >>Enabled virtio 1.0 support for virtio pmd driver. >> >> +* **Added af_packet driver deinitialization function.** >> + >> + Implemented rte_pmd_af_packet_devuninit() exposed through struct >> + rte_driver.uninit() to allow af_packet device deinitialization with API >> function. >> > > The use of "deinitialization" sounds awkward, and the overall text maybe > could be > made less technical. Maybe talk about "allowing dynamic removal" of af_packet > devices [or even hotplug of them]? > > /Bruce
[dpdk-dev] [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers.
On Thu, Dec 17, 2015 at 10:30:42AM +, Tom Kiely wrote: > Sorry for the delay in replying to this thread. I was on vacation for the > last 3 days. Please see inline for my comments. > > On 12/15/2015 02:37 PM, Ananyev, Konstantin wrote: > > > >>-Original Message- > >>From: Stephen Hemminger [mailto:stephen at networkplumber.org] > >>Sent: Monday, December 14, 2015 9:35 PM > >>To: Ananyev, Konstantin > >>Cc: Zhang, Helin; dev at dpdk.org; Tom Kiely > >>Subject: Re: [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers. > >> > >>On Mon, 14 Dec 2015 19:57:10 + > >>"Ananyev, Konstantin" wrote: > >> > >>> > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Monday, December 14, 2015 7:25 PM > To: Ananyev, Konstantin > Cc: Zhang, Helin; dev at dpdk.org; Tom Kiely > Subject: Re: [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers. > > On Mon, 14 Dec 2015 19:12:26 + > "Ananyev, Konstantin" wrote: > > > > >>From: Stephen Hemminger [mailto:stephen at networkplumber.org] > >>Sent: Friday, December 11, 2015 4:59 PM > >>To: Zhang, Helin; Ananyev, Konstantin > >>Cc: dev at dpdk.org; Tom Kiely; Stephen Hemminger > >>Subject: [PATCH] ixgbe: Discard SRIOV transparent vlan packet headers. > >> > >>From: Tom Kiely > >> > >>SRIOV VFs support "transparent" vlans. Traffic from/to a VM > >>associated with a VF is tagged/untagged with the specified > >>vlan in a manner intended to be totally transparent to the VM. > >> > >>The vlan is specified by "ip link set vf vlan ". > >>The VM is not configured for any vlan on the VF and the VM > >>should never see these transparent vlan headers for that reason. > >> > >>However, in practice these vlan headers are being received by > >>the VM which discards the packets as that vlan is unknown to it. > >>The Linux kernel explicitly discards such vlan headers but DPDK > >>does not. > >>This patch mirrors the kernel behaviour for SRIOV VFs only > > > >I have few concerns about that approach: > > > >1. I don't think vlan_tci info should *always* be stripped by vf RX > >routine. > >There could be configurations when that information might be needed by > >upper layer. > >Let say VF can be member of 2 or more VLANs and upper layer would like > >to have that information > >for further processing. > >Or special mirror VF, that does traffic snnoping, or something else. > >2. Proposed implementation would introduce a slowdown for all VF RX > >routines. > >3. From the description it seems like the aim is to clear VLAN > >information for the RX packet. > >Though the patch actually clears VLAN info only for the RX packet whose > >VLAN tag is not present inside SW copy of VFTA table. > >Which makes no much point to me: > >If VLAN is not present in HW VFTA table, then packet with that VLAN tag > >will be discarded by HW anyway. > >If it is present inside VFTA table (both SW & HW), then VLAN information > >would be preserved with and without the patch. > > > >If you need to clear VLAN information, why not to do it on the upper > >layer - inside your application itself? > >Either create some sort of wrapper around rx_burst(), or setup an RX > >call-back for your VF device. > > > >Konstantin > > The aim is to get SRIOV to work when the transparent VLAN tag feature is > used. > Please talk to the Linux driver team. Similar code exists there in > ixgbevf_process_skb_fields. > >>> > >>>Ah ok, I realised what you are trying to achieve now: > >>>You setup HW VFTA[] from the PF, so from VF point of view SW copy of the > >>>VFTA[] remains unset. > >>>So HW will pass VLAN packet in, but then SW will clear VLAN tag. > >>>Ok, that clears #3 above, but I think #1,2 still remain. > >>On the host, what configured is a vlan tag per VF per guest > >> > >>Tom had more info in the original mail. > >> > >>http://permalink.gmane.org/gmane.comp.networking.dpdk.devel/28932 > >> > The other option is have a copy of all the receive logic which is only > used by VF code. > >>>Why that's the only option? > >>>Why can't you clear that VLAN information above the PMD layer? > >>>Keep/obtain a copy of VFTA[] somewhere on the upper layer, > >>>and do actual clear after rx_burst() returns? > >>>Konstantin > >>The problem is that the guest is supposed to not see the VLAN tags (it has > >>no reason to), > >>but the hardware leaves a VLAN tag on there. > >Yes, I understand what you are trying to achieve. > > What I am trying to say: > >1. VLAN tag removing shouldn't be forced for all VFs. > >I think there are scenarios where existing behaviour (keeping vlan_tci and > >ol_flags intact) are what people need. > >One example would be mirror VF doing other VFs traffic snoopi
[dpdk-dev] [PATCH] i40e: fix inverted check for ETH_TXQ_FLAGS_NOREFCOUNT
On Wed, Dec 23, 2015 at 08:42:26AM +, Zhang, Helin wrote: > > > > -Original Message- > > From: Rich Lane [mailto:rich.lane at bigswitch.com] > > Sent: Wednesday, December 23, 2015 4:08 PM > > To: dev at dpdk.org > > Cc: Zhang, Helin > > Subject: [PATCH] i40e: fix inverted check for ETH_TXQ_FLAGS_NOREFCOUNT > > > > The no-refcount path was being taken without the application opting in to > > it. > > > > Reported-by: Mike Stolarchuk > > Signed-off-by: Rich Lane > Acked-by: Helin Zhang > > Thanks for the good catch! Applied to dpdk-next-net/rel_16_04 Thanks, /Bruce
[dpdk-dev] [PATCH] remove redundant __func__ in PMD_INIT_LOG and PMD_RX_LOG
On Mon, Dec 28, 2015 at 09:39:44AM -0800, Stephen Hemminger wrote: > On Mon, 28 Dec 2015 01:28:28 +0800 > Huawei Xie wrote: > > > > > Signed-off-by: Huawei Xie > > --- > > drivers/net/virtio/virtio_ethdev.c | 12 +--- > > drivers/net/vmxnet3/vmxnet3_ethdev.c | 6 +++--- > > drivers/net/vmxnet3/vmxnet3_rxtx.c | 2 +- > > 3 files changed, 9 insertions(+), 11 deletions(-) > > > > Looks good, probably worth auditing the code for extra newlines as well at > some point. > > Acked-by: Stephen Hemminger Patch split into 2 - one for virtio and one for vmxnet3 - and applied to dpdk-next-net/rel_16_04 /Bruce
[dpdk-dev] [PATCH] cryptodev: add capabilities discovery mechanism
Hi Declan, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty > Sent: Saturday, January 30, 2016 1:11 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] cryptodev: add capabilities discovery mechanism > > This patch add a mechanism for discovery of crypto device features and > supported crypto operations and algorithms. It also provides a method for a > crypto PMD to publish any data range limitations it may have for the > operations > and algorithms it supports. > > The parameter feature_flags added to rte_cryptodev struct is used to capture > features such as operations supported (symmetric crypto, operation chaining > etc) as well parameter such as whether the device is hardware accelerated or > uses SIMD instructions. > > The capabilities parameter allows a PMD to define an array of supported > operations with any limitation which that implementation may have. > > Finally the rte_cryptodev_info struct has been extended to allow retrieval of > these parameter using the existing rte_cryptodev_info_get() API. > > Signed-off-by: Declan Doherty There's a lot of good data in the Capabilities struct, but is there a use-case for it? Is it likely an application will call this API and validate everything? Or is it more likely it would just check for algorithm support and if it then uses an unsupported block_size, key_size, etc it will get an invalid param from the PMD and handle this? Secondly, instead of spreading capabilities across 2 data elements (feature bitmask and op capabilities struct), see suggestion below to roll all into a single struct. //snip// > + > +/** > + * Symmetric Crypto Capability > + */ > +struct rte_cryptodev_symmetric_capability { > + enum rte_crypto_xform_type type; > + /**< Transform type : Authentication / Cipher */ > + It would be more helpful to call this xform_type > + union { > + struct { > + enum rte_crypto_auth_algorithm algo; > + /**< authentication algorithm */ > + uint16_t block; > + /**< algorithm block size */ > + uint16_t key; > + /**< Key size supported */ > + uint16_t digest; > + /**< Maximum digest size supported */ > + struct { > + uint16_t min; /**< minimum aad size */ > + uint16_t max; /**< maximum aad size */ > + uint16_t increment; > + /**< if a range of sizes are supported, > + * this parameter is used to indicate > + * increments in byte size that are supported > + * between the minimum and maximum */ > + } aad; > + /**< Additional authentication data size range */ > + } auth; > + /**< Symmetric Authentication transform capabilities */ > + struct { > + enum rte_crypto_cipher_algorithm algo; > + /**< cipher algorithm */ > + uint16_t block_size; > + /**< algorithm block size */ > + struct { > + uint16_t min; /**< minimum key size */ > + uint16_t max; /**< maximum key size */ > + uint16_t increment; > + /**< if a range of sizes are supported, > + * this parameter is used to indicate > + * increments in byte size that are supported > + * between the minimum and maximum */ > + } key; > + /**< cipher key size range */ > + struct { > + uint16_t min; /**< minimum iv size */ > + uint16_t max; /**< maximum iv size */ > + uint16_t increment; > + /**< if a range of sizes are supported, > + * this parameter is used to indicate > + * increments in byte size that are supported > + * between the minimum and maximum */ > + } iv; > + /**< Initialisation vector data size range */ > + } cipher; > + /**< Symmetric Cipher transform capabilities */ > + }; > +}; > + > +/** Structure used to capture a capability of a crypto device */ struct > +rte_cryptodev_capabilities { > + enum rte_crypto_op_type op; > + /**< Operation type */ > + > + union { > + struct rte_cryptodev_symmetric_capability sym; > + /**< Symmetric operation capability parameters */ > + }; > +}; > + > +/** Macro
[dpdk-dev] [PATCH v2 1/2] version: switch to year/month version numbers
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, February 10, 2016 3:12 PM > To: Mcnamara, John > Cc: dev at dpdk.org; Richardson, Bruce > Subject: Re: [PATCH v2 1/2] version: switch to year/month version numbers > > ... > > I liked the idea of Bruce of renaming these constants to *_YEAR and > *_MONTH. > I also think RTE_VER_PATCH_LEVEL should be RTE_VER_MINOR. > And maybe that RTE_VER_PATCH_RELEASE should be RTE_VER_RELEASE. > > It shouldn't hurt applications as they must use RTE_VERSION and > rte_version(). > > Opinions? That seems like a good change. I'll resubmit. John. --
[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size
Hi Keith, Thank you for adding the RTE_NEXT_ABI. I think this is the way described in the process. Your changes will be available in next version (16.4) for people compiling with RTE_NEXT_ABI=y, and in 16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in default configs...). I think a deprecation notice should also be added in this commit in doc/guides/rel_notes/deprecation.rst. Please also find comments below. On 02/09/2016 06:30 PM, Keith Wiles wrote: > diff --git a/config/defconfig_x86_64-native-linuxapp-gcc > b/config/defconfig_x86_64-native-linuxapp-gcc > index 60baf5b..02e9ace 100644 > --- a/config/defconfig_x86_64-native-linuxapp-gcc > +++ b/config/defconfig_x86_64-native-linuxapp-gcc > @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y > > CONFIG_RTE_TOOLCHAIN="gcc" > CONFIG_RTE_TOOLCHAIN_GCC=y > +CONFIG_RTE_BUILD_SHARED_LIB=y > +CONFIG_RTE_NEXT_ABI=n > +CONFIG_RTE_EAL_IGB_UIO=n > +CONFIG_RTE_LIBRTE_KNI=n > +CONFIG_RTE_KNI_KMOD=n I think this should not be part of the patch. > @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp) > static unsigned > rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) > { > +#ifdef RTE_NEXT_ABI > + unsigned lcore_id; > + unsigned count = 0; > + unsigned cache_count; > + > + fprintf(f, " cache infos:\n"); > + fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size); > + if (mp->cache_size == 0) > + return count; > + > + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { > + cache_count = mp->local_cache[lcore_id].len; > + fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count); > + count += cache_count; > + } > + fprintf(f, "total_cache_count=%u\n", count); > + return count; > +#else > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > unsigned lcore_id; > unsigned count = 0; I think in this case we could avoid to duplicate the code without beeing unclear by using the proper #ifdefs: #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) /* common code */ #ifdef RTE_NEXT_ABI if (mp->cache_size == 0) return count; #endif /* common code */ #else ... #endif > @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp) > #define mempool_audit_cookies(mp) do {} while(0) > #endif > > +#ifdef RTE_NEXT_ABI > +/* check cookies before and after objects */ > +static void > +mempool_audit_cache(const struct rte_mempool *mp) > +{ > + /* check cache size consistency */ > + unsigned lcore_id; > + > + if (mp->cache_size == 0) > + return; > + > + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { > + if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) { > + RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n", > + lcore_id); > + rte_panic("MEMPOOL: invalid cache len\n"); > + } > + } > +} > +#else same here > diff --git a/lib/librte_mempool/rte_mempool.h > b/lib/librte_mempool/rte_mempool.h > index 6e2390a..fc9b595 100644 > --- a/lib/librte_mempool/rte_mempool.h > +++ b/lib/librte_mempool/rte_mempool.h > @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats { > } __rte_cache_aligned; > #endif > > +#ifdef RTE_NEXT_ABI > +/** > + * A structure that stores a per-core object cache. > + */ > +struct rte_mempool_cache { > + unsigned len; /**< Cache len */ > + /* > + * Cache is allocated to this size to allow it to overflow in certain > + * cases to avoid needless emptying of cache. > + */ > + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ > +} __rte_cache_aligned; > +#else same here > @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline)) > __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, > unsigned n, int is_mp) > { > +#ifndef RTE_NEXT_ABI /* Note: ifndef */ > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > +#endif /* RTE_NEXT_ABI */ > struct rte_mempool_cache *cache; > uint32_t index; > void **cache_objs; > unsigned lcore_id = rte_lcore_id(); > uint32_t cache_size = mp->cache_size; > uint32_t flushthresh = mp->cache_flushthresh; > +#ifndef RTE_NEXT_ABI /* Note: ifndef */ > #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ > +#endif /* RTE_NEXT_ABI */ this looks strange... I think it does not work properly. Why not #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) > /* increment stat now, adding in mempool always success */ > __MEMPOOL_STAT_ADD(mp, put, n); > > +#ifndef RTE_NEXT_ABI /* Note: ifndef */ > #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 > +#endif /* RTE_NEXT_ABI */ > /* cache is not enabled or single producer or non-EAL thread */ > if (unlikely(cache_size == 0 || is_mp == 0 || >lcore_id >= RTE_MAX_LCORE)) > @@ -802,7 +846,9 @@ __mempool_put_bulk(struct rte_mempool *
[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers
From: "Richardson, Bruce" As discussed on list, switch numbering scheme to be based on year/month. Release 2.3 then becomes 16.04. Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html Also, added zero padding to the month so that it appear as 16.04 and not 16.4 in "make showversion" and rte_version(). Signed-off-by: Bruce Richardson Signed-off-by: John McNamara --- drivers/net/nfp/nfp_net.c | 2 -- lib/librte_eal/common/include/rte_version.h | 42 ++--- mk/rte.sdkconfig.mk | 2 +- 3 files changed, 22 insertions(+), 24 deletions(-) diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index ffa5730..fd4dd39 100644 --- a/drivers/net/nfp/nfp_net.c +++ b/drivers/net/nfp/nfp_net.c @@ -1070,9 +1070,7 @@ nfp_net_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) }; dev_info->reta_size = NFP_NET_CFG_RSS_ITBL_SZ; -#if RTE_VER_MAJOR == 2 && RTE_VER_MINOR >= 1 dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ; -#endif } static uint32_t diff --git a/lib/librte_eal/common/include/rte_version.h b/lib/librte_eal/common/include/rte_version.h index 0baf130..779f17f 100644 --- a/lib/librte_eal/common/include/rte_version.h +++ b/lib/librte_eal/common/include/rte_version.h @@ -50,22 +50,22 @@ extern "C" { /** * String that appears before the version number */ -#define RTE_VER_PREFIX "RTE" +#define RTE_VER_PREFIX "DPDK" /** - * Major version number i.e. the x in x.y.z + * Major version/year number i.e. the yy in yy.mm.z */ -#define RTE_VER_MAJOR 2 +#define RTE_VER_YEAR 16 /** - * Minor version number i.e. the y in x.y.z + * Minor version/month number i.e. the mm in yy.mm.z */ -#define RTE_VER_MINOR 3 +#define RTE_VER_MONTH 4 /** - * Patch level number i.e. the z in x.y.z + * Patch level number i.e. the z in yy.mm.z */ -#define RTE_VER_PATCH_LEVEL 0 +#define RTE_VER_MINOR 0 /** * Extra string to be appended to version number @@ -77,7 +77,7 @@ extern "C" { * 0-15 = release candidates * 16 = release */ -#define RTE_VER_PATCH_RELEASE 0 +#define RTE_VER_RELEASE 0 /** * Macro to compute a version number usable for comparisons @@ -88,10 +88,10 @@ extern "C" { * All version numbers in one to compare with RTE_VERSION_NUM() */ #define RTE_VERSION RTE_VERSION_NUM( \ - RTE_VER_MAJOR, \ + RTE_VER_YEAR, \ + RTE_VER_MONTH, \ RTE_VER_MINOR, \ - RTE_VER_PATCH_LEVEL, \ - RTE_VER_PATCH_RELEASE) + RTE_VER_RELEASE) /** * Function returning version string @@ -105,21 +105,21 @@ rte_version(void) if (version[0] != 0) return version; if (strlen(RTE_VER_SUFFIX) == 0) - snprintf(version, sizeof(version), "%s %d.%d.%d", + snprintf(version, sizeof(version), "%s %d.%02d.%d", RTE_VER_PREFIX, - RTE_VER_MAJOR, - RTE_VER_MINOR, - RTE_VER_PATCH_LEVEL); + RTE_VER_YEAR, + RTE_VER_MONTH, + RTE_VER_MINOR); else - snprintf(version, sizeof(version), "%s %d.%d.%d%s%d", + snprintf(version, sizeof(version), "%s %d.%02d.%d%s%d", RTE_VER_PREFIX, - RTE_VER_MAJOR, + RTE_VER_YEAR, + RTE_VER_MONTH, RTE_VER_MINOR, - RTE_VER_PATCH_LEVEL, RTE_VER_SUFFIX, - RTE_VER_PATCH_RELEASE < 16 ? - RTE_VER_PATCH_RELEASE : - RTE_VER_PATCH_RELEASE - 16); + RTE_VER_RELEASE < 16 ? + RTE_VER_RELEASE : + RTE_VER_RELEASE - 16); return version; } diff --git a/mk/rte.sdkconfig.mk b/mk/rte.sdkconfig.mk index f8d95b1..564a5c3 100644 --- a/mk/rte.sdkconfig.mk +++ b/mk/rte.sdkconfig.mk @@ -35,7 +35,7 @@ showversion: $$(sed -rne 's,^#define RTE_VER_[A-Z_]*[[:space:]]+([0-9]+).*,\1,p' \ -e 's,^#define RTE_VER_SUFFIX[[:space:]]+"(.*)",\1,p' \ $(RTE_SRCDIR)/lib/librte_eal/common/include/rte_version.h) ;\ - printf '%d.%d.%d' "$$1" "$$2" "$$3"; \ + printf '%d.%02d.%d' "$$1" "$$2" "$$3"; \ if [ -z "$$5" ]; then echo; \ else printf '%s' "$$4"; \ if [ $$5 -lt 16 ] ; then echo $$5; \ -- 2.5.0
[dpdk-dev] [PATCH v3 2/2] doc: rename release notes 2.3 to 16.04
From: "Richardson, Bruce" Updated release documentation to reflect new numbering scheme. Signed-off-by: Bruce Richardson Signed-off-by: John McNamara --- doc/guides/rel_notes/index.rst | 2 +- doc/guides/rel_notes/release_16_04.rst | 138 + doc/guides/rel_notes/release_2_3.rst | 138 - 3 files changed, 139 insertions(+), 139 deletions(-) create mode 100644 doc/guides/rel_notes/release_16_04.rst delete mode 100644 doc/guides/rel_notes/release_2_3.rst diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst index 29013cf..84317b8 100644 --- a/doc/guides/rel_notes/index.rst +++ b/doc/guides/rel_notes/index.rst @@ -36,7 +36,7 @@ Release Notes :numbered: rel_description -release_2_3 +release_16_04 release_2_2 release_2_1 release_2_0 diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst new file mode 100644 index 000..27fc624 --- /dev/null +++ b/doc/guides/rel_notes/release_16_04.rst @@ -0,0 +1,138 @@ +DPDK Release 16.04 +== + + +**Read this first** + +The text below explains how to update the release notes. + +Use proper spelling, capitalization and punctuation in all sections. + +Variable and config names should be quoted as fixed width text: ``LIKE_THIS``. + +Build the docs and view the output file to ensure the changes are correct:: + + make doc-guides-html + + firefox build/doc/html/guides/rel_notes/release_16_04.html + + +New Features + + +This section should contain new features added in this release. Sample format: + +* **Add a title in the past tense with a full stop.** + + Add a short 1-2 sentence description in the past tense. The description + should be enough to allow someone scanning the release notes to understand + the new feature. + + If the feature adds a lot of sub-features you can use a bullet list like this. + + * Added feature foo to do something. + * Enhanced feature bar to do something else. + + Refer to the previous release notes for examples. + +* **Virtio 1.0.** + + Enabled virtio 1.0 support for virtio pmd driver. + + +Resolved Issues +--- + +This section should contain bug fixes added to the relevant sections. Sample format: + +* **code/section Fixed issue in the past tense with a full stop.** + + Add a short 1-2 sentence description of the resolved issue in the past tense. + The title should contain the code/lib section like a commit message. + Add the entries in alphabetic order in the relevant sections below. + + +EAL +~~~ + + +Drivers +~~~ + + +Libraries +~ + + +Examples + + + +Other +~ + + +Known Issues + + +This section should contain new known issues in this release. Sample format: + +* **Add title in present tense with full stop.** + + Add a short 1-2 sentence description of the known issue in the present + tense. Add information on any known workarounds. + + +API Changes +--- + +This section should contain API changes. Sample format: + +* Add a short 1-2 sentence description of the API change. Use fixed width + quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + + +ABI Changes +--- + +* Add a short 1-2 sentence description of the ABI change that was announced in + the previous releases and made in this release. Use fixed width quotes for + ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + + +Shared Library Versions +--- + +Update any library version updated in this release and prepend with a ``+`` sign. + +The libraries prepended with a plus sign were incremented in this version. + +.. code-block:: diff + + libethdev.so.2 + librte_acl.so.2 + librte_cfgfile.so.2 + librte_cmdline.so.1 + librte_distributor.so.1 + librte_eal.so.2 + librte_hash.so.2 + librte_ip_frag.so.1 + librte_ivshmem.so.1 + librte_jobstats.so.1 + librte_kni.so.2 + librte_kvargs.so.1 + librte_lpm.so.2 + librte_mbuf.so.2 + librte_mempool.so.1 + librte_meter.so.1 + librte_pipeline.so.2 + librte_pmd_bond.so.1 + librte_pmd_ring.so.2 + librte_port.so.2 + librte_power.so.1 + librte_reorder.so.1 + librte_ring.so.1 + librte_sched.so.1 + librte_table.so.2 + librte_timer.so.1 + librte_vhost.so.2 diff --git a/doc/guides/rel_notes/release_2_3.rst b/doc/guides/rel_notes/release_2_3.rst deleted file mode 100644 index 7945694..000 --- a/doc/guides/rel_notes/release_2_3.rst +++ /dev/null @@ -1,138 +0,0 @@ -DPDK Release 2.3 - - - -**Read this first** - -The text below explains how to update the release notes. - -Use proper spelling, capitalization and punctuation in all sections. - -Variable and config names should be quoted as fixed width text: ``LIKE_THIS``. - -Build the docs and view the output file to ensure the changes are correct:: -
[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers
> -Original Message- > From: Mcnamara, John > Sent: Wednesday, February 10, 2016 5:02 PM > To: dev at dpdk.org > Cc: thomas.monjalon at 6wind.com; Richardson, Bruce > ; Mcnamara, John > Subject: [PATCH v3 1/2] version: switch to year/month version numbers > diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c index > ffa5730..fd4dd39 100644 > --- a/drivers/net/nfp/nfp_net.c > +++ b/drivers/net/nfp/nfp_net.c > @@ -1070,9 +1070,7 @@ nfp_net_infos_get(struct rte_eth_dev *dev, struct > rte_eth_dev_info *dev_info) > }; > > dev_info->reta_size = NFP_NET_CFG_RSS_ITBL_SZ; >-#if RTE_VER_MAJOR == 2 && RTE_VER_MINOR >= 1 > dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ; >-#endif } > Hi Alejandro, Please note the above change to the nfp_net.c code due to the name change in the version macro names. I was originally going to change it to this: #if RTE_VER_YEAR == 2 && RTE_VER_MONTH >= 1 dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ; #endif Or better: #if RTE_VERSION >= 0x0201 dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ; #endif However, since the code was added in DPDK 2.2 the version check doesn't seem necessary, so I removed it. If you need it back in for some reason let me know. John
[dpdk-dev] [PATCH] ixgbe: fix whitespace
On Wed, Jan 13, 2016 at 05:23:22AM +, Zhang, Helin wrote: > > > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Wednesday, January 13, 2016 12:54 PM > To: Zhang, Helin ; Ananyev, Konstantin > > Cc: dev at dpdk.org; Stephen Hemminger > Subject: [PATCH] ixgbe: fix whitespace > > Normally, I ignore random minor whitespace issues, but this driver seems to > have some authors that don't understand the conventions of putting whitespace > after keywords. > > Signed-off-by: Stephen Hemminger > Acked-by: Helin Zhang > Pushed to dpdk-next-net/rel_16_04, including an additional 11 other whitespace fixes called out by checkpatch. /Bruce
[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers
2016-02-10 17:11, Mcnamara, John: > Hi Alejandro, > > Please note the above change to the nfp_net.c code due to the name change > in the version macro names. > > > I was originally going to change it to this: > > #if RTE_VER_YEAR == 2 && RTE_VER_MONTH >= 1 > dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ; > #endif > > Or better: > > #if RTE_VERSION >= 0x0201 > dev_info->hash_key_size = NFP_NET_CFG_RSS_KEY_SZ; > #endif Or better: #if RTE_VERSION >= RTE_VERSION_NUM(2,1,0,0) > However, since the code was added in DPDK 2.2 the version check doesn't seem > necessary, so I removed it. If you need it back in for some reason let me > know. A version check is needed for external applications or external drivers. It should not exist in the git tree.
[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size
>Hi Keith, > >Thank you for adding the RTE_NEXT_ABI. I think this is the way >described in the process. Your changes will be available in next >version (16.4) for people compiling with RTE_NEXT_ABI=y, and in >16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in >default configs...). > >I think a deprecation notice should also be added in this commit >in doc/guides/rel_notes/deprecation.rst. Will add the text. > >Please also find comments below. > >On 02/09/2016 06:30 PM, Keith Wiles wrote: > >> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc >> b/config/defconfig_x86_64-native-linuxapp-gcc >> index 60baf5b..02e9ace 100644 >> --- a/config/defconfig_x86_64-native-linuxapp-gcc >> +++ b/config/defconfig_x86_64-native-linuxapp-gcc >> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y >> >> CONFIG_RTE_TOOLCHAIN="gcc" >> CONFIG_RTE_TOOLCHAIN_GCC=y >> +CONFIG_RTE_BUILD_SHARED_LIB=y >> +CONFIG_RTE_NEXT_ABI=n >> +CONFIG_RTE_EAL_IGB_UIO=n >> +CONFIG_RTE_LIBRTE_KNI=n >> +CONFIG_RTE_KNI_KMOD=n Hmm, not sure where this came from, but will remove it. > >I think this should not be part of the patch. > >> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp) >> static unsigned >> rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) >> { >> +#ifdef RTE_NEXT_ABI >> +unsigned lcore_id; >> +unsigned count = 0; >> +unsigned cache_count; >> + >> +fprintf(f, " cache infos:\n"); >> +fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size); >> +if (mp->cache_size == 0) >> +return count; >> + >> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >> +cache_count = mp->local_cache[lcore_id].len; >> +fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count); >> +count += cache_count; >> +} >> +fprintf(f, "total_cache_count=%u\n", count); >> +return count; >> +#else >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >> unsigned lcore_id; >> unsigned count = 0; > >I think in this case we could avoid to duplicate the code without >beeing unclear by using the proper #ifdefs: I was struggling with how it should be done. I like to see clear ifdefs and be able to see the complete code for a given case. In these cases I wanted to make it simple to remove the code quickly by just deleting lines instead of editing lines. I will follow your suggestion. > >#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) > /* common code */ >#ifdef RTE_NEXT_ABI > if (mp->cache_size == 0) > return count; >#endif > /* common code */ >#else >... >#endif > > >> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp) >> #define mempool_audit_cookies(mp) do {} while(0) >> #endif >> >> +#ifdef RTE_NEXT_ABI >> +/* check cookies before and after objects */ >> +static void >> +mempool_audit_cache(const struct rte_mempool *mp) >> +{ >> +/* check cache size consistency */ >> +unsigned lcore_id; >> + >> +if (mp->cache_size == 0) >> +return; >> + >> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >> +if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) { >> +RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n", >> +lcore_id); >> +rte_panic("MEMPOOL: invalid cache len\n"); >> +} >> +} >> +} >> +#else > >same here > >> diff --git a/lib/librte_mempool/rte_mempool.h >> b/lib/librte_mempool/rte_mempool.h >> index 6e2390a..fc9b595 100644 >> --- a/lib/librte_mempool/rte_mempool.h >> +++ b/lib/librte_mempool/rte_mempool.h >> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats { >> } __rte_cache_aligned; >> #endif >> >> +#ifdef RTE_NEXT_ABI >> +/** >> + * A structure that stores a per-core object cache. >> + */ >> +struct rte_mempool_cache { >> +unsigned len; /**< Cache len */ >> +/* >> + * Cache is allocated to this size to allow it to overflow in certain >> + * cases to avoid needless emptying of cache. >> + */ >> +void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ >> +} __rte_cache_aligned; >> +#else > >same here > > > >> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline)) >> __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, >> unsigned n, int is_mp) >> { >> +#ifndef RTE_NEXT_ABI/* Note: ifndef */ >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >> +#endif /* RTE_NEXT_ABI */ >> struct rte_mempool_cache *cache; >> uint32_t index; >> void **cache_objs; >> unsigned lcore_id = rte_lcore_id(); >> uint32_t cache_size = mp->cache_size; >> uint32_t flushthresh = mp->cache_flushthresh; >> +#ifndef RTE_NEXT_ABI/* Note: ifndef */ >> #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ >> +#endif /* RTE_NEXT_ABI */ > >this looks strange... I think it does not work properly. >Why not >#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || define
[dpdk-dev] Regarding mbuf allocation/free in secondary process
Sara: The use of polling is to provide high throughput. At high bandwidths, interrupt processing is a great contributor to latency and can greatly decrease available bandwidth. Polling eliminates the overhead of interrupts but it will consume an entire CPU. That is why the DPDK library makes it easy to put each thread on a separate core. The downside is that for low bandwidth applications, you end up using a lot of CPU resources spinning in loops waiting for input. Using the techniques Bruce mentioned will allow you to code a low-bandwidth, low CPU utilization app, but it will not be able to handle high bandwidth. You will have to choose one or the other. Unfortunately, computer hardware has a limited number of interrupts that it can handle per second, and those limits make it impractical to handle bandwidths much greater than 1 Gb/s (i.e 10, 40, or 100 Gb/s) using interrupts to handle I/O. Lawrence This one time (02/10/2016 05:01 AM), at band camp, Bruce Richardson wrote: > On Tue, Feb 09, 2016 at 11:43:19PM -0800, Saravana Kumar wrote: >> Hi DPDK community, >> >> >> >> I'd like to have DPDK NIC IO operations in (primary) process and >> execution logic in (secondary) processes. >> Primary process pushes NIC Rx mbufs to Secondary process through S/W ring >> >> Seconary process allocates mbuf for Tx path and pushes down to Primary >> process for NIC Tx >> >> >> I have few doubts here: >> >> 1. If Secondary process dies because of SIGKILL then how can the mbufs >> allocated in Secondary process can be freed. >> If it is normal signals like SIGINT/SIGTERM then we can be catch >> those and free in those respective signal handlers > If a process terminates abnormally then the buffers being used by that process > may well be leaked. The solution you propose of catching signals will > certainly > help as you want to try and ensure that a process always frees all its buffers > properly on termination. > >> 2. Secondary process needs to poll on the S/W ring. This can consume 100% >> cpu. >> Is there a way to avoid polling in secondary process for Rx path > Not using DPDK software rings, no. You'd have to use some kernel constructs > such as > fifo's/named pipes to do blocking reads on those. However, the overhead of > using > such structures can be severe making them unusable for many packet processing > applications. An alternative might be to use some small sleep calls i.e. > nanosleep > between polls of the SW ring in cases where traffic rates are low. That will > reduce your cpu usage. > > /Bruce > > -- Lawrence MacIntyre macintyrelp at ornl.gov Oak Ridge National Laboratory 865.574.7401 Cyber Space and Information Intelligence Research Group
[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size
>>Hi Keith, >> >>Thank you for adding the RTE_NEXT_ABI. I think this is the way >>described in the process. Your changes will be available in next >>version (16.4) for people compiling with RTE_NEXT_ABI=y, and in >>16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in >>default configs...). >> >>I think a deprecation notice should also be added in this commit >>in doc/guides/rel_notes/deprecation.rst. > >Will add the text. >> >>Please also find comments below. >> >>On 02/09/2016 06:30 PM, Keith Wiles wrote: >> >>> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc >>> b/config/defconfig_x86_64-native-linuxapp-gcc >>> index 60baf5b..02e9ace 100644 >>> --- a/config/defconfig_x86_64-native-linuxapp-gcc >>> +++ b/config/defconfig_x86_64-native-linuxapp-gcc >>> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y >>> >>> CONFIG_RTE_TOOLCHAIN="gcc" >>> CONFIG_RTE_TOOLCHAIN_GCC=y >>> +CONFIG_RTE_BUILD_SHARED_LIB=y >>> +CONFIG_RTE_NEXT_ABI=n >>> +CONFIG_RTE_EAL_IGB_UIO=n >>> +CONFIG_RTE_LIBRTE_KNI=n >>> +CONFIG_RTE_KNI_KMOD=n > >Hmm, not sure where this came from, but will remove it. I think this was from the ABI-Checker I ran and the tool should leave the repo in its original state. >> >>I think this should not be part of the patch. >> >>> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp) >>> static unsigned >>> rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) >>> { >>> +#ifdef RTE_NEXT_ABI >>> + unsigned lcore_id; >>> + unsigned count = 0; >>> + unsigned cache_count; >>> + >>> + fprintf(f, " cache infos:\n"); >>> + fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size); >>> + if (mp->cache_size == 0) >>> + return count; >>> + >>> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >>> + cache_count = mp->local_cache[lcore_id].len; >>> + fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count); >>> + count += cache_count; >>> + } >>> + fprintf(f, "total_cache_count=%u\n", count); >>> + return count; >>> +#else >>> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >>> unsigned lcore_id; >>> unsigned count = 0; >> >>I think in this case we could avoid to duplicate the code without >>beeing unclear by using the proper #ifdefs: > >I was struggling with how it should be done. I like to see clear ifdefs and be >able to see the complete code for a given case. In these cases I wanted to >make it simple to remove the code quickly by just deleting lines instead of >editing lines. I will follow your suggestion. >> >>#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) >> /* common code */ >>#ifdef RTE_NEXT_ABI >> if (mp->cache_size == 0) >> return count; >>#endif >> /* common code */ >>#else >>... >>#endif >> >> >>> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp) >>> #define mempool_audit_cookies(mp) do {} while(0) >>> #endif >>> >>> +#ifdef RTE_NEXT_ABI >>> +/* check cookies before and after objects */ >>> +static void >>> +mempool_audit_cache(const struct rte_mempool *mp) >>> +{ >>> + /* check cache size consistency */ >>> + unsigned lcore_id; >>> + >>> + if (mp->cache_size == 0) >>> + return; >>> + >>> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >>> + if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) { >>> + RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n", >>> + lcore_id); >>> + rte_panic("MEMPOOL: invalid cache len\n"); >>> + } >>> + } >>> +} >>> +#else >> >>same here >> >>> diff --git a/lib/librte_mempool/rte_mempool.h >>> b/lib/librte_mempool/rte_mempool.h >>> index 6e2390a..fc9b595 100644 >>> --- a/lib/librte_mempool/rte_mempool.h >>> +++ b/lib/librte_mempool/rte_mempool.h >>> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats { >>> } __rte_cache_aligned; >>> #endif >>> >>> +#ifdef RTE_NEXT_ABI >>> +/** >>> + * A structure that stores a per-core object cache. >>> + */ >>> +struct rte_mempool_cache { >>> + unsigned len; /**< Cache len */ >>> + /* >>> +* Cache is allocated to this size to allow it to overflow in certain >>> +* cases to avoid needless emptying of cache. >>> +*/ >>> + void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ >>> +} __rte_cache_aligned; >>> +#else >> >>same here >> >> >> >>> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline)) >>> __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, >>> unsigned n, int is_mp) >>> { >>> +#ifndef RTE_NEXT_ABI /* Note: ifndef */ >>> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >>> +#endif /* RTE_NEXT_ABI */ >>> struct rte_mempool_cache *cache; >>> uint32_t index; >>> void **cache_objs; >>> unsigned lcore_id = rte_lcore_id(); >>> uint32_t cache_size = mp->cache_size; >>> uint32_t flushthresh = mp->cache_flushthresh; >>> +#ifndef RTE_NEXT_ABI /*
[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size
2016-02-10 18:01, Wiles, Keith: > >>> --- a/config/defconfig_x86_64-native-linuxapp-gcc > >>> +++ b/config/defconfig_x86_64-native-linuxapp-gcc > >>> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y > >>> > >>> CONFIG_RTE_TOOLCHAIN="gcc" > >>> CONFIG_RTE_TOOLCHAIN_GCC=y > >>> +CONFIG_RTE_BUILD_SHARED_LIB=y > >>> +CONFIG_RTE_NEXT_ABI=n > >>> +CONFIG_RTE_EAL_IGB_UIO=n > >>> +CONFIG_RTE_LIBRTE_KNI=n > >>> +CONFIG_RTE_KNI_KMOD=n > > > >Hmm, not sure where this came from, but will remove it. > > I think this was from the ABI-Checker I ran and the tool should leave the > repo in its original state. Yes you're right. The ABI checker modify the defconfig instead of modifying the generated .config file. Anyone for a patch?
[dpdk-dev] [PATCH v2] af_packet: make the device detachable
On Wed, Feb 10, 2016 at 04:42:53PM +0100, Wojciech ?muda wrote: > Bernard, Bruce, I have a question, if I may. Do you know what is the > reason that rte_pmd_af_packet_devinit() is the only non-static device > initialization function among all the dpdk drivers? There's even a > comment in the rte_eth_af_packet.h: > > /** > * For use by the EAL only. Called as part of EAL init to set up any dummy > NICs > * configured on command line. > */ > int rte_pmd_af_packet_devinit(const char *name, const char *params); > > Despite the comment above, I cannot see this function being called > directly anywhere. Is there any reason it is implemented this way? Or > should I change the definition to static, as it should be called via > proper API functions? The af_packet driver structure was essentially copied from the pcap driver. Way in the past there was the EAL initialization of the "virtual" drivers, and the original af_packet code hooked into that. Somewhere along the way the driver initialization code changed and I guess the EAL initialization bits disappeared. The comment above rte_pmd_af_packet_devinit was overlooked. Long story, short -- I think you can remove the comment and add the static modifier and everything will be fine. John -- John W. LinvilleSomeday the world will need a hero, and you linville at tuxdriver.com might be all we have. Be ready.
[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size
>Hi Keith, > >Thank you for adding the RTE_NEXT_ABI. I think this is the way >described in the process. Your changes will be available in next >version (16.4) for people compiling with RTE_NEXT_ABI=y, and in >16.7 without option (I'm just surprised that RTE_NEXT_ABI=y in >default configs...). > >I think a deprecation notice should also be added in this commit >in doc/guides/rel_notes/deprecation.rst. > >Please also find comments below. > >On 02/09/2016 06:30 PM, Keith Wiles wrote: > >> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc >> b/config/defconfig_x86_64-native-linuxapp-gcc >> index 60baf5b..02e9ace 100644 >> --- a/config/defconfig_x86_64-native-linuxapp-gcc >> +++ b/config/defconfig_x86_64-native-linuxapp-gcc >> @@ -40,3 +40,8 @@ CONFIG_RTE_ARCH_64=y >> >> CONFIG_RTE_TOOLCHAIN="gcc" >> CONFIG_RTE_TOOLCHAIN_GCC=y >> +CONFIG_RTE_BUILD_SHARED_LIB=y >> +CONFIG_RTE_NEXT_ABI=n >> +CONFIG_RTE_EAL_IGB_UIO=n >> +CONFIG_RTE_LIBRTE_KNI=n >> +CONFIG_RTE_KNI_KMOD=n > >I think this should not be part of the patch. > >> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp) >> static unsigned >> rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) >> { >> +#ifdef RTE_NEXT_ABI >> +unsigned lcore_id; >> +unsigned count = 0; >> +unsigned cache_count; >> + >> +fprintf(f, " cache infos:\n"); >> +fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size); >> +if (mp->cache_size == 0) >> +return count; >> + >> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >> +cache_count = mp->local_cache[lcore_id].len; >> +fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count); >> +count += cache_count; >> +} >> +fprintf(f, "total_cache_count=%u\n", count); >> +return count; >> +#else >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >> unsigned lcore_id; >> unsigned count = 0; > >I think in this case we could avoid to duplicate the code without >beeing unclear by using the proper #ifdefs: > >#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) > /* common code */ >#ifdef RTE_NEXT_ABI > if (mp->cache_size == 0) > return count; >#endif > /* common code */ >#else >... >#endif Started looking at this change and the problem is I want to remove the ifdef RTE_MEMPOOL.. As well as the #else/#endif code. If I do as you suggest it does not appear to be clearer when someone goes back to remove the code, they may think the #ifdef RTE_MEMPOOL/#else/#endif are still required. > > >> @@ -755,6 +806,26 @@ mempool_audit_cookies(const struct rte_mempool *mp) >> #define mempool_audit_cookies(mp) do {} while(0) >> #endif >> >> +#ifdef RTE_NEXT_ABI >> +/* check cookies before and after objects */ >> +static void >> +mempool_audit_cache(const struct rte_mempool *mp) >> +{ >> +/* check cache size consistency */ >> +unsigned lcore_id; >> + >> +if (mp->cache_size == 0) >> +return; >> + >> +for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >> +if (mp->local_cache[lcore_id].len > mp->cache_flushthresh) { >> +RTE_LOG(CRIT, MEMPOOL, "badness on cache[%u]\n", >> +lcore_id); >> +rte_panic("MEMPOOL: invalid cache len\n"); >> +} >> +} >> +} >> +#else > >same here The same comment here. > >> diff --git a/lib/librte_mempool/rte_mempool.h >> b/lib/librte_mempool/rte_mempool.h >> index 6e2390a..fc9b595 100644 >> --- a/lib/librte_mempool/rte_mempool.h >> +++ b/lib/librte_mempool/rte_mempool.h >> @@ -95,6 +95,19 @@ struct rte_mempool_debug_stats { >> } __rte_cache_aligned; >> #endif >> >> +#ifdef RTE_NEXT_ABI >> +/** >> + * A structure that stores a per-core object cache. >> + */ >> +struct rte_mempool_cache { >> +unsigned len; /**< Cache len */ >> +/* >> + * Cache is allocated to this size to allow it to overflow in certain >> + * cases to avoid needless emptying of cache. >> + */ >> +void *objs[RTE_MEMPOOL_CACHE_MAX_SIZE * 3]; /**< Cache objects */ >> +} __rte_cache_aligned; >> +#else > >same here Same for this one. > > > >> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline)) >> __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, >> unsigned n, int is_mp) >> { >> +#ifndef RTE_NEXT_ABI/* Note: ifndef */ >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >> +#endif /* RTE_NEXT_ABI */ >> struct rte_mempool_cache *cache; >> uint32_t index; >> void **cache_objs; >> unsigned lcore_id = rte_lcore_id(); >> uint32_t cache_size = mp->cache_size; >> uint32_t flushthresh = mp->cache_flushthresh; >> +#ifndef RTE_NEXT_ABI/* Note: ifndef */ >> #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ >> +#endif /* RTE_NEXT_ABI */ > >this looks strange... I think it does not work properly. >Why not >#if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_AB
[dpdk-dev] [PATCH v3] vhost: fix leak of fds and mmaps
The common vhost code only supported a single mmap per device. vhost-user worked around this by saving the address/length/fd of each mmap after the end of the rte_virtio_memory struct. This only works if the vhost-user code frees dev->mem, since the common code is unaware of the extra info. The VHOST_USER_RESET_OWNER message is one situation where the common code frees dev->mem and leaks the fds and mappings. This happens every time I shut down a VM. The new code calls back into the implementation (vhost-user or vhost-cuse) to clean up these resources. The vhost-cuse changes are only compile tested. Signed-off-by: Rich Lane Acked-by: Yuanhan Liu --- v2->v3: - Rename "impl" to "backend". v1->v2: - Call into vhost-user/vhost-cuse to free mmaps. lib/librte_vhost/vhost-net.h | 6 ++ lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 12 lib/librte_vhost/vhost_user/vhost-net-user.c | 1 - lib/librte_vhost/vhost_user/virtio-net-user.c | 25 ++--- lib/librte_vhost/vhost_user/virtio-net-user.h | 1 - lib/librte_vhost/virtio-net.c | 8 +--- 6 files changed, 29 insertions(+), 24 deletions(-) diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h index c69b60b..affbd1a 100644 --- a/lib/librte_vhost/vhost-net.h +++ b/lib/librte_vhost/vhost-net.h @@ -115,4 +115,10 @@ struct vhost_net_device_ops { struct vhost_net_device_ops const *get_virtio_net_callbacks(void); + +/* + * Backend-specific cleanup. Defined by vhost-cuse and vhost-user. + */ +void vhost_backend_cleanup(struct virtio_net *dev); + #endif /* _VHOST_NET_CDEV_H_ */ diff --git a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c index ae2c3fa..374c884 100644 --- a/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c +++ b/lib/librte_vhost/vhost_cuse/virtio-net-cdev.c @@ -421,3 +421,15 @@ int cuse_set_backend(struct vhost_device_ctx ctx, struct vhost_vring_file *file) return ops->set_backend(ctx, file); } + +void +vhost_backend_cleanup(struct virtio_net *dev) +{ + /* Unmap QEMU memory file if mapped. */ + if (dev->mem) { + munmap((void *)(uintptr_t)dev->mem->mapped_address, + (size_t)dev->mem->mapped_size); + free(dev->mem); + dev->mem = NULL; + } +} diff --git a/lib/librte_vhost/vhost_user/vhost-net-user.c b/lib/librte_vhost/vhost_user/vhost-net-user.c index 8b7a448..336efba 100644 --- a/lib/librte_vhost/vhost_user/vhost-net-user.c +++ b/lib/librte_vhost/vhost_user/vhost-net-user.c @@ -347,7 +347,6 @@ vserver_message_handler(int connfd, void *dat, int *remove) close(connfd); *remove = 1; free(cfd_ctx); - user_destroy_device(ctx); ops->destroy_device(ctx); return; diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.c b/lib/librte_vhost/vhost_user/virtio-net-user.c index 2934d1c..993ed71 100644 --- a/lib/librte_vhost/vhost_user/virtio-net-user.c +++ b/lib/librte_vhost/vhost_user/virtio-net-user.c @@ -339,21 +339,6 @@ user_set_vring_enable(struct vhost_device_ctx ctx, } void -user_destroy_device(struct vhost_device_ctx ctx) -{ - struct virtio_net *dev = get_device(ctx); - - if (dev && (dev->flags & VIRTIO_DEV_RUNNING)) - notify_ops->destroy_device(dev); - - if (dev && dev->mem) { - free_mem_region(dev); - free(dev->mem); - dev->mem = NULL; - } -} - -void user_set_protocol_features(struct vhost_device_ctx ctx, uint64_t protocol_features) { @@ -365,3 +350,13 @@ user_set_protocol_features(struct vhost_device_ctx ctx, dev->protocol_features = protocol_features; } + +void +vhost_backend_cleanup(struct virtio_net *dev) +{ + if (dev->mem) { + free_mem_region(dev); + free(dev->mem); + dev->mem = NULL; + } +} diff --git a/lib/librte_vhost/vhost_user/virtio-net-user.h b/lib/librte_vhost/vhost_user/virtio-net-user.h index b82108d..1140ee1 100644 --- a/lib/librte_vhost/vhost_user/virtio-net-user.h +++ b/lib/librte_vhost/vhost_user/virtio-net-user.h @@ -55,5 +55,4 @@ int user_get_vring_base(struct vhost_device_ctx, struct vhost_vring_state *); int user_set_vring_enable(struct vhost_device_ctx ctx, struct vhost_vring_state *state); -void user_destroy_device(struct vhost_device_ctx); #endif diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index de78a0f..cf2560e 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -199,13 +199,7 @@ cleanup_device(struct virtio_net *dev, int destroy) { uint32_t i; - /* Unmap QEMU memory file if mapped. */ - if (dev->mem) { - munmap((void *)(uintptr_t)dev->mem->mapped_address, - (size_t)dev->mem->mapp
[dpdk-dev] [PATCH v3] cfgfile: support looking up sections by index
This is useful when sections have duplicate names. Signed-off-by: Rich Lane --- v2->v3 - Added check for index < 0. v1->v2: - Added new symbol to version script. lib/librte_cfgfile/rte_cfgfile.c | 16 lib/librte_cfgfile/rte_cfgfile.h | 23 +++ lib/librte_cfgfile/rte_cfgfile_version.map | 6 ++ 3 files changed, 45 insertions(+) diff --git a/lib/librte_cfgfile/rte_cfgfile.c b/lib/librte_cfgfile/rte_cfgfile.c index a677dad..eba0713 100644 --- a/lib/librte_cfgfile/rte_cfgfile.c +++ b/lib/librte_cfgfile/rte_cfgfile.c @@ -333,6 +333,22 @@ rte_cfgfile_section_entries(struct rte_cfgfile *cfg, const char *sectionname, return i; } +int +rte_cfgfile_section_entries_by_index(struct rte_cfgfile *cfg, int index, + struct rte_cfgfile_entry *entries, int max_entries) +{ + int i; + const struct rte_cfgfile_section *sect; + + if (index < 0 || index >= cfg->num_sections) + return -1; + + sect = cfg->sections[index]; + for (i = 0; i < max_entries && i < sect->num_entries; i++) + entries[i] = *sect->entries[i]; + return i; +} + const char * rte_cfgfile_get_entry(struct rte_cfgfile *cfg, const char *sectionname, const char *entryname) diff --git a/lib/librte_cfgfile/rte_cfgfile.h b/lib/librte_cfgfile/rte_cfgfile.h index d443782..8e69971 100644 --- a/lib/librte_cfgfile/rte_cfgfile.h +++ b/lib/librte_cfgfile/rte_cfgfile.h @@ -155,6 +155,29 @@ int rte_cfgfile_section_entries(struct rte_cfgfile *cfg, struct rte_cfgfile_entry *entries, int max_entries); +/** Get section entries as key-value pairs +* +* The index of a section is the same as the index of its name in the +* result of rte_cfgfile_sections. This API can be used when there are +* multiple sections with the same name. +* +* @param cfg +* Config file +* @param index +* Section index +* @param entries +* Pre-allocated array of at least max_entries entries where the section +* entries are stored as key-value pair after successful invocation +* @param max_entries +* Maximum number of section entries to be stored in entries array +* @return +* Number of entries populated on success, negative error code otherwise +*/ +int rte_cfgfile_section_entries_by_index(struct rte_cfgfile *cfg, + int index, + struct rte_cfgfile_entry *entries, + int max_entries); + /** Get value of the named entry in named config file section * * @param cfg diff --git a/lib/librte_cfgfile/rte_cfgfile_version.map b/lib/librte_cfgfile/rte_cfgfile_version.map index bf6c6fd..f6a27a9 100644 --- a/lib/librte_cfgfile/rte_cfgfile_version.map +++ b/lib/librte_cfgfile/rte_cfgfile_version.map @@ -13,3 +13,9 @@ DPDK_2.0 { local: *; }; + +DPDK_2.3 { + global: + + rte_cfgfile_section_entries_by_index; +} DPDK_2.0; -- 1.9.1
[dpdk-dev] [PATCH v2] cfgfile: support looking up sections by index
On Tue, Feb 2, 2016 at 7:10 AM, Dumitrescu, Cristian < cristian.dumitrescu at intel.com> wrote: > > > > -Original Message- > > From: Rich Lane [mailto:rich.lane at bigswitch.com] > > Sent: Tuesday, January 19, 2016 4:42 AM > > To: dev at dpdk.org > > Cc: Dumitrescu, Cristian ; Panu > Matilainen > > > > Subject: [PATCH v2] cfgfile: support looking up sections by index > > > > This is useful when sections have duplicate names. > > Hi Rich, > > You are right, I can see this is needed to allow multiple sections with > identical name in the same configuration file. When sections with the same > name are not allowed, then this is not needed, as the current API is > sufficient. > > To me, having several sections with the same name does not look like a > good idea, in fact it might look like an application design flaw, as > differentiating between sections with the same name can only done based on > the position of the section in the configuration file, which is an error > prone option. Some examples: > 1. While maintaining a large config file, keeping a specific section at a > fixed position quickly becomes a pain, and shifting the position of the > section up or down can completely change the application behavior; > 2. Using basic pre-processors (like CPP or M4) or scripts, the master > configuration file can include other configuration files, with the > inclusion of each decided at run-time based on application command line > parameters, so the position of certain sections is actually not known until > run-time. > > Can you provide some examples when having multiple sections with the same > name is a key requirement? > My application uses a config file to assign RX/TX queues to cores. Ports and cores have names (e.g. "core.fwd0"), but there's no reason to name queues. The order of the queue sections is unimportant. > A straight forward workaround to having multiple sections with the same > name is to add a number to the name of each such section. Using the current > API, all the sections with the same prefix name can be read gracefully. As > an example, the ip_pipeline application allows multiple sections with the > same name prefix but a different number prefix: > PIPELINE0, PIPELINE1, ... > LINK0, LINK1, ... > MEMPOOL0, MEMPOOL1, ... > RXQ0.0, RXQ0.1, RXQ1.0, ... > TXQ0.0, TXQ0.1, TXQ1.0, ... > > Is there a reason why this approach is not acceptable for your application? > It is less usable. The person writing the config file must generate those suffixes which are irrelevant to what they're trying to express. The sole effect is to add another source of errors. Additionally, this API allows reading a config file in linear instead of quadratic time (wrt number of sections). While my config files aren't long enough to make this a requirement it certainly seems desirable.
[dpdk-dev] [PATCH] vhost: remove vhost_net_device_ops
The indirection is unnecessary because there is only one implementation of the vhost common code. Removing it makes the code more readable. Signed-off-by: Rich Lane --- examples/vhost_xen/virtio-net.h | 2 - lib/librte_vhost/vhost-net.h | 40 +--- lib/librte_vhost/vhost_cuse/vhost-net-cdev.c | 27 lib/librte_vhost/vhost_cuse/virtio-net-cdev.c | 4 +- lib/librte_vhost/vhost_user/vhost-net-user.c | 23 +++ lib/librte_vhost/vhost_user/virtio-net-user.c | 6 +- lib/librte_vhost/virtio-net.c | 92 --- 7 files changed, 70 insertions(+), 124 deletions(-) diff --git a/examples/vhost_xen/virtio-net.h b/examples/vhost_xen/virtio-net.h index c8c5a7a..ab69726 100644 --- a/examples/vhost_xen/virtio-net.h +++ b/examples/vhost_xen/virtio-net.h @@ -110,6 +110,4 @@ struct virtio_net_device_ops { void (* destroy_device) (volatile struct virtio_net *); /* Remove device. */ }; -struct vhost_net_device_ops const * get_virtio_net_callbacks(void); - #endif diff --git a/lib/librte_vhost/vhost-net.h b/lib/librte_vhost/vhost-net.h index c69b60b..81274df 100644 --- a/lib/librte_vhost/vhost-net.h +++ b/lib/librte_vhost/vhost-net.h @@ -43,8 +43,6 @@ #include "rte_virtio_net.h" -extern struct vhost_net_device_ops const *ops; - /* Macros for printing using RTE_LOG */ #define RTE_LOGTYPE_VHOST_CONFIG RTE_LOGTYPE_USER1 #define RTE_LOGTYPE_VHOST_DATA RTE_LOGTYPE_USER1 @@ -85,34 +83,26 @@ struct vhost_device_ctx { uint64_tfh; /* Populated with fi->fh to track the device index. */ }; -/* - * Structure contains function pointers to be defined in virtio-net.c. These - * functions are called in CUSE context and are used to configure devices. - */ -struct vhost_net_device_ops { - int (*new_device)(struct vhost_device_ctx); - void (*destroy_device)(struct vhost_device_ctx); +int vhost_new_device(struct vhost_device_ctx); +void vhost_destroy_device(struct vhost_device_ctx); - void (*set_ifname)(struct vhost_device_ctx, - const char *if_name, unsigned int if_len); +void vhost_set_ifname(struct vhost_device_ctx, + const char *if_name, unsigned int if_len); - int (*get_features)(struct vhost_device_ctx, uint64_t *); - int (*set_features)(struct vhost_device_ctx, uint64_t *); +int vhost_get_features(struct vhost_device_ctx, uint64_t *); +int vhost_set_features(struct vhost_device_ctx, uint64_t *); - int (*set_vring_num)(struct vhost_device_ctx, struct vhost_vring_state *); - int (*set_vring_addr)(struct vhost_device_ctx, struct vhost_vring_addr *); - int (*set_vring_base)(struct vhost_device_ctx, struct vhost_vring_state *); - int (*get_vring_base)(struct vhost_device_ctx, uint32_t, struct vhost_vring_state *); +int vhost_set_vring_num(struct vhost_device_ctx, struct vhost_vring_state *); +int vhost_set_vring_addr(struct vhost_device_ctx, struct vhost_vring_addr *); +int vhost_set_vring_base(struct vhost_device_ctx, struct vhost_vring_state *); +int vhost_get_vring_base(struct vhost_device_ctx, uint32_t, struct vhost_vring_state *); - int (*set_vring_kick)(struct vhost_device_ctx, struct vhost_vring_file *); - int (*set_vring_call)(struct vhost_device_ctx, struct vhost_vring_file *); +int vhost_set_vring_kick(struct vhost_device_ctx, struct vhost_vring_file *); +int vhost_set_vring_call(struct vhost_device_ctx, struct vhost_vring_file *); - int (*set_backend)(struct vhost_device_ctx, struct vhost_vring_file *); - - int (*set_owner)(struct vhost_device_ctx); - int (*reset_owner)(struct vhost_device_ctx); -}; +int vhost_set_backend(struct vhost_device_ctx, struct vhost_vring_file *); +int vhost_set_owner(struct vhost_device_ctx); +int vhost_reset_owner(struct vhost_device_ctx); -struct vhost_net_device_ops const *get_virtio_net_callbacks(void); #endif /* _VHOST_NET_CDEV_H_ */ diff --git a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c index ae7ad8d..c613e68 100644 --- a/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c +++ b/lib/librte_vhost/vhost_cuse/vhost-net-cdev.c @@ -58,7 +58,6 @@ static const char cuse_device_name[] = "/dev/cuse"; static const char default_cdev[] = "vhost-net"; static struct fuse_session *session; -struct vhost_net_device_ops const *ops; /* * Returns vhost_device_ctx from given fuse_req_t. The index is populated later @@ -86,7 +85,7 @@ vhost_net_open(fuse_req_t req, struct fuse_file_info *fi) struct vhost_device_ctx ctx = fuse_req_to_vhost_ctx(req, fi); int err = 0; - err = ops->new_device(ctx); + err = vhost_new_device(ctx); if (err == -1) { fuse_reply_err(req, EPERM); return; @@ -108,7 +107,7 @@ vhost_net_release(fuse_req_t req, struct fuse_file_info *fi) int err = 0; struct vhost_device_ctx ctx = fuse_req_to_vhost_ctx(req, fi); - o
[dpdk-dev] [PATCH v2] mempool: reduce rte_mempool structure size
Hi Keith, On 02/10/2016 07:35 PM, Wiles, Keith wrote: >>> @@ -672,6 +704,24 @@ rte_mempool_count(const struct rte_mempool *mp) >>> static unsigned >>> rte_mempool_dump_cache(FILE *f, const struct rte_mempool *mp) >>> { >>> +#ifdef RTE_NEXT_ABI >>> + unsigned lcore_id; >>> + unsigned count = 0; >>> + unsigned cache_count; >>> + >>> + fprintf(f, " cache infos:\n"); >>> + fprintf(f, "cache_size=%"PRIu32"\n", mp->cache_size); >>> + if (mp->cache_size == 0) >>> + return count; >>> + >>> + for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) { >>> + cache_count = mp->local_cache[lcore_id].len; >>> + fprintf(f, "cache_count[%u]=%u\n", lcore_id, cache_count); >>> + count += cache_count; >>> + } >>> + fprintf(f, "total_cache_count=%u\n", count); >>> + return count; >>> +#else >>> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >>> unsigned lcore_id; >>> unsigned count = 0; >> >> I think in this case we could avoid to duplicate the code without >> beeing unclear by using the proper #ifdefs: >> >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) >> /* common code */ >> #ifdef RTE_NEXT_ABI >> if (mp->cache_size == 0) >> return count; >> #endif >> /* common code */ >> #else >> ... >> #endif > > Started looking at this change and the problem is I want to remove the ifdef > RTE_MEMPOOL.. As well as the #else/#endif code. If I do as you suggest it > does not appear to be clearer when someone goes back to remove the code, they > may think the #ifdef RTE_MEMPOOL/#else/#endif are still required. OK, makes sense. >>> @@ -755,19 +793,25 @@ static inline void __attribute__((always_inline)) >>> __mempool_put_bulk(struct rte_mempool *mp, void * const *obj_table, >>> unsigned n, int is_mp) >>> { >>> +#ifndef RTE_NEXT_ABI /* Note: ifndef */ >>> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 >>> +#endif /* RTE_NEXT_ABI */ >>> struct rte_mempool_cache *cache; >>> uint32_t index; >>> void **cache_objs; >>> unsigned lcore_id = rte_lcore_id(); >>> uint32_t cache_size = mp->cache_size; >>> uint32_t flushthresh = mp->cache_flushthresh; >>> +#ifndef RTE_NEXT_ABI /* Note: ifndef */ >>> #endif /* RTE_MEMPOOL_CACHE_MAX_SIZE > 0 */ >>> +#endif /* RTE_NEXT_ABI */ >> >> this looks strange... I think it does not work properly. >> Why not >> #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 || defined(RTE_NEXT_ABI) > > Yes I agree the ifndef looks strange, but it should work as we want to remove > the #ifdef RTE_MEMPOOL/#endif lines. This was the reason for the comment that > it was a ifndef. It's not only strange, it's also probably not what you want to do: #ifndef RTE_NEXT_ABI/* Note: ifndef */ #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 #endif /* RTE_NEXT_ABI */ ... Here, the #endif corresponds to the second #if, not the first #ifdef. Regards, Olivier
[dpdk-dev] [PATCH v3] mempool: reduce rte_mempool structure size
The rte_mempool structure is changed, which will cause an ABI change for this structure. Providing backward compat is not reasonable here as this structure is used in multiple defines/inlines. Allow mempool cache support to be dynamic depending on if the mempool being created needs cache support. Saves about 1.5M of memory used by the rte_mempool structure. Allocating small mempools which do not require cache can consume larges amounts of memory if you have a number of these mempools. Signed-off-by: Keith Wiles --- * Patch v3 fix up the ifdefs to correct some problems in removing ifdef lines. Added the ABI deprecation notice to the document file. * Patch v2 to add some comments and setup for RTE_NEXT_ABI changes. app/test/test_mempool.c | 5 +++ doc/guides/rel_notes/deprecation.rst | 7 +++ lib/librte_mempool/rte_mempool.c | 82 +--- lib/librte_mempool/rte_mempool.h | 46 4 files changed, 127 insertions(+), 13 deletions(-) diff --git a/app/test/test_mempool.c b/app/test/test_mempool.c index f0f823b..f3fba50 100644 --- a/app/test/test_mempool.c +++ b/app/test/test_mempool.c @@ -122,8 +122,13 @@ test_mempool_basic(void) return -1; printf("get private data\n"); +#ifdef RTE_NEXT_ABI + if (rte_mempool_get_priv(mp) != (char *)mp + + MEMPOOL_HEADER_SIZE(mp, mp->pg_num, mp->cache_size)) +#else if (rte_mempool_get_priv(mp) != (char*) mp + MEMPOOL_HEADER_SIZE(mp, mp->pg_num)) +#endif return -1; printf("get physical address of an object\n"); diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index e94d4a2..1b9d25e 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -49,3 +49,10 @@ Deprecation Notices commands (such as RETA update in testpmd). This should impact CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE. It should be integrated in release 2.3. + +* ABI change is planned for the rte_mempool structure to allow mempool + cache support to be dynamic depending on the mempool being created + needing cache support. Saves about 1.5M of memory per rte_mempool structure + by removing the per lcore cache memory. Change will occur after DPDK 16.04 + release. + diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c index aff5f6d..5f21eaa 100644 --- a/lib/librte_mempool/rte_mempool.c +++ b/lib/librte_mempool/rte_mempool.c @@ -452,12 +452,17 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size, /* compilation-time checks */ RTE_BUILD_BUG_ON((sizeof(struct rte_mempool) & RTE_CACHE_LINE_MASK) != 0); +#ifdef RTE_NEXT_ABI + RTE_BUILD_BUG_ON((sizeof(struct rte_mempool_cache) & + RTE_CACHE_LINE_MASK) != 0); +#else #if RTE_MEMPOOL_CACHE_MAX_SIZE > 0 RTE_BUILD_BUG_ON((sizeof(struct rte_mempool_cache) & RTE_CACHE_LINE_MASK) != 0); RTE_BUILD_BUG_ON((offsetof(struct rte_mempool, local_cache) & RTE_CACHE_LINE_MASK) != 0); #endif +#endif /* RTE_NEXT_ABI */ #ifdef RTE_LIBRTE_MEMPOOL_DEBUG RTE_BUILD_BUG_ON((sizeof(struct rte_mempool_debug_stats) & RTE_CACHE_LINE_MASK) != 0); @@ -527,9 +532,8 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size, */ int head = sizeof(struct rte_mempool); int new_size = (private_data_size + head) % page_size; - if (new_size) { + if (new_size) private_data_size += page_size - new_size; - } } /* try to allocate tailq entry */ @@ -544,7 +548,12 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size, * store mempool objects. Otherwise reserve a memzone that is large * enough to hold mempool header and metadata plus mempool objects. */ +#ifdef RTE_NEXT_ABI + mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num, cache_size); + mempool_size += private_data_size; +#else mempool_size = MEMPOOL_HEADER_SIZE(mp, pg_num) + private_data_size; +#endif /* RTE_NEXT_ABI */ mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN); if (vaddr == NULL) mempool_size += (size_t)objsz.total_size * n; @@ -598,9 +607,22 @@ rte_mempool_xmem_create(const char *name, unsigned n, unsigned elt_size, mp->cache_flushthresh = CALC_CACHE_FLUSHTHRESH(cache_size); mp->private_data_size = private_data_size; +#ifdef RTE_NEXT_ABI + /* +* local_cache pointer is set even if cache_size is zero. +* The local_cache points to just past the elt_pa[] array. +*/ + mp->local_cache = (struct rte_mempool_cache *) +
[dpdk-dev] [dpdk-users] DPDK i40evf problem in receving packet
Hi, *i40e card is :* 00:09.0 *Ether*net controller: Intel Corporation XL710/X710 Virtual Function (rev 01) 00:0a.0 *Ether*net controller: Intel Corporation XL710/X710 Virtual Function (rev 01) firmware-version: f4.40 a1.4 n04.53 e80001dc0 Is there any extra verbose/debug in DPDK which we can enable to find out why RX does not receive any packet after first one? Thanks, /Saurabh On Wed, Feb 10, 2016 at 3:15 AM, Thomas Monjalon wrote: > Formatting note: > Please avoid top posting (inline replies are preferred), > and do not include such disclaimer, not relevant on a mailing list. > Thanks > > 2016-02-10 16:00, Muhammad Zain-ul-Abideen: > > Hi Saurabh, > > Can you tell me what card you are uing > > > > > > *Regards:* > > Zain ul Abideen > > > > > > *Disclaimer**: The information contained in this e-mail and any > attachments > > is confidential; it is intended only for use of the individual or entity > > named above. If the reader of this message is not the intended recipient, > > you are notified that any dissemination, distribution or use of this > > information is strictly prohibited. If you have received this > communication > > in error, please delete it and email confirmation to the sender. Thank > You.* > > > > > > On Wed, Feb 10, 2016 at 6:30 AM, Saurabh Mishra > > > wrote: > > > > > Hi Qian -- > > > > > > Any suggestions? This is bit urgent. > > > > > > /Saurabh > > > > > > On Sat, Feb 6, 2016 at 9:22 AM, Saurabh Mishra < > saurabh.globe at gmail.com> > > > wrote: > > > > > > > Hi Qian -- > > > > > > > > > > > > Here's the data from Host: > > > > > > > > [root at oscompute3 ~]# ethtool -i p3p1 > > > > > > > > driver: i40e > > > > > > > > version: 1.0.11-k > > > > > > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0 > > > > > > > > bus-info: :04:00.0 > > > > > > > > supports-statistics: yes > > > > > > > > supports-test: yes > > > > > > > > supports-eeprom-access: yes > > > > > > > > supports-register-dump: yes > > > > > > > > supports-priv-flags: no > > > > > > > > [root at oscompute3 ~]# ethtool -i p3p2 > > > > > > > > driver: i40e > > > > > > > > version: 1.0.11-k > > > > > > > > firmware-version: f4.40 a1.4 n04.53 e80001dc0 > > > > > > > > bus-info: :04:00.1 > > > > > > > > supports-statistics: yes > > > > > > > > supports-test: yes > > > > > > > > supports-eeprom-access: yes > > > > > > > > supports-register-dump: yes > > > > > > > > supports-priv-flags: no > > > > > > > > [root at oscompute3 ~]# > > > > > > > > EthApp> drvinfo > > > > > > > > Port 0 driver: rte_i40evf_pmd (ver: RTE 2.2.0) > > > > > > > > Port 1 driver: rte_i40evf_pmd (ver: RTE 2.2.0) > > > > > > > > EthApp> > > > > > > > > On Fri, Feb 5, 2016 at 4:59 PM, Xu, Qian Q > wrote: > > > > > > > >> What's your current firmware info, can u run ethtool -i > port_interface > > > to > > > >> check? > > > >> > > > >> Thanks > > > >> Qian > > > >> > > > >> -Original Message- > > > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Saurabh Mishra > > > >> Sent: Saturday, February 06, 2016 6:33 AM > > > >> To: dev at dpdk.org; users at dpdk.org > > > >> Subject: [dpdk-dev] DPDK i40evf problem in receving packet > > > >> > > > >> Hi, > > > >> > > > >> I'm seeing two problems: > > > >> > > > >> 1) when use our kernel '3.10.88-8.0.0.0.6', we only receive first > > > >> packet but not subsequent ones at all after that. However, when I > use > > > >> centos7.0, then l2fwd is able to receive all the packets. > > > >> > > > >> 2) I've also seen that on centos7.0, symmetric_mp itself is not > working. > > > >> dev start fails with 280 error. > > > >> > > > >> i40evf is giving us lot of headache. The i40evf kernel driver works > > > >> without any problem. Host is a centos7 KVM. I've already upgraded > > > firmware > > > >> to latest. > > > >> > > > >> [root at localhost ~]# uname -a > > > >> > > > >> Linux localhost.localdomain 3.10.0-327.4.5.el7.x86_64 #1 SMP Mon > Jan 25 > > > >> 22:07:14 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux > > > >> > > > >> [root at localhost ~]# > > > >> > > > >> [root at localhost ~]# ./symmetric_mp fakeelf --file-prefix=virtaddr > -c2 > > > >> -m2048 -n1 --base-virtaddr=0x7fa0 --proc-type=primary -- -p > 3 > > > >> --num-procs=1 --proc-id=0 > > > >> > > > >> EAL: Detected lcore 0 as core 0 on socket 0 > > > >> > > > >> EAL: Detected lcore 1 as core 0 on socket 0 > > > >> > > > >> EAL: Support maximum 128 logical core(s) by configuration. > > > >> > > > >> EAL: Detected 2 lcore(s) > > > >> > > > >> EAL: No free hugepages reported in hugepages-1048576kB > > > >> > > > >> EAL: VFIO modules not all loaded, skip VFIO support... > > > >> > > > >> EAL: Setting up physically contiguous memory... > > > >> > > > >> EAL: Ask a virtual area of 0x2a80 bytes > > > >> > > > >> EAL: Virtual area found at 0x7fa0 (size = 0x2a80) > > > >> > > > >> EAL: Ask a virtual area of 0x20 bytes > > > >> > > > >> EAL: Virtual area found at 0x7fa02a80 (size = 0x20) > > > >> > > > >>
[dpdk-dev] [PATCH v3 1/2] version: switch to year/month version numbers
2016-02-10 17:02, John McNamara: > From: "Richardson, Bruce" > > As discussed on list, switch numbering scheme to be based on year/month. > Release 2.3 then becomes 16.04. > > Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html > > Also, added zero padding to the month so that it appear as 16.04 and > not 16.4 in "make showversion" and rte_version(). > > Signed-off-by: Bruce Richardson > Signed-off-by: John McNamara Acked-by: Thomas Monjalon Series applied, thanks
[dpdk-dev] [PATCH] config: remove useless explicit includes of generated header
2016-02-08 15:38, Thomas Monjalon: > The file rte_config.h is automatically generated and included. > No need to #include it. > > The example performance-thread needs a makefile fix to avoid > overwriting the default cflags. > > Signed-off-by: Thomas Monjalon Applied
[dpdk-dev] DPDK (and rte_*alloc family) friendly Valgrind
Hello all, I created a set of patches for Valgrind that add support for the rte_*alloc family of functions. We use it for memcheck (I added support for other all the other Valgrind tools like cachegrind as well, but it's less tested), and find it extremely useful, since the vanilla version cannot intercept and report leaks cause by rte_*alloc functions from librte_malloc. While at FOSDEM last week I mentioned this to Mark Gray and Kevin Traynor after their presentations (sorry it took a while to remember to push this up :-) ), and they thought it would be useful for other DPDK-based applications developers, so I've uploaded it to Github [1]. I've also sent the patches upstream a while ago [2]. It works with both with a statically and dynamically linked DPDK library. To use it with a statically linked DPDK, pass the following parameter to Valgrind: --soname-synonyms=somalloc=NONE To use it with a dynamically linked DPDK no additional parameter is needed, but make sure that the SONAME matches either "lib*dpdk.so*" if building a single .so or "librte_malloc.so*" if building each library individually. If it doesn't match either of these regexp, then you can manually patch the file include/pub_tool_redir.h and rebuild. Please feel free to provide comments, feedback, or patches! All bug reports will be promptly redirected to /dev/null :-) -- Kind regards, Luca Boccassi Brocade Communications Systems [1] https://github.com/bluca/valgrind-dpdk [2] https://bugs.kde.org/show_bug.cgi?id=350405
[dpdk-dev] [PATCH] Correcting upstream kernel version in driver
Fixing the version of the kernel required in the QAT documentation. Signed-off-by: John Griffin --- doc/guides/cryptodevs/qat.rst | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst index 1901842..23402b4 100644 --- a/doc/guides/cryptodevs/qat.rst +++ b/doc/guides/cryptodevs/qat.rst @@ -33,7 +33,7 @@ Quick Assist Crypto Poll Mode Driver The QAT PMD provides poll mode crypto driver support for **Intel QuickAssist Technology DH895xxC** hardware accelerator. -The QAT PMD has been tested on Fedora 21 64-bit with gcc and on the 4.3 +The QAT PMD has been tested on Fedora 21 64-bit with gcc and on the 4.4 kernel.org Linux kernel. @@ -73,9 +73,9 @@ Installation To use the DPDK QAT PMD an SRIOV-enabled QAT kernel driver is required. The VF devices exposed by this driver will be used by QAT PMD. -If you are running on kernel 4.3 or greater, see instructions for +If you are running on kernel 4.4 or greater, see instructions for `Installation using kernel.org driver`_ below. If you are on a kernel earlier -than 4.3, see `Installation using 01.org QAT driver`_. +than 4.4, see `Installation using 01.org QAT driver`_. Installation using 01.org QAT driver @@ -157,13 +157,13 @@ If the build or install fails due to mismatching kernel sources you may need to Installation using kernel.org driver -Assuming you are running on at least a 4.3 kernel, you can use the stock kernel.org QAT +Assuming you are running on at least a 4.4 kernel, you can use the stock kernel.org QAT driver to start the QAT hardware. The steps below assume you are: * Running DPDK on a platform with one ``DH895xCC`` device. -* On a kernel at least version 4.3. +* On a kernel at least version 4.4. In BIOS ensure that SRIOV is enabled and VT-d is disabled. @@ -190,7 +190,7 @@ Using the sysfs, enable the VFs:: echo 32 > /sys/bus/pci/drivers/dh895xcc/\:03\:00.0/sriov_numvfs -If you get an error, it's likely you're using a QAT kernel driver earlier than kernel 4.3. +If you get an error, it's likely you're using a QAT kernel driver earlier than kernel 4.4. To verify that the VFs are available for use - use ``lspci -d:443`` to confirm the bdf of the 32 VF devices are available per ``DH895xCC`` device. -- 2.1.0
[dpdk-dev] [PATCH] virtio: prettify log messages
On Thu, 11 Feb 2016 00:45:34 +0100 Vincent JARDIN wrote: > PMD_TX_LOG() looks better with a \n > > Signed-off-by: Vincent JARDIN NAK. Yes the messages, are messed up, but this is not the right way to fix it. The logging wrappers are inconsistent in virtio. PMD_INIT_LOG adds newline, but RX/TX/DRV do not. I would rather the macros were aligned with ixgbe which always adds newline for all the PMD_XXX_LOG() macros. And then remove all extra newlines in virtio code.
[dpdk-dev] thoughts on DPDK after a few days of reading sources
Hello, The Ubuntu distribution is looking at supporting DPDK in the 'main' component of the archive. As part of this process I spent a few days reading the DPDK sources to gauge if we can support it or not. I've taken some notes while reading the sources; I'm sharing them in the hopes that it's useful: on the one hand my fresh eyes may spot things that you've overlooked, on the other hand your familiarity with the code means that you're better suited to judge what I've found. Most of the code was very good; I am however concerned about the frequent memory allocations that use simple integer arithmetic when deciding how much to allocate without checking for integer overflows. Here's the bug tracking the Main Inclusion Request: https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1492186 Here's the portions of the notes that I think may be most interesting to DPDK devs. It's in rough priority order so feel free to stop reading when you're bored: === Affects main codebase and should be investigated quickly: - shellcheck reports extensive cases of forgotten quotes to prevent word splitting or globbing, potentially unused variables, error-prone printf formatting. The scripts that are going to be used at runtime should be fixed: - ./debian/dpdk-init - ./debian/dpdk.init - ./tools/setup.sh ? (Hard to tell) - ./drivers/net/cxgbe/cxgbe_ethdev.c eth_cxgbe_dev_init() memory leak in out_free_adapter: that doesn't free adapter - ./drivers/net/virtio/virtio_ethdev.c virtio_set_multiple_queues() calls virtio_send_command(), which performs: memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl)); This copies a potentially huge amount of uninitialized data into ->addr because the struct virtio_pmd_ctrl ctrl was not zeroed before being passed. How much of this data leaves the system? Does this require a CVE? - ./lib/librte_eal/common/rte_malloc.c memory allocation routines don't check for integer overflow errors: - rte_calloc_socket() - rte_calloc() - ./lib/librte_vhost/vhost_user/virtio-net-user.c user_set_mem_table() doesn't perform integer overflow checks before calling calloc() - ./lib/librte_vhost/vhost_cuse/virtio-net-cdev.c cuse_set_mem_table() doesn't perform integer overflow checks before calling calloc() - ./lib/librte_eal/linuxapp/eal/eal_xen_memory.c rte_xen_dom0_memory_init() If vma_addr = xen_get_virtual_area(&vma_len, RTE_PGSIZE_2M); fails, vma_len is reset to RTE_PGSIZE_2M instead of seginfo[memseg_idx].size -- is this a bug? - ./lib/librte_eal/linuxapp/eal/eal_memory.c create_shared_memory() creates the hugetlb file with mode 0666 rather than 0600 -- is this a bug? Does this require a CVE? - ./lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h rte_cpu_get_features() leaks auxv_fd - ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_32.h rte_cpu_get_features() leaks auxv_fd - ./lib/librte_eal/common/include/arch/arm/rte_cpuflags_64.h rte_cpu_get_features() leaks auxv_fd Affects main codebase: - Assorted false-positives and style issues reported by cppcheck - Slightly dangerous convention of memcpy(dest, source, sizeof(source)) is used extensively; while all the instances I investigated were correct, it's still more prone to mistakes under maintenance than memcpy(dest, source, sizeof(dest)). - extensive use of *malloc() wrappers that perform multiplication to determine the size to allocate; all the cases I've seen used values that should be constrained by system configurations but the habit is dangerous compared to use of *calloc() wrappers that handle integer overflow safely. - ./app/test/test_malloc.c test_reordered_free_per_lcore() has incorrect calls to is_memory_overlap() -- p2 is 16000 bytes long, not 1000 bytes long, and this size difference is not reflected in the calls. - ./app/test/test_malloc.c test_realloc() leaks ptr1, may leak ptr9 via error path - ./app/test/test_malloc.c test_realloc() the test with error "Unexpected - ptr4 != ptr3" doesn't feel like it tests an actual promise from the API - ./lib/librte_pipeline/rte_pipeline.c rte_pipeline_table_create(), rte_pipeline_port_in_create(), rte_pipeline_port_out_create(), duplicate the array of function pointers via memcpy() rather than copying a pointer to static tables -- this may represent an easy way to save memory and improve cache hit ratios as well as potentially allow storing the tables in static memory rather than on the heap, reducing the value of these structs in potential exploits. - ixgbe driver in the package is very different from the driver in the Linux kernel -- when bugs in one are found, who is in charge of copying the fixes between the two code bases? - ./lib/librte_eal/linuxapp/kni/ethtool/ixgbe/ixgbe_ethtool.c ixgbe_get_strings() takes a buffer to write into but not the buffer length; sprintf() calls may overflow the buffer if it isn't large enough. It looks like ethtool_get_strings() may use fix
[dpdk-dev] thoughts on DPDK after a few days of reading sources
On Wed, Feb 10, 2016 at 07:05:40PM -0800, Seth Arnold wrote: > - ixgbe driver in the package is very different from the driver in the > Linux kernel -- when bugs in one are found, who is in charge of copying > the fixes between the two code bases? It's not the Linux driver. It's from BSD because DPDK uses mbufs. The MAINTAINERS file / Intel team refreshes it w/ shared code from their upstream BSD support. Matthew.