[dpdk-dev] [PATCH] [VIRTIO] Support multiple queues feature in DPDK based virtio-net frontend.
This patch support multiple queues feature in DPDK based virtio-net frontend. It firstly gets max queue number of virtio-net from virtio pci configuration and then send command to negotiate the queue numer with backend; when receiving and transmiting packets, negotiated multiple virtio-net queues can serve that; To utilize this featrue, the backend also need support mulitiple queues feature and enable it. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_virtio/virtio_ethdev.c | 326 -- lib/librte_pmd_virtio/virtio_ethdev.h | 10 +- lib/librte_pmd_virtio/virtio_pci.h| 4 +- lib/librte_pmd_virtio/virtio_rxtx.c | 79 +--- lib/librte_pmd_virtio/virtqueue.h | 61 +-- 5 files changed, 388 insertions(+), 92 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index c6a1df5..a3616ea 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -80,6 +80,9 @@ static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats * static void virtio_dev_stats_reset(struct rte_eth_dev *dev); static void virtio_dev_free_mbufs(struct rte_eth_dev *dev); +static int virtio_dev_queue_stats_mapping_set(__rte_unused struct rte_eth_dev *eth_dev, +__rte_unused uint16_t queue_id, __rte_unused uint8_t stat_idx, __rte_unused uint8_t is_rx); + /* * The set of PCI devices this driver supports */ @@ -91,6 +94,130 @@ static struct rte_pci_id pci_id_virtio_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static int +virtio_send_command(struct virtqueue* vq, struct virtio_pmd_ctrl* ctrl, + int* dlen, int pkt_num) +{ + uint32_t head = vq->vq_desc_head_idx, i; + int k, sum = 0; + virtio_net_ctrl_ack status = ~0; + struct virtio_pmd_ctrl result; + + ctrl->status = status; + + if (!vq->hw->cvq) { + PMD_INIT_LOG(ERR, "%s(): Control queue is " +"not supported by this device.\n", __func__); + return -1; + } + + PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, vq->hw->cvq = %p \n" + "vq = %p \n", vq->vq_desc_head_idx, status, vq->hw->cvq, vq); + + if ((vq->vq_free_cnt < ((uint32_t)pkt_num + 2)) || (pkt_num < 1)) { + return -1; + } + + memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl)); + + /* +* Format is enforced in qemu code: +* One TX packet for header; +* At least one TX packet per argument; +* One RX packet for ACK. +*/ + vq->vq_ring.desc[head].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mz->phys_addr; + vq->vq_ring.desc[head].len = sizeof(struct virtio_net_ctrl_hdr); + vq->vq_free_cnt--; + i = vq->vq_ring.desc[head].next; + + for (k = 0; k < pkt_num; k++) { + vq->vq_ring.desc[i].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + + sizeof(struct virtio_net_ctrl_hdr) + sizeof(ctrl->status) + sizeof(uint8_t)*sum; + vq->vq_ring.desc[i].len = dlen[k]; + sum += dlen[k]; + vq->vq_free_cnt--; + i = vq->vq_ring.desc[i].next; + } + + vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + sizeof(struct virtio_net_ctrl_hdr); + vq->vq_ring.desc[i].len = sizeof(ctrl->status); + vq->vq_free_cnt--; + + vq->vq_desc_head_idx = vq->vq_ring.desc[i].next; + + vq_update_avail_ring(vq, head); + vq_update_avail_idx(vq); + + PMD_INIT_LOG(DEBUG, "vq->vq_queue_index = %d \n", vq->vq_queue_index); + + virtqueue_notify(vq); + + while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) { + usleep(100); + } + + while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) { + uint32_t idx, desc_idx, used_idx; + struct vring_used_elem *uep; + + rmb(); + + used_idx = (uint32_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); + uep = &vq->vq_ring.used->ring[used_idx]; + idx = (uint32_t) uep->id; + desc_idx = idx; + + while (vq->vq_ring.desc[desc_idx].flags & VRING_DESC_F_NEXT) { + desc_idx = vq->vq_ring.desc[desc_idx].next; + vq->vq_free_cnt++; + } + + vq->vq_ring.desc[desc_idx].next = vq->vq_desc_head_idx; + vq->vq_desc_head_idx = idx; + + vq->vq_used_cons_
[dpdk-dev] [PATCH 0/3] [PMD] [VHOST] *** Support zero copy RX/TX in user space vhost ***
Short summary: * Add API to support queue start and stop functionality for RX/TX, and implement them in IXGBE PMD; * Enable hardware loopback functionality in VMDQ mode; * Implement mbuf metadata macros to facilitate refering to space in mbuf headroom; * Support user space vhost zero copy RX/TX, it removes packets copying between host and guest in RX/TX; Ouyang Changchun (3): 1. It contains the following 2 parts: a) Add API to support queue start and stop functionality for RX/TX, and implement them in IXGBE PMD; b) Enable hardware loopback functionality in VMDQ mode; 2. Implement mbuf metadata macros to facilitate refering to space in mbuf headroom; 3. Support user space vhost zero copy, it removes packets copying between host and guest in RX/TX. examples/vhost/main.c| 1405 ++ examples/vhost/virtio-net.c | 120 ++- examples/vhost/virtio-net.h | 15 +- lib/librte_eal/linuxapp/eal/eal_memory.c |2 +- lib/librte_ether/rte_ethdev.c| 104 +++ lib/librte_ether/rte_ethdev.h| 80 ++ lib/librte_mbuf/rte_mbuf.h | 17 + lib/librte_pmd_ixgbe/ixgbe_ethdev.c |4 + lib/librte_pmd_ixgbe/ixgbe_ethdev.h |8 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c| 233 - lib/librte_pmd_ixgbe/ixgbe_rxtx.h|6 + 11 files changed, 1800 insertions(+), 194 deletions(-) -- 1.9.0
[dpdk-dev] [PATCH 3/3] [PMD] [VHOST] Support zero copy RX/TX in user space vhost
Support user space vhost zero copy. It removes packets copying between host and guest in RX/TX packets. It introduces an extra ring to store the detached mbufs. At initialization stage all mbufs will put into this ring; when one guest starts, vhost gets the available buffer address allocated by guest for RX and translates them into host space addresses, then attaches them to mbufs and puts the attached mbufs into mempool. Queue starting and DMA refilling will get mbufs from mempool and use them to set the DMA addresses. For TX, it gets the buffer addresses of available packets to be transmitted from guest and translates them to host space addresses, then attaches them to mbufs and puts them to TX queues. After TX finishes, it pulls mbufs out from mempool, detaches them and puts them back into the extra ring. Signed-off-by: Ouyang Changchun --- examples/vhost/main.c | 1405 ++- examples/vhost/virtio-net.c | 120 +++- examples/vhost/virtio-net.h | 15 +- 3 files changed, 1383 insertions(+), 157 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 816a71a..21704f1 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -48,6 +48,7 @@ #include #include #include +#include #include "main.h" #include "virtio-net.h" @@ -70,6 +71,14 @@ #define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) /* + * No frame data buffer allocated from host are required for zero copy implementation, + * guest will allocate the frame data buffer, and vhost directly use it. + */ +#define VIRTIO_DESCRIPTOR_LEN_ZCP 1518 +#define MBUF_SIZE_ZCP (VIRTIO_DESCRIPTOR_LEN_ZCP + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) +#define MBUF_CACHE_SIZE_ZCP 0 + +/* * RX and TX Prefetch, Host, and Write-back threshold values should be * carefully set for optimal performance. Consult the network * controller's datasheet and supporting DPDK documentation for guidance @@ -108,6 +117,21 @@ #define RTE_TEST_RX_DESC_DEFAULT 1024 #define RTE_TEST_TX_DESC_DEFAULT 512 +/* + * Need refine these 2 macros for legacy and DPDK based front end: + * Max vring avail descriptor/entries from guest - MAX_PKT_BURST + * And then adjust power 2. + */ +/* + * For legacy front end, 128 descriptors, + * half for virtio header, another half for mbuf. + */ +#define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. */ +#define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. */ + +/* true if x is a power of 2 */ +#define POWEROF2(x) x)-1) & (x)) == 0) + #define INVALID_PORT_ID 0xFF /* Max number of devices. Limited by vmdq. */ @@ -138,8 +162,39 @@ static uint32_t num_switching_cores = 0; static uint32_t num_queues = 0; uint32_t num_devices = 0; +/* Enable zero copy, pkts buffer will directly dma to hw descriptor, disabled on default*/ +static uint32_t zero_copy = 0; + +/* number of descriptors to apply*/ +static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP; +static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP; + +/* max ring descriptor, ixgbe, i40e, e1000 all are 4096. */ +#define MAX_RING_DESC 4096 + +struct vpool { + struct rte_mempool * pool; + struct rte_ring * ring; + uint32_t buf_size; +} vpool_array[MAX_QUEUES+MAX_QUEUES]; + /* Enable VM2VM communications. If this is disabled then the MAC address compare is skipped. */ -static uint32_t enable_vm2vm = 1; +typedef enum { + VM2VM_DISABLED = 0, + VM2VM_SOFTWARE = 1, + VM2VM_HARDWARE = 2, + VM2VM_LAST +} vm2vm_type; +static vm2vm_type vm2vm_mode = VM2VM_SOFTWARE; + +/* The type of host physical address translated from guest physical address. */ +typedef enum { + PHYS_ADDR_CONTINUOUS = 0, + PHYS_ADDR_CROSS_SUBREG = 1, + PHYS_ADDR_INVALID = 2, + PHYS_ADDR_LAST +} hpa_type; + /* Enable stats. */ static uint32_t enable_stats = 0; /* Enable retries on RX. */ @@ -159,7 +214,7 @@ static uint32_t dev_index = 0; extern uint64_t VHOST_FEATURES; /* Default configuration for rx and tx thresholds etc. */ -static const struct rte_eth_rxconf rx_conf_default = { +static struct rte_eth_rxconf rx_conf_default = { .rx_thresh = { .pthresh = RX_PTHRESH, .hthresh = RX_HTHRESH, @@ -173,7 +228,7 @@ static const struct rte_eth_rxconf rx_conf_default = { * Controller and the DPDK ixgbe/igb PMD. Consider using other values for other * network controllers and/or network drivers. */ -static const struct rte_eth_txconf tx_conf_default = { +static struct rte_eth_txconf tx_conf_default = { .tx_thresh = { .pthresh = TX_PTHRESH, .hthresh = TX_HTHRESH, @@ -184,7 +239,7 @@ static const struct rte_eth_txconf tx_conf_default = { }; /* empty vmdq configuration structure. Filled in programatically */ -static const struct rte_eth_conf vmdq_conf_default =
[dpdk-dev] [PATCH 2/3] [PMD] [VHOST] Support zero copy RX/TX in user space vhost
Implement mbuf metadata macros to facilitate refering to space in mbuf headroom; Signed-off-by: Ouyang Changchun --- lib/librte_mbuf/rte_mbuf.h | 17 + 1 file changed, 17 insertions(+) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index edffc2c..baf3ca4 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -201,8 +201,25 @@ struct rte_mbuf { struct rte_ctrlmbuf ctrl; struct rte_pktmbuf pkt; }; + + union { + uint8_t metadata[0]; + uint16_t metadata16[0]; + uint32_t metadata32[0]; + uint64_t metadata64[0]; + }; } __rte_cache_aligned; +#define RTE_MBUF_METADATA_UINT8(mbuf, offset) (mbuf->metadata[offset]) +#define RTE_MBUF_METADATA_UINT16(mbuf, offset) (mbuf->metadata16[offset/sizeof(uint16_t)]) +#define RTE_MBUF_METADATA_UINT32(mbuf, offset) (mbuf->metadata32[offset/sizeof(uint32_t)]) +#define RTE_MBUF_METADATA_UINT64(mbuf, offset) (mbuf->metadata64[offset/sizeof(uint64_t)]) + +#define RTE_MBUF_METADATA_UINT8_PTR(mbuf, offset) (&mbuf->metadata[offset]) +#define RTE_MBUF_METADATA_UINT16_PTR(mbuf, offset) (&mbuf->metadata16[offset/sizeof(uint16_t)]) +#define RTE_MBUF_METADATA_UINT32_PTR(mbuf, offset) (&mbuf->metadata32[offset/sizeof(uint32_t)]) +#define RTE_MBUF_METADATA_UINT64_PTR(mbuf, offset) (&mbuf->metadata64[offset/sizeof(uint64_t)]) + /** * Given the buf_addr returns the pointer to corresponding mbuf. */ -- 1.9.0
[dpdk-dev] [PATCH 1/3] [PMD] [VHOST] Support zero copy RX/TX in user space vhost
1. Add API to support queue start and stop functionality for RX/TX, and implement them in IXGBE PMD; 2. Enable hardware loopback functionality in VMDQ mode; Signed-off-by: Ouyang Changchun --- lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +- lib/librte_ether/rte_ethdev.c| 104 ++ lib/librte_ether/rte_ethdev.h| 80 +++ lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 4 + lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 8 ++ lib/librte_pmd_ixgbe/ixgbe_rxtx.c| 233 ++- lib/librte_pmd_ixgbe/ixgbe_rxtx.h| 6 + 7 files changed, 400 insertions(+), 37 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 69ad63e..dd10e15 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -134,6 +134,7 @@ rte_mem_virt2phy(const void *virtaddr) uint64_t page, physaddr; unsigned long virt_pfn; int page_size; + off_t offset; /* standard page size */ page_size = getpagesize(); @@ -145,7 +146,6 @@ rte_mem_virt2phy(const void *virtaddr) return RTE_BAD_PHYS_ADDR; } - off_t offset; virt_pfn = (unsigned long)virtaddr / page_size; offset = sizeof(uint64_t) * virt_pfn; if (lseek(fd, offset, SEEK_SET) == (off_t) -1) { diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index ec411db..7faeeff 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -293,6 +293,110 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) return (0); } +int +rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (rx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_start, -ENOTSUP); + + return dev->dev_ops->rx_queue_start(dev, rx_queue_id); + +} + +int +rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t rx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (rx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_stop, -ENOTSUP); + + return dev->dev_ops->rx_queue_stop(dev, rx_queue_id); + +} + +int +rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t tx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (tx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", tx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_start, -ENOTSUP); + + return dev->dev_ops->tx_queue_start(dev, tx_queue_id); + +} + +int +rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t tx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (tx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", tx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_stop, -ENOTSUP); + + return dev->dev_ops->tx_queue_stop(dev, tx_que
[dpdk-dev] [PATCH] [PMD] [VHOST] Revert unnecessary definition and fix wrong referring in user space vhost zero copy patches
1. Revert the change of metadata macro definition for referring to headroom space in mbuf; 2. Fix wrongly referring to RX queues number in TX queues start/stop function. Signed-off-by: Ouyang Changchun --- examples/vhost/main.c | 15 +-- lib/librte_ether/rte_ethdev.c | 8 lib/librte_mbuf/rte_mbuf.h| 17 - 3 files changed, 13 insertions(+), 27 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 21704f1..674608c 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -129,6 +129,9 @@ #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. */ #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. */ +/* Get first 4 bytes in mbuf headroom. */ +#define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t*)((uint8_t*)(mbuf) + sizeof(struct rte_mbuf))) + /* true if x is a power of 2 */ #define POWEROF2(x) x)-1) & (x)) == 0) @@ -1638,7 +1641,7 @@ attach_rxmbuf_zcp(struct virtio_net *dev) mbuf->pkt.data = (void*)(uintptr_t)(buff_addr); mbuf->buf_physaddr = phys_addr - RTE_PKTMBUF_HEADROOM; mbuf->pkt.data_len = desc->len; - RTE_MBUF_METADATA_UINT32(mbuf, 0) = (uint32_t)desc_idx; + MBUF_HEADROOM_UINT32(mbuf) = (uint32_t)desc_idx; LOG_DEBUG(DATA, "(%"PRIu64") in attach_rxmbuf_zcp: res base idx:%d, descriptor idx:%d\n", dev->device_fh, res_base_idx, desc_idx); @@ -1700,7 +1703,7 @@ txmbuf_clean_zcp(struct virtio_net* dev, struct vpool* vpool) rte_ring_sp_enqueue(vpool->ring, mbuf); /* Update used index buffer information. */ - vq->used->ring[used_idx].id = RTE_MBUF_METADATA_UINT32(mbuf, 0); + vq->used->ring[used_idx].id = MBUF_HEADROOM_UINT32(mbuf); vq->used->ring[used_idx].len = 0; used_idx = (used_idx + 1) & (vq->size - 1); @@ -1788,7 +1791,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count /* Retrieve all of the head indexes first to avoid caching issues. */ for (head_idx = 0; head_idx < count; head_idx++) - head[head_idx] = RTE_MBUF_METADATA_UINT32((pkts[head_idx]), 0); + head[head_idx] = MBUF_HEADROOM_UINT32(pkts[head_idx]); /*Prefetch descriptor index. */ rte_prefetch0(&vq->desc[head[packet_success]]); @@ -1799,7 +1802,7 @@ virtio_dev_rx_zcp(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count buff = pkts[packet_success]; LOG_DEBUG(DATA, "(%"PRIu64") in dev_rx_zcp: update the used idx for pkt[%d] descriptor idx: %d\n", - dev->device_fh, packet_success, RTE_MBUF_METADATA_UINT32(buff, 0)); + dev->device_fh, packet_success, MBUF_HEADROOM_UINT32(buff)); PRINT_PACKET(dev, (uintptr_t)(((uint64_t)(uintptr_t)buff->buf_addr) + RTE_PKTMBUF_HEADROOM), rte_pktmbuf_data_len(buff), 0); @@ -1901,7 +1904,7 @@ virtio_tx_route_zcp(struct virtio_net* dev, struct rte_mbuf *m, uint32_t desc_id if (unlikely(dev_ll->dev->device_fh == dev->device_fh)) { LOG_DEBUG(DATA, "(%"PRIu64") TX: Source and destination MAC addresses are the same. Dropping packet.\n", dev_ll->dev->device_fh); - RTE_MBUF_METADATA_UINT32(mbuf, 0) = (uint32_t)desc_idx; + MBUF_HEADROOM_UINT32(mbuf) = (uint32_t)desc_idx; __rte_mbuf_raw_free(mbuf); return ; } @@ -1936,7 +1939,7 @@ virtio_tx_route_zcp(struct virtio_net* dev, struct rte_mbuf *m, uint32_t desc_id mbuf->pkt.vlan_macip.f.vlan_tci = vlan_tag; mbuf->pkt.vlan_macip.f.l2_len = sizeof(struct ether_hdr); mbuf->pkt.vlan_macip.f.l3_len = sizeof(struct ipv4_hdr); - RTE_MBUF_METADATA_UINT32(mbuf, 0) = (uint32_t)desc_idx; + MBUF_HEADROOM_UINT32(mbuf) = (uint32_t)desc_idx; tx_q->m_table[len] = mbuf; len++; diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 7faeeff..0008755 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -360,8 +360,8 @@ rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t tx_queue_id) } dev = &rte_eth_devices[port_id]; - if (tx_queue_id >= dev->data->nb_rx_queues) { - PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", tx_queue_id); + if (tx_queue_id >= dev->data->nb_tx_queues) { + PMD_DEBUG_TRACE("Inva
[dpdk-dev] [PATCH] [PMD] [VHOST] Revert unnecessary definition and fix wrong referring in user space vhost zero copy patches
Hi Thomas, Fine, I will do it. One more question: You have comments as follow: The title was "[PATCH 0/3] [PMD] [VHOST] *** Support zero copy RX/TX in user space vhost ***" It should be "[PATCH v2 0/3] Support zero copy RX/TX in user space vhost" So "[PMD] [VHOST]" in the title should be removed in the cover letter, right? And in each separate patch letter, it could use "ixgbe:" or "examples/vhost:", instead of "[PMD] [VHOST]" Is it right? Thanks Changchun -Original Message- From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] Sent: Tuesday, May 20, 2014 12:00 AM To: Ouyang, Changchun Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH] [PMD] [VHOST] Revert unnecessary definition and fix wrong referring in user space vhost zero copy patches Hi Changchun, 2014-05-19 23:09, Ouyang Changchun: > 1. Revert the change of metadata macro definition for referring to > headroom space in mbuf; 2. Fix wrongly referring to RX queues number > in TX queues start/stop function. > > Signed-off-by: Ouyang Changchun You are fixing commits which are not yet applied. Please merge and re-send the whole serie by suffixing with "v2". The title was "[PATCH 0/3] [PMD] [VHOST] *** Support zero copy RX/TX in user space vhost ***" It should be "[PATCH v2 0/3] Support zero copy RX/TX in user space vhost" Other notes: - please split API and ixgbe changes - set a significant title to each patch - use prefixes like "ethdev:", "ixgbe:" or "examples/vhost:" In general, this page is a good help: http://dpdk.org/dev#send Thanks -- Thomas
[dpdk-dev] [PATCH v2 0/3] Support zero copy RX/TX in user space vhost
This patch series support user space vhost zero copy. It removes packets copying between host and guest in RX/TX. And it introduces an extra ring to store the detached mbufs. At initialization stage all mbufs put into this ring; when one guest starts, vhost gets the available buffer address allocated by guest for RX and translates them into host space addresses, then attaches them to mbufs and puts the attached mbufs into mempool. Queue starting and DMA refilling will get mbufs from mempool and use them to set the DMA addresses. For TX, it gets the buffer addresses of available packets to be transmitted from guest and translates them to host space addresses, then attaches them to mbufs and puts them to TX queues. After TX finishes, it pulls mbufs out from mempool, detaches them and puts them back into the extra ring. This patch series also implement queue start and stop functionality in IXGBE PMD; and enable hardware loopback for VMDQ mode in IXGBE PMD. Ouyang Changchun (3): Add API to support queue start and stop functionality for RX/TX. Implement queue start and stop functionality in IXGBE PMD; Enable hardware loopback for VMDQ mode in IXGBE PMD. Support user space vhost zero copy, it removes packets copying between host and guest in RX/TX. examples/vhost/main.c| 1410 ++ examples/vhost/virtio-net.c | 120 ++- examples/vhost/virtio-net.h | 15 +- lib/librte_eal/linuxapp/eal/eal_memory.c |2 +- lib/librte_ether/rte_ethdev.c| 104 +++ lib/librte_ether/rte_ethdev.h| 80 ++ lib/librte_pmd_ixgbe/ixgbe_ethdev.c |4 + lib/librte_pmd_ixgbe/ixgbe_ethdev.h |8 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c| 233 - lib/librte_pmd_ixgbe/ixgbe_rxtx.h|6 + 10 files changed, 1787 insertions(+), 195 deletions(-) -- 1.9.0
[dpdk-dev] [PATCH v2 1/3] ethdev: Add API to support queue start and stop functionality for RX/TX.
This patch adds API to support queue start and stop functionality for RX/TX. It allows RX and TX queue is started or stopped one by one, instead of starting and stopping all of them at the same time. Signed-off-by: Ouyang Changchun --- lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +- lib/librte_ether/rte_ethdev.c| 104 +++ lib/librte_ether/rte_ethdev.h| 80 3 files changed, 185 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 69ad63e..dd10e15 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -134,6 +134,7 @@ rte_mem_virt2phy(const void *virtaddr) uint64_t page, physaddr; unsigned long virt_pfn; int page_size; + off_t offset; /* standard page size */ page_size = getpagesize(); @@ -145,7 +146,6 @@ rte_mem_virt2phy(const void *virtaddr) return RTE_BAD_PHYS_ADDR; } - off_t offset; virt_pfn = (unsigned long)virtaddr / page_size; offset = sizeof(uint64_t) * virt_pfn; if (lseek(fd, offset, SEEK_SET) == (off_t) -1) { diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index ec411db..0008755 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -293,6 +293,110 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) return (0); } +int +rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (rx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_start, -ENOTSUP); + + return dev->dev_ops->rx_queue_start(dev, rx_queue_id); + +} + +int +rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t rx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (rx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_stop, -ENOTSUP); + + return dev->dev_ops->rx_queue_stop(dev, rx_queue_id); + +} + +int +rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t tx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (tx_queue_id >= dev->data->nb_tx_queues) { + PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", tx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_start, -ENOTSUP); + + return dev->dev_ops->tx_queue_start(dev, tx_queue_id); + +} + +int +rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t tx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + + dev = &rte_eth_devices[port_id]; + if (tx_queue_id >= dev->data->nb_tx_queues) { + PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", tx_queue_id); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_stop, -ENOTSUP); + + return dev->dev_ops->tx_queue_stop(dev, tx_queue_id); + +} + static int rte_eth_dev_tx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) { diff --git a/lib/librte_ether/rte_ethdev.h b/li
[dpdk-dev] [PATCH v2 3/3] examples/vhost: Support user space vhost zero copy
This patch supports user space vhost zero copy. It removes packets copying between host and guest in RX/TX. It introduces an extra ring to store the detached mbufs. At initialization stage all mbufs will put into this ring; when one guest starts, vhost gets the available buffer address allocated by guest for RX and translates them into host space addresses, then attaches them to mbufs and puts the attached mbufs into mempool. Queue starting and DMA refilling will get mbufs from mempool and use them to set the DMA addresses. For TX, it gets the buffer addresses of available packets to be transmitted from guest and translates them to host space addresses, then attaches them to mbufs and puts them to TX queues. After TX finishes, it pulls mbufs out from mempool, detaches them and puts them back into the extra ring. Signed-off-by: Ouyang Changchun --- examples/vhost/main.c | 1410 ++- examples/vhost/virtio-net.c | 120 +++- examples/vhost/virtio-net.h | 15 +- 3 files changed, 1387 insertions(+), 158 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 816a71a..674608c 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -48,6 +48,7 @@ #include #include #include +#include #include "main.h" #include "virtio-net.h" @@ -70,6 +71,14 @@ #define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) /* + * No frame data buffer allocated from host are required for zero copy implementation, + * guest will allocate the frame data buffer, and vhost directly use it. + */ +#define VIRTIO_DESCRIPTOR_LEN_ZCP 1518 +#define MBUF_SIZE_ZCP (VIRTIO_DESCRIPTOR_LEN_ZCP + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) +#define MBUF_CACHE_SIZE_ZCP 0 + +/* * RX and TX Prefetch, Host, and Write-back threshold values should be * carefully set for optimal performance. Consult the network * controller's datasheet and supporting DPDK documentation for guidance @@ -108,6 +117,24 @@ #define RTE_TEST_RX_DESC_DEFAULT 1024 #define RTE_TEST_TX_DESC_DEFAULT 512 +/* + * Need refine these 2 macros for legacy and DPDK based front end: + * Max vring avail descriptor/entries from guest - MAX_PKT_BURST + * And then adjust power 2. + */ +/* + * For legacy front end, 128 descriptors, + * half for virtio header, another half for mbuf. + */ +#define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. */ +#define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. */ + +/* Get first 4 bytes in mbuf headroom. */ +#define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t*)((uint8_t*)(mbuf) + sizeof(struct rte_mbuf))) + +/* true if x is a power of 2 */ +#define POWEROF2(x) x)-1) & (x)) == 0) + #define INVALID_PORT_ID 0xFF /* Max number of devices. Limited by vmdq. */ @@ -138,8 +165,39 @@ static uint32_t num_switching_cores = 0; static uint32_t num_queues = 0; uint32_t num_devices = 0; +/* Enable zero copy, pkts buffer will directly dma to hw descriptor, disabled on default*/ +static uint32_t zero_copy = 0; + +/* number of descriptors to apply*/ +static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP; +static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP; + +/* max ring descriptor, ixgbe, i40e, e1000 all are 4096. */ +#define MAX_RING_DESC 4096 + +struct vpool { + struct rte_mempool * pool; + struct rte_ring * ring; + uint32_t buf_size; +} vpool_array[MAX_QUEUES+MAX_QUEUES]; + /* Enable VM2VM communications. If this is disabled then the MAC address compare is skipped. */ -static uint32_t enable_vm2vm = 1; +typedef enum { + VM2VM_DISABLED = 0, + VM2VM_SOFTWARE = 1, + VM2VM_HARDWARE = 2, + VM2VM_LAST +} vm2vm_type; +static vm2vm_type vm2vm_mode = VM2VM_SOFTWARE; + +/* The type of host physical address translated from guest physical address. */ +typedef enum { + PHYS_ADDR_CONTINUOUS = 0, + PHYS_ADDR_CROSS_SUBREG = 1, + PHYS_ADDR_INVALID = 2, + PHYS_ADDR_LAST +} hpa_type; + /* Enable stats. */ static uint32_t enable_stats = 0; /* Enable retries on RX. */ @@ -159,7 +217,7 @@ static uint32_t dev_index = 0; extern uint64_t VHOST_FEATURES; /* Default configuration for rx and tx thresholds etc. */ -static const struct rte_eth_rxconf rx_conf_default = { +static struct rte_eth_rxconf rx_conf_default = { .rx_thresh = { .pthresh = RX_PTHRESH, .hthresh = RX_HTHRESH, @@ -173,7 +231,7 @@ static const struct rte_eth_rxconf rx_conf_default = { * Controller and the DPDK ixgbe/igb PMD. Consider using other values for other * network controllers and/or network drivers. */ -static const struct rte_eth_txconf tx_conf_default = { +static struct rte_eth_txconf tx_conf_default = { .tx_thresh = { .pthresh = TX_PTHRESH, .hthresh = TX_HTHRESH, @@ -184,7 +242,7 @@ static const struct rte_eth_txconf tx_conf_de
[dpdk-dev] [PATCH v2 2/3] ixgbe: Implement queue start and stop functionality in IXGBE PMD
This patch implements queue start and stop functionality in IXGBE PMD; it also enables hardware loopback for VMDQ mode in IXGBE PMD. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 4 + lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 8 ++ lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 233 ++-- lib/librte_pmd_ixgbe/ixgbe_rxtx.h | 6 + 4 files changed, 215 insertions(+), 36 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index 49ff0d1..62a6d77 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -275,6 +275,10 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = { .vlan_tpid_set= ixgbe_vlan_tpid_set, .vlan_offload_set = ixgbe_vlan_offload_set, .vlan_strip_queue_set = ixgbe_vlan_strip_queue_set, + .rx_queue_start = ixgbe_dev_rx_queue_start, + .rx_queue_stop= ixgbe_dev_rx_queue_stop, + .tx_queue_start = ixgbe_dev_tx_queue_start, + .tx_queue_stop= ixgbe_dev_tx_queue_stop, .rx_queue_setup = ixgbe_dev_rx_queue_setup, .rx_queue_release = ixgbe_dev_rx_queue_release, .rx_queue_count = ixgbe_dev_rx_queue_count, diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h index 7c6139b..ae52c8e 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h @@ -245,6 +245,14 @@ void ixgbe_dev_tx_init(struct rte_eth_dev *dev); void ixgbe_dev_rxtx_start(struct rte_eth_dev *dev); +int ixgbe_dev_rx_queue_start(struct rte_eth_dev *dev, uint16_t rx_queue_id); + +int ixgbe_dev_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id); + +int ixgbe_dev_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id); + +int ixgbe_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id); + int ixgbevf_dev_rx_init(struct rte_eth_dev *dev); void ixgbevf_dev_tx_init(struct rte_eth_dev *dev); diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 55414b9..2a98051 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -1588,7 +1588,7 @@ ixgbe_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, * descriptors should meet the following condition: * (num_ring_desc * sizeof(rx/tx descriptor)) % 128 == 0 */ -#define IXGBE_MIN_RING_DESC 64 +#define IXGBE_MIN_RING_DESC 32 #define IXGBE_MAX_RING_DESC 4096 /* @@ -1836,6 +1836,7 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, txq->port_id = dev->data->port_id; txq->txq_flags = tx_conf->txq_flags; txq->ops = &def_txq_ops; + txq->start_tx_per_q= tx_conf->start_tx_per_q; /* * Modification to set VFTDT for virtual function if vf is detected @@ -2078,6 +2079,7 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev, rxq->crc_len = (uint8_t) ((dev->data->dev_conf.rxmode.hw_strip_crc) ? 0 : ETHER_CRC_LEN); rxq->drop_en = rx_conf->rx_drop_en; + rxq->start_rx_per_q= rx_conf->start_rx_per_q; /* * Allocate RX ring hardware descriptors. A memzone large enough to @@ -3025,6 +3027,14 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev) } + /* PFDMA Tx General Switch Control Enables VMDQ loopback */ + if (cfg->enable_loop_back){ + IXGBE_WRITE_REG(hw, IXGBE_PFDTXGSWC, IXGBE_PFDTXGSWC_VT_LBEN); + for(i = 0; i < RTE_IXGBE_VMTXSW_REGISTER_COUNT; i++) { + IXGBE_WRITE_REG(hw, IXGBE_VMTXSW(i), UINT32_MAX); + } + } + IXGBE_WRITE_FLUSH(hw); } @@ -3234,7 +3244,6 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) uint32_t rxcsum; uint16_t buf_size; uint16_t i; - int ret; PMD_INIT_FUNC_TRACE(); hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); @@ -3289,11 +3298,6 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) for (i = 0; i < dev->data->nb_rx_queues; i++) { rxq = dev->data->rx_queues[i]; - /* Allocate buffers for descriptor rings */ - ret = ixgbe_alloc_rx_queue_mbufs(rxq); - if (ret) - return ret; - /* * Reset crc_len in case it was changed after queue setup by a * call to configure. @@ -3500,10 +3504,8 @@ ixgbe_dev_rxtx_start(struct rte_eth_dev *dev) struct igb_rx_queue *rxq; uint32_t txdctl; uint32_t dmatxctl; - uint32_t rxdctl; uint32_t rxctrl; uint16_t i; - int poll_ms; PMD_INIT_FUNC_TRACE(); hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); @@ -3526,55 +3528,214 @@ ixgbe_dev_rxtx_start(struct rte_eth_dev *dev)
[dpdk-dev] [PATCH 2/3] ixgbe: Implement the functionality of administrative link up and down in IXGBE PMD
This patch implements the functionality of administrative link up and down in IXGBE PMD. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 58 + 1 file changed, 58 insertions(+) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index 76f09af..b6ffad0 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -98,6 +98,8 @@ static int eth_ixgbe_dev_init(struct eth_driver *eth_drv, static int ixgbe_dev_configure(struct rte_eth_dev *dev); static int ixgbe_dev_start(struct rte_eth_dev *dev); static void ixgbe_dev_stop(struct rte_eth_dev *dev); +static int ixgbe_dev_admin_link_up(struct rte_eth_dev *dev); +static int ixgbe_dev_admin_link_down(struct rte_eth_dev *dev); static void ixgbe_dev_close(struct rte_eth_dev *dev); static void ixgbe_dev_promiscuous_enable(struct rte_eth_dev *dev); static void ixgbe_dev_promiscuous_disable(struct rte_eth_dev *dev); @@ -263,6 +265,8 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = { .dev_configure= ixgbe_dev_configure, .dev_start= ixgbe_dev_start, .dev_stop = ixgbe_dev_stop, + .dev_admin_link_up= ixgbe_dev_admin_link_up, + .dev_admin_link_down = ixgbe_dev_admin_link_down, .dev_close= ixgbe_dev_close, .promiscuous_enable = ixgbe_dev_promiscuous_enable, .promiscuous_disable = ixgbe_dev_promiscuous_disable, @@ -1487,6 +1491,60 @@ ixgbe_dev_stop(struct rte_eth_dev *dev) } /* + * Link up device administratively: enable tx laser. + */ +static int +ixgbe_dev_admin_link_up(struct rte_eth_dev *dev) +{ + struct ixgbe_hw *hw = + IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + if (hw->mac.type == ixgbe_mac_82599EB) { +#ifdef RTE_NIC_BYPASS + if (hw->device_id == IXGBE_DEV_ID_82599_BYPASS) { + /* Not suported in bypass mode */ + PMD_INIT_LOG(ERR, "\nAdmin link up is not supported by device id 0x%x\n", +hw->device_id); + return -ENOTSUP; + } +#endif + /* Turn on the laser */ + ixgbe_enable_tx_laser(hw); + return 0; + } + + PMD_INIT_LOG(ERR, "\nAdmin link up is not supported by device id 0x%x\n", +hw->device_id); + return -ENOTSUP; +} + +/* + * Link down device administratively: disable tx laser. + */ +static int +ixgbe_dev_admin_link_down(struct rte_eth_dev *dev) +{ + struct ixgbe_hw *hw = + IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + if (hw->mac.type == ixgbe_mac_82599EB) { +#ifdef RTE_NIC_BYPASS + if (hw->device_id == IXGBE_DEV_ID_82599_BYPASS) { + /* Not suported in bypass mode */ + PMD_INIT_LOG(ERR, "\nAdmin link down is not supported by device id 0x%x\n", +hw->device_id); + return -ENOTSUP; + } +#endif + /* Turn off the laser */ + ixgbe_disable_tx_laser(hw); + return 0; + } + + PMD_INIT_LOG(ERR, "\nAdmin link down is not supported by device id 0x%x\n", +hw->device_id); + return -ENOTSUP; +} + +/* * Reest and stop device. */ static void -- 1.9.0
[dpdk-dev] [PATCH 0/3] Support administrative link up and link down
This patch series contain the following 3 items: 1. Add API to support administrative link up and down. 2. Implement the functionality of administrative link up and down in IXGBE PMD. 3. Add command in testpmd to test the functionality of administrative link up and down of PMD. Ouyang Changchun (3): Add API for supporting administrative link up and down. Implement the functionality of administrative link up and down in IXGBE PMD. Add command line to test the functionality of administrative link up and down of PMD in testpmd. app/test-pmd/cmdline.c | 78 + app/test-pmd/testpmd.c | 14 +++ app/test-pmd/testpmd.h | 2 + lib/librte_ether/rte_ethdev.c | 38 ++ lib/librte_ether/rte_ethdev.h | 34 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 58 +++ 6 files changed, 224 insertions(+) -- 1.9.0
[dpdk-dev] [PATCH 1/3] ether: Add API to support administrative link up and down
This patch addes API to support administrative link up and down. Signed-off-by: Ouyang Changchun --- lib/librte_ether/rte_ethdev.c | 38 ++ lib/librte_ether/rte_ethdev.h | 34 ++ 2 files changed, 72 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 0ddedfb..06a0896 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -796,6 +796,44 @@ rte_eth_dev_stop(uint8_t port_id) (*dev->dev_ops->dev_stop)(dev); } +int +rte_eth_dev_admin_link_up(uint8_t port_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + dev = &rte_eth_devices[port_id]; + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_admin_link_up, -ENOTSUP); + return (*dev->dev_ops->dev_admin_link_up)(dev); +} + +int +rte_eth_dev_admin_link_down(uint8_t port_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return (-EINVAL); + } + dev = &rte_eth_devices[port_id]; + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_admin_link_down, -ENOTSUP); + return (*dev->dev_ops->dev_admin_link_down)(dev); +} + void rte_eth_dev_close(uint8_t port_id) { diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 2be6e4f..d33ff93 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -891,6 +891,12 @@ typedef int (*eth_dev_start_t)(struct rte_eth_dev *dev); typedef void (*eth_dev_stop_t)(struct rte_eth_dev *dev); /**< @internal Function used to stop a configured Ethernet device. */ +typedef int (*eth_dev_admin_link_up_t)(struct rte_eth_dev *dev); +/**< @internal Function used to link up a configured Ethernet device administratively. */ + +typedef int (*eth_dev_admin_link_down_t)(struct rte_eth_dev *dev); +/**< @internal Function used to link down a configured Ethernet device administratively. */ + typedef void (*eth_dev_close_t)(struct rte_eth_dev *dev); /**< @internal Function used to close a configured Ethernet device. */ @@ -1223,6 +1229,8 @@ struct eth_dev_ops { eth_dev_configure_tdev_configure; /**< Configure device. */ eth_dev_start_tdev_start; /**< Start device. */ eth_dev_stop_t dev_stop; /**< Stop device. */ + eth_dev_admin_link_up_tdev_admin_link_up; /**< Device link up administratively. */ + eth_dev_admin_link_down_t dev_admin_link_down; /**< device link down admininstratively. */ eth_dev_close_tdev_close; /**< Close device. */ eth_promiscuous_enable_t promiscuous_enable; /**< Promiscuous ON. */ eth_promiscuous_disable_t promiscuous_disable;/**< Promiscuous OFF. */ @@ -1696,6 +1704,32 @@ extern int rte_eth_dev_start(uint8_t port_id); */ extern void rte_eth_dev_stop(uint8_t port_id); + +/** + * Link up an Ethernet device administratively. + * + * The administrative device link up will re-enable the device rx/tx functionality + * after it is previously administrative device linked down. + * + * @param port_id + * The port identifier of the Ethernet device. + * @return + * - 0: Success, Ethernet device linked up administratively. + * - <0: Error code of the driver device link up function. + */ +extern int rte_eth_dev_admin_link_up(uint8_t port_id); + +/** + * Link down an Ethernet device administratively. + * The device rx/tx functionality will be disabled if success, + * and it can be re-enabled with a call to + * rte_eth_dev_admin_link_up() + * + * @param port_id + * The port identifier of the Ethernet device. + */ +extern int rte_eth_dev_admin_link_down(uint8_t port_id); + /** * Close an Ethernet device. The device cannot be restarted! * -- 1.9.0
[dpdk-dev] [PATCH 3/3] testpmd: Add commands to test administrative link up and down of PMD
This patch adds commands to test the functionality of administrative link up and down of PMD in testpmd. Signed-off-by: Ouyang Changchun --- app/test-pmd/cmdline.c | 78 ++ app/test-pmd/testpmd.c | 14 + app/test-pmd/testpmd.h | 2 ++ 3 files changed, 94 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 6030192..9dcf475 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -3844,6 +3844,82 @@ cmdline_parse_inst_t cmd_start_tx_first = { }, }; +/* *** LINK UP ADMINISTRATIVELY *** */ +struct cmd_admin_link_up_result { + cmdline_fixed_string_t admin; + cmdline_fixed_string_t link_up; + cmdline_fixed_string_t port; + uint8_t port_id; +}; + +cmdline_parse_token_string_t cmd_admin_link_up_admin = + TOKEN_STRING_INITIALIZER(struct cmd_admin_link_up_result, admin, "admin"); +cmdline_parse_token_string_t cmd_admin_link_up_link_up = + TOKEN_STRING_INITIALIZER(struct cmd_admin_link_up_result, link_up, "link-up"); +cmdline_parse_token_string_t cmd_admin_link_up_port = + TOKEN_STRING_INITIALIZER(struct cmd_admin_link_up_result, port, "port"); +cmdline_parse_token_num_t cmd_admin_link_up_port_id = + TOKEN_NUM_INITIALIZER(struct cmd_admin_link_up_result, port_id, UINT8); + +static void cmd_admin_link_up_parsed(__attribute__((unused)) void *parsed_result, +__attribute__((unused)) struct cmdline *cl, +__attribute__((unused)) void *data) +{ + struct cmd_admin_link_up_result *res = parsed_result; + dev_admin_link_up(res->port_id); +} + +cmdline_parse_inst_t cmd_admin_link_up = { + .f = cmd_admin_link_up_parsed, + .data = NULL, + .help_str = "admin link-up port (port id)", + .tokens = { + (void *)&cmd_admin_link_up_admin, + (void *)&cmd_admin_link_up_link_up, + (void *)&cmd_admin_link_up_port, + (void *)&cmd_admin_link_up_port_id, + NULL, + }, +}; + +/* *** LINK DOWN ADMINISTRATIVELY *** */ +struct cmd_admin_link_down_result { + cmdline_fixed_string_t admin; + cmdline_fixed_string_t link_down; + cmdline_fixed_string_t port; + uint8_t port_id; +}; + +cmdline_parse_token_string_t cmd_admin_link_down_admin = + TOKEN_STRING_INITIALIZER(struct cmd_admin_link_down_result, admin, "admin"); +cmdline_parse_token_string_t cmd_admin_link_down_link_down = + TOKEN_STRING_INITIALIZER(struct cmd_admin_link_down_result, link_down, "link-down"); +cmdline_parse_token_string_t cmd_admin_link_down_port = + TOKEN_STRING_INITIALIZER(struct cmd_admin_link_down_result, port, "port"); +cmdline_parse_token_num_t cmd_admin_link_down_port_id = + TOKEN_NUM_INITIALIZER(struct cmd_admin_link_down_result, port_id, UINT8); + +static void cmd_admin_link_down_parsed(__attribute__((unused)) void *parsed_result, +__attribute__((unused)) struct cmdline *cl, +__attribute__((unused)) void *data) +{ + struct cmd_admin_link_down_result *res = parsed_result; + dev_admin_link_down(res->port_id); +} + +cmdline_parse_inst_t cmd_admin_link_down = { + .f = cmd_admin_link_down_parsed, + .data = NULL, + .help_str = "admin link-down port (port id)", + .tokens = { + (void *)&cmd_admin_link_down_admin, + (void *)&cmd_admin_link_down_link_down, + (void *)&cmd_admin_link_down_port, + (void *)&cmd_admin_link_down_port_id, + NULL, + }, +}; + /* *** SHOW CFG *** */ struct cmd_showcfg_result { cmdline_fixed_string_t show; @@ -6055,6 +6131,8 @@ cmdline_parse_ctx_t main_ctx[] = { (cmdline_parse_inst_t *)&cmd_showcfg, (cmdline_parse_inst_t *)&cmd_start, (cmdline_parse_inst_t *)&cmd_start_tx_first, + (cmdline_parse_inst_t *)&cmd_admin_link_up, + (cmdline_parse_inst_t *)&cmd_admin_link_down, (cmdline_parse_inst_t *)&cmd_reset, (cmdline_parse_inst_t *)&cmd_set_numbers, (cmdline_parse_inst_t *)&cmd_set_txpkts, diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index bc38305..9e9997f 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -1208,6 +1208,20 @@ stop_packet_forwarding(void) test_done = 1; } +void +dev_admin_link_up(portid_t pid) +{ + if (rte_eth_dev_admin_link_up((uint8_t)pid) < 0) + printf("\nAdmin link up fail.\n"); +} + +void +dev_admin_link_down(portid_t pid) +{ + if (rte_eth_dev_admin_link_down((uint8_t)pid) < 0) + printf("\nAdmin link down fail.\n"); +} + static int all_ports_started(void)
[dpdk-dev] [PATCH 0/3] Support administrative link up and link down
Hi Ivan For this one, it seems long story for that... In short, Some customer have such kind of requirement, they want to repeatedly start(rte_dev_start) and stop(rte_dev_stop) the port for RX and TX, but they find after several times start and stop, the RX and TX can't work well even the port starts, and the packets error number increase. To resolve this error number increase issue, and let port work fine even after repeatedly start and stop, We need a new API to do it, after discussing, we have these 2 API, admin link up and admin link down. Any difference if use " dev_link_start/stop" or " dev_link_up/down"? to me, admin_link_up/down is better than dev_link_start/stop, If most people think we need change the name, it is ok to rename it. I don't think we need it in non-physical PMDs. So no implementation in virtio PMD. Thanks Changchun -Original Message- From: Ivan Boule [mailto:ivan.bo...@6wind.com] Sent: Thursday, May 22, 2014 9:17 PM To: Ouyang, Changchun; dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/3] Support administrative link up and link down On 05/22/2014 08:11 AM, Ouyang Changchun wrote: > This patch series contain the following 3 items: > 1. Add API to support administrative link up and down. > 2. Implement the functionality of administrative link up and down in IXGBE > PMD. > 3. Add command in testpmd to test the functionality of administrative link up > and down of PMD. > > Ouyang Changchun (3): >Add API for supporting administrative link up and down. >Implement the functionality of administrative link up and down in > IXGBE PMD. >Add command line to test the functionality of administrative link up > and down of PMD in testpmd. > > app/test-pmd/cmdline.c | 78 > + > app/test-pmd/testpmd.c | 14 +++ > app/test-pmd/testpmd.h | 2 + > lib/librte_ether/rte_ethdev.c | 38 ++ > lib/librte_ether/rte_ethdev.h | 34 > lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 58 +++ > 6 files changed, 224 insertions(+) > Hi Changchun, The 2 functions "rte_eth_dev_admin_link_up" and "rte_eth_dev_admin_link_down" don't have an equivalent in the Linux kernel, thus I am wondering what is their effective usage from a network application perspective. Could you briefly explain in which use case these functions can be used for? By the way, it's not completely evident to infer the exact semantics of these 2 functions from their name. In particular, I do not see what the term "admin" brings to the understanding of their role. If it is to suggest that these functions are intended to force the link to a different state of its initial [self-detected] state, then the term "force" would be more appropriate. Otherwise, if eventually these functions appear to be mandatory, I suggest to rename them "rte_eth_dev_link_start" and "rte_eth_dev_link_stop" respectively, and to apply the same naming conventions in the 2 other patches. It might also be worth documenting in the comment section of the prototype of these 2 functions whether it makes sense or not to support a notion of link that can be dynamically started or stopped in non-physical PMDs (vmxnet3, virtio, etc). Regards, Ivan -- Ivan Boule 6WIND Development Engineer
[dpdk-dev] [PATCH] [PMD] [VHOST] Revert unnecessary definition and fix wrong referring in user space vhost zero copy patches
Hi, Thomas Thanks very much for your guiding! Best regards, Changchun -Original Message- From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] Sent: Thursday, May 22, 2014 11:29 PM To: Ouyang, Changchun Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH] [PMD] [VHOST] Revert unnecessary definition and fix wrong referring in user space vhost zero copy patches Hi Changchun, Please, it is preferred to answer below the question. 2014-05-20 01:14, Ouyang, Changchun: > So "[PMD] [VHOST]" in the title should be removed in the cover letter, > right? And in each separate patch letter, it could use "ixgbe:" or > "examples/vhost:", instead of "[PMD] [VHOST]" Is it right? Yes, you did right in the v2. Thanks -- Thomas > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Tuesday, May 20, 2014 12:00 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH] [PMD] [VHOST] Revert unnecessary definition > and fix wrong referring in user space vhost zero copy patches > > Hi Changchun, > > 2014-05-19 23:09, Ouyang Changchun: > > 1. Revert the change of metadata macro definition for referring to > > headroom space in mbuf; 2. Fix wrongly referring to RX queues number > > in TX queues start/stop function. > > > > Signed-off-by: Ouyang Changchun > > You are fixing commits which are not yet applied. > Please merge and re-send the whole serie by suffixing with "v2". > > The title was "[PATCH 0/3] [PMD] [VHOST] *** Support zero copy RX/TX in user > space vhost ***" It should be "[PATCH v2 0/3] Support zero copy RX/TX in > user space vhost" > > Other notes: > - please split API and ixgbe changes > - set a significant title to each patch > - use prefixes like "ethdev:", "ixgbe:" or "examples/vhost:" > > In general, this page is a good help: > http://dpdk.org/dev#send > > Thanks > -- > Thomas
[dpdk-dev] [PATCH 0/3] Support setting TX rate for queue and VF
This patch series contains the 3 items: 1. Add API to support setting TX rate for a queue or a VF. 2. Implement the functionality of setting TX rate for queue or VF in IXGBE PMD. 3. Add commands in testpmd to test the functionality of setting TX rate for queue or VF. Ouyang Changchun (3): Add API to support set TX rate for a queue anf VF. Implement the functionality of setting TX rate for queue or VF in IXGBE PMD. Add commands in testpmd to test the functionality of setting TX rate for queue or VF. app/test-pmd/cmdline.c | 153 app/test-pmd/config.c | 44 +++ app/test-pmd/testpmd.h | 2 + lib/librte_ether/rte_ethdev.c | 63 +++ lib/librte_ether/rte_ethdev.h | 51 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 110 ++ lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 10 ++- 7 files changed, 432 insertions(+), 1 deletion(-) -- 1.9.0
[dpdk-dev] [PATCH 2/3] ixgbe: Implement the functionality of setting TX rate for queue or VF in IXGBE PMD
This patch implements the functionality of setting TX rate for queue or VF in IXGBE PMD. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 110 lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 10 +++- 2 files changed, 119 insertions(+), 1 deletion(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index c9b5fe4..7a61ab0 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -87,6 +87,8 @@ #define IXGBE_LINK_UP_CHECK_TIMEOUT 1000 /* ms */ #define IXGBE_VMDQ_NUM_UC_MAC 4096 /* Maximum nb. of UC MAC addr. */ +#define IXGBE_MMW_SIZE_DEFAULT0x4 +#define IXGBE_MMW_SIZE_JUMBO_FRAME0x14 #define IXGBEVF_PMD_NAME "rte_ixgbevf_pmd" /* PMD name */ @@ -182,6 +184,9 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev, static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id); +static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev, uint16_t queue_idx, uint16_t tx_rate); +static int ixgbe_set_vf_rate_limit(struct rte_eth_dev *dev, uint16_t vf, uint16_t tx_rate, uint64_t q_msk); + /* * Define VF Stats MACRO for Non "cleared on read" register */ @@ -280,6 +285,8 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = { .set_vf_rx= ixgbe_set_pool_rx, .set_vf_tx= ixgbe_set_pool_tx, .set_vf_vlan_filter = ixgbe_set_pool_vlan_filter, + .set_queue_rate_limit = ixgbe_set_queue_rate_limit, + .set_vf_rate_limit= ixgbe_set_vf_rate_limit, .fdir_add_signature_filter= ixgbe_fdir_add_signature_filter, .fdir_update_signature_filter = ixgbe_fdir_update_signature_filter, .fdir_remove_signature_filter = ixgbe_fdir_remove_signature_filter, @@ -1288,10 +1295,13 @@ ixgbe_dev_start(struct rte_eth_dev *dev) { struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct ixgbe_vf_info *vfinfo = + *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private); int err, link_up = 0, negotiate = 0; uint32_t speed = 0; int mask = 0; int status; + uint16_t vf, idx; PMD_INIT_FUNC_TRACE(); @@ -1408,6 +1418,15 @@ skip_link_setup: goto error; } + /* Restore vf rate limit */ + if (vfinfo != NULL) { + for (vf = 0; vf < dev->pci_dev->max_vfs; vf++) + for (idx = 0; idx < IXGBE_MAX_QUEUE_NUM_PER_VF; idx++) + if (vfinfo[vf].tx_rate[idx] != 0) + ixgbe_set_vf_rate_limit(dev, vf, + vfinfo[vf].tx_rate[idx], 1 << idx); + } + ixgbe_restore_statistics_mapping(dev); return (0); @@ -3062,6 +3081,97 @@ ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id) return 0; } +static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev, uint16_t queue_idx, uint16_t tx_rate) +{ + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + uint32_t rf_dec, rf_int; + uint32_t bcnrc_val; + uint16_t link_speed = dev->data->dev_link.link_speed; + + if (queue_idx >= hw->mac.max_tx_queues) + return -EINVAL; + + if (tx_rate != 0) { + /* Calculate the rate factor values to set */ + rf_int = (uint32_t)link_speed / (uint32_t)tx_rate; + rf_dec = (uint32_t)link_speed % (uint32_t)tx_rate; + rf_dec = (rf_dec << IXGBE_RTTBCNRC_RF_INT_SHIFT) / tx_rate; + + bcnrc_val = IXGBE_RTTBCNRC_RS_ENA; + bcnrc_val |= ((rf_int << IXGBE_RTTBCNRC_RF_INT_SHIFT) & + IXGBE_RTTBCNRC_RF_INT_MASK_M); + bcnrc_val |= (rf_dec & IXGBE_RTTBCNRC_RF_DEC_MASK); + } else { + bcnrc_val = 0; + } + + /* +* Set global transmit compensation time to the MMW_SIZE in RTTBCNRM +* register. MMW_SIZE=0x014 if 9728-byte jumbo is supported, otherwise set as 0x4. +*/ + if ((dev->data->dev_conf.rxmode.jumbo_frame == 1) && + (dev->data->dev_conf.rxmode.max_rx_pkt_len >= IXGBE_MAX_JUMBO_FRAME_SIZE)) + IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRM, IXGBE_MMW_SIZE_JUMBO_FRAME); + else + IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRM, IXGBE_MMW_SIZE_DEFAULT); + + /* Set RTTBCNRC of queue X */ + IXGBE_WRITE_REG(hw, IXGBE_RTTDQSEL, queue_idx); + IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRC, bcnrc_val); + IXGBE_WRITE_FLUSH(hw); + + return 0; +} + +static int ixgbe_set_vf_rate_limit(struct rte_eth_dev *dev, uint16_t vf, uint16_t tx_rate, uint64_t q_msk) +{ + struct ixgbe_hw *hw = IXGBE_DEV_PRI
[dpdk-dev] [PATCH 1/3] ether: Add API to support setting TX rate for queue and VF
This patch adds API to support setting TX rate for a queue and a VF. Signed-off-by: Ouyang Changchun --- lib/librte_ether/rte_ethdev.c | 63 +++ lib/librte_ether/rte_ethdev.h | 51 +++ 2 files changed, 114 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a5727dd..ff3a9b6 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -1913,6 +1913,69 @@ rte_eth_dev_set_vf_vlan_filter(uint8_t port_id, uint16_t vlan_id, vf_mask,vlan_on); } +int rte_eth_set_queue_rate_limit(uint8_t port_id, uint16_t queue_idx, uint16_t tx_rate) +{ + struct rte_eth_dev *dev; + struct rte_eth_dev_info dev_info; + struct rte_eth_link link; + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("set queue rate limit:invalid port id=%d\n", port_id); + return (-ENODEV); + } + + dev = &rte_eth_devices[port_id]; + rte_eth_dev_info_get(port_id, &dev_info); + link = dev->data->dev_link; + + if (queue_idx > dev_info.max_tx_queues) { + PMD_DEBUG_TRACE("set queue rate limit:port %d: invalid queue id=%d\n", port_id, queue_idx); + return (-EINVAL); + } + + if(tx_rate > link.link_speed) { + PMD_DEBUG_TRACE("set queue rate limit:invalid tx_rate=%d, bigger than link speed= %d\n", + tx_rate, link_speed); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP); + return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate); +} + +int rte_eth_set_vf_rate_limit(uint8_t port_id, uint16_t vf, uint16_t tx_rate, uint64_t q_msk) +{ + struct rte_eth_dev *dev; + struct rte_eth_dev_info dev_info; + struct rte_eth_link link; + + if(q_msk == 0) + return 0; + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("set VF rate limit:invalid port id=%d\n", port_id); + return (-ENODEV); + } + + dev = &rte_eth_devices[port_id]; + rte_eth_dev_info_get(port_id, &dev_info); + link = dev->data->dev_link; + + if (vf > dev_info.max_vfs) { + PMD_DEBUG_TRACE("set VF rate limit:port %d: invalid vf id=%d\n", port_id, vf); + return (-EINVAL); + } + + if(tx_rate > link.link_speed) { + PMD_DEBUG_TRACE("set VF rate limit:invalid tx_rate=%d, bigger than link speed= %d\n", + tx_rate, link_speed); + return (-EINVAL); + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_vf_rate_limit, -ENOTSUP); + return (*dev->dev_ops->set_vf_rate_limit)(dev, vf, tx_rate, q_msk); +} + int rte_eth_mirror_rule_set(uint8_t port_id, struct rte_eth_vmdq_mirror_conf *mirror_conf, diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index d5ea46b..445d40a 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1012,6 +1012,17 @@ typedef int (*eth_set_vf_vlan_filter_t)(struct rte_eth_dev *dev, uint8_t vlan_on); /**< @internal Set VF VLAN pool filter */ +typedef int (*eth_set_queue_rate_limit_t)(struct rte_eth_dev *dev, + uint16_t queue_idx, + uint16_t tx_rate); +/**< @internal Set queue TX rate */ + +typedef int (*eth_set_vf_rate_limit_t)(struct rte_eth_dev *dev, + uint16_t vf, + uint16_t tx_rate, + uint64_t q_msk); +/**< @internal Set VF TX rate */ + typedef int (*eth_mirror_rule_set_t)(struct rte_eth_dev *dev, struct rte_eth_vmdq_mirror_conf *mirror_conf, uint8_t rule_id, @@ -1119,6 +1130,8 @@ struct eth_dev_ops { eth_set_vf_rx_tset_vf_rx; /**< enable/disable a VF receive */ eth_set_vf_tx_tset_vf_tx; /**< enable/disable a VF transmit */ eth_set_vf_vlan_filter_t set_vf_vlan_filter; /**< Set VF VLAN filter */ + eth_set_queue_rate_limit_t set_queue_rate_limit; /**< Set queue rate limit */ + eth_set_vf_rate_limit_tset_vf_rate_limit; /**< Set VF rate limit */ /** Add a signature filter. */ fdir_add_signature_filter_t fdir_add_signature_filter; @@ -2561,6 +2574,44 @@ int rte_eth_mirror_rule_reset(uint8_t port_id, uint8_t rule_id); /** + * Set the rate limitation for a queue on an Ethernet device. + * + * @param port_id + * The port identifier of the Ethernet device. +
[dpdk-dev] [PATCH 3/3] testpmd: Add commands to test the functionality of setting TX rate for queue or VF
This patch adds commands in testpmd to test the functionality of setting TX rate for queue or VF. Signed-off-by: Ouyang Changchun --- app/test-pmd/cmdline.c | 153 + app/test-pmd/config.c | 44 ++ app/test-pmd/testpmd.h | 2 + 3 files changed, 199 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index b3824f9..f85a275 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -343,6 +343,12 @@ static void cmd_help_long_parsed(void *parsed_result, "MPE:accepts all multicast packets\n\n" "Enable/Disable a VF receive mode of a port\n\n" + "set port (port_id) queue (queue_id) rate (rate_num) \n" + "Set rate limit for a queue of a port\n\n" + + "set port (port_id) vf (vf_id) rate (rate_num) queue_mask (queue_mask_value)\n" + "Set rate limit for queues in VF of a port\n\n" + "set port (port_id) mirror-rule (rule_id)" "(pool-mirror|vlan-mirror)\n" " (poolmask|vlanid[,vlanid]*) dst-pool (pool_id) (on|off)\n" @@ -4790,6 +4796,151 @@ cmdline_parse_inst_t cmd_vf_rxvlan_filter = { }, }; +/* *** SET RATE LIMIT FOR A QUEUE OF A PORT *** */ +struct cmd_queue_rate_limit_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t port; + uint8_t port_num; + cmdline_fixed_string_t queue; + uint8_t queue_num; + cmdline_fixed_string_t rate; + uint16_t rate_num; +}; + +static void cmd_queue_rate_limit_parsed(void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_queue_rate_limit_result *res = parsed_result; + int ret = 0; + + if ((strcmp(res->set, "set") == 0) && (strcmp(res->port, "port") == 0) + && (strcmp(res->queue, "queue") == 0) && (strcmp(res->rate, "rate") == 0)) + ret = set_queue_rate_limit(res->port_num, res->queue_num, + res->rate_num); + if(ret < 0) + printf("queue_rate_limit_cmd error: (%s)\n", strerror(-ret)); + +} + +cmdline_parse_token_string_t cmd_queue_rate_limit_set = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + set,"set"); +cmdline_parse_token_string_t cmd_queue_rate_limit_port = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + port,"port"); +cmdline_parse_token_num_t cmd_queue_rate_limit_portnum = + TOKEN_NUM_INITIALIZER(struct cmd_queue_rate_limit_result, + port_num, UINT8); +cmdline_parse_token_string_t cmd_queue_rate_limit_queue = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + queue,"queue"); +cmdline_parse_token_num_t cmd_queue_rate_limit_queuenum = + TOKEN_NUM_INITIALIZER(struct cmd_queue_rate_limit_result, + queue_num, UINT8); +cmdline_parse_token_string_t cmd_queue_rate_limit_rate = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + rate,"rate"); +cmdline_parse_token_num_t cmd_queue_rate_limit_ratenum = + TOKEN_NUM_INITIALIZER(struct cmd_queue_rate_limit_result, + rate_num, UINT16); + +cmdline_parse_inst_t cmd_queue_rate_limit = { + .f = cmd_queue_rate_limit_parsed, + .data = (void *)0, + .help_str = "set port X queue Y rate Z:(X = port number," + "Y = queue number,Z = rate number)set rate limit for a queue on port X", + .tokens = { + (void *)&cmd_queue_rate_limit_set, + (void *)&cmd_queue_rate_limit_port, + (void *)&cmd_queue_rate_limit_portnum, + (void *)&cmd_queue_rate_limit_queue, + (void *)&cmd_queue_rate_limit_queuenum, + (void *)&cmd_queue_rate_limit_rate, + (void *)&cmd_queue_rate_limit_ratenum, + NULL, + }, +}; + + +/* *** SET RATE LIMIT FOR A VF OF A PORT *** */ +struct cmd_vf_rate_limit_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t port; + uint8_t port_num; + cmdline_fixed_string_t vf; + uint8_t vf_num; + cmdline_fixed_string_t rate; + uint16_t rate_num; + cmdline_fixed_string_t q_msk; + uint64_t q_msk_val; +}; + +static void cmd_vf_rate_limit_parsed(void *parsed_result, + __attribute__((unused)) st
[dpdk-dev] [PATCH 0/3] Support administrative link up and link down
Hi Ivan To some extent, I also agree with you. But customer hope DPDK can provide an interface like "ifconfig up" and "ifconfig down" in linux, They can invoke such an interface in user application to repeated stop and start dev frequently, and Make sure RX and TX work fine after each start, I think it is not necessary to do really device start and stop at Each time, just need start and stop RX and TX function, so the straightforward method is to enable and disable tx lazer in ixgbe. But in the ether level we need a more generic api name, here is rte_eth_dev_admin_link_up/down, while enable_tx_laser is not suitable, Enable and disable tx laser is a way in ixgbe to fulfill the administrative link up and link down. maybe Fortville and future generation NIC will use other ways to fulfill the admin_link_up/down. Thanks and regards, Changchun -Original Message- From: Ivan Boule [mailto:ivan.bo...@6wind.com] Sent: Thursday, May 22, 2014 11:31 PM To: Ouyang, Changchun; dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/3] Support administrative link up and link down Hi Changchun, On 05/22/2014 04:44 PM, Ouyang, Changchun wrote: > Hi Ivan > For this one, it seems long story for that... > In short, > Some customer have such kind of requirement, they want to repeatedly > start(rte_dev_start) and stop(rte_dev_stop) the port for RX and TX, > but they find after several times start and stop, the RX and TX can't work > well even the port starts, and the packets error number increase. > > To resolve this error number increase issue, and let port work fine > even after repeatedly start and stop, We need a new API to do it, after > discussing, we have these 2 API, admin link up and admin link down. If I understand well, this "feature" is not needed by itself, but only as a work-around to address issues when repeatedly invoking the functions ixgbe_dev_stop and ixgbe_dev_start. Do such issues appear when performing the same operations with the Linux kernel driver? Anyway, I suppose that such functions have to be automatically invoked by the same code of the network application that invokes the functions ixgbe_dev_stop and ixgbe_dev_start (said differently, there is no need for a manual assistance !) In that case, would not it be possible - and highly preferable - to directly invoke the functions ixgbe_disable_tx_laser and, then, ixgbe_enable_tx_laser from the appropriate step during the execution of the function ixgbe_dev_start(), waiting for some appropriate delays between the two operations, if so needed? Regards, Ivan > > Any difference if use " dev_link_start/stop" or " dev_link_up/down"? > to me, admin_link_up/down is better than dev_link_start/stop, > > If most people think we need change the name, it is ok to rename it. > > I don't think we need it in non-physical PMDs. So no implementation in virtio > PMD. > > Thanks > Changchun > > > -Original Message- > From: Ivan Boule [mailto:ivan.boule at 6wind.com] > Sent: Thursday, May 22, 2014 9:17 PM > To: Ouyang, Changchun; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/3] Support administrative link up and > link down > > On 05/22/2014 08:11 AM, Ouyang Changchun wrote: >> This patch series contain the following 3 items: >> 1. Add API to support administrative link up and down. >> 2. Implement the functionality of administrative link up and down in IXGBE >> PMD. >> 3. Add command in testpmd to test the functionality of administrative link >> up and down of PMD. >> ... > Hi Changchun, > > The 2 functions "rte_eth_dev_admin_link_up" and "rte_eth_dev_admin_link_down" > don't have an equivalent in the Linux kernel, thus I am wondering what is > their effective usage from a network application perspective. > Could you briefly explain in which use case these functions can be used for? > > By the way, it's not completely evident to infer the exact semantics of these > 2 functions from their name. > In particular, I do not see what the term "admin" brings to the understanding > of their role. If it is to suggest that these functions are intended to force > the link to a different state of its initial [self-detected] state, then the > term "force" would be more appropriate. > > Otherwise, if eventually these functions appear to be mandatory, I suggest to > rename them "rte_eth_dev_link_start" and "rte_eth_dev_link_stop" > respectively, and to apply the same naming conventions in the 2 other patches. > > It might also be worth documenting in the comment section of the prototype of > these 2 functions whether it makes sense or not to support a notion of link > that can be dynamically started or stopped in non-physical PMDs (vmxnet3, > virtio, etc). -- Ivan Boule 6WIND Development Engineer
[dpdk-dev] [PATCH v2] virtio: Support multiple queues feature in DPDK based virtio-net frontend
This patch supports multiple queues feature in DPDK based virtio-net frontend. It firstly gets max queue number of virtio-net from virtio PCI configuration and then send command to negotiate the queue number with backend; When receiving and transmitting packets, it negotiates multiple virtio-net queues which serve RX/TX; To utilize this feature, the backend also need support multiple queues feature and enable it. It also fixes some patch style issues. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_virtio/virtio_ethdev.c | 326 -- lib/librte_pmd_virtio/virtio_ethdev.h | 10 +- lib/librte_pmd_virtio/virtio_pci.h| 4 +- lib/librte_pmd_virtio/virtio_rxtx.c | 72 ++-- lib/librte_pmd_virtio/virtqueue.h | 60 +-- 5 files changed, 384 insertions(+), 88 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 49e236b..79693f4 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -81,6 +81,9 @@ static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats * static void virtio_dev_stats_reset(struct rte_eth_dev *dev); static void virtio_dev_free_mbufs(struct rte_eth_dev *dev); +static int virtio_dev_queue_stats_mapping_set(__rte_unused struct rte_eth_dev *eth_dev, +__rte_unused uint16_t queue_id, __rte_unused uint8_t stat_idx, __rte_unused uint8_t is_rx); + /* * The set of PCI devices this driver supports */ @@ -92,6 +95,130 @@ static struct rte_pci_id pci_id_virtio_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static int +virtio_send_command(struct virtqueue* vq, struct virtio_pmd_ctrl* ctrl, + int* dlen, int pkt_num) +{ + uint32_t head = vq->vq_desc_head_idx, i; + int k, sum = 0; + virtio_net_ctrl_ack status = ~0; + struct virtio_pmd_ctrl result; + + ctrl->status = status; + + if (!vq->hw->cvq) { + PMD_INIT_LOG(ERR, "%s(): Control queue is " +"not supported by this device.\n", __func__); + return -1; + } + + PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, vq->hw->cvq = %p \n" + "vq = %p \n", vq->vq_desc_head_idx, status, vq->hw->cvq, vq); + + if ((vq->vq_free_cnt < ((uint32_t)pkt_num + 2)) || (pkt_num < 1)) { + return -1; + } + + memcpy(vq->virtio_net_hdr_mz->addr, ctrl, sizeof(struct virtio_pmd_ctrl)); + + /* +* Format is enforced in qemu code: +* One TX packet for header; +* At least one TX packet per argument; +* One RX packet for ACK. +*/ + vq->vq_ring.desc[head].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mz->phys_addr; + vq->vq_ring.desc[head].len = sizeof(struct virtio_net_ctrl_hdr); + vq->vq_free_cnt--; + i = vq->vq_ring.desc[head].next; + + for (k = 0; k < pkt_num; k++) { + vq->vq_ring.desc[i].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + + sizeof(struct virtio_net_ctrl_hdr) + sizeof(ctrl->status) + sizeof(uint8_t)*sum; + vq->vq_ring.desc[i].len = dlen[k]; + sum += dlen[k]; + vq->vq_free_cnt--; + i = vq->vq_ring.desc[i].next; + } + + vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + sizeof(struct virtio_net_ctrl_hdr); + vq->vq_ring.desc[i].len = sizeof(ctrl->status); + vq->vq_free_cnt--; + + vq->vq_desc_head_idx = vq->vq_ring.desc[i].next; + + vq_update_avail_ring(vq, head); + vq_update_avail_idx(vq); + + PMD_INIT_LOG(DEBUG, "vq->vq_queue_index = %d \n", vq->vq_queue_index); + + virtqueue_notify(vq); + + while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) { + usleep(100); + } + + while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) { + uint32_t idx, desc_idx, used_idx; + struct vring_used_elem *uep; + + rmb(); + + used_idx = (uint32_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); + uep = &vq->vq_ring.used->ring[used_idx]; + idx = (uint32_t) uep->id; + desc_idx = idx; + + while (vq->vq_ring.desc[desc_idx].flags & VRING_DESC_F_NEXT) { + desc_idx = vq->vq_ring.desc[desc_idx].next; + vq->vq_free_cnt++; + } + + vq->vq_ring.desc[desc_idx].next = vq->vq_desc_head_idx; + vq->vq_desc_head
[dpdk-dev] [PATCH v2 1/3] ether: Add API to support setting TX rate for queue and VF
This patch adds API to support setting TX rate for a queue and a VF. Signed-off-by: Ouyang Changchun --- lib/librte_ether/rte_ethdev.c | 71 +++ lib/librte_ether/rte_ethdev.h | 51 +++ 2 files changed, 122 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a5727dd..1ea61e1 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -1913,6 +1913,77 @@ rte_eth_dev_set_vf_vlan_filter(uint8_t port_id, uint16_t vlan_id, vf_mask,vlan_on); } +int rte_eth_set_queue_rate_limit(uint8_t port_id, uint16_t queue_idx, + uint16_t tx_rate) +{ + struct rte_eth_dev *dev; + struct rte_eth_dev_info dev_info; + struct rte_eth_link link; + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("set queue rate limit:invalid port id=%d\n", + port_id); + return -ENODEV; + } + + dev = &rte_eth_devices[port_id]; + rte_eth_dev_info_get(port_id, &dev_info); + link = dev->data->dev_link; + + if (queue_idx > dev_info.max_tx_queues) { + PMD_DEBUG_TRACE("set queue rate limit:port %d: " + "invalid queue id=%d\n", port_id, queue_idx); + return -EINVAL; + } + + if (tx_rate > link.link_speed) { + PMD_DEBUG_TRACE("set queue rate limit:invalid tx_rate=%d, " + "bigger than link speed= %d\n", + tx_rate, link_speed); + return -EINVAL; + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_queue_rate_limit, -ENOTSUP); + return (*dev->dev_ops->set_queue_rate_limit)(dev, queue_idx, tx_rate); +} + +int rte_eth_set_vf_rate_limit(uint8_t port_id, uint16_t vf, uint16_t tx_rate, + uint64_t q_msk) +{ + struct rte_eth_dev *dev; + struct rte_eth_dev_info dev_info; + struct rte_eth_link link; + + if (q_msk == 0) + return 0; + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("set VF rate limit:invalid port id=%d\n", + port_id); + return -ENODEV; + } + + dev = &rte_eth_devices[port_id]; + rte_eth_dev_info_get(port_id, &dev_info); + link = dev->data->dev_link; + + if (vf > dev_info.max_vfs) { + PMD_DEBUG_TRACE("set VF rate limit:port %d: " + "invalid vf id=%d\n", port_id, vf); + return -EINVAL; + } + + if (tx_rate > link.link_speed) { + PMD_DEBUG_TRACE("set VF rate limit:invalid tx_rate=%d, " + "bigger than link speed= %d\n", + tx_rate, link_speed); + return -EINVAL; + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->set_vf_rate_limit, -ENOTSUP); + return (*dev->dev_ops->set_vf_rate_limit)(dev, vf, tx_rate, q_msk); +} + int rte_eth_mirror_rule_set(uint8_t port_id, struct rte_eth_vmdq_mirror_conf *mirror_conf, diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index d5ea46b..445d40a 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1012,6 +1012,17 @@ typedef int (*eth_set_vf_vlan_filter_t)(struct rte_eth_dev *dev, uint8_t vlan_on); /**< @internal Set VF VLAN pool filter */ +typedef int (*eth_set_queue_rate_limit_t)(struct rte_eth_dev *dev, + uint16_t queue_idx, + uint16_t tx_rate); +/**< @internal Set queue TX rate */ + +typedef int (*eth_set_vf_rate_limit_t)(struct rte_eth_dev *dev, + uint16_t vf, + uint16_t tx_rate, + uint64_t q_msk); +/**< @internal Set VF TX rate */ + typedef int (*eth_mirror_rule_set_t)(struct rte_eth_dev *dev, struct rte_eth_vmdq_mirror_conf *mirror_conf, uint8_t rule_id, @@ -1119,6 +1130,8 @@ struct eth_dev_ops { eth_set_vf_rx_tset_vf_rx; /**< enable/disable a VF receive */ eth_set_vf_tx_tset_vf_tx; /**< enable/disable a VF transmit */ eth_set_vf_vlan_filter_t set_vf_vlan_filter; /**< Set VF VLAN filter */ + eth_set_queue_rate_limit_t set_queue_rate_limit; /**< Set queue rate limit */ + eth_set_vf_rate_limit_tset_vf_rate_limit; /**< Set VF rate limit */ /** Add a signature filter. */ fdir_add_signature_filter_t f
[dpdk-dev] [PATCH v2 2/3] ixgbe: Implement the functionality of setting TX rate for queue or VF in IXGBE PMD
This patch implements the functionality of setting TX rate for queue or VF in IXGBE PMD. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 122 lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 13 +++- 2 files changed, 132 insertions(+), 3 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index c9b5fe4..643477a 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -87,6 +87,8 @@ #define IXGBE_LINK_UP_CHECK_TIMEOUT 1000 /* ms */ #define IXGBE_VMDQ_NUM_UC_MAC 4096 /* Maximum nb. of UC MAC addr. */ +#define IXGBE_MMW_SIZE_DEFAULT0x4 +#define IXGBE_MMW_SIZE_JUMBO_FRAME0x14 #define IXGBEVF_PMD_NAME "rte_ixgbevf_pmd" /* PMD name */ @@ -182,6 +184,10 @@ static int ixgbe_mirror_rule_set(struct rte_eth_dev *dev, static int ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id); +static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev, + uint16_t queue_idx, uint16_t tx_rate); +static int ixgbe_set_vf_rate_limit(struct rte_eth_dev *dev, uint16_t vf, + uint16_t tx_rate, uint64_t q_msk); /* * Define VF Stats MACRO for Non "cleared on read" register */ @@ -280,6 +286,8 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = { .set_vf_rx= ixgbe_set_pool_rx, .set_vf_tx= ixgbe_set_pool_tx, .set_vf_vlan_filter = ixgbe_set_pool_vlan_filter, + .set_queue_rate_limit = ixgbe_set_queue_rate_limit, + .set_vf_rate_limit= ixgbe_set_vf_rate_limit, .fdir_add_signature_filter= ixgbe_fdir_add_signature_filter, .fdir_update_signature_filter = ixgbe_fdir_update_signature_filter, .fdir_remove_signature_filter = ixgbe_fdir_remove_signature_filter, @@ -1288,10 +1296,13 @@ ixgbe_dev_start(struct rte_eth_dev *dev) { struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct ixgbe_vf_info *vfinfo = + *IXGBE_DEV_PRIVATE_TO_P_VFDATA(dev->data->dev_private); int err, link_up = 0, negotiate = 0; uint32_t speed = 0; int mask = 0; int status; + uint16_t vf, idx; PMD_INIT_FUNC_TRACE(); @@ -1408,6 +1419,16 @@ skip_link_setup: goto error; } + /* Restore vf rate limit */ + if (vfinfo != NULL) { + for (vf = 0; vf < dev->pci_dev->max_vfs; vf++) + for (idx = 0; idx < IXGBE_MAX_QUEUE_NUM_PER_VF; idx++) + if (vfinfo[vf].tx_rate[idx] != 0) + ixgbe_set_vf_rate_limit(dev, vf, + vfinfo[vf].tx_rate[idx], + 1 << idx); + } + ixgbe_restore_statistics_mapping(dev); return (0); @@ -3062,6 +3083,107 @@ ixgbe_mirror_rule_reset(struct rte_eth_dev *dev, uint8_t rule_id) return 0; } +static int ixgbe_set_queue_rate_limit(struct rte_eth_dev *dev, + uint16_t queue_idx, uint16_t tx_rate) +{ + struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + uint32_t rf_dec, rf_int; + uint32_t bcnrc_val; + uint16_t link_speed = dev->data->dev_link.link_speed; + + if (queue_idx >= hw->mac.max_tx_queues) + return -EINVAL; + + if (tx_rate != 0) { + /* Calculate the rate factor values to set */ + rf_int = (uint32_t)link_speed / (uint32_t)tx_rate; + rf_dec = (uint32_t)link_speed % (uint32_t)tx_rate; + rf_dec = (rf_dec << IXGBE_RTTBCNRC_RF_INT_SHIFT) / tx_rate; + + bcnrc_val = IXGBE_RTTBCNRC_RS_ENA; + bcnrc_val |= ((rf_int << IXGBE_RTTBCNRC_RF_INT_SHIFT) & + IXGBE_RTTBCNRC_RF_INT_MASK_M); + bcnrc_val |= (rf_dec & IXGBE_RTTBCNRC_RF_DEC_MASK); + } else { + bcnrc_val = 0; + } + + /* +* Set global transmit compensation time to the MMW_SIZE in RTTBCNRM +* register. MMW_SIZE=0x014 if 9728-byte jumbo is supported, otherwise +* set as 0x4. +*/ + if ((dev->data->dev_conf.rxmode.jumbo_frame == 1) && + (dev->data->dev_conf.rxmode.max_rx_pkt_len >= + IXGBE_MAX_JUMBO_FRAME_SIZE)) + IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRM, + IXGBE_MMW_SIZE_JUMBO_FRAME); + else + IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRM, + IXGBE_MMW_SIZE_DEFAULT); + + /* Set RTTBCNRC of queue X */ + IXGBE_WRITE_REG(hw, IXGBE_RTTDQSEL, queue_idx); + IXGBE_WRITE_REG(hw, IXGBE_RTTBCNRC, bcnrc_val); + IXGBE_WRITE_FLUSH(hw); + +
[dpdk-dev] [PATCH v2 3/3] testpmd: Add commands to test the functionality of setting TX rate for queue or VF
This patch adds commands in testpmd to test the functionality of setting TX rate for queue or VF. Signed-off-by: Ouyang Changchun --- app/test-pmd/cmdline.c | 159 - app/test-pmd/config.c | 47 +++ app/test-pmd/testpmd.h | 3 + 3 files changed, 208 insertions(+), 1 deletion(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index b3824f9..83b2665 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -342,7 +342,14 @@ static void cmd_help_long_parsed(void *parsed_result, "BAM:accepts broadcast packets;" "MPE:accepts all multicast packets\n\n" "Enable/Disable a VF receive mode of a port\n\n" - + + "set port (port_id) queue (queue_id) rate (rate_num)\n" + "Set rate limit for a queue of a port\n\n" + + "set port (port_id) vf (vf_id) rate (rate_num) " + "queue_mask (queue_mask_value)\n" + "Set rate limit for queues in VF of a port\n\n" + "set port (port_id) mirror-rule (rule_id)" "(pool-mirror|vlan-mirror)\n" " (poolmask|vlanid[,vlanid]*) dst-pool (pool_id) (on|off)\n" @@ -4790,6 +4797,154 @@ cmdline_parse_inst_t cmd_vf_rxvlan_filter = { }, }; +/* *** SET RATE LIMIT FOR A QUEUE OF A PORT *** */ +struct cmd_queue_rate_limit_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t port; + uint8_t port_num; + cmdline_fixed_string_t queue; + uint8_t queue_num; + cmdline_fixed_string_t rate; + uint16_t rate_num; +}; + +static void cmd_queue_rate_limit_parsed(void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_queue_rate_limit_result *res = parsed_result; + int ret = 0; + + if ((strcmp(res->set, "set") == 0) && (strcmp(res->port, "port") == 0) + && (strcmp(res->queue, "queue") == 0) + && (strcmp(res->rate, "rate") == 0)) + ret = set_queue_rate_limit(res->port_num, res->queue_num, + res->rate_num); + if (ret < 0) + printf("queue_rate_limit_cmd error: (%s)\n", strerror(-ret)); + +} + +cmdline_parse_token_string_t cmd_queue_rate_limit_set = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + set, "set"); +cmdline_parse_token_string_t cmd_queue_rate_limit_port = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + port, "port"); +cmdline_parse_token_num_t cmd_queue_rate_limit_portnum = + TOKEN_NUM_INITIALIZER(struct cmd_queue_rate_limit_result, + port_num, UINT8); +cmdline_parse_token_string_t cmd_queue_rate_limit_queue = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + queue, "queue"); +cmdline_parse_token_num_t cmd_queue_rate_limit_queuenum = + TOKEN_NUM_INITIALIZER(struct cmd_queue_rate_limit_result, + queue_num, UINT8); +cmdline_parse_token_string_t cmd_queue_rate_limit_rate = + TOKEN_STRING_INITIALIZER(struct cmd_queue_rate_limit_result, + rate, "rate"); +cmdline_parse_token_num_t cmd_queue_rate_limit_ratenum = + TOKEN_NUM_INITIALIZER(struct cmd_queue_rate_limit_result, + rate_num, UINT16); + +cmdline_parse_inst_t cmd_queue_rate_limit = { + .f = cmd_queue_rate_limit_parsed, + .data = (void *)0, + .help_str = "set port X queue Y rate Z:(X = port number," + "Y = queue number,Z = rate number)set rate limit for a queue on port X", + .tokens = { + (void *)&cmd_queue_rate_limit_set, + (void *)&cmd_queue_rate_limit_port, + (void *)&cmd_queue_rate_limit_portnum, + (void *)&cmd_queue_rate_limit_queue, + (void *)&cmd_queue_rate_limit_queuenum, + (void *)&cmd_queue_rate_limit_rate, + (void *)&cmd_queue_rate_limit_ratenum, + NULL, + }, +}; + + +/* *** SET RATE LIMIT FOR A VF OF A PORT *** */ +struct cmd_vf_rate_limit_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t port; + uint8_t port_num; + cmdline_fixed_string_t vf; + uint8_t vf_num; + cmdline_fixed_string_t rate; + uint16_t rate
[dpdk-dev] [PATCH v2 0/3] Support setting TX rate for queue and VF
This patch v2 fixes some errors and warnings reported by checkpatch.pl. This patch series also contain the 3 items: 1. Add API to support setting TX rate for a queue or a VF. 2. Implement the functionality of setting TX rate for queue or VF in IXGBE PMD. 3. Add commands in testpmd to test the functionality of setting TX rate for queue or VF. Ouyang Changchun (3): Add API to support set TX rate for a queue and VF. Implement the functionality of setting TX rate for queue or VF in IXGBE PMD. Add commands in testpmd to test the functionality of setting TX rate for queue or VF. app/test-pmd/cmdline.c | 159 +++- app/test-pmd/config.c | 47 +++ app/test-pmd/testpmd.h | 3 + lib/librte_ether/rte_ethdev.c | 71 lib/librte_ether/rte_ethdev.h | 51 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 122 +++ lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 13 ++- 7 files changed, 462 insertions(+), 4 deletions(-) -- 1.9.0
[dpdk-dev] [PATCH v3] virtio: Support multiple queues feature in DPDK based virtio-net frontend.
This v3 patch continues fixing some errors and warnings reported by checkpatch.pl. This patch supports multiple queues feature in DPDK based virtio-net frontend. It firstly gets max queue number of virtio-net from virtio PCI configuration and then send command to negotiate the queue number with backend; When receiving and transmitting packets, it negotiates multiple virtio-net queues which serve RX/TX; To utilize this feature, the backend also need support multiple queues feature and enable it. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_virtio/virtio_ethdev.c | 377 -- lib/librte_pmd_virtio/virtio_ethdev.h | 40 ++-- lib/librte_pmd_virtio/virtio_pci.h| 4 +- lib/librte_pmd_virtio/virtio_rxtx.c | 92 +++-- lib/librte_pmd_virtio/virtqueue.h | 61 -- 5 files changed, 458 insertions(+), 116 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 49e236b..c2b4dfb 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -81,6 +81,12 @@ static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats * static void virtio_dev_stats_reset(struct rte_eth_dev *dev); static void virtio_dev_free_mbufs(struct rte_eth_dev *dev); +static int virtio_dev_queue_stats_mapping_set( + __rte_unused struct rte_eth_dev *eth_dev, + __rte_unused uint16_t queue_id, + __rte_unused uint8_t stat_idx, + __rte_unused uint8_t is_rx); + /* * The set of PCI devices this driver supports */ @@ -92,6 +98,135 @@ static struct rte_pci_id pci_id_virtio_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static int +virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, + int *dlen, int pkt_num) +{ + uint32_t head = vq->vq_desc_head_idx, i; + int k, sum = 0; + virtio_net_ctrl_ack status = ~0; + struct virtio_pmd_ctrl result; + + ctrl->status = status; + + if (!vq->hw->cvq) { + PMD_INIT_LOG(ERR, "%s(): Control queue is " + "not supported by this device.\n", __func__); + return -1; + } + + PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, " + "vq->hw->cvq = %p vq = %p\n", + vq->vq_desc_head_idx, status, vq->hw->cvq, vq); + + if ((vq->vq_free_cnt < ((uint32_t)pkt_num + 2)) || (pkt_num < 1)) + return -1; + + memcpy(vq->virtio_net_hdr_mz->addr, ctrl, + sizeof(struct virtio_pmd_ctrl)); + + /* +* Format is enforced in qemu code: +* One TX packet for header; +* At least one TX packet per argument; +* One RX packet for ACK. +*/ + vq->vq_ring.desc[head].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mz->phys_addr; + vq->vq_ring.desc[head].len = sizeof(struct virtio_net_ctrl_hdr); + vq->vq_free_cnt--; + i = vq->vq_ring.desc[head].next; + + for (k = 0; k < pkt_num; k++) { + vq->vq_ring.desc[i].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + + sizeof(struct virtio_net_ctrl_hdr) + + sizeof(ctrl->status) + sizeof(uint8_t)*sum; + vq->vq_ring.desc[i].len = dlen[k]; + sum += dlen[k]; + vq->vq_free_cnt--; + i = vq->vq_ring.desc[i].next; + } + + vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + + sizeof(struct virtio_net_ctrl_hdr); + vq->vq_ring.desc[i].len = sizeof(ctrl->status); + vq->vq_free_cnt--; + + vq->vq_desc_head_idx = vq->vq_ring.desc[i].next; + + vq_update_avail_ring(vq, head); + vq_update_avail_idx(vq); + + PMD_INIT_LOG(DEBUG, "vq->vq_queue_index = %d\n", vq->vq_queue_index); + + virtqueue_notify(vq); + + while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) + usleep(100); + + while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) { + uint32_t idx, desc_idx, used_idx; + struct vring_used_elem *uep; + + rmb(); + + used_idx = (uint32_t)(vq->vq_used_cons_idx + & (vq->vq_nentries - 1)); + uep = &vq->vq_ring.used->ring[used_idx]; + idx = (uint32_t) uep->id; + desc_idx = idx; + + while (vq->vq_ring.desc[desc_idx].flags & VRING_DESC_F_NEXT) { + desc_idx = vq->vq_ring.desc[desc_idx].next; + v
[dpdk-dev] [PATCH v2 0/3] Support zero copy RX/TX in user space vhost
Yes I will send out a patch v3 to replace the patch v2. Thanks Changchun -Original Message- From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] Sent: Wednesday, May 28, 2014 7:02 AM To: Ouyang, Changchun Cc: dev at dpdk.org Subject: Re: [PATCH v2 0/3] Support zero copy RX/TX in user space vhost Hi, checkpatch.pl is reporting some errors and I think some of them should avoided. Please check it. Thanks -- Thomas
[dpdk-dev] [PATCH 0/3] Support administrative link up and link down
Hi Ivan, Thanks very much for your detailed response for this issue, I think your recommendation makes sense, and I will update the naming and re-send a patch for link-up and link-down. Best regards, Changchun -Original Message- From: Ivan Boule [mailto:ivan.bo...@6wind.com] Sent: Friday, May 23, 2014 5:25 PM To: Ouyang, Changchun; dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/3] Support administrative link up and link down On 05/23/2014 04:08 AM, Ouyang, Changchun wrote: > Hi Ivan > > To some extent, I also agree with you. > But customer hope DPDK can provide an interface like "ifconfig up" and > "ifconfig down" in linux, They can invoke such an interface in user > application to repeated stop and start dev frequently, and Make sure > RX and TX work fine after each start, I think it is not necessary to > do really device start and stop at Each time, just need start and stop RX and > TX function, so the straightforward method is to enable and disable tx lazer > in ixgbe. > But in the ether level we need a more generic api name, here is > rte_eth_dev_admin_link_up/down, while enable_tx_laser is not suitable, Enable > and disable tx laser is a way in ixgbe to fulfill the administrative link up > and link down. > maybe Fortville and future generation NIC will use other ways to fulfill the > admin_link_up/down. > Hi Changchun, I do not understand what your customer effectively needs. First of all, if I understand well, your customer's application does not really need to invoke the DPDK functions "eth_dev_stop" and "eth_dev_start" for addressing its problem, for instance to reconfigure RX/TX queues of the port. When considering the implementation in the ixgbe PMD of the function "rte_eth_dev_admin_link_down", its only visible effects from the DPDK application perspective is that no input packet can be received anymore, and output packets cannot be transmitted (once having filled the TX queues). Conversely, the only visible effect of the "rte_eth_dev_admin_link_up" function is that input packets are received again, and that output packets can be successfully transmitted. In fact, by disabling the TX laser on a ixgbe port, the only interesting effect of the function "rte_eth_dev_admin_link_down" is that it notifies the peer system of a hardware link DOWN event (with no physical link unplug on the peer side). Conversely, by enabling the TX laser on a ixgbe port, the only interesting effect of the function "rte_eth_dev_admin_link_up" is that it notifies the peer system of a hardware link UP event. Is that the actions that your customer's application actually needs to perform? If so, then this certainly deserves a real operational use case that it is worth describing in the patch log. This would help DPDK PMD implementors to understand what such functions can be used for, and to decide whether they actually need to be supported by the PMD. Assuming that these 2 functions need to be provided to address the issue described above, I do not think that the word "admin" brings anything for understanding their role. In fact, the word "admin" rather suggests a pure "software" down/up setting, instead of a physical one. Naming these 2 functions "rte_eth_dev_set_link_down" and "rte_eth_dev_set_link_up" better describes their expected effect. Regards, Ivan > > On 05/22/2014 04:44 PM, Ouyang, Changchun wrote: >> Hi Ivan >> For this one, it seems long story for that... >> In short, >> Some customer have such kind of requirement, they want to repeatedly >> start(rte_dev_start) and stop(rte_dev_stop) the port for RX and TX, >> but they find after several times start and stop, the RX and TX can't work >> well even the port starts, and the packets error number increase. >> >> To resolve this error number increase issue, and let port work fine >> even after repeatedly start and stop, We need a new API to do it, after >> discussing, we have these 2 API, admin link up and admin link down. > > If I understand well, this "feature" is not needed by itself, but only as a > work-around to address issues when repeatedly invoking the functions > ixgbe_dev_stop and ixgbe_dev_start. > Do such issues appear when performing the same operations with the Linux > kernel driver? > > Anyway, I suppose that such functions have to be automatically invoked > by the same code of the network application that invokes the functions > ixgbe_dev_stop and ixgbe_dev_start (said differently, there is no need > for a manual assistance !) > > In that case, would not it be possible - and highly preferable - to directly > invoke the functions ixgbe_disable_tx_laser and, then, ixgbe_enable_
[dpdk-dev] [PATCH v2 2/3] ixgbe: Implement the functionality of setting link up and down in IXGBE PMD
Please ignore the previous v1 patch, just apply this v2 patch. This patch implements the functionality of setting link up and down in IXGBE PMD. It is implemented by enabling or disabling TX laser. Signed-off-by: Ouyang Changchun --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 63 + 1 file changed, 63 insertions(+) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index c9b5fe4..8f9c97a 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -97,6 +97,8 @@ static int eth_ixgbe_dev_init(struct eth_driver *eth_drv, static int ixgbe_dev_configure(struct rte_eth_dev *dev); static int ixgbe_dev_start(struct rte_eth_dev *dev); static void ixgbe_dev_stop(struct rte_eth_dev *dev); +static int ixgbe_dev_set_link_up(struct rte_eth_dev *dev); +static int ixgbe_dev_set_link_down(struct rte_eth_dev *dev); static void ixgbe_dev_close(struct rte_eth_dev *dev); static void ixgbe_dev_promiscuous_enable(struct rte_eth_dev *dev); static void ixgbe_dev_promiscuous_disable(struct rte_eth_dev *dev); @@ -246,6 +248,8 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = { .dev_configure= ixgbe_dev_configure, .dev_start= ixgbe_dev_start, .dev_stop = ixgbe_dev_stop, + .dev_set_link_up= ixgbe_dev_set_link_up, + .dev_set_link_down = ixgbe_dev_set_link_down, .dev_close= ixgbe_dev_close, .promiscuous_enable = ixgbe_dev_promiscuous_enable, .promiscuous_disable = ixgbe_dev_promiscuous_disable, @@ -1458,6 +1462,65 @@ ixgbe_dev_stop(struct rte_eth_dev *dev) } /* + * Set device link up: enable tx laser. + */ +static int +ixgbe_dev_set_link_up(struct rte_eth_dev *dev) +{ + struct ixgbe_hw *hw = + IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + if (hw->mac.type == ixgbe_mac_82599EB) { +#ifdef RTE_NIC_BYPASS + if (hw->device_id == IXGBE_DEV_ID_82599_BYPASS) { + /* Not suported in bypass mode */ + PMD_INIT_LOG(ERR, + "\nSet link up is not supported " + "by device id 0x%x\n", + hw->device_id); + return -ENOTSUP; + } +#endif + /* Turn on the laser */ + ixgbe_enable_tx_laser(hw); + return 0; + } + + PMD_INIT_LOG(ERR, "\nSet link up is not supported by device id 0x%x\n", + hw->device_id); + return -ENOTSUP; +} + +/* + * Set device link down: disable tx laser. + */ +static int +ixgbe_dev_set_link_down(struct rte_eth_dev *dev) +{ + struct ixgbe_hw *hw = + IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); + if (hw->mac.type == ixgbe_mac_82599EB) { +#ifdef RTE_NIC_BYPASS + if (hw->device_id == IXGBE_DEV_ID_82599_BYPASS) { + /* Not suported in bypass mode */ + PMD_INIT_LOG(ERR, + "\nSet link down is not supported " + "by device id 0x%x\n", +hw->device_id); + return -ENOTSUP; + } +#endif + /* Turn off the laser */ + ixgbe_disable_tx_laser(hw); + return 0; + } + + PMD_INIT_LOG(ERR, + "\nSet link down is not supported by device id 0x%x\n", +hw->device_id); + return -ENOTSUP; +} + +/* * Reest and stop device. */ static void -- 1.9.0
[dpdk-dev] [PATCH v2 3/3] testpmd: Add commands to test link up and down of PMD
Please ignore previous patch v1, and just apply this patch v2. This patch adds commands to test the functionality of setting link up and down of PMD in testpmd. Signed-off-by: Ouyang Changchun --- app/test-pmd/cmdline.c | 81 ++ app/test-pmd/testpmd.c | 14 + app/test-pmd/testpmd.h | 2 ++ 3 files changed, 97 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index b3824f9..29bf5b5 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -3780,6 +3780,85 @@ cmdline_parse_inst_t cmd_start_tx_first = { }, }; +/* *** SET LINK UP *** */ +struct cmd_set_link_up_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t link_up; + cmdline_fixed_string_t port; + uint8_t port_id; +}; + +cmdline_parse_token_string_t cmd_set_link_up_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_link_up_result, set, "set"); +cmdline_parse_token_string_t cmd_set_link_up_link_up = + TOKEN_STRING_INITIALIZER(struct cmd_set_link_up_result, link_up, + "link-up"); +cmdline_parse_token_string_t cmd_set_link_up_port = + TOKEN_STRING_INITIALIZER(struct cmd_set_link_up_result, port, "port"); +cmdline_parse_token_num_t cmd_set_link_up_port_id = + TOKEN_NUM_INITIALIZER(struct cmd_set_link_up_result, port_id, UINT8); + +static void cmd_set_link_up_parsed(__attribute__((unused)) void *parsed_result, +__attribute__((unused)) struct cmdline *cl, +__attribute__((unused)) void *data) +{ + struct cmd_set_link_up_result *res = parsed_result; + dev_set_link_up(res->port_id); +} + +cmdline_parse_inst_t cmd_set_link_up = { + .f = cmd_set_link_up_parsed, + .data = NULL, + .help_str = "set link-up port (port id)", + .tokens = { + (void *)&cmd_set_link_up_set, + (void *)&cmd_set_link_up_link_up, + (void *)&cmd_set_link_up_port, + (void *)&cmd_set_link_up_port_id, + NULL, + }, +}; + +/* *** SET LINK DOWN *** */ +struct cmd_set_link_down_result { + cmdline_fixed_string_t set; + cmdline_fixed_string_t link_down; + cmdline_fixed_string_t port; + uint8_t port_id; +}; + +cmdline_parse_token_string_t cmd_set_link_down_set = + TOKEN_STRING_INITIALIZER(struct cmd_set_link_down_result, set, "set"); +cmdline_parse_token_string_t cmd_set_link_down_link_down = + TOKEN_STRING_INITIALIZER(struct cmd_set_link_down_result, link_down, + "link-down"); +cmdline_parse_token_string_t cmd_set_link_down_port = + TOKEN_STRING_INITIALIZER(struct cmd_set_link_down_result, port, "port"); +cmdline_parse_token_num_t cmd_set_link_down_port_id = + TOKEN_NUM_INITIALIZER(struct cmd_set_link_down_result, port_id, UINT8); + +static void cmd_set_link_down_parsed( + __attribute__((unused)) void *parsed_result, + __attribute__((unused)) struct cmdline *cl, + __attribute__((unused)) void *data) +{ + struct cmd_set_link_down_result *res = parsed_result; + dev_set_link_down(res->port_id); +} + +cmdline_parse_inst_t cmd_set_link_down = { + .f = cmd_set_link_down_parsed, + .data = NULL, + .help_str = "set link-down port (port id)", + .tokens = { + (void *)&cmd_set_link_down_set, + (void *)&cmd_set_link_down_link_down, + (void *)&cmd_set_link_down_port, + (void *)&cmd_set_link_down_port_id, + NULL, + }, +}; + /* *** SHOW CFG *** */ struct cmd_showcfg_result { cmdline_fixed_string_t show; @@ -5164,6 +5243,8 @@ cmdline_parse_ctx_t main_ctx[] = { (cmdline_parse_inst_t *)&cmd_showcfg, (cmdline_parse_inst_t *)&cmd_start, (cmdline_parse_inst_t *)&cmd_start_tx_first, + (cmdline_parse_inst_t *)&cmd_set_link_up, + (cmdline_parse_inst_t *)&cmd_set_link_down, (cmdline_parse_inst_t *)&cmd_reset, (cmdline_parse_inst_t *)&cmd_set_numbers, (cmdline_parse_inst_t *)&cmd_set_txpkts, diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index bc38305..8f20fda 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -1208,6 +1208,20 @@ stop_packet_forwarding(void) test_done = 1; } +void +dev_set_link_up(portid_t pid) +{ + if (rte_eth_dev_set_link_up((uint8_t)pid) < 0) + printf("\nSet link up fail.\n"); +} + +void +dev_set_link_down(portid_t pid) +{ + if (rte_eth_dev_set_link_down((uint8_t)pid) < 0) + printf("\nSet link down fail.\n"); +} + static int all_ports_started(void)
[dpdk-dev] [PATCH v2 0/3] Support setting link up and link down
Please ignore the previous patch series with subject: "Support administrative link up and link down" This v2 patch series will replace the previous patch series. This patch series contain the following 3 items: 1. Add API to support setting link up and down, it can be used to repeatedly stop and restart RX/TX of a port without re-allocating resources for the port and re-configuring the port. 2. Implement the functionality of setting link up and down in IXGBE PMD. 3. Add command in testpmd to test the functionality of setting link up and down of PMD. Ouyang Changchun (3): Add API to support set link up and link down. Implement the functionality of setting link up and link down in IXGBE PMD. Add command line to test the functionality of setting link up and link down in testpmd. app/test-pmd/cmdline.c | 81 + app/test-pmd/testpmd.c | 14 +++ app/test-pmd/testpmd.h | 2 + lib/librte_ether/rte_ethdev.c | 38 + lib/librte_ether/rte_ethdev.h | 34 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 63 + 6 files changed, 232 insertions(+) -- 1.9.0
[dpdk-dev] [PATCH v2 1/3] ether: Add API to support set link up and link down
Please ignore previous v1 patch, just use this v2 patch. This patch adds API to support the functionality of setting link up and down. It can be used to repeatedly stop and restart RX/TX of a port without re-allocating resources for the port and re-configuring the port. Signed-off-by: Ouyang Changchun --- lib/librte_ether/rte_ethdev.c | 38 ++ lib/librte_ether/rte_ethdev.h | 34 ++ 2 files changed, 72 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a5727dd..97e3f9d 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -691,6 +691,44 @@ rte_eth_dev_stop(uint8_t port_id) (*dev->dev_ops->dev_stop)(dev); } +int +rte_eth_dev_set_link_up(uint8_t port_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -EINVAL; + } + dev = &rte_eth_devices[port_id]; + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_up, -ENOTSUP); + return (*dev->dev_ops->dev_set_link_up)(dev); +} + +int +rte_eth_dev_set_link_down(uint8_t port_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -EINVAL; + } + dev = &rte_eth_devices[port_id]; + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_set_link_down, -ENOTSUP); + return (*dev->dev_ops->dev_set_link_down)(dev); +} + void rte_eth_dev_close(uint8_t port_id) { diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index d5ea46b..84f2e9f 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -823,6 +823,12 @@ typedef int (*eth_dev_start_t)(struct rte_eth_dev *dev); typedef void (*eth_dev_stop_t)(struct rte_eth_dev *dev); /**< @internal Function used to stop a configured Ethernet device. */ +typedef int (*eth_dev_set_link_up_t)(struct rte_eth_dev *dev); +/**< @internal Function used to link up a configured Ethernet device. */ + +typedef int (*eth_dev_set_link_down_t)(struct rte_eth_dev *dev); +/**< @internal Function used to link down a configured Ethernet device. */ + typedef void (*eth_dev_close_t)(struct rte_eth_dev *dev); /**< @internal Function used to close a configured Ethernet device. */ @@ -1084,6 +1090,8 @@ struct eth_dev_ops { eth_dev_configure_tdev_configure; /**< Configure device. */ eth_dev_start_tdev_start; /**< Start device. */ eth_dev_stop_t dev_stop; /**< Stop device. */ + eth_dev_set_link_up_t dev_set_link_up; /**< Device link up. */ + eth_dev_set_link_down_tdev_set_link_down; /**< Device link down. */ eth_dev_close_tdev_close; /**< Close device. */ eth_promiscuous_enable_t promiscuous_enable; /**< Promiscuous ON. */ eth_promiscuous_disable_t promiscuous_disable;/**< Promiscuous OFF. */ @@ -1475,6 +1483,32 @@ extern int rte_eth_dev_start(uint8_t port_id); */ extern void rte_eth_dev_stop(uint8_t port_id); + +/** + * Link up an Ethernet device. + * + * Set device link up will re-enable the device rx/tx + * functionality after it is previously set device linked down. + * + * @param port_id + * The port identifier of the Ethernet device. + * @return + * - 0: Success, Ethernet device linked up. + * - <0: Error code of the driver device link up function. + */ +extern int rte_eth_dev_set_link_up(uint8_t port_id); + +/** + * Link down an Ethernet device. + * The device rx/tx functionality will be disabled if success, + * and it can be re-enabled with a call to + * rte_eth_dev_set_link_up() + * + * @param port_id + * The port identifier of the Ethernet device. + */ +extern int rte_eth_dev_set_link_down(uint8_t port_id); + /** * Close an Ethernet device. The device cannot be restarted! * -- 1.9.0
[dpdk-dev] [PATCH v3 2/3] ixgbe: Implement queue start and stop functionality in IXGBE PMD
Please ignore previous patch v1 and v2, only need this patch v3 for the queue start and stop functionality. This patch implements queue start and stop functionality in IXGBE PMD; it also enable hardware loopback for VMDQ mode in IXGBE PMD. Signed-off-by: Ouyang Changchun Tested-by: Waterman Cao This patch passed L2 Forward , L3 Forward testing base on commit: 57f0ba5f8b8588dfa6ffcd001447ef6337afa6cd. See test environment information as the following: Fedora 19 , Linux Kernel 3.9.0, GCC 4.8.2 X68_64, Intel Xeon processor E5-2600 and E5-2600 v2 family --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 4 + lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 8 ++ lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 239 ++-- lib/librte_pmd_ixgbe/ixgbe_rxtx.h | 6 + 4 files changed, 220 insertions(+), 37 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index c9b5fe4..3dcff78 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -260,6 +260,10 @@ static struct eth_dev_ops ixgbe_eth_dev_ops = { .vlan_tpid_set= ixgbe_vlan_tpid_set, .vlan_offload_set = ixgbe_vlan_offload_set, .vlan_strip_queue_set = ixgbe_vlan_strip_queue_set, + .rx_queue_start = ixgbe_dev_rx_queue_start, + .rx_queue_stop= ixgbe_dev_rx_queue_stop, + .tx_queue_start = ixgbe_dev_tx_queue_start, + .tx_queue_stop= ixgbe_dev_tx_queue_stop, .rx_queue_setup = ixgbe_dev_rx_queue_setup, .rx_queue_release = ixgbe_dev_rx_queue_release, .rx_queue_count = ixgbe_dev_rx_queue_count, diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h index 9d7e93f..1471942 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h @@ -212,6 +212,14 @@ void ixgbe_dev_tx_init(struct rte_eth_dev *dev); void ixgbe_dev_rxtx_start(struct rte_eth_dev *dev); +int ixgbe_dev_rx_queue_start(struct rte_eth_dev *dev, uint16_t rx_queue_id); + +int ixgbe_dev_rx_queue_stop(struct rte_eth_dev *dev, uint16_t rx_queue_id); + +int ixgbe_dev_tx_queue_start(struct rte_eth_dev *dev, uint16_t tx_queue_id); + +int ixgbe_dev_tx_queue_stop(struct rte_eth_dev *dev, uint16_t tx_queue_id); + int ixgbevf_dev_rx_init(struct rte_eth_dev *dev); void ixgbevf_dev_tx_init(struct rte_eth_dev *dev); diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 37d02aa..54ca010 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -1588,7 +1588,7 @@ ixgbe_recv_scattered_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, * descriptors should meet the following condition: * (num_ring_desc * sizeof(rx/tx descriptor)) % 128 == 0 */ -#define IXGBE_MIN_RING_DESC 64 +#define IXGBE_MIN_RING_DESC 32 #define IXGBE_MAX_RING_DESC 4096 /* @@ -1836,6 +1836,7 @@ ixgbe_dev_tx_queue_setup(struct rte_eth_dev *dev, txq->port_id = dev->data->port_id; txq->txq_flags = tx_conf->txq_flags; txq->ops = &def_txq_ops; + txq->start_tx_per_q = tx_conf->start_tx_per_q; /* * Modification to set VFTDT for virtual function if vf is detected @@ -2078,6 +2079,7 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev, rxq->crc_len = (uint8_t) ((dev->data->dev_conf.rxmode.hw_strip_crc) ? 0 : ETHER_CRC_LEN); rxq->drop_en = rx_conf->rx_drop_en; + rxq->start_rx_per_q = rx_conf->start_rx_per_q; /* * Allocate RX ring hardware descriptors. A memzone large enough to @@ -3025,6 +3027,13 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev) } + /* PFDMA Tx General Switch Control Enables VMDQ loopback */ + if (cfg->enable_loop_back) { + IXGBE_WRITE_REG(hw, IXGBE_PFDTXGSWC, IXGBE_PFDTXGSWC_VT_LBEN); + for (i = 0; i < RTE_IXGBE_VMTXSW_REGISTER_COUNT; i++) + IXGBE_WRITE_REG(hw, IXGBE_VMTXSW(i), UINT32_MAX); + } + IXGBE_WRITE_FLUSH(hw); } @@ -3234,7 +3243,6 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) uint32_t rxcsum; uint16_t buf_size; uint16_t i; - int ret; PMD_INIT_FUNC_TRACE(); hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); @@ -3289,11 +3297,6 @@ ixgbe_dev_rx_init(struct rte_eth_dev *dev) for (i = 0; i < dev->data->nb_rx_queues; i++) { rxq = dev->data->rx_queues[i]; - /* Allocate buffers for descriptor rings */ - ret = ixgbe_alloc_rx_queue_mbufs(rxq); - if (ret) - return ret; - /* * Reset crc_len in case it was changed after queue setup by a * call to configure. @@ -3500,10 +3503,8 @@ ix
[dpdk-dev] [PATCH v3 3/3] examples/vhost: Support user space vhost zero copy
Please ignore previous patch v1 and v2, only need this patch v3 for us vhost zero copy. This patch supports user space vhost zero copy. It removes packets copying between host and guest in RX/TX. It introduces an extra ring to store the detached mbufs. At initialization stage all mbufs will put into this ring; when one guest starts, vhost gets the available buffer address allocated by guest for RX and translates them into host space addresses, then attaches them to mbufs and puts the attached mbufs into mempool. Queue starting and DMA refilling will get mbufs from mempool and use them to set the DMA addresses. For TX, it gets the buffer addresses of available packets to be transmitted from guest and translates them to host space addresses, then attaches them to mbufs and puts them to TX queues. After TX finishes, it pulls mbufs out from mempool, detaches them and puts them back into the extra ring. Signed-off-by: Ouyang Changchun Tested-by: Waterman Cao This patch passed L2 Forward , L3 Forward testing base on commit: 57f0ba5f8b8588dfa6ffcd001447ef6337afa6cd. See test environment information as the following: Fedora 19 , Linux Kernel 3.9.0, GCC 4.8.2 X68_64, Intel Xeon processor E5-2600 and E5-2600 v2 family --- examples/vhost/main.c | 1476 +-- examples/vhost/virtio-net.c | 186 +- examples/vhost/virtio-net.h | 23 +- 3 files changed, 1623 insertions(+), 62 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index b86d57d..e91 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -48,6 +48,7 @@ #include #include #include +#include #include "main.h" #include "virtio-net.h" @@ -70,6 +71,16 @@ #define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + RTE_PKTMBUF_HEADROOM) /* + * No frame data buffer allocated from host are required for zero copy + * implementation, guest will allocate the frame data buffer, and vhost + * directly use it. + */ +#define VIRTIO_DESCRIPTOR_LEN_ZCP 1518 +#define MBUF_SIZE_ZCP (VIRTIO_DESCRIPTOR_LEN_ZCP + sizeof(struct rte_mbuf) \ + + RTE_PKTMBUF_HEADROOM) +#define MBUF_CACHE_SIZE_ZCP 0 + +/* * RX and TX Prefetch, Host, and Write-back threshold values should be * carefully set for optimal performance. Consult the network * controller's datasheet and supporting DPDK documentation for guidance @@ -108,6 +119,25 @@ #define RTE_TEST_RX_DESC_DEFAULT 1024 #define RTE_TEST_TX_DESC_DEFAULT 512 +/* + * Need refine these 2 macros for legacy and DPDK based front end: + * Max vring avail descriptor/entries from guest - MAX_PKT_BURST + * And then adjust power 2. + */ +/* + * For legacy front end, 128 descriptors, + * half for virtio header, another half for mbuf. + */ +#define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt FE: 128. */ +#define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt FE: 64. */ + +/* Get first 4 bytes in mbuf headroom. */ +#define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \ + + sizeof(struct rte_mbuf))) + +/* true if x is a power of 2 */ +#define POWEROF2(x) x)-1) & (x)) == 0) + #define INVALID_PORT_ID 0xFF /* Max number of devices. Limited by vmdq. */ @@ -138,8 +168,42 @@ static uint32_t num_switching_cores = 0; static uint32_t num_queues = 0; uint32_t num_devices = 0; +/* + * Enable zero copy, pkts buffer will directly dma to hw descriptor, + * disabled on default. + */ +static uint32_t zero_copy; + +/* number of descriptors to apply*/ +static uint32_t num_rx_descriptor = RTE_TEST_RX_DESC_DEFAULT_ZCP; +static uint32_t num_tx_descriptor = RTE_TEST_TX_DESC_DEFAULT_ZCP; + +/* max ring descriptor, ixgbe, i40e, e1000 all are 4096. */ +#define MAX_RING_DESC 4096 + +struct vpool { + struct rte_mempool *pool; + struct rte_ring *ring; + uint32_t buf_size; +} vpool_array[MAX_QUEUES+MAX_QUEUES]; + /* Enable VM2VM communications. If this is disabled then the MAC address compare is skipped. */ -static uint32_t enable_vm2vm = 1; +typedef enum { + VM2VM_DISABLED = 0, + VM2VM_SOFTWARE = 1, + VM2VM_HARDWARE = 2, + VM2VM_LAST +} vm2vm_type; +static vm2vm_type vm2vm_mode = VM2VM_SOFTWARE; + +/* The type of host physical address translated from guest physical address. */ +typedef enum { + PHYS_ADDR_CONTINUOUS = 0, + PHYS_ADDR_CROSS_SUBREG = 1, + PHYS_ADDR_INVALID = 2, + PHYS_ADDR_LAST +} hpa_type; + /* Enable stats. */ static uint32_t enable_stats = 0; /* Enable retries on RX. */ @@ -159,7 +223,7 @@ static uint32_t dev_index = 0; extern uint64_t VHOST_FEATURES; /* Default configuration for rx and tx thresholds etc. */ -static const struct rte_eth_rxconf rx_conf_default = { +static struct rte_eth_rxconf rx_conf_default = { .rx_thresh = { .pthresh = RX_PTHRESH, .hthresh = RX_HTHRESH, @@ -173,7 +237,7 @@ static const struct rte_eth_rxconf rx_conf_defau
[dpdk-dev] [PATCH v3 1/3] ethdev: Add API to support queue start and stop functionality for RX/TX.
Please ignore previous patch v1, v2, just need apply this patch v3 for new API code changes. This patch adds API to support queue start and stop functionality for RX/TX. It allows RX and TX queue is started or stopped one by one, instead of starting and stopping all of them at the same time. Signed-off-by: Ouyang Changchun Tested-by: Waterman Cao This patch passed L2 Forward , L3 Forward testing base on commit: 57f0ba5f8b8588dfa6ffcd001447ef6337afa6cd. See test environment information as the following: Fedora 19 , Linux Kernel 3.9.0, GCC 4.8.2 X68_64, Intel Xeon processor E5-2600 and E5-2600 v2 family --- lib/librte_eal/linuxapp/eal/eal_memory.c | 2 +- lib/librte_ether/rte_ethdev.c| 104 +++ lib/librte_ether/rte_ethdev.h| 80 3 files changed, 185 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c index 5a10a80..8d1edd9 100644 --- a/lib/librte_eal/linuxapp/eal/eal_memory.c +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c @@ -134,6 +134,7 @@ rte_mem_virt2phy(const void *virtaddr) uint64_t page, physaddr; unsigned long virt_pfn; int page_size; + off_t offset; /* standard page size */ page_size = getpagesize(); @@ -145,7 +146,6 @@ rte_mem_virt2phy(const void *virtaddr) return RTE_BAD_PHYS_ADDR; } - off_t offset; virt_pfn = (unsigned long)virtaddr / page_size; offset = sizeof(uint64_t) * virt_pfn; if (lseek(fd, offset, SEEK_SET) == (off_t) -1) { diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index a5727dd..df7cb07 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -292,6 +292,110 @@ rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) return (0); } +int +rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -EINVAL; + } + + dev = &rte_eth_devices[port_id]; + if (rx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id); + return -EINVAL; + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_start, -ENOTSUP); + + return dev->dev_ops->rx_queue_start(dev, rx_queue_id); + +} + +int +rte_eth_dev_rx_queue_stop(uint8_t port_id, uint16_t rx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -EINVAL; + } + + dev = &rte_eth_devices[port_id]; + if (rx_queue_id >= dev->data->nb_rx_queues) { + PMD_DEBUG_TRACE("Invalid RX queue_id=%d\n", rx_queue_id); + return -EINVAL; + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_stop, -ENOTSUP); + + return dev->dev_ops->rx_queue_stop(dev, rx_queue_id); + +} + +int +rte_eth_dev_tx_queue_start(uint8_t port_id, uint16_t tx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -EINVAL; + } + + dev = &rte_eth_devices[port_id]; + if (tx_queue_id >= dev->data->nb_tx_queues) { + PMD_DEBUG_TRACE("Invalid TX queue_id=%d\n", tx_queue_id); + return -EINVAL; + } + + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_start, -ENOTSUP); + + return dev->dev_ops->tx_queue_start(dev, tx_queue_id); + +} + +int +rte_eth_dev_tx_queue_stop(uint8_t port_id, uint16_t tx_queue_id) +{ + struct rte_eth_dev *dev; + + /* This function is only safe when called from the primary process +* in a multi-process setup*/ + PROC_PRIMARY_OR_ERR_RET(-E_RTE_SECONDARY); + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -EINVAL; + } + + dev = &rte_eth_devices[port_id]; + if (tx_queue_id >= dev->data->nb_tx_queues) { + PMD_DEBUG_TRACE("Invalid
[dpdk-dev] [PATCH v3 0/3] Support zero copy RX/TX in user space vhost
This patch v3 fixes some errors and warnings reported by checkpatch.pl, please ignore previous 2 patches: patch v1 and patch v2, only apply this v3 patch for zero copy RX/TX in user space vhost. This patch series support user space vhost zero copy. It removes packets copying between host and guest in RX/TX. And it introduces an extra ring to store the detached mbufs. At initialization stage all mbufs put into this ring; when one guest starts, vhost gets the available buffer address allocated by guest for RX and translates them into host space addresses, then attaches them to mbufs and puts the attached mbufs into mempool. Queue starting and DMA refilling will get mbufs from mempool and use them to set the DMA addresses. For TX, it gets the buffer addresses of available packets to be transmitted from guest and translates them to host space addresses, then attaches them to mbufs and puts them to TX queues. After TX finishes, it pulls mbufs out from mempool, detaches them and puts them back into the extra ring. This patch series also implement queue start and stop functionality in IXGBE PMD; and enable hardware loopback for VMDQ mode in IXGBE PMD. Ouyang Changchun (3): Add API to support queue start and stop functionality for RX/TX. Implement queue start and stop functionality in IXGBE PMD; Enable hardware loopback for VMDQ mode in IXGBE PMD. Support user space vhost zero copy, it removes packets copying between host and guest in RX/TX. examples/vhost/main.c| 1476 -- examples/vhost/virtio-net.c | 186 +++- examples/vhost/virtio-net.h | 23 +- lib/librte_eal/linuxapp/eal/eal_memory.c |2 +- lib/librte_ether/rte_ethdev.c| 104 +++ lib/librte_ether/rte_ethdev.h| 80 ++ lib/librte_pmd_ixgbe/ixgbe_ethdev.c |4 + lib/librte_pmd_ixgbe/ixgbe_ethdev.h |8 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c| 239 - lib/librte_pmd_ixgbe/ixgbe_rxtx.h|6 + 10 files changed, 2028 insertions(+), 100 deletions(-) -- 1.9.0
[dpdk-dev] [PATCH v4 1/2] virtio: Cleanup the existing codes in virtio-net PMD
This patch cleanups some coding style issue, and fixes some errors and warnings reported by checkpatch.pl. Signed-off-by: Ouyang Changchun Tested-by: Waterman Cao This patch passed Testpmd testing base on commit: 57f0ba5f8b8588dfa6ffcd001447ef6337afa6cd. See test environment information as the following: Fedora 20, Linux kernel 3.13.9, GCC 4.8.2 X86_64, Intel Xeon processor E5-2600 and E5-2600 v2 family --- lib/librte_pmd_virtio/virtio_ethdev.c | 68 +++ lib/librte_pmd_virtio/virtio_ethdev.h | 30 lib/librte_pmd_virtio/virtio_rxtx.c | 43 -- lib/librte_pmd_virtio/virtqueue.h | 26 -- 4 files changed, 100 insertions(+), 67 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 49e236b..685bf90 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -134,7 +134,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, if (queue_type == VTNET_RQ) { rte_snprintf(vq_name, sizeof(vq_name), "port%d_rvq%d", - dev->data->port_id, queue_idx); + dev->data->port_id, queue_idx); vq = rte_zmalloc(vq_name, sizeof(struct virtqueue) + vq_size * sizeof(struct vq_desc_extra), CACHE_LINE_SIZE); memcpy(vq->vq_name, vq_name, sizeof(vq->vq_name)); @@ -146,8 +146,8 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, memcpy(vq->vq_name, vq_name, sizeof(vq->vq_name)); } else if(queue_type == VTNET_CQ) { rte_snprintf(vq_name, sizeof(vq_name), "port%d_cvq", - dev->data->port_id); vq = rte_zmalloc(vq_name, sizeof(struct virtqueue), + dev->data->port_id); CACHE_LINE_SIZE); memcpy(vq->vq_name, vq_name, sizeof(vq->vq_name)); } @@ -155,6 +155,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, PMD_INIT_LOG(ERR, "%s: Can not allocate virtqueue\n", __func__); return (-ENOMEM); } + vq->hw = hw; vq->port_id = dev->data->port_id; vq->queue_id = queue_idx; @@ -171,11 +172,12 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, PMD_INIT_LOG(DEBUG, "vring_size: %d, rounded_vring_size: %d\n", size, vq->vq_ring_size); mz = rte_memzone_reserve_aligned(vq_name, vq->vq_ring_size, - socket_id, 0, VIRTIO_PCI_VRING_ALIGN); + socket_id, 0, VIRTIO_PCI_VRING_ALIGN); if (mz == NULL) { rte_free(vq); return (-ENOMEM); } + /* * Virtio PCI device VIRTIO_PCI_QUEUE_PF register is 32bit, * and only accepts 32 bit page frame number. @@ -186,6 +188,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, rte_free(vq); return (-ENOMEM); } + memset(mz->addr, 0, sizeof(mz->len)); vq->mz = mz; vq->vq_ring_mem = mz->phys_addr; @@ -197,8 +200,8 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, if (queue_type == VTNET_TQ) { /* - * For each xmit packet, allocate a virtio_net_hdr - */ +* For each xmit packet, allocate a virtio_net_hdr +*/ rte_snprintf(vq_name, sizeof(vq_name), "port%d_tvq%d_hdrzone", dev->data->port_id, queue_idx); vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name, @@ -206,10 +209,12 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, socket_id, 0, CACHE_LINE_SIZE); if (vq->virtio_net_hdr_mz == NULL) { rte_free(vq); - return (-ENOMEM); + return -ENOMEM; } - vq->virtio_net_hdr_mem = (void *)(uintptr_t)vq->virtio_net_hdr_mz->phys_addr; - memset(vq->virtio_net_hdr_mz->addr, 0, vq_size * sizeof(struct virtio_net_hdr)); + vq->virtio_net_hdr_mem = + (void *)(uintptr_t)vq->virtio_net_hdr_mz->phys_addr; + memset(vq->virtio_net_hdr_mz->addr, 0, + vq_size * sizeof(struct virtio_net_hdr)); } else if (queue_type == VTNET_CQ) { /* Allocate a page for control vq command, data and status */ rte_snprintf(vq_name, sizeof(vq_name), "port%d_cvq_hdrzone", @@ -218,9 +223,10 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, PAGE_SIZE, socket_id, 0, CACHE_LINE_SIZE); if (vq->virtio_net_hdr_mz == NULL) {
[dpdk-dev] [PATCH v4 2/2] virtio: Support multiple queues feature in DPDK based virtio-net frontend
This patch supports multiple queues feature in DPDK based virtio-net frontend. It firstly gets max queue number of virtio-net from virtio PCI configuration and then send command to negotiate the queue number with backend; When receiving and transmitting packets, it negotiates multiple virtio-net queues which serve RX/TX; To utilize this feature, the backend also need support multiple queues feature and enable it. Signed-off-by: Ouyang Changchun Tested-by: Waterman Cao This patch passed Testpmd testing base on commit: 57f0ba5f8b8588dfa6ffcd001447ef6337afa6cd. See test environment information as the following: Fedora 20, Linux kernel 3.13.9, GCC 4.8.2 X86_64, Intel Xeon processor E5-2600 and E5-2600 v2 family --- lib/librte_pmd_virtio/virtio_ethdev.c | 309 ++ lib/librte_pmd_virtio/virtio_ethdev.h | 10 +- lib/librte_pmd_virtio/virtio_pci.h| 4 +- lib/librte_pmd_virtio/virtio_rxtx.c | 49 -- lib/librte_pmd_virtio/virtqueue.h | 35 +++- 5 files changed, 358 insertions(+), 49 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 685bf90..c2b4dfb 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -81,6 +81,12 @@ static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats * static void virtio_dev_stats_reset(struct rte_eth_dev *dev); static void virtio_dev_free_mbufs(struct rte_eth_dev *dev); +static int virtio_dev_queue_stats_mapping_set( + __rte_unused struct rte_eth_dev *eth_dev, + __rte_unused uint16_t queue_id, + __rte_unused uint8_t stat_idx, + __rte_unused uint8_t is_rx); + /* * The set of PCI devices this driver supports */ @@ -92,6 +98,135 @@ static struct rte_pci_id pci_id_virtio_map[] = { { .vendor_id = 0, /* sentinel */ }, }; +static int +virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, + int *dlen, int pkt_num) +{ + uint32_t head = vq->vq_desc_head_idx, i; + int k, sum = 0; + virtio_net_ctrl_ack status = ~0; + struct virtio_pmd_ctrl result; + + ctrl->status = status; + + if (!vq->hw->cvq) { + PMD_INIT_LOG(ERR, "%s(): Control queue is " + "not supported by this device.\n", __func__); + return -1; + } + + PMD_INIT_LOG(DEBUG, "vq->vq_desc_head_idx = %d, status = %d, " + "vq->hw->cvq = %p vq = %p\n", + vq->vq_desc_head_idx, status, vq->hw->cvq, vq); + + if ((vq->vq_free_cnt < ((uint32_t)pkt_num + 2)) || (pkt_num < 1)) + return -1; + + memcpy(vq->virtio_net_hdr_mz->addr, ctrl, + sizeof(struct virtio_pmd_ctrl)); + + /* +* Format is enforced in qemu code: +* One TX packet for header; +* At least one TX packet per argument; +* One RX packet for ACK. +*/ + vq->vq_ring.desc[head].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[head].addr = vq->virtio_net_hdr_mz->phys_addr; + vq->vq_ring.desc[head].len = sizeof(struct virtio_net_ctrl_hdr); + vq->vq_free_cnt--; + i = vq->vq_ring.desc[head].next; + + for (k = 0; k < pkt_num; k++) { + vq->vq_ring.desc[i].flags = VRING_DESC_F_NEXT; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + + sizeof(struct virtio_net_ctrl_hdr) + + sizeof(ctrl->status) + sizeof(uint8_t)*sum; + vq->vq_ring.desc[i].len = dlen[k]; + sum += dlen[k]; + vq->vq_free_cnt--; + i = vq->vq_ring.desc[i].next; + } + + vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE; + vq->vq_ring.desc[i].addr = vq->virtio_net_hdr_mz->phys_addr + + sizeof(struct virtio_net_ctrl_hdr); + vq->vq_ring.desc[i].len = sizeof(ctrl->status); + vq->vq_free_cnt--; + + vq->vq_desc_head_idx = vq->vq_ring.desc[i].next; + + vq_update_avail_ring(vq, head); + vq_update_avail_idx(vq); + + PMD_INIT_LOG(DEBUG, "vq->vq_queue_index = %d\n", vq->vq_queue_index); + + virtqueue_notify(vq); + + while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) + usleep(100); + + while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) { + uint32_t idx, desc_idx, used_idx; + struct vring_used_elem *uep; + + rmb(); + + used_idx = (uint32_t)(vq->vq_used_cons_idx + & (vq->vq_nentries - 1)); + uep = &vq->vq_ring.used->ring[used_idx]; + idx = (uint32_t) uep->id; + desc_idx = idx; + +
[dpdk-dev] [PATCH v4 0/2] Support multiple queues feature in DPDK based virtio-net frontend
This v4 patch series replace previous v1, v2, v3 patch for virtio-net multiple queues feature. Please apply this v4 patch series and ignore previous patches. It splits previous one patch into the following 2 patches for easy to review: Cleanup the existing codes in virtio-net PMD; Support multiple queues feature in DPDK based virtio-net frontend; In sum, this patch supports multiple queues feature in DPDK based virtio-net frontend. It firstly gets max queue number of virtio-net from virtio PCI configuration and then send command to negotiate the queue number with backend; When receiving and transmitting packets, it negotiates multiple virtio-net queues which serve RX/TX; To utilize this feature, the backend also need support multiple queues feature and enable it. Ouyang Changchun (2): Cleanup the existing codes in virtio-net PMD. Support multiple queues feature in DPDK based virtio-net frontend. lib/librte_pmd_virtio/virtio_ethdev.c | 377 -- lib/librte_pmd_virtio/virtio_ethdev.h | 40 ++-- lib/librte_pmd_virtio/virtio_pci.h| 4 +- lib/librte_pmd_virtio/virtio_rxtx.c | 92 +++-- lib/librte_pmd_virtio/virtqueue.h | 61 -- 5 files changed, 458 insertions(+), 116 deletions(-) -- 1.9.0
[dpdk-dev] [PATCH] librte_vhost: Fix the path test issue
Commit aec8283d47d4e4366b6 fixes the compilation issue, but it leads to one runtime issue: early exit wrongly. In some case, 'path' is NULL, but 'resolved_path' has effective path, it should continue going ahead rather than exit. Signed-off-by: Changchun Ouyang --- lib/librte_vhost/virtio-net.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index 8015dd8..3fa1274 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -237,7 +237,7 @@ host_memory_map(struct virtio_net *dev, struct virtio_memory *mem, snprintf(memfile, PATH_MAX, "/proc/%u/fd/%s", pid, dptr->d_name); path = realpath(memfile, resolved_path); - if (path == NULL) { + if ((path == NULL) && (strlen(resolved_path) == 0)) { RTE_LOG(ERR, VHOST_CONFIG, "(%"PRIu64") Failed to resolve fd directory\n", dev->device_fh); -- 1.8.4.2
[dpdk-dev] [PATCH v3 0/2] Fix packet length issue
This patch set fix packet length issue in vhost app, and enhance code by extracting a function to replace duplicated codes in one copy and zero copy TX function. -v3 change: Extract a function to replace duplicated codes in one copy and zero copy TX function -v2 change: Update data length by plus offset in first segment instead of last segment. -v1 change: Update the packet length by plus offset; Use macro to replace constant. Changchun Ouyang (2): Fix packet length issue in vhost. Extract a function to replace duplicated codes in vhost. examples/vhost/main.c | 137 ++ 1 file changed, 61 insertions(+), 76 deletions(-) -- 1.8.4.2
[dpdk-dev] [PATCH v3 2/2] vhost: Remove duplicated codes
Extract a function to replace duplicated codes in one copy and zero copy TX function. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 139 +- 1 file changed, 58 insertions(+), 81 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 5ca8dce..2916313 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1040,6 +1040,57 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) } /* + * Check if the destination MAC of a packet is one local VM, + * and get its vlan tag, and offset if it is. + */ +static inline int __attribute__((always_inline)) +find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, + uint32_t *offset, uint16_t *vlan_tag) +{ + struct virtio_net_data_ll *dev_ll = ll_root_used; + struct ether_hdr *pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); + + while (dev_ll != NULL) { + if ((dev_ll->vdev->ready == DEVICE_RX) + && ether_addr_cmp(&(pkt_hdr->d_addr), + &dev_ll->vdev->mac_address)) { + /* +* Drop the packet if the TX packet is +* destined for the TX device. +*/ + if (dev_ll->vdev->dev->device_fh == dev->device_fh) { + LOG_DEBUG(VHOST_DATA, + "(%"PRIu64") TX: Source and destination" + " MAC addresses are the same. Dropping " + "packet.\n", + dev_ll->vdev->dev->device_fh); + return -1; + } + + /* +* HW vlan strip will reduce the packet length +* by minus length of vlan tag, so need restore +* the packet length by plus it. +*/ + *offset = VLAN_HLEN; + *vlan_tag = + (uint16_t) + vlan_tags[(uint16_t)dev_ll->vdev->dev->device_fh]; + + LOG_DEBUG(VHOST_DATA, + "(%"PRIu64") TX: pkt to local VM device id:" + "(%"PRIu64") vlan tag: %d.\n", + dev->device_fh, dev_ll->vdev->dev->device_fh, + vlan_tag); + + break; + } + dev_ll = dev_ll->next; + } + return 0; +} + +/* * This function routes the TX packet to the correct interface. This may be a local device * or the physical port. */ @@ -1050,8 +1101,6 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) struct rte_mbuf **m_table; unsigned len, ret, offset = 0; const uint16_t lcore_id = rte_lcore_id(); - struct virtio_net_data_ll *dev_ll = ll_root_used; - struct ether_hdr *pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); struct virtio_net *dev = vdev->dev; /*check if destination is local VM*/ @@ -1061,43 +1110,9 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) } if (vm2vm_mode == VM2VM_HARDWARE) { - while (dev_ll != NULL) { - if ((dev_ll->vdev->ready == DEVICE_RX) - && ether_addr_cmp(&(pkt_hdr->d_addr), - &dev_ll->vdev->mac_address)) { - /* -* Drop the packet if the TX packet is -* destined for the TX device. -*/ - if (dev_ll->vdev->dev->device_fh == dev->device_fh) { - LOG_DEBUG(VHOST_DATA, - "(%"PRIu64") TX: Source and destination" - " MAC addresses are the same. Dropping " - "packet.\n", - dev_ll->vdev->dev->device_fh); - rte_pktmbuf_free(m); - return; - } - - /* -* HW vlan strip will reduce the packet length -* by minus length of vlan tag, so need restore -* the packet length by plus it. -*/ - offset = VLAN_HLEN; - vlan_tag = - (uint16_t) - vlan_tags[(uint16_t)dev_ll->vdev->dev->device_fh]; - - LOG_DEBUG(VHOST_DATA, - "(%"PRIu64") TX: pkt to local VM device id:" -
[dpdk-dev] [PATCH v3 1/2] vhost: Fix packet length issue
As HW vlan strip will reduce the packet length by minus length of vlan tag, so it need restore the packet length by plus it. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 57ef464..5ca8dce 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1078,7 +1078,13 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) rte_pktmbuf_free(m); return; } - offset = 4; + + /* +* HW vlan strip will reduce the packet length +* by minus length of vlan tag, so need restore +* the packet length by plus it. +*/ + offset = VLAN_HLEN; vlan_tag = (uint16_t) vlan_tags[(uint16_t)dev_ll->vdev->dev->device_fh]; @@ -1102,8 +1108,10 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) len = tx_q->len; m->ol_flags = PKT_TX_VLAN_PKT; - /*FIXME: offset*/ + m->data_len += offset; + m->pkt_len += offset; + m->vlan_tci = vlan_tag; tx_q->m_table[len] = m; -- 1.8.4.2
[dpdk-dev] [PATCH v4 1/3] vhost: Fix packet length issue
As HW vlan strip will reduce the packet length by minus length of vlan tag, so it need restore the packet length by plus it. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 57ef464..5ca8dce 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1078,7 +1078,13 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) rte_pktmbuf_free(m); return; } - offset = 4; + + /* +* HW vlan strip will reduce the packet length +* by minus length of vlan tag, so need restore +* the packet length by plus it. +*/ + offset = VLAN_HLEN; vlan_tag = (uint16_t) vlan_tags[(uint16_t)dev_ll->vdev->dev->device_fh]; @@ -1102,8 +1108,10 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) len = tx_q->len; m->ol_flags = PKT_TX_VLAN_PKT; - /*FIXME: offset*/ + m->data_len += offset; + m->pkt_len += offset; + m->vlan_tci = vlan_tag; tx_q->m_table[len] = m; -- 1.8.4.2
[dpdk-dev] [PATCH v4 0/3] Fix packet length issue
This patch set fix packet length issue in vhost app, and enhance code by extracting a function to replace duplicated codes in one copy and zero copy TX function. -v4 chang: Check offset value and extra bytes inside packet buffer cross page boundary. -v3 change: Extract a function to replace duplicated codes in one copy and zero copy TX function. -v2 change: Update data length by plus offset in first segment instead of last segment. -v1 change: Update the packet length by plus offset; Use macro to replace constant. Changchun Ouyang (3): Fix packet length issue in vhost. Extract a function to replace duplicated codes in vhost. Check offset value in vhost examples/vhost/main.c | 142 +++--- 1 file changed, 65 insertions(+), 77 deletions(-) -- 1.8.4.2
[dpdk-dev] [PATCH v4 2/3] vhost: Remove duplicated codes
Extract a function to replace duplicated codes in one copy and zero copy TX function. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 139 +- 1 file changed, 58 insertions(+), 81 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 5ca8dce..2916313 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1040,6 +1040,57 @@ virtio_tx_local(struct vhost_dev *vdev, struct rte_mbuf *m) } /* + * Check if the destination MAC of a packet is one local VM, + * and get its vlan tag, and offset if it is. + */ +static inline int __attribute__((always_inline)) +find_local_dest(struct virtio_net *dev, struct rte_mbuf *m, + uint32_t *offset, uint16_t *vlan_tag) +{ + struct virtio_net_data_ll *dev_ll = ll_root_used; + struct ether_hdr *pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); + + while (dev_ll != NULL) { + if ((dev_ll->vdev->ready == DEVICE_RX) + && ether_addr_cmp(&(pkt_hdr->d_addr), + &dev_ll->vdev->mac_address)) { + /* +* Drop the packet if the TX packet is +* destined for the TX device. +*/ + if (dev_ll->vdev->dev->device_fh == dev->device_fh) { + LOG_DEBUG(VHOST_DATA, + "(%"PRIu64") TX: Source and destination" + " MAC addresses are the same. Dropping " + "packet.\n", + dev_ll->vdev->dev->device_fh); + return -1; + } + + /* +* HW vlan strip will reduce the packet length +* by minus length of vlan tag, so need restore +* the packet length by plus it. +*/ + *offset = VLAN_HLEN; + *vlan_tag = + (uint16_t) + vlan_tags[(uint16_t)dev_ll->vdev->dev->device_fh]; + + LOG_DEBUG(VHOST_DATA, + "(%"PRIu64") TX: pkt to local VM device id:" + "(%"PRIu64") vlan tag: %d.\n", + dev->device_fh, dev_ll->vdev->dev->device_fh, + vlan_tag); + + break; + } + dev_ll = dev_ll->next; + } + return 0; +} + +/* * This function routes the TX packet to the correct interface. This may be a local device * or the physical port. */ @@ -1050,8 +1101,6 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) struct rte_mbuf **m_table; unsigned len, ret, offset = 0; const uint16_t lcore_id = rte_lcore_id(); - struct virtio_net_data_ll *dev_ll = ll_root_used; - struct ether_hdr *pkt_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); struct virtio_net *dev = vdev->dev; /*check if destination is local VM*/ @@ -1061,43 +1110,9 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) } if (vm2vm_mode == VM2VM_HARDWARE) { - while (dev_ll != NULL) { - if ((dev_ll->vdev->ready == DEVICE_RX) - && ether_addr_cmp(&(pkt_hdr->d_addr), - &dev_ll->vdev->mac_address)) { - /* -* Drop the packet if the TX packet is -* destined for the TX device. -*/ - if (dev_ll->vdev->dev->device_fh == dev->device_fh) { - LOG_DEBUG(VHOST_DATA, - "(%"PRIu64") TX: Source and destination" - " MAC addresses are the same. Dropping " - "packet.\n", - dev_ll->vdev->dev->device_fh); - rte_pktmbuf_free(m); - return; - } - - /* -* HW vlan strip will reduce the packet length -* by minus length of vlan tag, so need restore -* the packet length by plus it. -*/ - offset = VLAN_HLEN; - vlan_tag = - (uint16_t) - vlan_tags[(uint16_t)dev_ll->vdev->dev->device_fh]; - - LOG_DEBUG(VHOST_DATA, - "(%"PRIu64") TX: pkt to local VM device id:" -
[dpdk-dev] [PATCH v4 3/3] vhost: Check offset value
This patch checks the packet length offset value, and checks if the extra bytes inside buffer cross page boundary. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 2916313..a93f7a0 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1110,7 +1110,8 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) } if (vm2vm_mode == VM2VM_HARDWARE) { - if (find_local_dest(dev, m, &offset, &vlan_tag) != 0) { + if (find_local_dest(dev, m, &offset, &vlan_tag) != 0 || + offset > rte_pktmbuf_tailroom(m)) { rte_pktmbuf_free(m); return; } @@ -1896,7 +1897,9 @@ virtio_dev_tx_zcp(struct virtio_net *dev) /* Buffer address translation. */ buff_addr = gpa_to_vva(dev, desc->addr); - phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len, &addr_type); + /* Need check extra VLAN_HLEN size for inserting VLAN tag */ + phys_addr = gpa_to_hpa(vdev, desc->addr, desc->len + VLAN_HLEN, + &addr_type); if (likely(packet_success < (free_entries - 1))) /* Prefetch descriptor index. */ -- 1.8.4.2
[dpdk-dev] [PATCH v4 3/3] vhost: Check offset value
Agree with Thomas! using small patches is for easily understanding. Merging and mixing things together is not a good thing. Changchun > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, November 6, 2014 1:01 AM > To: Xie, Huawei > Cc: dev at dpdk.org; Ouyang, Changchun > Subject: Re: [dpdk-dev] [PATCH v4 3/3] vhost: Check offset value > > 2014-11-05 16:52, Xie, Huawei: > > Why don't we merge 1,2,3 patches? > > Because it's simpler to understand small patches with a dedicated > explanation in the commit log of each patch. > Why do you want to merge them? > > -- > Thomas
[dpdk-dev] [PATCH] librte_vhost: Fix the path test issue
Hi Huawei, Thanks for the comments, And my response as follows. > -Original Message- > From: Xie, Huawei > Sent: Thursday, November 6, 2014 10:39 AM > To: Ouyang, Changchun; dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH] librte_vhost: Fix the path test issue > > > path = realpath(memfile, resolved_path); > > - if (path == NULL) { > > + if ((path == NULL) && (strlen(resolved_path) == 0)) { > > RTE_LOG(ERR, VHOST_CONFIG, > > "(%"PRIu64") Failed to resolve fd directory\n", > > dev->device_fh); > Changchun: > For some strange file, according to API description, we shouldn't check > resolved_path as it is undefined. > To make the loop go on, we could use "continue" when we detect path is > NULL. > > RETURN VALUE >If there is no error, realpath() returns a pointer to the > resolved_path. > >Otherwise it returns a NULL pointer, and the contents of the array > resolved_path are undefined, and errno is set to indicate the error. After my investigation this issue and find out using continue doesn't work. The reason is procmap.fname itself is "/dev/hugepages/qemu_back_mem.pc.ram.zxfqLq", It is not a normal path, so in this case, path is null, while resolved-path is /dev/hugepages/qemu_back_mem.pc.ram.zxfqLq If 'continue' is used, then procmap.fname could not be hit in the directory list, And then app will exit after report:?Failed to find memory file for pid So I have to keep it. Thanks again Changchun
[dpdk-dev] [PATCH v4 0/5] Support virtio multicast feature
-V1 change: This patch series support multicast feature in virtio and vhost. The vhost backend enables the promiscuous mode and config ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST in VMDQ offload register to receive the multicast and broadcast packets. The virtio frontend provides the functionality of enabling and disabling the multicast and promiscuous mode. -V2 change: Rework the patch basing on new vhost library and new vhost application. -V3 change: Rework the patch for comments, split commits. -V4 change: Rework for refining code comment and patch titles, fatorizing codes, and resolving conflicts. Changchun Ouyang (5): ethdev: Add vmdq rx mode igb: Config VM offload register ixgbe: Configure Rx mode for VMDQ virtio: Support promiscuous and allmulticast vhost: Enable promisc mode and multicast examples/vhost/main.c | 24 -- lib/librte_ether/rte_ethdev.h | 1 + lib/librte_pmd_e1000/igb_rxtx.c | 20 lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 31 lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 1 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 6 +++ lib/librte_pmd_virtio/virtio_ethdev.c | 90 ++- lib/librte_vhost/virtio-net.c | 3 +- 8 files changed, 161 insertions(+), 15 deletions(-) -- 1.8.4.2
[dpdk-dev] [PATCH v4 1/5] ethdev: Add vmdq rx mode
Add vmdq rx mode field into rx config struct, it is flag from ETH_VMDQ_ACCEPT_*. Signed-off-by: Changchun Ouyang --- lib/librte_ether/rte_ethdev.h | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 7e4c998..c29525b 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -593,6 +593,7 @@ struct rte_eth_vmdq_rx_conf { uint8_t default_pool; /**< The default pool, if applicable */ uint8_t enable_loop_back; /**< Enable VT loop back */ uint8_t nb_pool_maps; /**< We can have up to 64 filters/mappings */ + uint32_t rx_mode; /**< Flags from ETH_VMDQ_ACCEPT_* */ struct { uint16_t vlan_id; /**< The vlan id of the received frame */ uint64_t pools; /**< Bitmask of pools for packet rx */ -- 1.8.4.2
[dpdk-dev] [PATCH v4 2/5] igb: Config VM offload register
Config VM offload register in igb PMD to enable it receive broadcast and multicast packets. Signed-off-by: Changchun Ouyang --- lib/librte_pmd_e1000/igb_rxtx.c | 20 1 file changed, 20 insertions(+) diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c index f09c525..0dca7b7 100644 --- a/lib/librte_pmd_e1000/igb_rxtx.c +++ b/lib/librte_pmd_e1000/igb_rxtx.c @@ -1779,6 +1779,26 @@ igb_vmdq_rx_hw_configure(struct rte_eth_dev *dev) vt_ctl |= E1000_VT_CTL_IGNORE_MAC; E1000_WRITE_REG(hw, E1000_VT_CTL, vt_ctl); + for (i = 0; i < E1000_VMOLR_SIZE; i++) { + vmolr = E1000_READ_REG(hw, E1000_VMOLR(i)); + vmolr &= ~(E1000_VMOLR_AUPE | E1000_VMOLR_ROMPE | + E1000_VMOLR_ROPE | E1000_VMOLR_BAM | + E1000_VMOLR_MPME); + + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_UNTAG) + vmolr |= E1000_VMOLR_AUPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_MC) + vmolr |= E1000_VMOLR_ROMPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_UC) + vmolr |= E1000_VMOLR_ROPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_BROADCAST) + vmolr |= E1000_VMOLR_BAM; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_MULTICAST) + vmolr |= E1000_VMOLR_MPME; + + E1000_WRITE_REG(hw, E1000_VMOLR(i), vmolr); + } + /* * VMOLR: set STRVLAN as 1 if IGMAC in VTCTL is set as 1 * Both 82576 and 82580 support it -- 1.8.4.2
[dpdk-dev] [PATCH v4 3/5] ixgbe: Configure Rx mode for VMDQ
Config PFVML2FLT register in ixgbe PMD to enable it receive broadcast and multicast packets; also factorize the common logic with ixgbe_set_pool_rx_mode. Signed-off-by: Changchun Ouyang --- lib/librte_pmd_ixgbe/ixgbe_ethdev.c | 31 +-- lib/librte_pmd_ixgbe/ixgbe_ethdev.h | 1 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 6 ++ 3 files changed, 28 insertions(+), 10 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c index 9c73a30..fb7ed3d 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.c +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.c @@ -3123,6 +3123,26 @@ ixgbe_uc_all_hash_table_set(struct rte_eth_dev *dev, uint8_t on) return 0; } + +uint32_t +ixgbe_convert_vm_rx_mask_to_val(uint16_t rx_mask, uint32_t orig_val) +{ + uint32_t new_val = orig_val; + + if (rx_mask & ETH_VMDQ_ACCEPT_UNTAG) + new_val |= IXGBE_VMOLR_AUPE; + if (rx_mask & ETH_VMDQ_ACCEPT_HASH_MC) + new_val |= IXGBE_VMOLR_ROMPE; + if (rx_mask & ETH_VMDQ_ACCEPT_HASH_UC) + new_val |= IXGBE_VMOLR_ROPE; + if (rx_mask & ETH_VMDQ_ACCEPT_BROADCAST) + new_val |= IXGBE_VMOLR_BAM; + if (rx_mask & ETH_VMDQ_ACCEPT_MULTICAST) + new_val |= IXGBE_VMOLR_MPE; + + return new_val; +} + static int ixgbe_set_pool_rx_mode(struct rte_eth_dev *dev, uint16_t pool, uint16_t rx_mask, uint8_t on) @@ -3141,16 +3161,7 @@ ixgbe_set_pool_rx_mode(struct rte_eth_dev *dev, uint16_t pool, if (ixgbe_vmdq_mode_check(hw) < 0) return (-ENOTSUP); - if (rx_mask & ETH_VMDQ_ACCEPT_UNTAG ) - val |= IXGBE_VMOLR_AUPE; - if (rx_mask & ETH_VMDQ_ACCEPT_HASH_MC ) - val |= IXGBE_VMOLR_ROMPE; - if (rx_mask & ETH_VMDQ_ACCEPT_HASH_UC) - val |= IXGBE_VMOLR_ROPE; - if (rx_mask & ETH_VMDQ_ACCEPT_BROADCAST) - val |= IXGBE_VMOLR_BAM; - if (rx_mask & ETH_VMDQ_ACCEPT_MULTICAST) - val |= IXGBE_VMOLR_MPE; + val = ixgbe_convert_vm_rx_mask_to_val(rx_mask, val); if (on) vmolr |= val; diff --git a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h index a5159e5..ca99170 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_ethdev.h +++ b/lib/librte_pmd_ixgbe/ixgbe_ethdev.h @@ -340,4 +340,5 @@ void ixgbe_pf_mbx_process(struct rte_eth_dev *eth_dev); int ixgbe_pf_host_configure(struct rte_eth_dev *eth_dev); +uint32_t ixgbe_convert_vm_rx_mask_to_val(uint16_t rx_mask, uint32_t orig_val); #endif /* _IXGBE_ETHDEV_H_ */ diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index 3a5a8ff..f9b3fe3 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -3123,6 +3123,7 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev) struct ixgbe_hw *hw; enum rte_eth_nb_pools num_pools; uint32_t mrqc, vt_ctl, vlanctrl; + uint32_t vmolr = 0; int i; PMD_INIT_FUNC_TRACE(); @@ -3145,6 +3146,11 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev) IXGBE_WRITE_REG(hw, IXGBE_VT_CTL, vt_ctl); + for (i = 0; i < (int)num_pools; i++) { + vmolr = ixgbe_convert_vm_rx_mask_to_val(cfg->rx_mode, vmolr); + IXGBE_WRITE_REG(hw, IXGBE_VMOLR(i), vmolr); + } + /* VLNCTRL: enable vlan filtering and allow all vlan tags through */ vlanctrl = IXGBE_READ_REG(hw, IXGBE_VLNCTRL); vlanctrl |= IXGBE_VLNCTRL_VFE ; /* enable vlan filters */ -- 1.8.4.2
[dpdk-dev] [PATCH v4 5/5] vhost: Enable promisc mode and multicast
This is to enable user space vhost receiving and forwarding broadcast and multicast packets: Use new option in command line to enable promisc mode; Enable 2 bits in VMDQ RX mode: ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 24 +--- lib/librte_vhost/virtio-net.c | 3 ++- 2 files changed, 23 insertions(+), 4 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index a93f7a0..1f1edbe 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -161,6 +161,9 @@ /* mask of enabled ports */ static uint32_t enabled_port_mask = 0; +/* Promiscuous mode */ +static uint32_t promiscuous; + /*Number of switching cores enabled*/ static uint32_t num_switching_cores = 0; @@ -364,13 +367,15 @@ static inline int get_eth_conf(struct rte_eth_conf *eth_conf, uint32_t num_devices) { struct rte_eth_vmdq_rx_conf conf; + struct rte_eth_vmdq_rx_conf *def_conf = + &vmdq_conf_default.rx_adv_conf.vmdq_rx_conf; unsigned i; memset(&conf, 0, sizeof(conf)); conf.nb_queue_pools = (enum rte_eth_nb_pools)num_devices; conf.nb_pool_maps = num_devices; - conf.enable_loop_back = - vmdq_conf_default.rx_adv_conf.vmdq_rx_conf.enable_loop_back; + conf.enable_loop_back = def_conf->enable_loop_back; + conf.rx_mode = def_conf->rx_mode; for (i = 0; i < conf.nb_pool_maps; i++) { conf.pool_map[i].vlan_id = vlan_tags[ i ]; @@ -468,6 +473,9 @@ port_init(uint8_t port) return retval; } + if (promiscuous) + rte_eth_promiscuous_enable(port); + rte_eth_macaddr_get(port, &vmdq_ports_eth_addr[port]); RTE_LOG(INFO, VHOST_PORT, "Max virtio devices supported: %u\n", num_devices); RTE_LOG(INFO, VHOST_PORT, "Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8 @@ -598,7 +606,8 @@ us_vhost_parse_args(int argc, char **argv) }; /* Parse command line */ - while ((opt = getopt_long(argc, argv, "p:",long_option, &option_index)) != EOF) { + while ((opt = getopt_long(argc, argv, "p:P", + long_option, &option_index)) != EOF) { switch (opt) { /* Portmask */ case 'p': @@ -610,6 +619,15 @@ us_vhost_parse_args(int argc, char **argv) } break; + case 'P': + promiscuous = 1; + vmdq_conf_default.rx_adv_conf.vmdq_rx_conf.rx_mode = + ETH_VMDQ_ACCEPT_BROADCAST | + ETH_VMDQ_ACCEPT_MULTICAST; + rte_vhost_feature_enable(1ULL << VIRTIO_NET_F_CTRL_RX); + + break; + case 0: /* Enable/disable vm2vm comms. */ if (!strncmp(long_option[option_index].name, "vm2vm", diff --git a/lib/librte_vhost/virtio-net.c b/lib/librte_vhost/virtio-net.c index 6d8de09..852b6d1 100644 --- a/lib/librte_vhost/virtio-net.c +++ b/lib/librte_vhost/virtio-net.c @@ -68,7 +68,8 @@ static struct virtio_net_device_ops const *notify_ops; static struct virtio_net_config_ll *ll_root; /* Features supported by this lib. */ -#define VHOST_SUPPORTED_FEATURES (1ULL << VIRTIO_NET_F_MRG_RXBUF) +#define VHOST_SUPPORTED_FEATURES ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | \ + (1ULL << VIRTIO_NET_F_CTRL_RX)) static uint64_t VHOST_FEATURES = VHOST_SUPPORTED_FEATURES; /* Line size for reading maps file. */ -- 1.8.4.2
[dpdk-dev] [PATCH v4 4/5] virtio: Support promiscuous and allmulticast
Add codes for supporting promiscuous and allmulticast enable and disable. Signed-off-by: Changchun Ouyang --- lib/librte_pmd_virtio/virtio_ethdev.c | 90 ++- 1 file changed, 89 insertions(+), 1 deletion(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 19930c0..c009f2a 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -66,6 +66,10 @@ static int eth_virtio_dev_init(struct eth_driver *eth_drv, static int virtio_dev_configure(struct rte_eth_dev *dev); static int virtio_dev_start(struct rte_eth_dev *dev); static void virtio_dev_stop(struct rte_eth_dev *dev); +static void virtio_dev_promiscuous_enable(struct rte_eth_dev *dev); +static void virtio_dev_promiscuous_disable(struct rte_eth_dev *dev); +static void virtio_dev_allmulticast_enable(struct rte_eth_dev *dev); +static void virtio_dev_allmulticast_disable(struct rte_eth_dev *dev); static void virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info); static int virtio_dev_link_update(struct rte_eth_dev *dev, @@ -403,6 +407,86 @@ virtio_dev_close(struct rte_eth_dev *dev) virtio_dev_stop(dev); } +static void +virtio_dev_promiscuous_enable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC; + ctrl.data[0] = 1; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) + PMD_INIT_LOG(ERR, "Failed to enable promisc"); +} + +static void +virtio_dev_promiscuous_disable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC; + ctrl.data[0] = 0; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) + PMD_INIT_LOG(ERR, "Failed to disable promisc"); +} + +static void +virtio_dev_allmulticast_enable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI; + ctrl.data[0] = 1; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) + PMD_INIT_LOG(ERR, "Failed to enable allmulticast"); +} + +static void +virtio_dev_allmulticast_disable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI; + ctrl.data[0] = 0; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) + PMD_INIT_LOG(ERR, "Failed to disable allmulticast"); +} + /* * dev_ops for virtio, bare necessities for basic operation */ @@ -411,6 +495,10 @@ static struct eth_dev_ops virtio_eth_dev_ops = { .dev_start = virtio_dev_start, .dev_stop= virtio_dev_stop, .dev_close = virtio_dev_close, + .promiscuous_enable = virtio_dev_promiscuous_enable, + .promiscuous_disable = virtio_dev_promiscuous_disable, + .allmulticast_enable = virtio_dev_allmulticast_enable, + .allmulticast_disable= virtio_dev_allmulticast_disable, .dev_infos_get = virtio_dev_info_get, .stats_get = virtio_dev_stats_get, @@ -561,7 +649,7 @@ virtio_negotiate_features(struct virtio_hw *hw) { uint32_t host_features, mask; - mask = VIRTIO_NET_F_CTRL_RX | VIRTIO_NET_F_CTRL_VLAN; + mask = VIRTIO_NET_F_CTRL_VLAN; mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM; /* TSO and LRO are only available when their corresponding -- 1.8.4.2
[dpdk-dev] [PATCH v3 1/5] ethdev: add vmdq rx mode
Hi Thomas, > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Thursday, November 6, 2014 9:56 PM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 1/5] ethdev: add vmdq rx mode > > 2014-10-31 13:19, Ouyang Changchun: > > --- a/lib/librte_ether/rte_ethdev.h > > +++ b/lib/librte_ether/rte_ethdev.h > > @@ -577,6 +577,7 @@ struct rte_eth_vmdq_rx_conf { > > uint8_t default_pool; /**< The default pool, if applicable */ > > uint8_t enable_loop_back; /**< Enable VT loop back */ > > uint8_t nb_pool_maps; /**< We can have up to 64 filters/mappings > */ > > + uint32_t rx_mode; /**< RX mode for vmdq */ > > You are adding the field rx_mode in struct rte_eth_vmdq_rx_conf. > So the comment "RX mode for vmdq" is not really informative :) It would be > more interesting to explain which kind of value this field must contain. > Something like "flags from ETH_VMDQ_ACCEPT_*". > Thanks for your comments, I will update it. Changchun
[dpdk-dev] [PATCH v4 0/5] Support virtio multicast feature
Hi Thomas, Thanks very much for applying this patch! > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Wednesday, November 12, 2014 7:17 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org; Xie, Huawei > Subject: Re: [dpdk-dev] [PATCH v4 0/5] Support virtio multicast feature > > > -V1 change: > > This patch series support multicast feature in virtio and vhost. > > The vhost backend enables the promiscuous mode and config > > ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST in > VMDQ offload register to receive the multicast and broadcast packets. > > The virtio frontend provides the functionality of enabling and > > disabling the multicast and promiscuous mode. > > > > -V2 change: > > Rework the patch basing on new vhost library and new vhost application. > > > > -V3 change: > > Rework the patch for comments, split commits. > > > > -V4 change: > > Rework for refining code comment and patch titles, fatorizing codes, and > resolving conflicts. > > > > Changchun Ouyang (5): > > ethdev: Add vmdq rx mode > > igb: Config VM offload register > > ixgbe: Configure Rx mode for VMDQ > > virtio: Support promiscuous and allmulticast > > vhost: Enable promisc mode and multicast > > I reviewed only the first 3 commits. > The virtio and vhost commits seem to have been reviewed by Huawei. > Next times, a clear acked-by would be preferable. > Please note that this is the role of developpers to request reviews when > needed. Reviews are not always spontaneous :) > Yes I have asked some guys more than 3 times to review and ack this patch, Just because each guy has tightly schedule on their own patch rework, doc writing, next feature planning, etc, So the patch-acking delays.. Thanks again and regards, Changchun
[dpdk-dev] One pkt in mbuf chain - virtio pmd driver
It is one feature in developing, scatter and mergeable RX will be support in virtio subsequent patch. Thanks Changchun > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei > Sent: Thursday, August 7, 2014 3:07 PM > To: Czaus, Tomasz; dev at dpdk.org > Subject: Re: [dpdk-dev] One pkt in mbuf chain - virtio pmd driver > > Hi Tomasz: > This is a known issue in user space vhost. Will be fixed in subsequent patch > once the vhost lib is applied. > > BR. > -Huawei > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Czaus, Tomasz > > Sent: Thursday, August 07, 2014 2:20 PM > > To: dev at dpdk.org > > Subject: [dpdk-dev] One pkt in mbuf chain - virtio pmd driver > > > > Hello, > > > > Does virtio pmd driver support scenario when a frame fits in mbuf > > chain, this means all headers (eth/ipv4/tcp) are located in first mbuf > > and user data is located in next mbuf. I have asked the same question > > on dpdk-ovs mailing group, here is a thread and more details: > > > > https://lists.01.org/pipermail/dpdk-ovs/2014-August/001557.html > > > > Best Regards, > > Tomasz Czaus > > > > > > > > Intel Technology Poland sp. z o.o. > > ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII > > Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP > > 957-07- > > 52-316 | Kapital zakladowy 200.000 PLN. > > > > Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego > > adresata i moze zawierac informacje poufne. W razie przypadkowego > > otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale > > jej usuniecie; jakiekolwiek przegladanie lub rozpowszechnianie jest > > zabronione. > > This e-mail and any attachments may contain confidential material for > > the sole use of the intended recipient(s). If you are not the intended > > recipient, please contact the sender and delete all copies; any review > > or distribution by others is strictly prohibited.
[dpdk-dev] [PATCH v3] virtio: Support mergeable buffer in virtio pmd
v3 change: - Investigate the comments from Huawei and fix one potential issue of wrong offset to the number of descriptor in buffer; also fix other tiny comments. v2 change: - Resolve conflicts with the tip code; - And resolve 2 issues: -- fix mbuf leak when discard an uncompleted packet. -- refine pkt.data to point to actual payload data start point. v1 change: - This patch supports mergeable buffer feature in DPDK based virtio PMD, which can receive jumbo frame with larger size, like 3K, 4K or even 9K. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie --- lib/librte_pmd_virtio/virtio_ethdev.c | 20 +-- lib/librte_pmd_virtio/virtio_ethdev.h | 3 + lib/librte_pmd_virtio/virtio_rxtx.c | 221 +- 3 files changed, 207 insertions(+), 37 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index b9f5529..535d798 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -337,7 +337,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, snprintf(vq_name, sizeof(vq_name), "port%d_tvq%d_hdrzone", dev->data->port_id, queue_idx); vq->virtio_net_hdr_mz = rte_memzone_reserve_aligned(vq_name, - vq_size * sizeof(struct virtio_net_hdr), + vq_size * hw->vtnet_hdr_size, socket_id, 0, CACHE_LINE_SIZE); if (vq->virtio_net_hdr_mz == NULL) { rte_free(vq); @@ -346,7 +346,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, vq->virtio_net_hdr_mem = vq->virtio_net_hdr_mz->phys_addr; memset(vq->virtio_net_hdr_mz->addr, 0, - vq_size * sizeof(struct virtio_net_hdr)); + vq_size * hw->vtnet_hdr_size); } else if (queue_type == VTNET_CQ) { /* Allocate a page for control vq command, data and status */ snprintf(vq_name, sizeof(vq_name), "port%d_cvq_hdrzone", @@ -571,9 +571,6 @@ virtio_negotiate_features(struct virtio_hw *hw) mask |= VIRTIO_NET_F_GUEST_TSO4 | VIRTIO_NET_F_GUEST_TSO6 | VIRTIO_NET_F_GUEST_ECN; mask |= VTNET_LRO_FEATURES; - /* rx_mbuf should not be in multiple merged segments */ - mask |= VIRTIO_NET_F_MRG_RXBUF; - /* not negotiating INDIRECT descriptor table support */ mask |= VIRTIO_RING_F_INDIRECT_DESC; @@ -746,7 +743,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, } eth_dev->dev_ops = &virtio_eth_dev_ops; - eth_dev->rx_pkt_burst = &virtio_recv_pkts; eth_dev->tx_pkt_burst = &virtio_xmit_pkts; if (rte_eal_process_type() == RTE_PROC_SECONDARY) @@ -801,10 +797,13 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, virtio_negotiate_features(hw); /* Setting up rx_header size for the device */ - if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) + if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { + eth_dev->rx_pkt_burst = &virtio_recv_mergeable_pkts; hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr_mrg_rxbuf); - else + } else { + eth_dev->rx_pkt_burst = &virtio_recv_pkts; hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr); + } /* Allocate memory for storing MAC addresses */ eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0); @@ -1009,7 +1008,7 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev) while ((buf = (struct rte_mbuf *)virtqueue_detatch_unused( dev->data->rx_queues[i])) != NULL) { - rte_pktmbuf_free_seg(buf); + rte_pktmbuf_free(buf); mbuf_num++; } @@ -1028,7 +1027,8 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev) mbuf_num = 0; while ((buf = (struct rte_mbuf *)virtqueue_detatch_unused( dev->data->tx_queues[i])) != NULL) { - rte_pktmbuf_free_seg(buf); + rte_pktmbuf_free(buf); + mbuf_num++; } diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h index 858e644..d2e1eed 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.h +++ b/lib/librte_pmd_virtio/virtio_ethdev.h @@ -104,6 +104,9 @@ int virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id, uint16_t virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts); +uint16_t virtio_recv_mergeable_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, + uint16_t nb_pkts); + uint16_t virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts,
[dpdk-dev] [PATCH] examples/vhost: Support jumbo frame in user space vhost
This patch support mergeable RX feature and thus support jumbo frame RX and TX in user space vhost(as virtio backend). On RX, it secures enough room from vring to accommodate one complete scattered packet which is received by PMD from physical port, and then copy data from mbuf to vring buffer, possibly across a few vring entries and descriptors. On TX, it gets a jumbo frame, possibly described by a few vring descriptors which are chained together with the flags of 'NEXT', and then copy them into one scattered packet and TX it to physical port through PMD. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie --- examples/vhost/main.c | 726 examples/vhost/virtio-net.h | 14 + 2 files changed, 687 insertions(+), 53 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 193aa25..7d9e6a2 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -106,6 +106,8 @@ #define BURST_RX_WAIT_US 15/* Defines how long we wait between retries on RX */ #define BURST_RX_RETRIES 4 /* Number of retries on RX. */ +#define JUMBO_FRAME_MAX_SIZE0x2600 + /* State of virtio device. */ #define DEVICE_MAC_LEARNING 0 #define DEVICE_RX 1 @@ -676,8 +678,12 @@ us_vhost_parse_args(int argc, char **argv) us_vhost_usage(prgname); return -1; } else { - if (ret) + if (ret) { + vmdq_conf_default.rxmode.jumbo_frame = 1; + vmdq_conf_default.rxmode.max_rx_pkt_len + = JUMBO_FRAME_MAX_SIZE; VHOST_FEATURES = (1ULL << VIRTIO_NET_F_MRG_RXBUF); + } } } @@ -797,6 +803,14 @@ us_vhost_parse_args(int argc, char **argv) return -1; } + if ((zero_copy == 1) && (vmdq_conf_default.rxmode.jumbo_frame == 1)) { + RTE_LOG(INFO, VHOST_PORT, + "Vhost zero copy doesn't support jumbo frame," + "please specify '--mergeable 0' to disable the " + "mergeable feature.\n"); + return -1; + } + return 0; } @@ -916,7 +930,7 @@ gpa_to_hpa(struct virtio_net *dev, uint64_t guest_pa, * This function adds buffers to the virtio devices RX virtqueue. Buffers can * be received from the physical port or from another virtio device. A packet * count is returned to indicate the number of packets that were succesfully - * added to the RX queue. + * added to the RX queue. This function works when mergeable is disabled. */ static inline uint32_t __attribute__((always_inline)) virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count) @@ -930,7 +944,6 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count) uint64_t buff_hdr_addr = 0; uint32_t head[MAX_PKT_BURST], packet_len = 0; uint32_t head_idx, packet_success = 0; - uint32_t mergeable, mrg_count = 0; uint32_t retry = 0; uint16_t avail_idx, res_cur_idx; uint16_t res_base_idx, res_end_idx; @@ -940,6 +953,7 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count) LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev->device_fh); vq = dev->virtqueue[VIRTIO_RXQ]; count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count; + /* As many data cores may want access to available buffers, they need to be reserved. */ do { res_base_idx = vq->last_used_idx_res; @@ -976,9 +990,6 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count) /* Prefetch available ring to retrieve indexes. */ rte_prefetch0(&vq->avail->ring[res_cur_idx & (vq->size - 1)]); - /* Check if the VIRTIO_NET_F_MRG_RXBUF feature is enabled. */ - mergeable = dev->features & (1 << VIRTIO_NET_F_MRG_RXBUF); - /* Retrieve all of the head indexes first to avoid caching issues. */ for (head_idx = 0; head_idx < count; head_idx++) head[head_idx] = vq->avail->ring[(res_cur_idx + head_idx) & (vq->size - 1)]; @@ -997,56 +1008,44 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t count) /* Prefetch buffer address. */ rte_prefetch0((void*)(uintptr_t)buff_addr); - if (mergeable && (mrg_count != 0)) { - desc->len = packet_len = rte_pktmbuf_data_len(buff); - } else { - /* Copy virtio_hdr to packet and increment buffer address */ - buf
[dpdk-dev] [PATCH v3] virtio: Support mergeable buffer in virtio pmd
Hi all, Any comments for this patch? And what's the status for merging it into mainline? Thanks in advance Changchun > -Original Message- > From: Ouyang, Changchun > Sent: Thursday, August 14, 2014 4:55 PM > To: dev at dpdk.org > Cc: Cao, Waterman; Ouyang, Changchun > Subject: [PATCH v3] virtio: Support mergeable buffer in virtio pmd > > v3 change: > - Investigate the comments from Huawei and fix one potential issue of > wrong offset to > the number of descriptor in buffer; also fix other tiny comments. > > v2 change: > - Resolve conflicts with the tip code; > - And resolve 2 issues: >-- fix mbuf leak when discard an uncompleted packet. >-- refine pkt.data to point to actual payload data start point. > > v1 change: > - This patch supports mergeable buffer feature in DPDK based virtio PMD, > which can > receive jumbo frame with larger size, like 3K, 4K or even 9K. > > Signed-off-by: Changchun Ouyang > Acked-by: Huawei Xie > --- > lib/librte_pmd_virtio/virtio_ethdev.c | 20 +-- > lib/librte_pmd_virtio/virtio_ethdev.h | 3 + > lib/librte_pmd_virtio/virtio_rxtx.c | 221 > +- > 3 files changed, 207 insertions(+), 37 deletions(-) > > diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c > b/lib/librte_pmd_virtio/virtio_ethdev.c > index b9f5529..535d798 100644 > --- a/lib/librte_pmd_virtio/virtio_ethdev.c > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c > @@ -337,7 +337,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, > snprintf(vq_name, sizeof(vq_name), > "port%d_tvq%d_hdrzone", > dev->data->port_id, queue_idx); > vq->virtio_net_hdr_mz = > rte_memzone_reserve_aligned(vq_name, > - vq_size * sizeof(struct virtio_net_hdr), > + vq_size * hw->vtnet_hdr_size, > socket_id, 0, CACHE_LINE_SIZE); > if (vq->virtio_net_hdr_mz == NULL) { > rte_free(vq); > @@ -346,7 +346,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, > vq->virtio_net_hdr_mem = > vq->virtio_net_hdr_mz->phys_addr; > memset(vq->virtio_net_hdr_mz->addr, 0, > - vq_size * sizeof(struct virtio_net_hdr)); > + vq_size * hw->vtnet_hdr_size); > } else if (queue_type == VTNET_CQ) { > /* Allocate a page for control vq command, data and status > */ > snprintf(vq_name, sizeof(vq_name), "port%d_cvq_hdrzone", > @@ -571,9 +571,6 @@ virtio_negotiate_features(struct virtio_hw *hw) > mask |= VIRTIO_NET_F_GUEST_TSO4 | VIRTIO_NET_F_GUEST_TSO6 > | VIRTIO_NET_F_GUEST_ECN; > mask |= VTNET_LRO_FEATURES; > > - /* rx_mbuf should not be in multiple merged segments */ > - mask |= VIRTIO_NET_F_MRG_RXBUF; > - > /* not negotiating INDIRECT descriptor table support */ > mask |= VIRTIO_RING_F_INDIRECT_DESC; > > @@ -746,7 +743,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver > *eth_drv, > } > > eth_dev->dev_ops = &virtio_eth_dev_ops; > - eth_dev->rx_pkt_burst = &virtio_recv_pkts; > eth_dev->tx_pkt_burst = &virtio_xmit_pkts; > > if (rte_eal_process_type() == RTE_PROC_SECONDARY) > @@ -801,10 +797,13 @@ eth_virtio_dev_init(__rte_unused struct > eth_driver *eth_drv, > virtio_negotiate_features(hw); > > /* Setting up rx_header size for the device */ > - if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) > + if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { > + eth_dev->rx_pkt_burst = &virtio_recv_mergeable_pkts; > hw->vtnet_hdr_size = sizeof(struct > virtio_net_hdr_mrg_rxbuf); > - else > + } else { > + eth_dev->rx_pkt_burst = &virtio_recv_pkts; > hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr); > + } > > /* Allocate memory for storing MAC addresses */ > eth_dev->data->mac_addrs = rte_zmalloc("virtio", > ETHER_ADDR_LEN, 0); > @@ -1009,7 +1008,7 @@ static void virtio_dev_free_mbufs(struct > rte_eth_dev *dev) > > while ((buf = (struct rte_mbuf *)virtqueue_detatch_unused( > dev->data->rx_queues[i])) != NULL) { > - rte_pktmbuf_free_seg(buf); > + rte_pktmbuf_free(buf); > mbuf_num++; > } > > @@ -1028,7 +1027,8 @@ static void virtio_dev_free_mbufs(struct > rte_eth_dev *dev) > mbuf_num = 0
[dpdk-dev] [PATCH] examples/vhost: Support jumbo frame in user space vhost
Hi all, Any comments for this patch? And what's the status for merging it into mainline? Thanks in advance Changchun > -Original Message- > From: Ouyang, Changchun > Sent: Friday, August 15, 2014 12:58 PM > To: dev at dpdk.org > Cc: Cao, Waterman; Ouyang, Changchun > Subject: [PATCH] examples/vhost: Support jumbo frame in user space vhost > > This patch support mergeable RX feature and thus support jumbo frame RX > and TX in user space vhost(as virtio backend). > > On RX, it secures enough room from vring to accommodate one complete > scattered packet which is received by PMD from physical port, and then copy > data from mbuf to vring buffer, possibly across a few vring entries and > descriptors. > > On TX, it gets a jumbo frame, possibly described by a few vring descriptors > which are chained together with the flags of 'NEXT', and then copy them into > one scattered packet and TX it to physical port through PMD. > > Signed-off-by: Changchun Ouyang > Acked-by: Huawei Xie > --- > examples/vhost/main.c | 726 > > examples/vhost/virtio-net.h | 14 + > 2 files changed, 687 insertions(+), 53 deletions(-) > > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index > 193aa25..7d9e6a2 100644 > --- a/examples/vhost/main.c > +++ b/examples/vhost/main.c > @@ -106,6 +106,8 @@ > #define BURST_RX_WAIT_US 15 /* Defines how long we wait > between retries on RX */ > #define BURST_RX_RETRIES 4 /* Number of retries on RX. */ > > +#define JUMBO_FRAME_MAX_SIZE0x2600 > + > /* State of virtio device. */ > #define DEVICE_MAC_LEARNING 0 > #define DEVICE_RX1 > @@ -676,8 +678,12 @@ us_vhost_parse_args(int argc, char **argv) > us_vhost_usage(prgname); > return -1; > } else { > - if (ret) > + if (ret) { > + > vmdq_conf_default.rxmode.jumbo_frame = 1; > + > vmdq_conf_default.rxmode.max_rx_pkt_len > + = > JUMBO_FRAME_MAX_SIZE; > VHOST_FEATURES = (1ULL << > VIRTIO_NET_F_MRG_RXBUF); > + } > } > } > > @@ -797,6 +803,14 @@ us_vhost_parse_args(int argc, char **argv) > return -1; > } > > + if ((zero_copy == 1) && (vmdq_conf_default.rxmode.jumbo_frame > == 1)) { > + RTE_LOG(INFO, VHOST_PORT, > + "Vhost zero copy doesn't support jumbo frame," > + "please specify '--mergeable 0' to disable the " > + "mergeable feature.\n"); > + return -1; > + } > + > return 0; > } > > @@ -916,7 +930,7 @@ gpa_to_hpa(struct virtio_net *dev, uint64_t guest_pa, > * This function adds buffers to the virtio devices RX virtqueue. Buffers can > * be received from the physical port or from another virtio device. A packet > * count is returned to indicate the number of packets that were succesfully > - * added to the RX queue. > + * added to the RX queue. This function works when mergeable is disabled. > */ > static inline uint32_t __attribute__((always_inline)) virtio_dev_rx(struct > virtio_net *dev, struct rte_mbuf **pkts, uint32_t count) @@ -930,7 +944,6 > @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t > count) > uint64_t buff_hdr_addr = 0; > uint32_t head[MAX_PKT_BURST], packet_len = 0; > uint32_t head_idx, packet_success = 0; > - uint32_t mergeable, mrg_count = 0; > uint32_t retry = 0; > uint16_t avail_idx, res_cur_idx; > uint16_t res_base_idx, res_end_idx; > @@ -940,6 +953,7 @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf > **pkts, uint32_t count) > LOG_DEBUG(VHOST_DATA, "(%"PRIu64") virtio_dev_rx()\n", dev- > >device_fh); > vq = dev->virtqueue[VIRTIO_RXQ]; > count = (count > MAX_PKT_BURST) ? MAX_PKT_BURST : count; > + > /* As many data cores may want access to available buffers, they > need to be reserved. */ > do { > res_base_idx = vq->last_used_idx_res; @@ -976,9 +990,6 > @@ virtio_dev_rx(struct virtio_net *dev, struct rte_mbuf **pkts, uint32_t > count) > /* Prefetch available ring to retrieve indexes. */ > rte_prefetch0(&vq->avail->ring[res_cur_idx & (vq->size - 1)]); >
[dpdk-dev] [PATCH 0/5] Support virtio multicast feature
This patch series support multicast feature in virtio and vhost. The vhost backend enables the promiscuous mode and config ETH_VMDQ_ACCEPT_BROADCAST and ETH_VMDQ_ACCEPT_MULTICAST in VMDQ offload register to receive the multicast and broadcast packets. The virtio frontend provides the functionality of enabling and disabling the multicast and promiscuous mode. Changchun Ouyang (2): Set VM offload register according to VMDQ config for IGB PMD to support broadcast and multicast packets. Add new API in virtio for supporting promiscuous and allmulticast enable and disable. Ouyang Changchun (3): Add RX mode in VMDQ config and set the register PFVML2FLT for IXGBE PMD; this makes VMDQ accept broadcast and multicast packets. To let US-vHOST accept and forward broadcast and multicast packets: Add promiscurous option into command line; set VMDQ RX mode into: ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST. Specify rx_mode as 0 for 2 other samples: vmdq and vhost-xen. examples/vhost/main.c | 27 -- examples/vhost_xen/main.c | 1 + examples/vmdq/main.c | 1 + lib/librte_ether/rte_ethdev.h | 1 + lib/librte_pmd_e1000/igb_rxtx.c | 20 +++ lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 ++ lib/librte_pmd_virtio/virtio_ethdev.c | 98 ++- 7 files changed, 159 insertions(+), 5 deletions(-) -- 1.8.4.2
[dpdk-dev] [PATCH 2/5] e1000: config VMDQ offload register to receive multicast packet
This patch set VM offload register according to VMDQ config for e1000 PMD to support multicast and broadcast packets. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie Acked-by: Cunming Liang --- lib/librte_pmd_e1000/igb_rxtx.c | 20 1 file changed, 20 insertions(+) diff --git a/lib/librte_pmd_e1000/igb_rxtx.c b/lib/librte_pmd_e1000/igb_rxtx.c index 977c4a2..51b1206 100644 --- a/lib/librte_pmd_e1000/igb_rxtx.c +++ b/lib/librte_pmd_e1000/igb_rxtx.c @@ -1768,6 +1768,26 @@ igb_vmdq_rx_hw_configure(struct rte_eth_dev *dev) vt_ctl |= E1000_VT_CTL_IGNORE_MAC; E1000_WRITE_REG(hw, E1000_VT_CTL, vt_ctl); + for (i = 0; i < E1000_VMOLR_SIZE; i++) { + vmolr = E1000_READ_REG(hw, E1000_VMOLR(i)); + vmolr &= ~(E1000_VMOLR_AUPE | E1000_VMOLR_ROMPE | + E1000_VMOLR_ROPE | E1000_VMOLR_BAM | + E1000_VMOLR_MPME); + + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_UNTAG) + vmolr |= E1000_VMOLR_AUPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_MC) + vmolr |= E1000_VMOLR_ROMPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_UC) + vmolr |= E1000_VMOLR_ROPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_BROADCAST) + vmolr |= E1000_VMOLR_BAM; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_MULTICAST) + vmolr |= E1000_VMOLR_MPME; + + E1000_WRITE_REG(hw, E1000_VMOLR(i), vmolr); + } + /* * VMOLR: set STRVLAN as 1 if IGMAC in VTCTL is set as 1 * Both 82576 and 82580 support it -- 1.8.4.2
[dpdk-dev] [PATCH 3/5] examples/vhost: enable promisc mode and config VMDQ offload register for multicast feature
This patch is to let vhost receive and forward multicast and broadcast packets, add promiscuous option into command line; and set VMDQ RX mode as: ETH_VMDQ_ACCEPT_BROADCAST|ETH_VMDQ_ACCEPT_MULTICAST if promisc mode is on. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie Acked-by: Cunming Liang --- examples/vhost/main.c | 27 +++ 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 193aa25..4acc7b8 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -161,6 +161,9 @@ /* mask of enabled ports */ static uint32_t enabled_port_mask = 0; +/* Ports set in promiscuous mode off by default. */ +static uint32_t promiscuous_on; + /*Number of switching cores enabled*/ static uint32_t num_switching_cores = 0; @@ -278,6 +281,7 @@ static struct rte_eth_conf vmdq_conf_default = { .enable_default_pool = 0, .default_pool = 0, .nb_pool_maps = 0, + .rx_mode = 0, .pool_map = {{0, 0},}, }, }, @@ -368,13 +372,15 @@ static inline int get_eth_conf(struct rte_eth_conf *eth_conf, uint32_t num_devices) { struct rte_eth_vmdq_rx_conf conf; + struct rte_eth_vmdq_rx_conf *def_conf = + &vmdq_conf_default.rx_adv_conf.vmdq_rx_conf; unsigned i; memset(&conf, 0, sizeof(conf)); conf.nb_queue_pools = (enum rte_eth_nb_pools)num_devices; conf.nb_pool_maps = num_devices; - conf.enable_loop_back = - vmdq_conf_default.rx_adv_conf.vmdq_rx_conf.enable_loop_back; + conf.enable_loop_back = def_conf->enable_loop_back; + conf.rx_mode = def_conf->rx_mode; for (i = 0; i < conf.nb_pool_maps; i++) { conf.pool_map[i].vlan_id = vlan_tags[ i ]; @@ -472,6 +478,9 @@ port_init(uint8_t port) return retval; } + if (promiscuous_on) + rte_eth_promiscuous_enable(port); + rte_eth_macaddr_get(port, &vmdq_ports_eth_addr[port]); RTE_LOG(INFO, VHOST_PORT, "Max virtio devices supported: %u\n", num_devices); RTE_LOG(INFO, VHOST_PORT, "Port %u MAC: %02"PRIx8" %02"PRIx8" %02"PRIx8 @@ -604,7 +613,8 @@ us_vhost_parse_args(int argc, char **argv) }; /* Parse command line */ - while ((opt = getopt_long(argc, argv, "p:",long_option, &option_index)) != EOF) { + while ((opt = getopt_long(argc, argv, "p:P", + long_option, &option_index)) != EOF) { switch (opt) { /* Portmask */ case 'p': @@ -616,6 +626,15 @@ us_vhost_parse_args(int argc, char **argv) } break; + case 'P': + promiscuous_on = 1; + vmdq_conf_default.rx_adv_conf.vmdq_rx_conf.rx_mode = + ETH_VMDQ_ACCEPT_BROADCAST | + ETH_VMDQ_ACCEPT_MULTICAST; + VHOST_FEATURES |= (1ULL << VIRTIO_NET_F_CTRL_RX); + + break; + case 0: /* Enable/disable vm2vm comms. */ if (!strncmp(long_option[option_index].name, "vm2vm", @@ -677,7 +696,7 @@ us_vhost_parse_args(int argc, char **argv) return -1; } else { if (ret) - VHOST_FEATURES = (1ULL << VIRTIO_NET_F_MRG_RXBUF); + VHOST_FEATURES |= (1ULL << VIRTIO_NET_F_MRG_RXBUF); } } -- 1.8.4.2
[dpdk-dev] [PATCH 5/5] examples/vmdq: set default value to rx mode
This patch specifies rx_mode as 0 for 2 samples: vmdq and vhost-xen because the multicast feature is not available currently for both samples. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie Acked-by: Cunming Liang --- examples/vhost_xen/main.c | 1 + examples/vmdq/main.c | 1 + 2 files changed, 2 insertions(+) diff --git a/examples/vhost_xen/main.c b/examples/vhost_xen/main.c index b275747..d451272 100644 --- a/examples/vhost_xen/main.c +++ b/examples/vhost_xen/main.c @@ -191,6 +191,7 @@ static const struct rte_eth_conf vmdq_conf_default = { .enable_default_pool = 0, .default_pool = 0, .nb_pool_maps = 0, + .rx_mode = 0, .pool_map = {{0, 0},}, }, }, diff --git a/examples/vmdq/main.c b/examples/vmdq/main.c index 35df234..0cfd963 100644 --- a/examples/vmdq/main.c +++ b/examples/vmdq/main.c @@ -172,6 +172,7 @@ static const struct rte_eth_conf vmdq_conf_default = { .enable_default_pool = 0, .default_pool = 0, .nb_pool_maps = 0, + .rx_mode = 0, .pool_map = {{0, 0},}, }, }, -- 1.8.4.2
[dpdk-dev] [PATCH 4/5] virtio: New API to enable/disable multicast and promisc mode
This patch adds new API in virtio for supporting promiscuous and allmulticast enabling and disabling. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie Acked-by: Cunming Liang --- lib/librte_pmd_virtio/virtio_ethdev.c | 98 ++- 1 file changed, 97 insertions(+), 1 deletion(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 6293ac6..c7f874a 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -66,6 +66,10 @@ static int eth_virtio_dev_init(struct eth_driver *eth_drv, static int virtio_dev_configure(struct rte_eth_dev *dev); static int virtio_dev_start(struct rte_eth_dev *dev); static void virtio_dev_stop(struct rte_eth_dev *dev); +static void virtio_dev_promiscuous_enable(struct rte_eth_dev *dev); +static void virtio_dev_promiscuous_disable(struct rte_eth_dev *dev); +static void virtio_dev_allmulticast_enable(struct rte_eth_dev *dev); +static void virtio_dev_allmulticast_disable(struct rte_eth_dev *dev); static void virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info); static int virtio_dev_link_update(struct rte_eth_dev *dev, @@ -403,6 +407,94 @@ virtio_dev_close(struct rte_eth_dev *dev) virtio_dev_stop(dev); } +static void +virtio_dev_promiscuous_enable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC; + ctrl.data[0] = 1; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) { + PMD_INIT_LOG(ERR, "Promisc enabling but send command " + "failed, this is too late now...\n"); + } +} + +static void +virtio_dev_promiscuous_disable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC; + ctrl.data[0] = 0; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) { + PMD_INIT_LOG(ERR, "Promisc disabling but send command " + "failed, this is too late now...\n"); + } +} + +static void +virtio_dev_allmulticast_enable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI; + ctrl.data[0] = 1; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) { + PMD_INIT_LOG(ERR, "Promisc enabling but send command " + "failed, this is too late now...\n"); + } +} + +static void +virtio_dev_allmulticast_disable(struct rte_eth_dev *dev) +{ + struct virtio_hw *hw + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_pmd_ctrl ctrl; + int dlen[1]; + int ret; + + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI; + ctrl.data[0] = 0; + dlen[0] = 1; + + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); + + if (ret) { + PMD_INIT_LOG(ERR, "Promisc disabling but send command " + "failed, this is too late now...\n"); + } +} + /* * dev_ops for virtio, bare necessities for basic operation */ @@ -411,6 +503,10 @@ static struct eth_dev_ops virtio_eth_dev_ops = { .dev_start = virtio_dev_start, .dev_stop= virtio_dev_stop, .dev_close = virtio_dev_close, + .promiscuous_enable = virtio_dev_promiscuous_enable, + .promiscuous_disable = virtio_dev_promiscuous_disable, + .allmulticast_enable = virtio_dev_allmulticast_enable, + .allmulticast_disable= virtio_dev_allmulticast_disable, .dev_infos_get = virtio_dev_info_get, .stats_get = virtio_dev_stats_get, @@ -561,7 +657,7 @@ virtio_negotiate_features(struct virtio_hw *hw) { uint32_t host_features, mask; - mask = VIRTIO_NET_F_CTRL_RX | VIRTIO_NET_F_CTRL_VLAN; + mask = VIRTIO_NET_F_CTRL_VLAN; mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM; /* TSO and LRO are only available when their corresponding -- 1.8.4.2
[dpdk-dev] [PATCH 1/5] ethdev: Add new config field to config VMDQ offload register
This patch adds new field of rx mode in VMDQ config; and set the register PFVML2FLT for IXGBE PMD, this makes VMDQ receive multicast and broadcast packets. Signed-off-by: Changchun Ouyang Acked-by: Huawei Xie Acked-by: Cunming Liang --- lib/librte_ether/rte_ethdev.h | 1 + lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 16 2 files changed, 17 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 50df654..f44dd2d 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -575,6 +575,7 @@ struct rte_eth_vmdq_rx_conf { uint8_t default_pool; /**< The default pool, if applicable */ uint8_t enable_loop_back; /**< Enable VT loop back */ uint8_t nb_pool_maps; /**< We can have up to 64 filters/mappings */ + uint32_t rx_mode; /**< RX mode for vmdq */ struct { uint16_t vlan_id; /**< The vlan id of the received frame */ uint64_t pools; /**< Bitmask of pools for packet rx */ diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c index dfc2076..9efdbfb 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c @@ -3084,6 +3084,7 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev) struct ixgbe_hw *hw; enum rte_eth_nb_pools num_pools; uint32_t mrqc, vt_ctl, vlanctrl; + uint32_t vmolr = 0; int i; PMD_INIT_FUNC_TRACE(); @@ -3106,6 +3107,21 @@ ixgbe_vmdq_rx_hw_configure(struct rte_eth_dev *dev) IXGBE_WRITE_REG(hw, IXGBE_VT_CTL, vt_ctl); + for (i = 0; i < (int)num_pools; i++) { + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_UNTAG) + vmolr |= IXGBE_VMOLR_AUPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_MC) + vmolr |= IXGBE_VMOLR_ROMPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_HASH_UC) + vmolr |= IXGBE_VMOLR_ROPE; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_BROADCAST) + vmolr |= IXGBE_VMOLR_BAM; + if (cfg->rx_mode & ETH_VMDQ_ACCEPT_MULTICAST) + vmolr |= IXGBE_VMOLR_MPE; + + IXGBE_WRITE_REG(hw, IXGBE_VMOLR(i), vmolr); + } + /* VLNCTRL: enable vlan filtering and allow all vlan tags through */ vlanctrl = IXGBE_READ_REG(hw, IXGBE_VLNCTRL); vlanctrl |= IXGBE_VLNCTRL_VFE ; /* enable vlan filters */ -- 1.8.4.2
[dpdk-dev] [PATCH 4/5] virtio: New API to enable/disable multicast and promisc mode
Hi Stephen, My response below. Thanks Changchun > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Tuesday, August 26, 2014 8:13 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 4/5] virtio: New API to enable/disable > multicast and promisc mode > > On Mon, 25 Aug 2014 10:09:31 +0800 > Ouyang Changchun wrote: > > > This patch adds new API in virtio for supporting promiscuous and > allmulticast enabling and disabling. > > > > Signed-off-by: Changchun Ouyang > > Acked-by: Huawei Xie > > Acked-by: Cunming Liang > > > > --- > > lib/librte_pmd_virtio/virtio_ethdev.c | 98 > > ++- > > 1 file changed, 97 insertions(+), 1 deletion(-) > > > > diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c > > b/lib/librte_pmd_virtio/virtio_ethdev.c > > index 6293ac6..c7f874a 100644 > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c > > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c > > @@ -66,6 +66,10 @@ static int eth_virtio_dev_init(struct eth_driver > > *eth_drv, static int virtio_dev_configure(struct rte_eth_dev *dev); > > static int virtio_dev_start(struct rte_eth_dev *dev); static void > > virtio_dev_stop(struct rte_eth_dev *dev); > > +static void virtio_dev_promiscuous_enable(struct rte_eth_dev *dev); > > +static void virtio_dev_promiscuous_disable(struct rte_eth_dev *dev); > > +static void virtio_dev_allmulticast_enable(struct rte_eth_dev *dev); > > +static void virtio_dev_allmulticast_disable(struct rte_eth_dev *dev); > > static void virtio_dev_info_get(struct rte_eth_dev *dev, > > struct rte_eth_dev_info *dev_info); static > int > > virtio_dev_link_update(struct rte_eth_dev *dev, @@ -403,6 +407,94 @@ > > virtio_dev_close(struct rte_eth_dev *dev) > > virtio_dev_stop(dev); > > } > > > > +static void > > +virtio_dev_promiscuous_enable(struct rte_eth_dev *dev) { > > + struct virtio_hw *hw > > + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > + struct virtio_pmd_ctrl ctrl; > > + int dlen[1]; > > + int ret; > > + > > + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; > > + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC; > > + ctrl.data[0] = 1; > > + dlen[0] = 1; > > + > > + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); > > + > > + if (ret) { > > + PMD_INIT_LOG(ERR, "Promisc enabling but send command " > > + "failed, this is too late now...\n"); > > + } > > +} > > + > > +static void > > +virtio_dev_promiscuous_disable(struct rte_eth_dev *dev) { > > + struct virtio_hw *hw > > + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > + struct virtio_pmd_ctrl ctrl; > > + int dlen[1]; > > + int ret; > > + > > + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; > > + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_PROMISC; > > + ctrl.data[0] = 0; > > + dlen[0] = 1; > > + > > + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); > > + > > + if (ret) { > > + PMD_INIT_LOG(ERR, "Promisc disabling but send command " > > + "failed, this is too late now...\n"); > > + } > > +} > > + > > +static void > > +virtio_dev_allmulticast_enable(struct rte_eth_dev *dev) { > > + struct virtio_hw *hw > > + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > + struct virtio_pmd_ctrl ctrl; > > + int dlen[1]; > > + int ret; > > + > > + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; > > + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI; > > + ctrl.data[0] = 1; > > + dlen[0] = 1; > > + > > + ret = virtio_send_command(hw->cvq, &ctrl, dlen, 1); > > + > > + if (ret) { > > + PMD_INIT_LOG(ERR, "Promisc enabling but send command " > > + "failed, this is too late now...\n"); > > + } > > +} > > + > > +static void > > +virtio_dev_allmulticast_disable(struct rte_eth_dev *dev) { > > + struct virtio_hw *hw > > + = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); > > + struct virtio_pmd_ctrl ctrl; > > + int dlen[1]; > > + int ret; > > + > > + ctrl.hdr.class = VIRTIO_NET_CTRL_RX; > > + ctrl.hdr.cmd = VIRTIO_NET_CTRL_RX_ALLMULTI; > > + ctrl.data[0] = 0; > >
[dpdk-dev] [RFC 07/10] virtio: remove unnecessary adapter structure
Acked-by: Changchun Ouyang > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Tuesday, August 26, 2014 10:08 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: [RFC 07/10] virtio: remove unnecessary adapter structure > > Cleanup virtio code by eliminating unnecessary nesting of virtio hardware > structure inside adapter structure. > Also allows removing unneeded macro, making code clearer. > > --- > lib/librte_pmd_virtio/virtio_ethdev.c | 31 +++ > lib/librte_pmd_virtio/virtio_ethdev.h |9 - > lib/librte_pmd_virtio/virtio_rxtx.c |3 +-- > 3 files changed, 12 insertions(+), 31 deletions(-) >
[dpdk-dev] [RFC 10/10] virtio: add support for promiscious and multicast
This patch is very similar with my previous patch: [PATCH 4/5] virtio: New API to enable/disable multicast and promisc mode So suggest applying only one of both. Thanks Changchun > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Tuesday, August 26, 2014 10:08 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org; Stephen Hemminger > Subject: [RFC 10/10] virtio: add support for promiscious and multicast > > Implement standard virtio controls for enabling and disabling promiscious > and multicast. > > Signed-off-by: Stephen Hemminger > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:16.754586819 -0700 > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:02:48.019397658 -0700
[dpdk-dev] [RFC 01/10] virtio: rearrange resource initialization
Acked-by: Changchun Ouyang > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Tuesday, August 26, 2014 10:08 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: [RFC 01/10] virtio: rearrange resource initialization > > For clarity make the setup of PCI resources for Linux into a function rather > than block of code #ifdef'd in middle of dev_init. > > --- > lib/librte_pmd_virtio/virtio_ethdev.c | 76 +++-- > - > 1 file changed, 43 insertions(+), 33 deletions(-) > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:03.622515574 -0700 > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:03.622515574 -0700 > @@ -706,6 +706,41 @@ virtio_has_msix(const struct rte_pci_add > > return (d != NULL); > } > + > +/* Extract I/O port numbers from sysfs */ static int > +virtio_resource_init(struct rte_pci_device *pci_dev) { > + char dirname[PATH_MAX]; > + char filename[PATH_MAX]; > + unsigned long start, size; > + > + if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0) > + return -1; > + > + /* get portio size */ > + snprintf(filename, sizeof(filename), > + "%s/portio/port0/size", dirname); > + if (parse_sysfs_value(filename, &size) < 0) { > + PMD_INIT_LOG(ERR, "%s(): cannot parse size", > + __func__); > + return -1; > + } > + > + /* get portio start */ > + snprintf(filename, sizeof(filename), > + "%s/portio/port0/start", dirname); > + if (parse_sysfs_value(filename, &start) < 0) { > + PMD_INIT_LOG(ERR, "%s(): cannot parse portio start", > + __func__); > + return -1; > + } > + pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start; > + pci_dev->mem_resource[0].len = (uint64_t)size; > + PMD_INIT_LOG(DEBUG, > + "PCI Port IO found start=0x%lx with size=0x%lx", > + start, size); > + return 0; > +} > #else > static int > virtio_has_msix(const struct rte_pci_addr *loc __rte_unused) @@ -713,6 > +748,12 @@ virtio_has_msix(const struct rte_pci_add > /* nic_uio does not enable interrupts, return 0 (false). */ > return 0; > } > + > +static int virtio_resource_init(struct rte_pci_device *pci_dev > +__rte_unused) { > + /* no setup required */ > + return 0; > +} > #endif > > /* > @@ -749,40 +790,9 @@ eth_virtio_dev_init(__rte_unused struct > return 0; > > pci_dev = eth_dev->pci_dev; > + if (virtio_resource_init(pci_dev) < 0) > + return -1; > > -#ifdef RTE_EXEC_ENV_LINUXAPP > - { > - char dirname[PATH_MAX]; > - char filename[PATH_MAX]; > - unsigned long start, size; > - > - if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) > < 0) > - return -1; > - > - /* get portio size */ > - snprintf(filename, sizeof(filename), > - "%s/portio/port0/size", dirname); > - if (parse_sysfs_value(filename, &size) < 0) { > - PMD_INIT_LOG(ERR, "%s(): cannot parse size", > - __func__); > - return -1; > - } > - > - /* get portio start */ > - snprintf(filename, sizeof(filename), > - "%s/portio/port0/start", dirname); > - if (parse_sysfs_value(filename, &start) < 0) { > - PMD_INIT_LOG(ERR, "%s(): cannot parse portio > start", > - __func__); > - return -1; > - } > - pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start; > - pci_dev->mem_resource[0].len = (uint64_t)size; > - PMD_INIT_LOG(DEBUG, > - "PCI Port IO found start=0x%lx with size=0x%lx", > - start, size); > - } > -#endif > hw->use_msix = virtio_has_msix(&pci_dev->addr); > hw->io_base = (uint32_t)(uintptr_t)pci_dev- > >mem_resource[0].addr; >
[dpdk-dev] [RFC 06/10] virtio: use software vlan stripping
Hi Stephen, Would you please describe the use scenario for the front end rx vlan strip and tx vlan insertion? In our current implementation, backend will strip vlan tag for RX, and insert vlan tag for TX. Thanks Changchun > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Tuesday, August 26, 2014 10:08 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org; Stephen Hemminger > Subject: [RFC 06/10] virtio: use software vlan stripping > > Implement VLAN stripping in software. This allows application to be device > independent. > > Signed-off-by: Stephen Hemminger > > > --- > lib/librte_pmd_virtio/virtio_ethdev.c |2 ++ > lib/librte_pmd_virtio/virtio_pci.h|1 + > lib/librte_pmd_virtio/virtio_rxtx.c | 20 ++-- > 3 files changed, 21 insertions(+), 2 deletions(-) > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:07.574537243 -0700 > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:07.574537243 -0700 > @@ -976,6 +976,8 @@ virtio_dev_configure(struct rte_eth_dev > return (-EINVAL); > } > > + hw->vlan_strip = rxmode->hw_vlan_strip; > + > ret = vtpci_irq_config(hw, 0); > if (ret != 0) > PMD_DRV_LOG(ERR, "failed to set config vector"); > --- a/lib/librte_pmd_virtio/virtio_pci.h 2014-08-25 19:00:07.574537243 > -0700 > +++ b/lib/librte_pmd_virtio/virtio_pci.h 2014-08-25 > 19:00:07.574537243 -0700 > @@ -168,6 +168,7 @@ struct virtio_hw { > uint32_tmax_tx_queues; > uint32_tmax_rx_queues; > uint16_tvtnet_hdr_size; > + uint8_t vlan_strip; > uint8_t use_msix; > uint8_t mac_addr[ETHER_ADDR_LEN]; > }; > --- a/lib/librte_pmd_virtio/virtio_rxtx.c 2014-08-25 19:00:07.574537243 > -0700 > +++ b/lib/librte_pmd_virtio/virtio_rxtx.c 2014-08-25 > 19:00:07.574537243 -0700 > @@ -49,6 +49,7 @@ > #include > #include > #include > +#include > > #include "virtio_logs.h" > #include "virtio_ethdev.h" > @@ -406,8 +407,8 @@ virtio_dev_tx_queue_setup(struct rte_eth > > PMD_INIT_FUNC_TRACE(); > > - if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOOFFLOADS) > - != ETH_TXQ_FLAGS_NOOFFLOADS) { > + if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS) > + != ETH_TXQ_FLAGS_NOXSUMS) { > PMD_INIT_LOG(ERR, "TX checksum offload not > supported\n"); > return -EINVAL; > } > @@ -444,6 +445,7 @@ uint16_t > virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t > nb_pkts) { > struct virtqueue *rxvq = rx_queue; > + struct virtio_hw *hw = rxvq->hw; > struct rte_mbuf *rxm, *new_mbuf; > uint16_t nb_used, num, nb_rx = 0; > uint32_t len[VIRTIO_MBUF_BURST_SZ]; > @@ -487,6 +489,9 @@ virtio_recv_pkts(void *rx_queue, struct > rxm->pkt.pkt_len = (uint32_t)(len[i] - hdr_size); > rxm->pkt.data_len = (uint16_t)(len[i] - hdr_size); > > + if (hw->vlan_strip) > + rte_vlan_strip(rxm); > + > VIRTIO_DUMP_PACKET(rxm, rxm->pkt.data_len); > > rx_pkts[nb_rx++] = rxm; > @@ -711,6 +716,17 @@ virtio_xmit_pkts(void *tx_queue, struct > > if (tx_pkts[nb_tx]->pkt.nb_segs <= txvq->vq_free_cnt) { > txm = tx_pkts[nb_tx]; > + > + /* Do VLAN tag insertion */ > + if (txm->ol_flags & PKT_TX_VLAN_PKT) { > + error = rte_vlan_insert(txm); > + if (unlikely(error)) { > + rte_pktmbuf_free(txm); > + ++nb_tx; > + continue; > + } > + } > + > /* Enqueue Packet buffers */ > error = virtqueue_enqueue_xmit(txvq, txm); > if (unlikely(error)) {
[dpdk-dev] [RFC 08/10] virtio: remove redundant vq_alignment
Acked by: Changchun Ouyang > -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Tuesday, August 26, 2014 10:08 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org; Stephen Hemminger > Subject: [RFC 08/10] virtio: remove redundant vq_alignment > > Since vq_alignment is constant (always 4K), it does not need to be part of the > vring struct. > > Signed-off-by: Stephen Hemminger > > --- > lib/librte_pmd_virtio/virtio_ethdev.c |1 - > lib/librte_pmd_virtio/virtio_rxtx.c |2 +- > lib/librte_pmd_virtio/virtqueue.h |3 +-- > 3 files changed, 2 insertions(+), 4 deletions(-) > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:09.918550097 -0700 > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c 2014-08-25 > 19:00:09.918550097 -0700 > @@ -290,7 +290,6 @@ int virtio_dev_queue_setup(struct rte_et > vq->port_id = dev->data->port_id; > vq->queue_id = queue_idx; > vq->vq_queue_index = vtpci_queue_idx; > - vq->vq_alignment = VIRTIO_PCI_VRING_ALIGN; > vq->vq_nentries = vq_size; > vq->vq_free_cnt = vq_size; > > --- a/lib/librte_pmd_virtio/virtio_rxtx.c 2014-08-25 19:00:09.918550097 > -0700 > +++ b/lib/librte_pmd_virtio/virtio_rxtx.c 2014-08-25 > 19:00:09.918550097 -0700 > @@ -258,7 +258,7 @@ virtio_dev_vring_start(struct virtqueue >* Reinitialise since virtio port might have been stopped and restarted >*/ > memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size); > - vring_init(vr, size, ring_mem, vq->vq_alignment); > + vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN); > vq->vq_used_cons_idx = 0; > vq->vq_desc_head_idx = 0; > vq->vq_avail_idx = 0; > --- a/lib/librte_pmd_virtio/virtqueue.h 2014-08-25 19:00:09.918550097 > -0700 > +++ b/lib/librte_pmd_virtio/virtqueue.h 2014-08-25 > 19:00:09.918550097 -0700 > @@ -139,8 +139,7 @@ struct virtqueue { > uint8_t port_id; /**< Device port identifier. */ > > void*vq_ring_virt_mem;/**< linear address of vring*/ > - int vq_alignment; > - int vq_ring_size; > + unsigned int vq_ring_size; > phys_addr_t vq_ring_mem; /**< physical address of vring */ > > struct vring vq_ring;/**< vring keeping desc, used and avail */
[dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user support into DPDK vhost library
Do we have performance comparison between both implementation? Thanks Changchun -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Xie, Huawei Sent: Tuesday, August 26, 2014 7:06 PM To: dev at dpdk.org Subject: Re: [dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user support into DPDK vhost library Hi all: We are implementing qemu official vhost-user interface into DPDK vhost library, so there would be two coexisting implementations for user space vhost backend. Pro and cons in my mind: Existing solution: Pros: works with qemu version before 2.1; Cons: depends on eventfd proxy kernel module and extra maintenance effort Qemu vhost-user: Pros: qemu official us-vhost interface; Cons: only available after qemu 2.1 BR. huawei
[dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user support into DPDK vhost library
Hi Tetsuya Thanks for your response. Agree with you, the performance should be same as the data path (RX/TX) is not affected, The difference between implementation only exists in the virtio device creation and destroy stage. Regards, Changchun > -Original Message- > From: Tetsuya.Mukawa [mailto:mukawa at igel.co.jp] > Sent: Wednesday, August 27, 2014 12:39 PM > To: Ouyang, Changchun; dev at dpdk.org > Cc: Xie, Huawei; Katsuya MATSUBARA; nakajima.yoshihiro at lab.ntt.co.jp; > Hitoshi Masutani > Subject: Re: [dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user support into > DPDK vhost library > > > (2014/08/27 9:43), Ouyang, Changchun wrote: > > Do we have performance comparison between both implementation? > Hi Changchun, > > If DPDK applications are running on both guest and host side, the > performance should be almost same, because while transmitting data virt > queues are accessed by virtio-net PMD and libvhost. In libvhost, the existing > vhost implementation and a vhost-user implementation will shares or uses > same code to access virt queues. So I guess the performance will be almost > same. > > Thanks, > Tetsuya > > > > Thanks > > Changchun > > > > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei > > Sent: Tuesday, August 26, 2014 7:06 PM > > To: dev at dpdk.org > > Subject: Re: [dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user > > support into DPDK vhost library > > > > Hi all: > > We are implementing qemu official vhost-user interface into DPDK vhost > library, so there would be two coexisting implementations for user space > vhost backend. > > Pro and cons in my mind: > > Existing solution: > > Pros: works with qemu version before 2.1; Cons: depends on eventfd > proxy kernel module and extra maintenance effort Qemu vhost-user: > >Pros: qemu official us-vhost interface; Cons: only > > available after > qemu 2.1 > > > > BR. > > huawei
[dpdk-dev] [RFC 06/10] virtio: use software vlan stripping
> -Original Message- > From: Stephen Hemminger [mailto:stephen at networkplumber.org] > Sent: Wednesday, August 27, 2014 12:24 AM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: Re: [RFC 06/10] virtio: use software vlan stripping > > On Tue, 26 Aug 2014 08:37:11 +0000 > "Ouyang, Changchun" wrote: > > > Hi Stephen, > > > > Would you please describe the use scenario for the front end rx vlan strip > and tx vlan insertion? > > In our current implementation, backend will strip vlan tag for RX, and > > insert > vlan tag for TX. > > > > Thanks > > Changchun > > First, we don't have to do software VLAN strip on our backend if we do this. > And this way we can always use VLAN insert on transmit. Otherwise you > have to introduce special case because there is no DPDK API to determine if > device does or does not do VLAN handling. > How the virtio frontend tell backend whether it has software VLAN strip feature or not? It seems no feature bit to negotiate it. Thanks Changchun
[dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user support into DPDK vhost library
> -Original Message- > From: Tetsuya.Mukawa [mailto:mukawa at igel.co.jp] > Sent: Wednesday, August 27, 2014 1:28 PM > To: Ouyang, Changchun; dev at dpdk.org > Cc: Xie, Huawei; Katsuya MATSUBARA; nakajima.yoshihiro at lab.ntt.co.jp; > Hitoshi Masutani > Subject: Re: [dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user support into > DPDK vhost library > > Hi Changchun, > > (2014/08/27 14:01), Ouyang, Changchun wrote: > > Agree with you, the performance should be same as the data path > > (RX/TX) is not affected, The difference between implementation only > exists in the virtio device creation and destroy stage. > Yes, I agree. Also There may be the difference, if a virtio-net driver on a > guest isn't poll mode like a virtio-net device driver in the kernel. In the > case, > existing vhost implementation uses the eventfd kernel module, and vhost- > user implementation uses eventfd to kick the driver. So I guess there will be > the difference. > > Anyway, about device creation and destruction, the difference will come > from transmission speed between unix domain socket and CUSE. I am not > sure which is faster. Yes, it doesn't matter which one is faster for virtio device creation and destroy, as it is not in data path. > Thanks, > Tetsuya > > > > > > Regards, > > Changchun > > > >> -----Original Message- > >> From: Tetsuya.Mukawa [mailto:mukawa at igel.co.jp] > >> Sent: Wednesday, August 27, 2014 12:39 PM > >> To: Ouyang, Changchun; dev at dpdk.org > >> Cc: Xie, Huawei; Katsuya MATSUBARA; nakajima.yoshihiro at lab.ntt.co.jp; > >> Hitoshi Masutani > >> Subject: Re: [dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user > >> support into DPDK vhost library > >> > >> > >> (2014/08/27 9:43), Ouyang, Changchun wrote: > >>> Do we have performance comparison between both implementation? > >> Hi Changchun, > >> > >> If DPDK applications are running on both guest and host side, the > >> performance should be almost same, because while transmitting data > >> virt queues are accessed by virtio-net PMD and libvhost. In libvhost, > >> the existing vhost implementation and a vhost-user implementation > >> will shares or uses same code to access virt queues. So I guess the > >> performance will be almost same. > >> > >> Thanks, > >> Tetsuya > >> > >> > >>> Thanks > >>> Changchun > >>> > >>> > >>> -Original Message- > >>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Xie, Huawei > >>> Sent: Tuesday, August 26, 2014 7:06 PM > >>> To: dev at dpdk.org > >>> Subject: Re: [dpdk-dev] [RFC] lib/librte_vhost: qemu vhost-user > >>> support into DPDK vhost library > >>> > >>> Hi all: > >>> We are implementing qemu official vhost-user interface into DPDK > >>> vhost > >> library, so there would be two coexisting implementations for user > >> space vhost backend. > >>> Pro and cons in my mind: > >>> Existing solution: > >>> Pros: works with qemu version before 2.1; Cons: depends on eventfd > >> proxy kernel module and extra maintenance effort Qemu vhost-user: > >>>Pros: qemu official us-vhost interface; Cons: only > >>> available > after > >> qemu 2.1 > >>> BR. > >>> huawei
[dpdk-dev] virtio merging - no UIO
Hi Vincent, Thanks for your highlighting. Will consider how to resolve it. regards Changchun > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Vincent JARDIN > Sent: Tuesday, December 2, 2014 4:23 PM > To: dev at dpdk.org > Subject: [dpdk-dev] virtio merging - no UIO > > From today's call, I'd like to highlight that virtio-net-pmd (said code B - > from > 6WIND) does not require UIO; it was required for some security reasons of > the guest Linux OS: >http://dpdk.org/browse/virtio-net-pmd/tree/virtio_user.c#n1494 > > Thank you, >Vincent
[dpdk-dev] [RFC PATCH 01/17] virtio: Rearrange resource initialization
For clarity make the setup of PCI resources for Linux into a function rather than block of code #ifdef'd in middle of dev_init. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 76 --- 1 file changed, 43 insertions(+), 33 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index c009f2a..6c31598 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -794,6 +794,41 @@ virtio_has_msix(const struct rte_pci_addr *loc) return (d != NULL); } + +/* Extract I/O port numbers from sysfs */ +static int virtio_resource_init(struct rte_pci_device *pci_dev) +{ + char dirname[PATH_MAX]; + char filename[PATH_MAX]; + unsigned long start, size; + + if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0) + return -1; + + /* get portio size */ + snprintf(filename, sizeof(filename), +"%s/portio/port0/size", dirname); + if (parse_sysfs_value(filename, &size) < 0) { + PMD_INIT_LOG(ERR, "%s(): cannot parse size", +__func__); + return -1; + } + + /* get portio start */ + snprintf(filename, sizeof(filename), +"%s/portio/port0/start", dirname); + if (parse_sysfs_value(filename, &start) < 0) { + PMD_INIT_LOG(ERR, "%s(): cannot parse portio start", +__func__); + return -1; + } + pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start; + pci_dev->mem_resource[0].len = (uint64_t)size; + PMD_INIT_LOG(DEBUG, +"PCI Port IO found start=0x%lx with size=0x%lx", +start, size); + return 0; +} #else static int virtio_has_msix(const struct rte_pci_addr *loc __rte_unused) @@ -801,6 +836,12 @@ virtio_has_msix(const struct rte_pci_addr *loc __rte_unused) /* nic_uio does not enable interrupts, return 0 (false). */ return 0; } + +static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused) +{ + /* no setup required */ + return 0; +} #endif /* @@ -831,40 +872,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, return 0; pci_dev = eth_dev->pci_dev; + if (virtio_resource_init(pci_dev) < 0) + return -1; -#ifdef RTE_EXEC_ENV_LINUXAPP - { - char dirname[PATH_MAX]; - char filename[PATH_MAX]; - unsigned long start, size; - - if (get_uio_dev(&pci_dev->addr, dirname, sizeof(dirname)) < 0) - return -1; - - /* get portio size */ - snprintf(filename, sizeof(filename), -"%s/portio/port0/size", dirname); - if (parse_sysfs_value(filename, &size) < 0) { - PMD_INIT_LOG(ERR, "%s(): cannot parse size", -__func__); - return -1; - } - - /* get portio start */ - snprintf(filename, sizeof(filename), -"%s/portio/port0/start", dirname); - if (parse_sysfs_value(filename, &start) < 0) { - PMD_INIT_LOG(ERR, "%s(): cannot parse portio start", -__func__); - return -1; - } - pci_dev->mem_resource[0].addr = (void *)(uintptr_t)start; - pci_dev->mem_resource[0].len = (uint64_t)size; - PMD_INIT_LOG(DEBUG, -"PCI Port IO found start=0x%lx with size=0x%lx", -start, size); - } -#endif hw->use_msix = virtio_has_msix(&pci_dev->addr); hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr; -- 1.8.4.2
[dpdk-dev] [RFC PATCH 05/17] ether: Add soft vlan encap/decap functions
It is helpful to allow device drivers that don't support hardware VLAN stripping to emulate this in software. This allows application to be device independent. Avoid discarding shared mbufs. Make a copy in rte_vlan_insert() of any packet to be tagged that has a reference count > 1. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_ether/rte_ether.h | 76 1 file changed, 76 insertions(+) diff --git a/lib/librte_ether/rte_ether.h b/lib/librte_ether/rte_ether.h index 187608d..3b6ab4b 100644 --- a/lib/librte_ether/rte_ether.h +++ b/lib/librte_ether/rte_ether.h @@ -49,6 +49,8 @@ extern "C" { #include #include +#include +#include #define ETHER_ADDR_LEN 6 /**< Length of Ethernet address. */ #define ETHER_TYPE_LEN 2 /**< Length of Ethernet type field. */ @@ -332,6 +334,80 @@ struct vxlan_hdr { #define ETHER_VXLAN_HLEN (sizeof(struct udp_hdr) + sizeof(struct vxlan_hdr)) /**< VXLAN tunnel header length. */ +/** + * Extract VLAN tag information into mbuf + * + * Software version of VLAN stripping + * + * @param m + * The packet mbuf. + * @return + * - 0: Success + * - 1: not a vlan packet + */ +static inline int rte_vlan_strip(struct rte_mbuf *m) +{ + struct ether_hdr *eh += rte_pktmbuf_mtod(m, struct ether_hdr *); + + if (eh->ether_type != ETHER_TYPE_VLAN) + return -1; + + struct vlan_hdr *vh = (struct vlan_hdr *)(eh + 1); + m->ol_flags |= PKT_RX_VLAN_PKT; + m->vlan_tci = rte_be_to_cpu_16(vh->vlan_tci); + + /* Copy ether header over rather than moving whole packet */ + memmove(rte_pktmbuf_adj(m, sizeof(struct vlan_hdr)), + eh, 2 * ETHER_ADDR_LEN); + + return 0; +} + +/** + * Insert VLAN tag into mbuf. + * + * Software version of VLAN unstripping + * + * @param m + * The packet mbuf. + * @return + * - 0: On success + * -EPERM: mbuf is is shared overwriting would be unsafe + * -ENOSPC: not enough headroom in mbuf + */ +static inline int rte_vlan_insert(struct rte_mbuf **m) +{ + struct ether_hdr *oh, *nh; + struct vlan_hdr *vh; + +#ifdef RTE_MBUF_REFCNT + /* Can't insert header if mbuf is shared */ + if (rte_mbuf_refcnt_read(*m) > 1) { + struct rte_mbuf *copy; + + copy = rte_pktmbuf_clone(*m, (*m)->pool); + if (unlikely(copy == NULL)) + return -ENOMEM; + rte_pktmbuf_free(*m); + *m = copy; + } +#endif + oh = rte_pktmbuf_mtod(*m, struct ether_hdr *); + nh = (struct ether_hdr *) + rte_pktmbuf_prepend(*m, sizeof(struct vlan_hdr)); + if (nh == NULL) + return -ENOSPC; + + memmove(nh, oh, 2 * ETHER_ADDR_LEN); + nh->ether_type = ETHER_TYPE_VLAN; + + vh = (struct vlan_hdr *) (nh + 1); + vh->vlan_tci = rte_cpu_to_be_16((*m)->vlan_tci); + + return 0; +} + #ifdef __cplusplus } #endif -- 1.8.4.2
[dpdk-dev] [RFC PATCH 07/17] virtio: Remove unnecessary adapter structure
Cleanup virtio code by eliminating unnecessary nesting of virtio hardware structure inside adapter structure. Also allows removing unneeded macro, making code clearer. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 43 --- lib/librte_pmd_virtio/virtio_ethdev.h | 9 lib/librte_pmd_virtio/virtio_rxtx.c | 3 +-- 3 files changed, 16 insertions(+), 39 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 829838c..c89614d 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -207,8 +207,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, static int virtio_set_multiple_queues(struct rte_eth_dev *dev, uint16_t nb_queues) { - struct virtio_hw *hw - = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; struct virtio_pmd_ctrl ctrl; int dlen[1]; int ret; @@ -242,8 +241,7 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, const struct rte_memzone *mz; uint16_t vq_size; int size; - struct virtio_hw *hw = - VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; struct virtqueue *vq = NULL; /* Write the virtqueue index to the Queue Select Field */ @@ -383,8 +381,7 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx, struct virtqueue *vq; uint16_t nb_desc = 0; int ret; - struct virtio_hw *hw = - VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; PMD_INIT_FUNC_TRACE(); ret = virtio_dev_queue_setup(dev, VTNET_CQ, VTNET_SQ_CQ_QUEUE_IDX, @@ -410,8 +407,7 @@ virtio_dev_close(struct rte_eth_dev *dev) static void virtio_dev_promiscuous_enable(struct rte_eth_dev *dev) { - struct virtio_hw *hw - = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; struct virtio_pmd_ctrl ctrl; int dlen[1]; int ret; @@ -430,8 +426,7 @@ virtio_dev_promiscuous_enable(struct rte_eth_dev *dev) static void virtio_dev_promiscuous_disable(struct rte_eth_dev *dev) { - struct virtio_hw *hw - = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; struct virtio_pmd_ctrl ctrl; int dlen[1]; int ret; @@ -450,8 +445,7 @@ virtio_dev_promiscuous_disable(struct rte_eth_dev *dev) static void virtio_dev_allmulticast_enable(struct rte_eth_dev *dev) { - struct virtio_hw *hw - = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; struct virtio_pmd_ctrl ctrl; int dlen[1]; int ret; @@ -470,8 +464,7 @@ virtio_dev_allmulticast_enable(struct rte_eth_dev *dev) static void virtio_dev_allmulticast_disable(struct rte_eth_dev *dev) { - struct virtio_hw *hw - = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; struct virtio_pmd_ctrl ctrl; int dlen[1]; int ret; @@ -853,8 +846,7 @@ virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle, void *param) { struct rte_eth_dev *dev = param; - struct virtio_hw *hw = - VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + struct virtio_hw *hw = dev->data->dev_private; uint8_t isr; /* Read interrupt status which clears interrupt */ @@ -880,12 +872,11 @@ static int eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, struct rte_eth_dev *eth_dev) { + struct virtio_hw *hw = eth_dev->data->dev_private; struct virtio_net_config *config; struct virtio_net_config local_config; uint32_t offset_conf = sizeof(config->mac); struct rte_pci_device *pci_dev; - struct virtio_hw *hw = - VIRTIO_DEV_PRIVATE_TO_HW(eth_dev->data->dev_private); if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) { PMD_INIT_LOG(ERR, @@ -1010,7 +1001,7 @@ static struct eth_driver rte_virtio_pmd = { .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, }, .eth_dev_init = eth_virtio_dev_init, - .dev_private_size = sizeof(struct virtio_adapter), + .dev_private_size = sizeof(struct virtio_hw), }; /* @@ -1053,8 +1044,7 @@ static int virtio_dev_configure(struct rte_eth_dev *dev) { const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode; - struct virtio_hw *hw = - VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); +
[dpdk-dev] [RFC PATCH 02/17] virtio: Use weaker barriers
The DPDK driver only has to deal with the case of running on PCI and with SMP. In this case, the code can use the weaker barriers instead of using hard (fence) barriers. This will help performance. The rationale is explained in Linux kernel virtio_ring.h. To make it clearer that this is a virtio thing and not some generic barrier, prefix the barrier calls with virtio_. Add missing (and needed) barrier between updating ring data structure and notifying host. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 2 +- lib/librte_pmd_virtio/virtio_rxtx.c | 8 +--- lib/librte_pmd_virtio/virtqueue.h | 19 ++- 3 files changed, 20 insertions(+), 9 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 6c31598..78018f9 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -175,7 +175,7 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, uint32_t idx, desc_idx, used_idx; struct vring_used_elem *uep; - rmb(); + virtio_rmb(); used_idx = (uint32_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c index 3f6bad2..f878c62 100644 --- a/lib/librte_pmd_virtio/virtio_rxtx.c +++ b/lib/librte_pmd_virtio/virtio_rxtx.c @@ -456,7 +456,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) nb_used = VIRTQUEUE_NUSED(rxvq); - rmb(); + virtio_rmb(); num = (uint16_t)(likely(nb_used <= nb_pkts) ? nb_used : nb_pkts); num = (uint16_t)(likely(num <= VIRTIO_MBUF_BURST_SZ) ? num : VIRTIO_MBUF_BURST_SZ); @@ -516,6 +516,7 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) } if (likely(nb_enqueued)) { + virtio_wmb(); if (unlikely(virtqueue_kick_prepare(rxvq))) { virtqueue_notify(rxvq); PMD_RX_LOG(DEBUG, "Notified\n"); @@ -547,7 +548,7 @@ virtio_recv_mergeable_pkts(void *rx_queue, nb_used = VIRTQUEUE_NUSED(rxvq); - rmb(); + virtio_rmb(); if (nb_used == 0) return 0; @@ -694,7 +695,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts); nb_used = VIRTQUEUE_NUSED(txvq); - rmb(); + virtio_rmb(); num = (uint16_t)(likely(nb_used < VIRTIO_MBUF_BURST_SZ) ? nb_used : VIRTIO_MBUF_BURST_SZ); @@ -735,6 +736,7 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) } } vq_update_avail_idx(txvq); + virtio_wmb(); txvq->packets += nb_tx; diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h index fdee054..f6ad98d 100644 --- a/lib/librte_pmd_virtio/virtqueue.h +++ b/lib/librte_pmd_virtio/virtqueue.h @@ -46,9 +46,18 @@ #include "virtio_ring.h" #include "virtio_logs.h" -#define mb() rte_mb() -#define wmb() rte_wmb() -#define rmb() rte_rmb() +/* + * Per virtio_config.h in Linux. + * For virtio_pci on SMP, we don't need to order with respect to MMIO + * accesses through relaxed memory I/O windows, so smp_mb() et al are + * sufficient. + * + * This driver is for virtio_pci on SMP and therefore can assume + * weaker (compiler barriers) + */ +#define virtio_mb()rte_mb() +#define virtio_rmb() rte_compiler_barrier() +#define virtio_wmb() rte_compiler_barrier() #ifdef RTE_PMD_PACKET_PREFETCH #define rte_packet_prefetch(p) rte_prefetch1(p) @@ -225,7 +234,7 @@ virtqueue_full(const struct virtqueue *vq) static inline void vq_update_avail_idx(struct virtqueue *vq) { - rte_compiler_barrier(); + virtio_rmb(); vq->vq_ring.avail->idx = vq->vq_avail_idx; } @@ -255,7 +264,7 @@ static inline void virtqueue_notify(struct virtqueue *vq) { /* -* Ensure updated avail->idx is visible to host. mb() necessary? +* Ensure updated avail->idx is visible to host. * For virtio on IA, the notificaiton is through io port operation * which is a serialization instruction itself. */ -- 1.8.4.2
[dpdk-dev] [RFC PATCH 04/17] virtio: Add support for Link State interrupt
Virtio has link state interrupt which can be used. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 78 +++ lib/librte_pmd_virtio/virtio_pci.c| 22 ++ lib/librte_pmd_virtio/virtio_pci.h| 4 ++ 3 files changed, 86 insertions(+), 18 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 4bff0fe..d37f2e9 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -845,6 +845,34 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused) #endif /* + * Process Virtio Config changed interrupt and call the callback + * if link state changed. + */ +static void +virtio_interrupt_handler(__rte_unused struct rte_intr_handle *handle, +void *param) +{ + struct rte_eth_dev *dev = param; + struct virtio_hw *hw = + VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + uint8_t isr; + + /* Read interrupt status which clears interrupt */ + isr = vtpci_isr(hw); + PMD_DRV_LOG(INFO, "interrupt status = %#x", isr); + + if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) + PMD_DRV_LOG(ERR, "interrupt enable failed"); + + if (isr & VIRTIO_PCI_ISR_CONFIG) { + if (virtio_dev_link_update(dev, 0) == 0) + _rte_eth_dev_callback_process(dev, + RTE_ETH_EVENT_INTR_LSC); + } + +} + +/* * This function is based on probe() function in virtio_pci.c * It returns 0 on success. */ @@ -968,6 +996,10 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, PMD_INIT_LOG(DEBUG, "port %d vendorID=0x%x deviceID=0x%x", eth_dev->data->port_id, pci_dev->id.vendor_id, pci_dev->id.device_id); + + /* Setup interrupt callback */ + rte_intr_callback_register(&pci_dev->intr_handle, + virtio_interrupt_handler, eth_dev); return 0; } @@ -975,7 +1007,7 @@ static struct eth_driver rte_virtio_pmd = { { .name = "rte_virtio_pmd", .id_table = pci_id_virtio_map, - .drv_flags = RTE_PCI_DRV_NEED_MAPPING, + .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, }, .eth_dev_init = eth_virtio_dev_init, .dev_private_size = sizeof(struct virtio_adapter), @@ -1021,6 +1053,9 @@ static int virtio_dev_configure(struct rte_eth_dev *dev) { const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode; + struct virtio_hw *hw = + VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); + int ret; PMD_INIT_LOG(DEBUG, "configure"); @@ -1029,7 +1064,11 @@ virtio_dev_configure(struct rte_eth_dev *dev) return (-EINVAL); } - return 0; + ret = vtpci_irq_config(hw, 0); + if (ret != 0) + PMD_DRV_LOG(ERR, "failed to set config vector"); + + return ret; } @@ -1037,7 +1076,6 @@ static int virtio_dev_start(struct rte_eth_dev *dev) { uint16_t nb_queues, i; - uint16_t status; struct virtio_hw *hw = VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); @@ -1052,18 +1090,22 @@ virtio_dev_start(struct rte_eth_dev *dev) /* Do final configuration before rx/tx engine starts */ virtio_dev_rxtx_start(dev); - /* Check VIRTIO_NET_F_STATUS for link status*/ - if (vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) { - vtpci_read_dev_config(hw, - offsetof(struct virtio_net_config, status), - &status, sizeof(status)); - if ((status & VIRTIO_NET_S_LINK_UP) == 0) - PMD_INIT_LOG(ERR, "Port: %d Link is DOWN", -dev->data->port_id); - else - PMD_INIT_LOG(DEBUG, "Port: %d Link is UP", -dev->data->port_id); + /* check if lsc interrupt feature is enabled */ + if (dev->data->dev_conf.intr_conf.lsc) { + if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) { + PMD_DRV_LOG(ERR, "link status not supported by host"); + return -ENOTSUP; + } + + if (rte_intr_enable(&dev->pci_dev->intr_handle) < 0) { + PMD_DRV_LOG(ERR, "interrupt enable failed"); + return -EIO; + } } + + /* Initialize Link state */ + virtio_dev_link_update(dev, 0); + vtpci_reinit_complete(hw); /*Notify the backend @@ -1145,6 +1187,7 @@ virtio_dev_stop(struct rte_eth_dev *dev) VIRTIO_DEV_PRIVATE_TO_HW(dev->data->dev_private); /* reset the
[dpdk-dev] [RFC PATCH 06/17] virtio: Use software vlan stripping
Implement VLAN stripping in software. This allows application to be device independent. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_ether/rte_ethdev.h | 3 +++ lib/librte_pmd_virtio/virtio_ethdev.c | 2 ++ lib/librte_pmd_virtio/virtio_pci.h| 1 + lib/librte_pmd_virtio/virtio_rxtx.c | 20 ++-- 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index f66805d..07d55b8 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -643,6 +643,9 @@ struct rte_eth_rxconf { #define ETH_TXQ_FLAGS_NOOFFLOADS \ (ETH_TXQ_FLAGS_NOVLANOFFL | ETH_TXQ_FLAGS_NOXSUMSCTP | \ ETH_TXQ_FLAGS_NOXSUMUDP | ETH_TXQ_FLAGS_NOXSUMTCP) +#define ETH_TXQ_FLAGS_NOXSUMS \ + (ETH_TXQ_FLAGS_NOXSUMSCTP | ETH_TXQ_FLAGS_NOXSUMUDP | \ +ETH_TXQ_FLAGS_NOXSUMTCP) /** * A structure used to configure a TX ring of an Ethernet port. */ diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index d37f2e9..829838c 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -1064,6 +1064,8 @@ virtio_dev_configure(struct rte_eth_dev *dev) return (-EINVAL); } + hw->vlan_strip = rxmode->hw_vlan_strip; + ret = vtpci_irq_config(hw, 0); if (ret != 0) PMD_DRV_LOG(ERR, "failed to set config vector"); diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h index 6998737..6d93fac 100644 --- a/lib/librte_pmd_virtio/virtio_pci.h +++ b/lib/librte_pmd_virtio/virtio_pci.h @@ -168,6 +168,7 @@ struct virtio_hw { uint32_tmax_tx_queues; uint32_tmax_rx_queues; uint16_tvtnet_hdr_size; + uint8_t vlan_strip; uint8_t use_msix; uint8_t mac_addr[ETHER_ADDR_LEN]; }; diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c index f878c62..a5756e1 100644 --- a/lib/librte_pmd_virtio/virtio_rxtx.c +++ b/lib/librte_pmd_virtio/virtio_rxtx.c @@ -49,6 +49,7 @@ #include #include #include +#include #include "virtio_logs.h" #include "virtio_ethdev.h" @@ -408,8 +409,8 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, PMD_INIT_FUNC_TRACE(); - if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOOFFLOADS) - != ETH_TXQ_FLAGS_NOOFFLOADS) { + if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOXSUMS) + != ETH_TXQ_FLAGS_NOXSUMS) { PMD_INIT_LOG(ERR, "TX checksum offload not supported\n"); return -EINVAL; } @@ -446,6 +447,7 @@ uint16_t virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) { struct virtqueue *rxvq = rx_queue; + struct virtio_hw *hw = rxvq->hw; struct rte_mbuf *rxm, *new_mbuf; uint16_t nb_used, num, nb_rx = 0; uint32_t len[VIRTIO_MBUF_BURST_SZ]; @@ -489,6 +491,9 @@ virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) rxm->pkt_len = (uint32_t)(len[i] - hdr_size); rxm->data_len = (uint16_t)(len[i] - hdr_size); + if (hw->vlan_strip) + rte_vlan_strip(rxm); + VIRTIO_DUMP_PACKET(rxm, rxm->data_len); rx_pkts[nb_rx++] = rxm; @@ -717,6 +722,17 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) */ if (likely(need <= 0)) { txm = tx_pkts[nb_tx]; + + /* Do VLAN tag insertion */ + if (txm->ol_flags & PKT_TX_VLAN_PKT) { + error = rte_vlan_insert(&txm); + if (unlikely(error)) { + rte_pktmbuf_free(txm); + ++nb_tx; + continue; + } + } + /* Enqueue Packet buffers */ error = virtqueue_enqueue_xmit(txvq, txm); if (unlikely(error)) { -- 1.8.4.2
[dpdk-dev] [RFC PATCH 10/17] virtio: Make vtpci_get_status local
Make vtpci_get_status a local function as it is used in one file. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_pci.c | 4 +++- lib/librte_pmd_virtio/virtio_pci.h | 2 -- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_pci.c b/lib/librte_pmd_virtio/virtio_pci.c index b099e4f..2245bec 100644 --- a/lib/librte_pmd_virtio/virtio_pci.c +++ b/lib/librte_pmd_virtio/virtio_pci.c @@ -35,6 +35,8 @@ #include "virtio_pci.h" #include "virtio_logs.h" +static uint8_t vtpci_get_status(struct virtio_hw *); + void vtpci_read_dev_config(struct virtio_hw *hw, uint64_t offset, void *dst, int length) @@ -113,7 +115,7 @@ vtpci_reinit_complete(struct virtio_hw *hw) vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER_OK); } -uint8_t +static uint8_t vtpci_get_status(struct virtio_hw *hw) { return VIRTIO_READ_REG_1(hw, VIRTIO_PCI_STATUS); diff --git a/lib/librte_pmd_virtio/virtio_pci.h b/lib/librte_pmd_virtio/virtio_pci.h index 0a4b578..64d9c34 100644 --- a/lib/librte_pmd_virtio/virtio_pci.h +++ b/lib/librte_pmd_virtio/virtio_pci.h @@ -255,8 +255,6 @@ void vtpci_reset(struct virtio_hw *); void vtpci_reinit_complete(struct virtio_hw *); -uint8_t vtpci_get_status(struct virtio_hw *); - void vtpci_set_status(struct virtio_hw *, uint8_t); uint32_t vtpci_negotiate_features(struct virtio_hw *, uint32_t); -- 1.8.4.2
[dpdk-dev] [RFC PATCH 09/17] virtio: Fix how states are handled during initialization
Change order of initialiazation to match Linux kernel. Don't blow away control queue by doing reset when stopped. Calling dev_stop then dev_start would not work. Dev_stop was calling virtio reset and that would clear all queues and clear all feature negotiation. Resolved by only doing reset on device removal. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 58 --- lib/librte_pmd_virtio/virtio_pci.c| 10 ++ lib/librte_pmd_virtio/virtio_pci.h| 3 +- 3 files changed, 37 insertions(+), 34 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index b7f65b9..a07f4ca 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -398,9 +398,14 @@ virtio_dev_cq_queue_setup(struct rte_eth_dev *dev, uint16_t vtpci_queue_idx, static void virtio_dev_close(struct rte_eth_dev *dev) { + struct virtio_hw *hw = dev->data->dev_private; + PMD_INIT_LOG(DEBUG, "virtio_dev_close"); - virtio_dev_stop(dev); + /* reset the NIC */ + vtpci_irq_config(hw, VIRTIO_MSI_NO_VECTOR); + vtpci_reset(hw); + virtio_dev_free_mbufs(dev); } static void @@ -889,6 +894,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, if (rte_eal_process_type() == RTE_PROC_SECONDARY) return 0; + /* Tell the host we've noticed this device. */ + vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK); + pci_dev = eth_dev->pci_dev; if (virtio_resource_init(pci_dev) < 0) return -1; @@ -899,9 +907,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, /* Reset the device although not necessary at startup */ vtpci_reset(hw); - /* Tell the host we've noticed this device. */ - vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK); - /* Tell the host we've known how to drive the device. */ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER); virtio_negotiate_features(hw); @@ -990,6 +995,9 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, /* Setup interrupt callback */ rte_intr_callback_register(&pci_dev->intr_handle, virtio_interrupt_handler, eth_dev); + + virtio_dev_cq_start(eth_dev); + return 0; } @@ -1044,7 +1052,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) { const struct rte_eth_rxmode *rxmode = &dev->data->dev_conf.rxmode; struct virtio_hw *hw = dev->data->dev_private; - int ret; PMD_INIT_LOG(DEBUG, "configure"); @@ -1055,11 +1062,12 @@ virtio_dev_configure(struct rte_eth_dev *dev) hw->vlan_strip = rxmode->hw_vlan_strip; - ret = vtpci_irq_config(hw, 0); - if (ret != 0) + if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) { PMD_DRV_LOG(ERR, "failed to set config vector"); + return -EBUSY; + } - return ret; + return 0; } @@ -1069,17 +1077,6 @@ virtio_dev_start(struct rte_eth_dev *dev) uint16_t nb_queues, i; struct virtio_hw *hw = dev->data->dev_private; - /* Tell the host we've noticed this device. */ - vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK); - - /* Tell the host we've known how to drive the device. */ - vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_DRIVER); - - virtio_dev_cq_start(dev); - - /* Do final configuration before rx/tx engine starts */ - virtio_dev_rxtx_start(dev); - /* check if lsc interrupt feature is enabled */ if (dev->data->dev_conf.intr_conf.lsc) { if (!vtpci_with_feature(hw, VIRTIO_NET_F_STATUS)) { @@ -1096,8 +1093,16 @@ virtio_dev_start(struct rte_eth_dev *dev) /* Initialize Link state */ virtio_dev_link_update(dev, 0); + /* On restart after stop do not touch queues */ + if (hw->started) + return 0; + vtpci_reinit_complete(hw); + /* Do final configuration before rx/tx engine starts */ + virtio_dev_rxtx_start(dev); + hw->started = 1; + /*Notify the backend *Otherwise the tap backend might already stop its queue due to fullness. *vhost backend will have no chance to be waked up @@ -1168,17 +1173,20 @@ static void virtio_dev_free_mbufs(struct rte_eth_dev *dev) } /* - * Stop device: disable rx and tx functions to allow for reconfiguring. + * Stop device: disable interrupt and mark link down */ static void virtio_dev_stop(struct rte_eth_dev *dev) { - struct virtio_hw *hw = dev->data->dev_private; + struct rte_eth_link link; - /* reset the NIC */ - vtpci_irq_config(hw, 0); - vtpci_reset(hw); - virtio_dev_free_mbufs(dev); + PMD_INIT_LOG(DEBUG, "stop"); + + if (dev->data->dev_conf.intr_conf.lsc) + rte_intr_disable(&dev->pci_dev-
[dpdk-dev] [RFC PATCH 08/17] virtio: Remove redundant vq_alignment
Since vq_alignment is constant (always 4K), it does not need to be part of the vring struct. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 1 - lib/librte_pmd_virtio/virtio_rxtx.c | 2 +- lib/librte_pmd_virtio/virtqueue.h | 3 +-- 3 files changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index c89614d..b7f65b9 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -294,7 +294,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev *dev, vq->port_id = dev->data->port_id; vq->queue_id = queue_idx; vq->vq_queue_index = vtpci_queue_idx; - vq->vq_alignment = VIRTIO_PCI_VRING_ALIGN; vq->vq_nentries = vq_size; vq->vq_free_cnt = vq_size; diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c index 73ad3ac..b44f091 100644 --- a/lib/librte_pmd_virtio/virtio_rxtx.c +++ b/lib/librte_pmd_virtio/virtio_rxtx.c @@ -258,7 +258,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type) * Reinitialise since virtio port might have been stopped and restarted */ memset(vq->vq_ring_virt_mem, 0, vq->vq_ring_size); - vring_init(vr, size, ring_mem, vq->vq_alignment); + vring_init(vr, size, ring_mem, VIRTIO_PCI_VRING_ALIGN); vq->vq_used_cons_idx = 0; vq->vq_desc_head_idx = 0; vq->vq_avail_idx = 0; diff --git a/lib/librte_pmd_virtio/virtqueue.h b/lib/librte_pmd_virtio/virtqueue.h index f6ad98d..5b8a255 100644 --- a/lib/librte_pmd_virtio/virtqueue.h +++ b/lib/librte_pmd_virtio/virtqueue.h @@ -138,8 +138,7 @@ struct virtqueue { uint8_t port_id; /**< Device port identifier. */ void*vq_ring_virt_mem;/**< linear address of vring*/ - int vq_alignment; - int vq_ring_size; + unsigned int vq_ring_size; phys_addr_t vq_ring_mem; /**< physical address of vring */ struct vring vq_ring;/**< vring keeping desc, used and avail */ -- 1.8.4.2
[dpdk-dev] [RFC PATCH 12/17] virtio: Move allocation before initialization
If allocation fails, don't want to leave virtio device stuck in middle of initialization sequence. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index c17cac8..13feda5 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -890,6 +890,15 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, if (rte_eal_process_type() == RTE_PROC_SECONDARY) return 0; + /* Allocate memory for storing MAC addresses */ + eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0); + if (eth_dev->data->mac_addrs == NULL) { + PMD_INIT_LOG(ERR, + "Failed to allocate %d bytes needed to store MAC addresses", + ETHER_ADDR_LEN); + return -ENOMEM; + } + /* Tell the host we've noticed this device. */ vtpci_set_status(hw, VIRTIO_CONFIG_STATUS_ACK); @@ -916,15 +925,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, hw->vtnet_hdr_size = sizeof(struct virtio_net_hdr); } - /* Allocate memory for storing MAC addresses */ - eth_dev->data->mac_addrs = rte_zmalloc("virtio", ETHER_ADDR_LEN, 0); - if (eth_dev->data->mac_addrs == NULL) { - PMD_INIT_LOG(ERR, - "Failed to allocate %d bytes needed to store MAC addresses", - ETHER_ADDR_LEN); - return -ENOMEM; - } - /* Copy the permanent MAC address to: virtio_hw */ virtio_get_hwaddr(hw); ether_addr_copy((struct ether_addr *) hw->mac_addr, -- 1.8.4.2
[dpdk-dev] [RFC PATCH 03/17] virtio: Allow starting with link down
Starting driver with link down should be ok, it is with every other driver. So just allow it. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 78018f9..4bff0fe 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -1057,14 +1057,12 @@ virtio_dev_start(struct rte_eth_dev *dev) vtpci_read_dev_config(hw, offsetof(struct virtio_net_config, status), &status, sizeof(status)); - if ((status & VIRTIO_NET_S_LINK_UP) == 0) { + if ((status & VIRTIO_NET_S_LINK_UP) == 0) PMD_INIT_LOG(ERR, "Port: %d Link is DOWN", dev->data->port_id); - return -EIO; - } else { + else PMD_INIT_LOG(DEBUG, "Port: %d Link is UP", dev->data->port_id); - } } vtpci_reinit_complete(hw); -- 1.8.4.2
[dpdk-dev] [RFC PATCH 11/17] virtio: Check for packet headroom at compile time
Better to check at compile time than fail at runtime. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 6 +- 1 file changed, 1 insertion(+), 5 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index a07f4ca..c17cac8 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -882,11 +882,7 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, uint32_t offset_conf = sizeof(config->mac); struct rte_pci_device *pci_dev; - if (RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)) { - PMD_INIT_LOG(ERR, - "MBUF HEADROOM should be enough to hold virtio net hdr\n"); - return -1; - } + RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct virtio_net_hdr)); eth_dev->dev_ops = &virtio_eth_dev_ops; eth_dev->tx_pkt_burst = &virtio_xmit_pkts; -- 1.8.4.2
[dpdk-dev] [RFC PATCH 13/17] virtio: Add support for vlan filtering
Virtio supports vlan filtering. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 13feda5..ec5a51e 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -84,6 +84,8 @@ static void virtio_dev_tx_queue_release(__rte_unused void *txq); static void virtio_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats); static void virtio_dev_stats_reset(struct rte_eth_dev *dev); static void virtio_dev_free_mbufs(struct rte_eth_dev *dev); +static int virtio_vlan_filter_set(struct rte_eth_dev *dev, + uint16_t vlan_id, int on); static int virtio_dev_queue_stats_mapping_set( __rte_unused struct rte_eth_dev *eth_dev, @@ -511,6 +513,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = { .tx_queue_release= virtio_dev_tx_queue_release, /* collect stats per queue */ .queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set, + .vlan_filter_set = virtio_vlan_filter_set, }; static inline int @@ -640,14 +643,31 @@ virtio_get_hwaddr(struct virtio_hw *hw) } } +static int +virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on) +{ + struct virtio_hw *hw = dev->data->dev_private; + struct virtio_pmd_ctrl ctrl; + int len; + + if (!vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) + return -ENOTSUP; + + ctrl.hdr.class = VIRTIO_NET_CTRL_VLAN; + ctrl.hdr.cmd = on ? VIRTIO_NET_CTRL_VLAN_ADD : VIRTIO_NET_CTRL_VLAN_DEL; + memcpy(ctrl.data, &vlan_id, sizeof(vlan_id)); + len = sizeof(vlan_id); + + return virtio_send_command(hw->cvq, &ctrl, &len, 1); +} static void virtio_negotiate_features(struct virtio_hw *hw) { uint32_t host_features, mask; - mask = VIRTIO_NET_F_CTRL_VLAN; - mask |= VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM; + /* checksum offload not implemented */ + mask = VIRTIO_NET_F_CSUM | VIRTIO_NET_F_GUEST_CSUM; /* TSO and LRO are only available when their corresponding * checksum offload feature is also negotiated. @@ -1058,6 +1078,13 @@ virtio_dev_configure(struct rte_eth_dev *dev) hw->vlan_strip = rxmode->hw_vlan_strip; + if (rxmode->hw_vlan_filter + && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) { + PMD_DRV_LOG(NOTICE, + "vlan filtering not available on this host"); + return -ENOTSUP; + } + if (vtpci_irq_config(hw, 0) == VIRTIO_MSI_NO_VECTOR) { PMD_DRV_LOG(ERR, "failed to set config vector"); return -EBUSY; -- 1.8.4.2
[dpdk-dev] [RFC PATCH 15/17] virtio: Add ability to set MAC address
Need to have do special things to set default mac address. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_ether/rte_ethdev.h | 5 + lib/librte_pmd_virtio/virtio_ethdev.c | 24 2 files changed, 29 insertions(+) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 07d55b8..cbe3fdf 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -1249,6 +1249,10 @@ typedef void (*eth_mac_addr_add_t)(struct rte_eth_dev *dev, uint32_t vmdq); /**< @internal Set a MAC address into Receive Address Address Register */ +typedef void (*eth_mac_addr_set_t)(struct rte_eth_dev *dev, + struct ether_addr *mac_addr); +/**< @internal Set a MAC address into Receive Address Address Register */ + typedef int (*eth_uc_hash_table_set_t)(struct rte_eth_dev *dev, struct ether_addr *mac_addr, uint8_t on); @@ -1482,6 +1486,7 @@ struct eth_dev_ops { priority_flow_ctrl_set_t priority_flow_ctrl_set; /**< Setup priority flow control.*/ eth_mac_addr_remove_t mac_addr_remove; /**< Remove MAC address */ eth_mac_addr_add_t mac_addr_add; /**< Add a MAC address */ + eth_mac_addr_set_t mac_addr_set; /**< Set a MAC address */ eth_uc_hash_table_set_tuc_hash_table_set; /**< Set Unicast Table Array */ eth_uc_all_hash_table_set_t uc_all_hash_table_set; /**< Set Unicast hash bitmap */ eth_mirror_rule_set_t mirror_rule_set; /**< Add a traffic mirror rule.*/ diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index e469ac2..c5f21c1 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -90,6 +90,8 @@ static void virtio_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr, uint32_t index, uint32_t vmdq __rte_unused); static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index); +static void virtio_mac_addr_set(struct rte_eth_dev *dev, + struct ether_addr *mac_addr); static int virtio_dev_queue_stats_mapping_set( __rte_unused struct rte_eth_dev *eth_dev, @@ -518,6 +520,7 @@ static struct eth_dev_ops virtio_eth_dev_ops = { .vlan_filter_set = virtio_vlan_filter_set, .mac_addr_add= virtio_mac_addr_add, .mac_addr_remove = virtio_mac_addr_remove, + .mac_addr_set= virtio_mac_addr_set, }; static inline int @@ -733,6 +736,27 @@ virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index) virtio_mac_table_set(hw, uc, mc); } +static void +virtio_mac_addr_set(struct rte_eth_dev *dev, struct ether_addr *mac_addr) +{ + struct virtio_hw *hw = dev->data->dev_private; + + memcpy(hw->mac_addr, mac_addr, ETHER_ADDR_LEN); + + /* Use atomic update if available */ + if (vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_MAC_ADDR)) { + struct virtio_pmd_ctrl ctrl; + int len = ETHER_ADDR_LEN; + + ctrl.hdr.class = VIRTIO_NET_CTRL_MAC; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_ADDR_SET; + + memcpy(ctrl.data, mac_addr, ETHER_ADDR_LEN); + virtio_send_command(hw->cvq, &ctrl, &len, 1); + } else if (vtpci_with_feature(hw, VIRTIO_NET_F_MAC)) + virtio_set_hwaddr(hw); +} + static int virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on) { -- 1.8.4.2
[dpdk-dev] [RFC PATCH 00/17] Single virtio implementation
This is RFC patch for single virtio implementation. Why we need single virtio? As we know currently there are at least 3 virtio PMD driver implementations: A) lib/librte_pmd_virtio(refer as virtio A); B) virtio_net_pmd by 6wind(refer as virtio B); C) virtio by Brocade/vyatta(refer as virtio C); Integrating 3 implementations into one could reduce the maintaining cost and time, in other hand, user don't need practice their application on 3 variant one by one to see which one is the best for them; What's the status? Currently virtio A has covered most features of virtio B, we could regard they have similar behavior as virtio driver. But there are some differences between virtio A and virtio C, so it need integrate features/codes from virtio C into virtio A. This patch set bases on two original RFC patch sets from Stephen Hemminger[stephen at networkplumber.org] Refer to [http://dpdk.org/ml/archives/dev/2014-August/004845.html ] for the original one. This patch set also resolves some conflict with latest codes and removed duplicated codes. What this patch set contains: === 1) virtio: Rearrange resource initialization, it extracts a function to setup PCI resources; 2) virtio: Use weaker barriers, as DPDK driver only has to deal with the case of running on PCI and with SMP, In this case, the code can use the weaker barriers instead of using hard (fence) barriers. This may help performance a bit; 3) virtio: Allow starting with link down, other driver has similar behavior; 4) virtio: Add support for Link State interrupt; 5) ether: Add soft vlan encap/decap functions, it helps if HW don't support vlan strip; 6) virtio: Use software vlan stripping; 7) virtio: Remove unnecessary adapter structure; 8) virtio: Remove redundant vq_alignment, as vq alignment is always 4K, so use constant when needed; 9) virtio: Fix how states are handled during initialization, this is to match Linux kernel; 10) virtio: Make vtpci_get_status a local function as it is used in one file; 11) virtio: Check for packet headroom at compile time; 12) virtio: Move allocation before initialization to avoid being stuck in middle of virtio init; 13) virtio: Add support for vlan filtering; 14) virtio: Add support for multiple mac addresses; 15) virtio: Add ability to set MAC address; 16) virtio: Free mbuf's with threshold, this makes its behavior more like ixgbe; 17) virtio: Use port IO to get PCI resource for security reasons and match virtio-net-pmd. Any feedback and comments for this RFC are welcome. Changchun Ouyang (17): virtio: Rearrange resource initialization virtio: Use weaker barriers virtio: Allow starting with link down virtio: Add support for Link State interrupt ether: Add soft vlan encap/decap functions virtio: Use software vlan stripping virtio: Remove unnecessary adapter structure virtio: Remove redundant vq_alignment virtio: Fix how states are handled during initialization virtio: Make vtpci_get_status local virtio: Check for packet headroom at compile time virtio: Move allocation before initialization virtio: Add support for vlan filtering virtio: Add suport for multiple mac addresses virtio: Add ability to set MAC address virtio: Free mbuf's with threshold virtio: Use port IO to get PCI resource. lib/librte_eal/common/include/rte_pci.h | 2 + lib/librte_eal/linuxapp/eal/eal_pci.c | 3 +- lib/librte_ether/rte_ethdev.h | 8 + lib/librte_ether/rte_ether.h| 76 + lib/librte_pmd_virtio/virtio_ethdev.c | 479 lib/librte_pmd_virtio/virtio_ethdev.h | 12 +- lib/librte_pmd_virtio/virtio_pci.c | 20 +- lib/librte_pmd_virtio/virtio_pci.h | 8 +- lib/librte_pmd_virtio/virtio_rxtx.c | 101 +-- lib/librte_pmd_virtio/virtqueue.h | 59 +++- 10 files changed, 614 insertions(+), 154 deletions(-) -- 1.8.4.2
[dpdk-dev] [RFC PATCH 14/17] virtio: Add suport for multiple mac addresses
Virtio support multiple MAC addresses. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 94 ++- lib/librte_pmd_virtio/virtio_ethdev.h | 3 +- lib/librte_pmd_virtio/virtqueue.h | 34 - 3 files changed, 127 insertions(+), 4 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index ec5a51e..e469ac2 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -86,6 +86,10 @@ static void virtio_dev_stats_reset(struct rte_eth_dev *dev); static void virtio_dev_free_mbufs(struct rte_eth_dev *dev); static int virtio_vlan_filter_set(struct rte_eth_dev *dev, uint16_t vlan_id, int on); +static void virtio_mac_addr_add(struct rte_eth_dev *dev, + struct ether_addr *mac_addr, + uint32_t index, uint32_t vmdq __rte_unused); +static void virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index); static int virtio_dev_queue_stats_mapping_set( __rte_unused struct rte_eth_dev *eth_dev, @@ -503,8 +507,6 @@ static struct eth_dev_ops virtio_eth_dev_ops = { .stats_get = virtio_dev_stats_get, .stats_reset = virtio_dev_stats_reset, .link_update = virtio_dev_link_update, - .mac_addr_add= NULL, - .mac_addr_remove = NULL, .rx_queue_setup = virtio_dev_rx_queue_setup, /* meaningfull only to multiple queue */ .rx_queue_release= virtio_dev_rx_queue_release, @@ -514,6 +516,8 @@ static struct eth_dev_ops virtio_eth_dev_ops = { /* collect stats per queue */ .queue_stats_mapping_set = virtio_dev_queue_stats_mapping_set, .vlan_filter_set = virtio_vlan_filter_set, + .mac_addr_add= virtio_mac_addr_add, + .mac_addr_remove = virtio_mac_addr_remove, }; static inline int @@ -644,6 +648,92 @@ virtio_get_hwaddr(struct virtio_hw *hw) } static int +virtio_mac_table_set(struct virtio_hw *hw, +const struct virtio_net_ctrl_mac *uc, +const struct virtio_net_ctrl_mac *mc) +{ + struct virtio_pmd_ctrl ctrl; + int err, len[2]; + + ctrl.hdr.class = VIRTIO_NET_CTRL_MAC; + ctrl.hdr.cmd = VIRTIO_NET_CTRL_MAC_TABLE_SET; + + len[0] = uc->entries * ETHER_ADDR_LEN + sizeof(uc->entries); + memcpy(ctrl.data, uc, len[0]); + + len[1] = mc->entries * ETHER_ADDR_LEN + sizeof(mc->entries); + memcpy(ctrl.data + len[0], mc, len[1]); + + err = virtio_send_command(hw->cvq, &ctrl, len, 2); + if (err != 0) + PMD_DRV_LOG(NOTICE, "mac table set failed: %d", err); + + return err; +} + +static void +virtio_mac_addr_add(struct rte_eth_dev *dev, struct ether_addr *mac_addr, + uint32_t index, uint32_t vmdq __rte_unused) +{ + struct virtio_hw *hw = dev->data->dev_private; + const struct ether_addr *addrs = dev->data->mac_addrs; + unsigned int i; + struct virtio_net_ctrl_mac *uc, *mc; + + if (index >= VIRTIO_MAX_MAC_ADDRS) { + PMD_DRV_LOG(ERR, "mac address index %u out of range", index); + return; + } + + uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries)); + uc->entries = 0; + mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries)); + mc->entries = 0; + + for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) { + const struct ether_addr *addr + = (i == index) ? mac_addr : addrs + i; + struct virtio_net_ctrl_mac *tbl + = is_multicast_ether_addr(addr) ? mc : uc; + + memcpy(&tbl->macs[tbl->entries++], addr, ETHER_ADDR_LEN); + } + + virtio_mac_table_set(hw, uc, mc); +} + +static void +virtio_mac_addr_remove(struct rte_eth_dev *dev, uint32_t index) +{ + struct virtio_hw *hw = dev->data->dev_private; + struct ether_addr *addrs = dev->data->mac_addrs; + struct virtio_net_ctrl_mac *uc, *mc; + unsigned int i; + + if (index >= VIRTIO_MAX_MAC_ADDRS) { + PMD_DRV_LOG(ERR, "mac address index %u out of range", index); + return; + } + + uc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(uc->entries)); + uc->entries = 0; + mc = alloca(VIRTIO_MAX_MAC_ADDRS * ETHER_ADDR_LEN + sizeof(mc->entries)); + mc->entries = 0; + + for (i = 0; i < VIRTIO_MAX_MAC_ADDRS; i++) { + struct virtio_net_ctrl_mac *tbl; + + if (i == index || is_zero_ether_addr(addrs + i)) + continue; + + tbl = is_multicast_ether_addr(addrs + i) ? mc : uc; + memcpy(&tbl->macs[tbl->entries++], addrs + i, ETHER_
[dpdk-dev] [RFC PATCH 17/17] virtio: Use port IO to get PCI resource.
Make virtio not require UIO for some security reasons, this is to match 6Wind's virtio-net-pmd. Signed-off-by: Changchun Ouyang --- lib/librte_eal/common/include/rte_pci.h | 2 + lib/librte_eal/linuxapp/eal/eal_pci.c | 3 +- lib/librte_pmd_virtio/virtio_ethdev.c | 75 - 3 files changed, 77 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/common/include/rte_pci.h b/lib/librte_eal/common/include/rte_pci.h index 66ed793..2021b3b 100644 --- a/lib/librte_eal/common/include/rte_pci.h +++ b/lib/librte_eal/common/include/rte_pci.h @@ -193,6 +193,8 @@ struct rte_pci_driver { /** Device needs PCI BAR mapping (done with either IGB_UIO or VFIO) */ #define RTE_PCI_DRV_NEED_MAPPING 0x0001 +/** Device needs port IO(done with /proc/ioports) */ +#define RTE_PCI_DRV_IO_PORT 0x0002 /** Device driver must be registered several times until failure - deprecated */ #pragma GCC poison RTE_PCI_DRV_MULTIPLE /** Device needs to be unbound even if no module is provided */ diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c index b5f5410..dd60793 100644 --- a/lib/librte_eal/linuxapp/eal/eal_pci.c +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c @@ -573,7 +573,8 @@ rte_eal_pci_probe_one_driver(struct rte_pci_driver *dr, struct rte_pci_device *d #endif /* map resources for devices that use igb_uio */ ret = pci_map_device(dev); - if (ret != 0) + if ((ret != 0) && + ((dr->drv_flags & RTE_PCI_DRV_IO_PORT) == 0)) return ret; } else if (dr->drv_flags & RTE_PCI_DRV_FORCE_UNBIND && rte_eal_process_type() == RTE_PROC_PRIMARY) { diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index 1ec29e1..4490a06 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -961,6 +961,69 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev) start, size); return 0; } + +/* Extract I/O port numbers from proc/ioports */ +static int virtio_resource_init_by_ioport(struct rte_pci_device *pci_dev) +{ + uint16_t start, end; + int size; + FILE* fp; + char* line = NULL; + char pci_id[16]; + int found = 0; + size_t linesz; + + snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT, +pci_dev->addr.domain, +pci_dev->addr.bus, +pci_dev->addr.devid, +pci_dev->addr.function); + + fp = fopen("/proc/ioports", "r"); + if (fp == NULL) { + PMD_INIT_LOG(ERR, "%s(): can't open ioports", __func__); + return -1; + } + + while (getdelim(&line, &linesz, '\n', fp) > 0) { + char* ptr = line; + char* left; + int n; + + n = strcspn(ptr, ":"); + ptr[n]= 0; + left = &ptr[n+1]; + + while (*left && isspace(*left)) + left++; + + if (!strncmp(left, pci_id, strlen(pci_id))) { + found = 1; + + while (*ptr && isspace(*ptr)) + ptr++; + + sscanf(ptr, "%04hx-%04hx", &start, &end); + size = end - start + 1; + + break; + } + } + + free(line); + fclose(fp); + + if (!found) + return -1; + + pci_dev->mem_resource[0].addr = (void *)(uintptr_t)(uint32_t)start; + pci_dev->mem_resource[0].len = (uint64_t)size; + PMD_INIT_LOG(DEBUG, +"PCI Port IO found start=0x%lx with size=0x%lx", +start, size); + return 0; +} + #else static int virtio_has_msix(const struct rte_pci_addr *loc __rte_unused) @@ -974,6 +1037,12 @@ static int virtio_resource_init(struct rte_pci_device *pci_dev __rte_unused) /* no setup required */ return 0; } + +static int virtio_resource_init_by_ioport(struct rte_pci_device *pci_dev) +{ + /* no setup required */ + return 0; +} #endif /* @@ -1039,7 +1108,8 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, pci_dev = eth_dev->pci_dev; if (virtio_resource_init(pci_dev) < 0) - return -1; + if (virtio_resource_init_by_ioport(pci_dev) < 0) + return -1; hw->use_msix = virtio_has_msix(&pci_dev->addr); hw->io_base = (uint32_t)(uintptr_t)pci_dev->mem_resource[0].addr; @@ -1136,7 +1206,8 @@ static struct eth_driver rte_virtio_pmd = { { .name = "rte_virtio_pmd", .id_table = pci_id_virtio_map, - .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC, + .drv_flag
[dpdk-dev] [RFC PATCH 16/17] virtio: Free mbuf's with threshold
This makes virtio driver work like ixgbe. Transmit buffers are held until a transmit threshold is reached. The previous behavior was to hold mbuf's until the ring entry was reused which caused more memory usage than needed. Signed-off-by: Changchun Ouyang Signed-off-by: Stephen Hemminger --- lib/librte_pmd_virtio/virtio_ethdev.c | 7 ++-- lib/librte_pmd_virtio/virtio_rxtx.c | 70 +-- lib/librte_pmd_virtio/virtqueue.h | 3 +- 3 files changed, 64 insertions(+), 16 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index c5f21c1..1ec29e1 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -176,15 +176,16 @@ virtio_send_command(struct virtqueue *vq, struct virtio_pmd_ctrl *ctrl, virtqueue_notify(vq); - while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) + rte_rmb(); + while (vq->vq_used_cons_idx == vq->vq_ring.used->idx) { + rte_rmb(); usleep(100); + } while (vq->vq_used_cons_idx != vq->vq_ring.used->idx) { uint32_t idx, desc_idx, used_idx; struct vring_used_elem *uep; - virtio_rmb(); - used_idx = (uint32_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); uep = &vq->vq_ring.used->ring[used_idx]; diff --git a/lib/librte_pmd_virtio/virtio_rxtx.c b/lib/librte_pmd_virtio/virtio_rxtx.c index b44f091..26c0a1d 100644 --- a/lib/librte_pmd_virtio/virtio_rxtx.c +++ b/lib/librte_pmd_virtio/virtio_rxtx.c @@ -129,9 +129,15 @@ virtqueue_dequeue_burst_rx(struct virtqueue *vq, struct rte_mbuf **rx_pkts, return i; } +#ifndef DEFAULT_TX_FREE_THRESH +#define DEFAULT_TX_FREE_THRESH 32 +#endif + +/* Cleanup from completed transmits. */ static void -virtqueue_dequeue_pkt_tx(struct virtqueue *vq) +virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num) { +#if 0 struct vring_used_elem *uep; uint16_t used_idx, desc_idx; @@ -140,6 +146,25 @@ virtqueue_dequeue_pkt_tx(struct virtqueue *vq) desc_idx = (uint16_t) uep->id; vq->vq_used_cons_idx++; vq_ring_free_chain(vq, desc_idx); +#endif + uint16_t i, used_idx, desc_idx; + for (i = 0; i < num ; i++) { + struct vring_used_elem *uep; + struct vq_desc_extra *dxp; + + used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); + uep = &vq->vq_ring.used->ring[used_idx]; + dxp = &vq->vq_descx[used_idx]; + + desc_idx = (uint16_t) uep->id; + vq->vq_used_cons_idx++; + vq_ring_free_chain(vq, desc_idx); + + if (dxp->cookie != NULL) { + rte_pktmbuf_free(dxp->cookie); + dxp->cookie = NULL; + } + } } @@ -203,8 +228,10 @@ virtqueue_enqueue_xmit(struct virtqueue *txvq, struct rte_mbuf *cookie) idx = head_idx; dxp = &txvq->vq_descx[idx]; +#if 0 if (dxp->cookie != NULL) rte_pktmbuf_free(dxp->cookie); +#endif dxp->cookie = (void *)cookie; dxp->ndescs = needed; @@ -404,6 +431,7 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, { uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX; struct virtqueue *vq; + uint16_t tx_free_thresh; int ret; PMD_INIT_FUNC_TRACE(); @@ -421,6 +449,21 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, return ret; } + tx_free_thresh = tx_conf->tx_free_thresh; + if (tx_free_thresh == 0) + tx_free_thresh = RTE_MIN(vq->vq_nentries / 4, DEFAULT_TX_FREE_THRESH); + + if (tx_free_thresh >= (vq->vq_nentries - 3)) { + RTE_LOG(ERR, PMD, "tx_free_thresh must be less than the " + "number of TX entries minus 3 (%u)." + " (tx_free_thresh=%u port=%u queue=%u)\n", + vq->vq_nentries - 3, + tx_free_thresh, dev->data->port_id, queue_idx); + return -EINVAL; + } + + vq->vq_free_thresh = tx_free_thresh; + dev->data->tx_queues[queue_idx] = vq; return 0; } @@ -688,11 +731,9 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { struct virtqueue *txvq = tx_queue; struct rte_mbuf *txm; - uint16_t nb_used, nb_tx, num; + uint16_t nb_used, nb_tx; int error; - nb_tx = 0; - if (unlikely(nb_pkts < 1)) return nb_pkts; @@ -700,21 +741,26 @@ virtio_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) nb_used = VIRTQUEUE_NUSED(txvq); virtio_rmb(); + if (likely(nb_used > txvq->vq_free_thresh)) + virtio_xmit_cleanup(txvq, nb_used); - num = (uint16_t)(likely(nb
[dpdk-dev] [RFC PATCH 00/17] Single virtio implementation
Hi Thomas, > -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, December 8, 2014 5:31 PM > To: Ouyang, Changchun > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [RFC PATCH 00/17] Single virtio implementation > > Hi Changchun, > > 2014-12-08 14:21, Ouyang Changchun: > > This patch set bases on two original RFC patch sets from Stephen > Hemminger[stephen at networkplumber.org] > > Refer to [http://dpdk.org/ml/archives/dev/2014-August/004845.html ] for > the original one. > > This patch set also resolves some conflict with latest codes and removed > duplicated codes. > > As you sent the patches, you appear as the author. > But I guess Stephen should be the author for some of them. > Please check who has contributed the most in each patch to decide. You are right, most of patches originate from Stephen's patchset, except for the last one, To be honest, I am ok whoever is the author of this patch set, :-), We could co-own the feature of Single virtio if you all agree with it, and I think we couldn't finish Such a feature without collaboration among us, this is why I tried to communicate with most of you to collect more feedback, suggestion and comments for this feature. Very appreciate for all kinds of feedback, suggestion here, especially for patch set from Stephen. According to your request, how could we make this patch set looks more like Stephen as the author? Currently I add Stephen as Signed-off-by list in each patch(I got the agreement from Stephen before doing this :-)). Need I send all patchset to Stephen and let Stephen send out them to dpdk.org? Or any other better solution? If you has better suggestion, I assume it works for all subsequent RFC and normal patch set. Any other suggestions are welcome. Thanks Changchun