[dpdk-dev] [PATCH 0/3] Rename field name for RX/TX queue start/stop
Hi Thomas, I have generated patch v2 to resolve this according to your comments. Pls see attachment. Thanks and regards, Changchun -Original Message- From: Thomas Monjalon [mailto:thomas.monja...@6wind.com] Sent: Tuesday, July 22, 2014 5:38 PM To: Ouyang, Changchun Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/3] Rename field name for RX/TX queue start/stop Hi, 2014-07-22 15:47, Ouyang Changchun: > This patch series include 3 things: > 1) Rename the field name from start_rx_per_q to rx_enable_queue in > struct rte_eth_rxconf, and do same thing for TX. > This patch also update description for field rx_enable_queue and > tx_enable_queue. > 2) According to 1), update field name from start_rx_per_q to > rx_enable_queue in struct igb_rx_queue in ixgbe PMD, do same thing for TX. > 3) Update its reference in sample vhost. In order to be atomic (and do not break git bisect), you should submit it in one patch. Title would be "ethdev: rename queue enabler field" or something like that. But the most important in such change is to explain why you make it. Thanks -- Thomas
[dpdk-dev] does vswitchd runs multiple threads when i added dpdk devices
Hi, I have taken the code form https://github.com/openvswitch/ovs I have added two dpdk devices ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vsctl add-port br0 dpdk1 -- set Interface dpdk1 type=dpdk https://github.com/openvswitch/ovs/blob/master/INSTALL.DPDK In the above link it is mentioned as " [ Once first DPDK port is added to vswitchd, it creates a Polling thread and polls dpdk device in continuous loop. Therefore CPU utilization for that thread is always 100%. ] " As per my understanding each dpdk device is polled on different thread . But in my case vswithcd is running in only single thread [on core 0] , I expected to run on 3 cores .. One thing I want to clarify that .. does ovs-vswitchd runs on single core only .. or multiple thereads .. when I added dpdk devices . If vswitchd runs on multiple threads , when I added dpdk devices .. pls let me know how can I run . Thanks, Srinivas. "DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus."
[dpdk-dev] [PATCH] virtio: Fix 2 compilation issues in virtio PMD
Hi all, > -Original Message- > From: Ouyang, Changchun > Sent: Thursday, July 24, 2014 12:58 PM > To: dev at dpdk.org > Cc: Cao, Waterman; Ouyang, Changchun > Subject: [PATCH] virtio: Fix 2 compilation issues in virtio PMD > > Fix 2 compilation issues in virtio PMD when dump option is enabled. > > Signed-off-by: Changchun Ouyang > --- > lib/librte_pmd_virtio/virtio_ethdev.c | 2 +- > lib/librte_pmd_virtio/virtqueue.h | 4 ++-- > 2 files changed, 3 insertions(+), 3 deletions(-) > The offending commit which cause these 2 issues are: commit f37cdfde46a30d93f3dd8a4e01243be8bc0ac142 Author: Stephen Hemminger Date: Fri Jun 13 18:06:23 2014 -0700 virtio: remove unused virtqueue name vq_name is only used when setting up queue, and does not need to be saved. commit ce65e697c67ba1a357d806eed05957b3d43f562c Author: Stephen Hemminger Date: Fri Jun 13 18:06:25 2014 -0700 virtio: simplify the hardware structure The host_features are never used after negotiation. The PCI information is unused (and available in rte_pci if needed). Thanks Changchun
[dpdk-dev] [PATCH v2 1/6] ethdev: rename macros of packet classification type
For better understanding, 'PCTYPE' which represents 'Packet Classification Type' is used to replace 'RSS' in the name of shift macros. Signed-off-by: Helin Zhang --- lib/librte_ether/rte_ethdev.h | 76 +-- 1 file changed, 38 insertions(+), 38 deletions(-) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 50df654..dd36605 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -345,47 +345,47 @@ struct rte_eth_rss_conf { #define ETH_RSS_IPV4_UDP_SHIFT6 #define ETH_RSS_IPV6_UDP_SHIFT7 #define ETH_RSS_IPV6_UDP_EX_SHIFT 8 -/* for 40G only */ -#define ETH_RSS_NONF_IPV4_UDP_SHIFT 31 -#define ETH_RSS_NONF_IPV4_TCP_SHIFT 33 -#define ETH_RSS_NONF_IPV4_SCTP_SHIFT 34 -#define ETH_RSS_NONF_IPV4_OTHER_SHIFT 35 -#define ETH_RSS_FRAG_IPV4_SHIFT 36 -#define ETH_RSS_NONF_IPV6_UDP_SHIFT 41 -#define ETH_RSS_NONF_IPV6_TCP_SHIFT 43 -#define ETH_RSS_NONF_IPV6_SCTP_SHIFT 44 -#define ETH_RSS_NONF_IPV6_OTHER_SHIFT 45 -#define ETH_RSS_FRAG_IPV6_SHIFT 46 -#define ETH_RSS_FCOE_OX_SHIFT 48 -#define ETH_RSS_FCOE_RX_SHIFT 49 -#define ETH_RSS_FCOE_OTHER_SHIFT 50 -#define ETH_RSS_L2_PAYLOAD_SHIFT 63 +/* Packet Classification Type for 40G only */ +#define ETH_PCTYPE_NONF_IPV4_UDP 31 +#define ETH_PCTYPE_NONF_IPV4_TCP 33 +#define ETH_PCTYPE_NONF_IPV4_SCTP 34 +#define ETH_PCTYPE_NONF_IPV4_OTHER35 +#define ETH_PCTYPE_FRAG_IPV4 36 +#define ETH_PCTYPE_NONF_IPV6_UDP 41 +#define ETH_PCTYPE_NONF_IPV6_TCP 43 +#define ETH_PCTYPE_NONF_IPV6_SCTP 44 +#define ETH_PCTYPE_NONF_IPV6_OTHER45 +#define ETH_PCTYPE_FRAG_IPV6 46 +#define ETH_PCTYPE_FCOE_OX48 /* not used */ +#define ETH_PCTYPE_FCOE_RX49 /* not used */ +#define ETH_PCTYPE_FCOE_OTHER 50 /* not used */ +#define ETH_PCTYPE_L2_PAYLOAD 63 /* for 1G & 10G */ -#define ETH_RSS_IPV4((uint16_t)1 << ETH_RSS_IPV4_SHIFT) -#define ETH_RSS_IPV4_TCP((uint16_t)1 << ETH_RSS_IPV4_TCP_SHIFT) -#define ETH_RSS_IPV6((uint16_t)1 << ETH_RSS_IPV6_SHIFT) -#define ETH_RSS_IPV6_EX ((uint16_t)1 << ETH_RSS_IPV6_EX_SHIFT) -#define ETH_RSS_IPV6_TCP((uint16_t)1 << ETH_RSS_IPV6_TCP_SHIFT) -#define ETH_RSS_IPV6_TCP_EX ((uint16_t)1 << ETH_RSS_IPV6_TCP_EX_SHIFT) -#define ETH_RSS_IPV4_UDP((uint16_t)1 << ETH_RSS_IPV4_UDP_SHIFT) -#define ETH_RSS_IPV6_UDP((uint16_t)1 << ETH_RSS_IPV6_UDP_SHIFT) -#define ETH_RSS_IPV6_UDP_EX ((uint16_t)1 << ETH_RSS_IPV6_UDP_EX_SHIFT) +#define ETH_RSS_IPV4(1 << ETH_RSS_IPV4_SHIFT) +#define ETH_RSS_IPV4_TCP(1 << ETH_RSS_IPV4_TCP_SHIFT) +#define ETH_RSS_IPV6(1 << ETH_RSS_IPV6_SHIFT) +#define ETH_RSS_IPV6_EX (1 << ETH_RSS_IPV6_EX_SHIFT) +#define ETH_RSS_IPV6_TCP(1 << ETH_RSS_IPV6_TCP_SHIFT) +#define ETH_RSS_IPV6_TCP_EX (1 << ETH_RSS_IPV6_TCP_EX_SHIFT) +#define ETH_RSS_IPV4_UDP(1 << ETH_RSS_IPV4_UDP_SHIFT) +#define ETH_RSS_IPV6_UDP(1 << ETH_RSS_IPV6_UDP_SHIFT) +#define ETH_RSS_IPV6_UDP_EX (1 << ETH_RSS_IPV6_UDP_EX_SHIFT) /* for 40G only */ -#define ETH_RSS_NONF_IPV4_UDP ((uint64_t)1 << ETH_RSS_NONF_IPV4_UDP_SHIFT) -#define ETH_RSS_NONF_IPV4_TCP ((uint64_t)1 << ETH_RSS_NONF_IPV4_TCP_SHIFT) -#define ETH_RSS_NONF_IPV4_SCTP ((uint64_t)1 << ETH_RSS_NONF_IPV4_SCTP_SHIFT) -#define ETH_RSS_NONF_IPV4_OTHER ((uint64_t)1 << ETH_RSS_NONF_IPV4_OTHER_SHIFT) -#define ETH_RSS_FRAG_IPV4 ((uint64_t)1 << ETH_RSS_FRAG_IPV4_SHIFT) -#define ETH_RSS_NONF_IPV6_UDP ((uint64_t)1 << ETH_RSS_NONF_IPV6_UDP_SHIFT) -#define ETH_RSS_NONF_IPV6_TCP ((uint64_t)1 << ETH_RSS_NONF_IPV6_TCP_SHIFT) -#define ETH_RSS_NONF_IPV6_SCTP ((uint64_t)1 << ETH_RSS_NONF_IPV6_SCTP_SHIFT) -#define ETH_RSS_NONF_IPV6_OTHER ((uint64_t)1 << ETH_RSS_NONF_IPV6_OTHER_SHIFT) -#define ETH_RSS_FRAG_IPV6 ((uint64_t)1 << ETH_RSS_FRAG_IPV6_SHIFT) -#define ETH_RSS_FCOE_OX ((uint64_t)1 << ETH_RSS_FCOE_OX_SHIFT) /* not used */ -#define ETH_RSS_FCOE_RX ((uint64_t)1 << ETH_RSS_FCOE_RX_SHIFT) /* not used */ -#define ETH_RSS_FCOE_OTHER ((uint64_t)1 << ETH_RSS_FCOE_OTHER_SHIFT) /* not used */ -#define ETH_RSS_L2_PAYLOAD ((uint64_t)1 << ETH_RSS_L2_PAYLOAD_SHIFT) +#define ETH_RSS_NONF_IPV4_UDP (1ULL << ETH_PCTYPE_NONF_IPV4_UDP) +#define ETH_RSS_NONF_IPV4_TCP (1ULL << ETH_PCTYPE_NONF_IPV4_TCP) +#define ETH_RS
[dpdk-dev] [PATCH v2 4/6] i40e: support of 'is_command_supported'
'is_command_supported' is defined for the capability discovery. Actually it is to check if a command (feature) is supported on a specific type of NIC port. Now i40e supports below eight commands. Of cause, more commands can be supported later. - RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE - RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE - RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT - RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT - RTE_CMD_GET_FILTER_SWAP - RTE_CMD_SET_FILTER_SWAP - RTE_CMD_GET_HASH_FUNCTION - RTE_CMD_SET_HASH_FUNCTION Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e_ethdev.c | 28 1 file changed, 28 insertions(+) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 4403af4..87a4999 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -204,6 +204,8 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); +static int i40e_dev_is_command_supported(struct rte_eth_dev *dev __rte_unused, +enum rte_eth_command cmd); static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, enum rte_eth_command cmd, void *args); @@ -252,6 +254,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = { .reta_query = i40e_dev_rss_reta_query, .rss_hash_update = i40e_dev_rss_hash_update, .rss_hash_conf_get= i40e_dev_rss_hash_conf_get, + .is_command_supported = i40e_dev_is_command_supported, .rx_classification_filter_ctl = i40e_rx_classification_filter_ctl, }; @@ -4292,6 +4295,31 @@ i40e_get_hash_function(struct i40e_hw *hw, enum rte_i40e_hash_function *hf) } static int +i40e_dev_is_command_supported(struct rte_eth_dev *dev __rte_unused, + enum rte_eth_command cmd) +{ + uint32_t i; + /* Below commands defined in rte_eth_features.h are for i40e only */ + static const enum rte_eth_command i40e_commands[] = { + RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE, + RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE, + RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT, + RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT, + RTE_CMD_GET_FILTER_SWAP, + RTE_CMD_SET_FILTER_SWAP, + RTE_CMD_GET_HASH_FUNCTION, + RTE_CMD_SET_HASH_FUNCTION, + }; + + for (i = 0; i < RTE_DIM(i40e_commands); i++) { + if (i40e_commands[i] == cmd) + return 1; + } + + return 0; +} + +static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, enum rte_eth_command cmd, void *args) -- 1.8.1.4
[dpdk-dev] [PATCH v2 5/6] i40e: Initialize hash function during port initialization.
As hash function are configured in gloabal registers, those registers will not be reloaded unless a gloabl NIC hardware reset. That means a DPDK application launch will not load the default configuration of hash functions. It needs an initialization of those registers during the port initialization to make sure all those registers are in an expected state. Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e_ethdev.c | 71 +++ 1 file changed, 71 insertions(+) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 87a4999..386d864 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -204,6 +204,7 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); +static void i40e_init_hash_function(struct i40e_hw *hw); static int i40e_dev_is_command_supported(struct rte_eth_dev *dev __rte_unused, enum rte_eth_command cmd); static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, @@ -392,6 +393,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv, return ret; } + /* Init hash functions */ + i40e_init_hash_function(hw); + /* Initialize the shared code (base driver) */ ret = i40e_init_shared_code(hw); if (ret) { @@ -4369,3 +4373,70 @@ i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, return ret; } + +/** + * Initialize hash functions. It includes, + * - set hash function to Toeplitz. + * - set the default filter swap configurations. + * - disable hash function enable per port. + * - disable hash function enable per pctype. + * Only global reset can reload the firmware configurations. + */ +static void +i40e_init_hash_function(struct i40e_hw *hw) +{ + static struct rte_i40e_filter_swap_info swap_info[] = { + {ETH_PCTYPE_NONF_IPV4_UDP, + 0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02}, + {ETH_PCTYPE_NONF_IPV4_TCP, + 0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02}, + {ETH_PCTYPE_NONF_IPV4_SCTP, + 0x1e, 0x36, 0x04, 0x00, 0x00, 0x00}, + {ETH_PCTYPE_NONF_IPV4_OTHER, + 0x1e, 0x36, 0x04, 0x00, 0x00, 0x00}, + {ETH_PCTYPE_FRAG_IPV4, + 0x1e, 0x36, 0x04, 0x00, 0x00, 0x00}, + {ETH_PCTYPE_NONF_IPV6_UDP, + 0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02}, + {ETH_PCTYPE_NONF_IPV6_TCP, + 0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02}, + {ETH_PCTYPE_NONF_IPV6_SCTP, + 0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00}, + {ETH_PCTYPE_NONF_IPV6_OTHER, + 0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00}, + {ETH_PCTYPE_FRAG_IPV6, + 0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00}, + {ETH_PCTYPE_L2_PAYLOAD, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, + }; + static struct rte_i40e_sym_hash_enable_info sym_hash_ena_info[] = { + {ETH_PCTYPE_NONF_IPV4_UDP, 0}, + {ETH_PCTYPE_NONF_IPV4_TCP, 0}, + {ETH_PCTYPE_NONF_IPV4_SCTP, 0}, + {ETH_PCTYPE_NONF_IPV4_OTHER, 0}, + {ETH_PCTYPE_FRAG_IPV4, 0}, + {ETH_PCTYPE_NONF_IPV6_UDP, 0}, + {ETH_PCTYPE_NONF_IPV6_TCP, 0}, + {ETH_PCTYPE_NONF_IPV6_SCTP, 0}, + {ETH_PCTYPE_NONF_IPV6_OTHER, 0}, + {ETH_PCTYPE_FRAG_IPV6, 0}, + {ETH_PCTYPE_L2_PAYLOAD, 0}, + }; + static enum rte_i40e_hash_function hf = rte_i40e_hash_function_toeplitz; + uint32_t i; + + /* set hash function to Toeplitz by default */ + i40e_set_hash_function(hw, &hf); + + /* initialize filter swap */ + for (i = 0; i < RTE_DIM(swap_info); i++) + i40e_set_filter_swap(hw, &swap_info[i]); + + /* disable all symmetric hash per pctype */ + for (i = 0; i < RTE_DIM(sym_hash_ena_info); i++) + i40e_set_symmetric_hash_enable_per_pctype(hw, + &sym_hash_ena_info[i]); + + /* disable symmetric hash per port */ + i40e_set_symmetric_hash_enable_per_port(hw, 0); +} -- 1.8.1.4
[dpdk-dev] [PATCH v2 2/6] ethdev: add new ops of 'is_command_supported' and 'rx_classification_filter_ctl'
Two ops of 'is_command_supported' and 'rx_classification_filter_ctl' are added. New header file of 'rte_eth_features.h' is added. * 'is_command_supported': It is for capability discovery, that is to check if specific feature/command is supported on a port. * 'rx_classification_filter_ctl': It is for receive classification filter configuring. e.g. selecting hash function, possibly configuring flow director. It is a common API where a lot of commands can be implemented for different sub features, to avoid defining quite a lot of ops for device specific features. * 'rte_eth_features.h': It includes all the feature commands which can be checked and processed in above two ops. Also it may include other commands for future implementations. Signed-off-by: Helin Zhang --- lib/librte_ether/Makefile | 1 + lib/librte_ether/rte_eth_features.h | 73 + lib/librte_ether/rte_ethdev.c | 31 lib/librte_ether/rte_ethdev.h | 55 4 files changed, 160 insertions(+) create mode 100644 lib/librte_ether/rte_eth_features.h diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile index b310f8b..8089723 100644 --- a/lib/librte_ether/Makefile +++ b/lib/librte_ether/Makefile @@ -46,6 +46,7 @@ SRCS-y += rte_ethdev.c # SYMLINK-y-include += rte_ether.h SYMLINK-y-include += rte_ethdev.h +SYMLINK-y-include += rte_eth_features.h # this lib depends upon: DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf diff --git a/lib/librte_ether/rte_eth_features.h b/lib/librte_ether/rte_eth_features.h new file mode 100644 index 000..983d7c6 --- /dev/null +++ b/lib/librte_ether/rte_eth_features.h @@ -0,0 +1,73 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_ETH_FEATURES_H_ +#define _RTE_ETH_FEATURES_H_ + +/** + * @file + * + * Ethernet device specific features + */ + +#ifdef __cplusplus +extern "C" { +#endif + +/* Commands defined for NIC specific features */ +enum rte_eth_command { + RTE_CMD_UNKNOWN = 0, + /**< Unknown command */ + RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE, + /**< Get symmetric hash enable per pctype */ + RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE, + /**< Set symmetric hash enable per pctype */ + RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT, + /**< Get symmetric hash enable per port */ + RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT, + /**< Set symmetric hash enable per port */ + RTE_CMD_GET_FILTER_SWAP, + /**< Get filter swap configurations */ + RTE_CMD_SET_FILTER_SWAP, + /**< Set filter swap configurations */ + RTE_CMD_GET_HASH_FUNCTION, + /**< Get hash function */ + RTE_CMD_SET_HASH_FUNCTION, + /**< Set hash function */ +}; + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_ETH_FEATURES_H_ */ diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index fd1010a..dfeb804 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -3002,3 +3002,34 @@ rte_eth_dev_get_flex_filter(uint8_t port_id, uint16_t index, return (*dev->dev_ops->get_flex_filter)(dev, index, filter, rx_queue); } + +int +rte_eth_dev_is_command_suppo
[dpdk-dev] [PATCH v2 3/6] i40e: support of 'rx_classification_filter_ctl'
'rx_classification_filter_ctl' was defined as a common API for receive classification filter features. Eight commands has been implemented for selecting hash functions of 'Toeplitz' and 'Simple XOR', and configuring symmetric hash functions. In detail, RTE_CMD_GET_SYM_HASH_ENABLE_PER_PCTYPE: - Get symmetric hash enable configuration per 'PCTYPE'. RTE_CMD_SET_SYM_HASH_ENABLE_PER_PCTYPE: - Set symmetric hash enable configuration per 'PCTYPE'. RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT: - Get symmetric hash enable configuration per port. RTE_CMD_SET_SYM_HASH_ENABLE_PER_PORT: - Set symmetric hash enable configuration per port. RTE_CMD_GET_FILTER_SWAP: - Get filter swap configurations. RTE_CMD_SET_FILTER_SWAP: - Set filter swap configurations. RTE_CMD_GET_HASH_FUNCTION: - Get current hash function. RTE_CMD_SET_HASH_FUNCTION: - Set hash function of 'Toeplitz' or 'Simple XOR'. Note that 'PCTYPE' means 'Packet Classification Type'. Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/Makefile | 6 + lib/librte_pmd_i40e/i40e_ethdev.c | 385 ++ lib/librte_pmd_i40e/i40e_ethdev.h | 2 + lib/librte_pmd_i40e/rte_i40e.h| 108 +++ 4 files changed, 501 insertions(+) create mode 100644 lib/librte_pmd_i40e/rte_i40e.h diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile index 4b31675..a777a76 100644 --- a/lib/librte_pmd_i40e/Makefile +++ b/lib/librte_pmd_i40e/Makefile @@ -87,6 +87,12 @@ SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_rxtx.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev_vf.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_pf.c + +# +# Export include file +# +SYMLINK-$(CONFIG_RTE_LIBRTE_I40E_PMD)-include += rte_i40e.h + # this lib depends upon: DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_eal lib/librte_ether DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_mempool lib/librte_mbuf diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 9ed31b5..4403af4 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -48,6 +48,7 @@ #include #include #include +#include #include "i40e_logs.h" #include "i40e/i40e_register_x710_int.h" @@ -203,6 +204,9 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); +static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, +enum rte_eth_command cmd, +void *args); /* Default hash key buffer for RSS */ static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; @@ -248,6 +252,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = { .reta_query = i40e_dev_rss_reta_query, .rss_hash_update = i40e_dev_rss_hash_update, .rss_hash_conf_get= i40e_dev_rss_hash_conf_get, + .rx_classification_filter_ctl = i40e_rx_classification_filter_ctl, }; static struct eth_driver rte_i40e_pmd = { @@ -3956,3 +3961,383 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf) return 0; } + +static int +i40e_get_filter_swap(struct i40e_hw *hw, struct rte_i40e_filter_swap_info *info) +{ + uint32_t reg; + + if (!hw || !info) { + PMD_DRV_LOG(ERR, "Invalid pointer\n"); + return -1; + } + + switch (info->pctype) { + case ETH_PCTYPE_NONF_IPV4_UDP: + case ETH_PCTYPE_NONF_IPV4_TCP: + case ETH_PCTYPE_NONF_IPV4_SCTP: + case ETH_PCTYPE_NONF_IPV4_OTHER: + case ETH_PCTYPE_FRAG_IPV4: + case ETH_PCTYPE_NONF_IPV6_UDP: + case ETH_PCTYPE_NONF_IPV6_TCP: + case ETH_PCTYPE_NONF_IPV6_SCTP: + case ETH_PCTYPE_NONF_IPV6_OTHER: + case ETH_PCTYPE_FRAG_IPV6: + case ETH_PCTYPE_L2_PAYLOAD: + reg = I40E_READ_REG(hw, I40E_GLQF_SWAP(0, info->pctype)); + PMD_DRV_LOG(DEBUG, "Value read from I40E_GLQF_SWAP[0,%d]: " + "0x%x\n", info->pctype, reg); + + /** +* The offset and length read from register in word unit, +* which need to be converted in byte unit before being saved. +*/ + info->off0_src0 = + (uint8_t)((reg & I40E_GLQF_SWAP_OFF0_SRC0_MASK) >> + I40E_GLQF_SWAP_OFF0_SRC0_SHIFT) << 1; + info->off0_src1 = + (uint8_t)((reg & I40E_GLQF_SWAP_OFF0_SRC1_MASK) >> + I40E_GLQF_SWAP_OFF0_SRC1_SHIFT) << 1; + info->len0 = (uint8_t)((reg & I40E_GLQF_SWAP_FLEN0_MASK) >> + I40E_GLQF_SWAP_FLEN0_SHIFT) << 1; + info->off1_sr
[dpdk-dev] [PATCH v2 6/6] app/testpmd: add commands for configuring hash functions
Eight commands are added to configure hash functions. They are, - i40e_get_sym_hash_ena_per_port Get symmetric hash enable per port. - i40e_set_sym_hash_ena_per_port Set symmetric hash enable per port. - i40e_get_sym_hash_ena_per_pctype Get symmetric hash enable per PCTYPE (Packet Classification Type). - i40e_set_sym_hash_ena_per_pctype Set symmetric hash enable per PCTYPE (Packet Classification Type). - i40e_get_filter_swap Get filter swap configurations. - i40e_set_filter_swap Set filter swap configurations. - i40e_get_hash_function Get hash function. - i40e_set_hash_function Set hash function to 'Toeplitz' or 'Simple XOR' Signed-off-by: Helin Zhang --- app/test-pmd/cmdline.c | 579 + 1 file changed, 579 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 345be11..9890400 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -74,6 +74,10 @@ #include #include #include +#include +#ifdef RTE_LIBRTE_I40E_PMD +#include +#endif #include #include @@ -655,6 +659,43 @@ static void cmd_help_long_parsed(void *parsed_result, "get_flex_filter (port_id) index (idx)\n" "get info of a flex filter.\n\n" + +#ifdef RTE_LIBRTE_I40E_PMD + "i40e_get_sym_hash_ena_per_port (port_id)\n" + "get symmetric hash enable configuration per port," + " on i40e only\n\n" + + "i40e_set_sym_hash_ena_per_port (port_id)" + " (enable|disable)\n" + "set symmetric hash enable configuration per port" + " to enable or disable, on i40e only\n\n" + + "i40e_get_sym_hash_ena_per_pctype (port_id) (pctype)\n" + "get symmetric hash enable configuration per port," + " on i40e only\n\n" + + "i40e_set_sym_hash_ena_per_pctype (port_id) (pctype)" + " (enable|disable)\n" + "set symmetric hash enable configuration per" + " pctype to enable or disable, on i40e only\n\n" + + "i40e_get_filter_swap (port_id) (pctype)\n" + "get filter swap configurations on i40e," + " on i40e only\n\n" + + "i40e_set_filter_swap (port_id) (pctype) (off0_src0)" + " (off0_src1) (len0) (off1_src0) (off1_src1) (len1)\n" + "set filter swap configurations, on i40e only\n\n" + + "i40e_get_hash_function (port_id)\n" + "get hash function of Toeplitz or Simple XOR," + " on i40e only\n\n" + + "i40e_set_hash_function (port_id)" + " (toeplitz|simple_xor)\n" + "set the hash function to Toeplitz or Simple XOR," + " on i40e only\n\n" +#endif /* RTE_LIBRTE_I40E_PMD */ ); } } @@ -7304,6 +7345,534 @@ cmdline_parse_inst_t cmd_get_flex_filter = { }, }; +/* *** Classification Filters Control *** */ +#ifdef RTE_LIBRTE_I40E_PMD +/* *** Get symmetric hash enable per port *** */ +struct cmd_i40e_get_sym_hash_ena_per_port_result { + cmdline_fixed_string_t i40e_get_sym_hash_ena_per_port; + uint8_t port_id; +}; + +static void +cmd_i40e_get_sym_hash_per_port_parsed(void *parsed_result, + __rte_unused struct cmdline *cl, + __rte_unused void *data) +{ + struct cmd_i40e_get_sym_hash_ena_per_port_result *res = parsed_result; + uint8_t enable = 0; + int ret; + + if (rte_eth_dev_is_command_supported(res->port_id, + RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT) <= 0) { + printf("Command of RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT " + "not supported on port: %d\n", res->port_id); + return; + } + + ret = rte_eth_dev_rx_classification_filter_ctl(res->port_id, + RTE_CMD_GET_SYM_HASH_ENABLE_PER_PORT, &enable); + if (ret < 0) { + printf("Cannot get symmetric hash enable per port " + "on i40e port %u\n", res->port_id); + return; + } + + printf("Symmetric hash is %s on i40e port %u\n", + enable ? "enabled" : "disabled", res->port_id); +} + +cmdline_parse_token_string_t cmd_i40e_get_sym_hash_ena_per_port_all = + TOKEN_STRING_INITIALIZER( + struct cmd_i40e_get_sym_hash_ena_per_port_result, + i40e_get_sym_hash_ena_per_port, + "i40e_get_sym_hash_ena_per_port"); +cmdline_parse_token_num_t cmd_i40e_get_sym_hash_ena_per_port_port_id = + TOKEN_NUM_INITI
[dpdk-dev] [PATCH v2 0/6] Support configuring hash functions
These pathches mainly support configuring hash functions. In detail, - It can select Toeplitz or simple XOR hash functions. - It can configure symmetric hash functions. * Get/set symmetric hash enable per port. * Get/set symmetric hash enable per 'PCTYPE'. * Get/set filter swap configurations. - 'ethdev' level interfaces are added. * 'is_command_supported', to check if a feature (command) is supported on a port. * 'rx_classification_filter_ctl', a common API to execute specific command of each feature. - Seven commands are implemented in testpmd to support testing above. Note that 'PCTYPE' means 'Packet Classification Type'. Helin Zhang (6): ethdev: rename macros of packet classification type ethdev: add new ops of 'is_command_supported' and 'rx_classification_filter_ctl' i40e: support of 'rx_classification_filter_ctl' i40e: support of 'is_command_supported' i40e: Initialize hash function during port initialization. app/testpmd: add commands for configuring hash functions app/test-pmd/cmdline.c | 579 lib/librte_ether/Makefile | 1 + lib/librte_ether/rte_eth_features.h | 73 + lib/librte_ether/rte_ethdev.c | 31 ++ lib/librte_ether/rte_ethdev.h | 131 +--- lib/librte_pmd_i40e/Makefile| 6 + lib/librte_pmd_i40e/i40e_ethdev.c | 484 ++ lib/librte_pmd_i40e/i40e_ethdev.h | 2 + lib/librte_pmd_i40e/rte_i40e.h | 108 +++ 9 files changed, 1377 insertions(+), 38 deletions(-) create mode 100644 lib/librte_ether/rte_eth_features.h create mode 100644 lib/librte_pmd_i40e/rte_i40e.h -- 1.8.1.4
[dpdk-dev] free a memzone
Hi Mahdi, > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi Dashtbozorgi > Sent: Thursday, July 24, 2014 6:20 AM > To: dev at dpdk.org > Subject: Re: [dpdk-dev] free a memzone > > Hi Bruce, > > Thank you for the response. That's a great Idea! > But I do not understand the last four parameters of this function. (vaddr, > paddr, pg_num, pg_shift) > I guess vaddr is the virtual address of the previously allocated mempool, yes > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I > right? yes >what about pg_num and pg_shift? how can I pass them correctly? >From rte_mempool.h: "* @param pg_num * Number of elements in the paddr array. * @param pg_shift * LOG2 of the physical pages size." If you are using memzone as externally allocated memory - it will be already physically continuos. So in your case pg_num = MEMPOOL_PG_NUM_DEFAULT, pg_shift = MEMPOOL_PG_SHIFT_MAX. Though, I don't think rte_mempool_xmem_create() will help you in any way. Again from rte_mempool.h: "* Creates a new mempool named *name* in memory. * * This function uses ``memzone_reserve()`` to allocate memory. The * pool contains n elements of elt_size. Its size is set to n. * Depending on the input parameters, mempool elements can be either allocated * together with the mempool header, or an externally provided memory buffer * could be used to store mempool objects. In later case, that external * memory buffer can consist of set of disjoint phyiscal pages." So xmem_create would still create a new ring, reserve a new memzone of mempool's metadata, etc. The only difference - it can use externally allocated memory to store mempool elements. As I understand what you need is sort of mempool_reset(): a function that would re-init mempool to just created state (all elements are free, lcores caches are empty, etc). Right now we don't have such function, but I suppose something like that should do (note that I didn't run or even build it): If ((mp = rte_mempool_lookup(name)) != NULL { char ring_name[RTE_RING_NAMESIZE]; /* save mp ring name. */ memcpy(ring_name, mp->ring->name, sizeof ring_name); /* reset the ring. */ rte_ring_init(mp->ring, ring_name, rte_align32pow2(mp->size+1), mp->ring->flags); /*repopulate mempool and reinit all its elements. */ mempool_populate(mp, mp->size, 1, rte_pktmbuf_init, NULL); /* reset all lcore caches. */ memset(mp->local_cache, 0, sizeof(local_cahce)); /* reset statistics if needed. */ } else { /* create new mempool. */ } Ideally such function should be in the librte_mempool of course, but if you are in a hurry - you probably can give it a try. Note that I assume that no other process, except failed/restarting secondary are using this mempool. If primary or some other secondary do, then first you need to stop them using this mempool and wait till they finish all their packet processing activity. Konstantin > Best Regards, > Mahdi. > > > On Thu, Jul 24, 2014 at 9:48 AM, Mahdi Dashtbozorgi > wrote: > > > Hi Bruce, > > > > Thank you for the response. That's a great Idea! > > But I do not understand the last four parameters of this function. (vaddr, > > paddr, pg_num, pg_shift) > > I guess vaddr is the virtual address of the previously allocated mempool, > > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I > > right? what about pg_num and pg_shift? how can I pass them correctly? > > > > Best Regards, > > Mahdi. > > > > > > On Wed, Jul 23, 2014 at 11:09 PM, Richardson, Bruce < > > bruce.richardson at intel.com> wrote: > > > >> Rather than freeing the previously allocated memzone, could you not just > >> re-initialize the mempool using something like rte_mempool_xmem_create? > >> > >> > -Original Message- > >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi > >> > Dashtbozorgi > >> > Sent: Wednesday, July 23, 2014 2:05 AM > >> > To: dev at dpdk.org > >> > Subject: Re: [dpdk-dev] free a memzone > >> > > >> > Hi guys, > >> > > >> > Is there any suggestion to free the previously allocated memzone? > >> > I really need help in this issue. > >> > Any help is appreciated. > >> > > >> > Best Regards, > >> > Mahdi. > >> > > >> > > >> > > >> > On Tue, Jul 22, 2014 at 4:03 PM, Mahdi Dashtbozorgi > >> > wrote: > >> > > >> > > Hi, > >> > > > >> > > I have two processes, which uses DPDK multi-process feature to > >> communicate. > >> > > Master process captures packets from NIC and put them to a ring > >> buffer, > >> > > which is shared between master and slave process. > >> > > The slave process looks up the shared ring buffer using > >> rte_ring_lookup > >> > > function and reads the packets. > >> > > The slave process needs a memory pool, too. Therefore, it creates a > >> > > mempool using rte_mempool_create. But If the slave process crashes > >> during > >> > > its processing and runs again, rte_mempool_create function fails and > >> tells > >> > > that there is a memor
[dpdk-dev] free a memzone
Hi Konstantin, Thank you very much. Your solution fixed my problem. Is there a solution like this for resetting the memory zone, which is used by rte_malloc function? Because if I use rte_malloc instead of malloc, in the case of application crash, the memory zone, which was used by rte_malloc in the previous run would be unusable for the next run of slave process. Best Regards, Mahdi. On Mon, Jul 28, 2014 at 4:27 PM, Ananyev, Konstantin < konstantin.ananyev at intel.com> wrote: > Hi Mahdi, > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi Dashtbozorgi > > Sent: Thursday, July 24, 2014 6:20 AM > > To: dev at dpdk.org > > Subject: Re: [dpdk-dev] free a memzone > > > > Hi Bruce, > > > > Thank you for the response. That's a great Idea! > > But I do not understand the last four parameters of this function. > (vaddr, > > paddr, pg_num, pg_shift) > > I guess vaddr is the virtual address of the previously allocated mempool, > > yes > > > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I > > right? > > yes > > >what about pg_num and pg_shift? how can I pass them correctly? > > From rte_mempool.h: > "* @param pg_num > * Number of elements in the paddr array. > * @param pg_shift > * LOG2 of the physical pages size." > > If you are using memzone as externally allocated memory - it will be > already physically continuos. > So in your case pg_num = MEMPOOL_PG_NUM_DEFAULT, pg_shift = > MEMPOOL_PG_SHIFT_MAX. > > Though, I don't think rte_mempool_xmem_create() will help you in any way. > Again from rte_mempool.h: > "* Creates a new mempool named *name* in memory. > * > * This function uses ``memzone_reserve()`` to allocate memory. The > * pool contains n elements of elt_size. Its size is set to n. > * Depending on the input parameters, mempool elements can be either > allocated > * together with the mempool header, or an externally provided memory > buffer > * could be used to store mempool objects. In later case, that external > * memory buffer can consist of set of disjoint phyiscal pages." > > So xmem_create would still create a new ring, reserve a new memzone of > mempool's metadata, etc. > The only difference - it can use externally allocated memory to store > mempool elements. > > As I understand what you need is sort of mempool_reset(): a function that > would re-init mempool to just created state > (all elements are free, lcores caches are empty, etc). > Right now we don't have such function, but I suppose something like that > should do > (note that I didn't run or even build it): > > If ((mp = rte_mempool_lookup(name)) != NULL { > > char ring_name[RTE_RING_NAMESIZE]; > > /* save mp ring name. */ > memcpy(ring_name, mp->ring->name, sizeof ring_name); > > /* reset the ring. */ > rte_ring_init(mp->ring, ring_name, rte_align32pow2(mp->size+1), > mp->ring->flags); > > /*repopulate mempool and reinit all its elements. */ > mempool_populate(mp, mp->size, 1, rte_pktmbuf_init, NULL); > > /* reset all lcore caches. */ > memset(mp->local_cache, 0, sizeof(local_cahce)); > > /* reset statistics if needed. */ > } else { > /* create new mempool. */ > } > > Ideally such function should be in the librte_mempool of course, but if > you are in a hurry - you probably can give it a try. > > Note that I assume that no other process, except failed/restarting > secondary are using this mempool. > If primary or some other secondary do, then first you need to stop them > using this mempool and wait till they finish all their packet processing > activity. > > Konstantin > > > Best Regards, > > Mahdi. > > > > > > On Thu, Jul 24, 2014 at 9:48 AM, Mahdi Dashtbozorgi > > wrote: > > > > > Hi Bruce, > > > > > > Thank you for the response. That's a great Idea! > > > But I do not understand the last four parameters of this function. > (vaddr, > > > paddr, pg_num, pg_shift) > > > I guess vaddr is the virtual address of the previously allocated > mempool, > > > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I > > > right? what about pg_num and pg_shift? how can I pass them correctly? > > > > > > Best Regards, > > > Mahdi. > > > > > > > > > On Wed, Jul 23, 2014 at 11:09 PM, Richardson, Bruce < > > > bruce.richardson at intel.com> wrote: > > > > > >> Rather than freeing the previously allocated memzone, could you not > just > > >> re-initialize the mempool using something like > rte_mempool_xmem_create? > > >> > > >> > -Original Message- > > >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi > Dashtbozorgi > > >> > Sent: Wednesday, July 23, 2014 2:05 AM > > >> > To: dev at dpdk.org > > >> > Subject: Re: [dpdk-dev] free a memzone > > >> > > > >> > Hi guys, > > >> > > > >> > Is there any suggestion to free the previously allocated memzone? > > >> > I really need help in this issue. > > >> > Any help is appreciated. > > >> > > > >> > Best Regards, > > >> > Mahdi. > > >> > > > >> > > > >> > > > >> > On Tue, Jul
[dpdk-dev] [PATCH] ixgbe: convert sse intrinsics to use __builtin variants
On Sat, Jul 26, 2014 at 11:15:01AM +, Ananyev, Konstantin wrote: > > Hi Neil, > > > The ixgbe pmd currently can't be built without enabling sse instructions at > > compile time. > > Actually it can, all you have to do is set RTE_IXGBE_INC_VECTOR=n in your > config. > > > While sse extensions provide better performance, theres no reason > > that we can't still create builds to run on systems that don't support sse. > > If > > we modify the ixgbe code to use the __builtin_shuffle and > > __builtin_popcountll > > functions, I've confirmed that the gcc compiler emits the appropriate sse > > instructions when the provided -march parameter indicates a machine that > > includes sse support, and emits generic code when see isn't available. > > I don't think it is ok to blindly replace _mm_shuffle_epi8 with > __builtin_shuffle. > They are not identical. > I tried your patch on IVB box (gcc 4.8.3, CONFIG_RTE_MACHINE="native"). > The result is - ixgbe_recv_pkts_vec() functionality is broken. > See below for more details. > So my vote is NACK. > Konstantin > > 1. Code changes: > uint16_t > ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, > uint16_t nb_pkts) > { > ... > __m128i shuf_msk; > ... > /* mask to shuffle from desc. to mbuf */ > shuf_msk = _mm_set_epi8( > 7, 6, 5, 4, /* octet 4~7, 32bits rss */ > 0xFF, 0xFF, /* skip high 16 bits vlan_macip, zero out */ > 15, 14, /* octet 14~15, low 16 bits vlan_macip */ > 0xFF, 0xFF, /* skip high 16 bits pkt_len, zero out */ > 13, 12, /* octet 12~13, low 16 bits pkt_len */ > 0xFF, 0xFF, /* skip nb_segs and in_port, zero out */ > 13, 12 /* octet 12~13, 16 bits data_len */ > ); > ... > for (...) { > __m128i descs[RTE_IXGBE_DESCS_PER_LOOP]; > __m128i pkt_mb1, pkt_mb2, pkt_mb3, pkt_mb4; > ... > descs[3] = _mm_loadu_si128((__m128i *)(rxdp + 3)); > ... > - pkt_mb4 = _mm_shuffle_epi8(descs[3], shuf_msk); > + pkt_mb4 = __builtin_shuffle(descs[3], shuf_msk); > ... > > 2. Code generated before the patch (valid one): > ... > vmovdqa 0x4978d(%rip),%xmm0 /* load shuf_msk */ > ... > vmovdqu 0x30(%rdx),%xmm4/* load desc[3] */ > > vpshufb %xmm0,%xmm4,%xmm8 > > > 3. Code generated after the patch applied (broken one): > ... > vmovdqu 0x30(%rdx),%xmm3 > ... > vpunpcklqdq %xmm3,%xmm3,%xmm3 /* !!! ERROR - should be vpshufb */ > > 4. What happens here? > My understanding: > > GCC treats __m128i as vector of two 64bit integers: > /lib/gcc/x86_64-redhat-linux/4.8.3/include/emmintrin.h:typedef long long > __m128i __attribute__ ((__vector_size__ (16), __may_alias__)); > > From https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html: > "... Vector shuffling is available using functions __builtin_shuffle (vec, > mask) and __builtin_shuffle (vec0, vec1, mask). Both functions construct a > permutation of elements from one or two vectors and return a vector of the > same type as the input vector(s). The mask is an integral vector with the > same width (W) and element count (N) as the output vector. > > The elements of the input vectors are numbered in memory ordering of vec0 > beginning at 0 and vec1 beginning at N. The elements of mask are considered > modulo N in the single-operand case and modulo 2*N in the two-operand case." > > For m128i N = 2, so: > > __m128i x, __m128i msk; > x = __builtin_shuffle(x, msk); > > means: > > index0 = msk[0..63] % 2; > index1 = msk[64..127] % 2; > x[0..63] = x[index0 * 64..index0*64+63]; > x[64..127] = x[index1 * 64..index1*64+63]; > > In ixgbe_recv_pkts_vec() shuf_msk[0..63] % 2 == 0 and shuf_msk[64..127] % 2. > So compiler makes optimisation: > pkt_mb4[0..63] = descs[3] [0..63]; > pkt_mb4[64..127] = descs[3] [0..63]; > i.e: > vpunpcklqdq %xmm3,%xmm3,%xmm3 > > BTW, changing to > __builtin_shuffle((__v16qi) descs[3], (__v16qi)shuf_msk); > wouldn't help either. > In that case __builtin_shuffle will consider elements of mas modulo 16. > While _mm_shuffle_epi8 (PSHUFB) is expected to zero destination byte if upper > bit in the corresponding mask byte is 1. > Ok, Sorry for the delayed response, I spend the weekend reading up on the differences between these instructions, and you're right, I thought those operations were equivalent, but the zeroing operation differentiates them. Sorry about that. That said, it would be really helpful for distribution packaging to be able to enable vectorized reception at run time. A compile time build variable really just isn't very helpful. I'll see if I can rework the patch to allow optional vectorized patch reception at run time. Neil > >
[dpdk-dev] free a memzone
Hi Mahdi, >Hi Konstantin, >Thank you very much. Your solution fixed my problem. >Is there a solution like this for resetting the memory zone, which is used by >rte_malloc function? >Because if I use rte_malloc instead of malloc, in the case of application >crash, the memory zone, which was used by rte_malloc in the previous run would >be >unusable for the next run of slave process. Without significant modification inside librte_eal and/or librte_malloc - nothing comes on top of my head. If that is such a big problem to you, might be it is possible to change your whole process model a bit: In the parent process: - allocate memory/init run-time strcutures L1: - fork(); - wait till child terminates - if it terminated abnormally, then free memory/re-init run-time structures. Goto L1 Do actual packet processing inside child process. Would that help somehow? Konstantin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi Dashtbozorgi > Sent: Thursday, July 24, 2014 6:20 AM > To: dev at dpdk.org > Subject: Re: [dpdk-dev] free a memzone > > Hi Bruce, > > Thank you for the response. That's a great Idea! > But I do not understand the last four parameters of this function. (vaddr, > paddr, pg_num, pg_shift) > I guess vaddr is the virtual address of the previously allocated mempool, yes > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I > right? yes ?>what about pg_num and pg_shift? how can I pass them correctly? >From rte_mempool.h: "* @param pg_num ?* ? Number of elements in the paddr array. ?* @param pg_shift ?* ? LOG2 of the physical pages size." If you are using memzone as externally allocated memory - it will be already physically continuos. So in your case pg_num = MEMPOOL_PG_NUM_DEFAULT, ?pg_shift = MEMPOOL_PG_SHIFT_MAX. Though, I don't think rte_mempool_xmem_create() will help you in any way. Again from rte_mempool.h: "* Creates a new mempool named *name* in memory. ?* ?* This function uses ``memzone_reserve()`` to allocate memory. The ?* pool contains n elements of elt_size. Its size is set to n. ?* Depending on the input parameters, mempool elements can be either allocated ?* together with the mempool header, or an externally provided memory buffer ?* could be used to store mempool objects. In later case, that external ?* memory buffer can consist of set of disjoint phyiscal pages." So xmem_create would still create a new ring, reserve a new memzone of mempool's metadata, etc. The only difference - it can use externally allocated memory to store mempool elements. As I understand what you need is sort of mempool_reset(): a function that would re-init mempool to just created state (all elements are free, lcores caches are empty, etc). Right now we don't have such function, ?but I suppose something like that should do (note that I didn't run or even build it): If ((mp = rte_mempool_lookup(name)) != NULL { char ring_name[RTE_RING_NAMESIZE]; /* save mp ring name. */ memcpy(ring_name, mp->ring->name, sizeof ring_name); /* reset the ring. */ ? rte_ring_init(mp->ring, ring_name, ?rte_align32pow2(mp->size+1), mp->ring->flags); /*repopulate mempool and reinit all its elements. */ mempool_populate(mp, mp->size, 1, rte_pktmbuf_init, NULL); /* reset all lcore caches. */ ?memset(mp->local_cache, 0, sizeof(local_cahce)); ? /* reset statistics if needed. */ } else { ? /* create new mempool. */ } Ideally such function should be in the librte_mempool of course, but if you are in a hurry - you probably can give it a try. Note that I assume that no other process, except failed/restarting secondary are using this mempool. If primary or some other secondary do, then first you need to stop them using this mempool and wait till they finish all their packet processing activity. Konstantin > Best Regards, > Mahdi. > > > On Thu, Jul 24, 2014 at 9:48 AM, Mahdi Dashtbozorgi > wrote: > > > Hi Bruce, > > > > Thank you for the response. That's a great Idea! > > But I do not understand the last four parameters of this function. (vaddr, > > paddr, pg_num, pg_shift) > > I guess vaddr is the virtual address of the previously allocated mempool, > > paddr is calculated using function call rte_mem_virt2phy(vaddr), am I > > right? what about pg_num and pg_shift? how can I pass them correctly? > > > > Best Regards, > > Mahdi. > > > > > > On Wed, Jul 23, 2014 at 11:09 PM, Richardson, Bruce < > > bruce.richardson at intel.com> wrote: > > > >> Rather than freeing the previously allocated memzone, could you not just > >> re-initialize the mempool using something like rte_mempool_xmem_create? > >> > >> > -Original Message- > >> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Mahdi > >> > Dashtbozorgi > >> > Sent: Wednesday, July 23, 2014 2:05 AM > >> > To: dev at dpdk.org > >> > Subject: Re: [dpdk-dev] free a memzone > >> > > >> > Hi guys, > >> > > >> > Is there any suggestion to free the previously allocated memzone? > >> > I really
[dpdk-dev] [ovs-discuss] does vswitchd runs multiple threads when i added dpdk devices
Hey Srinivas, Right now, ovs has only one dpdk polling thread. We are working on creating multiple polling threads and pinning polling threads to the same cpu socket as dpdk interface. Thanks, Alex Wang, On Mon, Jul 28, 2014 at 9:15 AM, Ben Pfaff wrote: > On Mon, Jul 28, 2014 at 07:33:35AM +, Srinivas Reddi wrote: > > As per my understanding each dpdk device is polled on different thread > . > > But in my case vswithcd is running in only single thread [on core 0] , > I expected to run on 3 cores .. > > > > One thing I want to clarify that .. does ovs-vswitchd runs on single > core only .. or multiple thereads .. when I added dpdk devices . > > If vswitchd runs on multiple threads , when I added dpdk devices .. pls > let me know how can I run . > > It looks like right now OVS has only one dpdk polling thread. > ___ > discuss mailing list > discuss at openvswitch.org > http://openvswitch.org/mailman/listinfo/discuss >