[dpdk-dev] [PATCH v2 0/6] Support configuring hash functions
> -Original Message- > From: Zhan, Zhaochen > Sent: Thursday, July 31, 2014 10:49 AM > To: Zhang, Helin; dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v2 0/6] Support configuring hash functions > > > > These pathches mainly support configuring hash functions. > > In detail, > > - It can select Toeplitz or simple XOR hash functions. > > - It can configure symmetric hash functions. > >* Get/set symmetric hash enable per port. > >* Get/set symmetric hash enable per 'PCTYPE'. > >* Get/set filter swap configurations. > > - 'ethdev' level interfaces are added. > >* 'is_command_supported', to check if a feature (command) > > is supported on a port. > >* 'rx_classification_filter_ctl', a common API to execute > > specific command of each feature. > > - Seven commands are implemented in testpmd to support > >testing above. > > Note that 'PCTYPE' means 'Packet Classification Type'. > > > > Helin Zhang (6): > > ethdev: rename macros of packet classification type > > ethdev: add new ops of 'is_command_supported' and > > 'rx_classification_filter_ctl' > > i40e: support of 'rx_classification_filter_ctl' > > i40e: support of 'is_command_supported' > > i40e: Initialize hash function during port initialization. > > app/testpmd: add commands for configuring hash functions > > Tested-by: Zhaochen Zhan > > I have tested this patch on fedora20 with Fortville NIC. > The hash function toeplitz/simple XOR/ symmetric all works well for ip/udp > both > ipv4 and ipv6 packets in testpmd support RSS. Hi Thomas Would you please help to apply it as soon as possible, as another developer here who is now developing another feature (Flow Director) which has dependency on these patches? Thank you very much! Regards, Helin
[dpdk-dev] [PATCH 0/6] Support flow director programming on fortville
The patch set supports flow director programming on fortville. It includes: - reserve i40e resources for flow director, such as queue and vsi. - support the new ethdev AP Irx_classification_filter_ctl for all the configuration or queries for receive classification filters. - support programming 6 flow types for the flow director filters, which is called PCTYPE in fortville: ipv4, tcpv4, udpv4, ipv6, tcpv6, udpv6. - support flushing flow director table (all filters). - support match statistics and FD ID report. - all fix the the Marco conflict between rte_ip.h and netinet/in.h. jingjing.wu (6): i40e: flow director resource reserve and initialize on i40e lib/librte_net: fix the Marco conflict between rte_ip.h and netinet/in.h ethdev: define new ethdev API rx_classification_filter_ctl i40e: function implement in i40e for flow director filter programming app/test-pmd: add commands and config functions for i40e flow director support i40e: support FD ID report and match counter for i40e flow director app/test-pmd/cmdline.c | 665 app/test-pmd/config.c | 54 ++- app/test-pmd/testpmd.c | 22 ++ app/test-pmd/testpmd.h | 57 lib/librte_ether/Makefile | 3 +- lib/librte_ether/rte_eth_features.h | 64 lib/librte_ether/rte_ethdev.c | 19 +- lib/librte_ether/rte_ethdev.h | 108 +++--- lib/librte_net/rte_ip.h | 5 +- lib/librte_pmd_i40e/Makefile| 5 + lib/librte_pmd_i40e/i40e_ethdev.c | 98 +- lib/librte_pmd_i40e/i40e_ethdev.h | 32 +- lib/librte_pmd_i40e/i40e_fdir.c | 355 +++ lib/librte_pmd_i40e/i40e_rxtx.c | 176 +- lib/librte_pmd_i40e/rte_i40e.h | 125 +++ 15 files changed, 1727 insertions(+), 61 deletions(-) create mode 100644 lib/librte_ether/rte_eth_features.h create mode 100644 lib/librte_pmd_i40e/i40e_fdir.c create mode 100644 lib/librte_pmd_i40e/rte_i40e.h -- 1.8.1.4
[dpdk-dev] [PATCH 1/6] i40e: flow director resource reserve and initialize on i40e
flow director resource reserve and initialize on i40e, it includes - queue initialization and switch on and vsi creation during setup - queue vsi for flow director release during close Signed-off-by: jingjing.wu --- lib/librte_pmd_i40e/Makefile | 1 + lib/librte_pmd_i40e/i40e_ethdev.c | 40 +-- lib/librte_pmd_i40e/i40e_ethdev.h | 22 ++- lib/librte_pmd_i40e/i40e_fdir.c | 135 ++ lib/librte_pmd_i40e/i40e_rxtx.c | 127 +++ 5 files changed, 318 insertions(+), 7 deletions(-) create mode 100644 lib/librte_pmd_i40e/i40e_fdir.c diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile index 4b31675..6537654 100644 --- a/lib/librte_pmd_i40e/Makefile +++ b/lib/librte_pmd_i40e/Makefile @@ -87,6 +87,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_rxtx.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev_vf.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_pf.c +SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_fdir.c # this lib depends upon: DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_eal lib/librte_ether DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_mempool lib/librte_mbuf diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 9ed31b5..47125c7 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -516,6 +516,7 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv, err_setup_pf_switch: rte_free(pf->main_vsi); + i40e_fdir_teardown(pf); err_get_mac_addr: err_configure_lan_hmc: (void)i40e_shutdown_lan_hmc(hw); @@ -728,6 +729,7 @@ i40e_dev_close(struct rte_eth_dev *dev) /* release all the existing VSIs and VEBs */ i40e_vsi_release(pf->main_vsi); + i40e_fdir_teardown(pf); /* shutdown the adminq */ i40e_aq_queue_shutdown(hw, true); @@ -2262,7 +2264,11 @@ i40e_vsi_release(struct i40e_vsi *vsi) TAILQ_FOREACH(f, &vsi->mac_list, next) rte_free(f); - if (vsi->type != I40E_VSI_MAIN) { + /* +* For FDIR vsi, there is not actual HW VSI needed, no need to +* call adminq and removing fromtailq. +*/ + if (vsi->type != I40E_VSI_MAIN && vsi->type != I40E_VSI_FDIR) { /* Remove vsi from parent's sibling list */ if (vsi->parent_vsi == NULL || vsi->parent_vsi->veb == NULL) { PMD_DRV_LOG(ERR, "VSI's parent VSI is NULL\n"); @@ -2379,7 +2385,8 @@ i40e_vsi_setup(struct i40e_pf *pf, struct ether_addr broadcast = {.addr_bytes = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff}}; - if (type != I40E_VSI_MAIN && uplink_vsi == NULL) { + if (type != I40E_VSI_MAIN && type != I40E_VSI_FDIR && + uplink_vsi == NULL) { PMD_DRV_LOG(ERR, "VSI setup failed, " "VSI link shouldn't be NULL\n"); return NULL; @@ -2392,7 +2399,8 @@ i40e_vsi_setup(struct i40e_pf *pf, } /* If uplink vsi didn't setup VEB, create one first */ - if (type != I40E_VSI_MAIN && uplink_vsi->veb == NULL) { + if (type != I40E_VSI_MAIN && type != I40E_VSI_FDIR && + uplink_vsi->veb == NULL) { uplink_vsi->veb = i40e_veb_setup(pf, uplink_vsi); if (NULL == uplink_vsi->veb) { @@ -2420,6 +2428,9 @@ i40e_vsi_setup(struct i40e_pf *pf, case I40E_VSI_SRIOV : vsi->nb_qps = pf->vf_nb_qps; break; + case I40E_VSI_FDIR: + vsi->nb_qps = pf->fdir_nb_qps; + break; default: goto fail_mem; } @@ -2432,7 +2443,7 @@ i40e_vsi_setup(struct i40e_pf *pf, vsi->base_queue = ret; /* VF has MSIX interrupt in VF range, don't allocate here */ - if (type != I40E_VSI_SRIOV) { + if (type != I40E_VSI_SRIOV && type != I40E_VSI_FDIR) { ret = i40e_res_pool_alloc(&pf->msix_pool, 1); if (ret < 0) { PMD_DRV_LOG(ERR, "VSI %d get heap failed %d", vsi->seid, ret); @@ -2561,8 +2572,16 @@ i40e_vsi_setup(struct i40e_pf *pf, * Since VSI is not created yet, only configure parameter, * will add vsi below. */ - } - else { + } else if (type == I40E_VSI_FDIR) { + vsi->info.valid_sections = 0; + vsi->seid = 0; + vsi->vsi_id = 0; + /* +* No actual HW VSI needed, will return here without +* calling adminq and adding to tailq. +*/ + return vsi; + } else { PMD_DRV_LOG(ERR, "VSI: Not support other type VSI yet\n"); goto fail_msix_alloc; } @@ -2749,6 +2768,13 @@ i40e_pf_setup(struct i40e_pf *pf) return ret; } +
[dpdk-dev] [PATCH 2/6] lib/librte_net: fix the Marco conflict
fix the Marco conflict between rte_ip.h and netinet/in.h Signed-off-by: jingjing.wu --- lib/librte_net/rte_ip.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/librte_net/rte_ip.h b/lib/librte_net/rte_ip.h index e3f65c1..0f0b3b0 100644 --- a/lib/librte_net/rte_ip.h +++ b/lib/librte_net/rte_ip.h @@ -116,6 +116,8 @@ struct ipv4_hdr { #defineIPV4_HDR_OFFSET_UNITS 8 +#ifndef _NETINET_IN_H +#ifndef _NETINET_IN_H_ /* IPv4 protocols */ #define IPPROTO_IP 0 /**< dummy for IP */ #define IPPROTO_HOPOPTS0 /**< IP6 hop-by-hop options */ @@ -226,7 +228,8 @@ struct ipv4_hdr { #define IPPROTO_DIVERT 254 /**< divert pseudo-protocol */ #define IPPROTO_RAW 255 /**< raw IP packet */ #define IPPROTO_MAX 256 /**< maximum protocol number */ - +#endif +#endif /* * IPv4 address types */ -- 1.8.1.4
[dpdk-dev] [PATCH 3/6] ethdev: define new ethdev API rx_classification_filter_ctl
support a new ethdev API rx_classification_filter_ctl for all the configuration or queries for receive classification filters. this patch supports commands the API used below: RTE_CMD_FDIR_RULE_ADD RTE_CMD_FDIR_RULE_DEL RTE_CMD_FDIR_FLUSH RTE_CMD_FDIR_INFO_GET Signed-off-by: jingjing.wu --- lib/librte_ether/Makefile | 3 +- lib/librte_ether/rte_eth_features.h | 64 + lib/librte_ether/rte_ethdev.c | 19 ++- lib/librte_ether/rte_ethdev.h | 108 +++- 4 files changed, 154 insertions(+), 40 deletions(-) create mode 100644 lib/librte_ether/rte_eth_features.h diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile index b310f8b..03dec8a 100644 --- a/lib/librte_ether/Makefile +++ b/lib/librte_ether/Makefile @@ -46,8 +46,9 @@ SRCS-y += rte_ethdev.c # SYMLINK-y-include += rte_ether.h SYMLINK-y-include += rte_ethdev.h +SYMLINK-y-include += rte_eth_features.h # this lib depends upon: -DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf +DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf lib/librte_net include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_ether/rte_eth_features.h b/lib/librte_ether/rte_eth_features.h new file mode 100644 index 000..ba5eccb --- /dev/null +++ b/lib/librte_ether/rte_eth_features.h @@ -0,0 +1,64 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_ETH_FEATURES_H_ +#define _RTE_ETH_FEATURES_H_ + +/** + * @file + * + * Ethernet device specific features + */ + +#ifdef __cplusplus +extern "C" { +#endif + +/* Commands defined for NIC specific features */ +enum rte_eth_command { + RTE_CMD_UNKNOWN = 0, + RTE_CMD_FDIR_RULE_ADD, + /**< Command to add a new FDIR filter rule. */ + RTE_CMD_FDIR_RULE_DEL, + /**< Command to delete a FDIR filter rule. */ + RTE_CMD_FDIR_FLUSH, + /**< Command to clear all FDIR filter rules. */ + RTE_CMD_FDIR_INFO_GET, + /**< Command to get FDIR information. */ +}; + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_ETH_FEATURES_H_ */ diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index fd1010a..10a08de 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -41,7 +41,6 @@ #include #include #include -#include #include #include @@ -66,6 +65,7 @@ #include #include #include +#include #include "rte_ether.h" #include "rte_ethdev.h" @@ -3002,3 +3002,20 @@ rte_eth_dev_get_flex_filter(uint8_t port_id, uint16_t index, return (*dev->dev_ops->get_flex_filter)(dev, index, filter, rx_queue); } + +int +rte_eth_dev_rx_classification_filter_ctl(uint8_t port_id, +enum rte_eth_command cmd, +void *args) +{ + struct rte_eth_dev *dev; + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -ENODEV; + } + dev = &rte_eth_devices[port_id]; + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_classification_filter_ctl, + -ENOTSUP); + return (*dev->dev_ops->r
[dpdk-dev] [PATCH 4/6] i40e: function implement in i40e for flow director filter programming
support the API ops defined in ethdev, the behavior according to each command: RTE_CMD_FDIR_RULE_ADD: add a new FDIR filter rule. RTE_CMD_FDIR_RULE_DEL: delete a FDIR filter rule. RTE_CMD_FDIR_FLUSH : clear all FDIR filter rules. RTE_CMD_FDIR_INFO_GET: get FDIR information. Signed-off-by: jingjing.wu --- lib/librte_pmd_i40e/Makefile | 4 + lib/librte_pmd_i40e/i40e_ethdev.c | 53 + lib/librte_pmd_i40e/i40e_ethdev.h | 10 ++ lib/librte_pmd_i40e/i40e_fdir.c | 220 ++ lib/librte_pmd_i40e/rte_i40e.h| 125 ++ 5 files changed, 412 insertions(+) create mode 100644 lib/librte_pmd_i40e/rte_i40e.h diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile index 6537654..3da20c5 100644 --- a/lib/librte_pmd_i40e/Makefile +++ b/lib/librte_pmd_i40e/Makefile @@ -88,6 +88,10 @@ SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_rxtx.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_ethdev_vf.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_pf.c SRCS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += i40e_fdir.c + +# install this header file +SYMLINK-$(CONFIG_RTE_LIBRTE_I40E_PMD)-include := rte_i40e.h + # this lib depends upon: DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_eal lib/librte_ether DEPDIRS-$(CONFIG_RTE_LIBRTE_I40E_PMD) += lib/librte_mempool lib/librte_mbuf diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 47125c7..c4637be 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -48,6 +48,7 @@ #include #include #include +#include #include "i40e_logs.h" #include "i40e/i40e_register_x710_int.h" @@ -203,6 +204,9 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); +static int i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, +enum rte_eth_command cmd, +void *args); /* Default hash key buffer for RSS */ static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; @@ -248,6 +252,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = { .reta_query = i40e_dev_rss_reta_query, .rss_hash_update = i40e_dev_rss_hash_update, .rss_hash_conf_get= i40e_dev_rss_hash_conf_get, + .rx_classification_filter_ctl = i40e_rx_classification_filter_ctl, }; static struct eth_driver rte_i40e_pmd = { @@ -3984,3 +3989,51 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf) return 0; } + +static int +i40e_rx_classification_filter_ctl(struct rte_eth_dev *dev, + enum rte_eth_command cmd, + void *args) +{ + struct i40e_pf *pf = I40E_DEV_PRIVATE_TO_PF(dev->data->dev_private); + struct rte_i40e_fdir_entry *fdir_entry; + struct rte_i40e_fdir_info *fdir_info; + int ret = I40E_SUCCESS; + + switch (cmd) { + case RTE_CMD_FDIR_RULE_ADD: + if (args == NULL) + return I40E_ERR_PARAM; + fdir_entry = (struct rte_i40e_fdir_entry *)args; + ret = i40e_fdir_filter_programming(pf, + fdir_entry->soft_id, + &fdir_entry->input, + &fdir_entry->action, + TRUE); + break; + case RTE_CMD_FDIR_RULE_DEL: + if (args == NULL) + return I40E_ERR_PARAM; + fdir_entry = (struct rte_i40e_fdir_entry *)args; + ret = i40e_fdir_filter_programming(pf, + fdir_entry->soft_id, + &fdir_entry->input, + &fdir_entry->action, + FALSE); + break; + case RTE_CMD_FDIR_INFO_GET: + if (args == NULL) + return I40E_ERR_PARAM; + fdir_info = (struct rte_i40e_fdir_info *)args; + i40e_fdir_info_get(dev, fdir_info); + break; + case RTE_CMD_FDIR_FLUSH: + ret = i40e_fdir_flush(pf); + break; + default: + PMD_DRV_LOG(ERR, "unknown command type %u\n", cmd); + ret = I40E_ERR_PARAM; + break; + } + return ret; +} diff --git a/lib/librte_pmd_i40e/i40e_ethdev.h b/lib/librte_pmd_i40e/i40e_ethdev.h index c2c7fa9..7755f5a 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.h +++ b/lib/librte_pmd_i40e/i40e_ethdev.h @@ -34,6 +34,8 @@ #ifndef _I40E_ETHDEV_H_ #define _I40E_ETHDEV_H_ +#include "rte_i40e.h" + #define I40E_AQ_LEN 32 #define I40E_AQ_BUF_SZ4096 /* Number of queues per TC should be one of 1, 2, 4, 8, 16, 32, 64 */ @@ -332,6 +334,1
[dpdk-dev] [PATCH 5/6] app/test-pmd: add commands and config functions for i40e flow director support
add structure definition to construct programming packet. add commands to programming 6 flow types for the flow director filters, which is called PCTYPE in fortville: ipv4, tcpv4, udpv4, ipv6, tcpv6, udpv6 add command to support flushing flow director table Signed-off-by: jingjing.wu --- app/test-pmd/cmdline.c | 665 + app/test-pmd/config.c | 54 +++- app/test-pmd/testpmd.c | 22 ++ app/test-pmd/testpmd.h | 57 + 4 files changed, 786 insertions(+), 12 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 345be11..bf7e45c 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -74,6 +74,14 @@ #include #include #include +#include +#include +#include +#include +#include +#ifdef RTE_LIBRTE_I40E_PMD +#include +#endif /* RTE_LIBRTE_I40E_PMD */ #include #include @@ -655,6 +663,25 @@ static void cmd_help_long_parsed(void *parsed_result, "get_flex_filter (port_id) index (idx)\n" "get info of a flex filter.\n\n" + +#ifdef RTE_LIBRTE_I40E_PMD + "i40e_flow_director_filter (port_id) (add|del)" + " flow (ip4|ip6) src (src_ip_address) dst (dst_ip_address)" + " flexwords (flexwords_value) (drop|fwd)" + " queue (queue_id) fd_id (fd_id_value)\n" + "Add/Del a IP type flow director filter for i40e NIC.\n\n" + + "i40e_flow_director_filter (port_id) (add|del)" + " flow (udp4|tcp4|udp6|tcp6)" + " src (src_ip_address) (src_port)" + " dst (dst_ip_address) (dst_port)" + " flexwords (flexwords_value) (drop|fwd)" + " queue (queue_id) fd_id (fd_id_value)\n" + "Add/Del a UDP/TCP type flow director filter for i40e NIC.\n\n" + + "i40e_flush_flow_diretor (port_id)\n" + "Flush all flow director entries of a device on i40e NIC.\n\n" +#endif /* RTE_LIBRTE_I40E_PMD */ ); } } @@ -7304,6 +7331,639 @@ cmdline_parse_inst_t cmd_get_flex_filter = { }, }; +/* *** Classification Filters Control *** */ +#ifdef RTE_LIBRTE_I40E_PMD +/* *** deal with i40e flow director filter *** */ +struct cmd_i40e_flow_director_result { + cmdline_fixed_string_t flow_director_filter; + uint8_t port_id; + cmdline_fixed_string_t ops; + cmdline_fixed_string_t flow; + cmdline_fixed_string_t flow_type; + cmdline_fixed_string_t src; + cmdline_ipaddr_t ip_src; + uint16_t port_src; + cmdline_fixed_string_t dst; + cmdline_ipaddr_t ip_dst; + uint16_t port_dst; + cmdline_fixed_string_t flexwords; + cmdline_fixed_string_t flexwords_value; + cmdline_fixed_string_t drop; + cmdline_fixed_string_t queue; + uint16_t queue_id; + cmdline_fixed_string_t fd_id; + uint32_t fd_id_value; +}; + +static inline int +parse_flexwords(const char *q_arg, uint16_t *flexwords) +{ +#define MAX_NUM_WORD 8 + char s[256]; + const char *p, *p0 = q_arg; + char *end; + unsigned long int_fld[MAX_NUM_WORD]; + char *str_fld[MAX_NUM_WORD]; + int i; + unsigned size; + int num_words = -1; + + p = strchr(p0, '('); + if (p == NULL) + return -1; + ++p; + p0 = strchr(p, ')'); + if (p0 == NULL) + return -1; + + size = p0 - p; + if (size >= sizeof(s)) + return -1; + + snprintf(s, sizeof(s), "%.*s", size, p); + num_words = rte_strsplit(s, sizeof(s), str_fld, MAX_NUM_WORD, ','); + if (num_words < 0 || num_words > MAX_NUM_WORD) + return -1; + for (i = 0; i < num_words; i++) { + errno = 0; + int_fld[i] = strtoul(str_fld[i], &end, 0); + if (errno != 0 || end == str_fld[i] || int_fld[i] > UINT16_MAX) + return -1; + flexwords[i] = rte_cpu_to_be_16((uint16_t)int_fld[i]); + } + return num_words; +} + +static inline struct rte_mbuf * +tx_mbuf_alloc(struct rte_mempool *mp) +{ + struct rte_mbuf *m; + + m = __rte_mbuf_raw_alloc(mp); + __rte_mbuf_sanity_check_raw(m, RTE_MBUF_PKT, 0); + return m; +} + +static inline void +rte_i40e_fdir_construct_ip4_input(struct ipv4_other_flow *flow, + unsigned char *raw_pkt) +{ + struct ether_hdr *ether; + struct ipv4_hdr *ip; + unsigned char *payload; + + ether = (struct ether_hdr *)raw_pkt; + ip = (struct ipv4_hdr *)(raw_pkt + sizeof(struct ether_hdr)); + payload = raw_pkt + sizeof(struct ether_hdr) + sizeof(struct ipv4_hdr); + + ether->ether_type = rte_cpu_to_be_16(ETHER_TYPE_IPv4); + ip->version_ihl = I40E_FDIR_IP_DEFAULT_VE
[dpdk-dev] [PATCH 6/6] i40e: support FD ID report and match counter for i40e flow director
support to get the fdir_match counter support to set the FDIR flag and FD_ID reported in mbuf Signed-off-by: jingjing.wu --- lib/librte_pmd_i40e/i40e_ethdev.c | 5 lib/librte_pmd_i40e/i40e_rxtx.c | 49 ++- 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index c4637be..c518375 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -1112,6 +1112,9 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) I40E_GLPRT_PTC9522L(hw->port), pf->offset_loaded, &os->tx_size_big, &ns->tx_size_big); + i40e_stat_update_32(hw, I40E_GLQF_PCNT(pf->fdir.match_counter_index), + pf->offset_loaded, + &os->fd_sb_match, &ns->fd_sb_match); /* GLPRT_MSPDC not supported */ /* GLPRT_XEC not supported */ @@ -1125,6 +1128,7 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) stats->obytes = ns->eth.tx_bytes; stats->oerrors = ns->eth.tx_errors; stats->imcasts = ns->eth.rx_multicast; + stats->fdirmatch = ns->fd_sb_match; if (pf->main_vsi) i40e_update_vsi_stats(pf->main_vsi); @@ -1190,6 +1194,7 @@ i40e_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *stats) printf("mac_short_packet_dropped: %lu\n", ns->mac_short_packet_dropped); printf("checksum_error: %lu\n", ns->checksum_error); + printf("fdir_match: %lu\n", ns->fd_sb_match); printf("* PF stats end \n"); #endif /* RTE_LIBRTE_I40E_DEBUG_DRIVER */ } diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c index ad60d20..759fd75 100644 --- a/lib/librte_pmd_i40e/i40e_rxtx.c +++ b/lib/librte_pmd_i40e/i40e_rxtx.c @@ -110,6 +110,10 @@ i40e_rxd_status_to_pkt_flags(uint64_t qword) I40E_RX_DESC_FLTSTAT_RSS_HASH) == I40E_RX_DESC_FLTSTAT_RSS_HASH) ? PKT_RX_RSS_HASH : 0); + /* Check if FDIR Match */ + flags |= (uint16_t)(qword & (1 << I40E_RX_DESC_STATUS_FLM_SHIFT) ? + PKT_RX_FDIR : 0); + return flags; } @@ -626,7 +630,22 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) mb->ol_flags = pkt_flags; if (pkt_flags & PKT_RX_RSS_HASH) mb->pkt.hash.rss = rte_le_to_cpu_32(\ - rxdp->wb.qword0.hi_dword.rss); + rxdp[j].wb.qword0.hi_dword.rss); + if (pkt_flags & PKT_RX_FDIR) { +#ifdef RTE_LIBRTE_I40E_16BYTE_RX_DESC + if (((qword1 >> I40E_RX_DESC_STATUS_FLTSTAT_SHIFT) & + I40E_RX_DESC_FLTSTAT_RSS_HASH) == + I40E_RX_DESC_FLTSTAT_RSV_FD_ID) + mb->pkt.hash.fdir.id = (uint16_t) + rte_le_to_cpu_32(rxdp[j].wb.qword0.hi_dword.fd); +#else + if (((rxdp[j].wb.qword2.ext_status >> + I40E_RX_DESC_EXT_STATUS_FLEXBH_SHIFT) & + 0x03) == 0x01) + mb->pkt.hash.fdir.id = (uint16_t) + rte_le_to_cpu_32(rxdp[j].wb.qword3.hi_dword.fd_id); +#endif + } } for (j = 0; j < I40E_LOOK_AHEAD; j++) @@ -864,6 +883,20 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) if (pkt_flags & PKT_RX_RSS_HASH) rxm->pkt.hash.rss = rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.rss); + if (pkt_flags & PKT_RX_FDIR) { +#ifdef RTE_LIBRTE_I40E_16BYTE_RX_DESC + if (((qword1 >> I40E_RX_DESC_STATUS_FLTSTAT_SHIFT) & + I40E_RX_DESC_FLTSTAT_RSS_HASH) == + I40E_RX_DESC_FLTSTAT_RSV_FD_ID) + rxm->pkt.hash.fdir.id = (uint16_t) + rte_le_to_cpu_32(rxd.wb.qword0.hi_dword.fd); +#else + if (((rxd.wb.qword2.ext_status >> I40E_RX_DESC_EXT_STATUS_FLEXBH_SHIFT) & + 0x03) == 0x01) + rxm->pkt.hash.fdir.id = (uint16_t) + rte_le_to_cpu_32(rxd.wb.qword3.hi_dword.fd_id); +#endif + } rx_pkts[nb_rx++] = rxm; } @@ -1017,6 +1050,20
[dpdk-dev] VMDq + DCB: 128 Tx queues
Yes, if you hope TX is configured DCB mode, and 128 TX queues are needed. In testpmd codes, there is an example how to use 128 RX queue and 128 TX queue simultaneously in vmdq+dcb mode. The example function is get_eth_dcb_conf() in testpmd.c file. BRs, Jijiang Liu -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Sunil Bojanapally Sent: Friday, August 01, 2014 1:13 PM To: dev at dpdk.org Subject: [dpdk-dev] VMDq + DCB: 128 Tx queues Hi team, As per dpdk programming guide on VMDq+DCB will configure each Ethernet port to 16 pools with 8 queues each. Which means per port will have 128 Rx & Tx queues. The question is, in order to have end 2 end QoS support the port should get configured with 128 Rx as well as 128 Tx queues ? Note: I am considering same port for Rx & Tx. Thanks, Sunil
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On 31/07/2014 22:32, John W. Linville wrote: >> BTW: what is FESCO? > Fedora Engineering Steering Committee > > Neil and I have already felt the hot breath of FESCO on our necks > regarding the Fedora DPDK package... > I do confirm and feel that we should go step by step. Having multiple library as Bruce suggest could be an option, I like this idea. Another one could be to get a new ELF format (like executables on Mac or Windows) that allows to support multiple binaries optimized for each CPUs. I am not aware of such options with Linux loader. But here, as a DPDK community, we cannot push it. Any one at Fedora?
[dpdk-dev] VLAN based Packet Processing
Hi team, When we perform VLAN tagged packet transmission using rte_eth_tx_burst(), does the dpdk library will enqueue the packets to different CoS queues baseed upon QoS priority ? Provided ports are configured in DCB mode. Similarly while reception rte_eth_rx_burst(), does it poll all the CoS queues and dequeue the packets based on priority of queue ? Finally want to know, what scheduling is employed to dequeue the packets from different QoS queues ? Thanks in advance, Sunil
[dpdk-dev] [PATCH 1/2] igb_uio: fix compability on old kernel
2014-07-25 10:36, Stephen Hemminger: > Add more compatibility wrappers, and split out all the wrapper > code to a separate file. Builds on Debian Squeeze (2.6.32) which > is oldest version of kernel current DPDK supports. > > Signed-off-by: Stephen Hemminger Acked-by: Thomas Monjalon Applied for version 1.7.1. There are still some compilation issues with RHEL: include/linux/pci.h:1572: note: previous declaration of ?pci_num_vf? was here include/linux/pci.h:868: note: previous declaration of ?pci_intx_mask_supported? was here include/linux/pci.h:869: note: previous declaration of ?pci_check_and_mask_intx? was here Some ifdefs are missing but I don't want to dig into RHEL kernel headers to find what is the first RHEL release to support these functions. By the way, if someone knows an easy method to get all RHEL kernel headers or to know the release where a symbol appeared, it would be very useful. -- Thomas
[dpdk-dev] [PATCH 2/2] igb_uio: handle no IRQ fallback
2014-07-25 10:37, Stephen Hemminger: > Fix a couple of issues with my earlier igb_uio stuff: > 1. With MSI (like MSI-X) actual IRQ number is not known until >after the pci_enable_msi() is done. > 2. If INTX fails, fall back to running without IRQ. >This allows usermode PCI to recover and run without out IRQ >for cases where PCI INTX support is broken (aka VMWare). > > Signed-off-by: Stephen Hemminger Acked-by: Thomas Monjalon Applied for version 1.7.1. Thanks -- Thomas
[dpdk-dev] [PATCH] kni: fixed compilation error on Ubuntu 14.04 LTS (kernel 3.13.0-30.54)
2014-07-24 16:31, Buriez, Patrice: > > Why not this simpler form? > > $(shell lsb_release -si 2>/dev/null) > > I didn't want "make" to stop on error or to display a warning if lsb_release > is not available on other distributions. > I must admit that I focused on identifying the exact 5-tuple > UBUNTU_KERNEL_CODE that was triggering the compilation error. > Then I tried to keep the Makefile as simple and readable as possible, and > took no shortcut. > If your simpler form works the same, then indeed it's nicer than mine. ;-) > > > > +MODULE_CFLAGS += -DUBUNTU_RELEASE_CODE=$(subst .,,$(shell lsb_release > > > -sr)) > > > > Or you can use | tr -d . instead of subst and keep the flow from left to > > right. > > Agreed. I seldom use tr and didn't figure out that it would perfectly fit > here. Thanks! > > > > +UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature |cut > > > -d- -f1,2) > > ^ > > space missing here > > I usually pipe into the next command with no space in between. So that's > somewhat on purpose. > Is your comment cosmetic or about readability? > Or are there situations that would fail, unless the space is provided between > the pipe and the command? Yes, only cosmetic. > > > +UBUNTU_KERNEL_CODE := $(subst -,$(comma),$(UBUNTU_KERNEL_CODE)) > > > +UBUNTU_KERNEL_CODE := $(subst .,$(comma),$(UBUNTU_KERNEL_CODE)) > > > > Would be simpler with | tr -d .- > > Agreed again (with the $(comma) from your next email, and without the -d in > order to actually translate, not delete. ;-) Yes "tr .- $(comma)" :) > Again, I mainly focused on extracting and transmitting the 5-tuple > UBUNTU_KERNEL_CODE from shell to compiler. > I agree this can be rewritten in nicer ways, but it works, and hopefully does > not break compilation on other distributions. Acked and applied with above modifications. Thanks -- Thomas
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Thu, Jul 31, 2014 at 01:19:50PM -0700, Bruce Richardson wrote: > On Thu, Jul 31, 2014 at 03:01:17PM -0400, Neil Horman wrote: > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > > > I think a good first step here that I can't see anyone objecting to is > > > to enable the ixgbe driver to use the vector code path for a generic > > > x86_64 build. I've run a quick test here, and changing "_mm_popcnt_u64" > > > to "__builtin_popcountll" [and the include from nmmintrin to tmmintrin] > > > allows a compile for machine type default, and testpmd can still forward > > > packets at a good rate (roughly perf down about 10% vs native compile on > > > SNB). > > > The ACL is a tougher nut to crack, but anyone see any issues with that > > > two-line change to ixgbe_rxtx_vec.c? [Neil, since you started the patch > > > set thread, do you want to submit an official patch here, or would you > > > prefer I > > > do so?] > > > > > > > I'm happy to do so, Though 10% performance degradation vs. using the sse4.2 > > instructions in that path seems significant, isn't it? Given that > > performance > > delta, it seems like it would still be preferable to have a path that used > > the > > sse4.2 instructions when they're available. Or am I misreading what you > > mean > > when you say down 10% > > > > Neil > > > Ok, I did a little bit more testing here. Using the vector pmd compiled > for generic x86_64 and using __builtin_popcountll is approx 35% faster > for packet IO than the existing fast-path functions. It is also 7% (a > bit lower than ~10% as I originally stated) slower than the existing > native-compiled vpmd on a Sandy Bridge platform. > > I then ran an extra test, using EXTRA_CFLAGS='-msse4.2' to turn on the > extra instructions. The ~7% performance drop went to ~3%, so we would > gain a little more with using SSE4.2, but compared to the gain from > having the vector driver at all, it's not that much. [I don't have a > system handy with AVX2 support to see what boosts might come from > compiling with that instruction set enabled.] > > Because of this, I'd take the ~35% speed boost for now, and try and find > what would be the best general way to solve this problem across all > libraries. Also, I think that anyone who needs that extra 4% performance > probably wants the other 3% too, and so will compile up the code from > source using the "native" compilation target. :-) > Wait a moment, I'm not entirely sure what you did here. I understand that you replaced the _mm_popcnt_u64 call in the ixgbe pmd vector receive path with __builtin_popcnt, which is good, but ixgbe also uses the __mm_shuffle_epi8 intrinsic which is only available with sse4.2 from what I can see. did you replace those calls with a __builtin_shuffle variant? Otherwise, how did you get the pmd to build? I'm asking because this is what I tried in the first pass and Konstantin gave some pretty convicing evidence that this was an unworkable solution: http://dpdk.org/ml/archives/dev/2014-July/004443.html Neil
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman > Sent: Friday, August 01, 2014 2:37 PM > To: Richardson, Bruce > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of > some isolated features > > On Thu, Jul 31, 2014 at 01:19:50PM -0700, Bruce Richardson wrote: > > On Thu, Jul 31, 2014 at 03:01:17PM -0400, Neil Horman wrote: > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > > > > > I think a good first step here that I can't see anyone objecting to is > > > > to enable the ixgbe driver to use the vector code path for a generic > > > > x86_64 build. I've run a quick test here, and changing "_mm_popcnt_u64" > > > > to "__builtin_popcountll" [and the include from nmmintrin to tmmintrin] > > > > allows a compile for machine type default, and testpmd can still forward > > > > packets at a good rate (roughly perf down about 10% vs native compile on > > > > SNB). > > > > The ACL is a tougher nut to crack, but anyone see any issues with that > > > > two-line change to ixgbe_rxtx_vec.c? [Neil, since you started the patch > > > > set thread, do you want to submit an official patch here, or would you > > > > prefer I > > > > do so?] > > > > > > > > > > I'm happy to do so, Though 10% performance degradation vs. using the > > > sse4.2 > > > instructions in that path seems significant, isn't it? Given that > > > performance > > > delta, it seems like it would still be preferable to have a path that > > > used the > > > sse4.2 instructions when they're available. Or am I misreading what you > > > mean > > > when you say down 10% > > > > > > Neil > > > > > Ok, I did a little bit more testing here. Using the vector pmd compiled > > for generic x86_64 and using __builtin_popcountll is approx 35% faster > > for packet IO than the existing fast-path functions. It is also 7% (a > > bit lower than ~10% as I originally stated) slower than the existing > > native-compiled vpmd on a Sandy Bridge platform. > > > > I then ran an extra test, using EXTRA_CFLAGS='-msse4.2' to turn on the > > extra instructions. The ~7% performance drop went to ~3%, so we would > > gain a little more with using SSE4.2, but compared to the gain from > > having the vector driver at all, it's not that much. [I don't have a > > system handy with AVX2 support to see what boosts might come from > > compiling with that instruction set enabled.] > > > > Because of this, I'd take the ~35% speed boost for now, and try and find > > what would be the best general way to solve this problem across all > > libraries. Also, I think that anyone who needs that extra 4% performance > > probably wants the other 3% too, and so will compile up the code from > > source using the "native" compilation target. :-) > > > > > Wait a moment, I'm not entirely sure what you did here. I understand that you > replaced the _mm_popcnt_u64 call in the ixgbe pmd vector receive path with > __builtin_popcnt, which is good, but ixgbe also uses the __mm_shuffle_epi8 > intrinsic which is only available with sse4.2 from what I can see. did you > replace those calls with a __builtin_shuffle variant? Otherwise, how did you > get the pmd to build? I'm asking because this is what I tried in the first > pass > and Konstantin gave some pretty convicing evidence that this was an unworkable > solution: > http://dpdk.org/ml/archives/dev/2014-July/004443.html > I think that _mm_shuffle_epi8 (PSHUFB) is available starting from SSE3. So I presume, there is no need for replacement. Konstantin
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Fri, Aug 01, 2014 at 10:46:33AM +0200, Vincent JARDIN wrote: > On 31/07/2014 22:32, John W. Linville wrote: > >>BTW: what is FESCO? > >Fedora Engineering Steering Committee > > > >Neil and I have already felt the hot breath of FESCO on our necks > >regarding the Fedora DPDK package... > > > > I do confirm and feel that we should go step by step. > > Having multiple library as Bruce suggest could be an option, I like this > idea. > Its not an option (reasons described further down in the thread). > Another one could be to get a new ELF format (like executables on Mac or > Windows) that allows to support multiple binaries optimized for each CPUs. I > am not aware of such options with Linux loader. But here, as a DPDK > community, we cannot push it. Any one at Fedora? > This is definately not an option, at least not without significant justification or need. What you're asking for here is the development of an entirely new binary file format, the kernel and glibc support to interpret and execute it, and the compiler tooling to emit code in that format. Thats a huge undertaking, its not going to be done just because a single library would like to ship multiple binaries to be optimized for different cpu variants within the same family. Thats a multi year effort, and not something I'm prepared to even consider undertaking. Neil
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On 8/1/2014 6:56 AM, Ananyev, Konstantin wrote: >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman >> Sent: Friday, August 01, 2014 2:37 PM >> To: Richardson, Bruce >> Cc: dev at dpdk.org >> Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of >> some isolated features >> >> On Thu, Jul 31, 2014 at 01:19:50PM -0700, Bruce Richardson wrote: >>> On Thu, Jul 31, 2014 at 03:01:17PM -0400, Neil Horman wrote: On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > I think a good first step here that I can't see anyone objecting to is > to enable the ixgbe driver to use the vector code path for a generic > x86_64 build. I've run a quick test here, and changing "_mm_popcnt_u64" > to "__builtin_popcountll" [and the include from nmmintrin to tmmintrin] > allows a compile for machine type default, and testpmd can still forward > packets at a good rate (roughly perf down about 10% vs native compile on > SNB). > The ACL is a tougher nut to crack, but anyone see any issues with that > two-line change to ixgbe_rxtx_vec.c? [Neil, since you started the patch > set thread, do you want to submit an official patch here, or would you > prefer I > do so?] > I'm happy to do so, Though 10% performance degradation vs. using the sse4.2 instructions in that path seems significant, isn't it? Given that performance delta, it seems like it would still be preferable to have a path that used the sse4.2 instructions when they're available. Or am I misreading what you mean when you say down 10% Neil >>> Ok, I did a little bit more testing here. Using the vector pmd compiled >>> for generic x86_64 and using __builtin_popcountll is approx 35% faster >>> for packet IO than the existing fast-path functions. It is also 7% (a >>> bit lower than ~10% as I originally stated) slower than the existing >>> native-compiled vpmd on a Sandy Bridge platform. >>> >>> I then ran an extra test, using EXTRA_CFLAGS='-msse4.2' to turn on the >>> extra instructions. The ~7% performance drop went to ~3%, so we would >>> gain a little more with using SSE4.2, but compared to the gain from >>> having the vector driver at all, it's not that much. [I don't have a >>> system handy with AVX2 support to see what boosts might come from >>> compiling with that instruction set enabled.] >>> >>> Because of this, I'd take the ~35% speed boost for now, and try and find >>> what would be the best general way to solve this problem across all >>> libraries. Also, I think that anyone who needs that extra 4% performance >>> probably wants the other 3% too, and so will compile up the code from >>> source using the "native" compilation target. :-) So if I read this right, the fast path scalar to the new "generic" vector implementation is 35%? That's is a bit higher than anticipated, but great!! One caution - we should probably get a performance read on Atom cores before we remove the scalar fast path completely. >> >> Wait a moment, I'm not entirely sure what you did here. I understand that >> you >> replaced the _mm_popcnt_u64 call in the ixgbe pmd vector receive path with >> __builtin_popcnt, which is good, but ixgbe also uses the __mm_shuffle_epi8 >> intrinsic which is only available with sse4.2 from what I can see. did you >> replace those calls with a __builtin_shuffle variant? Otherwise, how did you >> get the pmd to build? I'm asking because this is what I tried in the first >> pass >> and Konstantin gave some pretty convicing evidence that this was an >> unworkable >> solution: >> http://dpdk.org/ml/archives/dev/2014-July/004443.html >> > I think that _mm_shuffle_epi8 (PSHUFB) is available starting from SSE3. > So I presume, there is no need for replacement. > Konstantin The change is really to keep the __mm_shuffle_epi8 and replace the _mm_popcnt_u64 with the builtin variant. That should allow compilation all the way up from SSSE3.
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Fri, Aug 01, 2014 at 01:56:24PM +, Ananyev, Konstantin wrote: > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Neil Horman > > Sent: Friday, August 01, 2014 2:37 PM > > To: Richardson, Bruce > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of > > some isolated features > > > > On Thu, Jul 31, 2014 at 01:19:50PM -0700, Bruce Richardson wrote: > > > On Thu, Jul 31, 2014 at 03:01:17PM -0400, Neil Horman wrote: > > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > > > > > > > I think a good first step here that I can't see anyone objecting to is > > > > > to enable the ixgbe driver to use the vector code path for a generic > > > > > x86_64 build. I've run a quick test here, and changing > > > > > "_mm_popcnt_u64" > > > > > to "__builtin_popcountll" [and the include from nmmintrin to > > > > > tmmintrin] > > > > > allows a compile for machine type default, and testpmd can still > > > > > forward > > > > > packets at a good rate (roughly perf down about 10% vs native compile > > > > > on > > > > > SNB). > > > > > The ACL is a tougher nut to crack, but anyone see any issues with that > > > > > two-line change to ixgbe_rxtx_vec.c? [Neil, since you started the > > > > > patch > > > > > set thread, do you want to submit an official patch here, or would > > > > > you prefer I > > > > > do so?] > > > > > > > > > > > > > I'm happy to do so, Though 10% performance degradation vs. using the > > > > sse4.2 > > > > instructions in that path seems significant, isn't it? Given that > > > > performance > > > > delta, it seems like it would still be preferable to have a path that > > > > used the > > > > sse4.2 instructions when they're available. Or am I misreading what > > > > you mean > > > > when you say down 10% > > > > > > > > Neil > > > > > > > Ok, I did a little bit more testing here. Using the vector pmd compiled > > > for generic x86_64 and using __builtin_popcountll is approx 35% faster > > > for packet IO than the existing fast-path functions. It is also 7% (a > > > bit lower than ~10% as I originally stated) slower than the existing > > > native-compiled vpmd on a Sandy Bridge platform. > > > > > > I then ran an extra test, using EXTRA_CFLAGS='-msse4.2' to turn on the > > > extra instructions. The ~7% performance drop went to ~3%, so we would > > > gain a little more with using SSE4.2, but compared to the gain from > > > having the vector driver at all, it's not that much. [I don't have a > > > system handy with AVX2 support to see what boosts might come from > > > compiling with that instruction set enabled.] > > > > > > Because of this, I'd take the ~35% speed boost for now, and try and find > > > what would be the best general way to solve this problem across all > > > libraries. Also, I think that anyone who needs that extra 4% performance > > > probably wants the other 3% too, and so will compile up the code from > > > source using the "native" compilation target. :-) > > > > > > > > > Wait a moment, I'm not entirely sure what you did here. I understand that > > you > > replaced the _mm_popcnt_u64 call in the ixgbe pmd vector receive path with > > __builtin_popcnt, which is good, but ixgbe also uses the __mm_shuffle_epi8 > > intrinsic which is only available with sse4.2 from what I can see. did you > > replace those calls with a __builtin_shuffle variant? Otherwise, how did > > you > > get the pmd to build? I'm asking because this is what I tried in the first > > pass > > and Konstantin gave some pretty convicing evidence that this was an > > unworkable > > solution: > > http://dpdk.org/ml/archives/dev/2014-July/004443.html > > > > I think that _mm_shuffle_epi8 (PSHUFB) is available starting from SSE3. > So I presume, there is no need for replacement. Ah, I see, its just because we're using the nmmintrinsic.h header. We need to replace the popcount instruction and change the include header to tmmintrins.h to avoid the #error from the failed sse4.2 check Thanks! Neil > Konstantin >
[dpdk-dev] [PATCH] virtio: Fix 2 compilation issues in virtio PMD
2014-07-24 12:57, Ouyang Changchun: > Fix 2 compilation issues in virtio PMD when dump option is enabled. > > Signed-off-by: Changchun Ouyang Acked-by: Thomas Monjalon Applied for version 1.7.1. Thanks -- Thomas
[dpdk-dev] [PATCH] vmxnet3: initialize receive mode correctly
2014-07-25 10:50, Stephen Hemminger: > The driver must listen to broadcast packets, like other devices. > Otherwise protocols like ARP won't work! > > Signed-off-by: Stephen Hemminger > - vmxnet3_dev_set_rxmode(hw, VMXNET3_RXM_UCAST | VMXNET3_RXM_ALL_MULTI, > 1); > + vmxnet3_dev_set_rxmode(hw, VMXNET3_RXM_UCAST | VMXNET3_RXM_BCAST, 1); It's also removing multicast at init. No comment so I assume everybody agrees. Acked-by: Thomas Monjalon Applied for version 1.7.1. Thanks -- Thomas
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On 01/08/2014 16:06, Neil Horman wrote: > Thats a multi year effort, and not something I'm prepared to even > consider undertaking. Sorry: I am not pushing you, it was just an open comment. I do agree that it is a multi year effort to get it down into a wide "agreed" community. DPDK community cannot manage it.
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote: > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote: > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote: > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote: > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote: > > > > > > 2014-07-31 09:13, Neil Horman: > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson wrote: > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote: > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson > > > > > > > > > wrote: > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman wrote: > > > > > > > > > > > Hey all- > > > > > > With regards to the general approach for runtime detection of software > > > functions, I wonder if something like this can be handled by the > > > packaging system? Is it possible to ship out a set of shared libs > > > compiled up for different instruction sets, and then at rpm install > > > time, symlink the appropriate library? This would push the whole issue > > > of detection of code paths outside of code, work across all our > > > libraries and ensure each user got the best performance they could get > > > form a binary? > > > Has something like this been done before? The building of all the > > > libraries could be scripted easy enough, just do multiple builds using > > > different EXTRA_CFLAGS each time, and move and rename the .so's after > > > each run. > > > > > > > Sorry, I missed this in my last reply. > > > > In answer to your question, the short version is that such a thing is > > roughly > > possible from a packaging standpoint, but completely unworkable from a > > distribution standpoint. We could certainly build the dpdk multiple times > > and > > rename all the shared objects to some variant name representative of the > > optimzations we build in for certain cpu flags, but then we woudl be > > shipping X > > versions of the dpdk, and any appilcation (say OVS that made use of the dpdk > > would need to provide a version linked against each variant to be useful > > when > > making a product, and each end user would need to manually select (or run a > > script to select) which variant is most optimized for the system at hand. > > Its > > just not a reasonable way to package a library. > > Sorry, perhaps I was not clear, having the user have to select the > appropriate library was not what I was suggesting. Instead, I was > suggesting that the rpm install "librte_pmd_ixgbe.so.generic", > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm > post-install script would look at the cpuflags in cpuinfo and then > symlink librte_pmd_ixgbe.so to the best-match version. That way the user > only has to link against "librte_pmd_ixgbe.so" and depending on the > system its run on, the loader will automatically resolve the symbols > from the appropriate instruction-set specific .so file. > This is an absolute packaging nightmare, it will potentially break all sorts of corner cases, and support processes. To cite a few examples: 1) Upgrade support - What if the minimum cpu requirements for dpdk are advanced at some point in the future? The above strategy has no way to know that a given update has more advanced requirements than a previous update, and when the update is installed, the previously linked library for the old base will dissappear, leaving broken applications behind. 2) Debugging - Its going to be near impossible to support an application built with a package put together this way, because you'll never be sure as to which version of the library was running when the crash occured. You can figure it out for certain, but for support/development people to need to remember to figure this out is going to be a major turn off for them, and the result will be that they simply won't use the dpdk. Its Anathema to the expectations of linux user space. 3) QA - Building multiple versions of a library means needing to QA multiple versions of a library. If you have to have 4 builds to support different levels of optimization, you've created a 4x increase in the amount of testing you need to do to ensure consistent behavior. You need to be aware of how many different builds are available in the single rpm at all times, and find systems on which to QA which will ensure that all of the builds get tested (as they are in fact, unique builds). While you may not hit all code paths in a single build, you will at least test all the common paths. The bottom line is that Distribution packaging is all about consistency and commonality. If you install something for an arch on multiple systems, its the same thing on each system, and it works in the same way, all the time. This strategy breaks that. Thats why we do run time checks for things. Neil > > > > W
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Fri, Aug 01, 2014 at 04:57:56PM +0200, Vincent JARDIN wrote: > On 01/08/2014 16:06, Neil Horman wrote: > >Thats a multi year effort, and not something I'm prepared to even > >consider undertaking. > > Sorry: I am not pushing you, it was just an open comment. I do agree that it > is a multi year effort to get it down into a wide "agreed" community. DPDK > community cannot manage it. > I understand, but I still don't think it makes sense to do. The fat elf format was origionally intended to allow transparent movement bewteen major architectures. While it certainly could be used to migrate between a smaller granularity within the same arch family, it seems like a lousy tradeoff to me, especially as the fanout for variants increases. Right now I think there are at least 4 variants within the dpdk (sse3, sse4.2, avx, and avx512). Thats a 4x increase in binary size. While that might make sense for one highly optimized application, it doesn't seem like it would make sense as a general purpose utility, as the disk space / speedup tradeoff is not super great. Neil
[dpdk-dev] [PATCH 0/2] link bonding unit test fix
2014-07-22 10:57, De Lara Guarch, Pablo: > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty > > > > fix for link bonding unit tests which are failing due to change introduced > > in > > rte_eth_dev_configure which now explicitly checks in pmd supports link > > status > > chnage interrupts. > > > > Also adds some common test macros to the test application for unit testing > > > > Declan Doherty (2): > > test app: > > bond: unit test suite fix > > Acked-by: Pablo de Lara Applied for version 1.7.1. > I would add that it may be a good idea to implement some of the changes made > in this unit test in other ones (use the new macros). Yes, it's cleaner with macros. It would be a good idea to translate existing unit tests like Declan did. Thanks -- Thomas
[dpdk-dev] [PATCH] vmxnet3: initialize receive mode correctly
On Fri, 01 Aug 2014 16:50:06 +0200 Thomas Monjalon wrote: > 2014-07-25 10:50, Stephen Hemminger: > > The driver must listen to broadcast packets, like other devices. > > Otherwise protocols like ARP won't work! > > > > Signed-off-by: Stephen Hemminger > > > - vmxnet3_dev_set_rxmode(hw, VMXNET3_RXM_UCAST | VMXNET3_RXM_ALL_MULTI, > > 1); > > + vmxnet3_dev_set_rxmode(hw, VMXNET3_RXM_UCAST | VMXNET3_RXM_BCAST, 1); > > It's also removing multicast at init. No comment so I assume everybody agrees. Just following what initial value for bare metal drivers is.
[dpdk-dev] dpdk-1.7.0 bug report
2014-07-23 14:24, David Binderman: > dpdk-1.7.0/lib/librte_ether/rte_ether.h:208]: (style) Expression '(X & 0x2) > == 0x1' is always false. > > Source code is > > return ((ea->addr_bytes[0] & ETHER_LOCAL_ADMIN_ADDR) == 1); > > but > > #define ETHER_LOCAL_ADMIN_ADDR 0x02 /**< Locally assigned Eth. address. */ It's now fixed: http://dpdk.org/browse/dpdk/commit/?id=030df0102ce762360 Thanks -- Thomas
[dpdk-dev] VMDq + DCB: 128 Tx queues
Thanks Liu for the pointer to function implemented in testpmd.c Just want to know in RX configured pools, what scheduling method is used in polling the queues. -Sunil On Fri, Aug 01, 2014 at 1:11 PM, Liu, Jijiang mailto:jijiang.liu at intel.com>> wrote: Yes, if you hope TX is configured DCB mode, and 128 TX queues are needed. In testpmd codes, there is an example how to use 128 RX queue and 128 TX queue simultaneously in vmdq+dcb mode. The example function is get_eth_dcb_conf() in testpmd.c file. BRs, Jijiang Liu -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Sunil Bojanapally Sent: Friday, August 01, 2014 1:13 PM To: dev at dpdk.org Subject: [dpdk-dev] VMDq + DCB: 128 Tx queues Hi team, As per dpdk programming guide on VMDq+DCB will configure each Ethernet port to 16 pools with 8 queues each. Which means per port will have 128 Rx & Tx queues. The question is, in order to have end 2 end QoS support the port should get configured with 128 Rx as well as 128 Tx queues ? Note: I am considering same port for Rx & Tx. Thanks, Sunil
[dpdk-dev] [PATCH] l3fwd improve grouping by destination port a bit
2014-07-22 17:04, Konstantin Ananyev: > Latest changes introduced a small degradation for the corner case > when each input packet is destined to the different port. > For the test-case when 1 core manages 4 ports and packet stream looks like: > IPV4_DSTPORT0, IPV4_DSTPORT1, IPV4_DSTPORT3, IPV4_DSTPORT4, IPV4_DSTPORT0, ... > non-optimised code path outperforms optimised one by 2-3%. > These changes supposed to close that gap. > From my testing: now for the case descirbed above optimised code path > produces same numbers as non-optimised one. > For other test-cases numbers remain about the same. > > Signed-off-by: Konstantin Ananyev There was no comment about this patch so it's now applied for version 1.7.1. Thanks -- Thomas
[dpdk-dev] [PATCH] ixgbe: Reduce compilation to only require sse3 intrinsics
ixgbe was failing to build in the default configuration because it required sse4.2 intrinsics, and the default config doesn't support more than sse3. Modify the pmd so that only sse3 intrinsics are pulled in and used. Signed-off-by: Neil Horman CC: "Konstantin Ananyev" CC: Bruce Richardson CC: Thomas Monjalon --- lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c index 09e19a3..fe39ca2 100644 --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c @@ -38,7 +38,7 @@ #include "ixgbe_ethdev.h" #include "ixgbe_rxtx.h" -#include +#include #ifndef __INTEL_COMPILER #pragma GCC diagnostic ignored "-Wcast-qual" @@ -338,7 +338,7 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, pkt_mb1); /* C.4 calc avaialbe number of desc */ - var = _mm_popcnt_u64(_mm_cvtsi128_si64(staterr)); + var = __builtin_popcountll(_mm_cvtsi128_si64(staterr)); nb_pkts_recd += var; if (likely(var != RTE_IXGBE_DESCS_PER_LOOP)) break; -- 1.8.3.1
[dpdk-dev] [PATCH] kni: fix missing backslash in Makefile
With GNU Make 3.81 on Ubuntu 14.04, I get: lib/librte_eal/linuxapp/kni/Makefile:49: *** unterminated call to function `shell': missing `)'. Stop. Signed-off-by: Julien Cretin --- lib/librte_eal/linuxapp/kni/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_eal/linuxapp/kni/Makefile b/lib/librte_eal/linuxapp/kni/Makefile index 4b5d873..2799191 100644 --- a/lib/librte_eal/linuxapp/kni/Makefile +++ b/lib/librte_eal/linuxapp/kni/Makefile @@ -46,7 +46,7 @@ MODULE_CFLAGS += -Wall -Werror ifeq ($(shell lsb_release -si 2>/dev/null),Ubuntu) MODULE_CFLAGS += -DUBUNTU_RELEASE_CODE=$(shell lsb_release -sr | tr -d .) -UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | +UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | \ cut -d- -f1,2 | tr .- $(comma)) MODULE_CFLAGS += -D"UBUNTU_KERNEL_CODE=UBUNTU_KERNEL_VERSION($(UBUNTU_KERNEL_CODE))" endif -- 1.9.1
[dpdk-dev] VMDq + DCB: 128 Tx queues
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sunil Bojanapally > Sent: Friday, August 01, 2014 9:13 AM > To: Liu, Jijiang > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] VMDq + DCB: 128 Tx queues > > Thanks Liu for the pointer to function implemented in testpmd.c > > Just want to know in RX configured pools, what scheduling method is used in > polling the queues. How the queues are polled when in vmdq+dcb mode is entirely up to the application, as the queue id passed into the rx_burst function will refer directly to one of the hardware queues to be read. There is no behind-the-scenes magic and prioritization of packets being done, the app knows best what way it wants the packets to be read and processed. Regards, /Bruce > > -Sunil > > On Fri, Aug 01, 2014 at 1:11 PM, Liu, Jijiang > mailto:jijiang.liu at intel.com>> wrote: > > > Yes, if you hope TX is configured DCB mode, and 128 TX queues are needed. In > testpmd codes, there is an example how to use 128 RX queue and 128 TX queue > simultaneously in vmdq+dcb mode. > The example function is get_eth_dcb_conf() in testpmd.c file. > > BRs, > Jijiang Liu > > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sunil Bojanapally > Sent: Friday, August 01, 2014 1:13 PM > To: dev at dpdk.org > Subject: [dpdk-dev] VMDq + DCB: 128 Tx queues > > Hi team, > > As per dpdk programming guide on VMDq+DCB will configure each Ethernet port > to 16 pools with 8 queues each. Which means per port will have 128 Rx & Tx > queues. > > The question is, in order to have end 2 end QoS support the port should get > configured with 128 Rx as well as 128 Tx queues ? > > Note: I am considering same port for Rx & Tx. > > Thanks, > Sunil
[dpdk-dev] [PATCH] ixgbe: Reduce compilation to only require sse3 intrinsics
> -Original Message- > From: Neil Horman [mailto:nhorman at tuxdriver.com] > Sent: Friday, August 01, 2014 9:49 AM > To: dev at dpdk.org > Cc: Neil Horman; Ananyev, Konstantin; Richardson, Bruce; Thomas Monjalon > Subject: [PATCH] ixgbe: Reduce compilation to only require sse3 intrinsics > > ixgbe was failing to build in the default configuration because it required > sse4.2 intrinsics, and the default config doesn't support more than sse3. > Modify the pmd so that only sse3 intrinsics are pulled in and used. > > Signed-off-by: Neil Horman > CC: "Konstantin Ananyev" > CC: Bruce Richardson > CC: Thomas Monjalon > --- > lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > index 09e19a3..fe39ca2 100644 > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c > @@ -38,7 +38,7 @@ > #include "ixgbe_ethdev.h" > #include "ixgbe_rxtx.h" > > -#include > +#include > > #ifndef __INTEL_COMPILER > #pragma GCC diagnostic ignored "-Wcast-qual" > @@ -338,7 +338,7 @@ ixgbe_recv_pkts_vec(void *rx_queue, struct rte_mbuf > **rx_pkts, > pkt_mb1); > > /* C.4 calc avaialbe number of desc */ > - var = _mm_popcnt_u64(_mm_cvtsi128_si64(staterr)); > + var = __builtin_popcountll(_mm_cvtsi128_si64(staterr)); > nb_pkts_recd += var; > if (likely(var != RTE_IXGBE_DESCS_PER_LOOP)) > break; > -- > 1.8.3.1 Acked-by: Bruce Richardson
[dpdk-dev] Debugging EAL PCI / Driver Init
Hello, I am running into a problem where Eth driver init works fine in a sample app and finds my NICs, and the NICs appear in rte_eal_pci_dump(stdout) but they don't show up in rte_eth_dev_count() even after rte_eal_pci_probe() is called the same as the sample apps, so my app won't boot. I have a lot of experience using the older versions of the DPDK where you had to call the PMD init functions manually but no experience with the later versions where the DPDK is supposed to init the PMDs itself automatically. What do I have to do to dump the most possible debug output on why the driver list for my PCI devices always seems empty? Any places I should look to see the issue? Maybe I didn't link it together with the right DPDK libs? I used the combined DPDK static lib libintel_dpdk.a to make things simpler as I had seen recommended in various places. Thanks, Matthew.
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote: > On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote: > > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote: > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote: > > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote: > > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote: > > > > > > > 2014-07-31 09:13, Neil Horman: > > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson > > > > > > > > wrote: > > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote: > > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce Richardson > > > > > > > > > > wrote: > > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman > > > > > > > > > > > wrote: > > > > > > > > > > > > Hey all- > > > > > > > > With regards to the general approach for runtime detection of software > > > > functions, I wonder if something like this can be handled by the > > > > packaging system? Is it possible to ship out a set of shared libs > > > > compiled up for different instruction sets, and then at rpm install > > > > time, symlink the appropriate library? This would push the whole issue > > > > of detection of code paths outside of code, work across all our > > > > libraries and ensure each user got the best performance they could get > > > > form a binary? > > > > Has something like this been done before? The building of all the > > > > libraries could be scripted easy enough, just do multiple builds using > > > > different EXTRA_CFLAGS each time, and move and rename the .so's after > > > > each run. > > > > > > > > > > Sorry, I missed this in my last reply. > > > > > > In answer to your question, the short version is that such a thing is > > > roughly > > > possible from a packaging standpoint, but completely unworkable from a > > > distribution standpoint. We could certainly build the dpdk multiple > > > times and > > > rename all the shared objects to some variant name representative of the > > > optimzations we build in for certain cpu flags, but then we woudl be > > > shipping X > > > versions of the dpdk, and any appilcation (say OVS that made use of the > > > dpdk > > > would need to provide a version linked against each variant to be useful > > > when > > > making a product, and each end user would need to manually select (or run > > > a > > > script to select) which variant is most optimized for the system at hand. > > > Its > > > just not a reasonable way to package a library. > > > > Sorry, perhaps I was not clear, having the user have to select the > > appropriate library was not what I was suggesting. Instead, I was > > suggesting that the rpm install "librte_pmd_ixgbe.so.generic", > > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm > > post-install script would look at the cpuflags in cpuinfo and then > > symlink librte_pmd_ixgbe.so to the best-match version. That way the user > > only has to link against "librte_pmd_ixgbe.so" and depending on the > > system its run on, the loader will automatically resolve the symbols > > from the appropriate instruction-set specific .so file. > > > > This is an absolute packaging nightmare, it will potentially break all sorts > of > corner cases, and support processes. To cite a few examples: > > 1) Upgrade support - What if the minimum cpu requirements for dpdk are > advanced > at some point in the future? The above strategy has no way to know that a > given > update has more advanced requirements than a previous update, and when the > update is installed, the previously linked library for the old base will > dissappear, leaving broken applications behind. Firstly, I didn't know we could actually specify minimum cpu requirements for packaging, that is something that could be useful :-) Secondly, what is the normal case for handling something like this, where an upgrade has enhanced requirements compared to the previous version? Presumably you either need to prevent the upgrade from happening or else accept a broken app. Can the same mechanism not also be used to prevent upgrades using a multi-lib scheme? > > 2) Debugging - Its going to be near impossible to support an application built > with a package put together this way, because you'll never be sure as to which > version of the library was running when the crash occured. You can figure it > out for certain, but for support/development people to need to remember to > figure this out is going to be a major turn off for them, and the result will > be > that they simply won't use the dpdk. Its Anathema to the expectations of > linux > user space. Sorry, I just don't see this as being any harder to support than multiple code paths for the same functionality. In fact, it will surely make debugging easier, since you only have
[dpdk-dev] [PATCH] kni: fix missing backslash in Makefile
Hi Julien, 2014-08-01 18:56, Julien Cretin: > With GNU Make 3.81 on Ubuntu 14.04, I get: > lib/librte_eal/linuxapp/kni/Makefile:49: *** unterminated call to function > `shell': missing `)'. Stop. > > Signed-off-by: Julien Cretin > -UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | > +UBUNTU_KERNEL_CODE := $(shell cut -d' ' -f2 /proc/version_signature | \ > cut -d- -f1,2 | tr .- $(comma)) My fault, I forgot the backslash when splitting line. Acked-by: Thomas Monjalon Applied. Thanks for the quick test and report. -- Thomas
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Fri, Aug 01, 2014 at 12:22:22PM -0700, Bruce Richardson wrote: > On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote: > > On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote: > > > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote: > > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote: > > > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote: > > > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote: > > > > > > > > 2014-07-31 09:13, Neil Horman: > > > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson > > > > > > > > > wrote: > > > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman wrote: > > > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce > > > > > > > > > > > Richardson wrote: > > > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Hey all- > > > > > > > > > > With regards to the general approach for runtime detection of software > > > > > functions, I wonder if something like this can be handled by the > > > > > packaging system? Is it possible to ship out a set of shared libs > > > > > compiled up for different instruction sets, and then at rpm install > > > > > time, symlink the appropriate library? This would push the whole issue > > > > > of detection of code paths outside of code, work across all our > > > > > libraries and ensure each user got the best performance they could get > > > > > form a binary? > > > > > Has something like this been done before? The building of all the > > > > > libraries could be scripted easy enough, just do multiple builds using > > > > > different EXTRA_CFLAGS each time, and move and rename the .so's after > > > > > each run. > > > > > > > > > > > > > Sorry, I missed this in my last reply. > > > > > > > > In answer to your question, the short version is that such a thing is > > > > roughly > > > > possible from a packaging standpoint, but completely unworkable from a > > > > distribution standpoint. We could certainly build the dpdk multiple > > > > times and > > > > rename all the shared objects to some variant name representative of the > > > > optimzations we build in for certain cpu flags, but then we woudl be > > > > shipping X > > > > versions of the dpdk, and any appilcation (say OVS that made use of the > > > > dpdk > > > > would need to provide a version linked against each variant to be > > > > useful when > > > > making a product, and each end user would need to manually select (or > > > > run a > > > > script to select) which variant is most optimized for the system at > > > > hand. Its > > > > just not a reasonable way to package a library. > > > > > > Sorry, perhaps I was not clear, having the user have to select the > > > appropriate library was not what I was suggesting. Instead, I was > > > suggesting that the rpm install "librte_pmd_ixgbe.so.generic", > > > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm > > > post-install script would look at the cpuflags in cpuinfo and then > > > symlink librte_pmd_ixgbe.so to the best-match version. That way the user > > > only has to link against "librte_pmd_ixgbe.so" and depending on the > > > system its run on, the loader will automatically resolve the symbols > > > from the appropriate instruction-set specific .so file. > > > > > > > This is an absolute packaging nightmare, it will potentially break all > > sorts of > > corner cases, and support processes. To cite a few examples: > > > > 1) Upgrade support - What if the minimum cpu requirements for dpdk are > > advanced > > at some point in the future? The above strategy has no way to know that a > > given > > update has more advanced requirements than a previous update, and when the > > update is installed, the previously linked library for the old base will > > dissappear, leaving broken applications behind. > > Firstly, I didn't know we could actually specify minimum cpu > requirements for packaging, that is something that could be useful :-) You misread my comment :). I didn't say we could specify minimum cpu requirements at packaging (you can't, beyond general arch), I said "what if the dpdk's cpu requriements were raised?". Completely different thing. Currently teh default, lowest common denominator system that dpdk appears to build for is core2 (as listed in the old default config). What if at some point you raise those requirements and decide that SSE4.2 really is required to achieve maximum performance. Using the above strategy any system that doesn't meet the new requirements will silently break on such an update. Thats not acceptable. > Secondly, what is the normal case for handling something like this, > where an upgrade has enhanced requirements compared to the previous > version? Presumably you either need
[dpdk-dev] DPDK memory mechanism
Hello, everybody, I am new on DPDK, and have several questions on DPDK. Is "Mbuf Pool? pinned to avoid being swapped out? I checked the source code, and found there is API called ?rte_mem_lock_page?. But it seems this API is never by called. Do I miss something? Thanks, wenji
[dpdk-dev] [PATCH 0/2] dpdk: Allow for dynamic enablement of some isolated features
On Fri, Aug 01, 2014 at 04:43:52PM -0400, Neil Horman wrote: > On Fri, Aug 01, 2014 at 12:22:22PM -0700, Bruce Richardson wrote: > > On Fri, Aug 01, 2014 at 11:06:29AM -0400, Neil Horman wrote: > > > On Thu, Jul 31, 2014 at 01:25:06PM -0700, Bruce Richardson wrote: > > > > On Thu, Jul 31, 2014 at 04:10:18PM -0400, Neil Horman wrote: > > > > > On Thu, Jul 31, 2014 at 11:36:32AM -0700, Bruce Richardson wrote: > > > > > > Thu, Jul 31, 2014 at 02:10:32PM -0400, Neil Horman wrote: > > > > > > > On Thu, Jul 31, 2014 at 10:32:28AM -0400, Neil Horman wrote: > > > > > > > > On Thu, Jul 31, 2014 at 03:26:45PM +0200, Thomas Monjalon wrote: > > > > > > > > > 2014-07-31 09:13, Neil Horman: > > > > > > > > > > On Wed, Jul 30, 2014 at 02:09:20PM -0700, Bruce Richardson > > > > > > > > > > wrote: > > > > > > > > > > > On Wed, Jul 30, 2014 at 03:28:44PM -0400, Neil Horman > > > > > > > > > > > wrote: > > > > > > > > > > > > On Wed, Jul 30, 2014 at 11:59:03AM -0700, Bruce > > > > > > > > > > > > Richardson wrote: > > > > > > > > > > > > > On Tue, Jul 29, 2014 at 04:24:24PM -0400, Neil Horman > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > Hey all- > > > > > > > > > > > > With regards to the general approach for runtime detection of > > > > > > software > > > > > > functions, I wonder if something like this can be handled by the > > > > > > packaging system? Is it possible to ship out a set of shared libs > > > > > > compiled up for different instruction sets, and then at rpm install > > > > > > time, symlink the appropriate library? This would push the whole > > > > > > issue > > > > > > of detection of code paths outside of code, work across all our > > > > > > libraries and ensure each user got the best performance they could > > > > > > get > > > > > > form a binary? > > > > > > Has something like this been done before? The building of all the > > > > > > libraries could be scripted easy enough, just do multiple builds > > > > > > using > > > > > > different EXTRA_CFLAGS each time, and move and rename the .so's > > > > > > after > > > > > > each run. > > > > > > > > > > > > > > > > Sorry, I missed this in my last reply. > > > > > > > > > > In answer to your question, the short version is that such a thing is > > > > > roughly > > > > > possible from a packaging standpoint, but completely unworkable from a > > > > > distribution standpoint. We could certainly build the dpdk multiple > > > > > times and > > > > > rename all the shared objects to some variant name representative of > > > > > the > > > > > optimzations we build in for certain cpu flags, but then we woudl be > > > > > shipping X > > > > > versions of the dpdk, and any appilcation (say OVS that made use of > > > > > the dpdk > > > > > would need to provide a version linked against each variant to be > > > > > useful when > > > > > making a product, and each end user would need to manually select (or > > > > > run a > > > > > script to select) which variant is most optimized for the system at > > > > > hand. Its > > > > > just not a reasonable way to package a library. > > > > > > > > Sorry, perhaps I was not clear, having the user have to select the > > > > appropriate library was not what I was suggesting. Instead, I was > > > > suggesting that the rpm install "librte_pmd_ixgbe.so.generic", > > > > "librte_pmd_ixgbe.so.sse42" and "librte_pmd_ixgbe.so.avx". Then the rpm > > > > post-install script would look at the cpuflags in cpuinfo and then > > > > symlink librte_pmd_ixgbe.so to the best-match version. That way the user > > > > only has to link against "librte_pmd_ixgbe.so" and depending on the > > > > system its run on, the loader will automatically resolve the symbols > > > > from the appropriate instruction-set specific .so file. > > > > > > > > > > This is an absolute packaging nightmare, it will potentially break all > > > sorts of > > > corner cases, and support processes. To cite a few examples: > > > > > > 1) Upgrade support - What if the minimum cpu requirements for dpdk are > > > advanced > > > at some point in the future? The above strategy has no way to know that > > > a given > > > update has more advanced requirements than a previous update, and when the > > > update is installed, the previously linked library for the old base will > > > dissappear, leaving broken applications behind. > > > > Firstly, I didn't know we could actually specify minimum cpu > > requirements for packaging, that is something that could be useful :-) > You misread my comment :). I didn't say we could specify minimum cpu > requirements at packaging (you can't, beyond general arch), I said "what if > the > dpdk's cpu requriements were raised?". Completely different thing. Currently > teh default, lowest common denominator system that dpdk appears to build for > is > core2 (as listed in the old default config). What if at some point you raise > those requirements and decide that SSE4.2 really is required to achieve > m
[dpdk-dev] VMDq + DCB: 128 Tx queues
Hi team, As per dpdk programming guide on VMDq+DCB will configure each Ethernet port to 16 pools with 8 queues each. Which means per port will have 128 Rx & Tx queues. The question is, in order to have end 2 end QoS support the port should get configured with 128 Rx as well as 128 Tx queues ? Note: I am considering same port for Rx & Tx. Thanks, Sunil
[dpdk-dev] [PATCH] i40e: support autoneg or force link speed
- i40e force link up/down - i40e autoneg/force speed Signed-off-by: Cunming Liang Acked-by: Helin Zhang Acked-by: Chen Jing D(Mark) Tested-by: Xu HuilongX --- app/test-pmd/cmdline.c| 17 +++-- lib/librte_pmd_i40e/i40e_ethdev.c | 139 ++ 2 files changed, 150 insertions(+), 6 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 345be11..0abc233 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -527,7 +527,8 @@ static void cmd_help_long_parsed(void *parsed_result, "port close (port_id|all)\n" "Close all ports or port_id.\n\n" - "port config (port_id|all) speed (10|100|1000|1|auto)" + "port config (port_id|all)" + " speed (10|100|1000|1|4|auto)" " duplex (half|full|auto)\n" "Set speed and duplex for all ports or port_id\n\n" @@ -801,7 +802,9 @@ cmd_config_speed_all_parsed(void *parsed_result, else if (!strcmp(res->value1, "1000")) link_speed = ETH_LINK_SPEED_1000; else if (!strcmp(res->value1, "1")) - link_speed = ETH_LINK_SPEED_1; + link_speed = ETH_LINK_SPEED_10G; + else if (!strcmp(res->value1, "4")) + link_speed = ETH_LINK_SPEED_40G; else if (!strcmp(res->value1, "auto")) link_speed = ETH_LINK_SPEED_AUTONEG; else { @@ -839,7 +842,7 @@ cmdline_parse_token_string_t cmd_config_speed_all_item1 = TOKEN_STRING_INITIALIZER(struct cmd_config_speed_all, item1, "speed"); cmdline_parse_token_string_t cmd_config_speed_all_value1 = TOKEN_STRING_INITIALIZER(struct cmd_config_speed_all, value1, - "10#100#1000#1#auto"); + "10#100#1000#1#4#auto"); cmdline_parse_token_string_t cmd_config_speed_all_item2 = TOKEN_STRING_INITIALIZER(struct cmd_config_speed_all, item2, "duplex"); cmdline_parse_token_string_t cmd_config_speed_all_value2 = @@ -849,7 +852,7 @@ cmdline_parse_token_string_t cmd_config_speed_all_value2 = cmdline_parse_inst_t cmd_config_speed_all = { .f = cmd_config_speed_all_parsed, .data = NULL, - .help_str = "port config all speed 10|100|1000|1|auto duplex " + .help_str = "port config all speed 10|100|1000|1|4|auto duplex " "half|full|auto", .tokens = { (void *)&cmd_config_speed_all_port, @@ -901,6 +904,8 @@ cmd_config_speed_specific_parsed(void *parsed_result, link_speed = ETH_LINK_SPEED_1000; else if (!strcmp(res->value1, "1")) link_speed = ETH_LINK_SPEED_1; + else if (!strcmp(res->value1, "4")) + link_speed = ETH_LINK_SPEED_40G; else if (!strcmp(res->value1, "auto")) link_speed = ETH_LINK_SPEED_AUTONEG; else { @@ -939,7 +944,7 @@ cmdline_parse_token_string_t cmd_config_speed_specific_item1 = "speed"); cmdline_parse_token_string_t cmd_config_speed_specific_value1 = TOKEN_STRING_INITIALIZER(struct cmd_config_speed_specific, value1, - "10#100#1000#1#auto"); + "10#100#1000#1#4#auto"); cmdline_parse_token_string_t cmd_config_speed_specific_item2 = TOKEN_STRING_INITIALIZER(struct cmd_config_speed_specific, item2, "duplex"); @@ -950,7 +955,7 @@ cmdline_parse_token_string_t cmd_config_speed_specific_value2 = cmdline_parse_inst_t cmd_config_speed_specific = { .f = cmd_config_speed_specific_parsed, .data = NULL, - .help_str = "port config X speed 10|100|1000|1|auto duplex " + .help_str = "port config X speed 10|100|1000|1|4|auto duplex " "half|full|auto", .tokens = { (void *)&cmd_config_speed_specific_port, diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 9ed31b5..fe4c78e 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -128,6 +128,8 @@ static void i40e_dev_promiscuous_enable(struct rte_eth_dev *dev); static void i40e_dev_promiscuous_disable(struct rte_eth_dev *dev); static void i40e_dev_allmulticast_enable(struct rte_eth_dev *dev); static void i40e_dev_allmulticast_disable(struct rte_eth_dev *dev); +static int i40e_dev_set_link_up(struct rte_eth_dev *dev); +static int i40e_dev_set_link_down(struct rte_eth_dev *dev); static void i40e_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_st