[dpdk-dev] [PATCH 0/5]support filter of unicast and multicast MAC address for VF on Fortville
Hi Thomas, Any comments on this patch set? Thanks Jijiang Liu > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liu, Yong > Sent: Thursday, September 25, 2014 4:18 PM > To: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/5]support filter of unicast and multicast MAC > address for VF on Fortville > > Tested-by: Liu Yong > > This patch set has been tested by Intel. > Please see information as the following: > > Host: > OS : Fedora 20 x86_64 > Kernel : 3.11.10-301 > GCC: 4.8.3 > CPU: Intel Xeon CPU E5-2680 v2 @ 2.80GHz > NIC : 2*40G (8086:1583) > Qemu: 1.6.2 > libvirt : 1.1.3 > Guest: > OS : Fedora 20 x86_64 > Kernel : 3.11.10-301 > GCC : 4.8.3 > > We verified perfect and hash match filter of unicast and multicast MAC address > for VF work normally on FVL. > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu > > Sent: Tuesday, September 23, 2014 11:30 AM > > To: dev at dpdk.org > > Subject: [dpdk-dev] [PATCH 0/5]support filter of unicast and multicast > > MAC address for VF on Fortville > > > > The patch set enhances MACVLAN filter configurability and supports > > perfect and hash match filter of unicast and multicast MAC address for > > VF on Fortville. > > > > It mainly includes: > > - Use new filter mechanism discussed at > > http://dpdk.org/ml/archives/dev/2014-September/005179.html. > > - Enhance MACVLAN filter to be configurable. Now the following > > options are > > configurable: > >1. Perfect match of MAC address > >2. Perfect match of MAC address and VLAN ID > >3. Hash match of MAC address > >4. Hash match of MAC address and perfect match of VLAN ID > >5. To Queue: use MAC and VLAN to point to a queue > > - Support perfect and hash match of unicast and multicast MAC address > > for VF for i40e > > > > > > jijiangl (5): > > Use new filter framework > > Add new definations for MACVLAN filter enhancement in rte_eth_ctrl.h > file > > Change parameters of MAC/VLAN filter to be configurable > > Add VF MACVLAN filter handle for i40e > > Test VF MACVLAN filter for i40e > > > > app/test-pmd/cmdline.c| 115 +- > > lib/librte_ether/Makefile |1 + > > lib/librte_ether/rte_eth_ctrl.h | 104 > > lib/librte_ether/rte_ethdev.c | 33 > > lib/librte_ether/rte_ethdev.h | 48 ++- > > lib/librte_pmd_i40e/i40e_ethdev.c | 321 > > - > > lib/librte_pmd_i40e/i40e_ethdev.h | 18 ++- > > lib/librte_pmd_i40e/i40e_pf.c |7 +- > > 8 files changed, 601 insertions(+), 46 deletions(-) create mode > > 100644 lib/librte_ether/rte_eth_ctrl.h > > > > -- > > 1.7.7.6
[dpdk-dev] [PATCH 09/12] Remove iopl operation for IBM Power architecture
OK. I'll update the patches. Thanks for your comments! Best Regards! -- Chao Zhu From: "Ananyev, Konstantin" To: Cyril Chemparathy , Chao CH Zhu/China/IBM at IBMCN, "dev at dpdk.org" Date: 2014/10/07 22:45 Subject:RE: [dpdk-dev] [PATCH 09/12] Remove iopl operation for IBM Power architecture > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Cyril Chemparathy > Sent: Monday, October 06, 2014 11:04 PM > To: Chao Zhu; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH 09/12] Remove iopl operation for IBM Power architecture > > On 9/26/2014 2:36 AM, Chao Zhu wrote: > > iopl() call is mostly for the i386 architecture. In Power architecture. > > It doesn't exist. This patch modified rte_eal_iopl_init() and make it > > return -1 on Power. This means rte_config.flags will not contain > > EAL_FLG_HIGH_IOPL flag on IBM Power architecture. > > Since iopl() is an x86-only thing, shouldn't the code be conditional on > defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) instead of below? > > Better still, should we maybe break out an architecture specific init > function? This function could set iopl on x86, and possibly do other > lowlevel init things on other architectures... Yep, that sounds like a good way to me too. > > > Signed-off-by: Chao Zhu > > --- > > lib/librte_eal/linuxapp/eal/eal.c | 11 +++ > > 1 files changed, 11 insertions(+), 0 deletions(-) > > > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c > > index 4869e7c..8cc1f21 100644 > > --- a/lib/librte_eal/linuxapp/eal/eal.c > > +++ b/lib/librte_eal/linuxapp/eal/eal.c > > @@ -50,7 +50,10 @@ > > #include > > #include > > #include > > +/* Power architecture doesn't have this header file */ > > +#ifndef RTE_ARCH_PPC_64 > > #include > > +#endif > > > > #include > > #include > > @@ -1019,11 +1022,19 @@ rte_eal_mcfg_complete(void) > > > > /* > >* Request iopl privilege for all RPL, returns 0 on success > > + * > > + * Power architecture doesn't have iopl function, so this function > > + * return -1 on Power architecture, because this function is only used > > + * in rte_eal_init to add EAL_FLG_HIGH_IOPL to rte_config.flags. > >*/ > > static int > > rte_eal_iopl_init(void) > > { > > +#ifndef RTE_ARCH_PPC_64 > > return iopl(HIGHEST_RPL); > > +#else > > +return -1; > > +#endif > > } > > > > /* Launch threads, called at application init(). */
[dpdk-dev] [PATCH 0/7] Patches to split architecture specific operations from DPDK
David, I'll update the patches acccording to your comments. Thanks! Best Regards! -- Chao Zhu From: David Marchand To: Chao CH Zhu/China/IBM at IBMCN Cc: "dev at dpdk.org" Date: 2014/10/03 21:21 Subject:Re: [dpdk-dev] [PATCH 0/7] Patches to split architecture specific operations from DPDK Hello Chao, On Fri, Sep 26, 2014 at 11:33 AM, Chao Zhu wrote: The set of patches split x86 architecture specific operations from DPDK and put them to the arch directories of i686 and x86_64 architecture. This will make the adpotion of DPDK much easier on other computer architecture. For a new architecture, just add an architecture specific directory and necessary building configuration files, then DPDK can support it. Here is a different approach for the headers splitting. If we are going to support multiple architectures, the best would be to have a specific header for each arch which implements a common API (no need for any _arch suffix). These headers would be located in lib/librte_eal/common/include/arch/$arch/ rather than lib/librte_eal/common/include/$arch/arch/ (which looks odd to me). Makefiles can add some -I for dpdk to build itself (and we can remove those symlinks from the makefiles). Makefiles only install the specific headers in RTE_SDK/include for use by applications. For common code and documentation, we can add a "generic" directory in lib/librte_eal/common/include (or "arch-generic", or "shared" ... any better idea ?). DPDK makefiles installs the generic headers in RTE_SDK/include/generic. arch headers (like rte_atomic.h) include the generic one (). These generic headers can be implemented using compiler intrinsics when possible. They also include the doxygen stuff in a single place. This would look like something like this, for rte_atomic.h : - in DPDK sources $ ls lib/librte_eal/common/include/*/rte_atomic.h lib/librte_eal/common/include/i686/rte_atomic.h lib/librte_eal/common/include/x86_64/rte_atomic.h lib/librte_eal/common/include/generic/rte_atomic.h - in installed RTE_SDK $ ls RTE_SDK/include/{,*/}rte_atomic.h RTE_SDK/include/rte_atomic.h RTE_SDK/include/generic/rte_atomic.h Comments ? I am only focusing on the first patchset at the moment, but if we can find consensus here, a respin of the two patchsets would be great. Thanks. -- David Marchand
[dpdk-dev] vmxnet3 pmd dev restart
Hi Rashmin I have tried the memset change but still I am facing the problem which I pointed out earlier. After restart, packets are not being received in vmxnet3_recv_pkts(). I have also observed PANIC in vmxnet3_tq_tx_complete() after couple of stop and start operations. PANIC in vmxnet3_tq_tx_complete(): EOP desc does not point to a valid mbuf15: [/lib64/libc.so.6(clone+0x6d) [0x7fd60354c52d]] 1: [/mswitch/bin/sos.shumway.elf(rte_dump_stack+0x23) [0x463313]] 2: [/mswitch/bin/sos.shumway.elf(__rte_panic+0xc1) [0x447ae8]] 3: [/mswitch/bin/sos.shumway.elf(vmxnet3_xmit_pkts+0x382) [0x4f4f22]] Thanks Navakanth On Fri, Oct 10, 2014 at 8:39 AM, Cao, Waterman wrote: > Hi Rashmin, > > We found similar issue when we start/stop vmnet dev several time. (> 3 times) > It happens kernel panic, and sometimes kernel will occur core dump. > Let me know if you want to submit patch to fix it. > > Thanks > Waterman > > -Original Message- >>From: Patel, Rashmin N >>Sent: Friday, October 10, 2014 6:07 AM >>To: Navakanth M; stephen at networkplumber.org; Cao, Waterman >>Cc: dev at dpdk.org >>Subject: RE: vmxnet3 pmd dev restart >> >>I just quickly looked into the code and instead of releasing memory or simply >>set it to NULL (patch: >> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683), you can zero >> it out and it should work perfectly, you can give it a quick try. >> >>//rte_free(ring->buf_info); >>memset(ring->buf_info, 0x0, ring->size*sizeof(vmxnet3_buf_info_t)); >> >>This will not free the memory from heap but just wipe it out to 0x0, provided >>that we freed all the mbuf(s) pointed by each buf_info->m pointers. Hence you >>won't need to reallocate it when you start device after this stop. >> >>Thanks, >>Rashmin >> >>-Original Message- >>From: Navakanth M [mailto:navakanthdev at gmail.com] >>Sent: Wednesday, October 08, 2014 10:11 PM >>To: stephen at networkplumber.org; Patel, Rashmin N; Cao, Waterman >>Cc: dev at dpdk.org >>Subject: Re: vmxnet3 pmd dev restart >> >>I had tried with Stephen's patch but after stop is done and when we call >>start it crashed at vmxnet3_dev_start()-> >>vmxnet3_dev_rxtx_init()->vmxnet3_post_rx_bufs() as buf_info is freed and is >>not allocated again. buf_info is allocated in >>vmxnet3_dev_rx_queue_setup() which would be called once at the initialization >>only. >>I also tried not freeing buf_info in stop but then i see different issue >>after start, packets are not received due to check while (rcd->gen == >>rxq->comp_ring.gen) { in vmxnet3_recv_pkts() >> >>Waterman, Have you got chance to test stop and start of vmnet dev if so did >>you notice any issue similar to this? >> >>Thanks >>Navakanth >> >>On Thu, Oct 9, 2014 at 12:46 AM, Patel, Rashmin N >intel.com> wrote: >>> Yes I had a local copy working with couple of lines fix. But someone else, >>> I think Stephen added a fix patch for the same, and I assume if it's been >>> merged, should be working, so did not follow up later. >>> >>> I don't have a VMware setup handy at moment but I think Waterman would have >>> more information about testing that patch if he has found any issue with it. >>> >>> Thanks, >>> Rashmin >>> >>> -Original Message- >>> From: Navakanth M [mailto:navakanthdev at gmail.com] >>> Sent: Wednesday, October 08, 2014 4:14 AM >>> To: dev at dpdk.org; Patel, Rashmin N >>> Subject: Re: vmxnet3 pmd dev restart >>> >>> Hi Rashmin >>> >>> I have come across your reply in following post that you have worked on >>> this problem and would submit the patch for it. >>> Can you please share information on the changes you worked on or patch log >>> if you had submitted any for it? >>> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683 >>> >>> Thanks >>> Navakanth >>> >>> On Tue, Sep 30, 2014 at 1:44 PM, Navakanth M >>> wrote: Hi I am using DPDKv1.7.0 running on Vmware Esxi 5.1 and am trying to reset the port which uses pmd_vmnet3 library functions from below function calls. rte_eth_dev_stop rte_eth_dev_start Doing this, i face panic while rte_free(ring->buf_info) in Vmxnet3_cmd_ring_release(). I have gone through following thread but the patch mentioned didn't help rather it crashed in start function while accessing buf_info in vmxnet3_post_rx_bufs. I see this buf_info is allocated in queue setup functions which are called at initialization. http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683 I tried not freeing it and then rx packets are not received due to mismatch in while (rcd->gen == rxq->comp_ring.gen) in vmxnet3_recv_pkts() To reset the device port, is this the right way what i am doing? Or do I have to call vmxnet3_dev_tx_queue_setup() vmxnet3_dev_rx_queue_setup() once stop is called? I have checked recent patches and threads but did not get much information on this. Thanks Navakanth
[dpdk-dev] DPDK - VIRTIO performance problems
Hi , > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matthew Hall > Sent: Sunday, October 12, 2014 9:18 PM > To: Yan Freedland > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] DPDK - VIRTIO performance problems > > On Sun, Oct 12, 2014 at 12:37:37PM +, Yan Freedland wrote: > > Every ~2min traffic stopped completely and then immediately came back. > > This happened in a periodic fashion. > > To me it sounds like it could be similar to what I've seen when I ran out of > mbuf's or ran out of RX / TX descriptor entries. It could be worth checking > the > error counters on the interfaces with DPDK and Linux OS / ethtool to see > what might be incrementing during the failed time periods. > I didn't meet this issue before, I am not sure if the following patch will fix this issue or not. Please try it. http://dpdk.org/dev/patchwork/patch/779/ By the way, what kind of backend did you use? User space vhost, or other backend? Thanks Changchun
[dpdk-dev] [PATCH v2] virtio: Update max RX packet length
Update max RX packet length since virtio PMD has the capability of receiving and transmitting jumbo frame. This following patch provides the above capability: [dpdk-dev,v3] virtio: Support mergeable buffer in virtio pmd Submitter Ouyang Changchun Date Aug. 14, 2014, 8:54 a.m. Message ID <1408006475-17606-1-git-send-email-changchun.ouyang at intel.com> Permalink http://dpdk.org/dev/patchwork/patch/159/ Signed-off-by: Changchun Ouyang Tested-by: Jingguo Fu --- lib/librte_pmd_virtio/virtio_ethdev.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.h b/lib/librte_pmd_virtio/virtio_ethdev.h index d2e1eed..1da3c62 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.h +++ b/lib/librte_pmd_virtio/virtio_ethdev.h @@ -53,7 +53,7 @@ #define VIRTIO_MAX_TX_QUEUES 128 #define VIRTIO_MAX_MAC_ADDRS 1 #define VIRTIO_MIN_RX_BUFSIZE 64 -#define VIRTIO_MAX_RX_PKTLEN 1518 +#define VIRTIO_MAX_RX_PKTLEN 9728 /* Features desired/implemented by this driver. */ #define VTNET_FEATURES \ -- 1.8.4.2
[dpdk-dev] [PATCH v4 1/7] ethdev: add more annotations
Add more annotations about packet classification type. Signed-off-by: Helin Zhang --- lib/librte_ether/rte_ethdev.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index 13be711..1948594 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -334,7 +334,10 @@ struct rte_eth_rss_conf { uint64_t rss_hf; /**< Hash functions to apply - see below. */ }; -/* Supported RSS offloads */ +/* + * Supported RSS offloads, below '_SHIFT' can also be used to represent + * the 'Packet Classification type (pctype)'. + */ /* for 1G & 10G */ #define ETH_RSS_IPV4_SHIFT0 #define ETH_RSS_IPV4_TCP_SHIFT1 -- 1.8.1.4
[dpdk-dev] [PATCH v4 2/7] ethdev: add interfaces and relevant for filter control
To support flexible filter control, 'rte_eth_dev_filter_ctrl()' and 'rte_eth_dev_filter_supported()' are added. In addition, filter types and operations are defined in a newly added header file. Signed-off-by: Helin Zhang --- lib/librte_ether/Makefile | 1 + lib/librte_ether/rte_eth_ctrl.h | 80 + lib/librte_ether/rte_ethdev.c | 32 + lib/librte_ether/rte_ethdev.h | 48 + 4 files changed, 161 insertions(+) create mode 100644 lib/librte_ether/rte_eth_ctrl.h diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile index b310f8b..a461c31 100644 --- a/lib/librte_ether/Makefile +++ b/lib/librte_ether/Makefile @@ -46,6 +46,7 @@ SRCS-y += rte_ethdev.c # SYMLINK-y-include += rte_ether.h SYMLINK-y-include += rte_ethdev.h +SYMLINK-y-include += rte_eth_ctrl.h # this lib depends upon: DEPDIRS-y += lib/librte_eal lib/librte_mempool lib/librte_ring lib/librte_mbuf diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h new file mode 100644 index 000..aaea075 --- /dev/null +++ b/lib/librte_ether/rte_eth_ctrl.h @@ -0,0 +1,80 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#ifndef _RTE_ETH_CTRL_H_ +#define _RTE_ETH_CTRL_H_ + +/** + * @file + * + * Ethernet device features and related data structures used + * by control APIs should be defined in this file. + * + */ + +#ifdef __cplusplus +extern "C" { +#endif + +/** + * Feature filter types + */ +enum rte_filter_type { + RTE_ETH_FILTER_NONE = 0, + RTE_ETH_FILTER_HASH, + RTE_ETH_FILTER_FDIR, + RTE_ETH_FILTER_TUNNEL, + RTE_ETH_FILTER_MAX, +}; + +/** + * All generic operations to filters + */ +enum rte_filter_op { + /**< used to check whether the type filter is supported */ + RTE_ETH_FILTER_OP_NONE = 0, + RTE_ETH_FILTER_OP_ADD, /**< add filter entry */ + RTE_ETH_FILTER_OP_UPDATE, /**< update filter entry */ + RTE_ETH_FILTER_OP_DELETE, /**< delete filter entry */ + RTE_ETH_FILTER_OP_GET, /**< get filter entry */ + RTE_ETH_FILTER_OP_SET, /**< configurations */ + /**< get information of filter, such as status or statistics */ + RTE_ETH_FILTER_OP_GET_INFO, + RTE_ETH_FILTER_OP_MAX, +}; + +#ifdef __cplusplus +} +#endif + +#endif /* _RTE_ETH_CTRL_H_ */ diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 1659340..1edc816 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -3148,3 +3148,35 @@ rte_eth_dev_get_flex_filter(uint8_t port_id, uint16_t index, return (*dev->dev_ops->get_flex_filter)(dev, index, filter, rx_queue); } + +int +rte_eth_dev_filter_supported(uint8_t port_id, enum rte_filter_type filter_type) +{ + struct rte_eth_dev *dev; + + if (port_id >= nb_ports) { + PMD_DEBUG_TRACE("Invalid port_id=%d\n", port_id); + return -ENODEV; + } + + dev = &rte_eth_devices[port_id]; + FUNC_PTR_OR_ERR_RET(*dev->dev_ops->filter_ctrl, -ENOTSUP); + return (*dev->dev_ops->filter_ctrl)(dev, filter_type, + RTE_ETH_FILTER_OP_NONE, NULL); +} + +int +rte_eth_dev_filter_ctrl(uint8_t port_id, enum rte_
[dpdk-dev] [PATCH v4 4/7] i40e: add hash filter control implementation
Hash filter control has been implemented for i40e. It includes getting/setting - hash function type - symmetric hash enable per pctype (packet classification type) - symmetric hash enable per port - filter swap configuration Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e_ethdev.c | 402 ++ 1 file changed, 402 insertions(+) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 46c43a7..60b619b 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -216,6 +216,10 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); +static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev, + enum rte_filter_type filter_type, + enum rte_filter_op filter_op, + void *arg); /* Default hash key buffer for RSS */ static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; @@ -267,6 +271,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = { .reta_query = i40e_dev_rss_reta_query, .rss_hash_update = i40e_dev_rss_hash_update, .rss_hash_conf_get= i40e_dev_rss_hash_conf_get, + .filter_ctrl = i40e_dev_filter_ctrl, }; static struct eth_driver rte_i40e_pmd = { @@ -4162,3 +4167,400 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf) return 0; } + +/* Get the symmetric hash enable configurations per PCTYPE */ +static int +i40e_get_symmetric_hash_enable_per_pctype(struct i40e_hw *hw, + struct rte_eth_sym_hash_ena_info *info) +{ + uint32_t reg; + + switch (info->pctype) { + case ETH_RSS_NONF_IPV4_UDP_SHIFT: + case ETH_RSS_NONF_IPV4_TCP_SHIFT: + case ETH_RSS_NONF_IPV4_SCTP_SHIFT: + case ETH_RSS_NONF_IPV4_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV4_SHIFT: + case ETH_RSS_NONF_IPV6_UDP_SHIFT: + case ETH_RSS_NONF_IPV6_TCP_SHIFT: + case ETH_RSS_NONF_IPV6_SCTP_SHIFT: + case ETH_RSS_NONF_IPV6_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV6_SHIFT: + case ETH_RSS_L2_PAYLOAD_SHIFT: + reg = I40E_READ_REG(hw, I40E_GLQF_HSYM(info->pctype)); + info->enable = reg & I40E_GLQF_HSYM_SYMH_ENA_MASK ? 1 : 0; + break; + default: + PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype); + return -EINVAL; + } + + return 0; +} + +/* Set the symmetric hash enable configurations per PCTYPE */ +static int +i40e_set_symmetric_hash_enable_per_pctype(struct i40e_hw *hw, + const struct rte_eth_sym_hash_ena_info *info) +{ + uint32_t reg; + + switch (info->pctype) { + case ETH_RSS_NONF_IPV4_UDP_SHIFT: + case ETH_RSS_NONF_IPV4_TCP_SHIFT: + case ETH_RSS_NONF_IPV4_SCTP_SHIFT: + case ETH_RSS_NONF_IPV4_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV4_SHIFT: + case ETH_RSS_NONF_IPV6_UDP_SHIFT: + case ETH_RSS_NONF_IPV6_TCP_SHIFT: + case ETH_RSS_NONF_IPV6_SCTP_SHIFT: + case ETH_RSS_NONF_IPV6_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV6_SHIFT: + case ETH_RSS_L2_PAYLOAD_SHIFT: + reg = info->enable ? I40E_GLQF_HSYM_SYMH_ENA_MASK : 0; + I40E_WRITE_REG(hw, I40E_GLQF_HSYM(info->pctype), reg); + I40E_WRITE_FLUSH(hw); + break; + default: + PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype); + return -EINVAL; + } + + return 0; +} + +/* Get the symmetric hash enable configurations per port */ +static void +i40e_get_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t *enable) +{ + uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0); + + *enable = reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK ? 1 : 0; +} + +/* Set the symmetric hash enable configurations per port */ +static void +i40e_set_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t enable) +{ + uint32_t reg = I40E_READ_REG(hw, I40E_PRTQF_CTL_0); + + if (enable > 0) { + if (reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK) { + PMD_DRV_LOG(INFO, "Symmetric hash has already " + "been enabled"); + return; + } + reg |= I40E_PRTQF_CTL_0_HSYM_ENA_MASK; + } else { + if (!(reg & I40E_PRTQF_CTL_0_HSYM_ENA_MASK)) { + PMD_DRV_LOG(INFO, "Symmetric hash has already " + "been disabled"); + return; + } + reg &= ~I40E_PRTQF_CTL_0_HSYM_ENA_MASK; + } + I40E_WRITE_REG(hw, I40E_P
[dpdk-dev] [PATCH v4 0/7] Support configuring hash functions
These patches mainly support configuring hash functions. In detail, - It can get or set hash functions. - It can configure symmetric hash functions. * Get/set symmetric hash enable per port. * Get/set symmetric hash enable per 'PCTYPE'. * Get/set filter swap configurations. - 'ethdev' level interfaces are added. * 'rte_eth_dev_filter_supported', to check if a filter control is supported on a port. * 'rte_eth_dev_filter_ctrl', a common API to execute specific filter control. - Six commands have been implemented in testpmd to support testing above. * get_sym_hash_ena_per_port * set_sym_hash_ena_per_port * get_sym_hash_ena_per_pctype * set_sym_hash_ena_per_pctype * get_filter_swap * set_filter_swap * get_hash_function * set_hash_function Note that 'PCTYPE' means 'Packet Classification Type'. v4 changes: * Fixed a bug in testpmd to support 'set_sym_hash_ena_per_port'. Helin Zhang (7): ethdev: add more annotations ethdev: add interfaces and relevant for filter control ethdev: add structures and enum for hash filter control i40e: add hash filter control implementation i40e: add hardware initialization i40e: Use constant random hash keys app/testpmd: add commands to support hash filter control app/test-pmd/cmdline.c| 566 ++ lib/librte_ether/Makefile | 1 + lib/librte_ether/rte_eth_ctrl.h | 154 +++ lib/librte_ether/rte_ethdev.c | 32 +++ lib/librte_ether/rte_ethdev.h | 53 +++- lib/librte_pmd_i40e/i40e_ethdev.c | 492 - 6 files changed, 1291 insertions(+), 7 deletions(-) create mode 100644 lib/librte_ether/rte_eth_ctrl.h -- 1.8.1.4
[dpdk-dev] [PATCH v4 6/7] i40e: Use constant random hash keys
To be simpler, and remove the race condition, it uses prepared constant random hash keys to replace runtime generating the hash keys. Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e_ethdev.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index ce80f27..95132d5 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -222,9 +222,6 @@ static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev, void *arg); static void i40e_hw_init(struct i40e_hw *hw); -/* Default hash key buffer for RSS */ -static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; - static struct rte_pci_id pci_id_i40e_map[] = { #define RTE_PCI_DEV_ID_DECL_I40E(vend, dev) {RTE_PCI_DEVICE(vend, dev)}, #include "rte_pci_dev_ids.h" @@ -4144,9 +4141,12 @@ i40e_pf_config_rss(struct i40e_pf *pf) } if (rss_conf.rss_key == NULL || rss_conf.rss_key_len < (I40E_PFQF_HKEY_MAX_INDEX + 1) * sizeof(uint32_t)) { - /* Calculate the default hash key */ - for (i = 0; i <= I40E_PFQF_HKEY_MAX_INDEX; i++) - rss_key_default[i] = (uint32_t)rte_rand(); + /* Random default keys */ + static uint32_t rss_key_default[] = {0x6b793944, + 0x23504cb5, 0x5bea75b6, 0x309f4f12, 0x3dc0a2b8, + 0x024ddcdf, 0x339b8ca0, 0x4c4af64a, 0x34fac605, + 0x55d85839, 0x3a58997d, 0x2ec938e1, 0x66031581}; + rss_conf.rss_key = (uint8_t *)rss_key_default; rss_conf.rss_key_len = (I40E_PFQF_HKEY_MAX_INDEX + 1) * sizeof(uint32_t); -- 1.8.1.4
[dpdk-dev] [PATCH v4 5/7] i40e: add hardware initialization
As global registers will be reset only after a whole chip reset, those registers might not be in an initial state after each launching a physical port. The hardware initialization is added to put specific global registers into an initial state. Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e_ethdev.c | 78 +++ 1 file changed, 78 insertions(+) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 60b619b..ce80f27 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -220,6 +220,7 @@ static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev, enum rte_filter_type filter_type, enum rte_filter_op filter_op, void *arg); +static void i40e_hw_init(struct i40e_hw *hw); /* Default hash key buffer for RSS */ static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; @@ -401,6 +402,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver *eth_drv, /* Make sure all is clean before doing PF reset */ i40e_clear_hw(hw); + /* Initialize the hardware */ + i40e_hw_init(hw); + /* Reset here to make sure all is clean for each PF */ ret = i40e_pf_reset(hw); if (ret) { @@ -4564,3 +4568,77 @@ i40e_dev_filter_ctrl(struct rte_eth_dev *dev, return ret; } + +/* Initialization for hash function */ +static void +i40e_hash_function_hw_init(struct i40e_hw *hw) +{ + uint32_t i; + const struct rte_eth_sym_hash_ena_info sym_hash_ena_info[] = { + {ETH_RSS_NONF_IPV4_UDP_SHIFT, 0}, + {ETH_RSS_NONF_IPV4_TCP_SHIFT, 0}, + {ETH_RSS_NONF_IPV4_SCTP_SHIFT, 0}, + {ETH_RSS_NONF_IPV4_OTHER_SHIFT, 0}, + {ETH_RSS_FRAG_IPV4_SHIFT, 0}, + {ETH_RSS_NONF_IPV6_UDP_SHIFT, 0}, + {ETH_RSS_NONF_IPV6_TCP_SHIFT, 0}, + {ETH_RSS_NONF_IPV6_SCTP_SHIFT, 0}, + {ETH_RSS_NONF_IPV6_OTHER_SHIFT, 0}, + {ETH_RSS_FRAG_IPV6_SHIFT, 0}, + {ETH_RSS_L2_PAYLOAD_SHIFT, 0}, + }; + const struct rte_eth_filter_swap_info swap_info[] = { + {ETH_RSS_NONF_IPV4_UDP_SHIFT, + 0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02}, + {ETH_RSS_NONF_IPV4_TCP_SHIFT, + 0x1e, 0x36, 0x04, 0x3a, 0x3c, 0x02}, + {ETH_RSS_NONF_IPV4_SCTP_SHIFT, + 0x1e, 0x36, 0x04, 0x00, 0x00, 0x00}, + {ETH_RSS_NONF_IPV4_OTHER_SHIFT, + 0x1e, 0x36, 0x04, 0x00, 0x00, 0x00}, + {ETH_RSS_FRAG_IPV4_SHIFT, + 0x1e, 0x36, 0x04, 0x00, 0x00, 0x00}, + {ETH_RSS_NONF_IPV6_UDP_SHIFT, + 0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02}, + {ETH_RSS_NONF_IPV6_TCP_SHIFT, + 0x1a, 0x2a, 0x10, 0x3a, 0x3c, 0x02}, + {ETH_RSS_NONF_IPV6_SCTP_SHIFT, + 0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00}, + {ETH_RSS_NONF_IPV6_OTHER_SHIFT, + 0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00}, + {ETH_RSS_FRAG_IPV6_SHIFT, + 0x1a, 0x2a, 0x10, 0x00, 0x00, 0x00}, + {ETH_RSS_L2_PAYLOAD_SHIFT, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00}, + }; + + /* Disable symmetric hash per PCTYPE */ + for (i = 0; i < RTE_DIM(sym_hash_ena_info); i++) + i40e_set_symmetric_hash_enable_per_pctype(hw, + &sym_hash_ena_info[i]); + + /* Disable symmetric hash per port */ + i40e_set_symmetric_hash_enable_per_port(hw, 0); + + /* Initialize filter swap */ + for (i = 0; i < RTE_DIM(swap_info); i++) + i40e_set_filter_swap(hw, &swap_info[i]); + + /* Set hash function to Toeplitz by default */ + i40e_set_hash_function(hw, RTE_ETH_HASH_FUNCTION_TOEPLITZ); +} + +/* + * As global registers wouldn't be reset unless a global hardware reset, + * hardware initialization is needed to put those registers into an + * expected initial state. + */ +static void +i40e_hw_init(struct i40e_hw *hw) +{ + /* clear the PF Queue Filter control register */ + I40E_WRITE_REG(hw, I40E_PFQF_CTL_0, 0); + + /* Initialize hardware for hash function */ + i40e_hash_function_hw_init(hw); +} -- 1.8.1.4
[dpdk-dev] [PATCH v4 3/7] ethdev: add structures and enum for hash filter control
Structures and enum are added in rte_eth_ctrl.h to support hash filter control. Signed-off-by: Helin Zhang --- lib/librte_ether/rte_eth_ctrl.h | 74 + 1 file changed, 74 insertions(+) diff --git a/lib/librte_ether/rte_eth_ctrl.h b/lib/librte_ether/rte_eth_ctrl.h index aaea075..10197fc 100644 --- a/lib/librte_ether/rte_eth_ctrl.h +++ b/lib/librte_ether/rte_eth_ctrl.h @@ -73,6 +73,80 @@ enum rte_filter_op { RTE_ETH_FILTER_OP_MAX, }; +/** + * Hash filter information types. + */ +enum rte_eth_hash_filter_info_type { + RTE_ETH_HASH_FILTER_INFO_TYPE_UNKNOWN = 0, + RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE, + RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT, + RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP, + RTE_ETH_HASH_FILTER_INFO_TYPE_HASH_FUNCTION, + RTE_ETH_HASH_FILTER_INFO_TYPE_MAX, +}; + +/** + * Hash function types. + */ +enum rte_eth_hash_function { + RTE_ETH_HASH_FUNCTION_UNKNOWN = 0, + RTE_ETH_HASH_FUNCTION_TOEPLITZ, + RTE_ETH_HASH_FUNCTION_SIMPLE_XOR, + RTE_ETH_HASH_FUNCTION_MAX, +}; + +/** + * A structure used to set or get symmetric hash enable information, to support + * 'RTE_ETH_FILTER_HASH', 'RTE_ETH_FILTER_OP_GET/RTE_ETH_FILTER_OP_SET', with + * information type 'RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE'. + */ +struct rte_eth_sym_hash_ena_info { + /**< packet classification type, defined in rte_ethdev.h */ + uint8_t pctype; + uint8_t enable; /**< enable or disable flag */ +}; + +/** + * A structure used to set or get filter swap information, to support + * 'RTE_ETH_FILTER_HASH', 'RTE_ETH_FILTER_OP_GET/RTE_ETH_FILTER_OP_SET', + * with information type 'RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP'. + */ +struct rte_eth_filter_swap_info { + /**< Packet classification type, defined in rte_ethdev.h */ + uint8_t pctype; + /**< Offset of the 1st field of the 1st couple to be swapped. */ + uint8_t off0_src0; + /**< Offset of the 2nd field of the 1st couple to be swapped. */ + uint8_t off0_src1; + /**< Field length of the first couple. */ + uint8_t len0; + /**< Offset of the 1st field of the 2nd couple to be swapped. */ + uint8_t off1_src0; + /**< Offset of the 2nd field of the 2nd couple to be swapped. */ + uint8_t off1_src1; + /**< Field length of the second couple. */ + uint8_t len1; +}; + +/** + * A structure used to set or get hash filter information, to support filter + * type of 'RTE_ETH_FILTER_HASH' and its operations. + */ +struct rte_eth_hash_filter_info { + enum rte_eth_hash_filter_info_type info_type; /**< Information type. */ + /**< Details of hash filter infomation */ + union { + /* For RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PCTYPE */ + struct rte_eth_sym_hash_ena_info sym_hash_ena; + /* For RTE_ETH_HASH_FILTER_INFO_TYPE_FILTER_SWAP */ + struct rte_eth_filter_swap_info filter_swap; + /* For RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT */ + uint8_t enable; + /* For RTE_ETH_HASH_FILTER_INFO_TYPE_HASH_FUNCTION */ + enum rte_eth_hash_function hash_function; + } info; +}; + #ifdef __cplusplus } #endif -- 1.8.1.4
[dpdk-dev] [PATCH v4 7/7] app/testpmd: add commands to support hash filter control
To demonstrate the hash filter control, commands are added. They are - get_sym_hash_ena_per_port - set_sym_hash_ena_per_port - get_sym_hash_ena_per_pctype - set_sym_hash_ena_per_pctype - get_filter_swap - set_filter_swap - get_hash_function - set_hash_function v4 changes: * Fixed a bug in testpmd for 'set_sym_hash_ena_per_port'. Signed-off-by: Helin Zhang --- app/test-pmd/cmdline.c | 566 + 1 file changed, 566 insertions(+) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 0b972f9..bea88d1 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -74,6 +74,7 @@ #include #include #include +#include #include #include @@ -660,6 +661,35 @@ static void cmd_help_long_parsed(void *parsed_result, "get_flex_filter (port_id) index (idx)\n" "get info of a flex filter.\n\n" + + "get_sym_hash_ena_per_port (port_id)\n" + "get symmetric hash enable configuration per port.\n\n" + + "set_sym_hash_ena_per_port (port_id)" + " (enable|disable)\n" + "set symmetric hash enable configuration per port" + " to enable or disable.\n\n" + + "get_sym_hash_ena_per_pctype (port_id) (pctype)\n" + "get symmetric hash enable configuration per port\n\n" + + "set_sym_hash_ena_per_pctype (port_id) (pctype)" + " (enable|disable)\n" + "set symmetric hash enable configuration per" + " pctype to enable or disable.\n\n" + + "get_filter_swap (port_id) (pctype)\n" + "get filter swap configurations.\n\n" + + "set_filter_swap (port_id) (pctype) (off0_src0) (off0_src1)" + " (len0) (off1_src0) (off1_src1) (len1)\n" + "set filter swap configurations.\n\n" + + "get_hash_function (port_id)\n" + "get hash function of Toeplitz or Simple XOR.\n\n" + + "set_hash_function (port_id) (toeplitz|simple_xor)\n" + "set the hash function to Toeplitz or Simple XOR.\n\n" ); } } @@ -7415,6 +7445,534 @@ cmdline_parse_inst_t cmd_get_flex_filter = { }, }; +/* *** Classification Filters Control *** */ + +/* *** Get symmetric hash enable per port *** */ +struct cmd_get_sym_hash_ena_per_port_result { + cmdline_fixed_string_t get_sym_hash_ena_per_port; + uint8_t port_id; +}; + +static void +cmd_get_sym_hash_per_port_parsed(void *parsed_result, +__rte_unused struct cmdline *cl, +__rte_unused void *data) +{ + struct cmd_get_sym_hash_ena_per_port_result *res = parsed_result; + struct rte_eth_hash_filter_info info; + int ret; + + if (rte_eth_dev_filter_supported(res->port_id, + RTE_ETH_FILTER_HASH) < 0) { + printf("RTE_ETH_FILTER_HASH not supported on port: %d\n", + res->port_id); + return; + } + + memset(&info, 0, sizeof(info)); + info.info_type = RTE_ETH_HASH_FILTER_INFO_TYPE_SYM_HASH_ENA_PER_PORT; + ret = rte_eth_dev_filter_ctrl(res->port_id, RTE_ETH_FILTER_HASH, + RTE_ETH_FILTER_OP_GET, &info); + if (ret < 0) { + printf("Cannot get symmetric hash enable per port " + "on port %u\n", res->port_id); + return; + } + + printf("Symmetric hash is %s on port %u\n", info.info.enable ? + "enabled" : "disabled", res->port_id); +} + +cmdline_parse_token_string_t cmd_get_sym_hash_ena_per_port_all = + TOKEN_STRING_INITIALIZER(struct cmd_get_sym_hash_ena_per_port_result, + get_sym_hash_ena_per_port, "get_sym_hash_ena_per_port"); +cmdline_parse_token_num_t cmd_get_sym_hash_ena_per_port_port_id = + TOKEN_NUM_INITIALIZER(struct cmd_get_sym_hash_ena_per_port_result, + port_id, UINT8); + +cmdline_parse_inst_t cmd_get_sym_hash_ena_per_port = { + .f = cmd_get_sym_hash_per_port_parsed, + .data = NULL, + .help_str = "get_sym_hash_ena_per_port port_id", + .tokens = { + (void *)&cmd_get_sym_hash_ena_per_port_all, + (void *)&cmd_get_sym_hash_ena_per_port_port_id, + NULL, + }, +}; + +/* *** Set symmetric hash enable per port *** */ +struct cmd_set_sym_hash_ena_per_port_result { + cmdline_fixed_string_t set_sym_hash_ena_per_port; + cmdline_fixed_string_t enable; + uint8_t port_id; +}; + +static void +cmd_set_sym_hash_per_port_parsed(
[dpdk-dev] [PATCH v4 00/10] VM Power Management
Patch name: VM Power Management Brief description: Verify VM power management in virtualized environments Test Flag: Tested-by Tester name:yong.liu at intel.com Test environment: OS: Fedora20 3.11.10-301.fc20.x86_64 GCC: gcc version 4.8.3 20140911 CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz NIC: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection [8086:10fb] Test Tool Chain information: Qemu: 1.6.1 libvirt: 1.1.3 Guest OS: Fedora20 3.11.10-301.fc20.x86_64 Guest GCC: gcc version 4.8.3 20140624 Commit ID: 72d3e7ad3183f42f8b9fb3bb1c12b3e1b39eef39 Detailed Testing information DPDK SW Configuration: Default x86_64-native-linuxapp-gcc configuration Test Result Summary:Total 7 cases, 7 passed, 0 failed Test Case - name: VM Power Management Channel Test Case - Description: Check vm power management communication channels can successfully connected Test Case -command / instruction: Create folder in system temporary filesystem for power monitor socket mkdir -p /tmp/powermonitor Configure VM XML and pin VCPUs to specified CPUs 5 Configure VM XML to set up virtio serial ports Run power-manager monitor in Host ./build/vm_power_mgr -c 0x3 -n 4 Startup VM and run guest_vm_power_mgr guest_vm_power_mgr -c 0x1f -n 4 -- -i Add vm in host and check vm_power_mgr can get frequency normally vmpower> add_vm vmpower> add_channels all vmpower> get_cpu_freq Check vcpu/cpu mapping can be detected normally vmpower> show_vm Test Case - expected test result: VM power management communication channels can sucessfully connected and host can get vm core information Test Case - name: VM Power Management Numa Test Case - Description: Check vm power management support manage cores in different sockets Test Case -command / instruction: Get core and socket information by cpu_layout ./tools/cpu_layout.py Configure VM XML to pin VCPUs on Socket1: Repeat Case1 Check vcpu/cpu mapping can be detected normally vmpower> show_vm Test Case - expected test result: VM power management communication channels can sucessfully connected and show correct vm core information Test Case - name: VM scale cpu frequency down Test Case - Description: Check vm power management support VM configure self cores frequency down Test Case -command / instruction: Setup VM power management environment Send cpu frequency down hints to Host vmpower(guest)> set_cpu_freq 0 down Verify the frequency of physical CPU has been scaled down correctly vmpower> get_cpu_freq 1 Core 1 frequency: 270 Check other CPUs' frequency is not affected by actions above Check if the other VM works fine (if they use different CPUs) Repeat above actions several times Test Case - expected test result: Frequency for VM's core can be scaling down normally Test Case - name: VM scale cpu frequency up Test Case - Description: Check vm power management support VM configure self cores frequency up Test Case -command / instruction: Setup VM power management environment Send cpu frequency up hints to Host vmpower(guest)> set_cpu_freq 0 up Verify the frequency o
[dpdk-dev] DPDK - VIRTIO performance problems
Hi I found that what blocked me was actually the "nf_conntrack", So enlarging the maximum there solved the issue. Thanks Yan -Original Message- From: Ouyang, Changchun [mailto:changchun.ouy...@intel.com] Sent: Monday, October 13, 2014 6:10 AM To: Matthew Hall; Yan Freedland Cc: dev at dpdk.org; Ouyang, Changchun Subject: RE: [dpdk-dev] DPDK - VIRTIO performance problems Hi , > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Matthew Hall > Sent: Sunday, October 12, 2014 9:18 PM > To: Yan Freedland > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] DPDK - VIRTIO performance problems > > On Sun, Oct 12, 2014 at 12:37:37PM +, Yan Freedland wrote: > > Every ~2min traffic stopped completely and then immediately came back. > > This happened in a periodic fashion. > > To me it sounds like it could be similar to what I've seen when I ran > out of mbuf's or ran out of RX / TX descriptor entries. It could be > worth checking the error counters on the interfaces with DPDK and > Linux OS / ethtool to see what might be incrementing during the failed time > periods. > I didn't meet this issue before, I am not sure if the following patch will fix this issue or not. Please try it. http://dpdk.org/dev/patchwork/patch/779/ By the way, what kind of backend did you use? User space vhost, or other backend? Thanks Changchun
[dpdk-dev] [PATCH] i40e: fix of compile error
It fixes the compile error as below on gcc version 4.3.4. cc1: error: unrecognized command line option "-Wno-unused-but-set-variable" Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/Makefile | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile index 4b31675..bd3428f 100644 --- a/lib/librte_pmd_i40e/Makefile +++ b/lib/librte_pmd_i40e/Makefile @@ -55,8 +55,7 @@ CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast CFLAGS_BASE_DRIVER += -Wno-format-nonliteral else -CFLAGS_BASE_DRIVER = -Wno-unused-but-set-variable -CFLAGS_BASE_DRIVER += -Wno-sign-compare +CFLAGS_BASE_DRIVER = -Wno-sign-compare CFLAGS_BASE_DRIVER += -Wno-unused-value CFLAGS_BASE_DRIVER += -Wno-unused-parameter CFLAGS_BASE_DRIVER += -Wno-strict-aliasing @@ -65,6 +64,11 @@ CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast CFLAGS_BASE_DRIVER += -Wno-format-nonliteral CFLAGS_BASE_DRIVER += -Wno-format-security + +ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) -ge 4 && echo 1), 1) +CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable +endif + CFLAGS_i40e_lan_hmc.o += -Wno-error endif OBJS_BASE_DRIVER=$(patsubst %.c,%.o,$(notdir $(wildcard $(RTE_SDK)/lib/librte_pmd_i40e/i40e/*.c))) -- 1.8.1.4
[dpdk-dev] Issues running Ethtool on KNI interfaces
Hi, Hypervisor: KVM VM: Linux OS with 2.6.32 Kernel VM Settings: 8 vCPUs, 8192 MB of memory, CPU Configuration: Copy Host CPU Config (SandyBridge), Manually set CPU topology: Sockets=2, Cores=4, Threads=1 10Gigi Passthrough Interfaces attached to VM: Intel X520 DPDK settings: DPDK Version: 1.6R2 echo 512 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages rmmod ixgbe insmod igb_uio.ko mkdir /mnt/hugepages mount -t hugetlbfs nodev /mnt/hugepages #./igb_uio_bind.py --status #./igb_uio_bind.py --bind=igb_uio 03:01.0 #./igb_uio_bind.py --bind=igb_uio 03:02.0 #./igb_uio_bind.py --status modprobe hwmon insmod rte_kni.ko ./kni -c 0x1E -n 2 --socket-mem 512 -- -p 0x3 --config="(0,1,3),(1,2,4)" & Having KNI process running in the background, creates two KNI interfaces called vEth0 and vEth1 which corresponds to IGB-UIO interfaces attached to the Userspace. The issue is that the Ethtool doesn't work with KNI interfaces. Is there anything that I am missing, How can I get Ethtool to work with KNI interfaces? Also, If I do not use --socket-mem OR -m EAL option, KNI application consumes all the huge pages(It doesn't mater 512/4096) assigned in the very first step.Is that expected behavior? Can I get KNI to work without assigning any Hugepages? Also, I have attached screen shots of KNI log file. Please take a look. Thanks, Desh -- next part -- A non-text attachment was scrubbed... Name: KNI-SS4.png Type: image/png Size: 12209 bytes Desc: KNI-SS4.png URL: <http://dpdk.org/ml/archives/dev/attachments/20141013/beca22c5/attachment-0004.png> -- next part -- A non-text attachment was scrubbed... Name: KNI-SS1.png Type: image/png Size: 15281 bytes Desc: KNI-SS1.png URL: <http://dpdk.org/ml/archives/dev/attachments/20141013/beca22c5/attachment-0005.png> -- next part -- A non-text attachment was scrubbed... Name: KNI-SS2.png Type: image/png Size: 15239 bytes Desc: KNI-SS2.png URL: <http://dpdk.org/ml/archives/dev/attachments/20141013/beca22c5/attachment-0006.png> -- next part -- A non-text attachment was scrubbed... Name: KNI-SS3.png Type: image/png Size: 12812 bytes Desc: KNI-SS3.png URL: <http://dpdk.org/ml/archives/dev/attachments/20141013/beca22c5/attachment-0007.png>
[dpdk-dev] [PATCH] i40e: fix of compile error
Hi Helin, It still has errors: You can get access to 10.239.129.2 with root/tester. /root/zzz/dpdk is the latest dpdk code, /root/zzz/error is the latest code with the patch appled. cc1: warnings being treated as errors /root/zzz/error/app/test/test_prefetch.c:65: error: 'testfn_prefetch_cmd' defined but not used make[5]: *** [test_prefetch.o] Error 1 make[5]: *** Waiting for unfinished jobs CC test_table.o cc1: warnings being treated as errors /root/zzz/error/app/test/test_byteorder.c:99: error: 'testfn_byteorder_cmd' defined but not used make[5]: *** [test_byteorder.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_pci.c:203: error: 'testfn_pci_cmd' defined but not used make[5]: *** [test_pci.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_memory.c:92: error: 'testfn_memory_cmd' defined but not used cc1: warnings being treated as errors /root/zzz/error/app/test/test_cycles.c:96: error: 'testfn_cycles_cmd' defined but not used cc1: warnings being treated as errors /root/zzz/error/app/test/test_spinlock.c:341: error: 'testfn_spinlock_cmd' defined but not used make[5]: *** [test_cycles.o] Error 1 make[5]: *** [test_memory.o] Error 1 make[5]: *** [test_spinlock.o] Error 1 LD dump_cfg cc1: warnings being treated as errors /root/zzz/error/app/test/test_per_lcore.c:144: error: 'testfn_per_lcore_cmd' defined but not used make[5]: *** [test_per_lcore.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_atomic.c:382: error: 'testfn_atomic_cmd' defined but not used cc1: warnings being treated as errors /root/zzz/error/app/test/test_ring_perf.c:421: error: 'testfn_ring_perf_cmd' defined but not used make[5]: *** [test_atomic.o] Error 1 make[5]: *** [test_ring_perf.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_memzone.c:1052: error: 'testfn_memzone_cmd' defined but not used make[5]: *** [test_memzone.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_malloc.c:1053: error: 'testfn_malloc_cmd' defined but not used make[5]: *** [test_malloc.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_ring.c:1400: error: 'testfn_ring_cmd' defined but not used make[5]: *** [test_ring.o] Error 1 cc1: warnings being treated as errors /root/zzz/error/app/test/test_table.c:211: error: 'testfn_table_cmd' defined but not used make[5]: LD testacl *** [test_table.o] Error 1 INSTALL-APP cmdline_test INSTALL-MAP cmdline_test.map make[4]: *** [test] Error 2 make[4]: *** Waiting for unfinished jobs INSTALL-MAP dump_cfg.map INSTALL-APP dump_cfg INSTALL-APP testacl INSTALL-MAP testacl.map LD testpipeline INSTALL-APP testpipeline INSTALL-MAP testpipeline.map LD testpmd INSTALL-APP testpmd INSTALL-MAP testpmd.map make[3]: *** [app] Error 2 make[2]: *** [all] Error 2 make[1]: *** [x86_64-native-linuxapp-gcc_install] Error 2 make: *** [install] Error 2 > -Original Message- > From: Zhang, Helin > Sent: Monday, October 13, 2014 3:18 PM > To: dev at dpdk.org > Cc: Zhan, Zhaochen; Cao, Waterman; Zhang, Helin > Subject: [PATCH] i40e: fix of compile error > > It fixes the compile error as below on gcc version 4.3.4. > cc1: error: unrecognized command line option "-Wno-unused-but-set- > variable" > > Signed-off-by: Helin Zhang > --- > lib/librte_pmd_i40e/Makefile | 8 ++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/lib/librte_pmd_i40e/Makefile b/lib/librte_pmd_i40e/Makefile > index 4b31675..bd3428f 100644 > --- a/lib/librte_pmd_i40e/Makefile > +++ b/lib/librte_pmd_i40e/Makefile > @@ -55,8 +55,7 @@ CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers > CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast > CFLAGS_BASE_DRIVER += -Wno-format-nonliteral > else > -CFLAGS_BASE_DRIVER = -Wno-unused-but-set-variable > -CFLAGS_BASE_DRIVER += -Wno-sign-compare > +CFLAGS_BASE_DRIVER = -Wno-sign-compare > CFLAGS_BASE_DRIVER += -Wno-unused-value > CFLAGS_BASE_DRIVER += -Wno-unused-parameter > CFLAGS_BASE_DRIVER += -Wno-strict-aliasing > @@ -65,6 +64,11 @@ CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers > CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast > CFLAGS_BASE_DRIVER += -Wno-format-nonliteral > CFLAGS_BASE_DRIVER += -Wno-format-security > + > +ifeq ($(shell test $(GCC_MAJOR_VERSION) -ge 4 -a $(GCC_MINOR_VERSION) - > ge 4 && echo 1), 1) > +CFLAGS_BASE_DRIVER += -Wno-unused-but-set-variable > +endif > + > CFLAGS_i40e_lan_hmc.o +
[dpdk-dev] Issues running Ethtool on KNI interfaces
Hi Desh You tried to use ethtool for KNI interfaces in VM, right? I don't think it is supported in VM. Currently it just supports ethtool for KNI interfaces in host for some igb and ixgbe NICs. Regards, Helin > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Deshvanth Mirle > Jayaprakash (dmirleja) > Sent: Monday, October 13, 2014 3:21 PM > To: dev at dpdk.org > Subject: [dpdk-dev] Issues running Ethtool on KNI interfaces > > Hi, > > Hypervisor: KVM > > VM: Linux OS with 2.6.32 Kernel > > VM Settings: > > 8 vCPUs, 8192 MB of memory, CPU Configuration: Copy Host CPU Config > (SandyBridge), Manually set CPU topology: Sockets=2, Cores=4, Threads=1 > > 10Gigi Passthrough Interfaces attached to VM: Intel X520 > > > DPDK settings: > > > DPDK Version: 1.6R2 > > > echo 512 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > > rmmod ixgbe > > insmod igb_uio.ko > > mkdir /mnt/hugepages > > mount -t hugetlbfs nodev /mnt/hugepages > > #./igb_uio_bind.py --status > > #./igb_uio_bind.py --bind=igb_uio 03:01.0 > > #./igb_uio_bind.py --bind=igb_uio 03:02.0 > > #./igb_uio_bind.py --status > > modprobe hwmon > > insmod rte_kni.ko > > ./kni -c 0x1E -n 2 --socket-mem 512 -- -p 0x3 --config="(0,1,3),(1,2,4)" & > > > Having KNI process running in the background, creates two KNI interfaces > called vEth0 and vEth1 which corresponds to IGB-UIO interfaces attached to > the Userspace. The issue is that the Ethtool doesn't work with KNI interfaces. > Is there anything that I am missing, How can I get Ethtool to work with KNI > interfaces? Also, If I do not use --socket-mem OR -m EAL option, KNI > application > consumes all the huge pages(It doesn't mater 512/4096) assigned in the very > first step.Is that expected behavior? Can I get KNI to work without assigning > any Hugepages? > > > > Also, I have attached screen shots of KNI log file. Please take a look. > > > Thanks, > > Desh
[dpdk-dev] Issues running Ethtool on KNI interfaces
Thanks Helin, have been trying this for some time, Is there any other way I can pass IOCTLs to IGB-UIO interfaces. Can I use /dev/uio0 and /dev/uio1 ? Can I use IOCTLS on these references in Kernel to gather interface statistics, set MTU or bring up/down UIO interfaces. /Desh On 10/13/14 12:34 AM, "Zhang, Helin" wrote: >Hi Desh > >You tried to use ethtool for KNI interfaces in VM, right? I don't think >it is supported in VM. >Currently it just supports ethtool for KNI interfaces in host for some >igb and ixgbe NICs. > >Regards, >Helin > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Deshvanth Mirle >> Jayaprakash (dmirleja) >> Sent: Monday, October 13, 2014 3:21 PM >> To: dev at dpdk.org >> Subject: [dpdk-dev] Issues running Ethtool on KNI interfaces >> >> Hi, >> >> Hypervisor: KVM >> >> VM: Linux OS with 2.6.32 Kernel >> >> VM Settings: >> >> 8 vCPUs, 8192 MB of memory, CPU Configuration: Copy Host CPU Config >> (SandyBridge), Manually set CPU topology: Sockets=2, Cores=4, Threads=1 >> >> 10Gigi Passthrough Interfaces attached to VM: Intel X520 >> >> >> DPDK settings: >> >> >> DPDK Version: 1.6R2 >> >> >> echo 512 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages >> >> rmmod ixgbe >> >> insmod igb_uio.ko >> >> mkdir /mnt/hugepages >> >> mount -t hugetlbfs nodev /mnt/hugepages >> >> #./igb_uio_bind.py --status >> >> #./igb_uio_bind.py --bind=igb_uio 03:01.0 >> >> #./igb_uio_bind.py --bind=igb_uio 03:02.0 >> >> #./igb_uio_bind.py --status >> >> modprobe hwmon >> >> insmod rte_kni.ko >> >> ./kni -c 0x1E -n 2 --socket-mem 512 -- -p 0x3 >>--config="(0,1,3),(1,2,4)" & >> >> >> Having KNI process running in the background, creates two KNI interfaces >> called vEth0 and vEth1 which corresponds to IGB-UIO interfaces attached >>to >> the Userspace. The issue is that the Ethtool doesn't work with KNI >>interfaces. >> Is there anything that I am missing, How can I get Ethtool to work with >>KNI >> interfaces? Also, If I do not use --socket-mem OR -m EAL option, KNI >>application >> consumes all the huge pages(It doesn't mater 512/4096) assigned in the >>very >> first step.Is that expected behavior? Can I get KNI to work without >>assigning >> any Hugepages? >> >> >> >> Also, I have attached screen shots of KNI log file. Please take a look. >> >> >> Thanks, >> >> Desh
[dpdk-dev] [PATCH] vhost: Fix the vhost broken issue
As the vhost sample is broken by the following commit, commit 08b563ffb19d8baf59dd84200f25bc85031d18a7 Author: Olivier Matz Date: Thu Sep 11 14:15:35 2014 +0100 mbuf: replace data pointer by an offset It leads to segment fault error in vhost when binding a virtio device MAC address to its corresponding VMDq pool by executing command line 'start tx-first' in test-pmd on guest. This patch fixes that issue. Signed-off-by: Changchun Ouyang --- examples/vhost/main.c | 1 + 1 file changed, 1 insertion(+) diff --git a/examples/vhost/main.c b/examples/vhost/main.c index 9cf8e20..a6db607 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -1782,6 +1782,7 @@ virtio_dev_tx(struct virtio_net* dev, struct rte_mempool *mbuf_pool) /* Setup dummy mbuf. This is copied to a real mbuf if transmitted out the physical port. */ m.data_len = desc->len; m.pkt_len = desc->len; + m.buf_addr = (void *)(uintptr_t)buff_addr; m.data_off = 0; PRINT_PACKET(dev, (uintptr_t)buff_addr, desc->len, 0); -- 1.8.4.2
[dpdk-dev] Issues running Ethtool on KNI interfaces
Hi Desh Actually KNI provide a path to exchange info/actions between user space and kernel space. You can read kni example application and KNI kernel module and KNI library for more details. It already supports setting MTU, link up/down the port, etc. Regards, Helin > -Original Message- > From: Deshvanth Mirle Jayaprakash (dmirleja) [mailto:dmirleja at cisco.com] > Sent: Monday, October 13, 2014 3:39 PM > To: Zhang, Helin; dev at dpdk.org > Subject: Re: Issues running Ethtool on KNI interfaces > > Thanks Helin, have been trying this for some time, Is there any other way I > can > pass IOCTLs to IGB-UIO interfaces. Can I use /dev/uio0 and /dev/uio1 ? Can I > use IOCTLS on these references in Kernel to gather interface statistics, set > MTU or bring up/down UIO interfaces. > > /Desh > > On 10/13/14 12:34 AM, "Zhang, Helin" wrote: > > >Hi Desh > > > >You tried to use ethtool for KNI interfaces in VM, right? I don't think > >it is supported in VM. > >Currently it just supports ethtool for KNI interfaces in host for some > >igb and ixgbe NICs. > > > >Regards, > >Helin > > > >> -Original Message- > >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Deshvanth Mirle > >> Jayaprakash (dmirleja) > >> Sent: Monday, October 13, 2014 3:21 PM > >> To: dev at dpdk.org > >> Subject: [dpdk-dev] Issues running Ethtool on KNI interfaces > >> > >> Hi, > >> > >> Hypervisor: KVM > >> > >> VM: Linux OS with 2.6.32 Kernel > >> > >> VM Settings: > >> > >> 8 vCPUs, 8192 MB of memory, CPU Configuration: Copy Host CPU Config > >> (SandyBridge), Manually set CPU topology: Sockets=2, Cores=4, > >> Threads=1 > >> > >> 10Gigi Passthrough Interfaces attached to VM: Intel X520 > >> > >> > >> DPDK settings: > >> > >> > >> DPDK Version: 1.6R2 > >> > >> > >> echo 512 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages > >> > >> rmmod ixgbe > >> > >> insmod igb_uio.ko > >> > >> mkdir /mnt/hugepages > >> > >> mount -t hugetlbfs nodev /mnt/hugepages > >> > >> #./igb_uio_bind.py --status > >> > >> #./igb_uio_bind.py --bind=igb_uio 03:01.0 > >> > >> #./igb_uio_bind.py --bind=igb_uio 03:02.0 > >> > >> #./igb_uio_bind.py --status > >> > >> modprobe hwmon > >> > >> insmod rte_kni.ko > >> > >> ./kni -c 0x1E -n 2 --socket-mem 512 -- -p 0x3 > >>--config="(0,1,3),(1,2,4)" & > >> > >> > >> Having KNI process running in the background, creates two KNI > >>interfaces called vEth0 and vEth1 which corresponds to IGB-UIO > >>interfaces attached to the Userspace. The issue is that the Ethtool > >>doesn't work with KNI interfaces. > >> Is there anything that I am missing, How can I get Ethtool to work > >>with KNI interfaces? Also, If I do not use --socket-mem OR -m EAL > >>option, KNI application consumes all the huge pages(It doesn't mater > >>512/4096) assigned in the very first step.Is that expected behavior? > >>Can I get KNI to work without assigning any Hugepages? > >> > >> > >> > >> Also, I have attached screen shots of KNI log file. Please take a look. > >> > >> > >> Thanks, > >> > >> Desh
[dpdk-dev] [PATCH] vhost: Fix the vhost broken issue
Hi Thomas, If HuaweiXie's patch set for vhost library and new vhost sample could be applied into dpdk.org very soon, Then this patch could be depressed/superseded, I think his patch can fix this issue. Otherwise, this patch could be high priority as the vhost is broken in the tip code due to recent commit related to mbuf change. Thanks and regards, Changchun > -Original Message- > From: Ouyang, Changchun > Sent: Monday, October 13, 2014 3:40 PM > To: dev at dpdk.org > Cc: Cao, Waterman; Ouyang, Changchun > Subject: [PATCH] vhost: Fix the vhost broken issue > > As the vhost sample is broken by the following commit, > commit 08b563ffb19d8baf59dd84200f25bc85031d18a7 > Author: Olivier Matz > Date: Thu Sep 11 14:15:35 2014 +0100 > mbuf: replace data pointer by an offset > > It leads to segment fault error in vhost when binding a virtio device MAC > address to its corresponding VMDq pool by executing command line 'start tx- > first' in test-pmd on guest. > > This patch fixes that issue. > > Signed-off-by: Changchun Ouyang > --- > examples/vhost/main.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/examples/vhost/main.c b/examples/vhost/main.c index > 9cf8e20..a6db607 100644 > --- a/examples/vhost/main.c > +++ b/examples/vhost/main.c > @@ -1782,6 +1782,7 @@ virtio_dev_tx(struct virtio_net* dev, struct > rte_mempool *mbuf_pool) > /* Setup dummy mbuf. This is copied to a real mbuf if > transmitted out the physical port. */ > m.data_len = desc->len; > m.pkt_len = desc->len; > + m.buf_addr = (void *)(uintptr_t)buff_addr; > m.data_off = 0; > > PRINT_PACKET(dev, (uintptr_t)buff_addr, desc->len, 0); > -- > 1.8.4.2
[dpdk-dev] Aligned RX data.
Hi All, Is there a way to create a mempool such that all mbufs are aligned to X. lets say X is 512. Thanks. On Sat, Oct 11, 2014 at 5:04 PM, Alex Markuze wrote: > O.k, And how would I do that? > I'm guessing there is something I can control in rte_pktmbuf_pool_init? > I would appreciate If you could spare a word or two in the matter. > > On Tue, Oct 7, 2014 at 7:11 PM, Ananyev, Konstantin < > konstantin.ananyev at intel.com> wrote: > >> >> >> > -Original Message- >> > From: Ananyev, Konstantin >> > Sent: Tuesday, October 07, 2014 5:03 PM >> > To: Ananyev, Konstantin >> > Subject: FW: [dpdk-dev] Aligned RX data. >> > >> > >> > >> > From: Alex Markuze [mailto:alex at weka.io] >> > Sent: Tuesday, October 07, 2014 4:52 PM >> > To: Ananyev, Konstantin >> > Cc: dev at dpdk.org >> > Subject: Re: [dpdk-dev] Aligned RX data. >> > >> > RTE_PKTMBUF_HEADROOM defines the headroom >> >> Yes. >> >> >this would be true only if the buff_start was aligned to 512 which is >> not. >> >> As I said: " Make sure that your all your mbufs are aligned by 512". >> >> Konstantin >> >> > >> > On Tue, Oct 7, 2014 at 1:05 PM, Ananyev, Konstantin < >> konstantin.ananyev at intel.com> wrote: >> > >> > >> > > -Original Message- >> > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Markuze >> > > Sent: Tuesday, October 07, 2014 10:40 AM >> > > To: dev at dpdk.org >> > > Subject: [dpdk-dev] Aligned RX data. >> > > >> > > Hi , I'm trying to receive aligned packets from the wire. >> > > Meaning that for all received packets the pkt.data is always aligned >> to >> > > (512 -H). >> > > >> > > Looking at the pmds of ixgbe/vmxnet I see that the pmds call >> > > __rte_mbuf_raw_alloc and set the rx descriptor with a >> > > RTE_MBUF_DATA_DMA_ADDR_DEFAULT >> > > Instead of the more appropriate RTE_MBUF_DATA_DMA_ADDR. >> > > >> > > Do I need to modify each pmd I'm using to be able to receive aligned >> data? >> > Make sure that your all your mbufs are aligned by 512 and set in your >> config RTE_PKTMBUF_HEADROOM=512-H? >> > >> > >> > > Or have I missed something? >> > > >> > > Thanks >> >> >
[dpdk-dev] [PATCH v3] KNI: use a memzone pool for KNI alloc/release
This patch implements the KNI memzone pool in order to prevent memzone exhaustion when allocating/deallocating KNI interfaces. It adds a new API call, rte_kni_init(max_kni_ifaces) that shall be called before any call to rte_kni_alloc() if KNI is used. v2: Moved KNI fd opening to rte_kni_init(). Revised style. v3: Adapted examples/kni to rte_kni_init(). Signed-off-by: Marc Sune --- examples/kni/main.c |3 + lib/librte_kni/rte_kni.c | 315 +- lib/librte_kni/rte_kni.h | 18 +++ 3 files changed, 277 insertions(+), 59 deletions(-) diff --git a/examples/kni/main.c b/examples/kni/main.c index cb17b43..f998b02 100644 --- a/examples/kni/main.c +++ b/examples/kni/main.c @@ -872,6 +872,9 @@ main(int argc, char** argv) rte_exit(EXIT_FAILURE, "Configured invalid " "port ID %u\n", i); + /* Initialize KNI subsystem */ + rte_kni_init(nb_sys_ports); + /* Initialise each port */ for (port = 0; port < nb_sys_ports; port++) { /* Skip ports that are not enabled */ diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c index 76feef4..e339ef0 100644 --- a/lib/librte_kni/rte_kni.c +++ b/lib/librte_kni/rte_kni.c @@ -40,6 +40,7 @@ #include #include +#include #include #include #include @@ -58,7 +59,7 @@ #define KNI_REQUEST_MBUF_NUM_MAX 32 -#define KNI_MZ_CHECK(mz) do { if (mz) goto fail; } while (0) +#define KNI_MEM_CHECK(cond) do { if (cond) goto kni_fail; } while (0) /** * KNI context @@ -66,6 +67,7 @@ struct rte_kni { char name[RTE_KNI_NAMESIZE];/**< KNI interface name */ uint16_t group_id; /**< Group ID of KNI devices */ + uint32_t slot_id; /**< KNI pool slot ID */ struct rte_mempool *pktmbuf_pool; /**< pkt mbuf mempool */ unsigned mbuf_size; /**< mbuf size */ @@ -88,10 +90,48 @@ enum kni_ops_status { KNI_REQ_REGISTERED, }; +/** + * KNI memzone pool slot + */ +struct rte_kni_memzone_slot{ + uint32_t id; + uint8_t in_use : 1;/**< slot in use */ + + /* Memzones */ + const struct rte_memzone *m_ctx; /**< KNI ctx */ + const struct rte_memzone *m_tx_q; /**< TX queue */ + const struct rte_memzone *m_rx_q; /**< RX queue */ + const struct rte_memzone *m_alloc_q; /**< Allocated mbufs queue */ + const struct rte_memzone *m_free_q;/**< To be freed mbufs queue */ + const struct rte_memzone *m_req_q; /**< Request queue */ + const struct rte_memzone *m_resp_q;/**< Response queue */ + const struct rte_memzone *m_sync_addr; + + /* Free linked list */ + struct rte_kni_memzone_slot *next; /**< Next slot link.list */ +}; + +/** + * KNI memzone pool + */ +struct rte_kni_memzone_pool{ + uint8_t initialized : 1;/**< Global KNI pool init flag */ + + uint32_t max_ifaces;/**< Max. num of KNI ifaces */ + struct rte_kni_memzone_slot *slots;/**< Pool slots */ + rte_spinlock_t mutex; /**< alloc/relase mutex */ + + /* Free memzone slots linked-list */ + struct rte_kni_memzone_slot *free; /**< First empty slot */ + struct rte_kni_memzone_slot *free_tail;/**< Last empty slot */ +}; + + static void kni_free_mbufs(struct rte_kni *kni); static void kni_allocate_mbufs(struct rte_kni *kni); static volatile int kni_fd = -1; +static struct rte_kni_memzone_pool kni_memzone_pool = {0}; static const struct rte_memzone * kni_memzone_reserve(const char *name, size_t len, int socket_id, @@ -105,6 +145,161 @@ kni_memzone_reserve(const char *name, size_t len, int socket_id, return mz; } +/* Pool mgmt */ +static struct rte_kni_memzone_slot* +kni_memzone_pool_alloc(void) +{ + struct rte_kni_memzone_slot* slot; + + rte_spinlock_lock(&kni_memzone_pool.mutex); + + if(!kni_memzone_pool.free) { + rte_spinlock_unlock(&kni_memzone_pool.mutex); + return NULL; + } + + slot = kni_memzone_pool.free; + kni_memzone_pool.free = slot->next; + + if(!kni_memzone_pool.free) + kni_memzone_pool.free_tail = NULL; + + rte_spinlock_unlock(&kni_memzone_pool.mutex); + + return slot; +} + +static void +kni_memzone_pool_release(struct rte_kni_memzone_slot* slot) +{ + rte_spinlock_lock(&kni_memzone_pool.mutex); + + if(kni_memzone_pool.free) + kni_memzone_pool.free_tail->next = slot; + else + kni_memzone_pool.free = slot; + + kni_memzone_pool.free_tail = slot; + slot->next = NULL; + + rte_spinlock_unlock(&kni_memzone_pool.mutex); +} + + +/* Shall be called before any allocation happens */ +void +rte_kni_init(unsigned int max_kni_ifaces) +{ +
[dpdk-dev] [PATCH 1/5] vmxnet3: Fix VLAN Rx stripping
On Sun, 12 Oct 2014 23:23:05 -0700 Yong Wang wrote: > Shouldn't reset vlan_tci to 0 if a valid VLAN tag is stripped. > > Signed-off-by: Yong Wang Since vlan_tci is initialized to zero by rte_pktmbuf layer, the driver shouldn't be messing with it.
[dpdk-dev] [PATCH v4 4/7] i40e: add hash filter control implementation
Hi Helin, Should we define packet classification types separately and do not reuse bit shifts for RSS register as pctypes? Packet classification is a global index table which used by RSS Hash Enable registers, not vice versa. For example, there is no Packet classification named "ETH_RSS_NONF_IPV4_UDP_SHIFT" in Table 7-15 of XL710 Datasheet, it is "PCTYPE_NONF_IPV4_UDP", so switch (info->pctype) { case PCTYPE_NONF_IPV4_UDP: case PCTYPE_NONF_IPV4_TCP: ... Is more clean and readable for me than switch (info->pctype) { case ETH_RSS_NONF_IPV4_UDP_SHIFT: case ETH_RSS_NONF_IPV4_TCP_SHIFT: We can rerefine ETH_RSS_* using PCTYPE_* though: #define ETH_RSS_NONF_IPV4_UDP_SHIFT PCTYPE_NONF_IPV4_UDP and so one Regards, Andrey -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Helin Zhang Sent: Monday, October 13, 2014 7:13 AM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH v4 4/7] i40e: add hash filter control implementation Hash filter control has been implemented for i40e. It includes getting/setting - hash function type - symmetric hash enable per pctype (packet classification type) - symmetric hash enable per port - filter swap configuration Signed-off-by: Helin Zhang --- lib/librte_pmd_i40e/i40e_ethdev.c | 402 ++ 1 file changed, 402 insertions(+) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 46c43a7..60b619b 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -216,6 +216,10 @@ static int i40e_dev_rss_hash_update(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, struct rte_eth_rss_conf *rss_conf); +static int i40e_dev_filter_ctrl(struct rte_eth_dev *dev, + enum rte_filter_type filter_type, + enum rte_filter_op filter_op, + void *arg); /* Default hash key buffer for RSS */ static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; @@ -267,6 +271,7 @@ static struct eth_dev_ops i40e_eth_dev_ops = { .reta_query = i40e_dev_rss_reta_query, .rss_hash_update = i40e_dev_rss_hash_update, .rss_hash_conf_get= i40e_dev_rss_hash_conf_get, + .filter_ctrl = i40e_dev_filter_ctrl, }; static struct eth_driver rte_i40e_pmd = { @@ -4162,3 +4167,400 @@ i40e_pf_config_mq_rx(struct i40e_pf *pf) return 0; } + +/* Get the symmetric hash enable configurations per PCTYPE */ static +int i40e_get_symmetric_hash_enable_per_pctype(struct i40e_hw *hw, + struct rte_eth_sym_hash_ena_info *info) { + uint32_t reg; + + switch (info->pctype) { + case ETH_RSS_NONF_IPV4_UDP_SHIFT: + case ETH_RSS_NONF_IPV4_TCP_SHIFT: + case ETH_RSS_NONF_IPV4_SCTP_SHIFT: + case ETH_RSS_NONF_IPV4_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV4_SHIFT: + case ETH_RSS_NONF_IPV6_UDP_SHIFT: + case ETH_RSS_NONF_IPV6_TCP_SHIFT: + case ETH_RSS_NONF_IPV6_SCTP_SHIFT: + case ETH_RSS_NONF_IPV6_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV6_SHIFT: + case ETH_RSS_L2_PAYLOAD_SHIFT: + reg = I40E_READ_REG(hw, I40E_GLQF_HSYM(info->pctype)); + info->enable = reg & I40E_GLQF_HSYM_SYMH_ENA_MASK ? 1 : 0; + break; + default: + PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype); + return -EINVAL; + } + + return 0; +} + +/* Set the symmetric hash enable configurations per PCTYPE */ static +int i40e_set_symmetric_hash_enable_per_pctype(struct i40e_hw *hw, + const struct rte_eth_sym_hash_ena_info *info) { + uint32_t reg; + + switch (info->pctype) { + case ETH_RSS_NONF_IPV4_UDP_SHIFT: + case ETH_RSS_NONF_IPV4_TCP_SHIFT: + case ETH_RSS_NONF_IPV4_SCTP_SHIFT: + case ETH_RSS_NONF_IPV4_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV4_SHIFT: + case ETH_RSS_NONF_IPV6_UDP_SHIFT: + case ETH_RSS_NONF_IPV6_TCP_SHIFT: + case ETH_RSS_NONF_IPV6_SCTP_SHIFT: + case ETH_RSS_NONF_IPV6_OTHER_SHIFT: + case ETH_RSS_FRAG_IPV6_SHIFT: + case ETH_RSS_L2_PAYLOAD_SHIFT: + reg = info->enable ? I40E_GLQF_HSYM_SYMH_ENA_MASK : 0; + I40E_WRITE_REG(hw, I40E_GLQF_HSYM(info->pctype), reg); + I40E_WRITE_FLUSH(hw); + break; + default: + PMD_DRV_LOG(ERR, "PCTYPE[%u] not supported", info->pctype); + return -EINVAL; + } + + return 0; +} + +/* Get the symmetric hash enable configurations per port */ static void +i40e_get_symmetric_hash_enable_per_port(struct i40e_hw *hw, uint8_t +*enable) { + uint32_t reg = I40E_READ_REG(hw, I40E_PRT
[dpdk-dev] Aligned RX data.
> -Original Message- > From: Ananyev, Konstantin > Sent: Monday, October 13, 2014 12:30 PM > To: Ananyev, Konstantin > Subject: FW: [dpdk-dev] Aligned RX data. > > > > From: Alex Markuze [mailto:alex at weka.io] > Sent: Monday, October 13, 2014 9:47 AM > To: Ananyev, Konstantin > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] Aligned RX data. > > Hi All, > Is there a way to create a mempool such that all mbufs are aligned to X. lets > say X is 512. > > Thanks. > For example something like that: struct rte_mempool * mempool_xz1_create(uint32_t elt_num, int32_t socket_id) { struct rte_mempool *mp; const struct rte_memzone *mz; struct rte_mempool_objsz obj_sz; uint32_t flags, elt_size, total_size; size_t sz; phys_addr_t pa; void *va; /* mp element header_size==64B, trailer_size==0. */ flags = MEMPOOL_F_NO_SPREAD; /* to make total element size of mp 2K. */ elt_size = 2048 - 64; total_size = rte_mempool_calc_obj_size(elt_size, flags, &obj_sz); sz = elt_num * total_size + 512; if ((mz = rte_memzone_reserve_aligned("xz1_obj", sz, socket_id, 0, 512)) == NULL) return (NULL); va = (char *)mz->addr + 512 - obj_sz.header_size; pa = mz->phys_addr + 512 - obj_sz.header_size; mp = rte_mempool_xmem_create("xz1", elt_num, elt_size, 256, sizeof(struct rte_pktmbuf_pool_private), rte_pktmbuf_pool_init, NULL, rte_pktmbuf_init, NULL, socket_id, flags, va, &pa, MEMPOOL_PG_NUM_DEFAULT, MEMPOOL_PG_SHIFT_MAX); return (mp); } Each mbuf will be aligned on 512B boundary and 1856 (2K - 64B header - 128B mbuf). Alternative way - is to provide your own element constructor instead of rte_pktmbuf_init() for mempool_create. And inside it align buf_addr and buf_physaddr. Though in that case you have to set RTE_MBUF_REFCNT=n in your config. That's why I'd say it is a not recommended. Konstantin > > On Sat, Oct 11, 2014 at 5:04 PM, Alex Markuze wrote: > O.k, And how would I do that? > I'm guessing there is something I can control in?rte_pktmbuf_pool_init? > I would appreciate If you could spare a word or two in the matter. > > On Tue, Oct 7, 2014 at 7:11 PM, Ananyev, Konstantin intel.com> wrote: > > > > -Original Message- > > From: Ananyev, Konstantin > > Sent: Tuesday, October 07, 2014 5:03 PM > > To: Ananyev, Konstantin > > Subject: FW: [dpdk-dev] Aligned RX data. > > > > > > > > From: Alex Markuze [mailto:alex at weka.io] > > Sent: Tuesday, October 07, 2014 4:52 PM > > To: Ananyev, Konstantin > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] Aligned RX data. > > > > RTE_PKTMBUF_HEADROOM defines the headroom > > Yes. > > >this would be true only if the buff_start was?aligned to 512 which is not. > > As I said: " Make sure that your all your mbufs are aligned by 512". > > Konstantin > > > > > On Tue, Oct 7, 2014 at 1:05 PM, Ananyev, Konstantin > intel.com> wrote: > > > > > > > -Original Message- > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Markuze > > > Sent: Tuesday, October 07, 2014 10:40 AM > > > To: dev at dpdk.org > > > Subject: [dpdk-dev] Aligned RX data. > > > > > > Hi , I'm trying to receive aligned packets from the wire. > > > Meaning that for all received packets the pkt.data is always aligned to > > > (512 -H). > > > > > > Looking at the pmds of ixgbe/vmxnet I see that the pmds call > > > __rte_mbuf_raw_alloc and set the rx descriptor with a > > > RTE_MBUF_DATA_DMA_ADDR_DEFAULT > > > Instead of the more appropriate RTE_MBUF_DATA_DMA_ADDR. > > > > > > Do I need to modify each pmd I'm using to be able to receive aligned data? > > Make sure that your all your mbufs are aligned by 512 and set in your > > config RTE_PKTMBUF_HEADROOM=512-H? > > > > > > > Or have I missed something? > > > > > > Thanks >
[dpdk-dev] [PATCH v4 0/7] Support configuring hash functions
> These patches mainly support configuring hash functions. Tested-by: Zhaochen Zhan This patch has been verified on three kinds of Fortville NICs. Base commit: 23fcffe8ffaccf8a2901050e7daa4979597141ed CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz OS: Linux fc20 3.11.10-301.fc20.x86_64 GCC: 4.8.2 NIC: 4*10G(fortville_eagle), 2*40G(fortville_spirit), 1*40G(fortville_spirit_single) == Fortville RSS full support - Support configuring hash functions == This document provides test plan for testing the function of Fortville: Support configuring hash functions. Prerequisites - 2x Intel(r) 82599 (Niantic) NICs (2x 10GbE full duplex optical ports per NIC) 1x Fortville_eagle NIC (4x 10G) 1x Fortville_spirit NIC (2x 40G) 2x Fortville_spirit_single NIC (1x 40G) The four ports of the 82599 connect to the Fortville_eagle; The two ports of Fortville_spirit connect to Fortville_spirit_single. The three kinds of NICs are the target NICs. the connected NICs can send packets to these three NICs using scapy. Network Traffic --- The RSS feature is designed to improve networking performance by load balancing the packets received from a NIC port to multiple NIC RX queues, with each queue handled by a different logical core. #1. The receive packet is parsed into the header fields used by the hash operation (such as IP addresses, TCP port, etc.) #2. A hash calculation is performed. The Fortville supports four hash function: Toeplitz, simple XOR and their Symmetric RSS. #3. The seven LSBs of the hash result are used as an index into a 128/512 entry 'redirection table'. Each entry provides a 4-bit RSS output index. #4. There are four cases to test the four hash function. Test Case: test_toeplitz = Testpmd configuration - 16 RX/TX queues per port ~~~ #1. set up testpmd with fortville NICs:: ./testpmd -c f -n %d -- -i --coremask=0xe --rxq=16 --txq=16 #2. Reta Configuration. 128 reta entries configuration:: testpmd command: port config 0 rss reta (hash_index,queue_id) #3. PMD fwd only receive the packets:: testpmd command: set fwd rxonly #4. rss received package type configuration two received packet types configuration:: testpmd command: port config 0 rss ip/udp #5. verbose configuration:: testpmd command: set verbose 8 #6. set hash functions, can choose symmetric or not, chose port and packet type:: set_hash_function 0 toeplitz #7. start packet receive:: testpmd command: start tester Configuration #1. set up scapy #2. send packets with different type ipv4/ipv4 with tcp/ipv4 with udp/ ipv6/ipv6 with tcp/ipv6 with udp:: sendp([Ether(dst="90:e2:ba:36:99:3c")/IP(src="192.168.0.4", dst="192.168.0.5")], iface="eth3") test result --- The testpmd will print the hash value and actual queue of every packet. #1. Calaute the queue id: hash value%128or512, then refer to the redirection table to get the theoretical queue id. #2. Compare the theoretical queue id with the actual queue id. Test Case: test_toeplitz_symmetric === The same with the above steps, pay attention to "set hash function", should use:: set_hash_function 0 toeplitz set_sym_hash_ena_per_port 0 enable set_sym_hash_ena_per_pctype 0 35 enable And send packets with the same flow in different direction:: sendp([Ether(dst="90:e2:ba:36:99:3c")/IP(src="192.168.0.4", dst="192.168.0.5")], iface="eth3") sendp([Ether(dst="90:e2:ba:36:99:3c")/IP(src="192.168.0.5", dst="192.168.0.4")], iface="eth3") And the hash value and queue should be the same for these two flow . Test Case: test_simple === The same as the above two test cases. Just pay attention to set the hash function to "simple xor" Test Case: test_simple_symmetric = The same as the above two test cases. Just pay attention to set the hash function to "simple xor"
[dpdk-dev] [PATCH] i40e: fix of compile error
> It fixes the compile error as below on gcc version 4.3.4. > cc1: error: unrecognized command line option "-Wno-unused-but-set- > variable" > > Signed-off-by: Helin Zhang Tested-by: Zhaochen Zhan This patch has been verified on SUSE with gcc4.3.4. It has fixed the compile error related to i40e. But the dpdk still has error about"app/test/test_prefetch.c" with gcc4.3.4. Base commit: 23fcffe8ffaccf8a2901050e7daa4979597141ed CPU: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz OS: SUSE 11, 3.0.13-0.5-default GCC: 4.3.4
[dpdk-dev] why no API to free a ring?
hi, Could use rte_ring_create() API to create a ring, why no API to free it? -- Best Regards, zimeiw
[dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx cycles/packet
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Liang, Cunming > Sent: Sunday, October 12, 2014 12:11 PM > To: Neil Horman > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx > cycles/packet > > Hi Neil, > > Very appreciate your comments. > I add inline reply, will send v3 asap when we get alignment. > > BRs, > Liang Cunming > > > -Original Message- > > From: Neil Horman [mailto:nhorman at tuxdriver.com] > > Sent: Saturday, October 11, 2014 1:52 AM > > To: Liang, Cunming > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] [PATCH v2 1/4] app/test: unit test for rx and tx > cycles/packet > > > > On Fri, Oct 10, 2014 at 08:29:58PM +0800, Cunming Liang wrote: > > > It provides unit test to measure cycles/packet in NIC loopback mode. > > > It simply gives the average cycles of IO used per packet without test > equipment. > > > When doing the test, make sure the link is UP. > > > > > > Usage Example: > > > 1. Run unit test app in interactive mode > > > app/test -c f -n 4 -- -i > > > 2. Run and wait for the result > > > pmd_perf_autotest > > > > > > There's option to choose rx/tx pair, default is vector. > > > set_rxtx_mode [vector|scalar|full|hybrid] > > > Note: To get acurate scalar fast, please choose 'vector' or 'hybrid' > > > without > > INC_VEC=y in config > > > > > > Signed-off-by: Cunming Liang > > > Acked-by: Bruce Richardson > > > > Notes inline > > > > > --- > > > app/test/Makefile |1 + > > > app/test/commands.c | 38 +++ > > > app/test/packet_burst_generator.c |4 +- > > > app/test/test.h |4 + > > > app/test/test_pmd_perf.c| 626 > > +++ > > > lib/librte_pmd_ixgbe/ixgbe_ethdev.c |6 + > > > 6 files changed, 677 insertions(+), 2 deletions(-) > > > create mode 100644 app/test/test_pmd_perf.c > > > > > > diff --git a/app/test/Makefile b/app/test/Makefile > > > index 6af6d76..ebfa0ba 100644 > > > --- a/app/test/Makefile > > > +++ b/app/test/Makefile > > > @@ -56,6 +56,7 @@ SRCS-y += test_memzone.c > > > > > > SRCS-y += test_ring.c > > > SRCS-y += test_ring_perf.c > > > +SRCS-y += test_pmd_perf.c > > > > > > ifeq ($(CONFIG_RTE_LIBRTE_TABLE),y) > > > SRCS-y += test_table.c > > > diff --git a/app/test/commands.c b/app/test/commands.c > > > index a9e36b1..f1e746e 100644 > > > --- a/app/test/commands.c > > > +++ b/app/test/commands.c > > > @@ -310,12 +310,50 @@ cmdline_parse_inst_t cmd_quit = { > > > > > > +#define NB_ETHPORTS_USED(1) > > > +#define NB_SOCKETS (2) > > > +#define MEMPOOL_CACHE_SIZE 250 > > > +#define MBUF_SIZE (2048 + sizeof(struct rte_mbuf) + > > RTE_PKTMBUF_HEADROOM) > > Don't you want to size this in accordance with the amount of data your > sending > > (64 Bytes as noted above)? > [Liang, Cunming] The case is designed to measure small packet IO cost with > normal mbuf size. > Even if decreasing the size, it won't gain significant cycles. > > > > > +static void > > > +print_ethaddr(const char *name, const struct ether_addr *eth_addr) > > > +{ > > > + printf("%s%02X:%02X:%02X:%02X:%02X:%02X", name, > > > + eth_addr->addr_bytes[0], > > > + eth_addr->addr_bytes[1], > > > + eth_addr->addr_bytes[2], > > > + eth_addr->addr_bytes[3], > > > + eth_addr->addr_bytes[4], > > > + eth_addr->addr_bytes[5]); > > > +} > > > + > > This was copieed from print_ethaddr. Seems like a good candidate for a > common > > function in rte_ether.h > [Liang, Cunming] Agree with you, some of samples now use it with the same > copy. > I'll rework it. Adding 'ether_format_addr' in rte_ether.h only for format the > 48bits address output. > And leaving other prints for application customization. > > > > > > > +} > > > + > > > +static void > > > +signal_handler(int signum) > > > +{ > > > + /* When we receive a USR1 signal, print stats */ > > I think you mean SIGUSR2, below, SIGUSR1 tears the test down and exits > the > > program > [Liang, Cunming] Thanks, it's a typo. > > > > > + if (signum == SIGUSR1) { > > SIGINT instead. Thats the common practice. > [Liang, Cunming] I understood your opinion. > The considerations I'm not using SIGINT instead are: > 1. We unset ISIG in c_lflag of term. CRTL+C won't trigger SIGINT in command > interactive. > It always has to explicitly send signal. No matter SIGUSR1 or SIGINT. > 2. By SIGINT semantic, expect to terminate the process. > Here I expect to force stop this case, but still alive in command line. > After it stopped, it can run again or start to run other test cases. > So I keep SIGINT, SIGUSR1 in different behavior. > 3. It should be rarely used. > Only when exception timeout, I leave this backdoor for automation test > control. > For manual test, we can easily force kill the process. > > > > > > + printf("Force Stop!\n"); > > > + st
[dpdk-dev] [PATCH 1/3] pmd: add new flag to indicate TX TSO operation on the packet
From: Miroslaw Walukiewicz Transmission of TCP packets could be accelerated by HW Transmit Segmentation Offload. With TSO packets up to 64K could be transmismitted. When this flag is set the PMD drived will enable TCP segmentation. The new field tso_segsz is added to indicate how long is TCP TSO segment. Signed-off-by: Mirek Walukiewicz --- lib/librte_mbuf/rte_mbuf.h |5 + 1 file changed, 5 insertions(+) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index ddadc21..63cbc36 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -117,6 +117,9 @@ extern "C" { /* Use final bit of flags to indicate a control mbuf */ #define CTRL_MBUF_FLAG (1ULL << 63) /**< Mbuf contains control data */ +/* Bit 50 - TSO (TCP Transmit Segmenation Offload) */ +#define PKT_TX_TCP_TSO (1ULL << 50) /**< Mbuf needs TSO enabling */ + /** * Bit Mask to indicate what bits required for building TX context */ @@ -196,6 +199,8 @@ struct rte_mbuf { uint16_t l2_len:7; /**< L2 (MAC) Header Length. */ }; }; + /* field to support TSO segment size */ + uint32_t tso_segsz; } __rte_cache_aligned; /**
[dpdk-dev] [PATCH 3/3] pmd i40e: Enable Transmit Segmentation Offload for TCP traffic
From: Miroslaw Walukiewicz The patch enables the TSO HW feature for i40e PMD driver. The feature is reported by rte_dev_info_get() if enabled. Signed-off-by: Mirek Walukiewicz --- lib/librte_pmd_i40e/i40e_ethdev.c |1 + lib/librte_pmd_i40e/i40e_rxtx.c | 56 ++--- 2 files changed, 53 insertions(+), 4 deletions(-) diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c b/lib/librte_pmd_i40e/i40e_ethdev.c index 46c43a7..01b21eb 100644 --- a/lib/librte_pmd_i40e/i40e_ethdev.c +++ b/lib/librte_pmd_i40e/i40e_ethdev.c @@ -1399,6 +1399,7 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info) DEV_TX_OFFLOAD_IPV4_CKSUM | DEV_TX_OFFLOAD_UDP_CKSUM | DEV_TX_OFFLOAD_TCP_CKSUM | + DEV_TX_OFFLOAD_TCP_TSO | DEV_TX_OFFLOAD_SCTP_CKSUM; dev_info->default_rxconf = (struct rte_eth_rxconf) { diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c index 2b53677..bc7af2b 100644 --- a/lib/librte_pmd_i40e/i40e_rxtx.c +++ b/lib/librte_pmd_i40e/i40e_rxtx.c @@ -50,6 +50,8 @@ #include #include #include +#include +#include #include "i40e_logs.h" #include "i40e/i40e_prototype.h" @@ -440,6 +442,11 @@ i40e_txd_enable_checksum(uint32_t ol_flags, *td_offset |= (l3_len >> 2) << I40E_TX_DESC_LENGTH_IPLEN_SHIFT; } + if (ol_flags & PKT_TX_TCP_TSO) { + *td_cmd |= I40E_TX_DESC_CMD_L4T_EOFT_TCP; + /* td offset will be set next */ + return; + } /* Enable L4 checksum offloads */ switch (ol_flags & PKT_TX_L4_MASK) { case PKT_TX_TCP_CKSUM: @@ -1073,12 +1080,46 @@ i40e_calc_context_desc(uint64_t flags) #ifdef RTE_LIBRTE_IEEE1588 mask |= PKT_TX_IEEE1588_TMST; #endif + /* need for context descriptor when TSO enabled */ + mask |= PKT_TX_TCP_TSO; if (flags & mask) return 1; return 0; } +/* set i40e TSO context descriptor */ +static inline uint64_t +i40e_set_tso_ctx(struct rte_mbuf *mbuf, uint8_t l2_len, uint8_t l3_len, uint32_t *td_offset) +{ + uint64_t ctx_desc; + struct ipv4_hdr *ip; + struct tcp_hdr *th; + uint32_t tcp_hlen; + uint32_t hdrlen; + uint32_t paylen; + + /* set mss */ + ip = (struct ipv4_hdr *) (rte_pktmbuf_mtod(mbuf, unsigned char *) + l2_len); + ip->hdr_checksum = 0; + ip->total_length = 0; + th = (struct tcp_hdr *)((caddr_t)ip + l3_len); + th->cksum = rte_in_pseudo(ip->src_addr, ip->dst_addr, I40E_HTONS(IPPROTO_TCP)); + tcp_hlen = (th->data_off >> 4) << 2; + *td_offset |= (tcp_hlen >> 2) << + I40E_TX_DESC_LENGTH_L4_FC_LEN_SHIFT; + hdrlen = l2_len + l3_len + tcp_hlen; + paylen = mbuf->pkt_len - hdrlen; + + ctx_desc = ((uint64_t)mbuf->tso_segsz << + I40E_TXD_CTX_QW1_MSS_SHIFT) | + ((uint64_t)paylen << I40E_TXD_CTX_QW1_TSO_LEN_SHIFT) | + ((uint64_t)I40E_TX_CTX_DESC_TSO << + I40E_TXD_CTX_QW1_CMD_SHIFT); + + return ctx_desc; +} + uint16_t i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { @@ -1192,12 +1233,19 @@ i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) rte_pktmbuf_free_seg(txe->mbuf); txe->mbuf = NULL; } + /* TSO enabled means no timestamp */ + if (ol_flags & PKT_TX_TCP_TSO) { + cd_type_cmd_tso_mss |= + i40e_set_tso_ctx(tx_pkt, l2_len, l3_len, &td_offset); + } + else { #ifdef RTE_LIBRTE_IEEE1588 - if (ol_flags & PKT_TX_IEEE1588_TMST) - cd_type_cmd_tso_mss |= - ((uint64_t)I40E_TX_CTX_DESC_TSYN << - I40E_TXD_CTX_QW1_CMD_SHIFT); + if (ol_flags & PKT_TX_IEEE1588_TMST) + cd_type_cmd_tso_mss |= + ((uint64_t)I40E_TX_CTX_DESC_TSYN << + I40E_TXD_CTX_QW1_CMD_SHIFT); #endif + } ctx_txd->tunneling_params = rte_cpu_to_le_32(cd_tunneling_params); ctx_txd->l2tag2 = rte_cpu_to_le_16(cd_l2tag2);
[dpdk-dev] [PATCH 2/3] pmd: add new header containing TCP offload specific definitions
From: Miroslaw Walukiewicz The function for computing initial TCP header checksum. The file is common for both i40e and ixgbe PMD drivers Signed-off-by: Mirek Walukiewicz --- lib/librte_net/Makefile |3 + lib/librte_net/rte_tcp_off.h | 122 ++ 2 files changed, 124 insertions(+), 1 deletion(-) create mode 100644 lib/librte_net/rte_tcp_off.h diff --git a/lib/librte_net/Makefile b/lib/librte_net/Makefile index ad2e482..83e76d1 100644 --- a/lib/librte_net/Makefile +++ b/lib/librte_net/Makefile @@ -34,7 +34,8 @@ include $(RTE_SDK)/mk/rte.vars.mk CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 # install includes -SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h rte_sctp.h rte_icmp.h rte_arp.h +SYMLINK-$(CONFIG_RTE_LIBRTE_NET)-include := rte_ip.h rte_tcp.h rte_udp.h rte_sctp.h rte_icmp.h rte_arp.h \ +rte_tcp_off.h include $(RTE_SDK)/mk/rte.install.mk diff --git a/lib/librte_net/rte_tcp_off.h b/lib/librte_net/rte_tcp_off.h new file mode 100644 index 000..6143396 --- /dev/null +++ b/lib/librte_net/rte_tcp_off.h @@ -0,0 +1,122 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* + * Copyright (c) 1982, 1986, 1990, 1993 + * The Regents of the University of California. All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * 1. Redistributions of source code must retain the above copyright + *notice, this list of conditions and the following disclaimer. + * 2. Redistributions in binary form must reproduce the above copyright + *notice, this list of conditions and the following disclaimer in the + *documentation and/or other materials provided with the distribution. + * 3. All advertising materials mentioning features or use of this software + *must display the following acknowledgement: + * This product includes software developed by the University of + * California, Berkeley and its contributors. + * 4. Neither the name of the University nor the names of its contributors + *may be used to endorse or promote products derived from this software + *without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE + * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + * + * @(#)in.h8.3 (Berkeley) 1/3/94 + * $FreeBSD: src/sys/netinet/in.h,v 1.82 2003/10/25 09:37:10 ume Exp $ + */ + +#ifndef _RTE_TCP_OFF_H_ +#define _RTE_TCP_OFF_H_ + +/** + * @file + * + * TCP offload -related defines + */ + +#include + +#ifdef __
[dpdk-dev] Aligned RX data.
Very helpful, thanks a lot. It sure does seem to do the trick. On Mon, Oct 13, 2014 at 2:43 PM, Ananyev, Konstantin < konstantin.ananyev at intel.com> wrote: > > > > -Original Message- > > From: Ananyev, Konstantin > > Sent: Monday, October 13, 2014 12:30 PM > > To: Ananyev, Konstantin > > Subject: FW: [dpdk-dev] Aligned RX data. > > > > > > > > From: Alex Markuze [mailto:alex at weka.io] > > Sent: Monday, October 13, 2014 9:47 AM > > To: Ananyev, Konstantin > > Cc: dev at dpdk.org > > Subject: Re: [dpdk-dev] Aligned RX data. > > > > Hi All, > > Is there a way to create a mempool such that all mbufs are aligned to X. > lets say X is 512. > > > > Thanks. > > > > For example something like that: > > struct rte_mempool * > mempool_xz1_create(uint32_t elt_num, int32_t socket_id) > { > struct rte_mempool *mp; > const struct rte_memzone *mz; > struct rte_mempool_objsz obj_sz; > uint32_t flags, elt_size, total_size; > size_t sz; > phys_addr_t pa; > void *va; > > /* mp element header_size==64B, trailer_size==0. */ > flags = MEMPOOL_F_NO_SPREAD; > > /* to make total element size of mp 2K. */ > elt_size = 2048 - 64; > > total_size = rte_mempool_calc_obj_size(elt_size, flags, &obj_sz); > sz = elt_num * total_size + 512; > > if ((mz = rte_memzone_reserve_aligned("xz1_obj", sz, socket_id, > 0, 512)) == NULL) > return (NULL); > > va = (char *)mz->addr + 512 - obj_sz.header_size; > pa = mz->phys_addr + 512 - obj_sz.header_size; > > mp = rte_mempool_xmem_create("xz1", elt_num, elt_size, > 256, sizeof(struct rte_pktmbuf_pool_private), > rte_pktmbuf_pool_init, NULL, > rte_pktmbuf_init, NULL, > socket_id, flags, va, &pa, > MEMPOOL_PG_NUM_DEFAULT, MEMPOOL_PG_SHIFT_MAX); > > return (mp); > } > > Each mbuf will be aligned on 512B boundary and 1856 (2K - 64B header - > 128B mbuf). > > Alternative way - is to provide your own element constructor instead of > rte_pktmbuf_init() for mempool_create. > And inside it align buf_addr and buf_physaddr. > Though in that case you have to set RTE_MBUF_REFCNT=n in your config. > That's why I'd say it is a not recommended. > > Konstantin > > > > > On Sat, Oct 11, 2014 at 5:04 PM, Alex Markuze wrote: > > O.k, And how would I do that? > > I'm guessing there is something I can control in rte_pktmbuf_pool_init? > > I would appreciate If you could spare a word or two in the matter. > > > > On Tue, Oct 7, 2014 at 7:11 PM, Ananyev, Konstantin < > konstantin.ananyev at intel.com> wrote: > > > > > > > -Original Message- > > > From: Ananyev, Konstantin > > > Sent: Tuesday, October 07, 2014 5:03 PM > > > To: Ananyev, Konstantin > > > Subject: FW: [dpdk-dev] Aligned RX data. > > > > > > > > > > > > From: Alex Markuze [mailto:alex at weka.io] > > > Sent: Tuesday, October 07, 2014 4:52 PM > > > To: Ananyev, Konstantin > > > Cc: dev at dpdk.org > > > Subject: Re: [dpdk-dev] Aligned RX data. > > > > > > RTE_PKTMBUF_HEADROOM defines the headroom > > > > Yes. > > > > >this would be true only if the buff_start was aligned to 512 which is > not. > > > > As I said: " Make sure that your all your mbufs are aligned by 512". > > > > Konstantin > > > > > > > > On Tue, Oct 7, 2014 at 1:05 PM, Ananyev, Konstantin < > konstantin.ananyev at intel.com> wrote: > > > > > > > > > > -Original Message- > > > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Alex Markuze > > > > Sent: Tuesday, October 07, 2014 10:40 AM > > > > To: dev at dpdk.org > > > > Subject: [dpdk-dev] Aligned RX data. > > > > > > > > Hi , I'm trying to receive aligned packets from the wire. > > > > Meaning that for all received packets the pkt.data is always aligned > to > > > > (512 -H). > > > > > > > > Looking at the pmds of ixgbe/vmxnet I see that the pmds call > > > > __rte_mbuf_raw_alloc and set the rx descriptor with a > > > > RTE_MBUF_DATA_DMA_ADDR_DEFAULT > > > > Instead of the more appropriate RTE_MBUF_DATA_DMA_ADDR. > > > > > > > > Do I need to modify each pmd I'm using to be able to receive aligned > data? > > > Make sure that your all your mbufs are aligned by 512 and set in your > config RTE_PKTMBUF_HEADROOM=512-H? > > > > > > > > > > Or have I missed something? > > > > > > > > Thanks > > > >
[dpdk-dev] [PATCH v4 4/8] bond: free mbufs if transmission fails in bonding tx_burst functions
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty > Sent: Tuesday, September 30, 2014 10:58 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v4 4/8] bond: free mbufs if transmission fails in > bonding tx_burst functions > > > Signed-off-by: Declan Doherty > --- > app/test/test_link_bonding.c | 393 > - > app/test/virtual_pmd.c | 80 +-- > app/test/virtual_pmd.h | 7 + > lib/librte_pmd_bond/rte_eth_bond_pmd.c | 83 +-- > 4 files changed, 525 insertions(+), 38 deletions(-) > > diff --git a/app/test/test_link_bonding.c b/app/test/test_link_bonding.c > index cce32ed..1a847eb 100644 > --- a/app/test/test_link_bonding.c > +++ b/app/test/test_link_bonding.c > @@ -663,6 +663,9 @@ enable_bonded_slaves(void) > int i; > > for (i = 0; i < test_params->bonded_slave_count; i++) { > + virtual_ethdev_tx_burst_fn_set_success(test_params- > >slave_port_ids[i], > + 1); > + > virtual_ethdev_simulate_link_status_interrupt( > test_params->slave_port_ids[i], 1); > } > @@ -1413,6 +1416,135 @@ test_roundrobin_tx_burst(void) > } > > static int > +verify_mbufs_ref_count(struct rte_mbuf **mbufs, int nb_mbufs, int val) > +{ > + int i, refcnt; > + > + for (i = 0; i < nb_mbufs; i++) { > + refcnt = rte_mbuf_refcnt_read(mbufs[i]); > + TEST_ASSERT_EQUAL(refcnt, val, > + "mbuf ref count (%d)is not the expected value (%d)", > + refcnt, val); > + } > + return 0; > +} > + > + > +static void > +free_mbufs(struct rte_mbuf **mbufs, int nb_mbufs) > +{ > + int i; > + > + for (i = 0; i < nb_mbufs; i++) > + rte_pktmbuf_free(mbufs[i]); > +} > + > +#define TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT(2) > +#define TEST_RR_SLAVE_TX_FAIL_BURST_SIZE (64) > +#define TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT (22) > +#define TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX (1) > + > +static int > +test_roundrobin_tx_burst_slave_tx_fail(void) > +{ > + struct rte_mbuf *pkt_burst[MAX_PKT_BURST]; > + struct rte_mbuf *expected_tx_fail_pkts[MAX_PKT_BURST]; > + > + struct rte_eth_stats port_stats; > + > + int i, first_fail_idx, tx_count; > + > + TEST_ASSERT_SUCCESS(initialize_bonded_device_with_slaves( > + BONDING_MODE_ROUND_ROBIN, 0, > + TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT, 1), > + "Failed to intialise bonded device"); > + > + /* Generate test bursts of packets to transmit */ > + TEST_ASSERT_EQUAL(generate_test_burst(pkt_burst, > + TEST_RR_SLAVE_TX_FAIL_BURST_SIZE, 0, 1, 0, 0, 0), > + TEST_RR_SLAVE_TX_FAIL_BURST_SIZE, > + "Failed to generate test packet burst"); > + > + /* Copy references to packets which we expect not to be > transmitted */ > + first_fail_idx = (TEST_RR_SLAVE_TX_FAIL_BURST_SIZE - > + (TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT * > + TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT)) + > + TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX; > + > + for (i = 0; i < TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT; i++) { > + expected_tx_fail_pkts[i] = pkt_burst[first_fail_idx + > + (i * > TEST_RR_SLAVE_TX_FAIL_SLAVE_COUNT)]; > + } > + > + /* Set virtual slave to only fail transmission of > + * TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT packets in burst */ > + virtual_ethdev_tx_burst_fn_set_success( > + test_params- > >slave_port_ids[TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX], > + 0); > + > + virtual_ethdev_tx_burst_fn_set_tx_pkt_fail_count( > + test_params- > >slave_port_ids[TEST_RR_SLAVE_TX_FAIL_FAILING_SLAVE_IDX], > + TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT); > + > + tx_count = rte_eth_tx_burst(test_params->bonded_port_id, 0, > pkt_burst, > + TEST_RR_SLAVE_TX_FAIL_BURST_SIZE); > + > + TEST_ASSERT_EQUAL(tx_count, > TEST_RR_SLAVE_TX_FAIL_BURST_SIZE - > + TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT, > + "Transmitted (%d) an unexpected (%d) number of > packets", tx_count, > + TEST_RR_SLAVE_TX_FAIL_BURST_SIZE - > + TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT); > + > + /* Verify that failed packet are expected failed packets */ > + for (i = 0; i < TEST_RR_SLAVE_TX_FAIL_PACKETS_COUNT; i++) { > + TEST_ASSERT_EQUAL(expected_tx_fail_pkts[i], pkt_burst[i + > tx_count], > + "expected mbuf (%d) pointer %p not > expected pointer %p", > + i, expected_tx_fail_pkts[i], pkt_burst[i + > tx_count]); > + } > + > + /* Verify bonded port tx
[dpdk-dev] [PATCH v3 20/20] app/test-pmd: add test command to configure flexible masks
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jingjing Wu > Sent: Friday, September 26, 2014 7:04 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v3 20/20] app/test-pmd: add test command to > configure flexible masks > > add test command to configure flexible masks for each flow type > > Signed-off-by: Jingjing Wu > Acked-by: Chen Jing D(Mark) > Acked-by: Helin Zhang > --- > app/test-pmd/cmdline.c | 173 > + > 1 file changed, 173 insertions(+) > > diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c > index da77752..073b929 100644 > --- a/app/test-pmd/cmdline.c > +++ b/app/test-pmd/cmdline.c > @@ -690,6 +690,11 @@ static void cmd_help_long_parsed(void > *parsed_result, > "flow_director_flex_payload (port_id)" > " (l2|l3|l4) (config)\n" > "configure flexible payload selection.\n\n" > + > + "flow_director_flex_mask (port_id)" > + " flow > (ether|ip4|tcp4|udp4|sctp4|ip6|tcp6|udp6|sctp6)" > + " words_mask (words) (word_mask_list)" > + "configure masks of flexible payload.\n\n" > ); > } > } > @@ -8046,6 +8051,173 @@ cmdline_parse_inst_t > cmd_set_flow_director_flex_payload = { > }, > }; > > +/* *** deal with flow director mask on flexible payload *** */ > +struct cmd_flow_director_flex_mask_result { > + cmdline_fixed_string_t flow_director_flexmask; > + uint8_t port_id; > + cmdline_fixed_string_t flow; > + cmdline_fixed_string_t flow_type; > + cmdline_fixed_string_t words_mask; > + uint8_t words; > + cmdline_fixed_string_t word_mask_list; > +}; > + > +static inline int > +parse_word_masks_cfg(const char *q_arg, > + struct rte_eth_fdir_flex_masks *masks) > +{ > + char s[256]; > + const char *p, *p0 = q_arg; > + char *end; > + enum fieldnames { > + FLD_OFFSET = 0, > + FLD_MASK, > + _NUM_FLD > + }; > + unsigned long int_fld[_NUM_FLD]; > + char *str_fld[_NUM_FLD]; > + int i; > + unsigned size; > + > + masks->nb_field = 0; > + p = strchr(p0, '('); > + while (p != NULL) { > + ++p; > + p0 = strchr(p, ')'); > + if (p0 == NULL) > + return -1; > + > + size = p0 - p; > + if (size >= sizeof(s)) > + return -1; > + > + snprintf(s, sizeof(s), "%.*s", size, p); > + if (rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',') != > _NUM_FLD) > + return -1; > + for (i = 0; i < _NUM_FLD; i++) { > + errno = 0; > + int_fld[i] = strtoul(str_fld[i], &end, 0); > + if (errno != 0 || end == str_fld[i] || int_fld[i] > > UINT16_MAX) > + return -1; > + } > + masks->field[masks->nb_field].offset = > + (uint16_t)int_fld[FLD_OFFSET]; > + masks->field[masks->nb_field].bitmask = > + ~(uint16_t)int_fld[FLD_MASK]; > + masks->nb_field++; > + if (masks->nb_field > 2) { > + printf("exceeded max number of fields: %hu\n", > + masks->nb_field); masks->nb_field is an uint8_t, so you should change from %hu to %hhu or %PRIu8. > + return -1; > + } > + p = strchr(p0, '('); > + } > + return 0; > +} > + > +static void > +cmd_flow_director_flex_mask_parsed(void *parsed_result, > + __attribute__((unused)) struct cmdline *cl, > + __attribute__((unused)) void *data) > +{ > + struct cmd_flow_director_flex_mask_result *res = parsed_result; > + struct rte_eth_fdir_flex_masks *flex_masks; > + struct rte_eth_fdir_cfg fdir_cfg; > + int ret = 0; > + int cfg_size = 2 * sizeof(struct rte_eth_flex_mask) + > + offsetof(struct rte_eth_fdir_flex_masks, field); > + > + ret = rte_eth_dev_filter_supported(res->port_id, > RTE_ETH_FILTER_FDIR); > + if (ret < 0) { > + printf("flow director is not supported on port %u.\n", > + res->port_id); > + return; > + } > + > + memset(&fdir_cfg, 0, sizeof(struct rte_eth_fdir_cfg)); > + > + flex_masks = (struct rte_eth_fdir_flex_masks > *)rte_zmalloc_socket("CLI", > + cfg_size, CACHE_LINE_SIZE, rte_socket_id()); > + > + if (flex_masks == NULL) { > + printf("fail to malloc memory to configure flexi masks\n"); > + return; > + } > + > + if (!strcmp(res->flow_type, "ip4")) > + flex_masks->flow_type = > RTE_ETH_FLOW_TYPE_IPV4_OTHER; > + else if (!strcmp(res->flow_type, "udp4")) > +
[dpdk-dev] [PATCH v3 0/6] Update libs build process
Are there any more comments on this patch set? Thanks, Sergio > -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez > Monroy > Sent: Thursday, October 9, 2014 2:05 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v3 0/6] Update libs build process > > As per the proposal, this patch set does: > - Remove CONFIG_RTE_BUILD_COMBINE_LIBS as a configuration option. > - For static library, build a single/combined library. > - For shared libraries, build both individual/separated and single/combined >libraries. > - Link apps only against single/combined libs. > - Include external shared libs dependencies when building shared libraries. > > v3: > - Split some of the patches for easier review > - Improve patches descriptions > > Sergio Gonzalez Monroy (6): > Link combined shared library using CC > Link apps only against single/combined library > Remove CONFIG_RTE_BUILD_COMBINE_LIBS and related > Update library build process > Avoid duplicated code > Link apps/DSOs against EXECENV_LDLIBS with --as-needed > > config/common_bsdapp | 3 +- > config/common_linuxapp | 3 +- > mk/rte.app.mk | 164 > ++--- > mk/rte.lib.mk | 81 ++-- > mk/rte.sdkbuild.mk | 2 +- > mk/rte.sharelib.mk | 54 > mk/rte.vars.mk | 4 -- > 7 files changed, 54 insertions(+), 257 deletions(-) > > -- > 1.9.3
[dpdk-dev] [PATCH] Pass verbose flag to kernel module
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez > Monroy > Sent: Monday, October 06, 2014 5:09 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH] Pass verbose flag to kernel module > > --- > mk/rte.module.mk | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mk/rte.module.mk b/mk/rte.module.mk > index c4ca3fd..bd3c596 100644 > --- a/mk/rte.module.mk > +++ b/mk/rte.module.mk > @@ -78,7 +78,7 @@ build: _postbuild > $(MODULE).ko: $(SRCS_LINKS) > @if [ ! -f $(notdir Makefile) ]; then ln -nfs $(SRCDIR)/Makefile . ; fi > @$(MAKE) -C $(RTE_KERNELDIR) M=$(CURDIR) > O=$(RTE_KERNELDIR) \ > - CROSS_COMPILE=$(CROSS) > + V=$(if $(V),1,0) CROSS_COMPILE=$(CROSS) > > # install module in $(RTE_OUTPUT)/kmod > $(RTE_OUTPUT)/kmod/$(MODULE).ko: $(MODULE).ko > -- > 1.9.3 Acked-by: Pablo de Lara
[dpdk-dev] [PATCH v2] Pass CC option when building kernel modules
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Sergio Gonzalez > Monroy > Sent: Thursday, October 09, 2014 11:09 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v2] Pass CC option when building kernel > modules > > At least on kernels 3.15 or newer, wrong compiler flags are set when building > kernel modules. > > v2: > Remove unnecessary reset of CC when being included from the kernel. > > Signed-off-by: Sergio Gonzalez Monroy > Acked-by: Pablo de Lara > --- > mk/rte.module.mk | 2 +- > mk/toolchain/clang/rte.vars.mk | 5 + > mk/toolchain/gcc/rte.vars.mk | 1 + > mk/toolchain/icc/rte.vars.mk | 5 + > 4 files changed, 4 insertions(+), 9 deletions(-) > > diff --git a/mk/rte.module.mk b/mk/rte.module.mk > index c4ca3fd..41c0d0f 100644 > --- a/mk/rte.module.mk > +++ b/mk/rte.module.mk > @@ -78,7 +78,7 @@ build: _postbuild > $(MODULE).ko: $(SRCS_LINKS) > @if [ ! -f $(notdir Makefile) ]; then ln -nfs $(SRCDIR)/Makefile . ; fi > @$(MAKE) -C $(RTE_KERNELDIR) M=$(CURDIR) > O=$(RTE_KERNELDIR) \ > - CROSS_COMPILE=$(CROSS) > + CC=$(KERNELCC) CROSS_COMPILE=$(CROSS) > > # install module in $(RTE_OUTPUT)/kmod > $(RTE_OUTPUT)/kmod/$(MODULE).ko: $(MODULE).ko > diff --git a/mk/toolchain/clang/rte.vars.mk b/mk/toolchain/clang/rte.vars.mk > index ee4f451..40cb389 100644 > --- a/mk/toolchain/clang/rte.vars.mk > +++ b/mk/toolchain/clang/rte.vars.mk > @@ -38,11 +38,8 @@ > # - define TOOLCHAIN_ASFLAGS variable (overriden by cmdline value) > # > > -ifeq ($(KERNELRELEASE),) > CC= $(CROSS)clang > -else > -CC= $(CROSS)gcc > -endif > +KERNELCC = $(CROSS)gcc > CPP = $(CROSS)cpp > # for now, we don't use as but nasm. > # AS = $(CROSS)as > diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk > index 262ebdf..993eb26 100644 > --- a/mk/toolchain/gcc/rte.vars.mk > +++ b/mk/toolchain/gcc/rte.vars.mk > @@ -39,6 +39,7 @@ > # > > CC= $(CROSS)gcc > +KERNELCC = $(CROSS)gcc > CPP = $(CROSS)cpp > # for now, we don't use as but nasm. > # AS = $(CROSS)as > diff --git a/mk/toolchain/icc/rte.vars.mk b/mk/toolchain/icc/rte.vars.mk > index 612370d..f03a2a2 100644 > --- a/mk/toolchain/icc/rte.vars.mk > +++ b/mk/toolchain/icc/rte.vars.mk > @@ -41,11 +41,8 @@ > # Warning: we do not use CROSS environment variable as icc is mainly a > # x86->x86 compiler > > -ifeq ($(KERNELRELEASE),) > CC= icc > -else > -CC= gcc > -endif > +KERNELCC = gcc > CPP = cpp > AS= nasm > AR= ar > -- > 1.9.3
[dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet identification in librte_pmd_i40e
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jijiang Liu > Sent: Saturday, October 11, 2014 6:55 AM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v5 2/8]i40e:support VxLAN packet identification > in librte_pmd_i40e > > Support tunneling UDP port configuration on i40e in librte_pmd_i40e. > Currently, only VxLAN is implemented, which include > - VxLAN UDP port initialization > - Implement the APIs to configure VxLAN UDP port in librte_pmd_i40e. > > Signed-off-by: Jijiang Liu > Acked-by: Helin Zhang > Acked-by: Jingjing Wu > Acked-by: Jing Chen > > --- > config/common_linuxapp|5 + > lib/librte_mbuf/rte_mbuf.h|4 +- > lib/librte_pmd_i40e/i40e_ethdev.c | 200 > - > lib/librte_pmd_i40e/i40e_ethdev.h |5 + > lib/librte_pmd_i40e/i40e_rxtx.c |9 ++ > 5 files changed, 221 insertions(+), 2 deletions(-) > > diff --git a/config/common_linuxapp b/config/common_linuxapp > index 4713eb4..185cb0f 100644 > --- a/config/common_linuxapp > +++ b/config/common_linuxapp > @@ -210,6 +210,11 @@ > CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4 > CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1 > > # > +# Compile tunneling UDP port support > +# > +CONFIG_RTE_LIBRTE_TUNNEL_UDP_PORT=4789 > + > +# Should you add also this option in the common_bsdapp? > # Compile burst-oriented VIRTIO PMD driver > # > CONFIG_RTE_LIBRTE_VIRTIO_PMD=y > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > index ddadc21..0984650 100644 > --- a/lib/librte_mbuf/rte_mbuf.h > +++ b/lib/librte_mbuf/rte_mbuf.h > @@ -163,7 +163,7 @@ struct rte_mbuf { > > /* remaining bytes are set on RX when pulling packet from descriptor > */ > MARKER rx_descriptor_fields1; > - uint16_t reserved2; /**< Unused field. Required for padding */ > + uint16_t packet_type; /**< Packet type, which indicates packet > format */ > uint16_t data_len;/**< Amount of data in segment buffer. */ > uint32_t pkt_len; /**< Total pkt len: sum of all segments. */ > uint16_t vlan_tci;/**< VLAN Tag Control Identifier (CPU order) > */ > @@ -551,6 +551,7 @@ static inline void rte_pktmbuf_reset(struct rte_mbuf > *m) > m->port = 0xff; > > m->ol_flags = 0; > + m->packet_type = 0; > m->data_off = (RTE_PKTMBUF_HEADROOM <= m->buf_len) ? > RTE_PKTMBUF_HEADROOM : m->buf_len; > > @@ -620,6 +621,7 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf > *mi, struct rte_mbuf *md) > mi->pkt_len = mi->data_len; > mi->nb_segs = 1; > mi->ol_flags = md->ol_flags; > + mi->packet_type = md->packet_type; > > __rte_mbuf_sanity_check(mi, 1); > __rte_mbuf_sanity_check(md, 0); > diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c > b/lib/librte_pmd_i40e/i40e_ethdev.c > index 52f01a5..3ebb8e2 100644 > --- a/lib/librte_pmd_i40e/i40e_ethdev.c > +++ b/lib/librte_pmd_i40e/i40e_ethdev.c > @@ -189,7 +189,7 @@ static int i40e_res_pool_alloc(struct > i40e_res_pool_info *pool, > static int i40e_dev_init_vlan(struct rte_eth_dev *dev); > static int i40e_veb_release(struct i40e_veb *veb); > static struct i40e_veb *i40e_veb_setup(struct i40e_pf *pf, > - struct i40e_vsi *vsi); > + struct i40e_vsi *vsi); > static int i40e_pf_config_mq_rx(struct i40e_pf *pf); > static int i40e_vsi_config_double_vlan(struct i40e_vsi *vsi, int on); > static inline int i40e_find_all_vlan_for_mac(struct i40e_vsi *vsi, > @@ -205,6 +205,14 @@ static int i40e_dev_rss_hash_update(struct > rte_eth_dev *dev, > struct rte_eth_rss_conf *rss_conf); > static int i40e_dev_rss_hash_conf_get(struct rte_eth_dev *dev, > struct rte_eth_rss_conf *rss_conf); > +static int i40e_dev_udp_tunnel_add(struct rte_eth_dev *dev, > +struct rte_eth_udp_tunnel *udp_tunnel, > +uint8_t count); > +static int i40e_dev_udp_tunnel_del(struct rte_eth_dev *dev, > +struct rte_eth_udp_tunnel *udp_tunnel, > +uint8_t count); > +static int i40e_pf_config_vxlan(struct i40e_pf *pf); > + > > /* Default hash key buffer for RSS */ > static uint32_t rss_key_default[I40E_PFQF_HKEY_MAX_INDEX + 1]; > @@ -256,6 +264,8 @@ static struct eth_dev_ops i40e_eth_dev_ops = { > .reta_query = i40e_dev_rss_reta_query, > .rss_hash_update = i40e_dev_rss_hash_update, > .rss_hash_conf_get= i40e_dev_rss_hash_conf_get, > + .udp_tunnel_add = i40e_dev_udp_tunnel_add, > + .udp_tunnel_del = i40e_dev_udp_tunnel_del, > }; > > static struct eth_driver rte_i40e_pmd = { > @@ -2532,6 +2542,34 @@ i40e_vsi_dump_bw_config(struct i40e_vsi *vsi) > return 0; > } > >
[dpdk-dev] [PATCH 4/6] i40e: add VMDQ support
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Chen Jing D(Mark) > Sent: Tuesday, September 23, 2014 2:14 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH 4/6] i40e: add VMDQ support > > From: "Chen Jing D(Mark)" > > The change includes several parts: > 1. Get maximum number of VMDQ pools supported in dev_init. > 2. Fill VMDQ info in i40e_dev_info_get. > 3. Setup VMDQ pools in i40e_dev_configure. > 4. i40e_vsi_setup change to support creation of VMDQ VSI. > > Signed-off-by: Chen Jing D(Mark) > Acked-by: Konstantin Ananyev > Acked-by: Jingjing Wu > Acked-by: Jijiang Liu > Acked-by: Huawei Xie > --- > config/common_linuxapp|1 + > lib/librte_pmd_i40e/i40e_ethdev.c | 237 > - > lib/librte_pmd_i40e/i40e_ethdev.h | 17 +++- > 3 files changed, 225 insertions(+), 30 deletions(-) > > diff --git a/config/common_linuxapp b/config/common_linuxapp > index 5bee910..d0bb3f7 100644 > --- a/config/common_linuxapp > +++ b/config/common_linuxapp > @@ -208,6 +208,7 @@ > CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC=y > CONFIG_RTE_LIBRTE_I40E_ALLOW_UNSUPPORTED_SFP=n > CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n > CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4 > +CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4 Should we include this option in config_bsdapp as well? > # interval up to 8160 us, aligned to 2 (or default value) > CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1 > > diff --git a/lib/librte_pmd_i40e/i40e_ethdev.c > b/lib/librte_pmd_i40e/i40e_ethdev.c > index a00d6ca..a267c96 100644 > --- a/lib/librte_pmd_i40e/i40e_ethdev.c > +++ b/lib/librte_pmd_i40e/i40e_ethdev.c > @@ -168,6 +168,7 @@ static int i40e_get_cap(struct i40e_hw *hw); > static int i40e_pf_parameter_init(struct rte_eth_dev *dev); > static int i40e_pf_setup(struct i40e_pf *pf); > static int i40e_vsi_init(struct i40e_vsi *vsi); > +static int i40e_vmdq_setup(struct rte_eth_dev *dev); > static void i40e_stat_update_32(struct i40e_hw *hw, uint32_t reg, > bool offset_loaded, uint64_t *offset, uint64_t *stat); > static void i40e_stat_update_48(struct i40e_hw *hw, > @@ -269,21 +270,11 @@ static struct eth_driver rte_i40e_pmd = { > }; > > static inline int > -i40e_prev_power_of_2(int n) > +i40e_align_floor(int n) > { > - int p = n; > - > - --p; > - p |= p >> 1; > - p |= p >> 2; > - p |= p >> 4; > - p |= p >> 8; > - p |= p >> 16; > - if (p == (n - 1)) > - return n; > - p >>= 1; > - > - return ++p; > + if (n == 0) > + return 0; > + return (1 << (sizeof(n) * CHAR_BIT - 1 - __builtin_clz(n))); > } > > static inline int > @@ -500,7 +491,7 @@ eth_i40e_dev_init(__rte_unused struct eth_driver > *eth_drv, > if (!dev->data->mac_addrs) { > PMD_INIT_LOG(ERR, "Failed to allocated memory " > "for storing mac address"); > - goto err_get_mac_addr; > + goto err_mac_alloc; > } > ether_addr_copy((struct ether_addr *)hw->mac.perm_addr, > &dev->data->mac_addrs[0]); > @@ -521,8 +512,9 @@ eth_i40e_dev_init(__rte_unused struct eth_driver > *eth_drv, > > return 0; > > +err_mac_alloc: > + i40e_vsi_release(pf->main_vsi); > err_setup_pf_switch: > - rte_free(pf->main_vsi); > err_get_mac_addr: > err_configure_lan_hmc: > (void)i40e_shutdown_lan_hmc(hw); > @@ -541,6 +533,27 @@ err_get_capabilities: > static int > i40e_dev_configure(struct rte_eth_dev *dev) > { > + int ret; > + enum rte_eth_rx_mq_mode mq_mode = dev->data- > >dev_conf.rxmode.mq_mode; > + > + /* VMDQ setup. > + * Needs to move VMDQ setting out of i40e_pf_config_mq_rx() as > VMDQ and > + * RSS setting have different requirements. > + * General PMD driver call sequence are NIC init, configure, > + * rx/tx_queue_setup and dev_start. In rx/tx_queue_setup() > function, it > + * will try to lookup the VSI that specific queue belongs to if VMDQ > + * applicable. So, VMDQ setting has to be done before > + * rx/tx_queue_setup(). This function is good to place vmdq_setup. > + * For RSS setting, it will try to calculate actual configured RX queue > + * number, which will be available after rx_queue_setup(). > dev_start() > + * function is good to place RSS setup. > + */ > + if (mq_mode & ETH_MQ_RX_VMDQ_FLAG) { > + ret = i40e_vmdq_setup(dev); > + if (ret) > + return ret; > + } > + > return i40e_dev_init_vlan(dev); > } > > @@ -1389,6 +1402,16 @@ i40e_dev_info_get(struct rte_eth_dev *dev, > struct rte_eth_dev_info *dev_info) > DEV_TX_OFFLOAD_UDP_CKSUM | > DEV_TX_OFFLOAD_TCP_CKSUM | > DEV_TX_OFFLOAD_SCTP_CKSUM; > + > + if (pf->flags | I40E_FLAG_VMDQ) { > + dev_info->max_vmdq_pools = pf->m
[dpdk-dev] vmxnet3 pmd dev restart
Waterman/Navakanth, we've got some patches posted for the same by Yong Wang at VMware. I haven't got the chance to look at it but if you can validate it, it'd be great. Thanks, Rashmin -Original Message- From: Navakanth M [mailto:navakanth...@gmail.com] Sent: Sunday, October 12, 2014 8:07 PM To: Patel, Rashmin N; Cao, Waterman Cc: stephen at networkplumber.org; dev at dpdk.org; Jiajia, SunX Subject: Re: vmxnet3 pmd dev restart Hi Rashmin I have tried the memset change but still I am facing the problem which I pointed out earlier. After restart, packets are not being received in vmxnet3_recv_pkts(). I have also observed PANIC in vmxnet3_tq_tx_complete() after couple of stop and start operations. PANIC in vmxnet3_tq_tx_complete(): EOP desc does not point to a valid mbuf15: [/lib64/libc.so.6(clone+0x6d) [0x7fd60354c52d]] 1: [/mswitch/bin/sos.shumway.elf(rte_dump_stack+0x23) [0x463313]] 2: [/mswitch/bin/sos.shumway.elf(__rte_panic+0xc1) [0x447ae8]] 3: [/mswitch/bin/sos.shumway.elf(vmxnet3_xmit_pkts+0x382) [0x4f4f22]] Thanks Navakanth On Fri, Oct 10, 2014 at 8:39 AM, Cao, Waterman wrote: > Hi Rashmin, > > We found similar issue when we start/stop vmnet dev several time. (> > 3 times) It happens kernel panic, and sometimes kernel will occur core dump. > Let me know if you want to submit patch to fix it. > > Thanks > Waterman > > -Original Message- >>From: Patel, Rashmin N >>Sent: Friday, October 10, 2014 6:07 AM >>To: Navakanth M; stephen at networkplumber.org; Cao, Waterman >>Cc: dev at dpdk.org >>Subject: RE: vmxnet3 pmd dev restart >> >>I just quickly looked into the code and instead of releasing memory or simply >>set it to NULL (patch: >> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683), you can zero >> it out and it should work perfectly, you can give it a quick try. >> >>//rte_free(ring->buf_info); >>memset(ring->buf_info, 0x0, ring->size*sizeof(vmxnet3_buf_info_t)); >> >>This will not free the memory from heap but just wipe it out to 0x0, provided >>that we freed all the mbuf(s) pointed by each buf_info->m pointers. Hence you >>won't need to reallocate it when you start device after this stop. >> >>Thanks, >>Rashmin >> >>-Original Message- >>From: Navakanth M [mailto:navakanthdev at gmail.com] >>Sent: Wednesday, October 08, 2014 10:11 PM >>To: stephen at networkplumber.org; Patel, Rashmin N; Cao, Waterman >>Cc: dev at dpdk.org >>Subject: Re: vmxnet3 pmd dev restart >> >>I had tried with Stephen's patch but after stop is done and when we >>call start it crashed at vmxnet3_dev_start()-> >>vmxnet3_dev_rxtx_init()->vmxnet3_post_rx_bufs() as buf_info is freed >>and is not allocated again. buf_info is allocated in >>vmxnet3_dev_rx_queue_setup() which would be called once at the initialization >>only. >>I also tried not freeing buf_info in stop but then i see different >>issue after start, packets are not received due to check while >>(rcd->gen == rxq->comp_ring.gen) { in vmxnet3_recv_pkts() >> >>Waterman, Have you got chance to test stop and start of vmnet dev if so did >>you notice any issue similar to this? >> >>Thanks >>Navakanth >> >>On Thu, Oct 9, 2014 at 12:46 AM, Patel, Rashmin N >intel.com> wrote: >>> Yes I had a local copy working with couple of lines fix. But someone else, >>> I think Stephen added a fix patch for the same, and I assume if it's been >>> merged, should be working, so did not follow up later. >>> >>> I don't have a VMware setup handy at moment but I think Waterman would have >>> more information about testing that patch if he has found any issue with it. >>> >>> Thanks, >>> Rashmin >>> >>> -Original Message- >>> From: Navakanth M [mailto:navakanthdev at gmail.com] >>> Sent: Wednesday, October 08, 2014 4:14 AM >>> To: dev at dpdk.org; Patel, Rashmin N >>> Subject: Re: vmxnet3 pmd dev restart >>> >>> Hi Rashmin >>> >>> I have come across your reply in following post that you have worked on >>> this problem and would submit the patch for it. >>> Can you please share information on the changes you worked on or patch log >>> if you had submitted any for it? >>> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683 >>> >>> Thanks >>> Navakanth >>> >>> On Tue, Sep 30, 2014 at 1:44 PM, Navakanth M >>> wrote: Hi I am using DPDKv1.7.0 running on Vmware Esxi 5.1 and am trying to reset the port which uses pmd_vmnet3 library functions from below function calls. rte_eth_dev_stop rte_eth_dev_start Doing this, i face panic while rte_free(ring->buf_info) in Vmxnet3_cmd_ring_release(). I have gone through following thread but the patch mentioned didn't help rather it crashed in start function while accessing buf_info in vmxnet3_post_rx_bufs. I see this buf_info is allocated in queue setup functions which are called at initialization. http://thread.gmane.org/gmane.comp.networking.dpdk.devel/4683 I tried not freeing it and then
[dpdk-dev] [PATCH 1/5] vmxnet3: Fix VLAN Rx stripping
Are you referring to the patch as a whole or your comment is about the reset of vlan_tci on the "else" (no vlan tags stripped) path? I am not sure I get your comments here. This patch simply fixes a bug on the rx vlan stripping path (where valid vlan_tci stripped is overwritten unconditionally later on the rx path in the original vmxnet3 pmd driver). All the other pmd drivers are doing the same thing in terms of translating descriptor status to rte_mbuf flags for vlan stripping. From: Stephen Hemminger Sent: Monday, October 13, 2014 2:31 AM To: Yong Wang Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 1/5] vmxnet3: Fix VLAN Rx stripping On Sun, 12 Oct 2014 23:23:05 -0700 Yong Wang wrote: > Shouldn't reset vlan_tci to 0 if a valid VLAN tag is stripped. > > Signed-off-by: Yong Wang Since vlan_tci is initialized to zero by rte_pktmbuf layer, the driver shouldn't be messing with it.
[dpdk-dev] [PATCH v6 00/25] user space vhost library
Hi Huawei, 2014-10-09 02:54, Huawei Xie: > This set of patches transforms and refactors vhost example to a user > space vhost library. > This library implements a user space vhost cuse driver, and provides > generic APIs for user space ethernet vSwitch to integrate us-vhost for > fast packet switching with guest virtio. > > vhost lib consists of five APIs plus several helper routines > for feature disable/enable. > 1) rte_vhost_driver_register initializes vhost driver. > 2) rte_vhost_driver_callback_register registers the callbacks. > Callbacks are called from vhost driver when a virtio device is ready > to be added to the data processing core or is de-activated by guest. > 3) rte_vhost_driver_session_start, a blocking API to start vhost > message handler session loop. > 4) rte_vhost_enqueue_burst and rte_vhost_dequeue_burst for > enqueue/dequeue packets to/from virtio ring respectively. > > v2) turn off vhost lib by default > > v3) fixed checkpatch issues > > v4) split the patch per thomas' requirement > > v5) fine granularity split of the patch > regenerate patches based on latest commit > this patchset removes vhost example patches, which will be > submitted later. > > Huawei Xie (25): > move src files from examples/vhost to lib/librte_vhost > rename main.c to vhost_rxtx.c and virtio-net.h to rte_vhost_net.h > keep virtio_dev_(merge_)rx, copy_from_mbuf_to_vring and > virtio_dev_merge_tx; remove anything else in vhost_rxtx.c > remove mac learning, mac/vlan, VMDQ and other switching related logic > remove host memory region region related logic > remove retry > patch virtio_dev_merge_tx to make it return packets to app > patch vhost_dev_merge_tx about buf_size > add queue_id parameter to vhost rx/tx functions > define PACKET_BURST > rte_vhost_en/dequeue_burst API > move virtio_net_config_ll strcture to virtio_net.c > remove index > call get_virtio_net_callbacks to get the ops in register_cuse_device > rte_vhost_driver_register and rte_vhost_session_start API > rte_vhost_callback_register API > add debug print > define VHOST_SUPPORTED_FEATURES > header file cleanup > static fix > add priv field in virtio_net to store application specific context > coding style fixes > add TODO/FIXME > add vhost support in Makefile Thanks for your hard work. There are still few things to clean in this patch splitting but I've did it to apply them. I won't describe all the changes I've done, you can check them in the git repository. In short, some split or merge were needed, some lines were removed and re-added later, build dependencies were not correct and doc generation was missing. You did the big work by really splitting all these stuff. Working on small commits was far easier. Thanks Applied Now you can add the new example. I hope we'll have more reviews and cleanup now that the first version of this library is integrated. -- Thomas
[dpdk-dev] Unable to bring up VF interface at guest when using DPDK PMD driver on host
I create 7 Virtual Functions on 82599 using DPDK PMD PF ixgbe driver on a host as stated in the DPDK programming guide: modprobe uio insmod ./build/kmod/igb_uio.ko ./pci_unbind.py -b igb_uio :02:00.1 echo 7 > /sys/bus/pci/devices/\:02\:00.1/max_vfs Then I assign a VF to a KVM guest using "-device pci-assign,host=02:10.1". When login to the guest, I see the interface in "ip addr" command, but the interface is down. I try to bring it up, but I can't: [DPDK-1.6.0]# ifconfig eth1 up SIOCSIFFLAGS: Network is down On the host, I do configure the port up (using ifconfig) before creating the virtual functions. On the guest, I invoke rte_eal_init() with the port, I got this MAC error: PMD: The MAC address is not valid. The most likely cause of this error is that the VM host has not assigned a valid MAC address to this VF device. Please consult the DPDK Release Notes (FAQ section) for a possible solution to this problem. Any help is appreciated. Regards, Shian
[dpdk-dev] [PATCH v4 00/10] VM Power Management
Hi Alan, 2014-10-12 20:36, Alan Carew: > The following patches add two DPDK sample applications and an alternate > implementation of librte_power for use in virtualized environments. > The idea is to provide librte_power functionality from within a VM to address > the lack of MSRs to facilitate frequency changes from within a VM. > It is ideally suited for Haswell which provides per core frequency scaling. > > The current librte_power affects frequency changes via the acpi-cpufreq > 'userspace' power governor, accessed via sysfs. Something was preventing me from looking deeper in this big codebase, but I didn't know what sounds weird. Now I realize: the real problem is that virtualization transparency is broken for power management. So the right thing to do is to fix it in KVM. I think all this patchset is a huge workaround. Did you try to fix it with Qemu/KVM? -- Thomas
[dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement
Hi, 2014-10-12 23:23, Yong Wang: > This patch series include various fixes and improvement to the > vmxnet3 pmd driver. > > Yong Wang (5): > vmxnet3: Fix VLAN Rx stripping > vmxnet3: Add VLAN Tx offload > vmxnet3: Fix dev stop/restart bug > vmxnet3: Add rx pkt check offloads > vmxnet3: Some perf improvement on the rx path Please, could describe what is the performance gain for these patches? Benchmark numbers would be appreciated. Thanks -- Thomas
[dpdk-dev] [PATCH v6 00/25] user space vhost library
> -Original Message- > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com] > Sent: Monday, October 13, 2014 12:52 PM > To: Xie, Huawei > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v6 00/25] user space vhost library > > Hi Huawei, > > 2014-10-09 02:54, Huawei Xie: > > This set of patches transforms and refactors vhost example to a user > > space vhost library. > > This library implements a user space vhost cuse driver, and provides > > generic APIs for user space ethernet vSwitch to integrate us-vhost for > > fast packet switching with guest virtio. > > > > vhost lib consists of five APIs plus several helper routines > > for feature disable/enable. > > 1) rte_vhost_driver_register initializes vhost driver. > > 2) rte_vhost_driver_callback_register registers the callbacks. > > Callbacks are called from vhost driver when a virtio device is ready > > to be added to the data processing core or is de-activated by guest. > > 3) rte_vhost_driver_session_start, a blocking API to start vhost > > message handler session loop. > > 4) rte_vhost_enqueue_burst and rte_vhost_dequeue_burst for > > enqueue/dequeue packets to/from virtio ring respectively. > > > > v2) turn off vhost lib by default > > > > v3) fixed checkpatch issues > > > > v4) split the patch per thomas' requirement > > > > v5) fine granularity split of the patch > > regenerate patches based on latest commit > > this patchset removes vhost example patches, which will be > > submitted later. > > > > Huawei Xie (25): > > move src files from examples/vhost to lib/librte_vhost > > rename main.c to vhost_rxtx.c and virtio-net.h to rte_vhost_net.h > > keep virtio_dev_(merge_)rx, copy_from_mbuf_to_vring and > virtio_dev_merge_tx; remove anything else in vhost_rxtx.c > > remove mac learning, mac/vlan, VMDQ and other switching related logic > > remove host memory region region related logic > > remove retry > > patch virtio_dev_merge_tx to make it return packets to app > > patch vhost_dev_merge_tx about buf_size > > add queue_id parameter to vhost rx/tx functions > > define PACKET_BURST > > rte_vhost_en/dequeue_burst API > > move virtio_net_config_ll strcture to virtio_net.c > > remove index > > call get_virtio_net_callbacks to get the ops in register_cuse_device > > rte_vhost_driver_register and rte_vhost_session_start API > > rte_vhost_callback_register API > > add debug print > > define VHOST_SUPPORTED_FEATURES > > header file cleanup > > static fix > > add priv field in virtio_net to store application specific context > > coding style fixes > > add TODO/FIXME > > add vhost support in Makefile > > Thanks for your hard work. > There are still few things to clean in this patch splitting > but I've did it to apply them. > I won't describe all the changes I've done, you can check them in the > git repository. In short, some split or merge were needed, some lines were > removed and re-added later, build dependencies were not correct and doc > generation was missing. > You did the big work by really splitting all these stuff. Working on small > commits was far easier. Thanks > > Applied > > Now you can add the new example. > I hope we'll have more reviews and cleanup now that the first version of > this library is integrated. Thanks! Would check and submit the example patch soon. Yes, this is the first step, next we would have code cleanup, bug fix, performance tuning, interfaces refine, new qemu us-vhost support, etc. > > -- > Thomas
[dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement
Only the last one is performance related and it merely tries to give hints to the compiler to hopefully make branch prediction more efficient. It also moves a constant assignment out of the pkt polling loop. We did performance evaluation on a Nehalem box with 4cores at 2.8GHz x 2 socket: On the DPDK-side, it's running some l3 forwarding apps in a VM on ESXi with one core assigned for polling. The client side is pktgen/dpdk, pumping 64B tcp packets at line rate. Before the patch, we are seeing ~900K PPS with 65% cpu of a core used for DPDK. After the patch, we are seeing the same pkt rate with only 45% of a core used. CPU usage is collected factoring our the idle loop cost. The packet rate is a result of the mode we used for vmxnet3 (pure emulation mode running default number of hypervisor contexts). I can add these info in the review request. Yong From: Thomas Monjalon Sent: Monday, October 13, 2014 1:29 PM To: Yong Wang Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH 0/5] vmxnet3 pmd fixes/improvement Hi, 2014-10-12 23:23, Yong Wang: > This patch series include various fixes and improvement to the > vmxnet3 pmd driver. > > Yong Wang (5): > vmxnet3: Fix VLAN Rx stripping > vmxnet3: Add VLAN Tx offload > vmxnet3: Fix dev stop/restart bug > vmxnet3: Add rx pkt check offloads > vmxnet3: Some perf improvement on the rx path Please, could describe what is the performance gain for these patches? Benchmark numbers would be appreciated. Thanks -- Thomas
[dpdk-dev] Dynamic port/pipe QoS configuration
Hi, We are trying to provide QoS support for one of our clients using rte_sched. In our implementation we are treating each pipe as a customer. So, we can have maximum of 4096 customers per sub-port. Customers(pipe) can be added, deleted or modified dynamically. Each customer can have different profiles. Currently we are using DPDK-v1.6. Can I modify pipe profile during run time using rte_sched_pipe_config ()? Our plan is to have initial configs as below (similar to examples in DPDK) [1] Specify port params at the initialization of port as below static struct rte_sched_port_params port_param = { : : .n_subports_per_port = 1, .n_pipes_per_subport = 4096, .qsize = {64, 64, 64, 64}, .pipe_profiles = pipe_profile, .n_pipe_profiles = 1, } [2] static struct rte_sched_subport_params subport_param[] = { { .tb_rate = Link speed (1G/10G..) divided by 8 (bits), .tb_size = 100, .tc_rate = {Same as tb_rate, Same as tb_rate, Same as tb_rate, Same as tb_rate}, .tc_period = 10, }, }; [3] static struct rte_sched_pipe_params pipe_profile[] = { { /* Profile #0 */ .tb_rate = Link speed (1G/10G..) divided by 8 (bits)/4096 (maximum number of pipes), .tb_size = 100, .tc_rate = {pipe's tb_rate, pipe's tb_rate, pipe's tb_rate, pipe's tb_rate}, .tc_period = 40, .wrr_weights = {16, 4, 2, 1, 16, 4, 2, 1, 16, 4, 2, 1, 16, 4, 2, 1}, }, }; Our plan here is to initialize the pipe with default profile and modify each pipe based on user configurations. My questions are [a] Can I modify pipe profile during run time using rte_sched_pipe_config ()? (question repeated) If I can modify at pipe level, [b] Can we have different profiles for pipes, With one default profile at initialization? [c] Can we modify port level params without deleting the port using rte_sched_port_config ()? Please provide your valuable comments. Thanks in advance. -- Regards, Satish Babu
[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images
Hello, I was working to get my open source project running in a VirtualBox Vagrant VM powered by an Ubuntu Cloud image the last few days to make my project and DPDK more developer friendly with a prebuilt environment. During this I was fixing the ugly hardcodings I'd used to hack it together on a desktop. In the process I found a few things I was confused about. 1) Ubuntu doesn't include the UIO module by default in the Cloud image... only the virtio-net. I was wondering if we had anybody in good contact with their kernel group to lobby for inclusion of the UIO modules in the stock Cloud kernel, without having to grab linux-generic and linux-headers-generic first. 2) The directions for activating virtio-net based interfaces seem out-of-date. They refer to the PMD as rte_virtio_net_pmd, when the PMD calls itself rte_virtio_pmd in my copy of DPDK 1.7.1 w/ my clang compilation patches added. I am getting some odd errors when I'm trying to load my app: EAL: PCI device :00:08.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd EAL: :00:08.0 not managed by UIO driver, skipping 3) If I try to bind the device to the igb_uio driver even though it seems like the wrong thing to do, just for testing, this is what happens (NOTE: unbound the 00:08.0 device from the kernel to show this): $ sudo tools/dpdk_nic_bind.py --status Network devices using DPDK-compatible driver Network devices using kernel driver === :00:03.0 'Virtio network device' if= drv=virtio-pci unused=igb_uio Other network devices = :00:08.0 'Virtio network device' unused=igb_uio $ sudo tools/dpdk_nic_bind.py -b igb_uio 00:08.0 Error: bind failed for :00:08.0 - Cannot bind to driver igb_uio vagrant at vagrant-ubuntu-trusty-64:/vagrant/external/dpdk$ dmesg | tail [ 1766.445609] igb_uio: Use MSIX interrupt by default [ 1824.602075] igb_uio: probe of :00:08.0 failed with error -2 [ 1824.602742] igb_uio: probe of :00:08.0 failed with error -2 4) I found some old email threads asking about this: http://comments.gmane.org/gmane.comp.networking.dpdk.devel/1357 (there are some others as well but this seemed closest) But the only thing present in that thread seemed to be irritated replies which didn't really explain the different virtio PMD's out there and how they work, and didn't explain which ones were in-tree and out-of-tree either. So let me ask this again, when somebody wrote "virtio-net or virtio-net + uio (QEMU, VirtualBox)" into the supported page (http://dpdk.org/doc/nics), who tested this to prove it worked? How did they get it to work on VirtualBox? The last reply stating "you have the source code" didn't really explain how they proved this stuff ever worked in the first place. Thanks, Matthew.
[dpdk-dev] Possible bug in eal_pci pci_scan_one
Hi, Did anybody get a chance to look what might be going on in this weird NUMA bug? I could use some help to understand how you're supposed to make code that will work right on both NUMA and non-NUMA. Otherwise it's hard to make a bulletproof DPDK based app that will be able to reliably init on single socket, dual socket non-NUMA, and dual socket NUMA boxes. Thanks, Matthew. On Mon, Oct 06, 2014 at 02:13:44AM -0700, Matthew Hall wrote: > Hi Guys, > > I'm doing my development on kind of a cheap machine with no NUMA support... > but several years ago I used DPDK to build a NUMA box that could do 40 gbits > bidirectional L4-L7 stateful traffic replay. > > So given the past experiences I had before, I wanted to clean the code up so > it'd work well if some crazy guy tried my code on one of these huge boxes, > too, but then I ran into some weird issues. > > 1) When I call rte_eth_dev_socket_id() I get back -1. But the call can return > -1 if the port_id is bogus or if pci_scan_one didn't get a numa_node (because > you're on a non-NUMA box for example). > > int rte_eth_dev_socket_id(uint8_t port_id) > { > if (port_id >= nb_ports) > return -1; > return rte_eth_devices[port_id].pci_dev->numa_node; > } > > So you couldn't tell the different between non-NUMA or a bad port value, etc. > > 2) The code's behavior and comments disagree with one another. In the > pci_scan_one function, there's this code: > > /* get numa node */ > snprintf(filename, sizeof(filename), "%s/numa_node", > dirname); > if (access(filename, R_OK) != 0) { > /* if no NUMA support just set node to 0 */ > dev->numa_node = -1; > } else { > if (eal_parse_sysfs_value(filename, &tmp) < 0) { > free(dev); > return -1; > } > dev->numa_node = tmp; > } > > It says, just use NUMA node 0 if there is no NUMA support. But then proceeds > to set the value to -1 in disagreement with the comment, and also stomping on > the other meaning for -1 in the higher function rte_eth_dev_socket_id. > > 3) In conclusion, it seems like some stuff is missing... first there needs to > be a function that will tell you the number of NUMA nodes present on the box > so you can create the right number of mbuf_pools, but I couldn't find that > function. > > Then if you have the function, you can do some magic and shuffle the NICs > around to get them hooked to a core on the same NUMA, and the mbuf_pool on > the > same NUMA. > > When NUMA is not present, can we return 0 instead of -1, or return a specific > error code that the client can use to know he should just use Socket 0? Right > now I can't tell apart any potential errors or weird values from correct > values. > > 4) I'm willing to help make and test some patches... but first I want to > understand what is happening with these funny functions before doing things > blindly. > > Thanks, > Matthew.
[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images
Another problem regarding virtio-net-pmd. When I tried using virtio-net-pmd, it compiles fine but then hits a weird error also during EAL init process: EAL: open shared lib /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so EAL: /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so: undefined symbol: per_lcore__lcore_id It doesn't seem to have a link dependency against any DPDK library that might contain such a symbol, either: $ ldd librte_pmd_virtio.so linux-vdso.so.1 => (0x7fffd61fc000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7fa2d971f000) /lib64/ld-linux-x86-64.so.2 (0x7fa2d9d0) $ nm librte_pmd_virtio.so | fgrep -i lcore U per_lcore__lcore_id 14ee t rte_lcore_id man nm says this means: "U" The symbol is undefined. At present I am using the common setup, static DPDK w/ COMBINE_LIBS. The directions don't state that a shared lib DPDK is required, and if it is required, that increases complexity and reduces performance so it'd be better not to be forced to require this unless there's a good reason. Instead I think it would be good if this driver didn't have to be external so it would just work right by default as part of the normal DPDK... right now it seems like it's a lot harder to set it up than it really needs to be. Another weird issue... when I tried to compile a DPDK shared lib using clang I got this really, really weird error: /usr/bin/ld: test: hidden symbol `mknod' in /usr/lib/x86_64-linux-gnu/libc_nonshared.a(mknod.oS) is referenced by DSO /usr/bin/ld: final link failed: Bad value Did anybody ever see something this before? Thanks, Matthew. On Mon, Oct 13, 2014 at 10:45:23PM -0700, Matthew Hall wrote: > Hello, > > I was working to get my open source project running in a VirtualBox Vagrant > VM > powered by an Ubuntu Cloud image the last few days to make my project and > DPDK > more developer friendly with a prebuilt environment. During this I was fixing > the ugly hardcodings I'd used to hack it together on a desktop. > > In the process I found a few things I was confused about. > > 1) Ubuntu doesn't include the UIO module by default in the Cloud image... > only > the virtio-net. I was wondering if we had anybody in good contact with their > kernel group to lobby for inclusion of the UIO modules in the stock Cloud > kernel, without having to grab linux-generic and linux-headers-generic first. > > 2) The directions for activating virtio-net based interfaces seem > out-of-date. > They refer to the PMD as rte_virtio_net_pmd, when the PMD calls itself > rte_virtio_pmd in my copy of DPDK 1.7.1 w/ my clang compilation patches added. > > I am getting some odd errors when I'm trying to load my app: > > EAL: PCI device :00:08.0 on NUMA socket -1 > EAL: probe driver: 1af4:1000 rte_virtio_pmd > EAL: :00:08.0 not managed by UIO driver, skipping > > 3) If I try to bind the device to the igb_uio driver even though it seems > like > the wrong thing to do, just for testing, this is what happens (NOTE: unbound > the 00:08.0 device from the kernel to show this): > > $ sudo tools/dpdk_nic_bind.py --status > > Network devices using DPDK-compatible driver > > > > Network devices using kernel driver > === > :00:03.0 'Virtio network device' if= drv=virtio-pci unused=igb_uio > > Other network devices > = > :00:08.0 'Virtio network device' unused=igb_uio > > $ sudo tools/dpdk_nic_bind.py -b igb_uio 00:08.0 > Error: bind failed for :00:08.0 - Cannot bind to driver igb_uio > > vagrant at vagrant-ubuntu-trusty-64:/vagrant/external/dpdk$ dmesg | tail > [ 1766.445609] igb_uio: Use MSIX interrupt by default > [ 1824.602075] igb_uio: probe of :00:08.0 failed with error -2 > [ 1824.602742] igb_uio: probe of :00:08.0 failed with error -2 > > 4) I found some old email threads asking about this: > > http://comments.gmane.org/gmane.comp.networking.dpdk.devel/1357 > (there are some others as well but this seemed closest) > > But the only thing present in that thread seemed to be irritated replies > which > didn't really explain the different virtio PMD's out there and how they work, > and didn't explain which ones were in-tree and out-of-tree either. > > So let me ask this again, when somebody wrote "virtio-net or virtio-net + uio > (QEMU, VirtualBox)" into the supported page (http://dpdk.org/doc/nics), who > tested this to prove it worked? How did they get it to work on VirtualBox? > The > last reply stating "you have the source code" didn't really explain how they > proved this stuff ever worked in the first place. > > Thanks, > Matthew.
[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images
On Mon, Oct 13, 2014 at 11:03:53PM -0700, Matthew Hall wrote: > Another weird issue... when I tried to compile a DPDK shared lib using clang > I > got this really, really weird error: > > /usr/bin/ld: test: hidden symbol `mknod' in > /usr/lib/x86_64-linux-gnu/libc_nonshared.a(mknod.oS) is referenced by DSO > /usr/bin/ld: final link failed: Bad value Note: this specific error seems to be a bug in the behavior of DPDK compilation when the following two options are enabled simultaneously: CONFIG_RTE_BUILD_SHARED_LIB=y CONFIG_RTE_BUILD_COMBINE_LIBS=y I think this is a pretty serious problem for anybody that's packaging or distributing a complete DPDK because compiling both the static and dynamic DPDK's at the same time as one another is going to fail with this weird error. Matthew.
[dpdk-dev] virtio UIO / PMD issues in default Ubuntu Cloud Images
On Mon, Oct 13, 2014 at 11:03:53PM -0700, Matthew Hall wrote: > Another problem regarding virtio-net-pmd. When I tried using virtio-net-pmd, > it compiles fine but then hits a weird error also during EAL init process: > > EAL: open shared lib /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so > EAL: /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so: undefined symbol: > per_lcore__lcore_id > > It doesn't seem to have a link dependency against any DPDK library that might > contain such a symbol, either: > > $ ldd librte_pmd_virtio.so > linux-vdso.so.1 => (0x7fffd61fc000) > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7fa2d971f000) > /lib64/ld-linux-x86-64.so.2 (0x7fa2d9d0) > > $ nm librte_pmd_virtio.so | fgrep -i lcore > U per_lcore__lcore_id > 14ee t rte_lcore_id > > man nm says this means: "U" The symbol is undefined. > > At present I am using the common setup, static DPDK w/ COMBINE_LIBS. The > directions don't state that a shared lib DPDK is required, and if it is > required, that increases complexity and reduces performance so it'd be better > not to be forced to require this unless there's a good reason. The PMD seems to load if DPDK is build w/ shared libraries: EAL: open shared lib /vagrant/external/virtio-net-pmd/librte_pmd_virtio.so librte_pmd_virtio version 1.2 Copyright 2013-2014 6WIND S.A. provided without warranty. However the documentation doesn't state that this is required... and requiring it is very irritating for developers. So I really hope we could just incorporate this PMD into the DPDK itself or fix whatever bugs make it not work with a static DPDK, or perhaps allow it to be build as a static lib and linked into one's app during compile so there's no need to load it as a plugin. At minimum the documentation should clearly state the precise steps needed to get this to work so it doesn't just blow up on the user with very weird series of errors as seen above. Matthew.