Re: [dpdk-dev] [RFC] eventdev: add event adapter for ethernet Rx queues
-Original Message- > Date: Wed, 24 May 2017 10:00:22 +0530 > From: "Rao, Nikhil" > To: Jerin Jacob , Gage Eads > > CC: dev@dpdk.org, tho...@monjalon.net, bruce.richard...@intel.com, > harry.van.haa...@intel.com, hemant.agra...@nxp.com, nipun.gu...@nxp.com, > narender.vang...@intel.com, nikhil@intel.com > Subject: Re: [RFC] eventdev: add event adapter for ethernet Rx queues > User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 > Thunderbird/38.7.2 > > Hi Jerin, Hi Nikhil, > > Comments inline. > > Also, another function needed is > bool rte_event_eth_rx_adapter_multithread_capable(void). > > This would be used to set the "multithread_capable" service core > configuration parameter. OK. I was thinking like, in order to effectively use adapter scheme, it should use ops scheme like rte_flow or rte_tm[1] where the same API can be can be used for both HW and SW. If we see, Both RFC[2], We have a lot of similarities. I think, We can base the eth_rx_adapter model based on your SW requirement RFC and introduce capability wherever it is not applicable for HW or vice versa. See below as a example[3]. Can you take of the same in v1 of this series? if you don't have the bandwidth then I can try. Let me know. Thoughts? [1] http://dpdk.org/dev/patchwork/patch/25275/ http://dpdk.org/dev/patchwork/patch/25276/ [2] http://dpdk.org/ml/archives/dev/2017-May/065341.html /* adapter has inbuilt port, no need to create producer port */ #define RTE_EVENT_ETHDEV_CAP_INBUILT_PORT (1ULL << 0) /* adapter does not need service function */ #define RTE_EVENT_ETHDEV_CAP_NO_SERVICE_FUNC (1ULL << 1) struct rte_event_eth_rx_adap_info { char name[32]; uint32_t adapter_cap; /**< Ethdev RX adapter capabilities(RTE_EVENT_ETHDEV_CAP_)*/ } struct rte_event_eth_rx_adap_cfg { uint8_t rx_event_port_id; /**< Event port identifier, the adapter enqueues mbuf events to this * port, Ignored when RTE_EVENT_ETHDEV_CAP_INBUILT_PORT */ } struct rte_eth_rx_event_adapter_queue_config { uint32_t rx_queue_flags; /**< Flags for handling received packets */ uint16_t servicing_weight; /**< Relative polling frequency of ethernet receive queue, if this * is set to zero, the Rx queue is interrupt driven * Ignored if RTE_EVENT_ETHDEV_CAP_NO_SERVICE_FUNC set */ struct rte_event ev; /**< * The values from the following event fields will be used when * enqueuing mbuf events: * - event_queue_id: Targeted event queue ID for received packets. * - event_priority: Event priority of packets from this Rx queue in * the event queue relative to other events. * - sched_type: Scheduling type for packets from this Rx queue. * - flow_id: If the RTE_ETH_RX_EVENT_ADAPTER_QUEUE_FLOW_ID_VALID bit * is set in rx_queue_flags, this flow_id is used for all * packets received from this queue. Otherwise the flow ID * is set to the RSS hash. */ }; int rte_event_eth_rx_adapter_create(uint8_t id, uint8_t dev_id, uint8_t eth_port_id); int rte_event_eth_rx_adapter_get_info(uint8_t id, struct rte_event_eth_rx_adap_info *info); int rte_event_eth_rx_adapter_configure(uint8_t id, struct rte_event_eth_rx_adap_config *cfg); int rte_event_eth_rx_adapter_queue_add(uint8_t id, int32_t rx_queue_id, const struct rte_eth_rx_event_adapter_queue_config *config); int rte_event_eth_rx_adapter_queue_del(uint8_t id, int32_t rx_queue_id) int rte_event_eth_rx_adapter_run(); int rte_event_eth_rx_adapter_free(uint8_t id); > > Thanks, > Nikhil > > On 5/11/2017 10:08 PM, Jerin Jacob wrote: > > -Original Message- > >> Date: Tue, 9 May 2017 15:38:46 -0500 > >> From: Gage Eads > >> To: dev@dpdk.org > >> CC: nikhil@intel.com, jerin.ja...@caviumnetworks.com, > >> tho...@monjalon.net, bruce.richard...@intel.com, > >> harry.van.haa...@intel.com, hemant.agra...@nxp.com, nipun.gu...@nxp.com, > >> narender.vang...@intel.com > >> Subject: [RFC] eventdev: add event adapter for ethernet Rx queues > >> X-Mailer: git-send-email 2.7.4 > >> > >> From: Nikhil Rao > > > > Hi Nikhil and Gage, > > > > Thanks for the RFC. A few questions and comments below. > > Looks like SW has more constraints on event producer side, after we > > finalize on this RFC(I guess only a few minor changes are only required). > > I will align other[1] RFC based on _your_ RFC as we need to > > converge on name space and we can't duplicate configs like struct > > rte_event_dev_producer_conf etc > > > > [1] > > http://dpdk.org/ml/archives/dev/2017-May/065341.html > > > > > > > >> + */ > >> + > >> +#ifdef __cplusplus > >> +extern "C" { > >> +#endif > >> + > >> +#include > >> +#include > >> +#include > >> + > >> +/* struct rte_eth_rx_event_adapter_queue_config flags definitions */ > >> +#define RTE_ETH_
[dpdk-dev] [PATCH] doc: fix typos in virtio howto guide
Signed-off-by: Yong Wang --- doc/guides/howto/virtio_user_as_exceptional_path.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/howto/virtio_user_as_exceptional_path.rst b/doc/guides/howto/virtio_user_as_exceptional_path.rst index 0bbcd3f..3f99fe8 100644 --- a/doc/guides/howto/virtio_user_as_exceptional_path.rst +++ b/doc/guides/howto/virtio_user_as_exceptional_path.rst @@ -54,7 +54,7 @@ solution is very promising in: * Performance similar to KNI, this solution would use one or more kthreads to -send/receive packets from user space DPDK applications, which has little +send/receive packets to/from user space DPDK applications, which has little impact on user space polling thread (except that it might enter into kernel space to wake up those kthreads if necessary). @@ -94,7 +94,7 @@ compiling the kernel and those kernel modules should be inserted. This is used to negotiate VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6 feature so that large packets from kernel can be -transmitted DPDK application and further TSOed by physical NIC. +transmitted to DPDK application and further TSOed by physical NIC. * ``--enable-rx-cksum`` -- 1.8.3.1
Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations
Hello Adrien, On Friday 16 June 2017 04:04 PM, Adrien Mazarguil wrote: Hi Shreyansh, On Fri, Jun 16, 2017 at 09:21:35AM +, Shreyansh Jain wrote: Hi Bruce, -Original Message- From: Bruce Richardson [mailto:bruce.richard...@intel.com] Sent: Friday, June 16, 2017 2:27 PM To: Shreyansh Jain Cc: dev@dpdk.org; ferruh.yi...@intel.com; Hemant Agrawal Subject: Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations On Fri, Jun 16, 2017 at 11:10:31AM +0530, Shreyansh Jain wrote: From: Hemant Agrawal Bit Swap and LE<=>BE conversions for 23, 40 and 48 bit width Signed-off-by: Hemant Agrawal --- .../common/include/generic/rte_byteorder.h | 78 ++ 1 file changed, 78 insertions(+) Are these really common enough for inclusion in an generic EAL file? Would they be better being driver specific, so that we don't end up with lots of extra byte-swap routines for each possible size used by a driver. Reasoning was to keep all bit/byte swap at a single place and if it is useful for others. From DPAA perspective, these macro can be anywhere. In case someone else too has use of this (now or in near-future), probably then we can consider this in EAL. Else, if I don't get much responses in a few days, I will shift them to DPAA driver in next version of this series. While I'm not against exposing exotic byte swapping functions, they are not completely safe and I'm not sure they should be part of public header files on that basis. Problem is their storage size is larger than the number of bytes they deal with, which raises the question: are filler bytes prepended or appended to the converted value? How about input values in non-native order? Answering that is not so easy as it depends on the use case. We actually had a similar issue when defining VXLAN's VNI field for rte_flow, which is 24-bit in network order. Take rte_constant_bswap48() for instance, assuming input value is little-endian, output is supposed to be big-endian. While the shifts are correct, filler bytes are not in the right place for a big-endian system, and the resulting value stored on uint64_t cannot be used as-is. Again, that depends on the use case, it could be correct if the resulting value was to be used as is on a little-endian system. I understand what you have stated - the application or any user needs to be context aware about what they are using and the side-effect of such conversions. I think the only safe way to deal with that is by defining specific types of the proper size, e.g.: typedef uint8_t uint48_t[6]; These are cumbersome and cannot be used like normal integers though. With such types, byte-swapping functions become meaningless. Since these are supposed to be rather simple functions, I'm not sure handling/documenting all this complexity in rte_byteorder.h makes sense. I have no issues moving these into DPAA specific code. Hemant added them in generic just in case they would be of use to others. - Shreyansh
Re: [dpdk-dev] [PATCH] eal: add Bus log type
Hello Thomas, On Wednesday 24 May 2017 11:08 AM, Shreyansh Jain wrote: Signed-off-by: Shreyansh Jain --- This was missed while adding the rte_bus code. (But, this is not a fix) Also, I couldn't find any maintainer listed for common/include/* code in MAINTAINERS, so, sending 'to' dev@ list. lib/librte_eal/common/include/rte_log.h | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/librte_eal/common/include/rte_log.h b/lib/librte_eal/common/include/rte_log.h index 3419138..1e45e87 100644 --- a/lib/librte_eal/common/include/rte_log.h +++ b/lib/librte_eal/common/include/rte_log.h @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs; #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */ #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */ #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */ +#define RTE_LOGTYPE_BUS 20 /**< Log related to Bus. */ /* these log types can be used in an application */ #define RTE_LOGTYPE_USER1 24 /**< User-defined log type 1. */ Just wanted to ping - is there something wrong/unacceptable about this? - Shreyansh
Re: [dpdk-dev] [PATCH v5 0/2] Balanced allocation of hugepages
On 6/14/2017 11:41 AM, Ilya Maximets wrote: On 08.06.2017 18:44, Sergio Gonzalez Monroy wrote: On 08/06/2017 13:14, Bruce Richardson wrote: On Thu, Jun 08, 2017 at 02:21:58PM +0300, Ilya Maximets wrote: Hi everyone, I just want to clarify current status of these patches. As I understand, moving to the new build system (for example, meson+ninja as proposed[1] by Bruce) is a very long process. But we have issues with imbalanced memory allocation now, and IMHO it's better to fix them in a near future. Latest version (v5) of balanced allocation patches adds linbuma as general unconditional dependency which conflicts with the current DPDK policies. So, there are 2 option: 1. Return back config option RTE_LIBRTE_EAL_NUMA_AWARE_HUGEPAGES from the first version of the patch and disable it by default. 2. Keep patch as it is now and make everyone install libnuma for successful build. I have no preferences about above options. I'm asking your opinions. Bruce, Sergio, Thomas, what do you think? [1] http://dpdk.org/ml/archives/dev/2017-June/067428.html Best regards, Ilya Maximets. I would be ok with having libnuma as a dependency, so I think I'd prefer option 2 to 1, assuming libnuma is available in all major Linux distros. /Bruce +1 on option 2 (current patch and libnuma as DPDK dependency). Sergio Ok. In this case I'm waiting for review. And someone need to install libnuma development package in automatic build test environment. Otherwise there will be constant compilation test failures like this: http://dpdk.org/ml/archives/test-report/2017-June/021437.html Best regards, Ilya Maximets. +1 for option 1 It will be a issue and undesired dependency for SoCs, not supporting NUMA architecture. It can be added to the config, who desired to use it by default.
Re: [dpdk-dev] [PATCH] lpm: fix index of tbl8
On Mon, Jun 19, 2017 at 12:14:38PM +0800, Wei Dai wrote: > From v20 to v1604, number of tlb8 can be up to 1<<24, > (uint8_t) or (uint16_t) may truncate the number of > index of tlb8 in v1604 and cause wrong number. > > Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field") > Cc: sta...@dpdk.org > > Signed-off-by: Wei Dai Acked-by: Bruce Richardson
Re: [dpdk-dev] [PATCH 0/2] Allow application set mempool handle
On 6/1/2017 1:35 PM, Santosh Shukla wrote: Some platform can have two different NICs for example external PCI Intel 40G card and Integrated NIC like vNIC/octeontx/dpaa2. Both NICs like to use their preferred pool e.g. external PCI card/ vNIC's preferred pool would be the ring based pool and octeontx/dpaa2 preferred would be ext-mempools. Right now, Framework doesn't support such case. Only one pool can be used across two different NIC's. For that, user has to statically set CONFIG_RTE_MEMPOOL_DEFAULT_OPS=. So proposing two approaches: Patch 1) Introducing eal option --pkt-mempool= Patch 2) Introducing ethdev API called _get_preferred_pool(), where PMD driver gets a chance to advertise their pool capability to the application. And based on that hint- application creates pools for that driver. The idea is good. it will help the vendors with hw mempool support. On a similar line, I also submitted a patch to check the existence of a mempool instance. http://dpdk.org/dev/patchwork/patch/15877/ Option 1) requires manual knowledge of underlying NIC and different commands for different machines. Option 2) this will help more as it allows the application to take decision autonomously. In addition to it, we can also extend the overall MEMPOOL_OPS support. 3) currently we support defining only one "RTE_MBUF_DEFAULT_MEMPOOL_OPS" this can be supported to publish a priority list of MEMPOOL_OPS in config. if one is not available, application can try the next one in priority list as supported by the platform. 4) we can also try something, where the existing application can also be supported. - default mempool is configured as alias. This is with empty ops. - based on the mempool detections on the bus, the bus configure the mempool ops internally with the actual ones. Santosh Shukla (2): eal: Introducing option to set mempool handle ether/ethdev: Allow pmd to advertise preferred pool capability lib/librte_eal/bsdapp/eal/eal.c | 9 +++ lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 + lib/librte_eal/common/eal_common_options.c | 3 +++ lib/librte_eal/common/eal_internal_cfg.h| 2 ++ lib/librte_eal/common/eal_options.h | 2 ++ lib/librte_eal/common/include/rte_eal.h | 9 +++ lib/librte_eal/linuxapp/eal/eal.c | 36 + lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 + lib/librte_ether/rte_ethdev.c | 16 +++ lib/librte_ether/rte_ethdev.h | 21 +++ lib/librte_ether/rte_ether_version.map | 7 + lib/librte_mbuf/rte_mbuf.c | 8 -- 12 files changed, 125 insertions(+), 2 deletions(-)
Re: [dpdk-dev] [PATCH 0/2] Allow application set mempool handle
-Original Message- > Date: Mon, 19 Jun 2017 17:22:46 +0530 > From: Hemant Agrawal > To: Santosh Shukla , > olivier.m...@6wind.com, dev@dpdk.org > CC: jerin.ja...@caviumnetworks.com > Subject: Re: [PATCH 0/2] Allow application set mempool handle > User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 > Thunderbird/45.8.0 > > On 6/1/2017 1:35 PM, Santosh Shukla wrote: > > Some platform can have two different NICs for example external PCI Intel > > 40G card and Integrated NIC like vNIC/octeontx/dpaa2. > > > > Both NICs like to use their preferred pool e.g. external PCI card/ vNIC's > > preferred pool would be the ring based pool and octeontx/dpaa2 preferred > > would > > be ext-mempools. > > Right now, Framework doesn't support such case. Only one pool can be > > used across two different NIC's. For that, user has to statically set > > CONFIG_RTE_MEMPOOL_DEFAULT_OPS=. > > > > So proposing two approaches: > > Patch 1) Introducing eal option --pkt-mempool= > > Patch 2) Introducing ethdev API called _get_preferred_pool(), where PMD > > driver > > gets a chance to advertise their pool capability to the application. And > > based > > on that hint- application creates pools for that driver. > > > The idea is good. it will help the vendors with hw mempool support. > > On a similar line, I also submitted a patch to check the existence of a > mempool instance. > http://dpdk.org/dev/patchwork/patch/15877/ > > Option 1) requires manual knowledge of underlying NIC and different commands > for different machines. > > Option 2) this will help more as it allows the application to take decision > autonomously. > > In addition to it, we can also extend the overall MEMPOOL_OPS support. > 3) currently we support defining only one "RTE_MBUF_DEFAULT_MEMPOOL_OPS" > this can be supported to publish a priority list of MEMPOOL_OPS in > config. if one is not available, application can try the next one in > priority list as supported by the platform. > > 4) we can also try something, where the existing application can also be > supported. > - default mempool is configured as alias. This is with empty ops. > - based on the mempool detections on the bus, the bus configure the > mempool ops internally with the actual ones. What if both HW are on PCIe bus(That the case for us)? Any scheme to fix that? > > > > Santosh Shukla (2): > > eal: Introducing option to set mempool handle > > ether/ethdev: Allow pmd to advertise preferred pool capability > > > > lib/librte_eal/bsdapp/eal/eal.c | 9 +++ > > lib/librte_eal/bsdapp/eal/rte_eal_version.map | 7 + > > lib/librte_eal/common/eal_common_options.c | 3 +++ > > lib/librte_eal/common/eal_internal_cfg.h| 2 ++ > > lib/librte_eal/common/eal_options.h | 2 ++ > > lib/librte_eal/common/include/rte_eal.h | 9 +++ > > lib/librte_eal/linuxapp/eal/eal.c | 36 > > + > > lib/librte_eal/linuxapp/eal/rte_eal_version.map | 7 + > > lib/librte_ether/rte_ethdev.c | 16 +++ > > lib/librte_ether/rte_ethdev.h | 21 +++ > > lib/librte_ether/rte_ether_version.map | 7 + > > lib/librte_mbuf/rte_mbuf.c | 8 -- > > 12 files changed, 125 insertions(+), 2 deletions(-) > > > >
[dpdk-dev] next techboard meeting
The next meeting of the techboard will happen on IRC #dpdk-board, at 3pm UTC, this Wednesday 21th of June. The agenda is updated here: https://annuel.framapad.org/p/r.0c3cc4d1e011214183872a98f6b5c7db
Re: [dpdk-dev] [PATCH] maintainers: claim responsability for xen
Thanks Jianfeng for giving new ideas. There is not much activity on Xen side. Is there someone working on DPDK+Xen? Any news? The technical board requested to re-consider Xen support in DPDK. It will be discussed in the next techboard meeting: https://annuel.framapad.org/p/r.0c3cc4d1e011214183872a98f6b5c7db 11/05/2017 13:41, Tan, Jianfeng: > Hi Thomas and all, > > Apologize for being an unqualified maintainer. > > > -Original Message- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > > > Ping > > > > The Xen dom0 support in DPDK seems dead. > > > > Reminder: > > Last time we talked about, it was because of a severe bug which is not > > fixed yet: > > http://dpdk.org/ml/archives/dev/2016-July/044207.html > > For this bug, we removed the userspace memset(0) and suppose it has been done > by kernel, however, xen0 uses __get_free_pages() kernel API to map hugepages > and reseve memseg, I think it makes sense to zero the hugepage for xen0 in > rte_dom0_mm kernel module (instead of some special code for xen0 in > userspace) to keep aligned behavior. > > > http://dpdk.org/ml/archives/dev/2016-July/044376.html > > It does not make any sense to upstream a netfront PMD before we have a > netback PMD, as the legacy netback driver would be the bottleneck. Anyone has > plan on this? And a question mark keeps in my mind that is it a must to > implement netback in dom0? > > From another perspective, instead of using netfront/netback, we can also use > virtio/vhost as the device model; however, xl tool in xen only supports > vhost-kernel backend instead of vhost-user backend. So anyone has plan to > enhance xl tool so that we can accelerate dom0 just using vswitch like > OVS-DPDK? > > A third solution is to use xenvirtio as the frontend, and vhost_xen as the > backend. This solution is to use virtio ring on grant table mechanism of xen. > Honestly, I don't even know if it still work now. And to make it more usable, > better to upstream vhost_xen inside popular vswitch like OVS-DPDK. > > > The request (9 months ago) was to give more time for feedbacks: > > http://dpdk.org/ml/archives/dev/2016-July/044847.html > > Apologize again that I volunteer to maintain these files, but spend very few > time on this. > > Thanks, > Jianfeng
Re: [dpdk-dev] [RFC] proposal of allowing personal/project repos on DPDK.org
Hi, 01/06/2017 07:07, Tiwei Bie: > We'd like to make a proposal of making DPDK.org allow hosting > some personal/project repos, which could be very useful when > someone wants to try some experimental projects in DPDK. It has been discussed during the last technical board meeting. The board was not convinced by this idea, but would like to better understand what is the problem to be solved with this proposal? If there is something to solve, we could decide one these proposals: 1/ let users free to choose their forge 2/ advise for gitlab 3/ advise for github (and fight to recover the name DPDK) 4/ pay a sysadmin to host and secure our own forge in dpdk.org domain
[dpdk-dev] [PATCH] doc: Minor typo in documentation
Signed-off-by: Harrison McCullough --- doc/guides/linux_gsg/enable_func.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index 04e066c..15f53b1 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -35,8 +35,8 @@ Enabling Additional Functionality .. _High_Precision_Event_Timer: -High Precision Event Timer HPET) Functionality --- +High Precision Event Timer (HPET) Functionality +--- BIOS Support -- 1.9.1
Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations
> On Jun 19, 2017, at 6:00 AM, Shreyansh Jain wrote: > > Hello Adrien, > > On Friday 16 June 2017 04:04 PM, Adrien Mazarguil wrote: >> Hi Shreyansh, >> On Fri, Jun 16, 2017 at 09:21:35AM +, Shreyansh Jain wrote: >>> Hi Bruce, >>> -Original Message- From: Bruce Richardson [mailto:bruce.richard...@intel.com] Sent: Friday, June 16, 2017 2:27 PM To: Shreyansh Jain Cc: dev@dpdk.org; ferruh.yi...@intel.com; Hemant Agrawal Subject: Re: [dpdk-dev] [PATCH 01/38] eal: add support for 24 40 and 48 bit operations On Fri, Jun 16, 2017 at 11:10:31AM +0530, Shreyansh Jain wrote: > From: Hemant Agrawal > > Bit Swap and LE<=>BE conversions for 23, 40 and 48 bit width > > Signed-off-by: Hemant Agrawal > --- > .../common/include/generic/rte_byteorder.h | 78 ++ > 1 file changed, 78 insertions(+) > Are these really common enough for inclusion in an generic EAL file? Would they be better being driver specific, so that we don't end up with lots of extra byte-swap routines for each possible size used by a driver. >>> Reasoning was to keep all bit/byte swap at a single place and if it is >>> useful for others. >>> >>> From DPAA perspective, these macro can be anywhere. In case someone else too >>> has use of this (now or in near-future), probably then we can consider this >>> in EAL. >>> Else, if I don't get much responses in a few days, I will shift them to >>> DPAA driver in next version of this series. >> While I'm not against exposing exotic byte swapping functions, they are not >> completely safe and I'm not sure they should be part of public header files >> on that basis. >> Problem is their storage size is larger than the number of bytes they deal >> with, which raises the question: are filler bytes prepended or appended to >> the converted value? How about input values in non-native order? Answering >> that is not so easy as it depends on the use case. We actually had a similar >> issue when defining VXLAN's VNI field for rte_flow, which is 24-bit in >> network order. >> Take rte_constant_bswap48() for instance, assuming input value is >> little-endian, output is supposed to be big-endian. While the shifts are >> correct, filler bytes are not in the right place for a big-endian system, >> and the resulting value stored on uint64_t cannot be used as-is. Again, that >> depends on the use case, it could be correct if the resulting value was to >> be used as is on a little-endian system. > > I understand what you have stated - the application or any user needs to be > context aware about what they are using and the side-effect of such > conversions. > >> I think the only safe way to deal with that is by defining specific types of >> the proper size, e.g.: >> typedef uint8_t uint48_t[6]; >> These are cumbersome and cannot be used like normal integers though. With >> such types, byte-swapping functions become meaningless. >> Since these are supposed to be rather simple functions, I'm not sure >> handling/documenting all this complexity in rte_byteorder.h makes sense. > > I have no issues moving these into DPAA specific code. Hemant added them in > generic just in case they would be of use to others. > > - > Shreyansh These are all static inline functions, so no real code increase unless used and having them in one spot is the best place IMO. Regards, Keith
Re: [dpdk-dev] [PATCH v5 1/3] lib: add Generic Receive Offload API framework
On 6/18/2017 3:21 PM, Jiayu Hu wrote: Generic Receive Offload (GRO) is a widely used SW-based offloading technique to reduce per-packet processing overhead. It gains performance by reassembling small packets into large ones. This patchset is to support GRO in DPDK. To support GRO, this patch implements a GRO API framework. To enable more flexibility to applications, DPDK GRO is implemented as a user library. Applications explicitly use the GRO library to merge small packets into large ones. DPDK GRO provides two reassembly modes. One is called lightweigth mode, the other is called heavyweight mode. If applications want merge packets in a simple way, they can use lightweigth mode. If applications need more fine-grained controls, they can choose heavyweigth mode. rte_gro_reassemble_burst is the main reassembly API which is used in lightweigth mode and processes N packets at a time. For applications, performing GRO in lightweigth mode is simple. They just need to invoke rte_gro_reassemble_burst. Applications can get GROed packets as soon as rte_gro_reassemble_burst returns. rte_gro_reassemble is the main reassembly API which is used in lightweigth mode and processes one packet at a time. For applications, performing GRO in heavyweigth mode is relatively complicated. Before performing GRO, applications need to create a GRO table by rte_gro_tbl_create. Then they can use rte_gro_reassemble to process packets one by one. The processed packets are in the GRO table. If applications want to get them, applications need to manually flush them by flush APIs. For these two APIs, I suppose they will try best to reassemble the packets according to the supported GRO engine. So we need to call all GRO engines according to the ptype of this packet. And this framework should be implemented in this file. In DPDK GRO, different GRO types define own reassembly tables. When create a GRO table, it keeps the reassembly tables of desired GRO types. To process one packet, we search for the corresponding reassembly table according to the packet type first. Then search for the reassembly table to find an existed packet to merge. If find, chain the two packets together. If not find, insert the packet into the reassembly table. If the packet is with wrong checksum, or is fragmented etc., error happens. The reassebly function will stop processing the packet. Signed-off-by: Jiayu Hu --- config/common_base | 5 ++ lib/Makefile | 1 + lib/librte_gro/Makefile | 50 +++ lib/librte_gro/rte_gro.c | 126 lib/librte_gro/rte_gro.h | 213 +++ mk/rte.app.mk| 1 + 6 files changed, 396 insertions(+) create mode 100644 lib/librte_gro/Makefile create mode 100644 lib/librte_gro/rte_gro.c create mode 100644 lib/librte_gro/rte_gro.h If we expose some APIs, we always add a rte_vhost_version.map file in that directory. diff --git a/config/common_base b/config/common_base index f6aafd1..167f5ef 100644 --- a/config/common_base +++ b/config/common_base @@ -712,6 +712,11 @@ CONFIG_RTE_LIBRTE_VHOST_DEBUG=n CONFIG_RTE_LIBRTE_PMD_VHOST=n # +# Compile GRO library +# +CONFIG_RTE_LIBRTE_GRO=y + +# #Compile Xen domain0 support # CONFIG_RTE_LIBRTE_XEN_DOM0=n diff --git a/lib/Makefile b/lib/Makefile index 07e1fd0..e253053 100644 --- a/lib/Makefile +++ b/lib/Makefile @@ -106,6 +106,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_REORDER) += librte_reorder DEPDIRS-librte_reorder := librte_eal librte_mempool librte_mbuf DIRS-$(CONFIG_RTE_LIBRTE_PDUMP) += librte_pdump DEPDIRS-librte_pdump := librte_eal librte_mempool librte_mbuf librte_ether +DIRS-$(CONFIG_RTE_LIBRTE_GRO) += librte_gro ifeq ($(CONFIG_RTE_EXEC_ENV_LINUXAPP),y) DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile new file mode 100644 index 000..9f4063a --- /dev/null +++ b/lib/librte_gro/Makefile @@ -0,0 +1,50 @@ +# BSD LICENSE +# +# Copyright(c) 2016-2017 Intel Corporation. All rights reserved. +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# +# * Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# * Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in +# the documentation and/or other materials provided with the +# distribution. +# * Neither the name of Intel Corporation nor the names of its +# contributors may be used to endorse or promote products derived +# from this software without specific prior written permission. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTI
Re: [dpdk-dev] [PATCH v5 2/3] lib/gro: add TCP/IPv4 GRO support
On 6/18/2017 3:21 PM, Jiayu Hu wrote: In this patch, we introduce six APIs to support TCP/IPv4 GRO. Those functions are not used outside of this library. Don't make it as extern visible. - gro_tcp_tbl_create: create a TCP reassembly table, which is used to merge packets. Will tcp6 shares the same function with tcp4? If no, please rename it to gro_tcp4_tlb_create - gro_tcp_tbl_destroy: free memory space of a TCP reassembly table. - gro_tcp_tbl_flush: flush packets in the TCP reassembly table. - gro_tcp_tbl_timeout_flush: flush timeout packets in the TCP reassembly table. - gro_tcp4_reassemble: merge an inputted packet. - gro_tcp4_tbl_cksum_update: update TCP and IPv4 header checksums for all merged packets in the TCP reassembly table. In TCP GRO, we use a table structure, called TCP reassembly table, to reassemble packets. Both TCP/IPv4 and TCP/IPv6 GRO use the same table structure. A TCP reassembly table includes a flow array and a item array, where the flow array is used to record flow information and the item array is used to record packets information. Each element in the flow array records the information of one flow, which includes two parts: - key: the criteria of the same flow. If packets have the same key value, they belong to the same flow. - start_index: the index of the first incoming packet of this flow in the item array. With start_index, we can locate the first incoming packet of this flow. Each element in the item array records one packet information. It mainly includes two parts: - pkt: packet address - next_pkt_index: index of the next packet of the same flow in the item array. All packets of the same flow are chained by next_pkt_index. With next_pkt_index, we can locate all packets of the same flow one by one. To process an incoming packet, we need three steps: a. check if the packet should be processed. Packets with the following properties won't be processed: - packets without data; - packets with wrong checksums; Why do we care to check this kind of error? Can we just suppose the applications have already dropped the packets with wrong cksum? - fragmented packets. IP fragmented? I don't think we need to check it here either. It's the application's responsibility to call librte_ip_frag firstly to reassemble IP-fragmented packets, and then call this gro library to merge TCP packets. And this procedure should be shown in an example for other users to refer. b. traverse the flow array to find a flow which the packet belongs to. If not find, insert a new flow and store the packet into the item array. You do not store the packet now. "store the packet into the item array" -> "then go to step c". c. locate the first packet of this flow in the item array via start_index. Then traverse all packets of this flow one by one via next_pkt_index. If find one packet to merge with the incoming packet, merge them but without updating checksums. If not, allocate one item in the item array to store the incoming packet and update next_pkt_index value. For better performance, we don't update header checksums once two packets are merged. The header checksums are updated only when packets are flushed from TCP reassembly tables. Why do we care to recalculate the L4 checksum when flushing? How about Just keeping the wrong cksum, and letting the applications to handle that? Signed-off-by: Jiayu Hu --- lib/librte_gro/Makefile | 1 + lib/librte_gro/rte_gro.c | 154 +++-- lib/librte_gro/rte_gro.h | 34 +-- lib/librte_gro/rte_gro_tcp.c | 527 +++ lib/librte_gro/rte_gro_tcp.h | 210 + 5 files changed, 895 insertions(+), 31 deletions(-) create mode 100644 lib/librte_gro/rte_gro_tcp.c create mode 100644 lib/librte_gro/rte_gro_tcp.h diff --git a/lib/librte_gro/Makefile b/lib/librte_gro/Makefile index 9f4063a..3495dfc 100644 --- a/lib/librte_gro/Makefile +++ b/lib/librte_gro/Makefile @@ -43,6 +43,7 @@ LIBABIVER := 1 # source files SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro.c +SRCS-$(CONFIG_RTE_LIBRTE_GRO) += rte_gro_tcp.c Again, if it's just for tcp4, please use the name rte_gro_tcp4.c. # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_GRO)-include += rte_gro.h diff --git a/lib/librte_gro/rte_gro.c b/lib/librte_gro/rte_gro.c index 1bc53a2..2620ef6 100644 --- a/lib/librte_gro/rte_gro.c +++ b/lib/librte_gro/rte_gro.c @@ -32,11 +32,17 @@ #include #include +#include +#include +#include #include "rte_gro.h" +#include "rte_gro_tcp.h" -static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NB]; -static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NB]; +static gro_tbl_create_fn tbl_create_functions[GRO_TYPE_MAX_NB] = { + gro_tcp_tbl_create, NULL}; +static gro_tbl_destroy_fn tbl_destroy_functions[GRO_TYPE_MAX_NB] = { +
Re: [dpdk-dev] [PATCH v5 1/3] lib: add Generic Receive Offload API framework
On Sun, 18 Jun 2017 15:21:07 +0800 Jiayu Hu wrote: > +/** > + * This is the main reassembly API used in lightweight mode, which > + * merges numbers of packets at a time. After it returns, applications > + * can get GROed packets immediately. Applications don't need to > + * flush packets manually. In lightweight mode, applications just need > + * to tell the reassembly API what rules should be applied when merge > + * packets. Therefore, applications can perform GRO in very a simple > + * way. > + * > + * To process one packet, we find its corresponding reassembly table > + * according to the packet type. Then search for the reassembly table > + * to find one packet to merge. If find, chain the two packets together. > + * If not find, insert the inputted packet into the reassembly table. > + * Besides, to merge two packets is to chain them together. No > + * memory copy is needed. Before rte_gro_reassemble_burst returns, > + * header checksums of merged packets are re-calculated. > + * > + * @param pkts > + * a pointer array which points to the packets to reassemble. After > + * GRO, it is also used to keep GROed packets. > + * @param nb_pkts > + * the number of packets to reassemble. > + * @param param > + * Applications use param to tell rte_gro_reassemble_burst what rules > + * are demanded. > + * @return > + * the number of packets after GROed. > + */ > +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused, > + const uint16_t nb_pkts __rte_unused, > + const struct rte_gro_param param __rte_unused); I think the __rte_unused attribute should be on the function definition, not on the prototype. I think GCC ignores it on function prototypes.
[dpdk-dev] [PATCH] ether: add support for vtune task tracing
From: Ilia Kurakin The patch adds tracing of loop iterations that yielded no packets in a DPDK application. It is using ITT task API: https://software.intel.com/en-us/node/544206 We suppose the flow of using this tracing would assume the user has ITT lib and header on his machine and re-build DPDK with additional make parameters: make EXTRA_CFLAGS=-I EXTRA_LDLIBS="-L -littnotify" Signed-off-by: Ilia Kurakin --- config/common_base | 1 + lib/librte_ether/Makefile | 1 + lib/librte_ether/rte_eth_itt.h | 69 ++ lib/librte_ether/rte_ethdev.c | 7 + lib/librte_ether/rte_ethdev.h | 26 5 files changed, 104 insertions(+) create mode 100644 lib/librte_ether/rte_eth_itt.h diff --git a/config/common_base b/config/common_base index f6aafd1..60d8b63 100644 --- a/config/common_base +++ b/config/common_base @@ -135,6 +135,7 @@ CONFIG_RTE_MAX_QUEUES_PER_PORT=1024 CONFIG_RTE_LIBRTE_IEEE1588=n CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16 CONFIG_RTE_ETHDEV_RXTX_CALLBACKS=y +CONFIG_RTE_ETHDEV_TRACE_WASTED_RX_ITERATIONS=n # # Turn off Tx preparation stage diff --git a/lib/librte_ether/Makefile b/lib/librte_ether/Makefile index 93fdde1..c10153a 100644 --- a/lib/librte_ether/Makefile +++ b/lib/librte_ether/Makefile @@ -56,5 +56,6 @@ SYMLINK-y-include += rte_eth_ctrl.h SYMLINK-y-include += rte_dev_info.h SYMLINK-y-include += rte_flow.h SYMLINK-y-include += rte_flow_driver.h +SYMLINK-${CONFIG_RTE_ETHDEV_TRACE_WASTED_RX_ITERATIONS}-include += rte_eth_itt.h include $(RTE_SDK)/mk/rte.lib.mk diff --git a/lib/librte_ether/rte_eth_itt.h b/lib/librte_ether/rte_eth_itt.h new file mode 100644 index 000..e7984fb --- /dev/null +++ b/lib/librte_ether/rte_eth_itt.h @@ -0,0 +1,69 @@ +#ifndef _RTE_ETH_ITT_H_ +#define _RTE_ETH_ITT_H_ + +#include +#include + +#define ITT_MAX_NAME_LEN (100) + +/** + * Auxiliary ITT structure belonging to port and using to: + * - track queue state to determine whether it is wasting loop iterations + * - begin or end ITT task using task domain and name + */ +struct rte_eth_itt_aux_data { + /** +* ITT domains for each queue. +*/ + __itt_domain *wasted_iteration_itt_domains[RTE_MAX_QUEUES_PER_PORT]; + /** +* ITT task names for each queue. +*/ + __itt_string_handle *wasted_iteration_itt_handles[RTE_MAX_QUEUES_PER_PORT]; + /** +* Flags indicating the queues state. Possible values: +* 1 - queue is wasting iterations, 0 - otherwise. +*/ + uint8_t queue_is_wasting_iterations[RTE_MAX_QUEUES_PER_PORT]; +}; + +/** + * The pool of *rte_eth_itt_aux_data* structures. + */ +struct rte_eth_itt_aux_data itt_aux_data[RTE_MAX_ETHPORTS]; + +/** + * Initialization of rte_eth_itt_aux_data for a given port. + * This function must be invoked when ethernet device is being configured. + * Result will be stored in the global array *itt_aux_data*. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param port_name + * The name of the Ethernet device. + * @param queue_num + * The number of queues on specified port. + */ +static inline void +rte_eth_init_itt(uint8_t port_id, char *port_name, uint8_t queue_num) { + uint16_t q_id; + for (q_id = 0; q_id < queue_num; ++q_id) { + char domain_name[ITT_MAX_NAME_LEN]; + snprintf(domain_name, sizeof(domain_name), + "RXBurst.WastedIterations.Port_%s.Queue_%d", + port_name, q_id); + itt_aux_data[port_id].wasted_iteration_itt_domains[q_id] + = __itt_domain_create(domain_name); + + char task_name[ITT_MAX_NAME_LEN]; + snprintf(task_name, sizeof(task_name), + "port id: %d; queue id: %d", + port_id, q_id); + itt_aux_data[port_id].wasted_iteration_itt_handles[q_id] + = __itt_string_handle_create(task_name); + + itt_aux_data[port_id].queue_is_wasting_iterations[q_id] = 0; + } +} + +#endif diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c index 81a45c0..9e5ac01 100644 --- a/lib/librte_ether/rte_ethdev.c +++ b/lib/librte_ether/rte_ethdev.c @@ -818,6 +818,13 @@ rte_eth_dev_configure(uint8_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q, return diag; } +#ifdef RTE_ETHDEV_TRACE_WASTED_RX_ITERATIONS + /** + * See rte_eth_itt.h to find comments on code below. + */ + rte_eth_init_itt(port_id, dev->data->name, nb_rx_q); +#endif + return 0; } diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h index f6e6c74..4ba90d2 100644 --- a/lib/librte_ether/rte_ethdev.h +++ b/lib/librte_ether/rte_ethdev.h @@ -186,6 +186,10 @@ extern "C" { #include "rte_eth_ctrl.h" #include "rte_dev_info.h" +#ifdef RTE_ETHDEV_TRACE_WASTED_RX_ITERATIONS +#
Re: [dpdk-dev] [git pull] virtio changes for 17.08-rc1
16/06/2017 14:16, Yuanhan Liu: > Please consider pulling following virtio changes for 17.08-rc1 at > git://dpdk.org/next/dpdk-next-virtiomaster Pulled, thanks
Re: [dpdk-dev] [PATCH v2] test/test_mbuf: remove mempool global var
08/06/2017 16:28, Santosh Shukla: > Let test_mbuf alloc and free mempool. > > Cc: sta...@dpdk.org > Signed-off-by: Santosh Shukla Why Cc stable? Is it fixing something?
Re: [dpdk-dev] [PATCH] eal: add Bus log type
19/06/2017 13:03, Shreyansh Jain: > > --- a/lib/librte_eal/common/include/rte_log.h > > +++ b/lib/librte_eal/common/include/rte_log.h > > @@ -87,6 +87,7 @@ extern struct rte_logs rte_logs; > > #define RTE_LOGTYPE_CRYPTODEV 17 /**< Log related to cryptodev. */ > > #define RTE_LOGTYPE_EFD 18 /**< Log related to EFD. */ > > #define RTE_LOGTYPE_EVENTDEV 19 /**< Log related to eventdev. */ > > +#define RTE_LOGTYPE_BUS 20 /**< Log related to Bus. */ > > Just wanted to ping - is there something wrong/unacceptable about this? Sorry for not replying before. The strange thing is that it is not used. At least, the existing bus logs should be converted. I've spotted few lines in drivers/bus/fslmc/. And more importantly, we must switch to the new dynamic types using rte_log_register().
Re: [dpdk-dev] [PATCH] vfio: allow to map other memory regions
Hi, Some comments below 24/05/2017 13:17, Pawel Wodkowski: > Currently it is not possible to use memory that is not owned by DPDK to > perform DMA. This scenarion might be used in vhost applications (like > SPDK) where guest send its own memory table. To fill this gap provide > API to allow registering arbitrary address in VFIO container. > > Signed-off-by: Pawel Wodkowski > --- > lib/librte_eal/linuxapp/eal/Makefile| 3 + > lib/librte_eal/linuxapp/eal/eal_vfio.c | 142 > +--- > lib/librte_eal/linuxapp/eal/eal_vfio.h | 10 ++ > lib/librte_eal/linuxapp/eal/include/rte_iommu.h | 78 + > lib/librte_eal/linuxapp/eal/rte_eal_version.map | 8 ++ > 5 files changed, 224 insertions(+), 17 deletions(-) > create mode 100644 lib/librte_eal/linuxapp/eal/include/rte_iommu.h VFIO is not referenced in the doxygen of these functions. Could we use this API for something else than VFIO? Any API should be declared in common directory, even if it is not implemented for FreeBSD (returning -ENOTSUP).
Re: [dpdk-dev] [PATCH v3] app/testpmd: add parameter to start forwarding TX first
15/06/2017 14:05, De Lara Guarch, Pablo: > > Add parameter to start forwarding sending first > > a burst of packets, which is useful when testing > > a loopback connection. > > > > This was already implemented as an internal command, > > but adding it as a parameter is interesting, as it > > allows the user to test a loopback connection without > > entering in the internal command line. > > > > Signed-off-by: Pablo de Lara > > --- > > --- a/doc/guides/testpmd_app_ug/run_app.rst > > +++ b/doc/guides/testpmd_app_ug/run_app.rst > > @@ -188,6 +188,14 @@ The commandline options are: > > > > Start forwarding on initialization. > > > > +* ``--tx-first`` > > + > > +Start forwarding, after sending a burst of packets first. > > + > > +.. Note:: > > + > > + This flag should be only used in non-interactive mode. I don't really understand the benefit of this option. Why is it better than echo start tx_first | testpmd -i ?
Re: [dpdk-dev] [PATCH v3] app/testpmd: print statistics periodically
15/06/2017 03:48, Pablo de Lara: > +print_stats(void) > +{ > + uint8_t i; > + const char clr[] = { 27, '[', '2', 'J', '\0' }; > + const char topLeft[] = { 27, '[', '1', ';', '1', 'H', '\0' }; Spotted a CamelCase ;) Is there a punishment planned in the contributor's guide? > + if (stats_period != 0) { > + uint64_t prev_tsc = 0, cur_tsc, timer_tsc = 0; > + uint64_t timer_period; > + > + /* Convert to number of cycles */ > + timer_period = stats_period * rte_get_timer_hz(); > + > + while (1) { > + cur_tsc = rte_rdtsc(); > + timer_tsc += cur_tsc - prev_tsc; Please forget (Intel) TSC wording and prefer the more generic rte_get_timer_cycles() function.
Re: [dpdk-dev] [PATCH] lpm: fix index of tbl8
19/06/2017 13:25, Bruce Richardson: > On Mon, Jun 19, 2017 at 12:14:38PM +0800, Wei Dai wrote: > > From v20 to v1604, number of tlb8 can be up to 1<<24, > > (uint8_t) or (uint16_t) may truncate the number of > > index of tlb8 in v1604 and cause wrong number. > > > > Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field") > > Cc: sta...@dpdk.org > > > > Signed-off-by: Wei Dai > > Acked-by: Bruce Richardson Applied, thanks
Re: [dpdk-dev] [PATCH] app/testpmd: always build VF and MACsec functions
> -Original Message- > From: Thomas Monjalon [mailto:tho...@monjalon.net] > Sent: Thursday, June 15, 2017 7:02 PM > To: Wu, Jingjing > Cc: Lu, Wenzhuo ; dev@dpdk.org > Subject: [PATCH] app/testpmd: always build VF and MACsec functions > > These functions are supported only on ixgbe. > However, they should appear in the help and returns an error > if the function is not supported or not enabled. > > Signed-off-by: Thomas Monjalon Acked-by: Jingjing Wu
Re: [dpdk-dev] [PATCH v4 4/4] app/testpmd: add isolated mode parameter
> +/* > * Avoids to check link status when starting/stopping a port. > */ > uint8_t no_link_check = 0; /* check by default */ > @@ -1422,6 +1427,15 @@ static void eth_event_callback(uint8_t port_id, > if (port->need_reconfig > 0) { > port->need_reconfig = 0; > > + if (isolated_mode) { > + int ret = port_flow_isolate(pi, 1); > + if (ret) { > + printf("Failed to apply isolated" > +" mode on port %d\n", pi); > + return -1; > + } > + } > + Should it block the app startup if isolated-mode setting fails?
Re: [dpdk-dev] [PATCH v5 1/3] lib: add Generic Receive Offload API framework
On Mon, Jun 19, 2017 at 08:55:00AM -0700, Stephen Hemminger wrote: > On Sun, 18 Jun 2017 15:21:07 +0800 > Jiayu Hu wrote: > > > +/** > > + * This is the main reassembly API used in lightweight mode, which > > + * merges numbers of packets at a time. After it returns, applications > > + * can get GROed packets immediately. Applications don't need to > > + * flush packets manually. In lightweight mode, applications just need > > + * to tell the reassembly API what rules should be applied when merge > > + * packets. Therefore, applications can perform GRO in very a simple > > + * way. > > + * > > + * To process one packet, we find its corresponding reassembly table > > + * according to the packet type. Then search for the reassembly table > > + * to find one packet to merge. If find, chain the two packets together. > > + * If not find, insert the inputted packet into the reassembly table. > > + * Besides, to merge two packets is to chain them together. No > > + * memory copy is needed. Before rte_gro_reassemble_burst returns, > > + * header checksums of merged packets are re-calculated. > > + * > > + * @param pkts > > + * a pointer array which points to the packets to reassemble. After > > + * GRO, it is also used to keep GROed packets. > > + * @param nb_pkts > > + * the number of packets to reassemble. > > + * @param param > > + * Applications use param to tell rte_gro_reassemble_burst what rules > > + * are demanded. > > + * @return > > + * the number of packets after GROed. > > + */ > > +uint16_t rte_gro_reassemble_burst(struct rte_mbuf **pkts __rte_unused, > > + const uint16_t nb_pkts __rte_unused, > > + const struct rte_gro_param param __rte_unused); > > I think the __rte_unused attribute should be on the function definition, > not on the prototype. I think GCC ignores it on function prototypes. Thanks. I will modify it in next patch. BRs, Jiayu
Re: [dpdk-dev] [PATCH 3/3] app/testpmd: changed example to parse options from cfg file
> + > +#ifdef RTE_LIBRTE_CFGFILE > +/* Load config file path from command line */ static char * > +cfgfile_load_path(int argc, char **argv) { > + int i; > + > + for (i = 0; i < argc; i++) { > + if (!strcmp("--cfgfile-path", argv[i])) { > + if (i < argc) > + return strdup(argv[i+1]); > + } > + } > + return 0; > +} > +#endif > + It is a little confused. Is the cfgfile-path is application's argument or EAL's? Where is it supposed to be located? > int > main(int argc, char** argv) > { > int diag; > uint8_t port_id; > + struct rte_cfgfile *cfg = NULL; > + char *config_file = NULL; > > signal(SIGINT, signal_handler); > signal(SIGTERM, signal_handler); > > +#ifdef RTE_LIBRTE_CFGFILE > + /* load --cfgfile-path argument from argv */ > + config_file = cfgfile_load_path(argc, argv); > + > + if (config_file) { > + printf("Info: found cfgfile-path \"%s\"\n", config_file); > + } else { > + printf("Info: not found cfgfile-path parameter " > + "(searching for cfgfile " > + "in default directory)\n"); > + config_file = strdup("config.ini"); > + } > + cfg = rte_cfgfile_load(config_file, CFG_FLAG_EMPTY_VALUES); > + if (cfg == NULL) > + printf("Info: Valid cfgfile not found\n"); > + rte_eal_configure(cfg); > +#endif > + Does it mean if RTE_LIBRTE_CFGFILE is defined, then the arguments are passed by Cfgfile, if no cfgfile will use config.ini by default? How about the legacy command lines? I think even cfgfile is good, we still need to keep the command line way. > diag = rte_eal_init(argc, argv); > if (diag < 0) > rte_panic("Cannot init EAL\n"); > @@ -2289,7 +2329,16 @@ uint8_t port_is_bonding_slave(portid_t slave_pid) > argc -= diag; > argv += diag; > if (argc > 1) > - launch_args_parse(argc, argv); > + launch_args_parse(argc, argv, cfg); > + The argc and argv have been overwritten by cfgfile, right? Does it make sense to Pass cfg to launch_args_parse? And you also need to update the testpmd doc to describe this new use. Thanks Jingjing
Re: [dpdk-dev] [PATCH 2/2] app/testpmd: add support for different aggregation mode in IEEE802.3ad bonding
> -Original Message- > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Daniel Mrzyglod > Sent: Friday, May 26, 2017 10:35 PM > To: Doherty, Declan > Cc: dev@dpdk.org; Mrzyglod, DanielX T > Subject: [dpdk-dev] [PATCH 2/2] app/testpmd: add support for different > aggregation mode in IEEE802.3ad bonding > > This patch add support for different aggregator modes in similar manner that > is > provided in linux kernel. > > testpmd> set bonding agg_mode show bonding config > testpmd> > > Signed-off-by: Daniel Mrzyglod Here are my comments: 1. title looks too long 2. New commands or update need to be added to help display located at cmd_help_long_parsed. 3. You also need to update the testpmd doc. Thanks Jingjing
Re: [dpdk-dev] [PATCH] net/i40e: fix param check to avoid division by 0
> -Original Message- > From: Yong Wang [mailto:wang.yon...@zte.com.cn] > Sent: Monday, June 12, 2017 5:07 PM > To: Zhang, Helin ; Wu, Jingjing > Cc: dev@dpdk.org; Yong Wang > Subject: [PATCH] net/i40e: fix param check to avoid division by 0 > > In function i40e_vsi_config_tc_queue_mapping(), if 'enabled_tcmap' is > 0, 'total_tc' might be 0. Then 'total_tc' might be used in a division > by 0 in "qpnum_per_tc = i40e_align_floor(vsi->nb_qps / total_tc)". Fix > it by adding a check to parameter 'enabled_tcmap'. > > Signed-off-by: Yong Wang > --- > drivers/net/i40e/i40e_ethdev.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c > index c18a93b..d41b213 100644 > --- a/drivers/net/i40e/i40e_ethdev.c > +++ b/drivers/net/i40e/i40e_ethdev.c > @@ -4257,6 +4257,11 @@ enum i40e_status_code > int i, total_tc = 0; > uint16_t qpnum_per_tc, bsf, qp_idx; > > + if (enabled_tcmap == 0) { > + PMD_DRV_LOG(ERR, "Error! enabled tcmap shouldn't be 0"); > + return I40E_ERR_PARAM; > + } > + > ret = validate_tcmap_parameter(vsi, enabled_tcmap); > if (ret != I40E_SUCCESS) > return ret; > -- Thanks for the fix. Could you add the check in func validate_tcmap_parameter? Thanks Jingjing
Re: [dpdk-dev] [PATCH v5 2/3] lib/gro: add TCP/IPv4 GRO support
Hi Jianfeng, On Mon, Jun 19, 2017 at 11:43:20PM +0800, Tan, Jianfeng wrote: > > > On 6/18/2017 3:21 PM, Jiayu Hu wrote: > > In this patch, we introduce six APIs to support TCP/IPv4 GRO. > > Those functions are not used outside of this library. Don't make it as > extern visible. But they are called by functions in rte_gro.c, which are in the different file. If we define these functions with static, how can they be called by other functions in the different file? > > > - gro_tcp_tbl_create: create a TCP reassembly table, which is used to > > merge packets. > > Will tcp6 shares the same function with tcp4? If no, please rename it to > gro_tcp4_tlb_create In TCP GRO design, TCP4 and TCP6 will share a same table structure, but they will have different reassembly function. Therefore, I use gro_tcp_tlb_create instead of gro_tcp4_tlb_create here. > > > - gro_tcp_tbl_destroy: free memory space of a TCP reassembly table. > > - gro_tcp_tbl_flush: flush packets in the TCP reassembly table. > > - gro_tcp_tbl_timeout_flush: flush timeout packets in the TCP > > reassembly table. > > - gro_tcp4_reassemble: merge an inputted packet. > > - gro_tcp4_tbl_cksum_update: update TCP and IPv4 header checksums for > > all merged packets in the TCP reassembly table. > > > > In TCP GRO, we use a table structure, called TCP reassembly table, to > > reassemble packets. Both TCP/IPv4 and TCP/IPv6 GRO use the same table > > structure. A TCP reassembly table includes a flow array and a item array, > > where the flow array is used to record flow information and the item > > array is used to record packets information. > > > > Each element in the flow array records the information of one flow, > > which includes two parts: > > - key: the criteria of the same flow. If packets have the same key > > value, they belong to the same flow. > > - start_index: the index of the first incoming packet of this flow in > > the item array. With start_index, we can locate the first incoming > > packet of this flow. > > Each element in the item array records one packet information. It mainly > > includes two parts: > > - pkt: packet address > > - next_pkt_index: index of the next packet of the same flow in the item > > array. All packets of the same flow are chained by next_pkt_index. > > With next_pkt_index, we can locate all packets of the same flow > > one by one. > > > > To process an incoming packet, we need three steps: > > a. check if the packet should be processed. Packets with the following > > properties won't be processed: > > - packets without data; > > - packets with wrong checksums; > > Why do we care to check this kind of error? Can we just suppose the > applications have already dropped the packets with wrong cksum? Indeed, if we assume all inputted packets are correct, we can avoid checksum checking overhead. But as a library, I think a more flexible way is to enable applications to tell GRO API if checksum checking is needed. For example, we can add a flag to struct rte_gro_tbl and struct rte_gro_param, which indicates if the checksum checking is needed. If applications set this flag, reassembly function won't check packet checksum. Otherwise, we check the checksum. How do you think? > > > - fragmented packets. > > IP fragmented? I don't think we need to check it here either. It's the > application's responsibility to call librte_ip_frag firstly to reassemble > IP-fragmented packets, and then call this gro library to merge TCP packets. > And this procedure should be shown in an example for other users to refer. > > > b. traverse the flow array to find a flow which the packet belongs to. > > If not find, insert a new flow and store the packet into the item > > array. > > You do not store the packet now. "store the packet into the item array" -> > "then go to step c". Thanks, I will update it in next patch. > > > c. locate the first packet of this flow in the item array via > > start_index. Then traverse all packets of this flow one by one via > > next_pkt_index. If find one packet to merge with the incoming packet, > > merge them but without updating checksums. If not, allocate one item > > in the item array to store the incoming packet and update > > next_pkt_index value. > > > > For better performance, we don't update header checksums once two > > packets are merged. The header checksums are updated only when packets > > are flushed from TCP reassembly tables. > > Why do we care to recalculate the L4 checksum when flushing? How about Just > keeping the wrong cksum, and letting the applications to handle that? Not all applications want GROed packets with wrong checksum. So I think a more reasonable way is to give a flag to applications to tell GRO API if they need calculate checksum when flush them from GRO table. How do you think? > > > > > > Signed-off-by: Jiayu Hu > > --- > > lib/librte_gro/Makefile |
[dpdk-dev] [PATCH v3] test: add delay time in test alarm
Because accuracy of timing to the microsecond is not guaranteed in rte_eal_alarm_set, this function will not be called before the requested time, but may be called a period of time afterwards which can not be calculated. In order to ensure test alarm running success, this patch added the delay time before check the flag. Signed-off-by: Qiming Yang --- v2 changes: * fixed coding style problems v3 changes: * replaced the numeric by macro --- --- test/test/test_alarm.c | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/test/test/test_alarm.c b/test/test/test_alarm.c index ecb2f6d..40f55b5 100644 --- a/test/test/test_alarm.c +++ b/test/test/test_alarm.c @@ -47,6 +47,7 @@ #define RTE_TEST_ALARM_TIMEOUT 10 /* ms */ #define RTE_TEST_CHECK_PERIOD 3 /* ms */ +#define RTE_TEST_MAX_REPEAT20 static volatile int flag; @@ -96,6 +97,7 @@ static int test_multi_alarms(void) { int rm_count = 0; + int count = 0; cb_count.cnt = 0; printf("Expect 6 callbacks in order...\n"); @@ -169,7 +171,10 @@ test_multi_alarms(void) printf("Error, cancelling head-of-list leads to premature callback\n"); return -1; } - rte_delay_ms(10); + + while (flag != 2 && count++ < RTE_TEST_MAX_REPEAT) + rte_delay_ms(10); + if (flag != 2) { printf("Error - expected callback not called\n"); rte_eal_alarm_cancel(test_remove_in_callback, (void *)-1); @@ -212,7 +217,7 @@ test_alarm(void) printf("fail to set alarm callback\n"); return -1; } - while (flag == 0 && count ++ < 6) + while (flag == 0 && count++ < RTE_TEST_MAX_REPEAT) rte_delay_ms(RTE_TEST_CHECK_PERIOD); if (flag == 0){ -- 2.7.4
Re: [dpdk-dev] [PATCH v2] test/test_mbuf: remove mempool global var
On Tuesday 20 June 2017 02:07 AM, Thomas Monjalon wrote: > 08/06/2017 16:28, Santosh Shukla: >> Let test_mbuf alloc and free mempool. >> >> Cc: sta...@dpdk.org >> Signed-off-by: Santosh Shukla > Why Cc stable? > Is it fixing something? > w/o this fix, application can't run more than once. Reason: Static allocation of resources and exiting w/o freeing so leak. Patch makes app resource handling dynamic and Now user could run test more than once and app exits gracefully. thats why Cc: stable a need (IHMO). Thanks.
[dpdk-dev] [PATCH] net/mlx5: fix TSO segment size
In case on multi segment packet, the TSO segment size was taken from the last segment. This may lead to incorrect values in case not all segments are initialized with the field. Fixing it by taking the value from the first segment. Fixes: 3f13f8c23a7c ("net/mlx5: support hardware TSO") Cc: sta...@dpdk.org Signed-off-by: Shahaf Shuler Acked-by: Yongseok Koh --- drivers/net/mlx5/mlx5_rxtx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index cade625f9..70314b393 100644 --- a/drivers/net/mlx5/mlx5_rxtx.c +++ b/drivers/net/mlx5/mlx5_rxtx.c @@ -527,6 +527,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) uint16_t ehdr; uint8_t cs_flags = 0; uint64_t tso = 0; + uint16_t tso_segsz = 0; #ifdef MLX5_PMD_SOFT_COUNTERS uint32_t total_length = 0; #endif @@ -622,6 +623,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) tso_header_sz = buf->l2_len + vlan_sz + buf->l3_len + buf->l4_len; + tso_segsz = buf->tso_segsz; if (is_tunneled && txq->tunnel_en) { tso_header_sz += buf->outer_l2_len + @@ -821,7 +823,7 @@ mlx5_tx_burst(void *dpdk_txq, struct rte_mbuf **pkts, uint16_t pkts_n) }; wqe->eseg = (rte_v128u32_t){ 0, - cs_flags | (htons(buf->tso_segsz) << 16), + cs_flags | (htons(tso_segsz) << 16), 0, (ehdr << 16) | htons(tso_header_sz), }; -- 2.12.0
Re: [dpdk-dev] [RFC] proposal of allowing personal/project repos on DPDK.org
On Mon, Jun 19, 2017 at 03:29:29PM +0200, Thomas Monjalon wrote: > Hi, > > 01/06/2017 07:07, Tiwei Bie: > > We'd like to make a proposal of making DPDK.org allow hosting > > some personal/project repos, which could be very useful when > > someone wants to try some experimental projects in DPDK. > > It has been discussed during the last technical board meeting. > The board was not convinced by this idea, but would like to better > understand what is the problem to be solved with this proposal? > > If there is something to solve, we could decide one these proposals: > 1/ let users free to choose their forge > 2/ advise for gitlab > 3/ advise for github (and fight to recover the name DPDK) > 4/ pay a sysadmin to host and secure our own forge in dpdk.org domain > Anyway, I don't think it's a bad idea to get the name DPDK back on github if possible. Any thoughts? Best regards, Tiwei Bie
Re: [dpdk-dev] [PATCH] app/testpmd: always build VF and MACsec functions
On Thu, Jun 15, 2017 at 01:02:21PM +0200, Thomas Monjalon wrote: > These functions are supported only on ixgbe. [...] > @@ -3013,15 +3013,21 @@ set_vf_traffic(portid_t port_id, uint8_t is_rx, > uint16_t vf, uint8_t on) > > if (diag == 0) > return; > - if(is_rx) > + if(is_rx) { It's better to also fix this coding style issue. Best regards, Tiwei Bie > printf("rte_pmd_ixgbe_set_vf_rx for port_id=%d failed " > - "diag=%d\n", port_id, diag); > - else > + "diag=%d\n", port_id, diag); > + return; > + } else { > printf("rte_pmd_ixgbe_set_vf_tx for port_id=%d failed " > - "diag=%d\n", port_id, diag); > - > -} > + "diag=%d\n", port_id, diag); > + return; > + } > #endif > + printf("VF %s setting not supported for port %d\n", > + is_rx ? "Rx" : "Tx", port_id); > + RTE_SET_USED(vf); > + RTE_SET_USED(on); > +} > > int > set_queue_rate_limit(portid_t port_id, uint16_t queue_idx, uint16_t rate) > -- > 2.13.1 >