[dpdk-dev] [PATCH v2] doc: Malicious Driver Detection not supported by ixgbe
Hi Bruce, > -Original Message- > From: Richardson, Bruce > Sent: Friday, February 26, 2016 10:41 PM > To: Lu, Wenzhuo > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2] doc: Malicious Driver Detection not > supported by ixgbe > > On Fri, Feb 26, 2016 at 12:48:37PM +0800, Wenzhuo Lu wrote: > > Announce that Malicious Driver Detection is not supported. > > > > V2: > > *Rework the words. > > > > Signed-off-by: Wenzhuo Lu > > Hi Wenzhuo, > > just for future reference, please put the V2,v3 etc. updates below the cut > line "-- > -" so that they can be auto-stripped when applying the patch. > > /Bruce Got it. Thanks for the reminder :) > > > --- > > doc/guides/nics/ixgbe.rst | 20 > > doc/guides/rel_notes/release_16_04.rst | 23 +++ > > 2 files changed, 43 insertions(+) > > >
[dpdk-dev] [PATCH v3 1/3] fm10k: enable FTAG based forwarding
> -Original Message- > From: Richardson, Bruce > Sent: Saturday, February 27, 2016 12:33 AM > To: David Marchand > Cc: Wang, Xiao W ; dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 1/3] fm10k: enable FTAG based forwarding > > On Fri, Feb 26, 2016 at 04:00:49PM +0100, David Marchand wrote: > > On Fri, Feb 26, 2016 at 3:48 PM, Bruce Richardson > > wrote: > > > On Fri, Feb 26, 2016 at 09:24:06AM +, Wang, Xiao W wrote: > > >> Hi, > > >> > > Thanks for the discussion, Thomas, do you have any suggestions? > > >> > > > >> > I don't understand why you say this feature is specific to fm10k. > > >> > Can we imagine another NIC having this capability? > > >> > > >> As you know, fm10k has a switch logic between the Mac and Phy, > > >> every packets Sent out from the host will be switched inside the > > >> NIC, other NICs don't have a switch inside, and the FTAG feature is > > >> related > to the switch function. > > >> > > >> As introduced in the second patch: > > >> The FM10K family of NICs support the addition of a Fabric Tag > > >> (FTAG) to carry special information. The FTAG is placed at the > > >> beginning of the frame, it contains information such as where the > > >> packet comes from and goes, and the vlan tag. In FTAG based > > >> forwarding mode, the switch logic forwards packets according to glort > (global resource tag) information, rather than the mac and vlan table. > > >> So this is a feature specific to fm10k. > > > > > > If it is fm10k specific, how about just adding a public function to > > > the fm10k driver to turn it on. The user app will be non-portable > > > across NICs, but that's the price of using nic-specific features. > > > > What about using a devargs ? > > Something like : > > -w :xx:xx.x,enable_ftag=1 > > > > The application still needs to know about this to enable it, but that > > sounds better to me. > > The only issue is that it can't work with hotplug at the moment. > > > Seems a good enough solution to me. Xiao, any other thoughts? > > /Bruce I also agree with the devargs solution, in this way, the build time config can be removed and we don't need to add extra fields into ethdev. I'll rework the patch, thanks for the suggestions. Best Regards, Xiao
[dpdk-dev] [PATCH] eal: make resource initialization more robust
Hi Thomas, On 2/29/2016 5:12 AM, Thomas Monjalon wrote: > Hi, > > 2016-01-29 19:22, Jianfeng Tan: >> Current issue: DPDK is not that friendly to container environment, which >> caused by that it pre-alloc resource like cores and hugepages. But there >> are this or that resource limitations, for examples, cgroup, rlimit, >> cpuset, etc. >> >> For cores, this patch makes use of pthread_getaffinity_np to further >> narrow down detected cores before parsing coremask (-c), corelist (-l), >> and coremap (--lcores). >> >> For hugepages, this patch adds a recover mechanism to the case that >> there are no that many hugepages can be used. It relys on a mem access >> to fault-in hugepages, and if fails with SIGBUS, recover to previously >> saved stack environment with siglongjmp(). > They are some interesting ideas. > However, I am not sure a library should try to be so smart silently. > It needs more feedback to decide wether it can be the default behaviour > or an option. > > Please send coremask and hugepage mapping as separate patches as they > are totally different and may be integrated separately. Good advise, thanks! I'll do it. And one more thing FYI, coremask using pthread_getaffinity_np() may have issue on some Linux versions or distros: it excludes isolcpus. This is reported by Sergio Gonzalez Monroy , and I'm still working it out. Thanks, Jianfeng > > Thanks
[dpdk-dev] [PATCH v3 00/18] fm10k: update shared code
Tested-by: Heng Ding -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Wang Xiao W Sent: Friday, February 19, 2016 7:07 PM To: Chen, Jing D Cc: dev at dpdk.org Subject: [dpdk-dev] [PATCH v3 00/18] fm10k: update shared code v3: * Fixed checkpatch.pl warning about long commit message. * Fixed the issue of compile failure about part of patches applied. * Split the misc-small-fixes patch into several patches. v2: * Put the two extra fix patches ahead of the base code patches. This patch set has passed regression test. Wang Xiao W (18): fm10k: use default mailbox message handler for PF fm10k/base: correct typecast in fm10k_update_xc_addr_pf fm10k/base: cleanup namespace pollution fm10k/base: use bitshift for itr_scale fm10k/base: reset max_queues on init_hw_vf failure fm10k/base: document ITR scale workaround in VF TDLEN register fm10k/base: cleanup lines over 80 characters fm10k/base: cleanup useless else fm10k/base: use BIT macro instead of open-coded bit-shifting of 1 fm10k/base: do not use CamelCase fm10k/base: use memcpy for mac addr copy fm10k/base: allow removal of is_slot_appropriate function fm10k/base: consistently use VLAN ID when referencing vid variables fm10k/base: imporve comment per upstream review changes fm10k/base: fix TLV structures alignment fm10k/base: move constants to the right of binary operators fm10k/base: minor cleanups fm10k/base: remove unused struct element drivers/net/fm10k/base/fm10k_api.c | 2 + drivers/net/fm10k/base/fm10k_api.h | 2 + drivers/net/fm10k/base/fm10k_mbx.c | 63 +++- drivers/net/fm10k/base/fm10k_mbx.h | 11 +-- drivers/net/fm10k/base/fm10k_osdep.h | 32 ++ drivers/net/fm10k/base/fm10k_pf.c| 88 + drivers/net/fm10k/base/fm10k_pf.h| 18 ++-- drivers/net/fm10k/base/fm10k_tlv.c | 40 drivers/net/fm10k/base/fm10k_tlv.h | 9 +- drivers/net/fm10k/base/fm10k_type.h | 182 +++ drivers/net/fm10k/base/fm10k_vf.c| 32 -- drivers/net/fm10k/fm10k_ethdev.c | 41 +++- 12 files changed, 222 insertions(+), 298 deletions(-) -- 1.9.3
[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device
? 2/27/2016 1:47 AM, Xie, Huawei ??: > Use RTE_KDRV_NONE to indicate that kernel driver(including UIO/VFIO) > isn't manipulating the device. Thomas, could you kindly help change manipulating->managing? I have changed others per Panu's suggestion but missed this.
[dpdk-dev] [PATCH] virtio: don't count broadcast packets in multicast packets counter
On Fri, Feb 26, 2016 at 06:01:23PM +0300, Igor Ryzhov wrote: > Signed-off-by: Igor Ryzhov Acked-by: Yuanhan Liu --yliu
[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api
On Fri, Feb 26, 2016 at 02:21:02PM +0530, Santosh Shukla wrote: > Check cpuflag macro before using vectored api. > -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added > cpuflag. > - Also wrap other vectored freind api ie.. > 1) virtqueue_enqueue_recv_refill_simple > 2) virtio_rxq_vec_setup > ... > diff --git a/drivers/net/virtio/virtio_rxtx_simple.c > b/drivers/net/virtio/virtio_rxtx_simple.c > index 3a1de9d..be51d7c 100644 > --- a/drivers/net/virtio/virtio_rxtx_simple.c > +++ b/drivers/net/virtio/virtio_rxtx_simple.c Hmm, why not wrapping the whole file, instead of just few functions? Or maybe better, do a compile time check at the Makefile, something like: if has_CPUFLAG_xxx SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c endif --yliu
[dpdk-dev] ACL memory allocation failures
> > Thanks Konstantin. > > Previous allocation error was coming with 1024 huge pages of 2 MB size. > > After increasing the huge pages to 2048, I was able to add another > ~140 rules [IPv4 rule data--> with src, dst IP address & port, next header ] > more, ie., 950 rules were added. That's strange according to your log, all you need is ~13MB of hugepage memory: ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 Wonder what consumed rest of 4GB? >> We are creating mem pools (for DPDK compatible 3 ports) for packet >> processing. Again do you re-build your table after every rule you add? If so, then it seems a bit strange approach (and definitely not the fastest one). >>Yes, we are rebuilding the rules every time and is due to 2 reasons: >>1. Our application, gives full list of rules every time you add new rule. >>2. There is no way to delete a specific rule in the trie. Is there any way to >>delete a specific ACL rule? What you can do instead: create context; add all your rules into it; build; > > Logically it did not increase number of rules [expected 2*817, but only 950 > were added]. Is it really using huge pages memory only? > > From the code it looks like heap memory. [ ret = > malloc_heap_alloc(&mcfg->malloc_heaps[i], type, size, 0, align == 0 ? > 1 : align, 0) ] As I can see from the log it fails at GEN phase, when trying to allocate hugepages for RT table. At lib/librte_acl/acl_gen.c:509 rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie, struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries, uint32_t num_categories, uint32_t data_index_sz, size_t max_size) { ... mem = rte_zmalloc_socket(ctx->name, total_size, RTE_CACHE_LINE_SIZE, ctx->socket_id); if (mem == NULL) { RTE_LOG(ERR, ACL, "allocation of %zu bytes on socket %d for %s failed\n", total_size, ctx->socket_id, ctx->name); return -ENOMEM; } Konstantin > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rapelly, Varun > > Sent: Friday, February 26, 2016 10:28 AM > > To: dev at dpdk.org > > Subject: Re: [dpdk-dev] ACL memory allocation failures > > > > Hi All, > > > > When I'm trying to configure some 5000+ ACL rules with different > > source IP addresses, getting ACL memory allocation failure. I'm using DPDK > > 2.1. > > > > [root at ACLISSUE log_2015_10_26_08_19_42]# vim np.log match > > nodes/bytes > > used: 816/104448 > > total: 12940832 bytes > > ACL: Build phase for ACL "ipv4_acl_table2": > > memory consumed: 947913495 > > ACL: trie 0: number of rules: 816 > > ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 > > failed > > ACL: Build phase for ACL "ipv4_acl_table1": > > memory consumed: 947913495 > > ACL: trie 0: number of rules: 817 > > EAL: Error - exiting with code: 1 > > Cause: Failed to build ACL trie > > > > Again sourced the ACL config file. After adding around 77 again the same > > error came. > > > > total: 14912784 bytes > > ACL: Build phase for ACL "ipv4_acl_table1": > > memory consumed: 1040188260 > > ACL: trie 0: number of rules: 893 > > ACL: allocation of 14938480 bytes on socket 0 for ipv4_acl_table2 > > failed > > You are running out of hugepages memory. > > > ACL: Build phase for ACL "ipv4_acl_table2": > > memory consumed: 1040188260 > > ACL: trie 0: number of rules: 894 > > EAL: Error - exiting with code: 1 > > Cause: Failed to build ACL trie > > > > Where to increase the memory to avoid this issue? > > Refer to: > http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html#running-dpdk-applic > ations > Section 2.3.2 > > Konstantin
[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api
On 2/29/2016 12:26 PM, Yuanhan Liu wrote: > On Fri, Feb 26, 2016 at 02:21:02PM +0530, Santosh Shukla wrote: >> Check cpuflag macro before using vectored api. >> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added >> cpuflag. >> - Also wrap other vectored freind api ie.. >> 1) virtqueue_enqueue_recv_refill_simple >> 2) virtio_rxq_vec_setup >> > ... >> diff --git a/drivers/net/virtio/virtio_rxtx_simple.c >> b/drivers/net/virtio/virtio_rxtx_simple.c >> index 3a1de9d..be51d7c 100644 >> --- a/drivers/net/virtio/virtio_rxtx_simple.c >> +++ b/drivers/net/virtio/virtio_rxtx_simple.c > Hmm, why not wrapping the whole file, instead of just few functions? > > Or maybe better, do a compile time check at the Makefile, something > like: > > if has_CPUFLAG_xxx > SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c > endif > > > --yliu > For next release, we could consider providing arch level framework for different arch optimizations. It is more complicated for rte_memcpy. For the time being, except the small issue, ok with the temporary solution using CPUFLAG.
[dpdk-dev] [PATCH] i40e: remove redundant compiler warning disablers
This patch caused build error with i686-native-linuxapp-gcc (gcc version is 4.8.3) > > i686-native-linuxapp-gcc compile error info: > > > > INSTALL-LIB librte_pmd_vmxnet3_uio.a > > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function > > ?i40e_aq_set_lldp_mib?: > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:3772:32: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > > cmd->address_high = CPU_TO_LE32(I40E_HI_WORD((u64)buff)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:3773:30: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > > cmd->address_low = CPU_TO_LE32(I40E_LO_DWORD((u64)buff)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function > > ?i40e_aq_set_arp_proxy_config?: > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5817:33: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > > cmd->address_high = CPU_TO_LE32(I40E_HI_DWORD((u64)proxy_config)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5818:30: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > > cmd->address_low = CPU_TO_LE32(I40E_LO_DWORD((u64)proxy_config)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function > > ?i40e_aq_set_ns_proxy_table_entry?: > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5852:14: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > >CPU_TO_LE32(I40E_HI_DWORD((u64)ns_proxy_table_entry)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5854:12: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > >CPU_TO_LE32(I40E_LO_DWORD((u64)ns_proxy_table_entry)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c: In function > > ?i40e_aq_set_clear_wol_filter?: > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5914:33: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > > cmd->address_high = CPU_TO_LE32(I40E_HI_DWORD((u64)filter)); > > ^ > > /root/dpdk/drivers/net/i40e/base/i40e_common.c:5915:30: error: cast > > from pointer to integer of different size [-Werror=pointer-to-int-cast] > > cmd->address_low = CPU_TO_LE32(I40E_LO_DWORD((u64)filter)); > > ^ > > cc1: all warnings being treated as errors > > make[6]: *** [i40e_common.o] Error 1 > > make[5]: *** [i40e] Error 2 > > make[5]: *** Waiting for unfinished jobs > > INSTALL-LIB librte_pmd_ixgbe.a > > AR librte_pmd_e1000.a > > INSTALL-LIB librte_pmd_e1000.a > > make[4]: *** [net] Error 2 > > make[3]: *** [drivers] Error 2 > > make[2]: *** [all] Error 2 > > make[1]: *** [pre_install] Error 2 > > make: *** [install] Error 2 -Original Message- From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Panu Matilainen Sent: Monday, December 7, 2015 8:37 PM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH] i40e: remove redundant compiler warning disablers These may have been required at some point but current i40e base driver compiles cleanly without them, at least with clang 3.7.0 and gcc 5.1.1. Signed-off-by: Panu Matilainen --- drivers/net/i40e/Makefile | 13 - 1 file changed, 13 deletions(-) diff --git a/drivers/net/i40e/Makefile b/drivers/net/i40e/Makefile index 033ee4a..4ffaf0d 100644 --- a/drivers/net/i40e/Makefile +++ b/drivers/net/i40e/Makefile @@ -53,23 +53,10 @@ CFLAGS_BASE_DRIVER = -wd593 -wd188 else ifeq ($(CC), clang) CFLAGS_BASE_DRIVER += -Wno-sign-compare CFLAGS_BASE_DRIVER += -Wno-unused-value -CFLAGS_BASE_DRIVER += -Wno-unused-parameter -CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -CFLAGS_BASE_DRIVER += -Wno-format -CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers -CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast -CFLAGS_BASE_DRIVER += -Wno-format-nonliteral CFLAGS_BASE_DRIVER += -Wno-unused-variable else CFLAGS_BASE_DRIVER = -Wno-sign-compare CFLAGS_BASE_DRIVER += -Wno-unused-value -CFLAGS_BASE_DRIVER += -Wno-unused-parameter -CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -CFLAGS_BASE_DRIVER += -Wno-format -CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers -CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast -CFLAGS_BASE_DRIVER += -Wno-format-nonliteral -CFLAGS_BASE_DRIVER += -Wno-format-security CFLAGS_BASE_DRIVER += -Wno-unused-variable ifeq ($(shell test $(GCC_VERSION) -ge 44 && echo 1), 1) -- 2.5.0
[dpdk-dev] [PATCH] mk: add makefile extention support
2016-02-28 21:47, Wiles, Keith: > >Hi, > > > >2016-02-09 11:35, Keith Wiles: > >> Adding support to the build system to allow for Makefile.XXX > >> extention to a subtree, which already has Makefiles. These > >> Makefiles could be from the autotools and others places. Using > >> the Makefile extention RTE_MKFILE_SUFFIX in a makefile subtree > >> using 'export RTE_MKFILE_SUFFIX=.XXX' to use Makefile.XXX in > >> that subtree. > >> > >> The main reason I needed this feature was to integrate a autotool > >> open source projects with DPDK and keep the original Makefiles. > > > >Sorry I fail to understand why it is needed. > >Are you trying to add autotool in DPDK? I don't think it is a good approach. > >The DPDK must provide a pkgconfig interface to be integrated anywhere. > > I was not trying to add autotools to DPDK. On a number of times I wanted to > integrate a open source project(s) with DPDK and use DPDK?s build system, but > because the open source project already contained Makefile files you can not > use DPDK build system without modify or moving the original Makefile files. > Using this method I can just add a exported variable and supply my own > Makefile.XXX files. > > One case was building FreeBSD source, but I did not want to modify FreeBSD > Makefiles (or reply on previous built Makefiles as they would not work on > Linux anyway) as I was pulling the source down from freebsd.org repo. Using a > patch to add the Makefiles with a different suffix allows me to build FreeBSD > using DPDK, without having to modify or own the FreeBSD source. I have had > this problem a number of times with open source code I did not want to > modify, but just build within DPDK build system and adding the support for a > different suffix to DPDK provided a clean way. The change does not effect the > correct build system and just allows someone to define a new suffix for a > given subtree in the code. Why would you like to have another project inside the DPDK files tree? If you want to integrate the lib inside an existing project, the solution is pkgconfig.
[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device
On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie wrote: > v4 changes: > reword the commit message. When we mention kernel driver, emphasizes > that it includes UIO/VFIO. Annotations should not be part of the commitlog itself. > Use RTE_KDRV_NONE to indicate that kernel driver(including UIO/VFIO) > isn't manipulating the device. missing space before ( > Signed-off-by: Huawei Xie > Acked-by: Yuanhan Liu Thought I already acked this. Anyway, Acked-by: David Marchand -- David Marchand
[dpdk-dev] [PATCH v4 1/4] eal: make the comment more accurate
On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie wrote: > positive return of rte_eal_pci_probe_one_driver means the driver doesn't > support > the device. > > Signed-off-by: Huawei Xie > Acked-by: Yuanhan Liu Acked-by: David Marchand -- David Marchand
[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device
On 2/29/2016 4:47 PM, David Marchand wrote: > On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie wrote: >> v4 changes: >> reword the commit message. When we mention kernel driver, emphasizes >> that it includes UIO/VFIO. > Annotations should not be part of the commitlog itself. Do you mean that "rewording the commit message" should not appear in the commit message itself? Those version changes will not appear in the commit log when applied, right? So i added this so that reviewers know that i have changed the commit message otherwise they don't need to waste their time reviewing the commit message again. Is it that even if i send a new patch version with only the changes to the commit message , i needn't mention this? > >> Use RTE_KDRV_NONE to indicate that kernel driver(including UIO/VFIO) >> isn't manipulating the device. > missing space before ( Thomas, could you help change this? > >> Signed-off-by: Huawei Xie >> Acked-by: Yuanhan Liu > Thought I already acked this. > Anyway, > Acked-by: David Marchand > >
[dpdk-dev] [PATCH v4 3/4] eal: call pci_ioport_map when kernel driver isn't managing the device
On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie wrote: > Call rte_eal_pci_ioport_map only if driver type is RTE_KDRV_NONE, which > means kernel driver(including UIO/VFIO) isn't managing the device. I suppose you meant 'Call pci_ioport_map when the pci device is not bound to a kernel driver'. If you keep on with your choice of words, at least put a space before the (. > other minor changes: > * use RTE_ARCH_X86 for pci ioport map This is a trivial change, but this should not be here. > * rework rte_eal_pci_ioport_map a bit Well, not sure this comment helps the review, and anyway, why did you need to change this ? Your modification should be the smallest possible. > Signed-off-by: Huawei Xie Let aside these nits. Acked-by: David Marchand -- David Marchand
[dpdk-dev] [PATCH v4 2/4] eal: set kdrv to RTE_KDRV_NONE if kernel driver isn't managing the device
On Mon, Feb 29, 2016 at 10:00 AM, Xie, Huawei wrote: > On 2/29/2016 4:47 PM, David Marchand wrote: >> On Fri, Feb 26, 2016 at 2:53 AM, Huawei Xie wrote: >>> v4 changes: >>> reword the commit message. When we mention kernel driver, emphasizes >>> that it includes UIO/VFIO. >> Annotations should not be part of the commitlog itself. > > Do you mean that "rewording the commit message" should not appear in the > commit message itself? Those version changes will not appear in the > commit log when applied, right? So i added this so that reviewers know Try to apply it. http://dpdk.org/dev : "Annotations take place after the 3 dashes and should explicit what has changed since the previous version.". -- David Marchand
[dpdk-dev] [PATCH v3 3/3] keepalive: add rte_keepalive_xstats_get()
Hi, There is a compilation error for 32-bit arch: 2016-02-22 11:26, Harry van Haaren: > + for (i = 0; i < nstats; i++) > + printf("%s\t%lu\n", xstats[i].name, xstats[i].value); examples/l2fwd-keepalive/main.c:206:10: error: format ?%lu? expects argument of type ?long unsigned int?, but argument 3 has type ?uint64_t {aka long long unsigned int}? Please keep acks when re-sending. Thanks
[dpdk-dev] [PATCH v3 0/4] Use common Linux tools to control DPDK ports
On 26/02/2016 14:10, Ferruh Yigit wrote: > Ferruh Yigit (4): >lib/librte_ethtool: move librte_ethtool form examples to lib folder >kcp: add kernel control path kernel module >rte_ctrl_if: add control interface library >examples/ethtool: add control interface support to the application Acked-by: Remy Horton
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/28/2016 10:16 PM, Ferruh Yigit wrote: > On 2/28/2016 3:34 PM, Avi Kivity wrote: >> On 01/27/2016 06:24 PM, Ferruh Yigit wrote: >>> This kernel module is based on KNI module, but this one is stripped >>> version of it and only for control messages, no data transfer >>> functionality provided. >>> >>> This Linux kernel module helps userspace application create virtual >>> interfaces and when a control command issued into that virtual >>> interface, module pushes the command to the userspace and gets the >>> response back for the caller application. >>> >>> The Linux tools like ethtool/ifconfig/ip can be used on virtual >>> interfaces but not ones for related data, like tcpdump. >>> >>> In long term this patch intends to replace the KNI and KNI will be >>> depreciated. >> Instead of adding yet another out-of-tree kernel module, why not extend >> the existing in-tree tap driver? This will make everyone's life easier. >> >> Since tap also supports data transfer, an application can also forward >> packets not intended to it to the kernel, and forward packets from the >> kernel through the device. >> > Hi Avi, > > KDP (Kernel Data Path) does what you have described, it is implemented > as PMD and it benefits from tap driver to data transfer through the > kernel. It also support custom kernel module for better performance. > > For KCP (Kernel Control Path), network driver forwards control commands > to the userspace driver, I doubt this is something wanted for tun/tap > driver, so extending tun/tap driver like this can be hard to upstream. Have you tried asking? Maybe if you explain it they will be open to the extension. Certainly it will be better to have KCP and KDP use the same kernel interface name; so we'll need to either add data path support to kcp (causing duplication with tap), or add control path support to tap. I think the latter is preferable. > We are investigating about adding a native support to Linux kernel for > KCP, but there is no task started for this right now, any support is > welcome. > >
[dpdk-dev] [PATCH v3 1/1] jobstats: added function abort for job
2016-02-16 13:19, Zhang, Roy Fan: > > On 12/02/2016 16:04, Marcin Kerlin wrote: > > This patch adds new function rte_jobstats_abort. It marks *job* as finished > > and > > time of this work will be add to management time instead of execution time. > > This > > function should be used instead of rte_jobstats_finish if condition occurs, > > condition is defined by the application for example when receiving n>0 > > packets. > > Example of usage is added to the example l2fwd-jobstats. At maximum load > > do-while > > loop inside Idle job will be execute once because one or more jobs waiting > > to be > > executed, so this time should not be include as the execution time by > > calling > > rte_jobstats_abort(). > > > > v2: > > * removed redundant field > > v3: > > * added an example of using > > > > Signed-off-by: Marcin Kerlin [...] > > --- a/lib/librte_jobstats/rte_jobstats_version.map > > +++ b/lib/librte_jobstats/rte_jobstats_version.map > > @@ -17,3 +17,10 @@ DPDK_2.0 { > > > > local: *; > > }; > > + > > +DPDK_2.3 { updated to 16.04 > > + global: > > + > > + rte_jobstats_abort; > > + > > +} DPDK_2.0; > > Acked-by : Fan Zhang Applied, thanks
[dpdk-dev] [PATCH v6] cfgfile: support looking up sections by index
> > This is useful when sections have duplicate names. > > > > Signed-off-by: Rich Lane > > --- > > v5->v6: > > - Reordered sectionname argument in comment. > > Acked-by: Cristian Dumitrescu > > Thanks, Rich! Applied, thanks
[dpdk-dev] [PATCH v3] examples/l3fwd: exact-match rework
Current implementation of Exact-Match uses different execution path than for LPM. Unifying them allows to reuse big part of LPM code and sightly increase performance of Exact-Match. Main changes: - * Packet classification stage is separated from the rest of path for both LPM and EM. * Packet processing, modifying and transmit part is the same for LPM and EM and mostly based on the current LPM implementation. * Shared code is moved to the common file "l3fwd_sse.h". * While sequential packet classification in EM path, seems to be faster than using multi hash lookup, used before, it is used by default. Old implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled with HASH_LOOKUP_MULTI global define in compilation time. This patch depends of Ravi Kerur's "Modify and modularize l3fwd code" and should be applied after it. Changes in v3: - fixed error: unused function 'l3fwd_em_simple_forward'. This function is used only in l3fwd_em_no_opt_send_packets, and after moving it to new header file l3fwd_em.h in Ravi's patch, also should be moved there. Changes in v2: - patch rebase to be applicable on top of "Modify and modularize l3fwd code" v3 Signed-off-by: Tomasz Kulasek Acked-by: Konstantin Ananyev --- examples/l3fwd/l3fwd.h|8 + examples/l3fwd/l3fwd_em.c | 80 +- examples/l3fwd/l3fwd_em.h | 68 + examples/l3fwd/l3fwd_em_hlm_sse.h | 341 + examples/l3fwd/l3fwd_em_sse.h | 447 +++- examples/l3fwd/l3fwd_lpm.c| 15 +- examples/l3fwd/l3fwd_lpm_sse.h| 507 - examples/l3fwd/l3fwd_sse.h| 501 8 files changed, 1011 insertions(+), 956 deletions(-) create mode 100644 examples/l3fwd/l3fwd_em_hlm_sse.h create mode 100644 examples/l3fwd/l3fwd_sse.h diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h index f450269..da6d369 100644 --- a/examples/l3fwd/l3fwd.h +++ b/examples/l3fwd/l3fwd.h @@ -53,6 +53,14 @@ /* Configure how many packets ahead to prefetch, when reading packets */ #define PREFETCH_OFFSET 3 +/* Used to mark destination port as 'invalid'. */ +#defineBAD_PORT ((uint16_t)-1) + +#define FWDSTEP4 + +/* replace first 12B of the ethernet header. */ +#defineMASK_ETH 0x3f + /* Hash parameters. */ #ifdef RTE_ARCH_X86_64 /* default to 4 million hash entries (approx) */ diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c index ace06cf..f6a65d8 100644 --- a/examples/l3fwd/l3fwd_em.c +++ b/examples/l3fwd/l3fwd_em.c @@ -300,81 +300,17 @@ em_get_ipv6_dst_port(void *ipv6_hdr, uint8_t portid, void *lookup_struct) return (uint8_t)((ret < 0) ? portid : ipv6_l3fwd_out_if[ret]); } -static inline __attribute__((always_inline)) void -l3fwd_em_simple_forward(struct rte_mbuf *m, uint8_t portid, - struct lcore_conf *qconf) -{ - struct ether_hdr *eth_hdr; - struct ipv4_hdr *ipv4_hdr; - uint8_t dst_port; - - eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); - - if (RTE_ETH_IS_IPV4_HDR(m->packet_type)) { - /* Handle IPv4 headers.*/ - ipv4_hdr = rte_pktmbuf_mtod_offset(m, struct ipv4_hdr *, - sizeof(struct ether_hdr)); - -#ifdef DO_RFC_1812_CHECKS - /* Check to make sure the packet is valid (RFC1812) */ - if (is_valid_ipv4_pkt(ipv4_hdr, m->pkt_len) < 0) { - rte_pktmbuf_free(m); - return; - } -#endif -dst_port = em_get_ipv4_dst_port(ipv4_hdr, portid, - qconf->ipv4_lookup_struct); - - if (dst_port >= RTE_MAX_ETHPORTS || - (enabled_port_mask & 1 << dst_port) == 0) - dst_port = portid; - -#ifdef DO_RFC_1812_CHECKS - /* Update time to live and header checksum */ - --(ipv4_hdr->time_to_live); - ++(ipv4_hdr->hdr_checksum); -#endif - /* dst addr */ - *(uint64_t *)ð_hdr->d_addr = dest_eth_addr[dst_port]; - - /* src addr */ - ether_addr_copy(&ports_eth_addr[dst_port], ð_hdr->s_addr); - - send_single_packet(qconf, m, dst_port); - } else if (RTE_ETH_IS_IPV6_HDR(m->packet_type)) { - /* Handle IPv6 headers.*/ - struct ipv6_hdr *ipv6_hdr; - - ipv6_hdr = rte_pktmbuf_mtod_offset(m, struct ipv6_hdr *, - sizeof(struct ether_hdr)); - - dst_port = em_get_ipv6_dst_port(ipv6_hdr, portid, - qconf->ipv6_lookup_struct); - - if (dst_port >= RTE_MAX_ETHPORTS || - (enabled_port_mask & 1 << dst_port) == 0) - dst_
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 2/29/2016 9:43 AM, Avi Kivity wrote: > On 02/28/2016 10:16 PM, Ferruh Yigit wrote: >> On 2/28/2016 3:34 PM, Avi Kivity wrote: >>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote: This kernel module is based on KNI module, but this one is stripped version of it and only for control messages, no data transfer functionality provided. This Linux kernel module helps userspace application create virtual interfaces and when a control command issued into that virtual interface, module pushes the command to the userspace and gets the response back for the caller application. The Linux tools like ethtool/ifconfig/ip can be used on virtual interfaces but not ones for related data, like tcpdump. In long term this patch intends to replace the KNI and KNI will be depreciated. >>> Instead of adding yet another out-of-tree kernel module, why not extend >>> the existing in-tree tap driver? This will make everyone's life easier. >>> >>> Since tap also supports data transfer, an application can also forward >>> packets not intended to it to the kernel, and forward packets from the >>> kernel through the device. >>> >> Hi Avi, >> >> KDP (Kernel Data Path) does what you have described, it is implemented >> as PMD and it benefits from tap driver to data transfer through the >> kernel. It also support custom kernel module for better performance. >> >> For KCP (Kernel Control Path), network driver forwards control commands >> to the userspace driver, I doubt this is something wanted for tun/tap >> driver, so extending tun/tap driver like this can be hard to upstream. > > Have you tried asking? Maybe if you explain it they will be open to the > extension. > Not communicated but tun/tap already doing something different. For KCP, created interface is map of the DPDK port. All data interface shows coming from DPDK port. For example if you get stats information with ifconfig, the values you observe are DPDK port statistics -not statistics of data between userspace and kernelspace, statistics of data forwarded between DPDK ports. If you down the interface, DPDK port stopped, etc... If you extend the tun/tap, it won't be map of the DPDK port, and if you get statistics information from that interface, what do you expect to see, the data transferred between kernel and userspace, or underlying DPDK port forwarding statistics? Extending tun/tap in a way we want, forwarding all control commands to userspace, will break the current tun/tap, this doesn't looks like a valid option to me. For data path, using tun/tap is OK and we are already doing it, for the control path I believe we need a new driver. > Certainly it will be better to have KCP and KDP use the same kernel > interface name; so we'll need to either add data path support to kcp > (causing duplication with tap), or add control path support to tap. I > think the latter is preferable. > Why it is better to have same interface? Anyone who is not interested with kernel data path may want to control DPDK ports using common tools, or want to get some basic information and stats using ethtool or ifconfig. Why we need to bind two different functionality together? >> We are investigating about adding a native support to Linux kernel for >> KCP, but there is no task started for this right now, any support is >> welcome. >> >> >
[dpdk-dev] [PATCH v3] examples/l3fwd: exact-match rework
2016-02-29 11:33, Tomasz Kulasek: > Current implementation of Exact-Match uses different execution path than > for LPM. Unifying them allows to reuse big part of LPM code and sightly > increase performance of Exact-Match. > > Main changes: > - > * Packet classification stage is separated from the rest of path for both > LPM and EM. > * Packet processing, modifying and transmit part is the same for LPM and EM > and mostly based on the current LPM implementation. > * Shared code is moved to the common file "l3fwd_sse.h". > * While sequential packet classification in EM path, seems to be faster > than using multi hash lookup, used before, it is used by default. Old > implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled > with HASH_LOOKUP_MULTI global define in compilation time. > > This patch depends of Ravi Kerur's "Modify and modularize l3fwd code" and > should be applied after it. > > Changes in v3: > - fixed error: unused function 'l3fwd_em_simple_forward'. This function is >used only in l3fwd_em_no_opt_send_packets, and after moving it to new >header file l3fwd_em.h in Ravi's patch, also should be moved there. > > Changes in v2: > - patch rebase to be applicable on top of "Modify and modularize l3fwd >code" v3 > > Signed-off-by: Tomasz Kulasek > Acked-by: Konstantin Ananyev Applied, thanks
[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
On 02/24/2016 03:23 PM, Ananyev, Konstantin wrote: > Hi Panu, > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen >> Sent: Wednesday, February 24, 2016 12:12 PM >> To: Xie, Huawei; Olivier MATZ; dev at dpdk.org >> Cc: dprovan at bivio.net >> Subject: Re: [dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk >> API >> >> On 02/23/2016 07:35 AM, Xie, Huawei wrote: >>> On 2/22/2016 10:52 PM, Xie, Huawei wrote: On 2/4/2016 1:24 AM, Olivier MATZ wrote: > Hi, > > On 01/27/2016 02:56 PM, Panu Matilainen wrote: >> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of >> the library ABI and should not be listed in the version map. >> >> I assume its inline for performance reasons, but then you lose the >> benefits of dynamic linking such as ability to fix bugs and/or improve >> itby just updating the library. Since the point of having a bulk API is >> to improve performance by reducing the number of calls required, does it >> really have to be inline? As in, have you actually measured the >> difference between inline and non-inline and decided its worth all the >> downsides? > Agree with Panu. It would be interesting to compare the performance > between inline and non inline to decide whether inlining it or not. Will update after i gathered more data. inline could show obvious performance difference in some cases. >>> >>> Panu and Oliver: >>> I write a simple benchmark. This benchmark run 10M rounds, in each round >>> 8 mbufs are allocated through bulk API, and then freed. >>> These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @ >>> 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded). >>> Btw, i have removed some exceptional data, the frequency of which is >>> like 1/10. Sometimes observed user usage suddenly disappeared, no clue >>> what happened. >>> >>> With 8 mbufs allocated, there is about 6% performance increase using inline. >> [...] >>> >>> With 16 mbufs allocated, we could still observe obvious performance >>> difference, though only 1%-2% >>> >> [...] >>> >>> With 32/64 mbufs allocated, the deviation of the data itself would hide >>> the performance difference. >>> So we prefer using inline for performance. >> >> At least I was more after real-world performance in a real-world >> use-case rather than CPU cycles in a microbenchmark, we know function >> calls have a cost but the benefits tend to outweight the cons. >> >> Inline functions have their place and they're far less evil in project >> internal use, but in library public API they are BAD and should be ... >> well, not banned because there are exceptions to every rule, but highly >> discouraged. > > Why is that? For all the reasons static linking is bad, and what's worse it forces the static linking badness into dynamically linked builds. If there's a bug (security or otherwise) in a library, a distro wants to supply an updated package which fixes that bug and be done with it. But if that bug is in an inlined code, supplying an update is not enough, you also need to recompile everything using that code, and somehow inform customers possibly using that code that they need to not only update the library but to recompile their apps as well. That is precisely the reason distros go to great lenghts to avoid *any* statically linked apps and libs in the distro, completely regardless of the performance overhead. In addition, inlined code complicates ABI compatibility issues because some of the code is one the "wrong" side, and worse, it bypasses all the other ABI compatibility safeguards like soname and symbol versioning. Like said, inlined code is fine for internal consumption, but incredibly bad for public interfaces. And of course, the more complicated a function is, greater the potential of needing bugfixes. Mind you, none of this is magically specific to this particular function. Except in the sense that bulk operations offer a better way of performance improvements than just inlining everything. > As you can see right now we have all mbuf alloc/free routines as static > inline. > And I think we would like to keep it like that. > So why that particular function should be different? Because there's much less need to have it inlined since the function call overhead is "amortized" by the fact its doing bulk operations. "We always did it that way" is not a very good reason :) > After all that function is nothing more than a wrapper > around rte_mempool_get_bulk() unrolled by 4 loop {rte_pktmbuf_reset()} > So unless mempool get/put API would change, I can hardly see there could be > any ABI > breakages in future. > About 'real world' performance gain - it was a 'real world' performance > problem, > that we tried to solve by introducing that function: > http://dpdk.org/ml/archives/dev/2015-May/017633.html > > And according to the user feedback, it do
[dpdk-dev] [PATCH 0/6] external mempool manager
On 2/19/2016 1:25 PM, Olivier MATZ wrote: > Hi, > > On 02/16/2016 03:48 PM, David Hunt wrote: >> Hi list. >> >> Here's the v2 version of a proposed patch for an external mempool manager > Just to notice the "v2" is missing in the title, it would help > to have it for next versions of the series. > Thanks, Olivier, I will ensure it's in the next patchset. Regards, David.
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/29/2016 12:43 PM, Ferruh Yigit wrote: > On 2/29/2016 9:43 AM, Avi Kivity wrote: >> On 02/28/2016 10:16 PM, Ferruh Yigit wrote: >>> On 2/28/2016 3:34 PM, Avi Kivity wrote: On 01/27/2016 06:24 PM, Ferruh Yigit wrote: > This kernel module is based on KNI module, but this one is stripped > version of it and only for control messages, no data transfer > functionality provided. > > This Linux kernel module helps userspace application create virtual > interfaces and when a control command issued into that virtual > interface, module pushes the command to the userspace and gets the > response back for the caller application. > > The Linux tools like ethtool/ifconfig/ip can be used on virtual > interfaces but not ones for related data, like tcpdump. > > In long term this patch intends to replace the KNI and KNI will be > depreciated. Instead of adding yet another out-of-tree kernel module, why not extend the existing in-tree tap driver? This will make everyone's life easier. Since tap also supports data transfer, an application can also forward packets not intended to it to the kernel, and forward packets from the kernel through the device. >>> Hi Avi, >>> >>> KDP (Kernel Data Path) does what you have described, it is implemented >>> as PMD and it benefits from tap driver to data transfer through the >>> kernel. It also support custom kernel module for better performance. >>> >>> For KCP (Kernel Control Path), network driver forwards control commands >>> to the userspace driver, I doubt this is something wanted for tun/tap >>> driver, so extending tun/tap driver like this can be hard to upstream. >> Have you tried asking? Maybe if you explain it they will be open to the >> extension. >> > Not communicated but tun/tap already doing something different. > For KCP, created interface is map of the DPDK port. All data interface > shows coming from DPDK port. For example if you get stats information > with ifconfig, the values you observe are DPDK port statistics -not > statistics of data between userspace and kernelspace, statistics of data > forwarded between DPDK ports. If you down the interface, DPDK port > stopped, etc... > > If you extend the tun/tap, it won't be map of the DPDK port, and if you > get statistics information from that interface, what do you expect to > see, the data transferred between kernel and userspace, or underlying > DPDK port forwarding statistics? Good point. But you really have to involve netdev on this, or you'll live out-of-tree forever. > Extending tun/tap in a way we want, forwarding all control commands to > userspace, will break the current tun/tap, this doesn't looks like a > valid option to me. It's possible to enhance it while preserving backwards compatibility, by enabling a feature flag (statistics from userspace). > For data path, using tun/tap is OK and we are already doing it, for the > control path I believe we need a new driver. > >> Certainly it will be better to have KCP and KDP use the same kernel >> interface name; so we'll need to either add data path support to kcp >> (causing duplication with tap), or add control path support to tap. I >> think the latter is preferable. >> > Why it is better to have same interface? Anyone who is not interested > with kernel data path may want to control DPDK ports using common tools, > or want to get some basic information and stats using ethtool or > ifconfig. Why we need to bind two different functionality together? Having two interfaces will be confusing for the user. If I wish to firewall data packets coming from the dpdk port, do I set firewall rules on dpdk0 or tap0? I don't think it matters whether you extend tap, or add a data path to kcp, but if you want to upstream it, it needs to be blessed by netdev. > >>> We are investigating about adding a native support to Linux kernel for >>> KCP, but there is no task started for this right now, any support is >>> welcome. >>> >>>
[dpdk-dev] [PATCH 2/6] mempool: add stack (lifo) based external mempool handler
On 2/19/2016 1:31 PM, Olivier MATZ wrote: > Hi David, > > On 02/16/2016 03:48 PM, David Hunt wrote: >> adds a simple stack based mempool handler >> >> Signed-off-by: David Hunt >> --- >> lib/librte_mempool/Makefile| 2 +- >> lib/librte_mempool/rte_mempool.c | 4 +- >> lib/librte_mempool/rte_mempool.h | 1 + >> lib/librte_mempool/rte_mempool_stack.c | 164 >> + >> 4 files changed, 169 insertions(+), 2 deletions(-) >> create mode 100644 lib/librte_mempool/rte_mempool_stack.c >> > I don't get what is the purpose of this handler. Is it an example > or is it something that could be useful for dpdk applications? > > If it's an example, we should find a way to put the code outside > the librte_mempool library, maybe in the test program. I see there > is also a "custom handler". Do we really need to have both? They are both example handlers. I agree that we could reduce down to one, and since the 'custom' handler has autotests, I would suggest we keep that one. The next question is where it should live. I agree that it's not ideal to have example code living in the same directory as the mempool library, but they are an integral part of the library itself. How about creating a handlers sub-directory? We could then keep all additional and sample handlers in there, away from the built-in handlers. Also, seeing as the handler code is intended to be part of the library, I think moving it out to the examples directory may confuse matters further. Regards, David.
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
Hi, I totally agree with Avi's comments. This topic is really important for the future of DPDK. So I think we must give some time to continue the discussion and have netdev involved in the choices done. As a consequence, these series should not be merged in the release 16.04. Thanks for continuing the work. 2016-02-29 12:58, Avi Kivity: > On 02/29/2016 12:43 PM, Ferruh Yigit wrote: > > On 2/29/2016 9:43 AM, Avi Kivity wrote: > >> On 02/28/2016 10:16 PM, Ferruh Yigit wrote: > >>> On 2/28/2016 3:34 PM, Avi Kivity wrote: > On 01/27/2016 06:24 PM, Ferruh Yigit wrote: > > This kernel module is based on KNI module, but this one is stripped > > version of it and only for control messages, no data transfer > > functionality provided. > > > > This Linux kernel module helps userspace application create virtual > > interfaces and when a control command issued into that virtual > > interface, module pushes the command to the userspace and gets the > > response back for the caller application. > > > > The Linux tools like ethtool/ifconfig/ip can be used on virtual > > interfaces but not ones for related data, like tcpdump. > > > > In long term this patch intends to replace the KNI and KNI will be > > depreciated. > Instead of adding yet another out-of-tree kernel module, why not extend > the existing in-tree tap driver? This will make everyone's life easier. > > Since tap also supports data transfer, an application can also forward > packets not intended to it to the kernel, and forward packets from the > kernel through the device. > > >>> Hi Avi, > >>> > >>> KDP (Kernel Data Path) does what you have described, it is implemented > >>> as PMD and it benefits from tap driver to data transfer through the > >>> kernel. It also support custom kernel module for better performance. > >>> > >>> For KCP (Kernel Control Path), network driver forwards control commands > >>> to the userspace driver, I doubt this is something wanted for tun/tap > >>> driver, so extending tun/tap driver like this can be hard to upstream. > >> Have you tried asking? Maybe if you explain it they will be open to the > >> extension. > >> > > Not communicated but tun/tap already doing something different. > > For KCP, created interface is map of the DPDK port. All data interface > > shows coming from DPDK port. For example if you get stats information > > with ifconfig, the values you observe are DPDK port statistics -not > > statistics of data between userspace and kernelspace, statistics of data > > forwarded between DPDK ports. If you down the interface, DPDK port > > stopped, etc... > > > > If you extend the tun/tap, it won't be map of the DPDK port, and if you > > get statistics information from that interface, what do you expect to > > see, the data transferred between kernel and userspace, or underlying > > DPDK port forwarding statistics? > > Good point. But you really have to involve netdev on this, or you'll > live out-of-tree forever. +1 > > Extending tun/tap in a way we want, forwarding all control commands to > > userspace, will break the current tun/tap, this doesn't looks like a > > valid option to me. > > It's possible to enhance it while preserving backwards compatibility, by > enabling a feature flag (statistics from userspace). +1 > > For data path, using tun/tap is OK and we are already doing it, for the > > control path I believe we need a new driver. > > > >> Certainly it will be better to have KCP and KDP use the same kernel > >> interface name; so we'll need to either add data path support to kcp > >> (causing duplication with tap), or add control path support to tap. I > >> think the latter is preferable. > >> > > Why it is better to have same interface? Anyone who is not interested > > with kernel data path may want to control DPDK ports using common tools, > > or want to get some basic information and stats using ethtool or > > ifconfig. Why we need to bind two different functionality together? > > Having two interfaces will be confusing for the user. If I wish to > firewall data packets coming from the dpdk port, do I set firewall rules > on dpdk0 or tap0? +1 > I don't think it matters whether you extend tap, or add a data path to > kcp, but if you want to upstream it, it needs to be blessed by netdev. +1 > >>> We are investigating about adding a native support to Linux kernel for > >>> KCP, but there is no task started for this right now, any support is > >>> welcome.
[dpdk-dev] [PATCH 1/6] mempool: add external mempool manager support
On 2/19/2016 1:30 PM, Olivier MATZ wrote: > Hi David, > > On 02/16/2016 03:48 PM, David Hunt wrote: >> Adds the new rte_mempool_create_ext api and callback mechanism for >> external mempool handlers >> >> Modifies the existing rte_mempool_create to set up the handler_idx to >> the relevant mempool handler based on the handler name: >> ring_sp_sc >> ring_mp_mc >> ring_sp_mc >> ring_mp_sc >> >> v2: merges the duplicated code in rte_mempool_xmem_create and >> rte_mempool_create_ext into one common function. The old functions >> now call the new common function with the relevant parameters. >> >> Signed-off-by: David Hunt > I think the refactoring of rte_mempool_create() (adding of > mempool_create()) should go in another commit. It will make the > patches much easier to read. > > Also, I'm sorry but it seems that several comments or question I've made > in http://dpdk.org/ml/archives/dev/2016-February/032706.html are > not addressed. > > Examples: > - putting some part of the patch in separate commits > - meaning of "rt_pool" > - put_pool_bulk unclear comment > - should we also have get_pool_bulk stats? > - missing _MEMPOOL_STAT_ADD() in mempool_bulk() > - why internal in rte_mempool_internal.h? > - why default in rte_mempool_default.c? > - remaining references to stack handler (in a comment) > - ...? > > As you know, doing a proper code review takes a lot of time. If I > have to re-check all of my previous comments, it will take even > more. I'm not saying all my comments require a code change, but in case > you don't agree, please at least explain your opinion so we can debate > on the list. > Hi Olivier, Sincerest apologies. I had intended in coming back around to your original comments after refactoring the code. I will do that now. I did take them into consideration, but I see now that I need to do further work, such as a clearer name for rt_pool, etc. I will respond to your original email. Thanks David.
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 2/29/2016 10:58 AM, Avi Kivity wrote: > > > On 02/29/2016 12:43 PM, Ferruh Yigit wrote: >> On 2/29/2016 9:43 AM, Avi Kivity wrote: >>> On 02/28/2016 10:16 PM, Ferruh Yigit wrote: On 2/28/2016 3:34 PM, Avi Kivity wrote: > On 01/27/2016 06:24 PM, Ferruh Yigit wrote: >> This kernel module is based on KNI module, but this one is stripped >> version of it and only for control messages, no data transfer >> functionality provided. >> >> This Linux kernel module helps userspace application create virtual >> interfaces and when a control command issued into that virtual >> interface, module pushes the command to the userspace and gets the >> response back for the caller application. >> >> The Linux tools like ethtool/ifconfig/ip can be used on virtual >> interfaces but not ones for related data, like tcpdump. >> >> In long term this patch intends to replace the KNI and KNI will be >> depreciated. > Instead of adding yet another out-of-tree kernel module, why not > extend > the existing in-tree tap driver? This will make everyone's life > easier. > > Since tap also supports data transfer, an application can also forward > packets not intended to it to the kernel, and forward packets from the > kernel through the device. > Hi Avi, KDP (Kernel Data Path) does what you have described, it is implemented as PMD and it benefits from tap driver to data transfer through the kernel. It also support custom kernel module for better performance. For KCP (Kernel Control Path), network driver forwards control commands to the userspace driver, I doubt this is something wanted for tun/tap driver, so extending tun/tap driver like this can be hard to upstream. >>> Have you tried asking? Maybe if you explain it they will be open to the >>> extension. >>> >> Not communicated but tun/tap already doing something different. >> For KCP, created interface is map of the DPDK port. All data interface >> shows coming from DPDK port. For example if you get stats information >> with ifconfig, the values you observe are DPDK port statistics -not >> statistics of data between userspace and kernelspace, statistics of data >> forwarded between DPDK ports. If you down the interface, DPDK port >> stopped, etc... >> >> If you extend the tun/tap, it won't be map of the DPDK port, and if you >> get statistics information from that interface, what do you expect to >> see, the data transferred between kernel and userspace, or underlying >> DPDK port forwarding statistics? > > Good point. But you really have to involve netdev on this, or you'll > live out-of-tree forever. > Why do we need to touch netdev? A simple network driver, similar to kcp, can be solution. This driver implements all net_device_ops and ethtool_ops in a way to forward everything to the userspace via netlink. All needs to know about userspace driver is it's unique id. Any userspace application, not only DPDK drivers, can listen the netlink messages and response to the requests come to itself. This kind of driver is not big or complicated, kcp already does %90 of what described above. >> Extending tun/tap in a way we want, forwarding all control commands to >> userspace, will break the current tun/tap, this doesn't looks like a >> valid option to me. > > It's possible to enhance it while preserving backwards compatibility, by > enabling a feature flag (statistics from userspace). > >> For data path, using tun/tap is OK and we are already doing it, for the >> control path I believe we need a new driver. >> >>> Certainly it will be better to have KCP and KDP use the same kernel >>> interface name; so we'll need to either add data path support to kcp >>> (causing duplication with tap), or add control path support to tap. I >>> think the latter is preferable. >>> >> Why it is better to have same interface? Anyone who is not interested >> with kernel data path may want to control DPDK ports using common tools, >> or want to get some basic information and stats using ethtool or >> ifconfig. Why we need to bind two different functionality together? > > Having two interfaces will be confusing for the user. If I wish to > firewall data packets coming from the dpdk port, do I set firewall rules > on dpdk0 or tap0? > Agreed that it is confusing to have two interfaces. I think if user wants to use both data and control paths, a way can be found to end up with single interface, using module params or something else. Two different drivers for data and control not conflict with each other and can cooperate. But to work on this first both KCP and KDP should go in. > I don't think it matters whether you extend tap, or add a data path to > kcp, but if you want to upstream it, it needs to be blessed by netdev. > I still think not good idea to merge these two, because they may be used independently, but we can improve how they work together.
[dpdk-dev] [PATCH] doc/nic: add ixgbe statistics on read frequency
This patch adds a note to the ixgbe PMD guide, stating the minimum time that statistics must be polled from the hardware in order to avoid register values becoming saturated and "sticking" to the max value. Signed-off-by: Harry van Haaren --- doc/guides/nics/ixgbe.rst | 24 1 file changed, 24 insertions(+) diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst index 8cae299..c8085a8 100644 --- a/doc/guides/nics/ixgbe.rst +++ b/doc/guides/nics/ixgbe.rst @@ -178,3 +178,27 @@ load_balancer As in the case of l3fwd, set configure port_conf.rxmode.hw_ip_checksum=0 to enable vPMD. In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in load_balancer to avoid using the default burst size of 144. + +Statistics +-- + +The statistics of ixgbe hardware must be polled regularly in order for it to +remain consistent. Running a DPDK application without polling the statistcs will +cause registers on hardware to count to thier maxiumum value, and "stick" at +that value. + +In order to avoid statistic registers every reaching thier maxiumum value, +read the statistics from the hardware using ``rte_eth_stats_get()`` or +``rte_eth_xstats_get()``. + +The maxiumum time between statistics polls that ensures consistent results can +be calculated as follows: + +.. code-block:: c + + max_read_interval = UINT_MAX / max_packets_per_second + max_read_interval = 4294967295 / 14880952 + max_read_interval = 288.6218096127183 (seconds) + max_read_interval = ~4 mins 48 sec. + +In order to ensure valid results, it is recommended to poll every 4 minutes. -- 2.5.0
[dpdk-dev] [PATCH v5 01/11] ethdev: add API to query packet type filling info
On 02/26/2016 09:34 AM, Jianfeng Tan wrote: > Add a new API rte_eth_dev_get_ptype_info to query whether/what packet > type can be filled by given pmd rx burst function. > > Signed-off-by: Jianfeng Tan > --- > lib/librte_ether/rte_ethdev.c | 26 ++ > lib/librte_ether/rte_ethdev.h | 26 ++ > 2 files changed, 52 insertions(+) > [...] > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index 16da821..16f32a0 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -1021,6 +1021,9 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev > *dev, > struct rte_eth_dev_info *dev_info); > /**< @internal Get specific informations of an Ethernet device. */ > > +typedef const uint32_t *(*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev); > +/**< @internal Get ptype info of eth_rx_burst_t. */ > + > typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev, > uint16_t queue_id); > /**< @internal Start rx and tx of a queue of an Ethernet device. */ > @@ -1347,6 +1350,7 @@ struct eth_dev_ops { > eth_queue_stats_mapping_set_t queue_stats_mapping_set; > /**< Configure per queue stat counter mapping. */ > eth_dev_infos_get_tdev_infos_get; /**< Get device info. */ > + eth_dev_ptype_info_get_t dev_ptype_info_get; /** Get ptype info */ > mtu_set_t mtu_set; /**< Set MTU. */ > vlan_filter_set_t vlan_filter_set; /**< Filter VLAN Setup. */ > vlan_tpid_set_tvlan_tpid_set; /**< Outer VLAN TPID > Setup. */ > @@ -2268,6 +2272,28 @@ void rte_eth_macaddr_get(uint8_t port_id, struct > ether_addr *mac_addr); Technically this is an ABI break but its marked internal and I guess it falls into the "drivers only" territory similar to what was discussed in this thead: http://dpdk.org/ml/archives/dev/2016-January/032348.html so its probably ok. > void rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info > *dev_info); > > /** > + * Retrieve the packet type information of an Ethernet device. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param ptype_mask > + * A hint of what kind of packet type which the caller is interested in. > + * @param ptypes > + * An array pointer to store adequent packet types, allocated by caller. > + * @param num > + * Size of the array pointed by param ptypes. > + * @return > + * - (>0) Number of ptypes supported. If it exceeds param num, exceeding > + * packet types will not be filled in the given array. > + * - (0 or -ENOTSUP) if PMD does not fill the specified ptype. > + * - (-ENODEV) if *port_id* invalid. > + */ > +extern int rte_eth_dev_get_ptype_info(uint8_t port_id, > + uint32_t ptype_mask, > + uint32_t *ptypes, > + int num); > + > +/** >* Retrieve the MTU of an Ethernet device. >* >* @param port_id > "extern" is redundant in headers. We just saw a round of removing them (commit dd34ff1f0e03b2c5e4a97e9fbcba5c8238aac573), lets not add them back :) More importantly, to export a function you need to add an entry for it in rte_ether_version.map. - Panu -
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 2/29/2016 11:06 AM, Thomas Monjalon wrote: > Hi, > I totally agree with Avi's comments. > This topic is really important for the future of DPDK. > So I think we must give some time to continue the discussion > and have netdev involved in the choices done. > As a consequence, these series should not be merged in the release 16.04. > Thanks for continuing the work. > Hi Thomas, It is great to have some discussion and feedbacks. But I doubt not merging in this release will help to have more discussion. It is better to have them in this release and let people experiment it, this gives more chance to better discussion. These features are replacement of KNI, and KNI is not intended to be removed in this release, so who are using KNI as solution can continue to use KNI and can test KCP/KDP, so that we can get more feedbacks. Thanks, ferruh
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/29/2016 01:27 PM, Ferruh Yigit wrote: > On 2/29/2016 10:58 AM, Avi Kivity wrote: >> >> On 02/29/2016 12:43 PM, Ferruh Yigit wrote: >>> On 2/29/2016 9:43 AM, Avi Kivity wrote: On 02/28/2016 10:16 PM, Ferruh Yigit wrote: > On 2/28/2016 3:34 PM, Avi Kivity wrote: >> On 01/27/2016 06:24 PM, Ferruh Yigit wrote: >>> This kernel module is based on KNI module, but this one is stripped >>> version of it and only for control messages, no data transfer >>> functionality provided. >>> >>> This Linux kernel module helps userspace application create virtual >>> interfaces and when a control command issued into that virtual >>> interface, module pushes the command to the userspace and gets the >>> response back for the caller application. >>> >>> The Linux tools like ethtool/ifconfig/ip can be used on virtual >>> interfaces but not ones for related data, like tcpdump. >>> >>> In long term this patch intends to replace the KNI and KNI will be >>> depreciated. >> Instead of adding yet another out-of-tree kernel module, why not >> extend >> the existing in-tree tap driver? This will make everyone's life >> easier. >> >> Since tap also supports data transfer, an application can also forward >> packets not intended to it to the kernel, and forward packets from the >> kernel through the device. >> > Hi Avi, > > KDP (Kernel Data Path) does what you have described, it is implemented > as PMD and it benefits from tap driver to data transfer through the > kernel. It also support custom kernel module for better performance. > > For KCP (Kernel Control Path), network driver forwards control commands > to the userspace driver, I doubt this is something wanted for tun/tap > driver, so extending tun/tap driver like this can be hard to upstream. Have you tried asking? Maybe if you explain it they will be open to the extension. >>> Not communicated but tun/tap already doing something different. >>> For KCP, created interface is map of the DPDK port. All data interface >>> shows coming from DPDK port. For example if you get stats information >>> with ifconfig, the values you observe are DPDK port statistics -not >>> statistics of data between userspace and kernelspace, statistics of data >>> forwarded between DPDK ports. If you down the interface, DPDK port >>> stopped, etc... >>> >>> If you extend the tun/tap, it won't be map of the DPDK port, and if you >>> get statistics information from that interface, what do you expect to >>> see, the data transferred between kernel and userspace, or underlying >>> DPDK port forwarding statistics? >> Good point. But you really have to involve netdev on this, or you'll >> live out-of-tree forever. >> > Why do we need to touch netdev? By netdev, I meant the mailing list. If you don't touch it, your driver will remain out-of-tree forever. > A simple network driver, similar to kcp, can be solution. > > This driver implements all net_device_ops and ethtool_ops in a way to > forward everything to the userspace via netlink. All needs to know about > userspace driver is it's unique id. Any userspace application, not only > DPDK drivers, can listen the netlink messages and response to the > requests come to itself. > > This kind of driver is not big or complicated, kcp already does %90 of > what described above. I am not arguing against kcp. It fulfills an important need. This is my argument: 1. having multiple interfaces for the control and data path is bad for the user 2. therefore, we need to either add tap functionality to kcp, or add kcp functionality to tap 3. netdev@ is more likely (IMO) to accept additional functionality to tap than a new driver, but the only way to know is to engage with them > >>> Extending tun/tap in a way we want, forwarding all control commands to >>> userspace, will break the current tun/tap, this doesn't looks like a >>> valid option to me. >> It's possible to enhance it while preserving backwards compatibility, by >> enabling a feature flag (statistics from userspace). >> >>> For data path, using tun/tap is OK and we are already doing it, for the >>> control path I believe we need a new driver. >>> Certainly it will be better to have KCP and KDP use the same kernel interface name; so we'll need to either add data path support to kcp (causing duplication with tap), or add control path support to tap. I think the latter is preferable. >>> Why it is better to have same interface? Anyone who is not interested >>> with kernel data path may want to control DPDK ports using common tools, >>> or want to get some basic information and stats using ethtool or >>> ifconfig. Why we need to bind two different functionality together? >> Having two interfaces will be confusing for the user. If I wish to >> firewall data packets coming from the dpdk port, do I set firewall rules >> on dpdk0
[dpdk-dev] ACL memory allocation failures
Thanks Konstantin. Few more questions in line: > > Previous allocation error was coming with 1024 huge pages of 2 MB size. > > After increasing the huge pages to 2048, I was able to add another > ~140 rules [IPv4 rule data--> with src, dst IP address & port, next header ] > more, ie., 950 rules were added. That's strange according to your log, all you need is ~13MB of hugepage memory: ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 Wonder what consumed rest of 4GB? >> We are creating mem pools (for DPDK compatible 3 ports) for packet >> processing. >>> And there are no free huge pages available after our DPDK app >>> initialization. Again do you re-build your table after every rule you add? If so, then it seems a bit strange approach (and definitely not the fastest one). >>Yes, we are rebuilding the rules every time and is due to 2 reasons: >>1. Our application, gives full list of rules every time you add new rule. >>2. There is no way to delete a specific rule in the trie. Is there any way to >>delete a specific ACL rule? What you can do instead: create context; add all your rules into it; build; >>> By following the same approach (what I explained above, rebuilding the ACL >>> trie everytime), can we fix this memory allocation issue? >>>If yes, please provide me some pointers to modify the code. > > Logically it did not increase number of rules [expected 2*817, but only 950 > were added]. Is it really using huge pages memory only? > > From the code it looks like heap memory. [ ret = > malloc_heap_alloc(&mcfg->malloc_heaps[i], type, size, 0, align == 0 ? > 1 : align, 0) ] As I can see from the log it fails at GEN phase, when trying to allocate hugepages for RT table. At lib/librte_acl/acl_gen.c:509 rte_acl_gen(struct rte_acl_ctx *ctx, struct rte_acl_trie *trie, struct rte_acl_bld_trie *node_bld_trie, uint32_t num_tries, uint32_t num_categories, uint32_t data_index_sz, size_t max_size) { ... mem = rte_zmalloc_socket(ctx->name, total_size, RTE_CACHE_LINE_SIZE, ctx->socket_id); if (mem == NULL) { RTE_LOG(ERR, ACL, "allocation of %zu bytes on socket %d for %s failed\n", total_size, ctx->socket_id, ctx->name); return -ENOMEM; } >>> Is there any way to reserve some particular amount of huge page memory for >>> ACL trie (in eal_init())? Konstantin > > > -Original Message- > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Rapelly, Varun > > Sent: Friday, February 26, 2016 10:28 AM > > To: dev at dpdk.org > > Subject: Re: [dpdk-dev] ACL memory allocation failures > > > > Hi All, > > > > When I'm trying to configure some 5000+ ACL rules with different > > source IP addresses, getting ACL memory allocation failure. I'm using DPDK > > 2.1. > > > > [root at ACLISSUE log_2015_10_26_08_19_42]# vim np.log match > > nodes/bytes > > used: 816/104448 > > total: 12940832 bytes > > ACL: Build phase for ACL "ipv4_acl_table2": > > memory consumed: 947913495 > > ACL: trie 0: number of rules: 816 > > ACL: allocation of 12966528 bytes on socket 0 for ipv4_acl_table1 > > failed > > ACL: Build phase for ACL "ipv4_acl_table1": > > memory consumed: 947913495 > > ACL: trie 0: number of rules: 817 > > EAL: Error - exiting with code: 1 > > Cause: Failed to build ACL trie > > > > Again sourced the ACL config file. After adding around 77 again the same > > error came. > > > > total: 14912784 bytes > > ACL: Build phase for ACL "ipv4_acl_table1": > > memory consumed: 1040188260 > > ACL: trie 0: number of rules: 893 > > ACL: allocation of 14938480 bytes on socket 0 for ipv4_acl_table2 > > failed > > You are running out of hugepages memory. > > > ACL: Build phase for ACL "ipv4_acl_table2": > > memory consumed: 1040188260 > > ACL: trie 0: number of rules: 894 > > EAL: Error - exiting with code: 1 > > Cause: Failed to build ACL trie > > > > Where to increase the memory to avoid this issue? > > Refer to: > http://dpdk.org/doc/guides/linux_gsg/sys_reqs.html#running-dpdk-applic > ations > Section 2.3.2 > > Konstantin
[dpdk-dev] [PATCH v2] doc/nic: add ixgbe statistics on read frequency
This patch adds a note to the ixgbe PMD guide, stating the minimum time that statistics must be polled from the hardware in order to avoid register values becoming saturated and "sticking" to the max value. Reported-by: Jerry Zhang Tested-by: Marcin Kerlin Signed-off-by: Harry van Haaren --- v2: Add reported-by and tested-by doc/guides/nics/ixgbe.rst | 24 1 file changed, 24 insertions(+) diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst index 8cae299..c8085a8 100644 --- a/doc/guides/nics/ixgbe.rst +++ b/doc/guides/nics/ixgbe.rst @@ -178,3 +178,27 @@ load_balancer As in the case of l3fwd, set configure port_conf.rxmode.hw_ip_checksum=0 to enable vPMD. In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in load_balancer to avoid using the default burst size of 144. + +Statistics +-- + +The statistics of ixgbe hardware must be polled regularly in order for it to +remain consistent. Running a DPDK application without polling the statistcs will +cause registers on hardware to count to thier maxiumum value, and "stick" at +that value. + +In order to avoid statistic registers every reaching thier maxiumum value, +read the statistics from the hardware using ``rte_eth_stats_get()`` or +``rte_eth_xstats_get()``. + +The maxiumum time between statistics polls that ensures consistent results can +be calculated as follows: + +.. code-block:: c + + max_read_interval = UINT_MAX / max_packets_per_second + max_read_interval = 4294967295 / 14880952 + max_read_interval = 288.6218096127183 (seconds) + max_read_interval = ~4 mins 48 sec. + +In order to ensure valid results, it is recommended to poll every 4 minutes. -- 2.5.0
[dpdk-dev] [PATCH v1] virtio: Use cpuflag for vector api
On Mon, Feb 29, 2016 at 9:57 AM, Yuanhan Liu wrote: > On Fri, Feb 26, 2016 at 02:21:02PM +0530, Santosh Shukla wrote: >> Check cpuflag macro before using vectored api. >> -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added >> cpuflag. >> - Also wrap other vectored freind api ie.. >> 1) virtqueue_enqueue_recv_refill_simple >> 2) virtio_rxq_vec_setup >> > ... >> diff --git a/drivers/net/virtio/virtio_rxtx_simple.c >> b/drivers/net/virtio/virtio_rxtx_simple.c >> index 3a1de9d..be51d7c 100644 >> --- a/drivers/net/virtio/virtio_rxtx_simple.c >> +++ b/drivers/net/virtio/virtio_rxtx_simple.c > > Hmm, why not wrapping the whole file, instead of just few functions? > Better to refactor code and make arch specific. Current implementation is temporary. > Or maybe better, do a compile time check at the Makefile, something > like: > > if has_CPUFLAG_xxx > SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple.c > endif > Tried this approach but end up with link error, If I try to fix below link error then I will be ending up writing similar code, linker error snap: /work/santosh/thunder/nfs/dpdk/arm64-thunderx-linuxapp-gcc/lib/librte_pmd_virtio.a(virtio_rxtx.o): In function `virtio_dev_rxtx_start': virtio_rxtx.c:(.text+0x168c): undefined reference to `virtqueue_enqueue_recv_refill_simple' /work/santosh/thunder/nfs/dpdk/arm64-thunderx-linuxapp-gcc/lib/librte_pmd_virtio.a(virtio_rxtx.o): In function `virtio_dev_rx_queue_setup': virtio_rxtx.c:(.text+0x2364): undefined reference to `virtio_rxq_vec_setup' /work/santosh/thunder/nfs/dpdk/arm64-thunderx-linuxapp-gcc/lib/librte_pmd_virtio.a(virtio_rxtx.o): In function `virtio_dev_tx_queue_setup': virtio_rxtx.c:(.text+0x2460): undefined reference to `virtio_xmit_pkts_simple' virtio_rxtx.c:(.text+0x2464): undefined reference to `virtio_recv_pkts_vec' virtio_rxtx.c:(.text+0x2468): undefined reference to `virtio_xmit_pkts_simple' virtio_rxtx.c:(.text+0x246c): undefined reference to `virtio_recv_pkts_vec' collect2: error: ld returned 1 exit status make[5]: *** [test] Error 1 make[4]: *** [test] Error 2 make[3]: *** [app] Error 2 > > --yliu
[dpdk-dev] [PATCH v2] virtio: Use cpuflag for vector api
Check cpuflag macro before using vectored api. -virtio_recv_pkts_vec() uses _sse3__ simd instruction for now so added cpuflag. - Also wrap other vectored freind api ie.. 1) virtqueue_enqueue_recv_refill_simple 2) virtio_rxq_vec_setup - removed VIRTIO_PMD=n from armv7/v8 config. todo: 1) Move virtio_recv_pkts_vec() implementation to drivers/virtio/virtio_vec_.h file. 2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_.h files to provide vectored/non-vectored rx/tx apis. Signed-off-by: Santosh Shukla --- - v2: Removed VIRTIO_PMD=n from arm v7/v8 - v1: This is a rework of patch [1]. Note: This patch will let non-x86 arch to use virtio pmd. [1] http://dpdk.org/dev/patchwork/patch/10429/ config/defconfig_arm-armv7a-linuxapp-gcc |1 - config/defconfig_arm64-armv8a-linuxapp-gcc |1 - drivers/net/virtio/virtio_rxtx.c | 16 +++- drivers/net/virtio/virtio_rxtx.h |2 ++ drivers/net/virtio/virtio_rxtx_simple.c| 11 ++- 5 files changed, 27 insertions(+), 4 deletions(-) diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc index cbebd64..4bfdfad 100644 --- a/config/defconfig_arm-armv7a-linuxapp-gcc +++ b/config/defconfig_arm-armv7a-linuxapp-gcc @@ -70,7 +70,6 @@ CONFIG_RTE_LIBRTE_I40E_PMD=n CONFIG_RTE_LIBRTE_IXGBE_PMD=n CONFIG_RTE_LIBRTE_MLX4_PMD=n CONFIG_RTE_LIBRTE_MPIPE_PMD=n -CONFIG_RTE_LIBRTE_VIRTIO_PMD=n CONFIG_RTE_LIBRTE_VMXNET3_PMD=n CONFIG_RTE_LIBRTE_PMD_XENVIRT=n CONFIG_RTE_LIBRTE_PMD_BNX2X=n diff --git a/config/defconfig_arm64-armv8a-linuxapp-gcc b/config/defconfig_arm64-armv8a-linuxapp-gcc index eacd01c..f6f5d18 100644 --- a/config/defconfig_arm64-armv8a-linuxapp-gcc +++ b/config/defconfig_arm64-armv8a-linuxapp-gcc @@ -44,7 +44,6 @@ CONFIG_RTE_TOOLCHAIN="gcc" CONFIG_RTE_TOOLCHAIN_GCC=y CONFIG_RTE_IXGBE_INC_VECTOR=n -CONFIG_RTE_LIBRTE_VIRTIO_PMD=n CONFIG_RTE_LIBRTE_IVSHMEM=n CONFIG_RTE_LIBRTE_FM10K_PMD=n CONFIG_RTE_LIBRTE_I40E_PMD=n diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 41a1366..ec0b8de 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -67,7 +67,9 @@ #define VIRTIO_SIMPLE_FLAGS ((uint32_t)ETH_TXQ_FLAGS_NOMULTSEGS | \ ETH_TXQ_FLAGS_NOOFFLOADS) +#ifdef RTE_MACHINE_CPUFLAG_SSSE3 static int use_simple_rxtx; +#endif static void vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx) @@ -307,12 +309,13 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type) nbufs = 0; error = ENOSPC; +#ifdef RTE_MACHINE_CPUFLAG_SSSE3 if (use_simple_rxtx) for (i = 0; i < vq->vq_nentries; i++) { vq->vq_ring.avail->ring[i] = i; vq->vq_ring.desc[i].flags = VRING_DESC_F_WRITE; } - +#endif memset(&vq->fake_mbuf, 0, sizeof(vq->fake_mbuf)); for (i = 0; i < RTE_PMD_VIRTIO_RX_MAX_BURST; i++) vq->sw_ring[vq->vq_nentries + i] = &vq->fake_mbuf; @@ -325,9 +328,11 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type) /** * Enqueue allocated buffers* ***/ +#ifdef RTE_MACHINE_CPUFLAG_SSSE3 if (use_simple_rxtx) error = virtqueue_enqueue_recv_refill_simple(vq, m); else +#endif error = virtqueue_enqueue_recv_refill(vq, m); if (error) { rte_pktmbuf_free(m); @@ -340,6 +345,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type) PMD_INIT_LOG(DEBUG, "Allocated %d bufs", nbufs); } else if (queue_type == VTNET_TQ) { +#ifdef RTE_MACHINE_CPUFLAG_SSSE3 if (use_simple_rxtx) { int mid_idx = vq->vq_nentries >> 1; for (i = 0; i < mid_idx; i++) { @@ -357,6 +363,7 @@ virtio_dev_vring_start(struct virtqueue *vq, int queue_type) for (i = mid_idx; i < vq->vq_nentries; i++) vq->vq_ring.avail->ring[i] = i; } +#endif } } @@ -423,7 +430,9 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev, dev->data->rx_queues[queue_idx] = vq; +#ifdef RTE_MACHINE_CPUFLAG_SSSE3 virtio_rxq_vec_setup(vq); +#endif return 0; } @@ -449,7 +458,10 @@ virtio_dev_tx_queue_setup(struct rte_eth_dev *dev, const struct rte_eth_txconf *tx_conf) { uint8_t vtpci_queue_idx = 2 * queue_idx + VTNET_SQ_TQ_QUEUE_IDX; + +#ifdef RTE_MACHINE_CPUFLAG_SSSE3 struct virtio_hw *hw = dev->data->dev_private; +#endif struct virtqueue *vq; uint16_t tx_fre
[dpdk-dev] [PATCH v2] doc/nic: add ixgbe statistics on read frequency
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Harry van Haaren > Sent: Monday, February 29, 2016 1:17 PM > To: Mcnamara, John > Cc: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v2] doc/nic: add ixgbe statistics on read > frequency > > This patch adds a note to the ixgbe PMD guide, stating the minimum time > that statistics must be polled from the hardware in order to avoid register > values becoming saturated and "sticking" to the max value. > > Reported-by: Jerry Zhang > Tested-by: Marcin Kerlin > Signed-off-by: Harry van Haaren Acked-by: Marcin Kerlin
[dpdk-dev] [PATCH v4 4/4] virtio: return 1 to tell the upper layer we don't take over this device
On Fri, Feb 26, 2016 at 7:23 AM, Huawei Xie wrote: > v4 changes: > Rebase as io port map is moved to eal. > Only fall back to PORT IO when there isn't any kernel driver (including Pl. mention that fallback behaviour applicable to x86 arch only.. However this patch fixes one problem in non-x86 arch issue, Example: VM has 8 virtio interface and 2 i/f attached out of 8, so in default case - after 2nd interface, ioport try to program 3..8 ports, result to failure, lead to exit dpdk application. Patch fixes this problem for non-x86 arch, test on arm64 platform. > VFIO/UIO) managing the device. Before v4, we fall back to PORT IO even if > VFIO/UIO fails. > Reword the commit message. > > v3 changes: > Change log message to tell user that the virtio device is skipped > due to it is managed by kernel driver, instead of asking user to > unbind it from kernel driver. > > v2 changes: > Remove unnecessary assignment of NULL to dev->data->mac_addrs. > Ajust one comment's position. > > virtio PMD could use IO port to configure the virtio device without > using UIO/VFIO driver in legacy mode. > > There are two issues with the previous implementation: > 1) virtio PMD will take over the virtio device(s) blindly even if not > intended for DPDK. > 2) driver conflict between virtio PMD and virtio-net kernel driver. > > This patch checks if there is kernel driver other than UIO/VFIO managing > the virtio device before using port IO. > > If legacy_virtio_resource_init fails and kernel driver other than > VFIO/UIO is managing the device, return 1 to tell the upper layer we > don't take over this device. > For all other IO port mapping errors, return -1. > > Note than if VFIO/UIO fails, now we don't fall back to port IO. > > Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource") > > Signed-off-by: Huawei Xie > --- > drivers/net/virtio/virtio_ethdev.c | 9 +++-- > drivers/net/virtio/virtio_pci.c| 15 ++- > 2 files changed, 21 insertions(+), 3 deletions(-) > > diff --git a/drivers/net/virtio/virtio_ethdev.c > b/drivers/net/virtio/virtio_ethdev.c > index caa970c..8601080 100644 > --- a/drivers/net/virtio/virtio_ethdev.c > +++ b/drivers/net/virtio/virtio_ethdev.c > @@ -1,4 +1,5 @@ > /*- > + > * BSD LICENSE > * > * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. > @@ -1015,6 +1016,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) > struct virtio_net_config *config; > struct virtio_net_config local_config; > struct rte_pci_device *pci_dev; > + int ret; > > RTE_BUILD_BUG_ON(RTE_PKTMBUF_HEADROOM < sizeof(struct > virtio_net_hdr)); > > @@ -1037,8 +1039,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) > > pci_dev = eth_dev->pci_dev; > > - if (vtpci_init(pci_dev, hw) < 0) > - return -1; > + ret = vtpci_init(pci_dev, hw); > + if (ret) { > + rte_free(eth_dev->data->mac_addrs); > + return ret; > + } > > /* Reset the device although not necessary at startup */ > vtpci_reset(hw); > diff --git a/drivers/net/virtio/virtio_pci.c b/drivers/net/virtio/virtio_pci.c > index 85fbe88..f159b2a 100644 > --- a/drivers/net/virtio/virtio_pci.c > +++ b/drivers/net/virtio/virtio_pci.c > @@ -622,6 +622,13 @@ next: > return 0; > } > > +/* > + * Return -1: > + * if there is error mapping with VFIO/UIO. > + * if port map error when driver type is KDRV_NONE. > + * Return 1 if kernel driver is managing the device. > + * Return 0 on success. > + */ > int > vtpci_init(struct rte_pci_device *dev, struct virtio_hw *hw) > { > @@ -641,8 +648,14 @@ vtpci_init(struct rte_pci_device *dev, struct virtio_hw > *hw) > } > > PMD_INIT_LOG(INFO, "trying with legacy virtio pci."); > - if (legacy_virtio_resource_init(dev, hw) < 0) > + if (legacy_virtio_resource_init(dev, hw) < 0) { > + if (dev->kdrv == RTE_KDRV_UNKNOWN) { > + PMD_INIT_LOG(INFO, > + "skip kernel managed virtio device."); > + return 1; > + } > return -1; > + } > > hw->vtpci_ops = &legacy_ops; > hw->use_msix = legacy_virtio_has_msix(&dev->addr); Tested-by: Santosh Shukla Acked-by: Santosh Shukla > -- > 1.8.1.4 >
[dpdk-dev] [PATCH] mk: add makefile extention support
>2016-02-28 21:47, Wiles, Keith: >> >Hi, >> > >> >2016-02-09 11:35, Keith Wiles: >> >> Adding support to the build system to allow for Makefile.XXX >> >> extention to a subtree, which already has Makefiles. These >> >> Makefiles could be from the autotools and others places. Using >> >> the Makefile extention RTE_MKFILE_SUFFIX in a makefile subtree >> >> using 'export RTE_MKFILE_SUFFIX=.XXX' to use Makefile.XXX in >> >> that subtree. >> >> >> >> The main reason I needed this feature was to integrate a autotool >> >> open source projects with DPDK and keep the original Makefiles. >> > >> >Sorry I fail to understand why it is needed. >> >Are you trying to add autotool in DPDK? I don't think it is a good approach. >> >The DPDK must provide a pkgconfig interface to be integrated anywhere. >> >> I was not trying to add autotools to DPDK. On a number of times I wanted to >> integrate a open source project(s) with DPDK and use DPDK?s build system, >> but because the open source project already contained Makefile files you can >> not use DPDK build system without modify or moving the original Makefile >> files. Using this method I can just add a exported variable and supply my >> own Makefile.XXX files. >> >> One case was building FreeBSD source, but I did not want to modify FreeBSD >> Makefiles (or reply on previous built Makefiles as they would not work on >> Linux anyway) as I was pulling the source down from freebsd.org repo. Using >> a patch to add the Makefiles with a different suffix allows me to build >> FreeBSD using DPDK, without having to modify or own the FreeBSD source. I >> have had this problem a number of times with open source code I did not want >> to modify, but just build within DPDK build system and adding the support >> for a different suffix to DPDK provided a clean way. The change does not >> effect the correct build system and just allows someone to define a new >> suffix for a given subtree in the code. > >Why would you like to have another project inside the DPDK files tree? >If you want to integrate the lib inside an existing project, the solution >is pkgconfig. The goal for me was to use DPDK build system for that project, instead of using autotools or some other makefile system. In the case of FreeBSD code, the FreeBSD build system requires FreeBSD tools to be built as the ?make? and the Makefiles are very different on a Linux machine. > Regards, Keith
[dpdk-dev] Issue with configuring iproute using netdpcmd and running opendp
Hi, I am trying to configure the iproute using netdpcmd(from dpdk-odp repository), but it is failing. Kindly help to resolve this issue. root at ICSCHELAP1003:/home/hadmin/Mari/dpdk-odp/netdp_cmd# ./build/netdpcmd EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 0 on socket 0 EAL: Detected lcore 3 as core 1 on socket 0 EAL: Support maximum 128 logical core(s) by configuration. EAL: Detected 4 lcore(s) EAL: Setting up physically contiguous memory... EAL: Analysing 64 files EAL: Mapped segment 0 of size 0x20 EAL: Mapped segment 1 of size 0x40 EAL: Mapped segment 2 of size 0x60 EAL: Mapped segment 3 of size 0x20 EAL: Mapped segment 4 of size 0x20 EAL: Mapped segment 5 of size 0x20 EAL: Mapped segment 6 of size 0x20 EAL: Mapped segment 7 of size 0x20 EAL: Mapped segment 8 of size 0x20 EAL: Mapped segment 9 of size 0x20 EAL: Mapped segment 10 of size 0x40 EAL: Mapped segment 11 of size 0x40 EAL: Mapped segment 12 of size 0x40 EAL: Mapped segment 13 of size 0x20 EAL: Mapped segment 14 of size 0x220 EAL: Mapped segment 15 of size 0x100 EAL: Mapped segment 16 of size 0x20 EAL: Mapped segment 17 of size 0x40 EAL: Mapped segment 18 of size 0x200 EAL: memzone_reserve_aligned_thread_unsafe(): memzone already exists RING: Cannot reserve memory EAL: TSC frequency is ~1895612 KHz EAL: Master lcore 0 is ready (tid=f7fdc940;cpuset=[0]) Lookup ring(NETDP_CTRL_PRI_2_SEC) failed PANIC in main(): Cannot init ring 5: [./build/netdpcmd() [0x42c223]] 4: [/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x77105ec5]] 3: [./build/netdpcmd() [0x42ab3c]] 2: [./build/netdpcmd(__rte_panic+0xc9) [0x424f31]] 1: [./build/netdpcmd(rte_dump_stack+0x28) [0x495128]] Aborted Also, i am getting the below error while running opendp. root at ICSCHELAP1003:/home/hadmin/Mari/dpdk-odp/opendp# ./build/opendp -c 0x1 -n 1 -- -p 0x1 --config="(0,0,0)" EAL: Detected lcore 0 as core 0 on socket 0 EAL: Detected lcore 1 as core 1 on socket 0 EAL: Detected lcore 2 as core 0 on socket 0 EAL: Detected lcore 3 as core 1 on socket 0 EAL: Support maximum 128 logical core(s) by configuration. EAL: Detected 4 lcore(s) EAL: VFIO modules not all loaded, skip VFIO support... EAL: Setting up physically contiguous memory... EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x76e0 (size = 0x20) EAL: Ask a virtual area of 0x40 bytes EAL: Virtual area found at 0x7680 (size = 0x40) EAL: Ask a virtual area of 0x60 bytes EAL: Virtual area found at 0x7600 (size = 0x60) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x75c0 (size = 0x20) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7580 (size = 0x20) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7540 (size = 0x20) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7500 (size = 0x20) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x74c0 (size = 0x20) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7480 (size = 0x20) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7440 (size = 0x20) EAL: Ask a virtual area of 0x40 bytes EAL: Virtual area found at 0x73e0 (size = 0x40) EAL: Ask a virtual area of 0x40 bytes EAL: Virtual area found at 0x7380 (size = 0x40) EAL: Ask a virtual area of 0x40 bytes EAL: Virtual area found at 0x7320 (size = 0x40) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x72e0 (size = 0x20) EAL: Ask a virtual area of 0x220 bytes EAL: Virtual area found at 0x70a0 (size = 0x220) EAL: Ask a virtual area of 0x100 bytes EAL: Virtual area found at 0x7fffef80 (size = 0x100) EAL: Ask a virtual area of 0x20 bytes EAL: Virtual area found at 0x7fffef40 (size = 0x20) EAL: Ask a virtual area of 0x40 bytes EAL: Virtual area found at 0x7fffeee0 (size = 0x40) EAL: Ask a virtual area of 0x200 bytes EAL: Virtual area found at 0x7fffecc0 (size = 0x200) EAL: Requesting 64 pages of size 2MB from socket 0 EAL: TSC frequency is ~1895612 KHz EAL: Master lcore 0 is ready (tid=f7fdc980;cpuset=[0]) param nb 1 ports 0 port id 0 port 0 is not present on the board EAL: Error - exiting with code: 1 Cause: check_port_config failed Below is my ifconfig, do i need to configure anything before running opendp ? root at ICSCHELAP1003:/home/hadmin/Mari/dpdk-odp/opendp# ifconfig eth13 Link encap:Ethernet HWaddr 34:e6:d7:2b:89:60 inet addr:172.27.10.27 Bcast:172.27.10.255 Mask:255.255.255.0 inet6 addr: fe80::36e6:d7ff:fe2b:8960/64 Scope:Link UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On Mon, Feb 29, 2016 at 5:06 AM, Thomas Monjalon wrote: > Hi, > I totally agree with Avi's comments. > This topic is really important for the future of DPDK. > So I think we must give some time to continue the discussion > and have netdev involved in the choices done. > As a consequence, these series should not be merged in the release 16.04. > Thanks for continuing the work. > I know you guys are very interested in getting rid of the out-of-tree drivers, but please do not block incremental improvements to DPDK in the meantime. Ferruh's patch improves the usability of KNI. Don't throw out good and useful enhancements just because it isn't where you want to be in the end. I'd like to see these be merged. Jay
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 2/29/2016 11:39 AM, Avi Kivity wrote: > > > On 02/29/2016 01:27 PM, Ferruh Yigit wrote: >> On 2/29/2016 10:58 AM, Avi Kivity wrote: >>> >>> On 02/29/2016 12:43 PM, Ferruh Yigit wrote: On 2/29/2016 9:43 AM, Avi Kivity wrote: > On 02/28/2016 10:16 PM, Ferruh Yigit wrote: >> On 2/28/2016 3:34 PM, Avi Kivity wrote: >>> On 01/27/2016 06:24 PM, Ferruh Yigit wrote: This kernel module is based on KNI module, but this one is stripped version of it and only for control messages, no data transfer functionality provided. This Linux kernel module helps userspace application create virtual interfaces and when a control command issued into that virtual interface, module pushes the command to the userspace and gets the response back for the caller application. The Linux tools like ethtool/ifconfig/ip can be used on virtual interfaces but not ones for related data, like tcpdump. In long term this patch intends to replace the KNI and KNI will be depreciated. >>> Instead of adding yet another out-of-tree kernel module, why not >>> extend >>> the existing in-tree tap driver? This will make everyone's life >>> easier. >>> >>> Since tap also supports data transfer, an application can also >>> forward >>> packets not intended to it to the kernel, and forward packets >>> from the >>> kernel through the device. >>> >> Hi Avi, >> >> KDP (Kernel Data Path) does what you have described, it is >> implemented >> as PMD and it benefits from tap driver to data transfer through the >> kernel. It also support custom kernel module for better performance. >> >> For KCP (Kernel Control Path), network driver forwards control >> commands >> to the userspace driver, I doubt this is something wanted for tun/tap >> driver, so extending tun/tap driver like this can be hard to >> upstream. > Have you tried asking? Maybe if you explain it they will be open > to the > extension. > Not communicated but tun/tap already doing something different. For KCP, created interface is map of the DPDK port. All data interface shows coming from DPDK port. For example if you get stats information with ifconfig, the values you observe are DPDK port statistics -not statistics of data between userspace and kernelspace, statistics of data forwarded between DPDK ports. If you down the interface, DPDK port stopped, etc... If you extend the tun/tap, it won't be map of the DPDK port, and if you get statistics information from that interface, what do you expect to see, the data transferred between kernel and userspace, or underlying DPDK port forwarding statistics? >>> Good point. But you really have to involve netdev on this, or you'll >>> live out-of-tree forever. >>> >> Why do we need to touch netdev? > > By netdev, I meant the mailing list. If you don't touch it, your driver > will remain out-of-tree forever. > Sorry, I thought you are suggesting updating netdev (struct net_device) for this. >> A simple network driver, similar to kcp, can be solution. >> >> This driver implements all net_device_ops and ethtool_ops in a way to >> forward everything to the userspace via netlink. All needs to know about >> userspace driver is it's unique id. Any userspace application, not only >> DPDK drivers, can listen the netlink messages and response to the >> requests come to itself. >> >> This kind of driver is not big or complicated, kcp already does %90 of >> what described above. > > I am not arguing against kcp. It fulfills an important need. This is > my argument: > > 1. having multiple interfaces for the control and data path is bad for > the user > 2. therefore, we need to either add tap functionality to kcp, or add kcp > functionality to tap > 3. netdev@ is more likely (IMO) to accept additional functionality to > tap than a new driver, but the only way to know is to engage with them > Agreed an incremental update to the tap can be easier to get in, but this is not really working for us, as explained above. The concern of having two separate interfaces can be solved without merging data and control path. I believe this is not a showstopper for the functionality and can be the incremental improvement. >> Extending tun/tap in a way we want, forwarding all control commands to userspace, will break the current tun/tap, this doesn't looks like a valid option to me. >>> It's possible to enhance it while preserving backwards compatibility, by >>> enabling a feature flag (statistics from userspace). >>> For data path, using tun/tap is OK and we are already doing it, for the control path I believe we need a new driver. > Certainly it will be better to have KCP and KDP use the same kernel > interface name; so we'll need
[dpdk-dev] [PATCH v2 1/2] librte_pipeline: add support for packet redirection at action handlers
Currently, there is no mechanism that allows the pipeline ports (in/out) and table action handlers to override the default forwarding decision (as previously configured per input port or in the table entry). Therefore, new pipeline API functions have been added which allows action handlers to hijack packets and remove them from the pipeline processing, and then either drop them or send them out of the pipeline on any output port. The port (in/out) and table action handler prototypes have been changed for making use of these new API functions. This feature will be helpful to implement functions such as exception handling (e.g. TTL =0), load balancing etc. Signed-off-by: Jasvinder Singh Acked-by: Cristian Dumitrescu --- v2: * rebased on master doc/guides/rel_notes/deprecation.rst | 5 - doc/guides/rel_notes/release_16_04.rst | 6 +- lib/librte_pipeline/Makefile | 4 +- lib/librte_pipeline/rte_pipeline.c | 461 ++- lib/librte_pipeline/rte_pipeline.h | 98 +++--- lib/librte_pipeline/rte_pipeline_version.map | 8 + 6 files changed, 308 insertions(+), 274 deletions(-) diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index e94d4a2..1a7d660 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -40,11 +40,6 @@ Deprecation Notices * The scheduler statistics structure will change to allow keeping track of RED actions. -* librte_pipeline: The prototype for the pipeline input port, output port - and table action handlers will be updated: - the pipeline parameter will be added, the packets mask parameter will be - either removed (for input port action handler) or made input-only. - * ABI changes are planned in cmdline buffer size to allow the use of long commands (such as RETA update in testpmd). This should impact CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE. diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst index e2219d0..bbfd248 100644 --- a/doc/guides/rel_notes/release_16_04.rst +++ b/doc/guides/rel_notes/release_16_04.rst @@ -118,6 +118,10 @@ ABI Changes the previous releases and made in this release. Use fixed width quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past tense. +* librte_pipeline: The prototype for the pipeline input port, output port + and table action handlers are updated:the pipeline parameter is added, + the packets mask parameter has been either removed or made input-only. + Shared Library Versions --- @@ -144,7 +148,7 @@ The libraries prepended with a plus sign were incremented in this version. librte_mbuf.so.2 librte_mempool.so.1 librte_meter.so.1 - librte_pipeline.so.2 + + librte_pipeline.so.3 librte_pmd_bond.so.1 librte_pmd_ring.so.2 librte_port.so.2 diff --git a/lib/librte_pipeline/Makefile b/lib/librte_pipeline/Makefile index 1166d3c..822fd41 100644 --- a/lib/librte_pipeline/Makefile +++ b/lib/librte_pipeline/Makefile @@ -1,6 +1,6 @@ # BSD LICENSE # -# Copyright(c) 2010-2014 Intel Corporation. All rights reserved. +# Copyright(c) 2010-2016 Intel Corporation. All rights reserved. # All rights reserved. # # Redistribution and use in source and binary forms, with or without @@ -41,7 +41,7 @@ CFLAGS += $(WERROR_FLAGS) EXPORT_MAP := rte_pipeline_version.map -LIBABIVER := 2 +LIBABIVER := 3 # # all source are stored in SRCS-y diff --git a/lib/librte_pipeline/rte_pipeline.c b/lib/librte_pipeline/rte_pipeline.c index d625fd2..87f7634 100644 --- a/lib/librte_pipeline/rte_pipeline.c +++ b/lib/librte_pipeline/rte_pipeline.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -49,14 +49,30 @@ #define RTE_TABLE_INVALID UINT32_MAX #ifdef RTE_PIPELINE_STATS_COLLECT -#define RTE_PIPELINE_STATS_ADD(counter, val) \ - ({ (counter) += (val); }) -#define RTE_PIPELINE_STATS_ADD_M(counter, mask) \ - ({ (counter) += __builtin_popcountll(mask); }) +#define RTE_PIPELINE_STATS_AH_DROP_WRITE(p, mask) \ + ({ (p)->n_pkts_ah_drop = __builtin_popcountll(mask); }) + +#define RTE_PIPELINE_STATS_AH_DROP_READ(p, counter)\ + ({ (counter) += (p)->n_pkts_ah_drop; (p)->n_pkts_ah_drop = 0; }) + +#define RTE_PIPELINE_STATS_TABLE_DROP0(p) \ + ({ (p)->pkts_drop_mask = (p)->action_mask0[RTE_PIPELINE_ACTION_DROP]; }) + +#define RTE_PIPELINE_STATS_TABLE_DROP1(p, counter) \ +({ \ + uint64_t mask = (p)->action_mask0[RTE_
[dpdk-dev] [PATCH v2 2/2] modify action handlers in test_pipeline and ip_pipeline
Changes are made to the ports and table action handlers defined in app/test_pipeline and ip_pipeline sample application. Signed-off-by: Jasvinder Singh Acked-by: Cristian Dumitrescu --- app/test-pipeline/pipeline_acl.c | 3 +- app/test-pipeline/pipeline_hash.c | 3 +- app/test-pipeline/pipeline_lpm.c | 3 +- app/test-pipeline/pipeline_lpm_ipv6.c | 3 +- app/test-pipeline/pipeline_stub.c | 3 +- .../ip_pipeline/pipeline/pipeline_actions_common.h | 47 +- .../ip_pipeline/pipeline/pipeline_firewall_be.c| 3 +- .../pipeline/pipeline_flow_actions_be.c| 3 +- .../pipeline/pipeline_flow_classification_be.c | 3 +- .../ip_pipeline/pipeline/pipeline_passthrough_be.c | 3 +- .../ip_pipeline/pipeline/pipeline_routing_be.c | 3 +- 11 files changed, 37 insertions(+), 40 deletions(-) diff --git a/app/test-pipeline/pipeline_acl.c b/app/test-pipeline/pipeline_acl.c index f163e55..22d5f36 100644 --- a/app/test-pipeline/pipeline_acl.c +++ b/app/test-pipeline/pipeline_acl.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -159,7 +159,6 @@ app_main_loop_worker_pipeline_acl(void) { .ops = &rte_port_ring_writer_ops, .arg_create = (void *) &port_ring_params, .f_action = NULL, - .f_action_bulk = NULL, .arg_ah = NULL, }; diff --git a/app/test-pipeline/pipeline_hash.c b/app/test-pipeline/pipeline_hash.c index 8b888d7..f8aac0d 100644 --- a/app/test-pipeline/pipeline_hash.c +++ b/app/test-pipeline/pipeline_hash.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -140,7 +140,6 @@ app_main_loop_worker_pipeline_hash(void) { .ops = &rte_port_ring_writer_ops, .arg_create = (void *) &port_ring_params, .f_action = NULL, - .f_action_bulk = NULL, .arg_ah = NULL, }; diff --git a/app/test-pipeline/pipeline_lpm.c b/app/test-pipeline/pipeline_lpm.c index 2d7bc01..916abd4 100644 --- a/app/test-pipeline/pipeline_lpm.c +++ b/app/test-pipeline/pipeline_lpm.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -99,7 +99,6 @@ app_main_loop_worker_pipeline_lpm(void) { .ops = &rte_port_ring_writer_ops, .arg_create = (void *) &port_ring_params, .f_action = NULL, - .f_action_bulk = NULL, .arg_ah = NULL, }; diff --git a/app/test-pipeline/pipeline_lpm_ipv6.c b/app/test-pipeline/pipeline_lpm_ipv6.c index c895b62..3352e89 100644 --- a/app/test-pipeline/pipeline_lpm_ipv6.c +++ b/app/test-pipeline/pipeline_lpm_ipv6.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -100,7 +100,6 @@ app_main_loop_worker_pipeline_lpm_ipv6(void) { .ops = &rte_port_ring_writer_ops, .arg_create = (void *) &port_ring_params, .f_action = NULL, - .f_action_bulk = NULL, .arg_ah = NULL, }; diff --git a/app/test-pipeline/pipeline_stub.c b/app/test-pipeline/pipeline_stub.c index 0ad6f9b..ba710ca 100644 --- a/app/test-pipeline/pipeline_stub.c +++ b/app/test-pipeline/pipeline_stub.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + * Copyright(c) 2010-2016 Intel Corporation. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -94,7 +94,6 @@ app_main_loop_worker_pipeline_stub(void) { .ops = &rte_port_ring_writer_ops, .arg_create = (void *) &port_ring_params, .f_action = NULL, - .f_action_bulk = NULL, .arg_ah = NULL, }
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 2/29/2016 11:35 AM, Ferruh Yigit wrote: > On 2/29/2016 11:06 AM, Thomas Monjalon wrote: >> Hi, >> I totally agree with Avi's comments. >> This topic is really important for the future of DPDK. >> So I think we must give some time to continue the discussion >> and have netdev involved in the choices done. >> As a consequence, these series should not be merged in the release 16.04. >> Thanks for continuing the work. >> > Hi Thomas, > > It is great to have some discussion and feedbacks. > But I doubt not merging in this release will help to have more discussion. > > It is better to have them in this release and let people experiment it, > this gives more chance to better discussion. > > These features are replacement of KNI, and KNI is not intended to be > removed in this release, so who are using KNI as solution can continue > to use KNI and can test KCP/KDP, so that we can get more feedbacks. > One more thing, overall reason of working on KCP/KDP is reduce KNI maintenance cost, and add more features, not to add more maintenance cost. The most maintenance cost of KNI is because of Linux network drivers in it, which KCP removes them, so there is an improvement. Although it is not as good as removing them completely, KCP/KDP is one step closer to be upstreamed than existing KNI. Thanks, ferruh
[dpdk-dev] [PATCH] log: add missing symbol
2016-01-27 10:35, Thomas Monjalon: > 2015-12-16 16:38, Stephen Hemminger: > > rte_get_log_type and rte_get_log_level functions has been avaliable > > for many versions. But they are missing from the shared library map > > and therefore do not get exported correctly. > > > > Signed-off-by: Stephen Hemminger > > --- > > lib/librte_eal/linuxapp/eal/rte_eal_version.map | 2 ++ > > 1 file changed, 2 insertions(+) > > Why only in linuxapp? > > > diff --git a/lib/librte_eal/linuxapp/eal/rte_eal_version.map > > b/lib/librte_eal/linuxapp/eal/rte_eal_version.map > > index cbe175f..51a241c 100644 > > --- a/lib/librte_eal/linuxapp/eal/rte_eal_version.map > > +++ b/lib/librte_eal/linuxapp/eal/rte_eal_version.map > > @@ -93,7 +93,9 @@ DPDK_2.0 { > > rte_realloc; > > rte_set_application_usage_hook; > > rte_set_log_level; > > + rte_get_log_level; > > rte_set_log_type; > > + rte_get_log_type; > > We try to keep an alphabetical order :) Reordered, updated in bsdapp/ and Applied, thanks
[dpdk-dev] [PATCH 1/1] arm: set CONFIG_RTE_ARCH_STRICT_ALIGN=y for armv7 target
2015-12-09 16:16, Jan Viktorin: > This patch reduces number of warnings from 53 to 40. It removes the usual > false > positives utilizing unaligned_uint*_t data types. > > Signed-off-by: Jan Viktorin Applied, thanks Jan, what is the problem with the other ARM alignment warnings? Can they be fixed?
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/29/2016 01:35 PM, Ferruh Yigit wrote: > On 2/29/2016 11:06 AM, Thomas Monjalon wrote: >> Hi, >> I totally agree with Avi's comments. >> This topic is really important for the future of DPDK. >> So I think we must give some time to continue the discussion >> and have netdev involved in the choices done. >> As a consequence, these series should not be merged in the release 16.04. >> Thanks for continuing the work. >> > Hi Thomas, > > It is great to have some discussion and feedbacks. > But I doubt not merging in this release will help to have more discussion. > > It is better to have them in this release and let people experiment it, > this gives more chance to better discussion. > > These features are replacement of KNI, and KNI is not intended to be > removed in this release, so who are using KNI as solution can continue > to use KNI and can test KCP/KDP, so that we can get more feedbacks. So make the work available from a separate git repo and make it easy for people to experiment with it. Code doesn't have to be in a release for the sake of experimenting, and removing code is much harder than not adding it in the first place, witness KNI. - Panu -
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
2016-02-29 17:19, Panu Matilainen: > On 02/29/2016 01:35 PM, Ferruh Yigit wrote: > > On 2/29/2016 11:06 AM, Thomas Monjalon wrote: > >> Hi, > >> I totally agree with Avi's comments. > >> This topic is really important for the future of DPDK. > >> So I think we must give some time to continue the discussion > >> and have netdev involved in the choices done. > >> As a consequence, these series should not be merged in the release 16.04. > >> Thanks for continuing the work. > >> > > Hi Thomas, > > > > It is great to have some discussion and feedbacks. > > But I doubt not merging in this release will help to have more discussion. > > > > It is better to have them in this release and let people experiment it, > > this gives more chance to better discussion. > > > > These features are replacement of KNI, and KNI is not intended to be > > removed in this release, so who are using KNI as solution can continue > > to use KNI and can test KCP/KDP, so that we can get more feedbacks. > > So make the work available from a separate git repo and make it easy for > people to experiment with it. Code doesn't have to be in a release for > the sake of experimenting, and removing code is much harder than not > adding it in the first place, witness KNI. Good idea. What about a -next tree to experiment on kernel interactions?
[dpdk-dev] [dpdk-dev,v2] Clean up rte_memcpy.h file
> -Original Message- > From: Ravi Kerur [mailto:rkerur at gmail.com] > Sent: Saturday, February 27, 2016 10:06 PM > To: Wang, Zhihong > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev,v2] Clean up rte_memcpy.h file > > > > On Wed, Jan 27, 2016 at 8:18 PM, Zhihong Wang > wrote: > > Remove unnecessary type casting in functions. > > > > Tested on Ubuntu (14.04 x86_64) with "make test". > > "make test" results match the results with baseline. > > "Memcpy perf" results match the results with baseline. > > > > Signed-off-by: Ravi Kerur > > Acked-by: Stephen Hemminger > > > > --- > > .../common/include/arch/x86/rte_memcpy.h? ? ? ? ? ?| 340 +++--- > --- > >? 1 file changed, 175 insertions(+), 165 deletions(-) > > > > diff --git a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > > index 6a57426..839d4ec 100644 > > --- a/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > > +++ b/lib/librte_eal/common/include/arch/x86/rte_memcpy.h > > [...] > > >? /** > > @@ -150,13 +150,16 @@ rte_mov64blocks(uint8_t *dst, const uint8_t *src, > size_t n) > >? ? ? ?__m256i ymm0, ymm1; > > > >? ? ? ?while (n >= 64) { > > -? ? ? ? ? ? ?ymm0 = _mm256_loadu_si256((const __m256i *)((const uint8_t > *)src + 0 * 32)); > > + > > +? ? ? ? ? ? ?ymm0 = _mm256_loadu_si256((const __m256i *)(src + 0 * 32)); > > +? ? ? ? ? ? ?ymm1 = _mm256_loadu_si256((const __m256i *)(src + 1 * 32)); > > + > > +? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)(dst + 0 * 32), ymm0); > > +? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)(dst + 1 * 32), ymm1); > > + > > Any particular reason to change the order of the statements here? :) > Overall this patch looks good. > > I checked the code changes, initial code had moving ?addresses (src and dst) > and > decrement counter scattered between store and load instructions. I changed it > to > loads, followed by stores and handle address/counters increment/decrement > without changing functionality. > It's definitely okay to do this. Actually changing it or not won't affect the final output at all since gcc will optimize it while generating code. It's C code we're writing after all. But personally I prefer to keep the original order just as a comment that what's needed in the future should be calculated ASAP, and different kinds (CPU port) of instructions should be mixed together. :) Could you please rebase this patch since there has been some changes already? > >? ? ? ? ? ? ? ?n -= 64; > > -? ? ? ? ? ? ?ymm1 = _mm256_loadu_si256((const __m256i *)((const uint8_t > *)src + 1 * 32)); > > -? ? ? ? ? ? ?src = (const uint8_t *)src + 64; > > -? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)((uint8_t *)dst + 0 * 32), > ymm0); > > -? ? ? ? ? ? ?_mm256_storeu_si256((__m256i *)((uint8_t *)dst + 1 * 32), > ymm1); > > -? ? ? ? ? ? ?dst = (uint8_t *)dst + 64; > > +? ? ? ? ? ? ?src = src + 64; > > +? ? ? ? ? ? ?dst = dst + 64; > >? ? ? ?} > >? } > >
[dpdk-dev] [PATCH 1/1] arm: set CONFIG_RTE_ARCH_STRICT_ALIGN=y for armv7 target
On Mon, 29 Feb 2016 16:14:58 +0100 Thomas Monjalon wrote: > 2015-12-09 16:16, Jan Viktorin: > > This patch reduces number of warnings from 53 to 40. It removes the usual > > false > > positives utilizing unaligned_uint*_t data types. > > > > Signed-off-by: Jan Viktorin > > Applied, thanks > > Jan, what is the problem with the other ARM alignment warnings? > Can they be fixed? This is the full list of warnings I can see on the current origin/master for ARMv7 (42 occurences) including examples (+10 more). The origin of all of them is: cast increases required alignment of target type [-Wcast-align] After skimming through the list, you can see that they are mostly casts to uint32_t * or something similar. I believe that all of them are OK. However, I don't know how to persuade GCC to not be angry... Probably, we can add some explicit alignment of certain structures. app/test/test_thash.c 116 rte_convert_rss_key((uint32_t *)&default_rss_key, 117 (uint32_t *)rss_key_be, RTE_DIM(default_rss_key)); build/include/test_thash.h 179 *((uint32_t *)targ->v6.src_addr + i) = 180 rte_be_to_cpu_32(*((const uint32_t *)orig->src_addr + i)); 181 *((uint32_t *)targ->v6.dst_addr + i) = 182 rte_be_to_cpu_32(*((const uint32_t *)orig->dst_addr + i)); 207 ret ^= rte_cpu_to_be_32(((const uint32_t *)rss_key)[j]) << i | 208 (uint32_t)((uint64_t)(rte_cpu_to_be_32(((const uint32_t *)rss_key)[j + 1])) >> 238 ret ^= ((const uint32_t *)rss_key)[j] << i | 239 (uint32_t)((uint64_t)(((const uint32_t *)rss_key)[j + 1]) >> (32 - i)); examples-sdk/usr/local/share/dpdk/arm-armv7a-linuxapp-gcc/include/rte_mbuf.h 1617 ((t)((char *)(m)->buf_addr + (m)->data_off + (o))) examples/l3fwd-acl/main.c 1074 next = (struct rte_acl_rule *)(route_rules + 1079 next = (struct rte_acl_rule *)(acl_rules + 1115 *pacl_base = (struct rte_acl_rule *)acl_rules; 1117 *proute_base = (struct rte_acl_rule *)route_rules; netmap_user.h 65 #define NETMAP_IF(b, o) (struct netmap_if *)((char *)(b) + (o)) 68 ((struct netmap_ring *)((char *)(nifp) + \ 72 ((struct netmap_ring *)((char *)(nifp) + \ examples/vhost/main.c 121 #define MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \ 945 return ((*(uint64_t *)ea ^ *(uint64_t *)eb) & MAC_ADDR_CMP) == 0; lib/librte_acl/acl_gen.c 391 qtrp = (uint32_t *)node->transitions; lib/librte_acl/acl_run.h 46 (*((const int32_t *)((prm)[(idx)].data + *(prm)[idx].data_index++))) lib/librte_eal/linuxapp/eal/eal_interrupts.c 150 irq_set = (struct vfio_irq_set *) irq_set_buf; 156 fd_ptr = (int *) &irq_set->data; 196 irq_set = (struct vfio_irq_set *) irq_set_buf; 239 irq_set = (struct vfio_irq_set *) irq_set_buf; 245 fd_ptr = (int *) &irq_set->data; 267 irq_set = (struct vfio_irq_set *) irq_set_buf; 293 irq_set = (struct vfio_irq_set *) irq_set_buf; 304 fd_ptr = (int *) &irq_set->data; 330 irq_set = (struct vfio_irq_set *) irq_set_buf; lib/librte_eal/linuxapp/eal/eal_pci_vfio_mp_sync.c 176 chdr = (struct cmsghdr *) chdr_buf; 209 chdr = (struct cmsghdr *) chdr_buf; 595 k = (struct rte_hash_key *) ((char *)keys + 615 k = (struct rte_hash_key *) ((char *)keys + 726 k = (struct rte_hash_key *) ((char *)keys + 749 k = (struct rte_hash_key *) ((char *)keys + 841 k = (struct rte_hash_key *) ((char *)keys + 864 k = (struct rte_hash_key *) ((char *)keys + 959 *key_slot = (const struct rte_hash_key *) ((const char *)keys + 1233 next_key = (struct rte_hash_key *) ((char *)h->key_store + lib/librte_sched/rte_bitmap.h 262 bmp = (struct rte_bitmap *) mem; 264 bmp->array1 = (uint64_t *) &mem[array1_byte_offset]; 266 bmp->array2 = (uint64_t *) &mem[array2_byte_offset]; lib/librte_sched/rte_sched.c 684 port->subport = (struct rte_sched_subport *) 687 port->pipe = (struct rte_sched_pipe *) 690 port->queue = (struct rte_sched_queue *) 693 port->queue_extra = (struct rte_sched_queue_extra *) 696 port->pipe_profiles = (struct rte_sched_pipe_profile *) 701 port->queue_array = (struct rte_mbuf **) lib/librte_vhost/vhost_user/virtio-net-user.c 433 rarp = (struct ether_arp *)(eth_hdr + 1); 527 ifr = (struct ifreq *)ifc.ifc_buf; Regards Jan
[dpdk-dev] [PATCH] vhost: broadcast RARP pkt by injecting it to receiving mbuf array
2016-02-22 22:36, Yuanhan Liu: > The wrong mac table lead all the packets to the VM go to the "ovsbr0" > in the end, which ends up with all packets being lost, until the guest > send a ARP quest (or reply) to refresh the mac learning table. > > Jianfeng then came up with a solution I have thought of firstly but NAKed > by myself, concerning it has potential issues [0]. The solution is as title > stated: broadcast the RARP packet by injecting it to the receiving mbuf > arrays at rte_vhost_dequeue_burst(). The re-bring of that idea made me > think it twice; it looked like a false concern to me then. And I had done > a rough verification: it worked as expected. > > [0]: http://dpdk.org/ml/archives/dev/2016-February/033527.html > > Another note is that while preparing this version, I found that DPDK has > some ARP related structures and macros defined. So, use them instead of > the one from standard header files here. > > Cc: Thibaut Collet > Suggested-by: Jianfeng Tan > Signed-off-by: Yuanhan Liu Applied, thanks
[dpdk-dev] [PATCH v3 0/2] cryptodev API changes
On 26/02/16 17:30, Declan Doherty wrote: > This patch set separates the symmetric crypto operations from generic > operations > and then modifies the cryptodev burst API to accept bursts of rte_crypto_op > rather than rte_mbufs. > > V3: > - Addresses V2 comments > - Rebased for head > > Declan Doherty (1): >cryptodev: change burst API to be crypto op oriented > > Fiona Trahe (1): >cryptodev: API tidy and changes to support future extensions > > MAINTAINERS| 6 +- > app/test/test_cryptodev.c | 894 > +++-- > app/test/test_cryptodev.h | 9 +- > app/test/test_cryptodev_perf.c | 270 --- > config/common_bsdapp | 8 - > config/common_linuxapp | 8 - > doc/api/doxy-api-index.md | 1 - > drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 199 ++--- > drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 18 +- > drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h | 6 +- > drivers/crypto/qat/qat_crypto.c| 150 ++-- > drivers/crypto/qat/qat_crypto.h| 14 +- > drivers/crypto/qat/rte_qat_cryptodev.c | 8 +- > examples/l2fwd-crypto/main.c | 300 --- > lib/Makefile | 1 - > lib/librte_cryptodev/Makefile | 1 + > lib/librte_cryptodev/rte_crypto.h | 822 --- > lib/librte_cryptodev/rte_crypto_sym.h | 642 +++ > lib/librte_cryptodev/rte_cryptodev.c | 115 ++- > lib/librte_cryptodev/rte_cryptodev.h | 185 ++--- > lib/librte_cryptodev/rte_cryptodev_pmd.h | 32 +- > lib/librte_cryptodev/rte_cryptodev_version.map | 3 +- > lib/librte_mbuf/rte_mbuf.h | 6 - > lib/librte_mbuf_offload/Makefile | 52 -- > lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 --- > lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 --- > .../rte_mbuf_offload_version.map | 7 - > 27 files changed, 2146 insertions(+), 2021 deletions(-) > create mode 100644 lib/librte_cryptodev/rte_crypto_sym.h > delete mode 100644 lib/librte_mbuf_offload/Makefile > delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.c > delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h > delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map > self NAK. There is an issue with mis-merged code in __rte_crypto_op_raw_bulk_alloc function in rte_crypto.h
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/29/2016 05:27 PM, Thomas Monjalon wrote: > 2016-02-29 17:19, Panu Matilainen: >> On 02/29/2016 01:35 PM, Ferruh Yigit wrote: >>> On 2/29/2016 11:06 AM, Thomas Monjalon wrote: Hi, I totally agree with Avi's comments. This topic is really important for the future of DPDK. So I think we must give some time to continue the discussion and have netdev involved in the choices done. As a consequence, these series should not be merged in the release 16.04. Thanks for continuing the work. >>> Hi Thomas, >>> >>> It is great to have some discussion and feedbacks. >>> But I doubt not merging in this release will help to have more discussion. >>> >>> It is better to have them in this release and let people experiment it, >>> this gives more chance to better discussion. >>> >>> These features are replacement of KNI, and KNI is not intended to be >>> removed in this release, so who are using KNI as solution can continue >>> to use KNI and can test KCP/KDP, so that we can get more feedbacks. >> >> So make the work available from a separate git repo and make it easy for >> people to experiment with it. Code doesn't have to be in a release for >> the sake of experimenting, and removing code is much harder than not >> adding it in the first place, witness KNI. > > Good idea. > What about a -next tree to experiment on kernel interactions? Here's another, related but more radical (and rather unbaked) idea: Move all the kernel modules and their associated libraries (thinking of KNI here) to a separate repo with perhaps more relaxed rules, but OTOH require upstream kernel support for any features to be included in dpdk itself. Carrot-and-stick of sorts :) - Panu -
[dpdk-dev] [PATCH v2 0/7] vhost rxtx refactor
Hi Yuanhan 2016-02-18 21:49, Yuanhan Liu: > Here is a patchset for refactoring vhost rxtx code, mainly for > improving readability. This series requires to be rebased. And maybe you could check also the series about numa_realloc. Thanks
[dpdk-dev] VIRTIO interface with DPDK in Guest VM not receiving packets
Hi Ajeet, We already tried setting up dpdk with vhostuser as a network attachment option and what we have following observations, 1st, it's not like we need to turn promisc mode enabled for host to guest communication to happen. We can turn them up with specific dpdk application arguments if we need. 2nd, is when we tried cirros 0.3.0, it's found that the cirros 0.3.0 is having a bug which clearly ignores DHCP responses. If you try with cirros 0.3.1 then it will work i guess. 3rd, is Cirros doesn?t get ip ? root cause ? br-dpdk needs to be configured with ip and up. Thanks & Regards Abhijeet Karve =-=-= Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
[dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for tcpdump
Hi, > -Original Message- > From: Pavel Fedin [mailto:p.fedin at samsung.com] > Sent: Wednesday, February 24, 2016 3:05 PM > To: Pattan, Reshma > Cc: dev at dpdk.org > Subject: RE: [dpdk-dev] [PATCH v2 0/5] add dpdk packet capture support for > tcpdump > > Hello! > > > > 2. What if i don't want separate RX and TX streams either? It only > > > prevents me from seeing the complete picture. > > > > Do you mean not to have separate pcap files for tx and rx? If so, I > > would prefer to keep this as it is. > > I mean - add an option not to have separate files. OK, I will make changes in v3. > > > > 3. vhostuser ports are missing. Perhaps not really related to this > > > patchset, i just don't know how much code "server" part of vhostuser > > > shares with normal PMDs, but anyway, ability to dump them too would be > nice to have. > > > > > > > I think this can be done in future i.e. when vhost as PMD is > > available. But as of now vhost is library. > > I expected "server"-side vhost to be the same as "client" part (AKA virtio), > just > use another mechanism for exchanging control information (via socket). Is it > not > true? I suppose, driving queues from both sides should be quite symmetric. > At this stage of release adding these changes is difficult as I don't have knowledge on vhost. But at the same if anyone from committee would like to make these enhancements are welcome. Thanks, Reshma
[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
2016-02-29 12:51, Panu Matilainen: > On 02/24/2016 03:23 PM, Ananyev, Konstantin wrote: > > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen > >> On 02/23/2016 07:35 AM, Xie, Huawei wrote: > >>> On 2/22/2016 10:52 PM, Xie, Huawei wrote: > On 2/4/2016 1:24 AM, Olivier MATZ wrote: > > On 01/27/2016 02:56 PM, Panu Matilainen wrote: > >> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of > >> the library ABI and should not be listed in the version map. > >> > >> I assume its inline for performance reasons, but then you lose the > >> benefits of dynamic linking such as ability to fix bugs and/or improve > >> itby just updating the library. Since the point of having a bulk API is > >> to improve performance by reducing the number of calls required, does > >> it > >> really have to be inline? As in, have you actually measured the > >> difference between inline and non-inline and decided its worth all the > >> downsides? > > Agree with Panu. It would be interesting to compare the performance > > between inline and non inline to decide whether inlining it or not. > Will update after i gathered more data. inline could show obvious > performance difference in some cases. > >>> > >>> Panu and Oliver: > >>> I write a simple benchmark. This benchmark run 10M rounds, in each round > >>> 8 mbufs are allocated through bulk API, and then freed. > >>> These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @ > >>> 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded). > >>> Btw, i have removed some exceptional data, the frequency of which is > >>> like 1/10. Sometimes observed user usage suddenly disappeared, no clue > >>> what happened. > >>> > >>> With 8 mbufs allocated, there is about 6% performance increase using > >>> inline. > >> [...] > >>> > >>> With 16 mbufs allocated, we could still observe obvious performance > >>> difference, though only 1%-2% > >>> > >> [...] > >>> > >>> With 32/64 mbufs allocated, the deviation of the data itself would hide > >>> the performance difference. > >>> So we prefer using inline for performance. > >> > >> At least I was more after real-world performance in a real-world > >> use-case rather than CPU cycles in a microbenchmark, we know function > >> calls have a cost but the benefits tend to outweight the cons. > >> > >> Inline functions have their place and they're far less evil in project > >> internal use, but in library public API they are BAD and should be ... > >> well, not banned because there are exceptions to every rule, but highly > >> discouraged. > > > > Why is that? > > For all the reasons static linking is bad, and what's worse it forces > the static linking badness into dynamically linked builds. > > If there's a bug (security or otherwise) in a library, a distro wants to > supply an updated package which fixes that bug and be done with it. But > if that bug is in an inlined code, supplying an update is not enough, > you also need to recompile everything using that code, and somehow > inform customers possibly using that code that they need to not only > update the library but to recompile their apps as well. That is > precisely the reason distros go to great lenghts to avoid *any* > statically linked apps and libs in the distro, completely regardless of > the performance overhead. > > In addition, inlined code complicates ABI compatibility issues because > some of the code is one the "wrong" side, and worse, it bypasses all the > other ABI compatibility safeguards like soname and symbol versioning. > > Like said, inlined code is fine for internal consumption, but incredibly > bad for public interfaces. And of course, the more complicated a > function is, greater the potential of needing bugfixes. > > Mind you, none of this is magically specific to this particular > function. Except in the sense that bulk operations offer a better way of > performance improvements than just inlining everything. > > > As you can see right now we have all mbuf alloc/free routines as static > > inline. > > And I think we would like to keep it like that. > > So why that particular function should be different? > > Because there's much less need to have it inlined since the function > call overhead is "amortized" by the fact its doing bulk operations. "We > always did it that way" is not a very good reason :) > > > After all that function is nothing more than a wrapper > > around rte_mempool_get_bulk() unrolled by 4 loop {rte_pktmbuf_reset()} > > So unless mempool get/put API would change, I can hardly see there could be > > any ABI > > breakages in future. > > About 'real world' performance gain - it was a 'real world' performance > > problem, > > that we tried to solve by introducing that function: > > http://dpdk.org/ml/archives/dev/2015-May/017633.html > > > > And according to the user feedback, it does help: > > http://dpdk.or
[dpdk-dev] VIRTIO interface with DPDK in Guest VM not receiving packets
May I kindly ask you to remove this footer from your emails? Thanks > =-=-= > Notice: The information contained in this e-mail > message and/or attachments to it may contain > confidential or privileged information. If you are > not the intended recipient, any dissemination, use, > review, distribution, printing or copying of the > information contained in this e-mail message > and/or attachments to it are strictly prohibited. If > you have received this communication in error, > please notify us by reply e-mail or telephone and > immediately and permanently delete the message > and any attachments. Thank you
[dpdk-dev] [PATCH v7] mbuf: provide rte_pktmbuf_alloc_bulk API
2016-02-28 20:44, Huawei Xie: > v7 changes: > rte_pktmbuf_alloc_bulk isn't exported as API, so shouldn't be listed in > version map > > v6 changes: > reflect the changes in release notes and library version map file > revise our duff's code style a bit to make it more readable > > v5 changes: > add comment about duff's device and our variant implementation > > v3 changes: > move while after case 0 > add context about duff's device and why we use while loop in the commit > message > > v2 changes: > unroll the loop a bit to help the performance > > rte_pktmbuf_alloc_bulk allocates a bulk of packet mbufs. > > There is related thread about this bulk API. > http://dpdk.org/dev/patchwork/patch/4718/ > Thanks to Konstantin's loop unrolling. > > Attached the wiki page about duff's device. It explains the performance > optimization through loop unwinding, and also the most dramatic use of > case label fall-through. > https://en.wikipedia.org/wiki/Duff%27s_device > > In this implementation, while() loop is used because we could not assume > count is strictly positive. Using while() loop saves one line of check. > > Signed-off-by: Gerald Rogers > Signed-off-by: Huawei Xie > Acked-by: Konstantin Ananyev > Acked-by: Olivier Matz Applied, thanks
[dpdk-dev] [PATCH v5 00/11] Add API to get packet type info
> -Original Message- > From: Tan, Jianfeng > Sent: Friday, February 26, 2016 7:34 AM > To: dev at dpdk.org > Cc: Zhang, Helin; Ananyev, Konstantin; nelio.laranjeiro at 6wind.com; > adrien.mazarguil at 6wind.com; rahul.lakkireddy at chelsio.com; > Tan, Jianfeng > Subject: [PATCH v5 00/11] Add API to get packet type info > > To achieve this, a new function pointer, dev_ptype_info_get, is added > into struct eth_dev_ops. For those devices who do not implement it, it > means it will not provide any ptype info. > > v5: > - Exclude l3fwd change from this series, as a separated one. > - Fix malposition of mlx4 code in mlx5 commit introduced in v4. > > v4: > - Change how to use this API: to previously agreement reached in mail. > > v3: > - Change how to use this API: api to allocate mem for storing ptype > array; and caller to free the mem. > - Change how to return back ptypes from PMDs: return a pointer to > corresponding static const array of supported ptypes, terminated > by RTE_PTYPE_UNKNOWN. > - Fix l3fwd parse_packet_type() when EXACT_MATCH is enabled. > - Fix l3fwd memory leak when calling the API. > > v2: > - Move ptype_mask filter function from each PMDs into ether layer. > - Add ixgbe vPMD's ptype info. > - Fix code style issues. > > Signed-off-by: Jianfeng Tan > Acked-by: Konstantin Ananyev > -- > 2.1.4
[dpdk-dev] [PATCH v4 0/2] cryptodev API changes
This patch set separates the symmetric crypto operations from generic operations and then modifies the cryptodev burst API to accept bursts of rte_crypto_op rather than rte_mbufs. V4: - Fixes for issues introduced in __rte_crypto_op_raw_bulk_alloc in V3 patcheset. - Typo fix in cached attribute on rte_crypto_op structure. V3: - Addresses V2 comments - Rebased for head Declan Doherty (1): cryptodev: change burst API to be crypto op oriented Fiona Trahe (1): cryptodev: API tidy and changes to support future extensions MAINTAINERS| 6 +- app/test/test_cryptodev.c | 894 +++-- app/test/test_cryptodev.h | 9 +- app/test/test_cryptodev_perf.c | 270 --- config/common_bsdapp | 8 - config/common_linuxapp | 8 - doc/api/doxy-api-index.md | 1 - drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 199 ++--- drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 18 +- drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h | 6 +- drivers/crypto/qat/qat_crypto.c| 150 ++-- drivers/crypto/qat/qat_crypto.h| 14 +- drivers/crypto/qat/rte_qat_cryptodev.c | 8 +- examples/l2fwd-crypto/main.c | 300 --- lib/Makefile | 1 - lib/librte_cryptodev/Makefile | 1 + lib/librte_cryptodev/rte_crypto.h | 819 +++ lib/librte_cryptodev/rte_crypto_sym.h | 642 +++ lib/librte_cryptodev/rte_cryptodev.c | 115 ++- lib/librte_cryptodev/rte_cryptodev.h | 185 ++--- lib/librte_cryptodev/rte_cryptodev_pmd.h | 32 +- lib/librte_cryptodev/rte_cryptodev_version.map | 3 +- lib/librte_mbuf/rte_mbuf.h | 6 - lib/librte_mbuf_offload/Makefile | 52 -- lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 --- lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 --- .../rte_mbuf_offload_version.map | 7 - 27 files changed, 2143 insertions(+), 2021 deletions(-) create mode 100644 lib/librte_cryptodev/rte_crypto_sym.h delete mode 100644 lib/librte_mbuf_offload/Makefile delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.c delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map -- 2.5.0
[dpdk-dev] [PATCH v4 1/2] cryptodev: API tidy and changes to support future extensions
From: Fiona Trahe This patch splits symmetric specific definitions and functions away from the common crypto APIs to facilitate the future extension and expansion of the cryptodev framework, in order to allow asymmetric crypto operations to be introduced at a later date, as well as to clean the logical structure of the public includes. The patch also introduces the _sym prefix to symmetric specific structure and functions to improve clarity in the API. Signed-off-by: Fiona Trahe Signed-off-by: Declan Doherty --- app/test/test_cryptodev.c | 164 +++--- app/test/test_cryptodev_perf.c | 79 +-- drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 44 +- drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 6 +- drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h | 4 +- drivers/crypto/qat/qat_crypto.c| 51 +- drivers/crypto/qat/qat_crypto.h| 10 +- drivers/crypto/qat/rte_qat_cryptodev.c | 8 +- examples/l2fwd-crypto/main.c | 33 +- lib/librte_cryptodev/Makefile | 1 + lib/librte_cryptodev/rte_crypto.h | 563 +-- lib/librte_cryptodev/rte_crypto_sym.h | 613 + lib/librte_cryptodev/rte_cryptodev.c | 39 +- lib/librte_cryptodev/rte_cryptodev.h | 80 ++- lib/librte_cryptodev/rte_cryptodev_pmd.h | 32 +- lib/librte_mbuf_offload/rte_mbuf_offload.h | 22 +- 16 files changed, 912 insertions(+), 837 deletions(-) create mode 100644 lib/librte_cryptodev/rte_crypto_sym.h diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c index 62f8fb0..951b443 100644 --- a/app/test/test_cryptodev.c +++ b/app/test/test_cryptodev.c @@ -1,7 +1,7 @@ /*- * BSD LICENSE * - * Copyright(c) 2015 Intel Corporation. All rights reserved. + * Copyright(c) 2015-2016 Intel Corporation. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions @@ -57,13 +57,13 @@ struct crypto_testsuite_params { }; struct crypto_unittest_params { - struct rte_crypto_xform cipher_xform; - struct rte_crypto_xform auth_xform; + struct rte_crypto_sym_xform cipher_xform; + struct rte_crypto_sym_xform auth_xform; - struct rte_cryptodev_session *sess; + struct rte_cryptodev_sym_session *sess; struct rte_mbuf_offload *ol; - struct rte_crypto_op *op; + struct rte_crypto_sym_op *op; struct rte_mbuf *obuf, *ibuf; @@ -78,7 +78,7 @@ test_AES_CBC_HMAC_SHA512_decrypt_create_session_params( struct crypto_unittest_params *ut_params); static int -test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_session *sess, +test_AES_CBC_HMAC_SHA512_decrypt_perform(struct rte_cryptodev_sym_session *sess, struct crypto_unittest_params *ut_params, struct crypto_testsuite_params *ts_param); @@ -165,7 +165,8 @@ testsuite_setup(void) ts_params->mbuf_ol_pool = rte_pktmbuf_offload_pool_create( "MBUF_OFFLOAD_POOL", NUM_MBUFS, MBUF_CACHE_SIZE, - DEFAULT_NUM_XFORMS * sizeof(struct rte_crypto_xform), + DEFAULT_NUM_XFORMS * + sizeof(struct rte_crypto_sym_xform), rte_socket_id()); if (ts_params->mbuf_ol_pool == NULL) { RTE_LOG(ERR, USER1, "Can't create CRYPTO_OP_POOL\n"); @@ -220,7 +221,7 @@ testsuite_setup(void) ts_params->conf.nb_queue_pairs = info.max_nb_queue_pairs; ts_params->conf.socket_id = SOCKET_ID_ANY; - ts_params->conf.session_mp.nb_objs = info.max_nb_sessions; + ts_params->conf.session_mp.nb_objs = info.sym.max_nb_sessions; TEST_ASSERT_SUCCESS(rte_cryptodev_configure(dev_id, &ts_params->conf), @@ -275,7 +276,7 @@ ut_setup(void) ts_params->conf.nb_queue_pairs = DEFAULT_NUM_QPS_PER_QAT_DEVICE; ts_params->conf.socket_id = SOCKET_ID_ANY; ts_params->conf.session_mp.nb_objs = - (gbl_cryptodev_type == RTE_CRYPTODEV_QAT_PMD) ? + (gbl_cryptodev_type == RTE_CRYPTODEV_QAT_SYM_PMD) ? DEFAULT_NUM_OPS_INFLIGHT : DEFAULT_NUM_OPS_INFLIGHT; @@ -319,7 +320,7 @@ ut_teardown(void) /* free crypto session structure */ if (ut_params->sess) { - rte_cryptodev_session_free(ts_params->valid_devs[0], + rte_cryptodev_sym_session_free(ts_params->valid_devs[0], ut_params->sess); ut_params->sess = NULL; } @@ -464,7 +465,7 @@ test_queue_pair_descriptor_setup(
[dpdk-dev] [PATCH v4 2/2] cryptodev: change burst API to be crypto op oriented
This patch modifies the crypto burst enqueue/dequeue APIs to operate on bursts rte_crypto_op's rather than the current implementation which operates on rte_mbuf bursts, this simplifies the burst processing in the crypto PMDs and the use of crypto operations in general. The changes also continues the separatation of the symmetric operation parameters from the more general operation parameters, this will simplify the integration of asymmetric crypto operations in the future. As well as the changes to the crypto APIs this patch adds functions for managing rte_crypto_op pools to the cryptodev API. It modifies the existing PMDs, unit tests and sample application to work with the modified APIs and finally removes the now unused rte_mbuf_offload library. Signed-off-by: Declan Doherty --- MAINTAINERS| 6 +- app/test/test_cryptodev.c | 804 +++-- app/test/test_cryptodev.h | 9 +- app/test/test_cryptodev_perf.c | 253 +++ config/common_bsdapp | 8 - config/common_linuxapp | 8 - doc/api/doxy-api-index.md | 1 - drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 171 +++-- drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 12 +- drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h | 2 +- drivers/crypto/qat/qat_crypto.c| 123 ++-- drivers/crypto/qat/qat_crypto.h| 12 +- drivers/crypto/qat/rte_qat_cryptodev.c | 4 +- examples/l2fwd-crypto/main.c | 283 lib/Makefile | 1 - lib/librte_cryptodev/rte_crypto.h | 364 +- lib/librte_cryptodev/rte_crypto_sym.h | 379 +- lib/librte_cryptodev/rte_cryptodev.c | 76 ++ lib/librte_cryptodev/rte_cryptodev.h | 109 ++- lib/librte_cryptodev/rte_cryptodev_version.map | 3 +- lib/librte_mbuf/rte_mbuf.h | 6 - lib/librte_mbuf_offload/Makefile | 52 -- lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 --- lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 .../rte_mbuf_offload_version.map | 7 - 25 files changed, 1575 insertions(+), 1528 deletions(-) delete mode 100644 lib/librte_mbuf_offload/Makefile delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.c delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map diff --git a/MAINTAINERS b/MAINTAINERS index 628bc05..ad6b45e 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -222,16 +222,12 @@ F: lib/librte_mbuf/ F: doc/guides/prog_guide/mbuf_lib.rst F: app/test/test_mbuf.c -Packet buffer offload - EXPERIMENTAL -M: Declan Doherty -F: lib/librte_mbuf_offload/ - Ethernet API M: Thomas Monjalon F: lib/librte_ether/ F: scripts/test-null.sh -Crypto API - EXPERIMENTAL +Crypto API M: Declan Doherty F: lib/librte_cryptodev/ F: app/test/test_cryptodev* diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c index 951b443..208fc14 100644 --- a/app/test/test_cryptodev.c +++ b/app/test/test_cryptodev.c @@ -35,7 +35,6 @@ #include #include #include -#include #include #include @@ -48,7 +47,7 @@ static enum rte_cryptodev_type gbl_cryptodev_type; struct crypto_testsuite_params { struct rte_mempool *mbuf_pool; - struct rte_mempool *mbuf_ol_pool; + struct rte_mempool *op_mpool; struct rte_cryptodev_config conf; struct rte_cryptodev_qp_conf qp_conf; @@ -62,8 +61,7 @@ struct crypto_unittest_params { struct rte_cryptodev_sym_session *sess; - struct rte_mbuf_offload *ol; - struct rte_crypto_sym_op *op; + struct rte_crypto_op *op; struct rte_mbuf *obuf, *ibuf; @@ -104,7 +102,7 @@ setup_test_string(struct rte_mempool *mpool, return m; } -#if HEX_DUMP +#ifdef HEX_DUMP static void hexdump_mbuf_data(FILE *f, const char *title, struct rte_mbuf *m) { @@ -112,27 +110,29 @@ hexdump_mbuf_data(FILE *f, const char *title, struct rte_mbuf *m) } #endif -static struct rte_mbuf * -process_crypto_request(uint8_t dev_id, struct rte_mbuf *ibuf) +static struct rte_crypto_op * +process_crypto_request(uint8_t dev_id, struct rte_crypto_op *op) { - struct rte_mbuf *obuf = NULL; -#if HEX_DUMP +#ifdef HEX_DUMP hexdump_mbuf_data(stdout, "Enqueued Packet", ibuf); #endif - if (rte_cryptodev_enqueue_burst(dev_id, 0, &ibuf, 1) != 1) { + if (rte_cryptodev_enqueue_burst(dev_id, 0, &op, 1) != 1) { printf("Error sending packet for encryption"); return NULL; } - while (rte_cryptodev_dequeue_burst(dev_id, 0, &obuf, 1) == 0) + + op = NULL; + + while (rte_cryptodev_dequeue_burst(dev_i
[dpdk-dev] [PATCH v5 00/11] Add API to get packet type info
On Mon, Feb 29, 2016 at 04:54:19PM +, Ananyev, Konstantin wrote: > > > > -Original Message- > > From: Tan, Jianfeng > > Sent: Friday, February 26, 2016 7:34 AM > > To: dev at dpdk.org > > Cc: Zhang, Helin; Ananyev, Konstantin; nelio.laranjeiro at 6wind.com; > > adrien.mazarguil at 6wind.com; rahul.lakkireddy at chelsio.com; > > Tan, Jianfeng > > Subject: [PATCH v5 00/11] Add API to get packet type info > > > > To achieve this, a new function pointer, dev_ptype_info_get, is added > > into struct eth_dev_ops. For those devices who do not implement it, it > > means it will not provide any ptype info. > > > > v5: > > - Exclude l3fwd change from this series, as a separated one. > > - Fix malposition of mlx4 code in mlx5 commit introduced in v4. > > > > v4: > > - Change how to use this API: to previously agreement reached in mail. > > > > v3: > > - Change how to use this API: api to allocate mem for storing ptype > > array; and caller to free the mem. > > - Change how to return back ptypes from PMDs: return a pointer to > > corresponding static const array of supported ptypes, terminated > > by RTE_PTYPE_UNKNOWN. > > - Fix l3fwd parse_packet_type() when EXACT_MATCH is enabled. > > - Fix l3fwd memory leak when calling the API. > > > > v2: > > - Move ptype_mask filter function from each PMDs into ether layer. > > - Add ixgbe vPMD's ptype info. > > - Fix code style issues. > > > > Signed-off-by: Jianfeng Tan > > > > Acked-by: Konstantin Ananyev Fine for me as well. Acked-by: Adrien Mazarguil -- Adrien Mazarguil 6WIND
[dpdk-dev] [PATCH 0/3 v2] ixgbe fixes
> -Original Message- > From: Iremonger, Bernard > Sent: Friday, February 26, 2016 2:49 PM > To: dev at dpdk.org > Cc: Ananyev, Konstantin; Zhang, Helin; Iremonger, Bernard > Subject: [PATCH 0/3 v2] ixgbe fixes > > This patch set implements the following: > Removes code which was duplicated in eth_ixgbevf_dev_init(). > Adds more information to the error message in ixgbe_check_mq_mode(). > Allows the MAC address of the VF to be set to zero. > > Changes in v2: > Do not overwrite the VF perm_add with zero. > > Bernard Iremonger (3): > ixgbe: cleanup eth_ixgbevf_dev_uninit > ixgbe: add more information to the error message > ixgbe: fix setting of VF MAC address > > drivers/net/ixgbe/ixgbe_ethdev.c | 29 + > drivers/net/ixgbe/ixgbe_pf.c | 7 --- > 2 files changed, 17 insertions(+), 19 deletions(-) > > -- Acked-by: Konstantin Ananyev > 2.6.3
[dpdk-dev] [PATCH 1/1] arm: set CONFIG_RTE_ARCH_STRICT_ALIGN=y for armv7 target
On Mon, 29 Feb 2016 16:55:38 +0100 Jan Viktorin wrote: > On Mon, 29 Feb 2016 16:14:58 +0100 > Thomas Monjalon wrote: > > > 2015-12-09 16:16, Jan Viktorin: > > > This patch reduces number of warnings from 53 to 40. It removes the usual > > > false > > > positives utilizing unaligned_uint*_t data types. > > > > > > Signed-off-by: Jan Viktorin > > > > Applied, thanks > > > > Jan, what is the problem with the other ARM alignment warnings? > > Can they be fixed? > > This is the full list of warnings I can see on the current origin/master > for ARMv7 (42 occurences) including examples (+10 more). The origin of > all of them is: > > cast increases required alignment of target type [-Wcast-align] > > After skimming through the list, you can see that they are mostly casts > to uint32_t * or something similar. I believe that all of them are OK. > However, I don't know how to persuade GCC to not be angry... > > Probably, we can add some explicit alignment of certain structures. > [snip] > > lib/librte_vhost/vhost_user/virtio-net-user.c > 433 rarp = (struct ether_arp *)(eth_hdr + 1); > 527 ifr = (struct ifreq *)ifc.ifc_buf; Fixed recently in http://dpdk.org/browse/dpdk/commit/?id=bb66588304632a7e4a043d2921d06709d40f9ed4 > > Regards > Jan
[dpdk-dev] [PATCH v4 0/2] cryptodev API changes
> -Original Message- > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Declan Doherty > Sent: Monday, February 29, 2016 4:52 PM > To: dev at dpdk.org > Subject: [dpdk-dev] [PATCH v4 0/2] cryptodev API changes > > This patch set separates the symmetric crypto operations from generic > operations and then modifies the cryptodev burst API to accept bursts of > rte_crypto_op rather than rte_mbufs. > > V4: > - Fixes for issues introduced in __rte_crypto_op_raw_bulk_alloc in V3 > patcheset. > - Typo fix in cached attribute on rte_crypto_op structure. > > V3: > - Addresses V2 comments > - Rebased for head > > > Declan Doherty (1): > cryptodev: change burst API to be crypto op oriented > > Fiona Trahe (1): > cryptodev: API tidy and changes to support future extensions > > MAINTAINERS| 6 +- > app/test/test_cryptodev.c | 894 > +++-- > app/test/test_cryptodev.h | 9 +- > app/test/test_cryptodev_perf.c | 270 --- > config/common_bsdapp | 8 - > config/common_linuxapp | 8 - > doc/api/doxy-api-index.md | 1 - > drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c | 199 ++--- > drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 18 +- > drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h | 6 +- > drivers/crypto/qat/qat_crypto.c| 150 ++-- > drivers/crypto/qat/qat_crypto.h| 14 +- > drivers/crypto/qat/rte_qat_cryptodev.c | 8 +- > examples/l2fwd-crypto/main.c | 300 --- > lib/Makefile | 1 - > lib/librte_cryptodev/Makefile | 1 + > lib/librte_cryptodev/rte_crypto.h | 819 +++ > lib/librte_cryptodev/rte_crypto_sym.h | 642 +++ > lib/librte_cryptodev/rte_cryptodev.c | 115 ++- > lib/librte_cryptodev/rte_cryptodev.h | 185 ++--- > lib/librte_cryptodev/rte_cryptodev_pmd.h | 32 +- > lib/librte_cryptodev/rte_cryptodev_version.map | 3 +- > lib/librte_mbuf/rte_mbuf.h | 6 - > lib/librte_mbuf_offload/Makefile | 52 -- > lib/librte_mbuf_offload/rte_mbuf_offload.c | 100 --- > lib/librte_mbuf_offload/rte_mbuf_offload.h | 310 --- > .../rte_mbuf_offload_version.map | 7 - > 27 files changed, 2143 insertions(+), 2021 deletions(-) create mode 100644 > lib/librte_cryptodev/rte_crypto_sym.h > delete mode 100644 lib/librte_mbuf_offload/Makefile delete mode 100644 > lib/librte_mbuf_offload/rte_mbuf_offload.c > delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload.h > delete mode 100644 lib/librte_mbuf_offload/rte_mbuf_offload_version.map > > -- > 2.5.0 Series Acked-by: Fiona Trahe
[dpdk-dev] [PATCH] Adding maintainers for Intel QAT PMD
On 05/02/16 16:36, Fiona Trahe wrote: > Signed-off-by: Fiona Trahe Acked-by: John Griffin
[dpdk-dev] [PATCH v3] af_packet: make the device detachable
Hi Bernard, > Does making rte_pmd_af_packet_devinit local result in an ABI breakage? If someone uses it in their app, they'll be forced to change it. However, as this function is not intentionally public and there is API to create devices that finally calls rte_pmd_af_packet_devinit(), I'm not sure if any special caution is needed here. > Should the DPDK_2.0 structure be kept and a DPDK_2.3 structure added? Should it be just `DPDK_2.3 { local: *} DPDK_2.0`? Doesn't inheritance of DPDK_2.0 make the symbol also global in 2.3? > A deprecation notice may need to be added to the > doc/guides/rel_notes/deprecation.rst file. As far as I understand, deprecation.rst is used to announce something will be removed in the future release. Changes already done should be moved from deprecation.rst to the release's .rst file. At least, this is what I see in commit logs. If this change should be announced in deprecation.rst, does this mean there should be another patch in the future (after 2.3 release?) making this function static? And that future patch will add DPDK_2.3 structure in the map file? Thank you for your time, Wojtek
[dpdk-dev] [PATCH] Adding maintainers for Intel QAT PMD
On 05/02/16 16:36, Fiona Trahe wrote: > Signed-off-by: Fiona Trahe Acked-by: John Griffin Acked-by: Deepak Kumar Jain
[dpdk-dev] [PATCH v2] I217 and I218 changes
v2: Incorporate Wenzhou's comments Compiled and tested (via testpmd) on Ubuntu 14.04 on target x86_64-native-linuxapp-gcc Compiled for target x86_64-native-linuxapp-clang v1: Modified driver and eal code to recognize and support I217 and I218 Intel NICs. Compiled and tested (via testpmd) on Ubuntu 14.04 for target x86_64-native-linuxapp-gcc Compiled for target x86_64-native-linuxapp-clang Signed-off-by: Ravi Kerur --- drivers/net/e1000/base/e1000_osdep.h| 26 +++- drivers/net/e1000/em_ethdev.c | 32 + lib/librte_eal/common/include/rte_pci_dev_ids.h | 9 +++ 3 files changed, 61 insertions(+), 6 deletions(-) diff --git a/drivers/net/e1000/base/e1000_osdep.h b/drivers/net/e1000/base/e1000_osdep.h index b2c76e3..47a1948 100644 --- a/drivers/net/e1000/base/e1000_osdep.h +++ b/drivers/net/e1000/base/e1000_osdep.h @@ -96,21 +96,35 @@ typedef int bool; #define E1000_PCI_REG(reg) (*((volatile uint32_t *)(reg))) +#define E1000_PCI_REG16(reg) (*((volatile uint16_t *)(reg))) + #define E1000_PCI_REG_WRITE(reg, value) do { \ E1000_PCI_REG((reg)) = (rte_cpu_to_le_32(value)); \ } while (0) +#define E1000_PCI_REG_WRITE16(reg, value) do { \ + E1000_PCI_REG16((reg)) = (rte_cpu_to_le_16(value)); \ +} while (0) + #define E1000_PCI_REG_ADDR(hw, reg) \ ((volatile uint32_t *)((char *)(hw)->hw_addr + (reg))) #define E1000_PCI_REG_ARRAY_ADDR(hw, reg, index) \ E1000_PCI_REG_ADDR((hw), (reg) + ((index) << 2)) -static inline uint32_t e1000_read_addr(volatile void* addr) +#define E1000_PCI_REG_FLASH_ADDR(hw, reg) \ + ((volatile uint32_t *)((char *)(hw)->flash_address + (reg))) + +static inline uint32_t e1000_read_addr(volatile void *addr) { return rte_le_to_cpu_32(E1000_PCI_REG(addr)); } +static inline uint16_t e1000_read_addr16(volatile void *addr) +{ + return rte_le_to_cpu_16(E1000_PCI_REG16(addr)); +} + /* Necessary defines */ #define E1000_MRQC_ENABLE_MASK 0x0007 #define E1000_MRQC_RSS_FIELD_IPV6_EX 0x0008 @@ -155,20 +169,20 @@ static inline uint32_t e1000_read_addr(volatile void* addr) E1000_WRITE_REG(hw, reg, value) /* - * Not implemented. + * Tested on I217/I218 chipset. */ #define E1000_READ_FLASH_REG(hw, reg) \ - (E1000_ACCESS_PANIC(E1000_READ_FLASH_REG, hw, reg, 0), 0) + e1000_read_addr(E1000_PCI_REG_FLASH_ADDR((hw), (reg))) #define E1000_READ_FLASH_REG16(hw, reg) \ - (E1000_ACCESS_PANIC(E1000_READ_FLASH_REG16, hw, reg, 0), 0) + e1000_read_addr16(E1000_PCI_REG_FLASH_ADDR((hw), (reg))) #define E1000_WRITE_FLASH_REG(hw, reg, value) \ - E1000_ACCESS_PANIC(E1000_WRITE_FLASH_REG, hw, reg, value) + E1000_PCI_REG_WRITE(E1000_PCI_REG_FLASH_ADDR((hw), (reg)), (value)) #define E1000_WRITE_FLASH_REG16(hw, reg, value) \ - E1000_ACCESS_PANIC(E1000_WRITE_FLASH_REG16, hw, reg, value) + E1000_PCI_REG_WRITE16(E1000_PCI_REG_FLASH_ADDR((hw), (reg)), (value)) #define STATIC static diff --git a/drivers/net/e1000/em_ethdev.c b/drivers/net/e1000/em_ethdev.c index 4a843fe..a8c26ed 100644 --- a/drivers/net/e1000/em_ethdev.c +++ b/drivers/net/e1000/em_ethdev.c @@ -231,6 +231,32 @@ rte_em_dev_atomic_write_link_status(struct rte_eth_dev *dev, return 0; } +/** + * eth_em_dev_is_ich8 - Check for ICH8 device + * @hw: pointer to the HW structure + * + * return TRUE for ICH8, otherwise FALSE + **/ +static bool +eth_em_dev_is_ich8(struct e1000_hw *hw) +{ + DEBUGFUNC("eth_em_dev_is_ich8"); + + switch (hw->device_id) { + case E1000_DEV_ID_PCH_LPT_I217_LM: + case E1000_DEV_ID_PCH_LPT_I217_V: + case E1000_DEV_ID_PCH_LPTLP_I218_LM: + case E1000_DEV_ID_PCH_LPTLP_I218_V: + case E1000_DEV_ID_PCH_I218_V2: + case E1000_DEV_ID_PCH_I218_LM2: + case E1000_DEV_ID_PCH_I218_V3: + case E1000_DEV_ID_PCH_I218_LM3: + return 1; + default: + return 0; + } +} + static int eth_em_dev_init(struct rte_eth_dev *eth_dev) { @@ -265,6 +291,8 @@ eth_em_dev_init(struct rte_eth_dev *eth_dev) adapter->stopped = 0; /* For ICH8 support we'll need to map the flash memory BAR */ + if (eth_em_dev_is_ich8(hw)) + hw->flash_address = (void *)pci_dev->mem_resource[1].addr; if (e1000_setup_init_funcs(hw, TRUE) != E1000_SUCCESS || em_hw_init(hw) != 0) { @@ -490,6 +518,7 @@ em_set_pba(struct e1000_hw *hw) break; case e1000_pchlan: case e1000_pch2lan: + case e1000_pch_lpt: pba = E1000_PBA_26K; break; default: @@ -798,6 +827,8 @@ em_hardware_init(struct e1000_hw *hw) hw->fc.low_water = 0x5048; hw->fc.pause_time = 0
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On Wed, 27 Jan 2016 16:24:07 + Ferruh Yigit wrote: > +static int > +kcp_ioctl_release(unsigned int ioctl_num, unsigned long ioctl_param) > +{ > + int ret = -EINVAL; > + struct kcp_dev *dev; > + struct kcp_dev *n; > + char name[RTE_KCP_NAMESIZE]; > + unsigned int instance = ioctl_param; > + > + snprintf(name, RTE_KCP_NAMESIZE, "dpdk%u", instance); > + > + down_write(&kcp_list_lock); Some observations about how acceptable this will to upstream kernel developers. ioctl's are the lease favored form of API. You chose the worst possible mutual exclusion read/write semaphores. Read/write is slower than simpler primtives, and semaphores were replaced for almost all usage models by mutexes (about 4 years ago). Looks like you copied the out of date kernel API's used by KNI.
[dpdk-dev] [PATCH v3 2/4] kcp: add kernel control path kernel module
On Fri, 26 Feb 2016 14:10:39 + Ferruh Yigit wrote: > +#define KCP_ERR(args...) printk(KERN_ERR "KCP: " args) > +#define KCP_INFO(args...) printk(KERN_INFO "KCP: " args) > + > +#ifdef RTE_KCP_KO_DEBUG > +#define KCP_DBG(args...) printk(KERN_DEBUG "KCP: " args) > +#else > +#define KCP_DBG(args...) > +#endif These macros will not make netdev developers happy. Use standard printk macros, and if you want prefix, use pr_fmt #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt