[dpdk-dev] No probed ethernet devices with shared library
On 05/18/2015 12:55 AM, Stuart Andrews wrote: > Hello, > > I've been trying to create an app which uses the DPDK shared library and > therefore I have > > CONFIG_RTE_BUILD_SHARED_LIB=y > > However, when I try to run 'test-pmd' I get > > EAL: No probed ethernet devices > > This is strange because when I compile DPDK with > CONFIG_RTE_BUILD_SHARED_LIB=n and run 'test-pmd' everything works fine. > > I'm using the IGB UIO module on a x86_64 Ubuntu OS running on a vm and I > set everything up according to the documentation. > > Any help would be appreciated. When building as shared library, all the drivers are dynamically loadable plugins instead of the big pile o' everything you get when statically linking. For now, you need to manually load any drivers you need with the EAL -d option, eg if you use virtio NIC in the VM you'd add this to testpmd: -d librte_pmd_virtio_uio.so And yes its cumbersome. Doing something about it has been on my todo for a while now, just been busy with other stuff. - Panu -
[dpdk-dev] [PATCH] vhost: make vhost lockless enqueue configurable
On 04/29/2015 02:29 PM, Huawei Xie wrote: > vhost enabled vSwitch could have their own thread-safe vring enqueue policy. > Add the RTE_LIBRTE_VHOST_LOCKLESS_ENQ macro for vhost lockless enqueue. > Turn it off by default. > > Signed-off-by: Huawei Xie > --- > config/common_linuxapp| 1 + > lib/librte_vhost/vhost_rxtx.c | 24 +++- > 2 files changed, 24 insertions(+), 1 deletion(-) > > diff --git a/config/common_linuxapp b/config/common_linuxapp > index 0078dc9..7f59499 100644 > --- a/config/common_linuxapp > +++ b/config/common_linuxapp > @@ -421,6 +421,7 @@ CONFIG_RTE_KNI_VHOST_DEBUG_TX=n > # > CONFIG_RTE_LIBRTE_VHOST=n > CONFIG_RTE_LIBRTE_VHOST_USER=y > +CONFIG_RTE_LIBRTE_VHOST_LOCKLESS_ENQ=n > CONFIG_RTE_LIBRTE_VHOST_DEBUG=n > > # > diff --git a/lib/librte_vhost/vhost_rxtx.c b/lib/librte_vhost/vhost_rxtx.c > index 510ffe8..475be6e 100644 > --- a/lib/librte_vhost/vhost_rxtx.c > +++ b/lib/librte_vhost/vhost_rxtx.c > @@ -80,7 +80,11 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id, >* they need to be reserved. >*/ > do { > +#ifdef RTE_LIBRTE_VHOST_LOCKESS_ENQ > res_base_idx = vq->last_used_idx_res; > +#else > + res_base_idx = vq->last_used_idx; > +#endif These things should be runtime configurable, not build options. Please do not assume everybody builds DPDK separately for each and every application that might ever be. - Panu -
[dpdk-dev] Did we reduce unnecessary linkage too well?
On 09/30/2016 01:15 PM, Bruce Richardson wrote: > On Thu, Sep 29, 2016 at 09:26:48AM +0200, Christian Ehrhardt wrote: >> On Thu, Sep 29, 2016 at 9:20 AM, Panu Matilainen >> wrote: >> >>> >>> Yup. Set CONFIG_RTE_EAL_PMD_PATH to the path where your PMDs are >>> installed. Note that since the plugin autoloader in DPDK doesn't make >>> assumptions about names, it'll try to load *everything* in that path, so >>> you don't want it pointing to eg /usr/lib directly. > > Is this something we should look to change? To me having some sort of > naming convention might not be a bad thing, so that we can point it at generic > folders. Plugins for program/library X are nearly always in a sub-directory of their own, outside the linker path because ... well, they're plugins and not something you should link to, and having them in separate directories makes it possible to have multiple versions co-exist on the system by simply placing the plugins into a versioned directory. That's why the current plugin autoloader is the way it is - it's the de-facto standard for dealing with plugins. The DPDK case is a bit convoluted since some of the alleged plugins also provide library APIs, so at least those DSOs need to be present in linker paths. Plugins usually also lack soname version, because that doesn't make much sense for plugins, libraries with APIs are different there too. Anyway, naming conventions are flimsy and fall apart in situations that the directory approach easily handles. So I dont see a point in changing that. What *would* be good is creating and populating that directory from DPDK "make install" step automatically, at least when RTE_EAL_PMD_PATH is set. - Panu - > > /Bruce >
[dpdk-dev] [PATCH] dpdk_procinfo: check for primary process
On 09/06/2016 08:12 PM, Maryam Tahhan wrote: > Add a check to see if the primary process is running and exit gracefully if it > is not. > > Suggested-by: Patrick Kutch > Signed-off-by: Maryam Tahhan > --- > app/proc_info/main.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/app/proc_info/main.c b/app/proc_info/main.c > index 6dc0bbb..ddc8cf8 100644 > --- a/app/proc_info/main.c > +++ b/app/proc_info/main.c > @@ -329,6 +329,11 @@ main(int argc, char **argv) > argc -= ret; > argv += (ret - 3); > > +if (!rte_eal_primary_proc_alive(NULL)) { > +rte_exit(EXIT_FAILURE, "NO PRIMARY DPDK PROCESS IS > RUNNING\n"); I don't think there'a a need to YELL THAT MESSAGE. - Panu -
[dpdk-dev] [PATCH v3 12/15] ether: extract function eth_dev_get_intr_handle
On 09/15/2016 05:05 PM, Thomas Monjalon wrote: > 2016-09-15 14:02, Hunt, David: >> On 9/9/2016 9:43 AM, Shreyansh Jain wrote: >>> +static inline >>> +struct rte_intr_handle *eth_dev_get_intr_handle(struct rte_eth_dev *dev) >>> +{ >>> + if (dev->pci_dev) { >>> + return &dev->pci_dev->intr_handle; >>> + } >>> + >>> + RTE_VERIFY(0); >> >> Rather than RTE_VERIFY(0), might I suggest using rte_panic with a more >> relevant error message? > > RTE_ASSERT is preferred. > We must stop adding some rte_panic calls except for debug. +1 It wouldn't hurt to make that a hard rule. - Panu -
[dpdk-dev] [PATCH v3 02/15] eal/soc: add rte_eal_soc_register/unregister logic
On 09/15/2016 05:09 PM, Thomas Monjalon wrote: > 2016-09-15 15:09, Jan Viktorin: >> On Thu, 15 Sep 2016 14:00:25 +0100 >> "Hunt, David" wrote: >> new file mode 100644 index 000..56135ed --- /dev/null +++ b/lib/librte_eal/common/eal_common_soc.c @@ -0,0 +1,56 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2016 RehiveTech. All rights reserved. + * All rights reserved. >>> >>> Duplicate "All rights reserved" >> >> This is present in many source files in DPDK... I don't know why. >> >> lib/librte_eal/common/eal_common_pci.c >> lib/librte_eal/common/eal_common_dev.c >> ... > > It would deserve a dedicated thread to discuss legal sense of these things. > I'm not a lawyer but I think "All rights reserved." has no real sense. > From a layman (such as myself) perspective it indeed seems totally ludicrous in the context of this particular license :) Whether it makes more sense to lawyers I wouldn't know, but as for the background: it's present in both 2- and 3-clause BSD licenses so *one* of them is probably best left alone. According to https://fedoraproject.org/wiki/Licensing:BSD, in the 3-clause BSD license "All rights reserved" is on a line of its own. In the other variants it follows the copyright holder. So that's probably where the duplicates originate from. - Panu -
[dpdk-dev] [PATCH 3/3] drivers/net:build support for new tap device driver
On 09/15/2016 05:10 PM, Keith Wiles wrote: > Signed-off-by: Keith Wiles > --- > config/common_linuxapp | 3 +++ > drivers/net/Makefile | 1 + > mk/rte.app.mk | 1 + > 3 files changed, 5 insertions(+) > > diff --git a/config/common_linuxapp b/config/common_linuxapp > index 2483dfa..704c01c 100644 > --- a/config/common_linuxapp > +++ b/config/common_linuxapp > @@ -44,3 +44,6 @@ CONFIG_RTE_LIBRTE_PMD_VHOST=y > CONFIG_RTE_LIBRTE_PMD_AF_PACKET=y > CONFIG_RTE_LIBRTE_POWER=y > CONFIG_RTE_VIRTIO_USER=y > +CONFIG_RTE_LIBRTE_PMD_TAP=y > +CONFIG_RTE_PMD_TAP_MAX_QUEUES=32 > + > diff --git a/drivers/net/Makefile b/drivers/net/Makefile > index bc93230..b4afa98 100644 > --- a/drivers/net/Makefile > +++ b/drivers/net/Makefile > @@ -55,6 +55,7 @@ DIRS-$(CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD) += thunderx > DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3 > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt > +DIRS-$(CONFIG_RTE_LIBRTE_PMD_TAP) += tap > > ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y) > DIRS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += vhost > diff --git a/mk/rte.app.mk b/mk/rte.app.mk > index 1a0095b..bd1d10f 100644 > --- a/mk/rte.app.mk > +++ b/mk/rte.app.mk > @@ -129,6 +129,7 @@ ifeq ($(CONFIG_RTE_LIBRTE_VHOST),y) > _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_VHOST) += -lrte_pmd_vhost > endif # $(CONFIG_RTE_LIBRTE_VHOST) > _LDLIBS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD)+= -lrte_pmd_vmxnet3_uio > +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_TAP)+= -lrte_pmd_tap > > ifeq ($(CONFIG_RTE_LIBRTE_CRYPTODEV),y) > _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB) += -lrte_pmd_aesni_mb > Splitting the Makefile and config changes into a separate patch makes no sense at all in this case. Just do it in the patch introducing the driver. And actually, ditto for documentation. - Panu -
[dpdk-dev] [PATCH] net/mlx: fix compile errors with ignore pedantic pragma
- Original Message - > On Tue, Sep 20, 2016 at 02:51:27PM +0200, Adrien Mazarguil wrote: > > On Mon, Sep 19, 2016 at 04:26:05PM +0100, Bruce Richardson wrote: > > > On Mon, Sep 19, 2016 at 04:59:59PM +0200, Adrien Mazarguil wrote: > > > > Hi Bruce, > > > > > > > > On Mon, Sep 19, 2016 at 03:36:54PM +0100, Bruce Richardson wrote: > > > > > With recent gcc versions, e.g. gcc 6.1, compilation of mlx drivers > > > > > with > > > > > debug enabled produces lots of errors complaining that "pedantic" is > > > > > not a warning level that can be ignored. > > > > > > > > > > error: ?-pedantic? is not an option that controls warnings > > > > > [-Werror=pragmas] > > > > > #pragma GCC diagnostic ignored "-pedantic" > > > > > ^~~ > > > > > > > > > > These errors can be removed by changing the "-pedantic" to > > > > > "-Wpedantic". > > > > > > > > Nice to have a workaround, I thought they did not keep the option at > > > > all. > > > > However after testing: > > > > > > > > - It does not seem to work with GCC 4.6 and older, they prefer > > > > -pedantic: > > > > "warning: unknown option after `#pragma GCC diagnostic' kind". > > > > > > > > - GCC 4.9 (possibly 5.x as well) does not care, can use either > > > > -pedantic or > > > > -Wpedantic. > > > > > > > > - GCC 6 can only supports -Wpedantic. > > > > > > > > Note we're working toward removing the need for these #pragma in the > > > > first > > > > place as soon as possible, however in the meantime I fear that checking > > > > the > > > > GCC version is necessary. > > > > > > > Depends on how old of GCC version we need to support. From the release > > > notes > > > it appears that -Wpedantic was introduced in GCC 4.8 (3 1/2 years ago). > > > > > > https://gcc.gnu.org/gcc-4.8/changes.html > > > > > > Do we need to support compilation on gcc versions older than this? > > > > I'm all for upgrading so I do not really mind if we stop caring about older > > GCC versions (especially considering this problem only occurs in debugging > > mode which is seldom used by non-developers). The version check is > > necessary > > if we want to keep full compatibility with at least: > > > > - RHEL <= 6.x > > - Debian <= 7.x > > - Ubuntu <= 13.04 > > > > Works for me either way, thus: > > > > Acked-by: Adrien Mazarguil > > > Any objections to dropping of support for debug settings for these OS's? No objections on dropping RHEL <= 6 support, we never did DPDK on those old versions anyway. As for the others, I've no particular opinion but certainly no objections either. - Panu - > > /Bruce >
[dpdk-dev] Did we reduce unnecessary linkage too well?
On 09/29/2016 09:58 AM, Christian Ehrhardt wrote: > Hi, > I was finally getting to more deeply re-validate Openvswitch 2.6 together > with DPDK 16.07. And I think I found a whiplash of our effort to reduce > unnecessary hard linkage. > > Trying to avoid cross-posting, picking DPDK list and the main involved > people on TO/CC. > > TL;DR: > - pmd drivers are no more "auto"-loaded > - adding -d ...so to all consuming applications feels obnoxious > - do we really have to intentionally overlink some? > - I hope I just overlook something trivial to fix this. Yup. Set CONFIG_RTE_EAL_PMD_PATH to the path where your PMDs are installed. Note that since the plugin autoloader in DPDK doesn't make assumptions about names, it'll try to load *everything* in that path, so you don't want it pointing to eg /usr/lib directly. What we have on Fedora and RHEL is a /usr/lib(64)/dpdk-pmds/ directory with symlinks to the actual pmds which reside in /usr/lib(64) because some of them provide actual API. - Panu -
[dpdk-dev] [PATCH v4 03/12] vhost: update version map file
On 08/12/2015 11:02 AM, Ouyang Changchun wrote: > From: Changchun Ouyang > > it is added in v4. > > Signed-off-by: Changchun Ouyang > --- > lib/librte_vhost/rte_vhost_version.map | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/lib/librte_vhost/rte_vhost_version.map > b/lib/librte_vhost/rte_vhost_version.map > index 3d8709e..0bb1c0f 100644 > --- a/lib/librte_vhost/rte_vhost_version.map > +++ b/lib/librte_vhost/rte_vhost_version.map > @@ -18,5 +18,5 @@ DPDK_2.1 { > global: > > rte_vhost_driver_unregister; > - > + rte_vhost_qp_num_get; > } DPDK_2.0; > Version map needs to be updated along with the actual code (in this case, the function is added in the second patch of the series). Otherwise there will be at least one commit where shared library configuration will be incorrect and might not be buildable at all. - Panu -
[dpdk-dev] [PATCH] mk: fix the combined library problems by replacing it with a linker script
On 11/30/2015 06:41 PM, Stephen Hemminger wrote: > On Mon, 30 Nov 2015 10:03:43 -0500 > Neil Horman wrote: > >> On Wed, Nov 25, 2015 at 08:08:37AM -0800, Stephen Hemminger wrote: >>> On Wed, 25 Nov 2015 10:38:48 +0200 >>> Panu Matilainen wrote: >>> >>>> On 11/25/2015 12:46 AM, Stephen Hemminger wrote: >>>>> On Tue, 24 Nov 2015 16:31:17 +0200 >>>>> Panu Matilainen wrote: >>>>> >>>>>> The physically linked-together combined library has been an increasing >>>>>> source of problems, as was predicted when library and symbol versioning >>>>>> was introduced. Replace the complex and fragile construction with a >>>>>> simple linker script which achieves the same without all the problems, >>>>>> remove the related kludges from eg mlx drivers. >>>>>> >>>>>> Since creating the linker script is practically zero cost, remove the >>>>>> config option and just create it always. >>>>>> >>>>>> Based on a patch by Sergio Gonzales Monroy, linker script approach >>>>>> initially suggested by Neil Horman. >>>>>> >>>>>> Suggested-by: Sergio Gonzalez Monroy >>>>> intel.com> >>>>>> Suggested-by: Neil Horman >>>>>> Signed-off-by: Panu Matilainen >>>>> >>>>> But it now means distros have to ship 20 libraries which seems like >>>>> a step back. >>>> >>>> That's how Fedora and RHEL are shipping it already and nobody has so >>>> much as noticed anything strange, much less complained about it. 20 >>>> libraries is but a drop in the ocean on a average distro. But more to >>>> the point, distros will prefer 50 working libraries over one that doesn't. >>>> >>>> The combined library as it is simply is no longer a viable option. >>>> Besides just being broken (witness the strange hacks people are coming >>>> up with to work around issues in it) its ugly because it basically gives >>>> the middle finger to all the effort going into version compatibility, >>>> and its also big. Few projects will use every library in DPDK, but with >>>> the combined library they're forced to lug the 800 pound gorilla along >>>> needlessly. >>>> >>>>- Panu - >>>> >>> >>> Fixing the combined library took less than an hour for us. >> How did you fix the versioning issue? >> >> Neil > > This is what I did. > Also decided to keep shared library version == major DPDK version > to avoid confusion. > > > mk: fix when building combined shared library > > The DPDK mk file does not set shared object name or version > information as required by Debian. > > Signed-off-by: Stephen Hemminger > > --- a/mk/rte.sharelib.mk > +++ b/mk/rte.sharelib.mk > @@ -51,10 +51,10 @@ ifeq ($(LINK_USING_CC),1) > # Override the definition of LD here, since we're linking with CC > LD := $(CC) $(CPU_CFLAGS) > O_TO_S = $(LD) $(call linkerprefix,$(CPU_LDFLAGS)) \ > - -shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE) > + -shared $(OBJS) -Wl,-soname,$(LIB_ONE).$(RTE_LIBVERS) -o > $(RTE_OUTPUT)/lib/$(LIB_ONE) > else > O_TO_S = $(LD) $(CPU_LDFLAGS) \ > - -shared $(OBJS) -o $(RTE_OUTPUT)/lib/$(LIB_ONE) > + -shared $(OBJS) -soname $(LIB_ONE).$(RTE_LIBVERS) -o > $(RTE_OUTPUT)/lib/$(LIB_ONE) > endif > > O_TO_S_STR = $(subst ','\'',$(O_TO_S)) #'# fix syntax highlight > --- a/mk/rte.vars.mk > +++ b/mk/rte.vars.mk > @@ -74,8 +74,10 @@ ifneq ($(BUILDING_RTE_SDK),) > endif > > RTE_LIBNAME := $(CONFIG_RTE_LIBNAME:"%"=%) > +RTE_LIBVERS := $(CONFIG_RTE_LIBVERS:"%"=%) > ifeq ($(RTE_LIBNAME),) > RTE_LIBNAME := intel_dpdk > +RTE_LIBVERS := 2 > endif > > # RTE_TARGET is deducted from config when we are building the SDK. > Adding a soname and a semi-arbitrary version does not fix the fundamental problems: Since the library lumps together everything in DPDK, you'd have to bump its version whenever any of the individual libraries bumps its version to have the version mean anything. DPDK 2.0 and 2.1 are supposedly binary compatible but 2.2 certainly is not, and beyond that who knows. That in turn forces all apps to be rebuild whenever one of the libraries changes version, whether those apps use that particular library or not. The combined library doesn't have symbol versioning, so besides the better version compatibility tracking it loses other benefits like limited symbol visibility. Not to mention the extra complexity in makefiles to support it, the increasing amount of duct-tape required to hold it together. And still eg the MLX pmds declare the configuration not supported at all. - Panu -
[dpdk-dev] 2.3 Roadmap
On 12/01/2015 12:03 PM, Bruce Richardson wrote: > On Mon, Nov 30, 2015 at 05:16:55PM -0800, Stephen Hemminger wrote: >> On Mon, 30 Nov 2015 22:53:50 + >> Kyle Larose wrote: >> >>> Hi Tim, >>> >>> On Mon, Nov 30, 2015 at 3:50 PM, O'Driscoll, Tim >> intel.com> wrote: >>> Tcpdump Support: Support for tcpdump will be added to DPDK. This will improve usability and debugging of DPDK applications. >>> >>> I'm curious about the proposed tcpdump support. Is there a concrete plan >>> for this, or is that still being looked into? Sandvine is interested in >>> contributing to this effort. Anything we can do to help? >>> >>> Thanks, >>> >>> Kyle >> >> We discussed an Ovscon doing a simple example of how to have a thread use >> named pipe >> support (already in tcpdump and wireshark). More complex solutions require >> changes to >> libpcap and application interaction. > > Our current thinking is to use kni to mirror packets into the kernel itself, > so that all standard linux capture tools can then be used. The problem with that (unless I'm missing something here) is that KNI requires using out-of-tree kernel modules which makes it pretty much a non-option for distros. - Panu -
[dpdk-dev] [PATCH] mk: bump minimum march in default machine
On 12/01/2015 04:26 PM, Christian Ehrhardt wrote: > While playing with building 2.2-rc2 I found that our usual way didn't work > anymore. > We usually configured "make config T=x86_64-native-linuxapp-gcc" but then > set CONFIG_RTE_MACHINE="default" to get something like the "lowest acceptable > build" but with that wide CPU copatibility. > > I found that with DPDK 2.2 this fails with issues like: > In file included from > /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:37:0, > from dpdk-2.2.0-rc2/lib/librte_sched/rte_sched.c:56: > dpdk-2.2.0-rc2/lib/librte_sched/rte_sched.c: In function > ?grinder_pipe_exists?: > /usr/lib/gcc/x86_64-linux-gnu/5/include/smmintrin.h:67:1: error: inlining > failed in call to always_inline ?_mm_testz_si128?: target specific option > mismatch > _mm_testz_si128 (__m128i __M, __m128i __V) > ^ > This is a hard need on newer SSE4.x features which are not given with > march=core2. > > So if nehalem (the next march level which has SSE4.x) is the new minimum let > us > set this in the default machine config. > > Signed-off-by: Christian Ehrhardt > --- > > [diffstat] > rte.vars.mk |2 +- >1 file changed, 1 insertion(+), 1 deletion(-) > > [diff] > diff --git a/mk/machine/default/rte.vars.mk b/mk/machine/default/rte.vars.mk > index 53c6af6..170d880 100644 > --- a/mk/machine/default/rte.vars.mk > +++ b/mk/machine/default/rte.vars.mk > @@ -55,4 +55,4 @@ > # CPU_LDFLAGS = > # CPU_ASFLAGS = > > -MACHINE_CFLAGS += -march=core2 > +MACHINE_CFLAGS += -march=nehalem > You can just disable CONFIG_RTE_SCHED_VECTOR instead. Also see http://dpdk.org/ml/archives/dev/2015-November/029067.html - Panu -
[dpdk-dev] 2.3 Roadmap
On 12/01/2015 04:48 PM, Vincent JARDIN wrote: > On 01/12/2015 15:27, Panu Matilainen wrote: >> The problem with that (unless I'm missing something here) is that KNI >> requires using out-of-tree kernel modules which makes it pretty much a >> non-option for distros. > > It works fine with some distros. I do not think it should be an argument. Its not a question of *working*, its that out-of-tree kernel modules are considered unsupportable by the kernel people. So relying on KNI would make the otherwise important and desireable tcpdump feature non-existent on at least Fedora and RHEL where such modules are practically outright banned by distro policies. - Panu -
[dpdk-dev] [PATCH 00/10] standard make install
On 12/02/2015 05:57 AM, Thomas Monjalon wrote: > Following the recent discussions, this is a proposal to have a standard > installation process while keeping compatibility with most of the old > behaviours. > Thank you Mario and Bruce for having submitted other proposals. > I hope there will be a strong consensus for this one. Mm, can't help it but this situation reminds me of https://imgs.xkcd.com/comics/standards.png That aside, a bigger problem is that it doesn't seem to work. make clean make config T=x86_64-native-linuxapp-gcc make make install DESTDIR=/tmp/dpdk-root ...results in this: [pmatilai at sopuli dpdk]$ make DESTDIR=/tmp/dpdk-root install == Installing /tmp/dpdk-root/usr/local/ make[3]: Nothing to be done for 'install-kmod'. tar: include: Cannot stat: No such file or directory tar: Exiting with failure status due to previous errors cp: cannot stat ?./.config?: No such file or directory /srv/work/repos/dpdk/mk/rte.sdkinstall.mk:122: recipe for target 'install-sdk' failed make[3]: *** [install-sdk] Error 1 /srv/work/repos/dpdk/mk/rte.sdkroot.mk:104: recipe for target 'install-sdk' failed make[2]: *** [install-sdk] Error 2 /srv/work/repos/dpdk/mk/rte.sdkinstall.mk:93: recipe for target 'install' failed make[1]: *** [install] Error 2 /srv/work/repos/dpdk/mk/rte.sdkroot.mk:102: recipe for target 'install' failed make: *** [install] Error 2 [pmatilai at sopuli dpdk]$ The failure appears to be install-sdk failing since invoking it alone results in similar errors. install-runtime appears to do something but it mainly installs sources to various directories in DESTDIR, eg: [pmatilai at sopuli dpdk]$ find /tmp/dpdk-root/ /tmp/dpdk-root/ /tmp/dpdk-root/usr /tmp/dpdk-root/usr/local /tmp/dpdk-root/usr/local/lib /tmp/dpdk-root/usr/local/lib/librte_mempool /tmp/dpdk-root/usr/local/lib/librte_mempool/rte_dom0_mempool.c /tmp/dpdk-root/usr/local/lib/librte_mempool/rte_mempool.c /tmp/dpdk-root/usr/local/lib/librte_mempool/Makefile /tmp/dpdk-root/usr/local/lib/librte_mempool/rte_mempool_version.map /tmp/dpdk-root/usr/local/lib/librte_mempool/rte_mempool.h [...] /tmp/dpdk-root/usr/local/bin/test-pmd /tmp/dpdk-root/usr/local/bin/test-pmd/testpmd.h /tmp/dpdk-root/usr/local/bin/test-pmd/icmpecho.c /tmp/dpdk-root/usr/local/bin/test-pmd/parameters.c /tmp/dpdk-root/usr/local/bin/test-pmd/macswap.c /tmp/dpdk-root/usr/local/bin/test-pmd/csumonly.c /tmp/dpdk-root/usr/local/bin/test-pmd/macfwd.c [...] install-kmod doesn't seem to do anything at all: [pmatilai at sopuli dpdk]$ rm -rf /tmp/dpdk-root/ [pmatilai at sopuli dpdk]$ ls build/kmod/ igb_uio.ko rte_kni.ko [pmatilai at sopuli dpdk]$ make DESTDIR=/tmp/dpdk-root install-kmod make[1]: Nothing to be done for 'install-kmod'. [pmatilai at sopuli dpdk]$ find /tmp/dpdk-root/ find: ?/tmp/dpdk-root/?: No such file or directory [pmatilai at sopuli dpdk]$ - Panu -
[dpdk-dev] [PATCH v8 00/11] Add installation rules for dpdk files.
On 12/01/2015 09:39 PM, Mario Carrillo wrote: > DPDK package lacks of a mechanism to install libraries, headers > applications, kernel modules and sdk files to a file system tree. > This patch set allows to install files based on the next > proposal: > http://www.freedesktop.org/software/systemd/man/file-hierarchy.html > > v8: > > When "make install" is invoked if "T" variable is defined, > the installation process will have the current > behaviour, else "install-fhs" rule will be called. > > Using rules support is possible to do the next steps: > > make config T= > make > make > > Modify the makefile target to specify the files > that will be installed using a rule: > > * make install-bin (install app files)(dafault path > bindir=$(exec_prefix)/bin). > > * make install-headers (install headers)(dafault path > includedir=$(prefix)/include/dpdk). > > * make install-lib (install libraries)(dafault path > libdir=$(exec_prefix)/lib). > > * make install-doc (install documentation)(dafault path > docdir=$(datarootdir)/doc/dpdk). > > * make install-mod (install modules)(dafault path if RTE_EXEC_ENV=linuxapp > then > kerneldir=/lib/modules/$(uname -r)/extra/drivers/dpdk else > kerneldir=/boot/modules). > > * make install-sdk (install headers, makefiles, scripts,examples and > config files) (default path sdkdir=$(datadir)/share/dpdk). > > * make install-fhs (install libraries, modules, app files, tools and > documentation). > > * make install (if T is defined current behaviour, else it will call > install-fhs rule). > > The following defaults apply: > > prefix=/usr/local > exec_prefix=$(prefix) > datarootdir=$(prefix)/share > > All path variables can be overridden and all targets can use the "DESTDIR" > variable. > > Furthermore this information is added to documentation. Overall, does what it promises. One point I just realized from comparing with Thomas' variant is that this by default installs documentation sources, ie the raw .rst files and does not include any "compiled" formats even if they exist. It might be better to leave docs out by default as Thomas' version does. One way of achieving that is only install docs if $(RTE_OUTPUT)/doc, and only install anything in that directory. That way you have to request doc generation specifically with "make doc" first (which has quite some build-dependencies so you might not always wnat it), and only the compiled docs get installed. Or something like that. - Panu -
[dpdk-dev] [PATCH 00/10] standard make install
On 12/02/2015 11:25 AM, Thomas Monjalon wrote: > 2015-12-02 09:44, Panu Matilainen: >> That aside, a bigger problem is that it doesn't seem to work. >> >> make clean >> make config T=x86_64-native-linuxapp-gcc >> make >> make install DESTDIR=/tmp/dpdk-root > > Oh, I forgot to test the simple case where O= is not specified! > > It should be fixed with this change: > Okay, that helped a bunch :) Now that I can actually test it, seems mostly ok to me. As for the rest, I'll comment on the specific patches. - Panu -
[dpdk-dev] [PATCH 06/10] mk: install kernel modules
On 12/02/2015 05:57 AM, Thomas Monjalon wrote: > Add kernel modules to "make install". > Nothing is done if there is no kernel module compiled. > > On native Linux, this path is suggested: > kerneldir=/lib/modules/$(uname -r)/extra/dpdk > > Suggested-by: Mario Carrillo > Signed-off-by: Thomas Monjalon > --- > mk/rte.sdkinstall.mk | 8 > 1 file changed, 8 insertions(+) > > diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk > index 5585974..46253ff 100644 > --- a/mk/rte.sdkinstall.mk > +++ b/mk/rte.sdkinstall.mk > @@ -36,6 +36,7 @@ BUILD_DIR := $O > > prefix ?= /usr/local > exec_prefix ?= $(prefix) > +kerneldir ?= $(exec_prefix)/kmod > bindir ?= $(exec_prefix)/bin > libdir ?= $(exec_prefix)/lib > includedir ?= $(prefix)/include/dpdk > @@ -89,6 +90,7 @@ ifeq '$(DESTDIR)$(if $T,,+)' '' > else > @echo == Installing $(DESTDIR)$(prefix)/ > $(Q)$(MAKE) O=$(BUILD_DIR) install-runtime > + $(Q)$(MAKE) O=$(BUILD_DIR) install-kmod > $(Q)$(MAKE) O=$(BUILD_DIR) install-sdk > @echo Installation in $(DESTDIR)$(prefix)/ complete > endif > @@ -105,6 +107,12 @@ install-runtime: > $(Q)$(call rte_mkdir, $(DESTDIR)$(datadir)) > $(Q)cp -a $(RTE_SDK)/tools $(DESTDIR)$(datadir) > > +install-kmod: > +ifneq '$(wildcard $O/kmod/*)' '' > + $(Q)$(call rte_mkdir, $(DESTDIR)$(kerneldir)) > + $(Q)cp -a $O/kmod/* $(DESTDIR)$(kerneldir) > +endif > + > install-sdk: > $(Q)$(call rte_mkdir, $(DESTDIR)$(includedir)) > $(Q)tar -chf - -C $O include | \ > This by default installs the modules to /usr/local/kmod/ with no kernel version etc. That's so broken that it'd be better not to install them at all. So either get the kerneldir right (the correct path is known on Linux and surely BSD too) or dont install them at all unless kerneldir is manually specified. For Linux, it should default to /lib/modules//extra/dpdk on Linux, where is the version those modules were built against (which might or might not have anything to do with uname -r output). - Panu -
[dpdk-dev] [PATCH 07/10] mk: install binding tool in sbin directory
On 12/02/2015 05:57 AM, Thomas Monjalon wrote: > sbin/dpdk_nic_bind is a symbolic link to tools/dpdk_nic_bind.py > where some python objects may be generated. > > Signed-off-by: Thomas Monjalon > --- > mk/rte.sdkinstall.mk | 4 > 1 file changed, 4 insertions(+) > > diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk > index 46253ff..d6df30c 100644 > --- a/mk/rte.sdkinstall.mk > +++ b/mk/rte.sdkinstall.mk > @@ -38,6 +38,7 @@ prefix ?= /usr/local > exec_prefix ?= $(prefix) > kerneldir ?= $(exec_prefix)/kmod > bindir ?= $(exec_prefix)/bin > +sbindir ?= $(exec_prefix)/sbin > libdir ?= $(exec_prefix)/lib > includedir ?= $(prefix)/include/dpdk > datarootdir ?= $(prefix)/share > @@ -106,6 +107,9 @@ install-runtime: > --keep-newer-files --warning=no-ignore-newer > $(Q)$(call rte_mkdir, $(DESTDIR)$(datadir)) > $(Q)cp -a $(RTE_SDK)/tools $(DESTDIR)$(datadir) > + $(Q)$(call rte_mkdir, $(DESTDIR)$(sbindir)) > + $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/dpdk_nic_bind.py, \ > +$(DESTDIR)$(sbindir)/dpdk_nic_bind) > > install-kmod: > ifneq '$(wildcard $O/kmod/*)' '' > This symlink is broken, it expects dpdk_nic_bind.py to reside in $(datadir) root when it actually is in $(datadir)/tools/ Other than that, getting rid of the .py suffix is a nice touch. - Panu -
[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree
On 12/02/2015 05:57 AM, Thomas Monjalon wrote: > The rule "install" follows these conventions: > https://www.gnu.org/prep/standards/html_node/Directory-Variables.html > https://www.gnu.org/prep/standards/html_node/DESTDIR.html > > The variable sdkdir has been added to the more standards ones, > to configure the directory used with RTE_SDK when using the DPDK makefiles > to build an application. > > The old installed tree was static and always had .config, includes and > libs in a RTE_TARGET subdirectory. There is no such directory anymore in > an installed SDK. So the top directory is checked. > But RTE_TARGET can still be used, especially to build an app with a > compiled but not installed SDK. > That's why both cases are looked for RTE_SDK_BIN. > > The default prefix /usr/local is empty in the T= case which is > used only for a local install. > It is still possible to build DPDK with the "install T=" rule without > specifying any DESTDIR. In such case there is no install, as before. > > The old usage of an installed SDK is: > make -C examples/helloworld RTE_SDK=$(readlink -m $DESTDIR) \ > RTE_TARGET=x86_64-native-linuxapp-gcc > RTE_TARGET can be specified but is useless now with an installed SDK. > The RTE_SDK directory must now point to a different path depending of > the installation. > > Signed-off-by: Thomas Monjalon > --- [...] > @@ -32,10 +33,30 @@ > # Build directory is given with O= > O ?= . > > +prefix ?= /usr/local > +exec_prefix ?= $(prefix) > +bindir ?= $(exec_prefix)/bin > +libdir ?= $(exec_prefix)/lib > +includedir ?= $(prefix)/include/dpdk > +datarootdir ?= $(prefix)/share > +datadir ?= $(datarootdir)/dpdk > +sdkdir ?= $(datadir) > + > +# The install directories may be staged in DESTDIR [...] > + @echo == Installing $(DESTDIR)$(prefix)/ > + $(Q)$(call rte_mkdir, $(DESTDIR)$(libdir)) > + $(Q)cp -a $(BUILD_DIR)/lib/* $(DESTDIR)$(libdir) > + $(Q)$(call rte_mkdir, $(DESTDIR)$(bindir)) > + $(Q)tar -cf - -C $(BUILD_DIR) app --exclude 'app/*.map' \ > + --exclude 'app/cmdline*' --exclude app/test \ > + --exclude app/testacl --exclude app/testpipeline | \ > + tar -xf - -C $(DESTDIR)$(bindir) --strip-components=1 \ > + --keep-newer-files --warning=no-ignore-newer > + $(Q)$(call rte_mkdir, $(DESTDIR)$(datadir)) > + $(Q)cp -a $(RTE_SDK)/tools $(DESTDIR)$(datadir) > + $(Q)$(call rte_mkdir, $(DESTDIR)$(includedir)) > + $(Q)tar -chf - -C $(BUILD_DIR) include | \ > + tar -xf - -C $(DESTDIR)$(includedir) --strip-components=1 \ > + --keep-newer-files --warning=no-ignore-newer > + $(Q)$(call rte_mkdir,$(DESTDIR)$(sdkdir)) > + $(Q)cp -a $(BUILD_DIR)/.config $(DESTDIR)$(sdkdir) > + $(Q)cp -a $(RTE_SDK)/{mk,scripts} $(DESTDIR)$(sdkdir) > + $(Q)$(call rte_symlink, $(DESTDIR)$(includedir), > $(DESTDIR)$(sdkdir)/include) > + $(Q)$(call rte_symlink, $(DESTDIR)$(libdir), > $(DESTDIR)$(sdkdir)/lib) > + @echo Installation in $(DESTDIR)$(prefix)/ complete > +endif $(prefix)/share is supposed to be shareable across different architectures. Most of the content here is, but at least the lib symlink and .config file are not. One option is to install .config and the symlinks within $(sdkdir)/$(T) directories, then it can be shared across architectures because each lives in their own directory. Another possibility is moving the whole sdk directory into a subdir in $(libdir), but that misses the opportunity to share across architectures (whether anybody actually cares is a whole other question :) $(sdkdir)/lib -> $(libdir) symlink seems reasonable when installing to an empty staging root, but on a real-world installation it'd point to /usr/lib(something) which has hundreds or thousands of other unrelated libraries. My memory is hazy on details but I think this caused an actual problem with something because I ended up $(sdkdir)/lib an actual directory populated with symlinks to the individual DPDK libraries. - Panu -
[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree
On 12/02/2015 01:25 PM, Thomas Monjalon wrote: > 2015-12-02 12:27, Panu Matilainen: >> On 12/02/2015 05:57 AM, Thomas Monjalon wrote: >>> The old installed tree was static and always had .config, includes and >>> libs in a RTE_TARGET subdirectory. There is no such directory anymore in >>> an installed SDK. So the top directory is checked. >>> But RTE_TARGET can still be used, especially to build an app with a >>> compiled but not installed SDK. >>> That's why both cases are looked for RTE_SDK_BIN. > [...] >>> The old usage of an installed SDK is: >>> make -C examples/helloworld RTE_SDK=$(readlink -m $DESTDIR) \ >>>RTE_TARGET=x86_64-native-linuxapp-gcc >>> RTE_TARGET can be specified but is useless now with an installed SDK. >>> The RTE_SDK directory must now point to a different path depending of >>> the installation. > [...] >>> + $(Q)$(call rte_mkdir,$(DESTDIR)$(sdkdir)) >>> + $(Q)cp -a $(BUILD_DIR)/.config $(DESTDIR)$(sdkdir) >>> + $(Q)cp -a $(RTE_SDK)/{mk,scripts} $(DESTDIR)$(sdkdir) >>> + $(Q)$(call rte_symlink, $(DESTDIR)$(includedir), >>> $(DESTDIR)$(sdkdir)/include) >>> + $(Q)$(call rte_symlink, $(DESTDIR)$(libdir), >>> $(DESTDIR)$(sdkdir)/lib) >> >> $(prefix)/share is supposed to be shareable across different >> architectures. Most of the content here is, but at least the lib symlink >> and .config file are not. > > The case you want to address is multilib 32/x32/64, right? That, plus modern Debian/Ubuntu supports multiarch, not just -lib. And then there's the pedantic side, ie to be in line with the FHS definition: http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA > >> One option is to install .config and the symlinks within $(sdkdir)/$(T) >> directories, then it can be shared across architectures because each >> lives in their own directory. Another possibility is moving the whole >> sdk directory into a subdir in $(libdir), but that misses the >> opportunity to share across architectures (whether anybody actually >> cares is a whole other question :) > > Yes, I tried to remove the use of RTE_TARGET when building an example. > But we can keep it with a subdirectory in $(sdkdir). Just realized my suggestion $(sdkdir)/$(T) would not cut it because if T= is specified then this installation method wont be invoked at all :D So yeah, RTE_TARGET. Or perhaps just RTE_ARCH. Dunno if there's actual added value to having the whole target string there, but I wont mind either. > >> $(sdkdir)/lib -> $(libdir) symlink seems reasonable when installing to >> an empty staging root, but on a real-world installation it'd point to >> /usr/lib(something) which has hundreds or thousands of other unrelated >> libraries. My memory is hazy on details but I think this caused an >> actual problem with something because I ended up $(sdkdir)/lib an actual >> directory populated with symlinks to the individual DPDK libraries. > > I don't see the problem. > I suggest to keep it and see how to fix it if an issue is raised. The problem probably had to do with something external, like compiling OVS or pktgen, but ... this is too hand-wavy to worry about right now. Just wanted to mention it because I dont think I added the extra complexity in packaging just for fun. - Panu -
[dpdk-dev] [PATCH 03/10] mk: install a standard cutomizable tree
On 12/02/2015 03:05 PM, Thomas Monjalon wrote: > 2015-12-02 14:54, Panu Matilainen: >> On 12/02/2015 01:25 PM, Thomas Monjalon wrote: >>> 2015-12-02 12:27, Panu Matilainen: >>>> $(prefix)/share is supposed to be shareable across different >>>> architectures. Most of the content here is, but at least the lib symlink >>>> and .config file are not. >>> >>> The case you want to address is multilib 32/x32/64, right? >> >> That, plus modern Debian/Ubuntu supports multiarch, not just -lib. > > We do not support completely different platforms (e.g. ARM and x86) > with only one include directory. At the moment, only variants (32/64) > live together. Actually even the variants will run into problems because eg rte_config.h will differ between 32- and 64-bit. But that's a problem for another day, this is hardly the most pressing of issues :) > >>>> One option is to install .config and the symlinks within $(sdkdir)/$(T) >>>> directories, then it can be shared across architectures because each >>>> lives in their own directory. Another possibility is moving the whole >>>> sdk directory into a subdir in $(libdir), but that misses the >>>> opportunity to share across architectures (whether anybody actually >>>> cares is a whole other question :) >>> >>> Yes, I tried to remove the use of RTE_TARGET when building an example. >>> But we can keep it with a subdirectory in $(sdkdir). >> >> Just realized my suggestion $(sdkdir)/$(T) would not cut it because if >> T= is specified then this installation method wont be invoked at all :D > > I don't understand what you mean. > In my patchset, the installation is the same (except some default values) > with and without T=. Hmm, must've misuderstood/mixed up with something Marios patches do. Never mind, I was just mumbling out loud anyhow. > >> So yeah, RTE_TARGET. Or perhaps just RTE_ARCH. Dunno if there's actual >> added value to having the whole target string there, but I wont mind either. > > RTE_TARGET is a safe choice for future. > Nod. - Panu -
[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request
On 12/02/2015 05:43 AM, Yuanhan Liu wrote: > VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk > vhost-user) where we should log dirty pages, and how big the log > buffer is. > > This request introduces a new payload: > > typedef struct VhostUserLog { > uint64_t mmap_size; > uint64_t mmap_offset; > } VhostUserLog; > > Also, a fd is delivered from QEMU by ancillary data. > > With those info given, an area of memory is mmaped, assigned > to dev->log_base, for logging dirty pages. > > Signed-off-by: Yuanhan Liu > --- > lib/librte_vhost/rte_virtio_net.h | 2 ++ > lib/librte_vhost/vhost_user/vhost-net-user.c | 7 - > lib/librte_vhost/vhost_user/vhost-net-user.h | 6 > lib/librte_vhost/vhost_user/virtio-net-user.c | 44 > +++ > lib/librte_vhost/vhost_user/virtio-net-user.h | 1 + > 5 files changed, 59 insertions(+), 1 deletion(-) > > diff --git a/lib/librte_vhost/rte_virtio_net.h > b/lib/librte_vhost/rte_virtio_net.h > index 5687452..416dac2 100644 > --- a/lib/librte_vhost/rte_virtio_net.h > +++ b/lib/librte_vhost/rte_virtio_net.h > @@ -127,6 +127,8 @@ struct virtio_net { > #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) > charifname[IF_NAME_SZ]; /**< Name of the tap > device or socket path. */ > uint32_tvirt_qp_nb; /**< number of queue pair we > have allocated */ > + uint64_tlog_size; /**< Size of log area */ > + uint8_t *log_base; /**< Where dirty pages are > logged */ > void*priv; /**< private context */ > struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; /**< > Contains all virtqueue information. */ > } __rte_cache_aligned; This (and other changes in patch 2 breaks the librte_vhost ABI again, so you'd need to at least add a deprecation note to 2.2 to be able to do it in 2.3 at all according to the ABI policy. Perhaps a better option would be adding some padding to the structs now for 2.2 since the vhost ABI is broken there anyway. That would at least give a chance to keep it compatible from 2.2 to 2.3. - Panu -
[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request
On 12/02/2015 04:31 PM, Yuanhan Liu wrote: > On Wed, Dec 02, 2015 at 03:53:45PM +0200, Panu Matilainen wrote: >> On 12/02/2015 05:43 AM, Yuanhan Liu wrote: >>> VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk >>> vhost-user) where we should log dirty pages, and how big the log >>> buffer is. >>> >>> This request introduces a new payload: >>> >>> typedef struct VhostUserLog { >>> uint64_t mmap_size; >>> uint64_t mmap_offset; >>> } VhostUserLog; >>> >>> Also, a fd is delivered from QEMU by ancillary data. >>> >>> With those info given, an area of memory is mmaped, assigned >>> to dev->log_base, for logging dirty pages. >>> >>> Signed-off-by: Yuanhan Liu >>> --- >>> lib/librte_vhost/rte_virtio_net.h | 2 ++ >>> lib/librte_vhost/vhost_user/vhost-net-user.c | 7 - >>> lib/librte_vhost/vhost_user/vhost-net-user.h | 6 >>> lib/librte_vhost/vhost_user/virtio-net-user.c | 44 >>> +++ >>> lib/librte_vhost/vhost_user/virtio-net-user.h | 1 + >>> 5 files changed, 59 insertions(+), 1 deletion(-) >>> >>> diff --git a/lib/librte_vhost/rte_virtio_net.h >>> b/lib/librte_vhost/rte_virtio_net.h >>> index 5687452..416dac2 100644 >>> --- a/lib/librte_vhost/rte_virtio_net.h >>> +++ b/lib/librte_vhost/rte_virtio_net.h >>> @@ -127,6 +127,8 @@ struct virtio_net { >>> #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) >>> charifname[IF_NAME_SZ]; /**< Name of the tap >>> device or socket path. */ >>> uint32_tvirt_qp_nb; /**< number of queue pair we >>> have allocated */ >>> + uint64_tlog_size; /**< Size of log area */ >>> + uint8_t *log_base; /**< Where dirty pages are >>> logged */ >>> void*priv; /**< private context */ >>> struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; /**< >>> Contains all virtqueue information. */ >>> } __rte_cache_aligned; >> >> This (and other changes in patch 2 breaks the librte_vhost ABI >> again, so you'd need to at least add a deprecation note to 2.2 to be >> able to do it in 2.3 at all according to the ABI policy. > > I was thinking that adding a new field (instead of renaming it or > removing it) isn't an ABI break. So, I was wrong? Adding or removing a field in the middle of a public struct is always an ABI break. Adding to the end often is too, but not always. Renaming a field is an API break but not an ABI break - the compiler cares but the cpu does not. >> >> Perhaps a better option would be adding some padding to the structs >> now for 2.2 since the vhost ABI is broken there anyway. That would >> at least give a chance to keep it compatible from 2.2 to 2.3. > > It will not be compatible, unless we add exact same fields (not > something like uint8_t pad[xx]). Otherwise, the pad field renaming > is also an ABI break, right? There's no ABI (or API) break in changing reserved unused fields to something else, as long as care is taken with sizes and alignment. In any case padding is best added to the end of a struct to minimize risks and keep things simple. - Panu - > > Thomas, should I write an ABI deprecation note? Can I make it for > v2.2 release If I make one tomorrow? (Sorry that I'm not awared > of that it would be an ABI break). > > --yliu >
[dpdk-dev] [PATCH] scripts: support any legal git revisions as abi validation range
In addition to git tags, support validating abi between any legal gitrevisions(7) syntaxes, such as "validate-abi.sh . -1 " "validate-abi.sh master mybrach " etc in addition to validating between tags. Makes it easier to run the validator for in-development work. Signed-off-by: Panu Matilainen --- scripts/validate-abi.sh | 26 -- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh index 4476433..0e3ccd7 100755 --- a/scripts/validate-abi.sh +++ b/scripts/validate-abi.sh @@ -43,16 +43,15 @@ log() { } validate_tags() { - git tag -l | grep -q "$TAG1" - if [ $? -ne 0 ] + + if [ -z "$HASH1" ] then - echo "$TAG1 is invalid" + echo "invalid revision: $TAG1" return fi - git tag -l | grep -q "$TAG2" - if [ $? -ne 0 ] + if [ -z "$HASH2" ] then - echo "$TAG2 is invalid" + echo "invalid revision: $TAG2" return fi } @@ -112,6 +111,9 @@ then cleanup_and_exit 1 fi +HASH1=$(git show -s --format=%H "$TAG1" -- 2> /dev/null) +HASH2=$(git show -s --format=%H "$TAG2" -- 2> /dev/null) + # Make sure our tags exist res=$(validate_tags) if [ -n "$res" ] @@ -120,6 +122,10 @@ then cleanup_and_exit 1 fi +# Make hashes available in output for non-local reference +TAG1="$TAG1 ($HASH1)" +TAG2="$TAG2 ($HASH2)" + ABICHECK=`which abi-compliance-checker 2>/dev/null` if [ $? -ne 0 ] then @@ -152,7 +158,7 @@ cd $(dirname $0)/.. log "INFO" "Checking out version $TAG1 of the dpdk" # Move to the old version of the tree -git checkout $TAG1 +git checkout $HASH1 # Make sure we configure SHARED libraries # Also turn off IGB and KNI as those require kernel headers to build @@ -185,7 +191,7 @@ cd $TARGET/lib log "INFO" "COLLECTING ABI INFORMATION FOR $TAG1" for i in `ls *.so` do - $ABIDUMP $i -o $ABI_DIR/$i-ABI-0.dump -lver $TAG1 + $ABIDUMP $i -o $ABI_DIR/$i-ABI-0.dump -lver $HASH1 done cd ../.. @@ -194,7 +200,7 @@ git clean -f -d git reset --hard # Move to the new version of the tree log "INFO" "Checking out version $TAG2 of the dpdk" -git checkout $TAG2 +git checkout $HASH2 # Make sure we configure SHARED libraries # Also turn off IGB and KNI as those require kernel headers to build @@ -220,7 +226,7 @@ cd $TARGET/lib log "INFO" "COLLECTING ABI INFORMATION FOR $TAG2" for i in `ls *.so` do - $ABIDUMP $i -o $ABI_DIR/$i-ABI-1.dump -lver $TAG2 + $ABIDUMP $i -o $ABI_DIR/$i-ABI-1.dump -lver $HASH2 done cd ../.. -- 2.5.0
[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request
On 12/02/2015 05:09 PM, Yuanhan Liu wrote: > On Wed, Dec 02, 2015 at 04:48:14PM +0200, Panu Matilainen wrote: > ... >>>>> diff --git a/lib/librte_vhost/rte_virtio_net.h >>>>> b/lib/librte_vhost/rte_virtio_net.h >>>>> index 5687452..416dac2 100644 >>>>> --- a/lib/librte_vhost/rte_virtio_net.h >>>>> +++ b/lib/librte_vhost/rte_virtio_net.h >>>>> @@ -127,6 +127,8 @@ struct virtio_net { >>>>> #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) >>>>> charifname[IF_NAME_SZ]; /**< Name of >>>>> the tap device or socket path. */ >>>>> uint32_tvirt_qp_nb; /**< number of queue >>>>> pair we have allocated */ >>>>> + uint64_tlog_size; /**< Size of log area */ >>>>> + uint8_t *log_base; /**< Where dirty pages are >>>>> logged */ >>>>> void*priv; /**< private context */ >>>>> struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; >>>>> /**< Contains all virtqueue information. */ >>>>> } __rte_cache_aligned; >>>> >>>> This (and other changes in patch 2 breaks the librte_vhost ABI >>>> again, so you'd need to at least add a deprecation note to 2.2 to be >>>> able to do it in 2.3 at all according to the ABI policy. >>> >>> I was thinking that adding a new field (instead of renaming it or >>> removing it) isn't an ABI break. So, I was wrong? >> >> Adding or removing a field in the middle of a public struct is >> always an ABI break. Adding to the end often is too, but not always. >> Renaming a field is an API break but not an ABI break - the compiler >> cares but the cpu does not. > > Good to know. Thanks. > >> >>>> >>>> Perhaps a better option would be adding some padding to the structs >>>> now for 2.2 since the vhost ABI is broken there anyway. That would >>>> at least give a chance to keep it compatible from 2.2 to 2.3. >>> >>> It will not be compatible, unless we add exact same fields (not >>> something like uint8_t pad[xx]). Otherwise, the pad field renaming >>> is also an ABI break, right? >> >> There's no ABI (or API) break in changing reserved unused fields to >> something else, as long as care is taken with sizes and alignment. > > as long as we don't reference the reserved unused fields? That would be the definition of an unused field I think :) Call it "reserved" if you want, it doesn't really matter as long as its clear its something you shouldn't be using. > >> In any case padding is best added to the end of a struct to minimize >> risks and keep things simple. > > The thing is that isn't it a bit aweful to (always) add pads to > the end of a struct, especially when you don't know how many > need to be padded? Then you pad for what you think you need, plus a bit extra, and maybe some more for others who might want to extend it. What is a reasonable amount needs deciding case by case - if a struct is alloced in the millions then be (very) conservative, but if there are one or 50 such structs within an app lifetime then who cares if its bit larger? And yeah padding may be annoying, but that's pretty much the only option in a project where most of the structs are out in the open. - Panu - > > --yliu >
[dpdk-dev] [PATCH v4 0/2] Add support for driver directories
On 12/03/2015 04:26 AM, Thomas Monjalon wrote: > 2015-12-02 18:07, Stephen Hemminger: >> On Thu, 12 Nov 2015 16:52:32 +0100 >> Thomas Monjalon wrote: >> >>>>> This mini-series adds support for driver directory concept >>>>> based on idea by Thomas Monjalon back in February: >>>>> http://dpdk.org/ml/archives/dev/2015-February/013285.html >>>>> >>>>> In the process FreeBSD also gains plugin support (but untested). >>>>> >>>>> v4: - introduce error-early behavior for invalid plugin paths >>>>> - support directories via the existing -d option instead of adding >>>>> new >>>>> >>>>> v3: - merge the first commits >>>>> >>>>> v2: - move code to eal/common >>>>> - add bsd support >>>>> >>>>> Panu Matilainen (2): >>>>>eal: move plugin loading to eal/common >>>>>eal: add support for driver directory concept >>>> >>>> >>>> checkpatch complains for some indent problem (Thomas, can you fix this ?), >>>> but the rest looks good to me. >>>> >>>> Acked-by: David Marchand >>>> >>>> Thanks Panu. >>> >>> Applied, thanks >> >> This patch introduces a new issue reported by Coverity. >> >> The root cause of the problem is that you are checking that it s a directory >> first with stat >> then calling dlopen(). I malicious entity could get between the stat and the >> dlopen. > > I think it is a false positive. > The aim of loading every files in the directory is out of a security scope > IMHO. > Yes its a false positive. The security aspect relates to world-writable directories and even in there the problem is usually "test for existence before creation", this is neither (if somebody routinely loads their critical device drivers from /tmp on a system they have bigger problems than this) If somebody changes a file to a directory or vice versa then the consecutive readdir() or dlopen() on that entry will just fail, end of story. And if somebody has the permission to change entries in that directory they dont have to bother with trying to time their changes between stat() and dlopen(). Sure it could just call dlopen() on everything and if it fails try readdir() on it. Matter of style, I dislike blindly stumbling and crashing when I can simply take a look to see whether its a door, a window or a wall :) - Panu -
[dpdk-dev] [PATCH v2] mk: fix compile error and ABI versioning for combined shared library
On 12/03/2015 03:22 AM, Ferruh Yigit wrote: > Fixes following error (observed when versioning macros used): >LD libdpdk.so >/usr/bin/ld: /root/dpdk/build/lib/libdpdk.so: version node not found >for symbol @DPDK_x.y > > Also resulting combined library contains symbol version information: > $ readelf -a build/lib/libdpdk.so | grep rte_eal_ | grep @ | head > <...>GLOBAL DEFAULT 12 rte_eal_alarm_set@@DPDK_2.0 > <...>GLOBAL DEFAULT 12 rte_eal_pci_write_config@@DPDK_2.1 > <...>GLOBAL DEFAULT 12 rte_eal_remote_launch@@DPDK_2.0 > ... > > Versioning fixed by merging all version scripts into one automatically and > feeding it to final library. > > Signed-off-by: Ferruh Yigit > --- > drivers/net/Makefile | 3 +++ > lib/Makefile | 3 +++ > mk/rte.sdkbuild.mk| 2 +- > mk/rte.sharelib.mk| 3 +++ > scripts/merge_maps.sh | 29 + > 5 files changed, 39 insertions(+), 1 deletion(-) > create mode 100755 scripts/merge_maps.sh > > diff --git a/drivers/net/Makefile b/drivers/net/Makefile > index cddcd57..d3c865b 100644 > --- a/drivers/net/Makefile > +++ b/drivers/net/Makefile > @@ -51,5 +51,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio > DIRS-$(CONFIG_RTE_LIBRTE_VMXNET3_PMD) += vmxnet3 > DIRS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT) += xenvirt > > +ifeq ($(COMBINED_BUILD),1) > include $(RTE_SDK)/mk/rte.sharelib.mk > +endif > + > include $(RTE_SDK)/mk/rte.subdir.mk > diff --git a/lib/Makefile b/lib/Makefile > index ef172ea..d0f7fb8 100644 > --- a/lib/Makefile > +++ b/lib/Makefile > @@ -64,5 +64,8 @@ DIRS-$(CONFIG_RTE_LIBRTE_KNI) += librte_kni > DIRS-$(CONFIG_RTE_LIBRTE_IVSHMEM) += librte_ivshmem > endif > > +ifeq ($(COMBINED_BUILD),1) > include $(RTE_SDK)/mk/rte.sharelib.mk > +endif > + > include $(RTE_SDK)/mk/rte.subdir.mk > diff --git a/mk/rte.sdkbuild.mk b/mk/rte.sdkbuild.mk > index 38ec7bd..d4e3abf 100644 > --- a/mk/rte.sdkbuild.mk > +++ b/mk/rte.sdkbuild.mk > @@ -94,7 +94,7 @@ $(ROOTDIRS-y): > @echo "== Build $@" > $(Q)$(MAKE) S=$@ -f $(RTE_SRCDIR)/$@/Makefile -C $(BUILDDIR)/$@ all > @if [ $@ = drivers -a $(CONFIG_RTE_BUILD_COMBINE_LIBS) = y ]; then \ > - $(MAKE) -f $(RTE_SDK)/lib/Makefile sharelib; \ > + COMBINED_BUILD=1 $(MAKE) -f $(RTE_SDK)/lib/Makefile sharelib; \ > fi > > %_clean: > diff --git a/mk/rte.sharelib.mk b/mk/rte.sharelib.mk > index 7bb7219..76ead09 100644 > --- a/mk/rte.sharelib.mk > +++ b/mk/rte.sharelib.mk > @@ -40,6 +40,8 @@ LIB_ONE := lib$(RTE_LIBNAME).so > else > LIB_ONE := lib$(RTE_LIBNAME).a > endif > +COMBINED_MAP=$(BUILDDIR)/lib/libdpdk.map > +CPU_LDFLAGS += --version-script=$(COMBINED_MAP) > endif > > .PHONY:sharelib > @@ -79,6 +81,7 @@ ifeq ($(CONFIG_RTE_BUILD_COMBINE_LIBS),y) > ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),y) > $(LIB_ONE): FORCE > @[ -d $(dir $@) ] || mkdir -p $(dir $@) > + @$(SRCDIR)/scripts/merge_maps.sh > $(COMBINED_MAP) > $(O_TO_S_DO) > else > $(LIB_ONE): FORCE > diff --git a/scripts/merge_maps.sh b/scripts/merge_maps.sh > new file mode 100755 > index 000..bc40dc8 > --- /dev/null > +++ b/scripts/merge_maps.sh > @@ -0,0 +1,29 @@ > +#!/bin/sh > + > +FILES=$(find $RTE_SDK -name "*.map" | grep -v build) > +SYMBOLS=$(grep -h "{" $FILES | sort -u | sed 's/{//') > + > +first=0 > +prev_sym="none" > + > +for s in $SYMBOLS; do > + echo "$s {" > + echo "global:" > + echo "" > + for f in $FILES; do > + sed -n "/$s {/,/}/p" $f | sed '/^$/d' | grep -v global | grep > -v local | sed '1d' | sed '$d' > + done | sort -u > + echo "" > + if [ $first -eq 0 ]; then > + first=1; > + echo "local: *;"; > + fi > + if [ "$prev_sym" == "none" ]; then > + echo "};"; > + prev_sym=$s; > + else > + echo "} $prev_sym;"; > + prev_sym=$s; > + fi > + echo "" > +done > I'd still rather see the combined library replaced with a linker script, but as long as it is there then +1 for this: with symbol versioning in place, applications linked to it more likely refuse to start than randomly crash when ABI changes, internal symbols are hidden etc. And doesn't require manual updating of two maps since its all scripted. - Panu -
[dpdk-dev] [PATCH v2 10/12] mk: install examples
On 12/03/2015 07:02 AM, Thomas Monjalon wrote: > The examples are part of the installed documentation. > > Signed-off-by: Thomas Monjalon > --- > mk/rte.sdkinstall.mk | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk > index 902a933..13fa819 100644 > --- a/mk/rte.sdkinstall.mk > +++ b/mk/rte.sdkinstall.mk > @@ -154,3 +154,4 @@ ifneq ($(wildcard $O/doc/*/*/*pdf),) > $(Q)$(call rte_mkdir, $(DESTDIR)$(docdir)/guides) > $(Q)cp -a $O/doc/*/*/*pdf $(DESTDIR)$(docdir)/guides > endif > + $(Q)cp -a $(RTE_SDK)/examples $(DESTDIR)$(datadir) > If examples are considered documentation (and I agree on that), then shouldn't they be installed in $(docdir) instead? - Panu -
[dpdk-dev] [PATCH v2 00/12] standard make install
On 12/03/2015 07:01 AM, Thomas Monjalon wrote: > Following the recent discussions, this is a proposal to have a standard > installation process while keeping compatibility with most of the old > behaviours. > > v2 changes: > - fix default build dir > - RTE_TARGET subdir in $(sdkdir). > - better kerneldir defaults > - fix dpdk_nic_bind symlink > - always install doc if generated > - doc > - pkg/dpdk.spec > Except for the minor nit about examples location (one could bikeshed on things like these forever), seems fine to me and quick-n-dirty conversion of my own spec didn't reveal any nasty surprises. It also appears more comprehensive and integrated with other workflows than the competing patches so FWIW, you have my ACK :) - Panu -
[dpdk-dev] [PATCH v2 10/12] mk: install examples
On 12/03/2015 03:32 PM, Thomas Monjalon wrote: > 2015-12-03 15:19, Panu Matilainen: >> On 12/03/2015 07:02 AM, Thomas Monjalon wrote: >>> The examples are part of the installed documentation. >>> >>> Signed-off-by: Thomas Monjalon >>> --- >>>mk/rte.sdkinstall.mk | 1 + >>>1 file changed, 1 insertion(+) >>> >>> diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk >>> index 902a933..13fa819 100644 >>> --- a/mk/rte.sdkinstall.mk >>> +++ b/mk/rte.sdkinstall.mk >>> @@ -154,3 +154,4 @@ ifneq ($(wildcard $O/doc/*/*/*pdf),) >>> $(Q)$(call rte_mkdir, $(DESTDIR)$(docdir)/guides) >>> $(Q)cp -a $O/doc/*/*/*pdf $(DESTDIR)$(docdir)/guides >>>endif >>> + $(Q)cp -a $(RTE_SDK)/examples $(DESTDIR)$(datadir) >>> >> >> If examples are considered documentation (and I agree on that), then >> shouldn't they be installed in $(docdir) instead? > > I was hesitating. I think it's strange to install some code in > /usr/share/doc/. > It's not really important and may be changed easily at any time. > Installing source code anywhere at all seems a bit strange, being in doc seems like the least-worst alternative to me :) But like said, no big deal. - Panu -
[dpdk-dev] [PATCH] scripts: support any legal git revisions as abi validation range
On 12/03/2015 02:14 PM, Mcnamara, John wrote: >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen >> Sent: Wednesday, December 2, 2015 4:51 PM >> To: dev at dpdk.org >> Subject: [dpdk-dev] [PATCH] scripts: support any legal git revisions as >> abi validation range >> >> In addition to git tags, support validating abi between any legal >> gitrevisions(7) syntaxes, such as "validate-abi.sh . -1 " >> "validate-abi.sh master mybrach " etc in addition to validating >> between tags. Makes it easier to run the validator for in-development >> work. > > Hi Panu, > > +1 for this. > > You might also change the ABI validation section of the docs to go along > with this. Something like the patch below. If not I'll submit it > afterwards. Good points, including changing the usage message to REV instead of TAG. I'll send an improved version based on this, thanks. > > Also, if someone has some bandwidth it would be good to add an option > to pass -j with an optional number to "make" in the script. Can do, although I'm still waiting fo my previous, semi-related validate-abi patches from September to be applied... - Panu -
[dpdk-dev] [PATCH] scripts: support any legal git revisions as abi validation range
On 12/03/2015 03:28 PM, Thomas Monjalon wrote: > 2015-12-03 12:14, Mcnamara, John: >> Also, if someone has some bandwidth it would be good to add an option >> to pass -j with an optional number to "make" in the script. > > We can use -j without any number: > "make will not limit the number of jobs that can run simultaneously". > It is a not so bad default. > Hmm, my memory associates an unlimited -j to make with sudden death by fork-bomb. But that was a long time ago and on a different codebase (the kernel maybe), with DPDK on current hardware it doesn't seem that bad at all. OTOH we can also simply ask the system for a reasonable value, eg $ /usr/bin/getconf _NPROCESSORS_ONLN - Panu -
[dpdk-dev] [PATCH v2] scripts: support any legal git revisions as abi validation range
In addition to git tags, support validating abi between any legal gitrevisions(7) syntaxes, such as "validate-abi.sh -1 . " "validate-abi.sh master mybranch " etc in addition to validating between tags. Makes it easier to run the validator for in-development work. Signed-off-by: Panu Matilainen Acked-by: Neil Horman --- v2 changes: - update usage and error messages to match new behavior - update documentation too (as suggested by John McNamara) doc/guides/contributing/versioning.rst | 20 --- scripts/validate-abi.sh| 36 -- 2 files changed, 34 insertions(+), 22 deletions(-) diff --git a/doc/guides/contributing/versioning.rst b/doc/guides/contributing/versioning.rst index 653c7d0..015ebb7 100644 --- a/doc/guides/contributing/versioning.rst +++ b/doc/guides/contributing/versioning.rst @@ -468,16 +468,22 @@ utilities which can be installed via a package manager. For example:: The syntax of the ``validate-abi.sh`` utility is:: - ./scripts/validate-abi.sh + ./scripts/validate-abi.sh -Where ``TAG1`` and ``TAG2`` are valid git tags on the local repo and target is -the usual DPDK compilation target. +Where ``REV1`` and ``REV2`` are valid gitrevisions(7) +https://www.kernel.org/pub/software/scm/git/docs/gitrevisions.html +on the local repo and target is the usual DPDK compilation target. -For example to test the current committed HEAD against a previous release tag -we could add a temporary tag and run the utility as follows:: +For example: - git tag MY_TEMP_TAG - ./scripts/validate-abi.sh v2.0.0 MY_TEMP_TAG x86_64-native-linuxapp-gcc + # Check between the previous and latest commit: + ./scripts/validate-abi.sh HEAD~1 HEAD x86_64-native-linuxapp-gcc + + # Check between two tags: + ./scripts/validate-abi.sh v2.0.0 v2.1.0 x86_64-native-linuxapp-gcc + + # Check between git master and local topic-branch "vhost-hacking": + ./scripts/validate-abi.sh master vhost-hacking x86_64-native-linuxapp-gcc After the validation script completes (it can take a while since it need to compile both tags) it will create compatibility reports in the diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh index 4476433..e49c425 100755 --- a/scripts/validate-abi.sh +++ b/scripts/validate-abi.sh @@ -33,7 +33,7 @@ TARGET=$3 ABI_DIR=`mktemp -d -p /tmp ABI.XX` usage() { - echo "$0 " + echo "$0 " } log() { @@ -43,16 +43,15 @@ log() { } validate_tags() { - git tag -l | grep -q "$TAG1" - if [ $? -ne 0 ] + + if [ -z "$HASH1" ] then - echo "$TAG1 is invalid" + echo "invalid revision: $TAG1" return fi - git tag -l | grep -q "$TAG2" - if [ $? -ne 0 ] + if [ -z "$HASH2" ] then - echo "$TAG2 is invalid" + echo "invalid revision: $TAG2" return fi } @@ -60,12 +59,12 @@ validate_tags() { validate_args() { if [ -z "$TAG1" ] then - echo "Must Specify TAG1" + echo "Must Specify REV1" return fi if [ -z "$TAG2" ] then - echo "Must Specify TAG2" + echo "Must Specify REV2" return fi if [ -z "$TARGET" ] @@ -112,6 +111,9 @@ then cleanup_and_exit 1 fi +HASH1=$(git show -s --format=%H "$TAG1" -- 2> /dev/null) +HASH2=$(git show -s --format=%H "$TAG2" -- 2> /dev/null) + # Make sure our tags exist res=$(validate_tags) if [ -n "$res" ] @@ -120,6 +122,10 @@ then cleanup_and_exit 1 fi +# Make hashes available in output for non-local reference +TAG1="$TAG1 ($HASH1)" +TAG2="$TAG2 ($HASH2)" + ABICHECK=`which abi-compliance-checker 2>/dev/null` if [ $? -ne 0 ] then @@ -135,8 +141,8 @@ then fi log "INFO" "We're going to check and make sure that applications built" -log "INFO" "against DPDK DSOs from tag $TAG1 will still run when executed" -log "INFO" "against DPDK DSOs built from tag $TAG2." +log "INFO" "against DPDK DSOs from version $TAG1 will still run when executed" +log "INFO" "against DPDK DSOs built from version $TAG2." log "INFO" "" # Check to make sure we have a clean tree @@ -152,7 +158,7 @@ cd $(dirname $0)/.. log "INFO" "Checking out version $TAG1 of the dpdk" # Move to the old version of the tree -git checkout $TAG1 +git checkout $HASH1 # Make sure we configure SHARED libraries # Also turn off IGB and KNI as those require kernel headers to build @@ -185,7 +191,7 @@ cd $TARGET/lib log &qu
[dpdk-dev] [announce] driverctl: utility for persistent alternative driver binding
Hi all, While this is not directly related to DPDK or OVS, it is potentially useful for users of both, so excuse me for cross-posting. Quoting from the project README (for the full text see http://laiskiainen.org/git/?p=driverctl.git;a=blob_plain;f=README) > driverctl is a tool for manipulating and inspecting the system > device driver choices. > > Devices are normally assigned to their sole designated kernel driver > by default. However in some situations it may be desireable to > override that default, for example to try an older driver to > work around a regression in a driver or to try an experimental > alternative driver. Another common use-case is pass-through > drivers and driver stubs to allow userspace to drive the device, > such as in case of virtualization. > > driverctl integrates with udev to support overriding > driver selection for both cold- and hotplugged devices from the > moment of discovery, but can also change already assigned drivers, > assuming they are not in use by the system. The driver overrides > created by driverctl are persistent across system reboots > by default. > > Usage > - > > Find devices currently driven by ixgbe driver: > > # driverctl -v list-devices | grep ixgbe > :01:00.0 ixgbe (Ethernet 10G 4P X520/I350 rNDC) > :01:00.1 ixgbe (Ethernet 10G 4P X520/I350 rNDC) > > Change them to use the vfio-pci driver: > # driverctl set-override :01:00.0 vfio-pci > # driverctl set-override :01:00.1 vfio-pci > > Find devices with driver overrides: > # driverctl -v list-devices|grep \\* > :01:00.0 vfio-pci [*] (Ethernet 10G 4P X520/I350 rNDC) > :01:00.1 vfio-pci [*] (Ethernet 10G 4P X520/I350 rNDC) > > Remove the override from slot :01:00.1: > # driverctl unset-override :01:00.1 DPDK of course has its own dpdk_nic_bind(.py) tool for this purpose, the main differences to driverctl are: - driverctl bindings are persistent across system boots - driverctl bindings take place immediately on cold- and hotplug - driverctl is a generic tool not limited to network adapters - dpdk_nic_bind being a special purpose tool has many more sanity checks for its supported use-cases - dpdk_nic_bind supports binding multiple NICs at once The project currently lives at http://laiskiainen.org/git/?p=driverctl.git Feedback, patches etc are welcome. - Panu -
[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request
On 12/07/2015 01:07 AM, Thomas Monjalon wrote: > 2015-12-02 15:53, Panu Matilainen: >> This (and other changes in patch 2 breaks the librte_vhost ABI again, so >> you'd need to at least add a deprecation note to 2.2 to be able to do it >> in 2.3 at all according to the ABI policy. >> >> Perhaps a better option would be adding some padding to the structs now >> for 2.2 since the vhost ABI is broken there anyway. That would at least >> give a chance to keep it compatible from 2.2 to 2.3. > > Please could you point where the vhost ABI is broken in 2.2? > The vhost ABI break was announced for DPDK 2.2 in commit 3c848bd7b1c6f4f681b833322a748fdefbb5fb2d: > commit 3c848bd7b1c6f4f681b833322a748fdefbb5fb2d > Author: Ouyang Changchun > Date: Tue Jun 16 09:38:43 2015 +0800 > > doc: announce ABI changes for vhost-user multiple queues > > It announces the planned ABI changes for vhost-user multiple > queues feature on v2.2. > > Signed-off-by: Changchun Ouyang > Acked-by: Neil Horman So the ABI process was properly followed, except for actually bumping LIBABIVER. Bumping LIBABIVER is mentioned in doc/guides/contributing/versioning.rst but it doesn't specify *when* this should be done, eg should the first patch breaking the ABI bump it or should it done be shortly before the next stable release, or something else. As it is, it seems a bit too easy to simply forget. - Panu -
[dpdk-dev] [PATCH v2] mk: pass EXTRA_CFLAGS to AUTO_CPUFLAGS to enable local modifications
On 12/04/2015 08:53 PM, Thomas Monjalon wrote: We have encountered a CPU where the AES-NI instruction set is disabled due to export restrictions. Since the build machine and target machine is different, using -native configs doesn't work, and on this CPU, the application refuses to run due to the AES CPU flags being amiss. The patch passes EXTRA_CFLAGS to the figure-out-cpu-flags helper, which allows us to add -mno-aes to the compile flags and resolve this problem. Signed-off-by: Simon Kagstrom >> >> Acked-by: Olivier Matz > > Applied, thanks > This causes some complications on Fedora/RHEL due to fairly complex interactions with -Werror, -Wall and -Wformat-security mixup between upstream- and distro default compiler flags. More specifically, when EXTRA_CFLAGS contains warning flag manipulation this patch can cause mismatch between other options that are okay elsewhere in dpdk make. A simple fix is to pass WERROR_FLAGS to AUTO_CPUFLAGS too to counter this, ie diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk index c6bb8de..28f203b 100644 --- a/mk/rte.cpuflags.mk +++ b/mk/rte.cpuflags.mk @@ -33,7 +33,7 @@ # used to set the RTE_CPUFLAG_* environment variables giving details # of what instruction sets the target cpu supports. -AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) $(EXTRA_CFLAGS) -dM -E - < /dev/null) +AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) $(WERROR_FLAGS) $(EXTRA_CFLAGS) -dM -E - < /dev/null) # adding flags to CPUFLAGS I can send an official patch if this seems acceptable. - Panu -
[dpdk-dev] [PATCH v4] ip_pipeline: add flow actions pipeline
On 12/07/2015 03:17 AM, Thomas Monjalon wrote: > 2015-11-18 17:09, Fan Zhang: >> Flow actions pipeline is an extension of flow-classification pipeline. >> Some of the operations of flow classification pipeline such as traffic >> metering/marking(for e.g. Single Rate Three Color Marker (srTCM), Two >> Rate Three Color Marker trTCM)), policer can be performed separately in >> flow action pipeline to avoid excessive computational burden on the CPU >> core running the flow-classification pipeline. The Flow action pipeline >> implements various function such as traffic metering, policer, stats. >> Traffic mettering can configured as per the required context, for >> examples- per user, per traffic class or both. These contexts can be >> applied by specifying parameters in configuration file as shown below; >> >> [PIPELINE1] >> type = FLOW_ACTIONS >> core = 1 >> pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0 >> pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0 >> n_flows = 65536 >> n_meters_per_flow = 1 >> flow_id_offset = 158 >> ip_hdr_offset = 142 >> color_offset = 64 >> >> The entries of flow and dscp tables of flow actions pipeline can be >> modified through command-line interface. The commands to add or delete >> entries to the flow table, DSCP(differentiated services code point) >> table and for statistics collection, etc have been included. The key >> functions such as Traffic Metering/marking and policer functions have >> been implemented as flow-table action handler. >> >> Signed-off-by: Jasvinder Singh >> Signed-off-by: Fan Zhang >> Acked-by: Cristian Dumitrescu > > Applied, thanks > The patch tries to include pipeline_flow_actions.h which doesn't exist, making the ip_pipeline example unbuildable. Seems like a case of forgotten "git add" when creating the patch... - Panu -
[dpdk-dev] [PATCH] mk: fix external shared library dependencies of drivers, round 2
Similar to commit 5f9115e58cc6f304ff4ade694cf5823d32887d1a, but for qat and mpipe drivers. The former did not exist when the previous patch was sent and latter I just missed. Fixes: 5f9115e58cc6 ("mk: fix shared library dependencies of drivers") Signed-off-by: Panu Matilainen --- drivers/crypto/qat/Makefile | 1 + drivers/net/mpipe/Makefile | 1 + mk/rte.app.mk | 8 +--- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/crypto/qat/Makefile b/drivers/crypto/qat/Makefile index e027ff9..258c2d5 100644 --- a/drivers/crypto/qat/Makefile +++ b/drivers/crypto/qat/Makefile @@ -41,6 +41,7 @@ CFLAGS += $(WERROR_FLAGS) # external library include paths CFLAGS += -I$(SRCDIR)/qat_adf +LDLIBS += -lcrypto # library source files SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_crypto.c diff --git a/drivers/net/mpipe/Makefile b/drivers/net/mpipe/Makefile index 552b303..654d191 100644 --- a/drivers/net/mpipe/Makefile +++ b/drivers/net/mpipe/Makefile @@ -32,6 +32,7 @@ include $(RTE_SDK)/mk/rte.vars.mk LIB = librte_pmd_mpipe.a CFLAGS += $(WERROR_FLAGS) -O3 +LDLIBS += -gxio EXPORT_MAP := rte_pmd_mpipe_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 90ec33d..856cac0 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -108,6 +108,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -libverbs _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -libverbs _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += -lsze2 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lxenstore +# QAT PMD has a dependency on libcrypto (from openssl) for calculating HMAC precomputes +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lcrypto +_LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lgxio endif # CONFIG_RTE_BUILD_COMBINE_LIBS or not CONFIG_RTE_BUILD_SHARED_LIBS _LDLIBS-y += --start-group @@ -144,14 +147,13 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += -lrte_pmd_e1000 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -lrte_pmd_mlx4 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_pmd_mlx5 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += -lrte_pmd_szedata2 -_LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lrte_pmd_mpipe -lgxio +_LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lrte_pmd_mpipe _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING) += -lrte_pmd_ring _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += -lrte_pmd_pcap _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += -lrte_pmd_af_packet _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += -lrte_pmd_null -# QAT PMD has a dependency on libcrypto (from openssl) for calculating HMAC precomputes -_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lrte_pmd_qat -lcrypto +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lrte_pmd_qat # AESNI MULTI BUFFER is dependent on the IPSec_MB library _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB) += -lrte_pmd_aesni_mb -- 2.5.0
[dpdk-dev] [PATCH v2] mk: fix external shared library dependencies of drivers, round 2
Similar to commit 5f9115e58cc6f304ff4ade694cf5823d32887d1a, but for qat and mpipe drivers. The former did not exist when the previous patch was sent and latter I just missed. Fixes: 5f9115e58cc6 ("mk: fix shared library dependencies of drivers") Signed-off-by: Panu Matilainen --- v2: - typo/copy-paste error -gxio -> -lgxio drivers/crypto/qat/Makefile | 1 + drivers/net/mpipe/Makefile | 1 + mk/rte.app.mk | 8 +--- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/crypto/qat/Makefile b/drivers/crypto/qat/Makefile index e027ff9..258c2d5 100644 --- a/drivers/crypto/qat/Makefile +++ b/drivers/crypto/qat/Makefile @@ -41,6 +41,7 @@ CFLAGS += $(WERROR_FLAGS) # external library include paths CFLAGS += -I$(SRCDIR)/qat_adf +LDLIBS += -lcrypto # library source files SRCS-$(CONFIG_RTE_LIBRTE_PMD_QAT) += qat_crypto.c diff --git a/drivers/net/mpipe/Makefile b/drivers/net/mpipe/Makefile index 552b303..46f046d 100644 --- a/drivers/net/mpipe/Makefile +++ b/drivers/net/mpipe/Makefile @@ -32,6 +32,7 @@ include $(RTE_SDK)/mk/rte.vars.mk LIB = librte_pmd_mpipe.a CFLAGS += $(WERROR_FLAGS) -O3 +LDLIBS += -lgxio EXPORT_MAP := rte_pmd_mpipe_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 90ec33d..856cac0 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -108,6 +108,9 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -libverbs _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -libverbs _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += -lsze2 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lxenstore +# QAT PMD has a dependency on libcrypto (from openssl) for calculating HMAC precomputes +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lcrypto +_LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lgxio endif # CONFIG_RTE_BUILD_COMBINE_LIBS or not CONFIG_RTE_BUILD_SHARED_LIBS _LDLIBS-y += --start-group @@ -144,14 +147,13 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_E1000_PMD) += -lrte_pmd_e1000 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX4_PMD) += -lrte_pmd_mlx4 _LDLIBS-$(CONFIG_RTE_LIBRTE_MLX5_PMD) += -lrte_pmd_mlx5 _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_SZEDATA2) += -lrte_pmd_szedata2 -_LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lrte_pmd_mpipe -lgxio +_LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lrte_pmd_mpipe _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_RING) += -lrte_pmd_ring _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_PCAP) += -lrte_pmd_pcap _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AF_PACKET) += -lrte_pmd_af_packet _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_NULL) += -lrte_pmd_null -# QAT PMD has a dependency on libcrypto (from openssl) for calculating HMAC precomputes -_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lrte_pmd_qat -lcrypto +_LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lrte_pmd_qat # AESNI MULTI BUFFER is dependent on the IPSec_MB library _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_AESNI_MB) += -lrte_pmd_aesni_mb -- 2.5.0
[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request
On 12/07/2015 01:28 PM, Thomas Monjalon wrote: > 2015-12-07 08:29, Panu Matilainen: >> On 12/07/2015 01:07 AM, Thomas Monjalon wrote: >>> 2015-12-02 15:53, Panu Matilainen: >>>> This (and other changes in patch 2 breaks the librte_vhost ABI again, so >>>> you'd need to at least add a deprecation note to 2.2 to be able to do it >>>> in 2.3 at all according to the ABI policy. >>>> >>>> Perhaps a better option would be adding some padding to the structs now >>>> for 2.2 since the vhost ABI is broken there anyway. That would at least >>>> give a chance to keep it compatible from 2.2 to 2.3. >>> >>> Please could you point where the vhost ABI is broken in 2.2? >>> >> >> The vhost ABI break was announced for DPDK 2.2 in commit >> 3c848bd7b1c6f4f681b833322a748fdefbb5fb2d: > [...] >> So the ABI process was properly followed, except for actually bumping >> LIBABIVER. Bumping LIBABIVER is mentioned in >> doc/guides/contributing/versioning.rst but it doesn't specify *when* >> this should be done, eg should the first patch breaking the ABI bump it >> or should it done be shortly before the next stable release, or >> something else. As it is, it seems a bit too easy to simply forget. > > I thought it was not needed to explicitly say that commits must be atomic > and we do not have to wait to do the required changes. The "problem" is that during a development cycle, an ABI could be broken several times but LIBABIVER should only be bumped once. So ABI breaking commits will often not be atomic wrt LIBABIVER, no matter which way its done. For example libtool recommendation is that library versions are updated only just before public releases: https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html#Updating-version-info - Panu - > In this case, I've missed it when reviewing the vhost patches breaking the > ABI. >
[dpdk-dev] [PATCH] i40e: remove redundant compiler warning disablers
These may have been required at some point but current i40e base driver compiles cleanly without them, at least with clang 3.7.0 and gcc 5.1.1. Signed-off-by: Panu Matilainen --- drivers/net/i40e/Makefile | 13 - 1 file changed, 13 deletions(-) diff --git a/drivers/net/i40e/Makefile b/drivers/net/i40e/Makefile index 033ee4a..4ffaf0d 100644 --- a/drivers/net/i40e/Makefile +++ b/drivers/net/i40e/Makefile @@ -53,23 +53,10 @@ CFLAGS_BASE_DRIVER = -wd593 -wd188 else ifeq ($(CC), clang) CFLAGS_BASE_DRIVER += -Wno-sign-compare CFLAGS_BASE_DRIVER += -Wno-unused-value -CFLAGS_BASE_DRIVER += -Wno-unused-parameter -CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -CFLAGS_BASE_DRIVER += -Wno-format -CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers -CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast -CFLAGS_BASE_DRIVER += -Wno-format-nonliteral CFLAGS_BASE_DRIVER += -Wno-unused-variable else CFLAGS_BASE_DRIVER = -Wno-sign-compare CFLAGS_BASE_DRIVER += -Wno-unused-value -CFLAGS_BASE_DRIVER += -Wno-unused-parameter -CFLAGS_BASE_DRIVER += -Wno-strict-aliasing -CFLAGS_BASE_DRIVER += -Wno-format -CFLAGS_BASE_DRIVER += -Wno-missing-field-initializers -CFLAGS_BASE_DRIVER += -Wno-pointer-to-int-cast -CFLAGS_BASE_DRIVER += -Wno-format-nonliteral -CFLAGS_BASE_DRIVER += -Wno-format-security CFLAGS_BASE_DRIVER += -Wno-unused-variable ifeq ($(shell test $(GCC_VERSION) -ge 44 && echo 1), 1) -- 2.5.0
[dpdk-dev] [PATCH] mk: fix warning spew when EXTRA_CFLAGS specifies warning flags
Starting with commit 9aa2053c6e81493b23346ff4e387903560de5c81 EXTRA_CFLAGS is sometimes being passed to the compiler without WERROR_FLAGS which can cause spurious warnings by the dozen, for example with when compiling with EXTRA_CFLAGS="-Wformat-security": cc1: warning: -Wformat-security ignored without -Wformat [-Wformat-security] Passing WERROR_FLAGS to AUTO_CPU helper makes the warning flag usage consistent throughout the codebase, silencing the warnings. Fixes: 9aa2053c6e81 ("mk: influence CPU flags with user input") Signed-off-by: Panu Matilainen --- mk/rte.cpuflags.mk | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk index c6bb8de..28f203b 100644 --- a/mk/rte.cpuflags.mk +++ b/mk/rte.cpuflags.mk @@ -33,7 +33,7 @@ # used to set the RTE_CPUFLAG_* environment variables giving details # of what instruction sets the target cpu supports. -AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) $(EXTRA_CFLAGS) -dM -E - < /dev/null) +AUTO_CPUFLAGS := $(shell $(CC) $(MACHINE_CFLAGS) $(WERROR_FLAGS) $(EXTRA_CFLAGS) -dM -E - < /dev/null) # adding flags to CPUFLAGS -- 2.5.0
[dpdk-dev] [PATCH v2] scripts: support any legal git revisions as abi validation range
On 12/03/2015 04:05 PM, Panu Matilainen wrote: > In addition to git tags, support validating abi between any legal > gitrevisions(7) syntaxes, such as "validate-abi.sh -1 . " > "validate-abi.sh master mybranch " etc in addition to > validating between tags. Makes it easier to run the validator > for in-development work. > > Signed-off-by: Panu Matilainen > Acked-by: Neil Horman > --- > > v2 changes: > - update usage and error messages to match new behavior > - update documentation too (as suggested by John McNamara) > I started wondering why this didn't get applied along with the other abi-validator changes and noticed this is sitting in patchwork in "changes requested" state, which doesn't seem right: v2 added the requested documentation. The discussion around this patch did spur another request (ability to pass parallel build flags to make) but that's entirely unrelated, so it shouldn't hold up this one. I've no intention of sending a v3 of this patch because AFAIK there's nothing to fix and the make-flags thing does not belong here, but resetting the state to "new" by myself feels like cheating or something :) So what's the correct action here? There's preciously little documentation about expected patchwork workflow and such. - Panu -
[dpdk-dev] [PATCH v2] scripts: support any legal git revisions as abi validation range
On 12/07/2015 04:32 PM, Thomas Monjalon wrote: > 2015-12-07 16:09, Panu Matilainen: >> On 12/03/2015 04:05 PM, Panu Matilainen wrote: >>> In addition to git tags, support validating abi between any legal >>> gitrevisions(7) syntaxes, such as "validate-abi.sh -1 . " >>> "validate-abi.sh master mybranch " etc in addition to >>> validating between tags. Makes it easier to run the validator >>> for in-development work. >>> >>> Signed-off-by: Panu Matilainen >>> Acked-by: Neil Horman >>> --- >>> >>> v2 changes: >>> - update usage and error messages to match new behavior >>> - update documentation too (as suggested by John McNamara) >>> >> >> I started wondering why this didn't get applied along with the other >> abi-validator changes and noticed this is sitting in patchwork in >> "changes requested" state, which doesn't seem right: v2 added the >> requested documentation. > > It seems to be an error. > >> The discussion around this patch did spur another request (ability to >> pass parallel build flags to make) but that's entirely unrelated, so it >> shouldn't hold up this one. > > Yes > >> I've no intention of sending a v3 of this patch because AFAIK there's >> nothing to fix and the make-flags thing does not belong here, but >> resetting the state to "new" by myself feels like cheating or something >> :) So what's the correct action here? There's preciously little >> documentation about expected patchwork workflow and such. > > It's not cheating. > Changing patchwork status and send such an email looks to be the right thing > to do. Ok, done. Thanks for clarifying. > > Yes maybe we can improve the contributing guide. Perhaps this could be used as a base, or referred to (assuming of course the info is rasonably applicaple to dpdk too)? https://sourceware.org/glibc/wiki/Patch%20Review%20Workflow - Panu - > Thanks >
[dpdk-dev] [PATCH 1/4] vhost: handle VHOST_USER_SET_LOG_BASE request
On 12/07/2015 03:55 PM, Thomas Monjalon wrote: > 2015-12-07 13:41, Panu Matilainen: >> On 12/07/2015 01:28 PM, Thomas Monjalon wrote: >>> 2015-12-07 08:29, Panu Matilainen: >>>> On 12/07/2015 01:07 AM, Thomas Monjalon wrote: >>>>> 2015-12-02 15:53, Panu Matilainen: >>>> The vhost ABI break was announced for DPDK 2.2 in commit >>>> 3c848bd7b1c6f4f681b833322a748fdefbb5fb2d: >>> [...] >>>> So the ABI process was properly followed, except for actually bumping >>>> LIBABIVER. Bumping LIBABIVER is mentioned in >>>> doc/guides/contributing/versioning.rst but it doesn't specify *when* >>>> this should be done, eg should the first patch breaking the ABI bump it >>>> or should it done be shortly before the next stable release, or >>>> something else. As it is, it seems a bit too easy to simply forget. >>> >>> I thought it was not needed to explicitly say that commits must be atomic >>> and we do not have to wait to do the required changes. Heh, now that I look more carefully... it IS documented, line 38 of contributing/versioning.rst: > ABI versions are set at the time of major release labeling, and the > ABI may change multiple times, without warning, between the last > release label and the HEAD label of the git tree. >> The "problem" is that during a development cycle, an ABI could be broken >> several times but LIBABIVER should only be bumped once. So ABI breaking >> commits will often not be atomic wrt LIBABIVER, no matter which way its >> done. > > If the ABI version has already been changed, there should be a merge conflict. > I think it's better to manage a conflict than forget to update the version. What I'm thinking of is something that would tie LIBABIVER to the deprecation announcement in a way that could be easily checked (programmatically and manually). As it is now, its quite non-trivial to figure what LIBABIVER *should* be for a given library at a given point - you need to dig up deprecation.rst history and Makefile history and whatnot, and its all quite error-prone. >> For example libtool recommendation is that library versions are updated >> only just before public releases: >> https://www.gnu.org/software/libtool/manual/html_node/Updating-version-info.html#Updating-version-info > > Interesting link. It makes me think that we do not manage ABI break when > downgrading the library (case of only new API keeping the ABI number). Hmm, not quite sure what you mean here, but full libtool-style versioning is not really needed with symbol versioning. - Panu - > >>> In this case, I've missed it when reviewing the vhost patches breaking the >>> ABI. >
[dpdk-dev] [PATCH v3 2/2] eal/linux: Add support for handling built-in kernel modules
On 12/07/2015 10:55 PM, Stephen Hemminger wrote: > On Mon, 7 Dec 2015 19:36:05 +0100 > Kamil Rytarowski wrote: > >> +/* Check if there is sysfs mounted */ >> +if (stat("/sys/module", &st) != 0) { >> +RTE_LOG(DEBUG, EAL, "Open /sys/module failed: %s\n", >> +strerror(errno)); >> return -1; >> } > > This check is useless. > If /sys/module does not exist then /sys/module/XXX won't exist either. Yes, but non-mounted sysfs is an error whereas /sys/module/XXX is merely an existence test, and the current sole caller in pci_vfio_enable() even bothers checking for the difference. So its perhaps a bit academic but its not incorrect. At any rate, the debug messages are incorrect/misleading. It's certainly not trying to *open* these directories so it should not claim to do so. - Panu -
[dpdk-dev] [PATCH] mk: fix external shared library dependencies of libraries
Similar to commit 5f9115e58cc6f304ff4ade694cf5823d32887d1a etc, but for libraries. Requiring applications to know about library internal details like dependencies to external helper libraries is a limitation of static linkage, shared libraries should always know their own dependencies for sane operation. Linking with the combined library (whether shared or not) still requires knowing the internal dependencies, and intra-dpdk dependencies are also not currently recorded. Signed-off-by: Panu Matilainen --- Note: I haven't tested on FreeBSD, it should be straightforward but it wouldn't hurt if Bruce / Sergio can sanity-check that side... lib/librte_eal/bsdapp/eal/Makefile | 4 lib/librte_eal/linuxapp/eal/Makefile | 6 ++ lib/librte_sched/Makefile| 3 +++ lib/librte_vhost/Makefile| 2 ++ mk/rte.app.mk| 20 5 files changed, 23 insertions(+), 12 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile index 65b293f..d7e06fd 100644 --- a/lib/librte_eal/bsdapp/eal/Makefile +++ b/lib/librte_eal/bsdapp/eal/Makefile @@ -42,6 +42,10 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ring CFLAGS += -I$(RTE_SDK)/lib/librte_mempool CFLAGS += $(WERROR_FLAGS) -O3 +LDLIBS += -lpthread +LDLIBS += -lexecinfo +LDLIBS += -lgcc_s + EXPORT_MAP := rte_eal_version.map LIBABIVER := 2 diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile index 26eced5..2a0fa2b 100644 --- a/lib/librte_eal/linuxapp/eal/Makefile +++ b/lib/librte_eal/linuxapp/eal/Makefile @@ -47,6 +47,12 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_mempool CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem CFLAGS += $(WERROR_FLAGS) -O3 +LDLIBS += -lpthread +LDLIBS += -ldl +LDLIBS += -lrt +LDLIBS += -lm +LDLIBS += -lgcc_s + # specific to linuxapp exec-env SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) := eal.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_hugepage_info.c diff --git a/lib/librte_sched/Makefile b/lib/librte_sched/Makefile index b1cb285..4d631f6 100644 --- a/lib/librte_sched/Makefile +++ b/lib/librte_sched/Makefile @@ -41,6 +41,9 @@ CFLAGS += $(WERROR_FLAGS) CFLAGS_rte_red.o := -D_GNU_SOURCE +LDLIBS += -lm +LDLIBS += -lrt + EXPORT_MAP := rte_sched_version.map LIBABIVER := 1 diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile index 6681f22..a3bdca4 100644 --- a/lib/librte_vhost/Makefile +++ b/lib/librte_vhost/Makefile @@ -44,10 +44,12 @@ CFLAGS += -I vhost_user else CFLAGS += -I vhost_cuse -lfuse LDFLAGS += -lfuse +LDLIBS += -lfuse endif ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) LDFLAGS += -lnuma +LDLIBS += -lnuma endif # all source are stored in SRCS-y diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 8ecab41..4ecaa6c 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -81,23 +81,11 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)+= -lrte_lpm _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER) += -lrte_power _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)+= -lrte_acl _LDLIBS-$(CONFIG_RTE_LIBRTE_METER) += -lrte_meter - _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lrte_sched -_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lm -_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lrt - _LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lrte_vhost endif # ! CONFIG_RTE_BUILD_COMBINE_LIBS -ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) -_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lnuma -endif - -ifeq ($(CONFIG_RTE_LIBRTE_VHOST_USER),n) -_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lfuse -endif - # The static libraries do not know their dependencies. # The combined library fails also to store this information. # So linking with static or combined library requires explicit dependencies. @@ -111,6 +99,14 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lxenstore _LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lgxio # QAT PMD has a dependency on libcrypto (from openssl) for calculating HMAC precomputes _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lcrypto +_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lm +_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lrt +ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lnuma +endif +ifeq ($(CONFIG_RTE_LIBRTE_VHOST_USER),n) +_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lfuse +endif endif # CONFIG_RTE_BUILD_COMBINE_LIBS or not CONFIG_RTE_BUILD_SHARED_LIBS _LDLIBS-y += --start-group -- 2.5.0
[dpdk-dev] [PATCH] mk: fix external shared library dependencies of libraries
On 12/08/2015 12:11 PM, Thomas Monjalon wrote: > Hi Panu, > > 2015-12-08 10:30, Panu Matilainen: >> --- a/lib/librte_vhost/Makefile >> +++ b/lib/librte_vhost/Makefile >> @@ -44,10 +44,12 @@ CFLAGS += -I vhost_user >> else >> CFLAGS += -I vhost_cuse -lfuse >> LDFLAGS += -lfuse >> +LDLIBS += -lfuse >> endif >> >> ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) >> LDFLAGS += -lnuma >> +LDLIBS += -lnuma >> endif > > It looks weird to have to declare the dependencies both in > LDFLAGS and LDLIBS. What is the reason? > Can we improve it? I'd say its just an artifact of the dpdk build system evolution and surely we can improve it, but I'd leave it post 2.2 to avoid breaking anything now. I'm planning further work in this area and one of the things on my TODO list is to look into the LDFLAGS/LDLIBS duplication. Technically, LDLIBS should only contain the libraries to link, and all the others directives (such as linker path etc) should go to LDFLAGS. - Panu -
[dpdk-dev] [PATCH] mk: fix external shared library dependencies of libraries
On 12/08/2015 01:19 PM, Panu Matilainen wrote: > On 12/08/2015 12:11 PM, Thomas Monjalon wrote: >> Hi Panu, >> >> 2015-12-08 10:30, Panu Matilainen: >>> --- a/lib/librte_vhost/Makefile >>> +++ b/lib/librte_vhost/Makefile >>> @@ -44,10 +44,12 @@ CFLAGS += -I vhost_user >>> else >>> CFLAGS += -I vhost_cuse -lfuse >>> LDFLAGS += -lfuse >>> +LDLIBS += -lfuse >>> endif >>> >>> ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) >>> LDFLAGS += -lnuma >>> +LDLIBS += -lnuma >>> endif >> >> It looks weird to have to declare the dependencies both in >> LDFLAGS and LDLIBS. What is the reason? >> Can we improve it? > > I'd say its just an artifact of the dpdk build system evolution and > surely we can improve it, but I'd leave it post 2.2 to avoid breaking > anything now. Actually, scratch that. That librte_vhost has used LDFLAGS instead of LDLIBS is likely just a mistake that happens to work, but there should be no reason for it. I'll send a v2 with that changed, and while at it, remove the bogus -lfuse from vhost_cuse CFLAGS too. - Panu -
[dpdk-dev] [PATCH v2] mk: fix external shared library dependencies of libraries
Similar to commit 5f9115e58cc6f304ff4ade694cf5823d32887d1a etc, but for libraries. Clean up librte_vhost CFLAGS/LDFLAGS/LDLIBS confusion while at it. Requiring applications to know about library internal details like dependencies to external helper libraries is a limitation of static linkage, shared libraries should always know their own dependencies for sane operation. Linking with the combined library (whether shared or not) still requires knowing the internal dependencies, and intra-dpdk dependencies are also not currently recorded. Signed-off-by: Panu Matilainen --- v2: - clean up librte_vhost CFLAGS/LDFLAGS/LDLIBS confusion while at it lib/librte_eal/bsdapp/eal/Makefile | 4 lib/librte_eal/linuxapp/eal/Makefile | 6 ++ lib/librte_sched/Makefile| 3 +++ lib/librte_vhost/Makefile| 6 +++--- mk/rte.app.mk| 20 5 files changed, 24 insertions(+), 15 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/Makefile b/lib/librte_eal/bsdapp/eal/Makefile index 65b293f..d7e06fd 100644 --- a/lib/librte_eal/bsdapp/eal/Makefile +++ b/lib/librte_eal/bsdapp/eal/Makefile @@ -42,6 +42,10 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_ring CFLAGS += -I$(RTE_SDK)/lib/librte_mempool CFLAGS += $(WERROR_FLAGS) -O3 +LDLIBS += -lpthread +LDLIBS += -lexecinfo +LDLIBS += -lgcc_s + EXPORT_MAP := rte_eal_version.map LIBABIVER := 2 diff --git a/lib/librte_eal/linuxapp/eal/Makefile b/lib/librte_eal/linuxapp/eal/Makefile index 26eced5..2a0fa2b 100644 --- a/lib/librte_eal/linuxapp/eal/Makefile +++ b/lib/librte_eal/linuxapp/eal/Makefile @@ -47,6 +47,12 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_mempool CFLAGS += -I$(RTE_SDK)/lib/librte_ivshmem CFLAGS += $(WERROR_FLAGS) -O3 +LDLIBS += -lpthread +LDLIBS += -ldl +LDLIBS += -lrt +LDLIBS += -lm +LDLIBS += -lgcc_s + # specific to linuxapp exec-env SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) := eal.c SRCS-$(CONFIG_RTE_LIBRTE_EAL_LINUXAPP) += eal_hugepage_info.c diff --git a/lib/librte_sched/Makefile b/lib/librte_sched/Makefile index b1cb285..4d631f6 100644 --- a/lib/librte_sched/Makefile +++ b/lib/librte_sched/Makefile @@ -41,6 +41,9 @@ CFLAGS += $(WERROR_FLAGS) CFLAGS_rte_red.o := -D_GNU_SOURCE +LDLIBS += -lm +LDLIBS += -lrt + EXPORT_MAP := rte_sched_version.map LIBABIVER := 1 diff --git a/lib/librte_vhost/Makefile b/lib/librte_vhost/Makefile index 6681f22..4aecc69 100644 --- a/lib/librte_vhost/Makefile +++ b/lib/librte_vhost/Makefile @@ -42,12 +42,12 @@ CFLAGS += $(WERROR_FLAGS) -I$(SRCDIR) -O3 -D_FILE_OFFSET_BITS=64 ifeq ($(CONFIG_RTE_LIBRTE_VHOST_USER),y) CFLAGS += -I vhost_user else -CFLAGS += -I vhost_cuse -lfuse -LDFLAGS += -lfuse +CFLAGS += -I vhost_cuse +LDLIBS += -lfuse endif ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) -LDFLAGS += -lnuma +LDLIBS += -lnuma endif # all source are stored in SRCS-y diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 8ecab41..4ecaa6c 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -81,23 +81,11 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_LPM)+= -lrte_lpm _LDLIBS-$(CONFIG_RTE_LIBRTE_POWER) += -lrte_power _LDLIBS-$(CONFIG_RTE_LIBRTE_ACL)+= -lrte_acl _LDLIBS-$(CONFIG_RTE_LIBRTE_METER) += -lrte_meter - _LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lrte_sched -_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lm -_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lrt - _LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lrte_vhost endif # ! CONFIG_RTE_BUILD_COMBINE_LIBS -ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) -_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lnuma -endif - -ifeq ($(CONFIG_RTE_LIBRTE_VHOST_USER),n) -_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lfuse -endif - # The static libraries do not know their dependencies. # The combined library fails also to store this information. # So linking with static or combined library requires explicit dependencies. @@ -111,6 +99,14 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_XENVIRT)+= -lxenstore _LDLIBS-$(CONFIG_RTE_LIBRTE_MPIPE_PMD) += -lgxio # QAT PMD has a dependency on libcrypto (from openssl) for calculating HMAC precomputes _LDLIBS-$(CONFIG_RTE_LIBRTE_PMD_QAT)+= -lcrypto +_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lm +_LDLIBS-$(CONFIG_RTE_LIBRTE_SCHED) += -lrt +ifeq ($(CONFIG_RTE_LIBRTE_VHOST_NUMA),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lnuma +endif +ifeq ($(CONFIG_RTE_LIBRTE_VHOST_USER),n) +_LDLIBS-$(CONFIG_RTE_LIBRTE_VHOST) += -lfuse +endif endif # CONFIG_RTE_BUILD_COMBINE_LIBS or not CONFIG_RTE_BUILD_SHARED_LIBS _LDLIBS-y += --start-group -- 2.5.0
[dpdk-dev] [ovs-discuss] [announce] driverctl: utility for persistent alternative driver binding
On 12/04/2015 05:44 PM, Gray, Mark D wrote: > I welcome this initiative, one question below: > >> -Original Message- >> From: discuss [mailto:discuss-bounces at openvswitch.org] On Behalf Of Panu >> Matilainen >> Sent: Friday, December 4, 2015 10:54 AM >> To: dev at dpdk.org; users at dpdk.org; dev at openvswitch.org; >> discuss at openvswitch.org >> Subject: [ovs-discuss] [announce] driverctl: utility for persistent >> alternative >> driver binding >> >> Hi all, >> >> While this is not directly related to DPDK or OVS, it is potentially >> useful for users of both, so excuse me for cross-posting. >> >> Quoting from the project README (for the full text see >> http://laiskiainen.org/git/?p=driverctl.git;a=blob_plain;f=README) >> >> > driverctl is a tool for manipulating and inspecting the system >> > device driver choices. >> > >> > Devices are normally assigned to their sole designated kernel driver >> > by default. However in some situations it may be desireable to >> > override that default, for example to try an older driver to >> > work around a regression in a driver or to try an experimental >> > alternative driver. Another common use-case is pass-through >> > drivers and driver stubs to allow userspace to drive the device, >> > such as in case of virtualization. >> > >> > driverctl integrates with udev to support overriding >> > driver selection for both cold- and hotplugged devices from the >> > moment of discovery, but can also change already assigned drivers, >> > assuming they are not in use by the system. The driver overrides >> > created by driverctl are persistent across system reboots >> > by default. >> > >> > Usage >> > - >> > >> > Find devices currently driven by ixgbe driver: >> > >> > # driverctl -v list-devices | grep ixgbe >> > :01:00.0 ixgbe (Ethernet 10G 4P X520/I350 rNDC) >> > :01:00.1 ixgbe (Ethernet 10G 4P X520/I350 rNDC) >> > >> > Change them to use the vfio-pci driver: >> > # driverctl set-override :01:00.0 vfio-pci >> > # driverctl set-override :01:00.1 vfio-pci >> > >> > Find devices with driver overrides: >> > # driverctl -v list-devices|grep \\* >> > :01:00.0 vfio-pci [*] (Ethernet 10G 4P X520/I350 rNDC) >> > :01:00.1 vfio-pci [*] (Ethernet 10G 4P X520/I350 rNDC) >> > >> > Remove the override from slot :01:00.1: >> > # driverctl unset-override :01:00.1 >> >> DPDK of course has its own dpdk_nic_bind(.py) tool for this purpose, the >> main differences to driverctl are: >> - driverctl bindings are persistent across system boots > > [Gray, Mark D] This is great! > > Will this integrate with, for example in Red Hat-based systems, > /etc/sysconfig/network-scripts/ifcfg-X? In DPDK, could we then > potentially reference devices by that (arbitrary) name? driverctl is not specific to NICs so network-scripts integration is out of scope. That aside, maybe I'm missing something but I'm not sure what there is to integrate with since DPDK ports are ultimately application specific. For OVS I've sent a patch to support managing OVS DPDK ports via network-scripts: http://openvswitch.org/pipermail/dev/2015-December/062850.html . Panu -
[dpdk-dev] [PATCH v4 2/2] eal/linux: Add support for handling built-in kernel modules
On 12/08/2015 05:33 PM, Kamil Rytarowski wrote: > Currently rte_eal_check_module() detects Linux kernel modules via reading > /proc/modules. Built-in ones aren't listed there and therefore they are not > being found by the script. > > Add support for checking built-in modules with parsing the sysfs files > > This commit obsoletes the /proc/modules parsing approach. > > Signed-off-by: Kamil Rytarowski > Signed-off-by: David Marchand > --- > lib/librte_eal/linuxapp/eal/eal.c | 34 -- > 1 file changed, 20 insertions(+), 14 deletions(-) > > diff --git a/lib/librte_eal/linuxapp/eal/eal.c > b/lib/librte_eal/linuxapp/eal/eal.c > index 635ec36..92482a0 100644 > --- a/lib/librte_eal/linuxapp/eal/eal.c > +++ b/lib/librte_eal/linuxapp/eal/eal.c > @@ -901,27 +901,33 @@ int rte_eal_has_hugepages(void) > int > rte_eal_check_module(const char *module_name) > { > - char mod_name[30]; /* Any module names can be longer than 30 bytes? */ > - int ret = 0; > + char sysfs_mod_name[PATH_MAX]; > + struct stat st; > int n; > > if (NULL == module_name) > return -1; > > - FILE *fd = fopen("/proc/modules", "r"); > - if (NULL == fd) { > - RTE_LOG(ERR, EAL, "Open /proc/modules failed!" > - " error %i (%s)\n", errno, strerror(errno)); > + /* Check if there is sysfs mounted */ > + if (stat("/sys/module", &st) != 0) { > + RTE_LOG(DEBUG, EAL, "sysfs is not mounted! error %i (%s)\n", > + errno, strerror(errno)); > return -1; > } > - while (!feof(fd)) { > - n = fscanf(fd, "%29s %*[^\n]", mod_name); > - if ((n == 1) && !strcmp(mod_name, module_name)) { > - ret = 1; > - break; > - } > + > + /* A module might be built-in, therefore try sysfs */ > + n = snprintf(sysfs_mod_name, PATH_MAX, "/sys/module/%s", module_name); > + if (n < 0 || n > PATH_MAX) { > + RTE_LOG(DEBUG, EAL, "Could not format module path\n"); > + return -1; > } > - fclose(fd); > > - return ret; > + if (stat(sysfs_mod_name, &st) != 0) { > + RTE_LOG(DEBUG, EAL, "Open %s failed! error %i (%s)\n", > + sysfs_mod_name, errno, strerror(errno)); > + return 0; > + } Like with /sys/module, its not trying to *open* sysfs_mod_name directory either so it shouldn't claim to do so. I did use plural on purpose when I said "the debug messages are incorrect/misleading. It's certainly not trying to *open* these directories so it should not claim to do so" in my previous mail :) - Panu -
[dpdk-dev] [PATCH v2] mk: fix external shared library dependencies of libraries
On 12/08/2015 06:28 PM, Sergio Gonzalez Monroy wrote: > On 08/12/2015 11:47, Panu Matilainen wrote: >> Similar to commit 5f9115e58cc6f304ff4ade694cf5823d32887d1a etc, but >> for libraries. Clean up librte_vhost CFLAGS/LDFLAGS/LDLIBS confusion >> while at it. >> >> Requiring applications to know about library internal details like >> dependencies to external helper libraries is a limitation of >> static linkage, shared libraries should always know their own >> dependencies for sane operation. >> >> Linking with the combined library (whether shared or not) still >> requires knowing the internal dependencies, and intra-dpdk >> dependencies are also not currently recorded. >> >> Signed-off-by: Panu Matilainen >> --- >> >> v2: >> - clean up librte_vhost CFLAGS/LDFLAGS/LDLIBS confusion while at it >> >> > Hi Panu, > > Patch itself looks good but there is a small side effect on BSD that > results > in app/test not linking because of missing -lm. > Linuxapp links with -lm by default (EXECENV_LDLIBS), but BSD does not. Oh, those LIBRTE_SCHED entries were in a different if-block from the others... Hmm, interesting. Without this patch, on Linux -lm gets added twice which actually causes a build failure on Fedora rawhide (related to some libmvec related changes it seems). > Should we just add -lm to EXECENV_LDLIBS for BSD too instead of > adding it on each app/example that uses librte_sched ? Linking should be based on usage, not convenience or such... but there's no explanation why -lm is added everywhere in Linux: commit 6da94b7a92d9706c1a4fb23a9cf54f49e6019af2 Author: Intel Date: Wed Sep 18 12:00:00 2013 +0200 mk: link with libm Signed-off-by: Intel Certainly librte_sched should link to -lm and in static builds, all its users, but beyond that I suppose it needs closer investigation of what (if anything else) actually needs it. I think we better leave it alone for 2.2, but the librte_vhost part should be safe. I can send another version with just that if it has a chance to make it to 2.2, otherwise lets postpone it to 2.3. - Panu - - Panu - - Panu -
[dpdk-dev] [PATCH v2] examples/vhost: reduce number of hugepages needed
On 12/10/2015 04:50 PM, Ananyev, Konstantin wrote: > > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Bernard Iremonger >> Sent: Thursday, December 10, 2015 1:53 PM >> To: dev at dpdk.org >> Subject: [dpdk-dev] [PATCH v2] examples/vhost: reduce number of hugepages >> needed >> >> Change MAX_QUEUES from 512 to 128 to reduce the number of hugepages >> required by the vhost-switch program. >> >> Changes in v2: >> remove comment added before #define MAX_QUEUES in v1 patch. >> >> Signed-off-by: Bernard Iremonger >> Acked-by: Yuanhan Liu > > > Wasn't it increased a while ago, because someone complained that > 128 queues might not be enough on FVL? > From git log I can see that it was first increased from 128 to 256, > then from 256 to 512. > The reason mentioned - HW that has bigger number of queues. > Isn't it not the case anymore? > If yes, why? > > BTW, shouldn't it be then at least: > > +#ifndef MAX_QUEUES > +#define MAX_QUEUES 128 > +#endif > > So people can just do -D MAX_QUEUES=X at build time if they like(need) to. Being subject to constant changes back and forth suggests this really should be a runtime tunable rather than build time constant. - Panu -
[dpdk-dev] [PATCH] doc: announce API change for rte_ether.h
On 12/11/2015 01:27 AM, Stephen Hemminger wrote: > Plan to change to version of struct ether_addr in > DPDK 2.3. The change in DPDK source is trivial but it will impact > source compatablilty therefore notification is necessary. > > Signed-off-by: Stephen Hemminger > --- > doc/guides/rel_notes/deprecation.rst | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index 1c7ab01..8ecb990 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -19,3 +19,8 @@ Deprecation Notices > and table action handlers will be updated: > the pipeline parameter will be added, the packets mask parameter will be > either removed (for input port action handler) or made input-only. > + > +* librte_ether: The structure ether_addr in DPDK will be replaced > + by using the standard header file . The structure > + size will be the same (no ABI impact), but the structure field name > + will change from addr_bytes[] to ether_addr_octet[]. > I hope there is some other reason/benefit besides getting rid of a three-line custom struct definition. It may be a trivial s/addr_bytes/ether_addr_octet/ change but it touches a lot of places all over the DPDK codebase alone, and for 3rd party developers such (at least seemingly) gratuitous renames are really irritating. - Panu -
[dpdk-dev] [PATCH] scripts: fix relpath.sh output when $prefix is set in environment
When relpath.sh is called from install target with prefix set, eg "make install DESTDIR=/tmp/dpdk-root prefix=/usr", the prefix from the environment leaks to relpath.sh internal helper variable and causes incorrect symlinks to be generated in sdk $(targetdir): include -> /usr../../../include/dpdk lib -> /usr../../../lib Initialize the local variable to empty to avoid side-effects from environment. Signed-off-by: Panu Matilainen --- scripts/relpath.sh | 2 ++ 1 file changed, 2 insertions(+) diff --git a/scripts/relpath.sh b/scripts/relpath.sh index 7d2f48f..4ff4671 100755 --- a/scripts/relpath.sh +++ b/scripts/relpath.sh @@ -61,6 +61,8 @@ right2=${REL2#*/} prev_right2=$REL2 prev_left2= +prefix= + while [ "${right1}" != "" -a "${right2}" != "" ]; do if [ "$left1" != "$left2" ]; then -- 2.5.0
[dpdk-dev] [PATCH 2/2] ethdev: remove old flow director symbols
On 12/15/2015 12:47 PM, Thomas Monjalon wrote: > The API has been removed but the symbols were still declared in the map. > > Fixes: a421b86a4a02 ("ethdev: remove old flow director API") > > Signed-off-by: Thomas Monjalon > --- > lib/librte_ether/rte_ether_version.map | 8 > 1 file changed, 8 deletions(-) > > diff --git a/lib/librte_ether/rte_ether_version.map > b/lib/librte_ether/rte_ether_version.map > index 17a11c7..d8db24d 100644 > --- a/lib/librte_ether/rte_ether_version.map > +++ b/lib/librte_ether/rte_ether_version.map > @@ -27,14 +27,6 @@ DPDK_2.2 { > rte_eth_dev_count; > rte_eth_dev_default_mac_addr_set; > rte_eth_dev_detach; > - rte_eth_dev_fdir_add_perfect_filter; > - rte_eth_dev_fdir_add_signature_filter; > - rte_eth_dev_fdir_get_infos; > - rte_eth_dev_fdir_remove_perfect_filter; > - rte_eth_dev_fdir_remove_signature_filter; > - rte_eth_dev_fdir_set_masks; > - rte_eth_dev_fdir_update_perfect_filter; > - rte_eth_dev_fdir_update_signature_filter; > rte_eth_dev_filter_ctrl; > rte_eth_dev_filter_supported; > rte_eth_dev_flow_ctrl_get; > Good spotting. What did you use find these and the ones in eal? Just thinking this seems like something that could and should be automated. - Panu -
[dpdk-dev] [PATCH] scripts: fix abi-validator regression when revision is a tag
Commit 9cbae2aa64eb managed to break the only previously supported case where a tag is used as a revision, due to git show output differing between tags and other objects. The hash is on the last line of the output in both cases though so just grab that. Fixes: 9cbae2aa64eb ("scripts: support any git revisions as ABI validation range") Signed-off-by: Panu Matilainen --- scripts/validate-abi.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/validate-abi.sh b/scripts/validate-abi.sh index 8d7be24..c36ad61 100755 --- a/scripts/validate-abi.sh +++ b/scripts/validate-abi.sh @@ -121,8 +121,8 @@ then cleanup_and_exit 1 fi -HASH1=$(git show -s --format=%H "$TAG1" -- 2> /dev/null) -HASH2=$(git show -s --format=%H "$TAG2" -- 2> /dev/null) +HASH1=$(git show -s --format=%H "$TAG1" -- 2> /dev/null | tail -1) +HASH2=$(git show -s --format=%H "$TAG2" -- 2> /dev/null | tail -1) # Make sure our tags exist res=$(validate_tags) -- 2.5.0
[dpdk-dev] [PATCH v2 4/6] bond mode 4: allow external state machine
On 02/19/2016 09:17 PM, Eric Kinzie wrote: > From: Eric Kinzie > > Provide functions to allow an external 802.3ad state machine to transmit > and recieve LACPDUs and to set the collection/distribution flags on > slave interfaces. > > Signed-off-by: Eric Kinzie > Signed-off-by: Stephen Hemminger > Acked-by: Declan Doherty [...] > diff --git a/drivers/net/bonding/rte_eth_bond_8023ad.h > b/drivers/net/bonding/rte_eth_bond_8023ad.h > index ebd0e93..8cfa3d3 100644 > --- a/drivers/net/bonding/rte_eth_bond_8023ad.h > +++ b/drivers/net/bonding/rte_eth_bond_8023ad.h > @@ -64,6 +64,8 @@ extern "C" { > #define MARKER_TLV_TYPE_INFO0x01 > #define MARKER_TLV_TYPE_RESP0x02 > > +typedef void (*rte_eth_bond_8023ad_ext_slowrx_fn)(uint8_t slave_id, struct > rte_mbuf *lacp_pkt); > + > enum rte_bond_8023ad_selection { > UNSELECTED, > STANDBY, > @@ -157,6 +159,7 @@ struct rte_eth_bond_8023ad_conf { > uint32_t tx_period_ms; > uint32_t rx_marker_period_ms; > uint32_t update_timeout_ms; > + rte_eth_bond_8023ad_ext_slowrx_fn slowrx_cb; > }; This still is a likely an ABI break, previously discussed around here: http://dpdk.org/ml/archives/dev/2015-November/027321.html It might not be embedded anywhere in DPDK codebase, but there's no telling what others might have done with it (have an array of them, embed in other structs etc). Also ultimately ABI compatibility goes both ways: when the library soname does not change then an application is free to assume both downgrading and upgrading are safe. In this case, upgrading *might* be okay, downgrading certainly is not. So by that measure it definitely is an ABI break. [...] > diff --git a/drivers/net/bonding/rte_eth_bond_version.map > b/drivers/net/bonding/rte_eth_bond_version.map > index 22bd920..33d73ff 100644 > --- a/drivers/net/bonding/rte_eth_bond_version.map > +++ b/drivers/net/bonding/rte_eth_bond_version.map > @@ -27,3 +27,9 @@ DPDK_2.1 { > rte_eth_bond_free; > > } DPDK_2.0; > + > +DPDK_2.2 { > + rte_eth_bond_8023ad_ext_collect; > + rte_eth_bond_8023ad_ext_distrib; > + rte_eth_bond_8023ad_ext_slowtx; > +} DPDK_2.1; > These symbols are not part of DPDK 2.2, the version here is wrong. Technically it would not actually matter much but better not to confuse things unnecessarily. - Panu -
[dpdk-dev] [PATCH] config: remove duplicate configuration information
On 02/22/2016 06:02 PM, Wiles, Keith wrote: >> Hi Keith, >> >> What makes a param common? >> >> e.g. cryptodev QAT PMD is supported in linux, but currently not supported >> in bsd. >> So typically I disable it in the bsd file and enable it in the linux file. >> >> Couldn't the same apply to any other parameter, i.e. there may be users who >> want to have differences in config for different OSs? >> >> So why not just leave as is and give users the option to choose? > > The problem is the major configs are all common, in this design we have the > common_base all configs are placed then as you stated they are disable in the > common_OS files. Plus some are enabled/disabled in the deconfig_XXX files as > well. > > The goal is to move all of the configs into one file then we do not have to > keep updating all of the common_OS files, but only enable/disable that option. > > I have common_osxapp that I want to add later to build and run DPDK on OS X, > which is another place to have these same configs. Later we may add another > OS too, which means more copies :-) > My +1 for eliminating config redundancy. In addition to improving overall sanity, having the common options in a common file makes the few actually OS-dependent items stand out, which is only a good thing. - Panu -
[dpdk-dev] including rte.app.mk from a Makefile.am
On 02/24/2016 04:24 AM, Stefan Puiu wrote: > Hi, > > I'm working on a Linux project that uses the DPDK and (unfornately, > IMO) automake; so we have a Makefile.am where we include rte.extapp.mk > and rte.vars.mk from the DPDK, add LDLIBS to the linker > > However, I've tried building against DPDK 2.2 and I'm getting linker > errors about options like '--no-as-needed', '--whole-archive' etc not > being recognized. Basically, we use libtool to link the binary, which > behind the scenes ends up calling gcc to link the binary, and gcc > doesn't know how to read linker options - they need to be prefixed > with '-Wl,..'. I've traced this to this part of rte.app.mk: > > === DPDK 1.7.1 > ifeq ($(LINK_USING_CC),1) > LDLIBS := $(call linkerprefix,$(LDLIBS)) > LDFLAGS := $(call linkerprefix,$(LDFLAGS)) > > === DPDK 2.2 (since DPDK 1.8, AFAICT) > ifeq ($(LINK_USING_CC),1) > O_TO_EXE = $(CC) $(CFLAGS) $(LDFLAGS_$(@)) \ > -Wl,-Map=$(@).map,--cref -o $@ $(OBJS-y) $(call > linkerprefix,$(LDFLAGS)) \ > $(EXTRA_LDFLAGS) $(call linkerprefix,$(LDLIBS)) > > Notice on 1.7.1 LDFLAGS gets the -Wl, prefix if linking with gcc; for > 2.2, that doesn't happen anymore - note O_TO_EXE calls linkerprefix > explicitly for LDLIBS and LDFLAGS. > > The change that removed the LDLIBS/LDFLAGS setting is 3c6a14f6, which > ironically says "mk: fix link with CC" in the title. > > I've tried working around this, but apparently automake doesn't give > you too much control of what you can do; overriding LDFLAGS with > $(call linkerprefix,$(LDFLAGS) in Makefile.am doesn't work. Since > LDFLAGS is treated as a user variable by automake, it's tricky to > override it. > > Now my question is: is this supposed to work? Is there any point in > trying to use the mk files from my outside project? I would say no, especially when the rest of your buildsystem is around automake (or cmake or...). Pktgen relies on the dpdk make infrastructure but even that gets into all sorts of trouble with it. > I noticed dpdk-ovs > doesn't seem to bother with that, and just builds one library to link > against. I guess it's useful to pick up the defines that the DPDK was > built against, so inline functions in headers are properly picked up. > Are there people using the DPDK from projects using automake? > > IMO, It would be nice if you could extract the CPPFLAGS/LDFLAGS etc > from the DPDK without including the mk files - maybe by running > something like 'make showvars' or something like that in the DPDK dir. > Then external projects could integrate those in their build system > without too much extra baggage. It would be nice yes, but instead of some custom make-thing, I'd prefer a pkg-config file for the purpose. Adding a pkg-config file has been suggested in the past a few times but nobody has stepped up to do it. - Panu - > Thanks, > Stefan. >
[dpdk-dev] [PATCH] mk: fix the combined library problems by replacing it with a linker script
On 02/23/2016 10:07 PM, Thomas Monjalon wrote: > Hi, > > I'm reviving this old thread. Thanks. > My understanding is that everybody prefer the linker script > than the current combined library which had neither symbol versioning > nor library dependency informations. Yeah it seemed to me most (if not everybody) had converged on the side of the linker script approach. > > Comments below: > > 2015-11-24 16:31, Panu Matilainen: >> The physically linked-together combined library has been an increasing >> source of problems, as was predicted when library and symbol versioning >> was introduced. Replace the complex and fragile construction with a >> simple linker script which achieves the same without all the problems, >> remove the related kludges from eg mlx drivers. >> >> Since creating the linker script is practically zero cost, remove the >> config option and just create it always. > [...] >> --- /dev/null >> +++ b/mk/rte.combinedlib.mk >> @@ -0,0 +1,57 @@ >> +# BSD LICENSE >> +# >> +# Copyright(c) 2010-2015 Intel Corporation. All rights reserved. >> +# All rights reserved. >> +# >> +# Redistribution and use in source and binary forms, with or without >> +# modification, are permitted provided that the following conditions >> +# are met: >> +# >> +# * Redistributions of source code must retain the above copyright >> +# notice, this list of conditions and the following disclaimer. >> +# * Redistributions in binary form must reproduce the above copyright >> +# notice, this list of conditions and the following disclaimer in >> +# the documentation and/or other materials provided with the >> +# distribution. >> +# * Neither the name of Intel Corporation nor the names of its >> +# contributors may be used to endorse or promote products derived >> +# from this software without specific prior written permission. > > Why this header, Panu? > I think you should write your own copyright, and assume the linker script ;) Its just inherited from the original patch by Sergio. As he's the actual author here, it didn't seem appropriate for me to remove it. > > It needs to be rebased and some docs comments must be removed or updated. > I'll send a v2. > Thanks, - Panu -
[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
On 02/23/2016 07:35 AM, Xie, Huawei wrote: > On 2/22/2016 10:52 PM, Xie, Huawei wrote: >> On 2/4/2016 1:24 AM, Olivier MATZ wrote: >>> Hi, >>> >>> On 01/27/2016 02:56 PM, Panu Matilainen wrote: >>>> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of >>>> the library ABI and should not be listed in the version map. >>>> >>>> I assume its inline for performance reasons, but then you lose the >>>> benefits of dynamic linking such as ability to fix bugs and/or improve >>>> itby just updating the library. Since the point of having a bulk API is >>>> to improve performance by reducing the number of calls required, does it >>>> really have to be inline? As in, have you actually measured the >>>> difference between inline and non-inline and decided its worth all the >>>> downsides? >>> Agree with Panu. It would be interesting to compare the performance >>> between inline and non inline to decide whether inlining it or not. >> Will update after i gathered more data. inline could show obvious >> performance difference in some cases. > > Panu and Oliver: > I write a simple benchmark. This benchmark run 10M rounds, in each round > 8 mbufs are allocated through bulk API, and then freed. > These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @ > 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded). > Btw, i have removed some exceptional data, the frequency of which is > like 1/10. Sometimes observed user usage suddenly disappeared, no clue > what happened. > > With 8 mbufs allocated, there is about 6% performance increase using inline. [...] > > With 16 mbufs allocated, we could still observe obvious performance > difference, though only 1%-2% > [...] > > With 32/64 mbufs allocated, the deviation of the data itself would hide > the performance difference. > So we prefer using inline for performance. At least I was more after real-world performance in a real-world use-case rather than CPU cycles in a microbenchmark, we know function calls have a cost but the benefits tend to outweight the cons. Inline functions have their place and they're far less evil in project internal use, but in library public API they are BAD and should be ... well, not banned because there are exceptions to every rule, but highly discouraged. - Panu -
[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
On 02/24/2016 03:23 PM, Ananyev, Konstantin wrote: > Hi Panu, > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen >> Sent: Wednesday, February 24, 2016 12:12 PM >> To: Xie, Huawei; Olivier MATZ; dev at dpdk.org >> Cc: dprovan at bivio.net >> Subject: Re: [dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk >> API >> >> On 02/23/2016 07:35 AM, Xie, Huawei wrote: >>> On 2/22/2016 10:52 PM, Xie, Huawei wrote: >>>> On 2/4/2016 1:24 AM, Olivier MATZ wrote: >>>>> Hi, >>>>> >>>>> On 01/27/2016 02:56 PM, Panu Matilainen wrote: >>>>>> Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of >>>>>> the library ABI and should not be listed in the version map. >>>>>> >>>>>> I assume its inline for performance reasons, but then you lose the >>>>>> benefits of dynamic linking such as ability to fix bugs and/or improve >>>>>> itby just updating the library. Since the point of having a bulk API is >>>>>> to improve performance by reducing the number of calls required, does it >>>>>> really have to be inline? As in, have you actually measured the >>>>>> difference between inline and non-inline and decided its worth all the >>>>>> downsides? >>>>> Agree with Panu. It would be interesting to compare the performance >>>>> between inline and non inline to decide whether inlining it or not. >>>> Will update after i gathered more data. inline could show obvious >>>> performance difference in some cases. >>> >>> Panu and Oliver: >>> I write a simple benchmark. This benchmark run 10M rounds, in each round >>> 8 mbufs are allocated through bulk API, and then freed. >>> These are the CPU cycles measured(Intel(R) Xeon(R) CPU E5-2680 0 @ >>> 2.70GHz, CPU isolated, timer interrupt disabled, rcu offloaded). >>> Btw, i have removed some exceptional data, the frequency of which is >>> like 1/10. Sometimes observed user usage suddenly disappeared, no clue >>> what happened. >>> >>> With 8 mbufs allocated, there is about 6% performance increase using inline. >> [...] >>> >>> With 16 mbufs allocated, we could still observe obvious performance >>> difference, though only 1%-2% >>> >> [...] >>> >>> With 32/64 mbufs allocated, the deviation of the data itself would hide >>> the performance difference. >>> So we prefer using inline for performance. >> >> At least I was more after real-world performance in a real-world >> use-case rather than CPU cycles in a microbenchmark, we know function >> calls have a cost but the benefits tend to outweight the cons. >> >> Inline functions have their place and they're far less evil in project >> internal use, but in library public API they are BAD and should be ... >> well, not banned because there are exceptions to every rule, but highly >> discouraged. > > Why is that? For all the reasons static linking is bad, and what's worse it forces the static linking badness into dynamically linked builds. If there's a bug (security or otherwise) in a library, a distro wants to supply an updated package which fixes that bug and be done with it. But if that bug is in an inlined code, supplying an update is not enough, you also need to recompile everything using that code, and somehow inform customers possibly using that code that they need to not only update the library but to recompile their apps as well. That is precisely the reason distros go to great lenghts to avoid *any* statically linked apps and libs in the distro, completely regardless of the performance overhead. In addition, inlined code complicates ABI compatibility issues because some of the code is one the "wrong" side, and worse, it bypasses all the other ABI compatibility safeguards like soname and symbol versioning. Like said, inlined code is fine for internal consumption, but incredibly bad for public interfaces. And of course, the more complicated a function is, greater the potential of needing bugfixes. Mind you, none of this is magically specific to this particular function. Except in the sense that bulk operations offer a better way of performance improvements than just inlining everything. > As you can see right now we have all mbuf alloc/free routines as static > inline. > And I think we would like to keep it like that. > So why that particular function should be different? Because there's much less need to
[dpdk-dev] [PATCH v5 01/11] ethdev: add API to query packet type filling info
On 02/26/2016 09:34 AM, Jianfeng Tan wrote: > Add a new API rte_eth_dev_get_ptype_info to query whether/what packet > type can be filled by given pmd rx burst function. > > Signed-off-by: Jianfeng Tan > --- > lib/librte_ether/rte_ethdev.c | 26 ++ > lib/librte_ether/rte_ethdev.h | 26 ++ > 2 files changed, 52 insertions(+) > [...] > diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h > index 16da821..16f32a0 100644 > --- a/lib/librte_ether/rte_ethdev.h > +++ b/lib/librte_ether/rte_ethdev.h > @@ -1021,6 +1021,9 @@ typedef void (*eth_dev_infos_get_t)(struct rte_eth_dev > *dev, > struct rte_eth_dev_info *dev_info); > /**< @internal Get specific informations of an Ethernet device. */ > > +typedef const uint32_t *(*eth_dev_ptype_info_get_t)(struct rte_eth_dev *dev); > +/**< @internal Get ptype info of eth_rx_burst_t. */ > + > typedef int (*eth_queue_start_t)(struct rte_eth_dev *dev, > uint16_t queue_id); > /**< @internal Start rx and tx of a queue of an Ethernet device. */ > @@ -1347,6 +1350,7 @@ struct eth_dev_ops { > eth_queue_stats_mapping_set_t queue_stats_mapping_set; > /**< Configure per queue stat counter mapping. */ > eth_dev_infos_get_tdev_infos_get; /**< Get device info. */ > + eth_dev_ptype_info_get_t dev_ptype_info_get; /** Get ptype info */ > mtu_set_t mtu_set; /**< Set MTU. */ > vlan_filter_set_t vlan_filter_set; /**< Filter VLAN Setup. */ > vlan_tpid_set_tvlan_tpid_set; /**< Outer VLAN TPID > Setup. */ > @@ -2268,6 +2272,28 @@ void rte_eth_macaddr_get(uint8_t port_id, struct > ether_addr *mac_addr); Technically this is an ABI break but its marked internal and I guess it falls into the "drivers only" territory similar to what was discussed in this thead: http://dpdk.org/ml/archives/dev/2016-January/032348.html so its probably ok. > void rte_eth_dev_info_get(uint8_t port_id, struct rte_eth_dev_info > *dev_info); > > /** > + * Retrieve the packet type information of an Ethernet device. > + * > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param ptype_mask > + * A hint of what kind of packet type which the caller is interested in. > + * @param ptypes > + * An array pointer to store adequent packet types, allocated by caller. > + * @param num > + * Size of the array pointed by param ptypes. > + * @return > + * - (>0) Number of ptypes supported. If it exceeds param num, exceeding > + * packet types will not be filled in the given array. > + * - (0 or -ENOTSUP) if PMD does not fill the specified ptype. > + * - (-ENODEV) if *port_id* invalid. > + */ > +extern int rte_eth_dev_get_ptype_info(uint8_t port_id, > + uint32_t ptype_mask, > + uint32_t *ptypes, > + int num); > + > +/** >* Retrieve the MTU of an Ethernet device. >* >* @param port_id > "extern" is redundant in headers. We just saw a round of removing them (commit dd34ff1f0e03b2c5e4a97e9fbcba5c8238aac573), lets not add them back :) More importantly, to export a function you need to add an entry for it in rte_ether_version.map. - Panu -
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/29/2016 01:35 PM, Ferruh Yigit wrote: > On 2/29/2016 11:06 AM, Thomas Monjalon wrote: >> Hi, >> I totally agree with Avi's comments. >> This topic is really important for the future of DPDK. >> So I think we must give some time to continue the discussion >> and have netdev involved in the choices done. >> As a consequence, these series should not be merged in the release 16.04. >> Thanks for continuing the work. >> > Hi Thomas, > > It is great to have some discussion and feedbacks. > But I doubt not merging in this release will help to have more discussion. > > It is better to have them in this release and let people experiment it, > this gives more chance to better discussion. > > These features are replacement of KNI, and KNI is not intended to be > removed in this release, so who are using KNI as solution can continue > to use KNI and can test KCP/KDP, so that we can get more feedbacks. So make the work available from a separate git repo and make it easy for people to experiment with it. Code doesn't have to be in a release for the sake of experimenting, and removing code is much harder than not adding it in the first place, witness KNI. - Panu -
[dpdk-dev] [PATCH 1/3] kcp: add kernel control path kernel module
On 02/29/2016 05:27 PM, Thomas Monjalon wrote: > 2016-02-29 17:19, Panu Matilainen: >> On 02/29/2016 01:35 PM, Ferruh Yigit wrote: >>> On 2/29/2016 11:06 AM, Thomas Monjalon wrote: >>>> Hi, >>>> I totally agree with Avi's comments. >>>> This topic is really important for the future of DPDK. >>>> So I think we must give some time to continue the discussion >>>> and have netdev involved in the choices done. >>>> As a consequence, these series should not be merged in the release 16.04. >>>> Thanks for continuing the work. >>>> >>> Hi Thomas, >>> >>> It is great to have some discussion and feedbacks. >>> But I doubt not merging in this release will help to have more discussion. >>> >>> It is better to have them in this release and let people experiment it, >>> this gives more chance to better discussion. >>> >>> These features are replacement of KNI, and KNI is not intended to be >>> removed in this release, so who are using KNI as solution can continue >>> to use KNI and can test KCP/KDP, so that we can get more feedbacks. >> >> So make the work available from a separate git repo and make it easy for >> people to experiment with it. Code doesn't have to be in a release for >> the sake of experimenting, and removing code is much harder than not >> adding it in the first place, witness KNI. > > Good idea. > What about a -next tree to experiment on kernel interactions? Here's another, related but more radical (and rather unbaked) idea: Move all the kernel modules and their associated libraries (thinking of KNI here) to a separate repo with perhaps more relaxed rules, but OTOH require upstream kernel support for any features to be included in dpdk itself. Carrot-and-stick of sorts :) - Panu -
[dpdk-dev] [PATCH 4/4] virtio: check if any kernel driver is manipulating the device
On 01/04/2016 11:02 AM, Xie, Huawei wrote: > On 12/25/2015 6:33 PM, Xie, Huawei wrote: >> virtio PMD could use IO port to configure the virtio device without >> using uio driver. >> >> There are two issues with previous implementation: >> 1) virtio PMD will take over each virtio device blindly even if some >> are not intended for DPDK. >> 2) driver conflict between virtio PMD and virtio-net kernel driver. >> >> This patch checks if there is any kernel driver manipulating the virtio >> device before virtio PMD uses IO port to configure the device. >> >> Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource") >> >> Signed-off-by: Huawei Xie >> --- >> drivers/net/virtio/virtio_ethdev.c | 7 +++ >> 1 file changed, 7 insertions(+) >> >> diff --git a/drivers/net/virtio/virtio_ethdev.c >> b/drivers/net/virtio/virtio_ethdev.c >> index 00015ef..504346a 100644 >> --- a/drivers/net/virtio/virtio_ethdev.c >> +++ b/drivers/net/virtio/virtio_ethdev.c >> @@ -1138,6 +1138,13 @@ static int virtio_resource_init_by_ioports(struct >> rte_pci_device *pci_dev) >> int found = 0; >> size_t linesz; >> >> +if (pci_dev->kdrv != RTE_KDRV_NONE) { >> +PMD_INIT_LOG(ERR, > Better change ERR to INFO and revise the message followed, since user > might not want to use this device for DPDK. Indeed. The whole point of this exercise is to have a clear way of telling DPDK which virtio devices it should (and should not) use, so it should just act accordingly and shut up. >> +"%s(): kernel driver is manipulating this device." \ >> +" Please unbind the kernel driver.", __func__); I'd suggest just dropping the whole message, DPDK doesn't log such messages for any other devices either. That, or make it a generic debug-level log in pci_scan_one(). - Panu -
[dpdk-dev] [PATCH v2 4/4] virtio: check if any kernel driver is manipulating the virtio device
On 01/03/2016 07:56 PM, Huawei Xie wrote: > v2 changes: > change LOG level from ERR to INFO > > virtio PMD could use IO port to configure the virtio device without > using uio driver. > > There are two issues with previous implementation: > 1) virtio PMD will take over each virtio device blindly even if some > are not intended for DPDK. > 2) driver conflict between virtio PMD and virtio-net kernel driver. > > This patch checks if there is any kernel driver manipulating the virtio > device before virtio PMD uses IO port to configure the device. > > Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource") > > Signed-off-by: Huawei Xie > --- > drivers/net/virtio/virtio_ethdev.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/net/virtio/virtio_ethdev.c > b/drivers/net/virtio/virtio_ethdev.c > index e815acd..7a50dac 100644 > --- a/drivers/net/virtio/virtio_ethdev.c > +++ b/drivers/net/virtio/virtio_ethdev.c > @@ -1138,6 +1138,13 @@ static int virtio_resource_init_by_ioports(struct > rte_pci_device *pci_dev) > int found = 0; > size_t linesz; > > + if (pci_dev->kdrv != RTE_KDRV_NONE) { > + PMD_INIT_LOG(INFO, > + "kernel driver is manipulating this device." \ > + " Please unbind the kernel driver."); At the very least this message needs to be changed. Like said earlier, I think the message could just as well be dropped entirely, but at least it should be something to the tune of "ignoring kernel owned device" instead of asking the user to break their configuration. - Panu -
[dpdk-dev] [PATCH v2 1/3] cmdline: increase command line buffer
On 01/12/2016 12:49 PM, Nelio Laranjeiro wrote: > Allow long command lines in testpmd (like flow director with IPv6, ...). > > Signed-off-by: John McNamara > Signed-off-by: Nelio Laranjeiro > --- > doc/guides/rel_notes/deprecation.rst | 5 - > lib/librte_cmdline/cmdline_rdline.h | 2 +- > 2 files changed, 1 insertion(+), 6 deletions(-) > > diff --git a/doc/guides/rel_notes/deprecation.rst > b/doc/guides/rel_notes/deprecation.rst > index e94d4a2..9cb288c 100644 > --- a/doc/guides/rel_notes/deprecation.rst > +++ b/doc/guides/rel_notes/deprecation.rst > @@ -44,8 +44,3 @@ Deprecation Notices > and table action handlers will be updated: > the pipeline parameter will be added, the packets mask parameter will be > either removed (for input port action handler) or made input-only. > - > -* ABI changes are planned in cmdline buffer size to allow the use of long > - commands (such as RETA update in testpmd). This should impact > - CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE. > - It should be integrated in release 2.3. > diff --git a/lib/librte_cmdline/cmdline_rdline.h > b/lib/librte_cmdline/cmdline_rdline.h > index b9aad9b..72e2dad 100644 > --- a/lib/librte_cmdline/cmdline_rdline.h > +++ b/lib/librte_cmdline/cmdline_rdline.h > @@ -93,7 +93,7 @@ extern "C" { > #endif > > /* configuration */ > -#define RDLINE_BUF_SIZE 256 > +#define RDLINE_BUF_SIZE 512 > #define RDLINE_PROMPT_SIZE 32 > #define RDLINE_VT100_BUF_SIZE 8 > #define RDLINE_HISTORY_BUF_SIZE BUFSIZ Having to break a library ABI for a change like this is a bit ridiculous. I didn't try it so could be wrong, but based on a quick look, struct rdline could easily be made opaque to consumers by just adding functions for allocating and freeing it. - Panu -
[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe
On 01/13/2016 01:55 PM, Bruce Richardson wrote: > On Thu, Dec 31, 2015 at 09:12:14AM -0800, Stephen Hemminger wrote: >> On Tue, 29 Dec 2015 10:53:26 +0800 >> Ziye Yang wrote: >> >>> This patch is used to add the class_id support >>> for pci_probe since some devices need the class_info >>> (class_code, subclass_code, programming_interface) >>> >>> Signed-off-by: Ziye Yang >> >> Since rte_pci is exposed to application this breaks the ABI. > > But applications are not going to be defining rte_pci_ids values internally, > are > they? That is for drivers to use. Is this really an ABI breakage for > applications that we > need to be concerned about? There might not be applications using it but drivers are ABI consumers too - think of 3rd party drivers and such. - Panu -
[dpdk-dev] [PATCH v2 1/3] cmdline: increase command line buffer
On 01/15/2016 10:44 AM, N?lio Laranjeiro wrote: > On Tue, Jan 12, 2016 at 02:46:07PM +0200, Panu Matilainen wrote: >> On 01/12/2016 12:49 PM, Nelio Laranjeiro wrote: >>> Allow long command lines in testpmd (like flow director with IPv6, ...). >>> >>> Signed-off-by: John McNamara >>> Signed-off-by: Nelio Laranjeiro >>> --- >>> doc/guides/rel_notes/deprecation.rst | 5 - >>> lib/librte_cmdline/cmdline_rdline.h | 2 +- >>> 2 files changed, 1 insertion(+), 6 deletions(-) >>> >>> diff --git a/doc/guides/rel_notes/deprecation.rst >>> b/doc/guides/rel_notes/deprecation.rst >>> index e94d4a2..9cb288c 100644 >>> --- a/doc/guides/rel_notes/deprecation.rst >>> +++ b/doc/guides/rel_notes/deprecation.rst >>> @@ -44,8 +44,3 @@ Deprecation Notices >>> and table action handlers will be updated: >>> the pipeline parameter will be added, the packets mask parameter will be >>> either removed (for input port action handler) or made input-only. >>> - >>> -* ABI changes are planned in cmdline buffer size to allow the use of long >>> - commands (such as RETA update in testpmd). This should impact >>> - CMDLINE_PARSE_RESULT_BUFSIZE, STR_TOKEN_SIZE and RDLINE_BUF_SIZE. >>> - It should be integrated in release 2.3. >>> diff --git a/lib/librte_cmdline/cmdline_rdline.h >>> b/lib/librte_cmdline/cmdline_rdline.h >>> index b9aad9b..72e2dad 100644 >>> --- a/lib/librte_cmdline/cmdline_rdline.h >>> +++ b/lib/librte_cmdline/cmdline_rdline.h >>> @@ -93,7 +93,7 @@ extern "C" { >>> #endif >>> >>> /* configuration */ >>> -#define RDLINE_BUF_SIZE 256 >>> +#define RDLINE_BUF_SIZE 512 >>> #define RDLINE_PROMPT_SIZE 32 >>> #define RDLINE_VT100_BUF_SIZE 8 >>> #define RDLINE_HISTORY_BUF_SIZE BUFSIZ >> >> Having to break a library ABI for a change like this is a bit ridiculous. > > Sure, but John McNamara needed it to handle flow director with IPv6[1]. > > For my part, I was needing it to manipulate the RETA table, but as I > wrote in the cover letter, it ends by breaking other commands. > Olivier Matz, has proposed another way to handle long commands lines[2], > it could be a good idea to go on this direction. > > For RETA situation, we already discussed on a new API, but for now, I > do not have time for it (and as it is another ABI breakage it could only > be done for 16.07 or 2.4)[3]. > > If this patch is no more needed we can just drop it, for that I would > like to have the point of view from John. Note that I was not objecting to the patch as such, I can easily see 256 characters not being enough for commandline buffer. I was merely noting that having to break an ABI to increase an effectively internal buffer size is a sign of a, um, less-than-optimal library design. Apologies if I wasn't clear about that, - Panu -
[dpdk-dev] [PATCH] cfgfile: support looking up sections by index
On 01/17/2016 05:58 AM, Rich Lane wrote: > This is useful when sections have duplicate names. > > Signed-off-by: Rich Lane > --- > lib/librte_cfgfile/rte_cfgfile.c | 16 > lib/librte_cfgfile/rte_cfgfile.h | 23 +++ > 2 files changed, 39 insertions(+) > This is missing the corresponding entry to lib/librte_cfgfile/rte_cfgfile_version.map - Panu -
[dpdk-dev] [RFC 0/3] Use common Linux tools to control DPDK ports
On 01/19/2016 11:59 AM, Ferruh Yigit wrote: > On Mon, Jan 18, 2016 at 11:20:02AM -0500, Aaron Conole wrote: >> Ferruh Yigit writes: >>> This work is to make DPDK ports more visible and to enable using common >>> Linux tools to configure DPDK ports. >> >> This is a good goal. Only question - why use an additional kernel module >> to do this? Is it _JUST_ for ethtool support? > > Kernel module used to create/destroy Linux net_devices, and module has a > simple > driver for that device which only handles control messages by passing them > into > userspace. > > To represent DPDK ports as Linux net_devices we need kernel support. > >> I think the other stuff >> can be accomplished using netlink sockets + messages, no? > > Netlink sockets just used to communicate kernel-space - user-space, this is > not > why we need a kernel module, for example this communication is implemented in > original KNI as part of FIFO. > >> The only >> trepidation I would have with something like this is the support from >> major vendors - out of tree modules are not generally supportable. Might >> be good to get some of the ethtool commands as netlink messages as well, >> then it is supportable with no 3rd party kernel modules. > > Yes, there is a out of three module problem for some distros, but > unfortunately > we are not able to find a solution for this case without an external kernel > module. > > This patch is still an RFC and if we receive suggested solution without a > kernel > module, we can work on it together. If it has to be in the kernel then you need to find a design that is upstreamable. Out of tree kernel modules are not a solution, they're a problem that people are working on eliminating. - Panu -
[dpdk-dev] [PKTGEN] fixing weird termio issues that complicate debugging
On 01/20/2016 08:32 AM, Matthew Hall wrote: > Hello, > > Since the pktgen code is reindented I am finding time to read through it > and experiment and see if I can get it working. > > I have issues with the init process of pktgen. It is difficult to debug > it because the init code does a lot of very scary stuff to the terminal > control / TTY device at inconvenient times in an inconvenient order, and > in the process damages the debug output and damages the screen of your > GDB without doing weird things to run GDB on a different TTY. > > Of course I am willing to contribute patches and not just complain, but > first I need some help to follow what is going on. > > Here is the problematic call-flow with some explanation what went wrong > trying it on some community machines outside of its original environment: > > 1) it calls printf("\n%s %s\n", wr_copyright_msg(), wr_powered_by()); > which dumps tons of weird boilerplate of licenses, copyrights, code > creator, etc. > > It is open source and everybody that matters already knows who coded it, > so is this stuff really that important? This gets in the way when you > are trying to work on it and I just have to comment it out. > > 2) it calls wr_scrn_setw and tinkers with the windows size very early in > the init which can make your terminal weird > > 3) it calls rte_eal_init which produces a lot of nice debug output, > which is fine > > 4) it calls pktgen_init_screen, which calls wr_scrn_init, which calls > wr_scrn_erase which destroys the valuable debug output just created in > (c) which is a bad thing > > 5) it calls wr_print_copyright and dumps more boilerplate I am not sure > is needed > > 6) it logs some helpful messages about the port / descriptor settings > which is fine > > 7) it calls the pktgen_config_ports function which can crash in ways you > need the destroyed debug output to fix. > > For example in my case that function crashes here: > > if (pktgen.nb_ports == 0) > pktgen_log_panic("*** Did not find any ports to use ***"); > > 8) Later it makes a logo and a splash screen (wr_log, wr_splash_screen). > Is this stuff really needed? This is a ton of output for just starting > up some test program. > > To fix this debug problem I propose some changes which I am happy to > help develop: > > 1) decide what of this output we really need here and greatly simplify > how much gets printed out > > 2) move wr_scrn_setw right before pktgen_init_screen and after > rte_eal_init to prevent damaging that output > > 3) consider how wr_scrn_init is called in pktgen_init_screen, because it > calls wr_scrn_erase which damages output > > 4) I think that pktgen_config_ports should be called before all this > weird screen init stuff, so that if it fails you can actually see what > happened there. > > One other random topic... on the long lines of code it looks like there > are some gigantic tab-indents pushing things off to the right still. One > example, maybe there are others or another setting which is needed to > fix all of these: > > info->seq_pkt = (pkt_seq_t *)rte_zmalloc_socket(buff, > (sizeof(pkt_seq_t) * NUM_TOTAL_PKTS), > > RTE_CACHE_LINE_SIZE, > rte_socket_id()); > > Thoughts? Just that I'm in violent agreement about the splash screens and all. Unfortunately the license explicitly forbids removal of the copyright messages (http://dpdk.org/browse/apps/pktgen-dpdk/tree/LICENSE#n18): -- # 4) The screens displayed by the application must contain the copyright notice as defined # above and can not be removed without specific prior written permission. -- Keith, any chance you could work out the details with Wind River to get the ridiculous startup messages straightened out? I dont think anybody would mind a line or two "copyright by..." kind of printf() in there if that's what it takes, but the current screen after screen after screen copyrights and advertisements are obnoxious to the point of driving potential users away. - Panu - > Matthew Hall
[dpdk-dev] [PATCH v2] cfgfile: support looking up sections by index
On 01/19/2016 06:41 AM, Rich Lane wrote: > This is useful when sections have duplicate names. > > Signed-off-by: Rich Lane > --- > v1->v2: > - Added new symbol to version script. > > lib/librte_cfgfile/rte_cfgfile.c | 16 > lib/librte_cfgfile/rte_cfgfile.h | 23 +++ > lib/librte_cfgfile/rte_cfgfile_version.map | 6 ++ > 3 files changed, 45 insertions(+) > > diff --git a/lib/librte_cfgfile/rte_cfgfile.c > b/lib/librte_cfgfile/rte_cfgfile.c > index a677dad..0bb40a4 100644 > --- a/lib/librte_cfgfile/rte_cfgfile.c > +++ b/lib/librte_cfgfile/rte_cfgfile.c > @@ -333,6 +333,22 @@ rte_cfgfile_section_entries(struct rte_cfgfile *cfg, > const char *sectionname, > return i; > } > > +int > +rte_cfgfile_section_entries_by_index(struct rte_cfgfile *cfg, int index, > + struct rte_cfgfile_entry *entries, int max_entries) > +{ > + int i; > + const struct rte_cfgfile_section *sect; > + > + if (index >= cfg->num_sections) > + return -1; > + > + sect = cfg->sections[index]; Since index is a signed int, I think you should check for < 0 as well in the above. Sorry for not noticing/mentioning that on the first round, I wasn't so much reviewing the code as just skimming through for general API/ABI issues. Other than that, looks ok to me. - Panu -
[dpdk-dev] [PKTGEN] fixing weird termio issues that complicate debugging
On 01/20/2016 06:26 PM, Wiles, Keith wrote: > On 1/20/16, 12:32 AM, "dev on behalf of Matthew Hall" dpdk.org on behalf of mhall at mhcomputing.net> wrote: > >> Hello, >> >> Since the pktgen code is reindented I am finding time to read through it >> and experiment and see if I can get it working. >> >> I have issues with the init process of pktgen. It is difficult to debug >> it because the init code does a lot of very scary stuff to the terminal >> control / TTY device at inconvenient times in an inconvenient order, and >> in the process damages the debug output and damages the screen of your >> GDB without doing weird things to run GDB on a different TTY. >> >> Of course I am willing to contribute patches and not just complain, but >> first I need some help to follow what is going on. >> >> Here is the problematic call-flow with some explanation what went wrong >> trying it on some community machines outside of its original environment: >> >> 1) it calls printf("\n%s %s\n", wr_copyright_msg(), wr_powered_by()); >> which dumps tons of weird boilerplate of licenses, copyrights, code >> creator, etc. >> >> It is open source and everybody that matters already knows who coded it, >> so is this stuff really that important? This gets in the way when you >> are trying to work on it and I just have to comment it out. > > One problem is a number of people wanted to steal the code and use in > a paid application, so the copyright is some what a requirement. In that case, why is it under a BSD'ish license instead of something like GPL that's designed to prevent it in the first place? Might be too late to change it by now, just wondering. > As you may know I do a lot of debugging on Pktgen and I feel they are > a nuisance. I can try to see if we can clean up these messages, but > do not hold your breath on getting them to be removed. It would make a world of difference if it just printed the copyright etc in a couple of lines during startup, instead of taking over the entire screen for several seconds. This is a whole lot like those anti-piracy ad campaigns on DVDs which you cant skip, so all the *legitimate* users are forced to suffer through them but all the bad guys just rip it out of their copies. DRM that ends up hurting the legitimate users the most is never a good idea. - Panu -
[dpdk-dev] [PATCH v3] Patch introducing API to read/write Intel Architecture Model Specific Registers (MSR)...
On 01/21/2016 10:18 AM, Wojciech Andralojc wrote: > Patch reworked. > > Signed-off-by: Wojciech Andralojc > --- > lib/librte_eal/common/include/arch/x86/rte_msr.h | 88 + > lib/librte_eal/linuxapp/eal/Makefile | 1 + > lib/librte_eal/linuxapp/eal/arch/x86/rte_msr.c | 116 > +++ > 3 files changed, 205 insertions(+) > create mode 100644 lib/librte_eal/common/include/arch/x86/rte_msr.h > create mode 100644 lib/librte_eal/linuxapp/eal/arch/x86/rte_msr.c This creates a new arch-specific public API, with rte_msr.h installed as a public header and implementation in the library (as opposed to inline), and so the new functions would have to be added to rte_eal_version.map. However that is a bit of a problem since it only exists on IA architectures, so it'd mean dummy entries in the version map for all other architectures. All the other arch-specific APIs are inline code so this is the first of its kind. Jerin Jacob suggested [1] adding these as internal (inline) functions which to me looks like a more sensible approach, arch-specific APIs tend to be problematic. [1] http://dpdk.org/ml/archives/dev/2016-January/031095.html - Panu -
[dpdk-dev] [PKTGEN] fixing weird termio issues that complicate debugging
On 01/21/2016 05:03 PM, Wiles, Keith wrote: > On 1/21/16, 2:46 AM, "Panu Matilainen" wrote: > >> On 01/20/2016 06:26 PM, Wiles, Keith wrote: >>> On 1/20/16, 12:32 AM, "dev on behalf of Matthew Hall" >> dpdk.org on behalf of mhall at mhcomputing.net> wrote: >>> >>>> Hello, >>>> >>>> Since the pktgen code is reindented I am finding time to read through it >>>> and experiment and see if I can get it working. >>>> >>>> I have issues with the init process of pktgen. It is difficult to debug >>>> it because the init code does a lot of very scary stuff to the terminal >>>> control / TTY device at inconvenient times in an inconvenient order, and >>>> in the process damages the debug output and damages the screen of your >>>> GDB without doing weird things to run GDB on a different TTY. >>>> >>>> Of course I am willing to contribute patches and not just complain, but >>>> first I need some help to follow what is going on. >>>> >>>> Here is the problematic call-flow with some explanation what went wrong >>>> trying it on some community machines outside of its original environment: >>>> >>>> 1) it calls printf("\n%s %s\n", wr_copyright_msg(), wr_powered_by()); >>>> which dumps tons of weird boilerplate of licenses, copyrights, code >>>> creator, etc. >>>> >>>> It is open source and everybody that matters already knows who coded it, >>>> so is this stuff really that important? This gets in the way when you >>>> are trying to work on it and I just have to comment it out. >>> >>> One problem is a number of people wanted to steal the code and use in >>> a paid application, so the copyright is some what a requirement. >> >> In that case, why is it under a BSD'ish license instead of something >> like GPL that's designed to prevent it in the first place? Might be too >> late to change it by now, just wondering. > > DPDK is BSD, so you can not use a GPL application with DPDK (I think) Well I sure hope the license is not chosen by that assumption. Why assume when you can trivially check, eg: http://www.gnu.org/licenses/license-list.html DPDK itself is under the very lax 3-clause BSD license which is compatible with the GPL. The 4-clause "advertising license" used by pktgen is not. But its not the license I'm complaining about. > anyway I can try to speed you the screens, but does it really matter > as these are only at startup and I normally leave pktgen running for > long periods of time. The extra time at the start does not seem to > be a big issue, right? We wouldn't be discussing this if it was not an issue. It is offensive enough to turn away both users and contributors, and merely speeding up a bit is not going to make it a whole lot better. As I (and now others as well) already suggested changing it to a one line printout is what would make worlds of difference while still complying with the license AFAICT. The license text requires printing out the copyright notice, it does not say anything about rendering it in full-screen ascii-art, or printing out the entire license. - Panu -
[dpdk-dev] [PATCH v3] Patch introducing API to read/write Intel Architecture Model Specific Registers (MSR)...
On 01/21/2016 12:51 PM, Ananyev, Konstantin wrote: > Hi Panu, > >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen >> Sent: Thursday, January 21, 2016 10:39 AM >> To: Andralojc, WojciechX >> Cc: dev at dpdk.org >> Subject: Re: [dpdk-dev] [PATCH v3] Patch introducing API to read/write Intel >> Architecture Model Specific Registers (MSR)... >> >> On 01/21/2016 10:18 AM, Wojciech Andralojc wrote: >>> Patch reworked. >>> >>> Signed-off-by: Wojciech Andralojc >>> --- >>>lib/librte_eal/common/include/arch/x86/rte_msr.h | 88 + >>>lib/librte_eal/linuxapp/eal/Makefile | 1 + >>>lib/librte_eal/linuxapp/eal/arch/x86/rte_msr.c | 116 >>> +++ >>>3 files changed, 205 insertions(+) >>>create mode 100644 lib/librte_eal/common/include/arch/x86/rte_msr.h >>>create mode 100644 lib/librte_eal/linuxapp/eal/arch/x86/rte_msr.c >> >> This creates a new arch-specific public API, with rte_msr.h installed as >> a public header and implementation in the library (as opposed to >> inline), and so the new functions would have to be added to >> rte_eal_version.map. >> >> However that is a bit of a problem since it only exists on IA >> architectures, so it'd mean dummy entries in the version map for all >> other architectures. All the other arch-specific APIs are inline code so >> this is the first of its kind. > > My thought was: > 1. implementation is linux specific (as I know not supposed to work under > freebsd). > 2. they are not supposed to be used at run-time cide-path, so no need to be > inlined. Speed is not the only interesting attribute of inlining, inlined code also effectively escapes the library ABI so it does not need versioning / exporting. > 3. As I understand we plan to have a library that will use these functions > anyway. Is this library going to be a generic or specific to Intel CPUs? Also, if there are no other uses for the MSR API at the moment, perhaps the best place for this code would be within that library anyway? > About dummy entries in the .map file: if we'll create a 'weak' generic > implementation, > that would just return an error - would it solve the issue? Sure it'd solve the issue of dummy entries in .map but people seemed opposed to having a highly arch-specific API exposed to all architectures. - Panu -
[dpdk-dev] [PATCH V1 1/1] jobstats: added function abort for job
On 01/26/2016 06:15 PM, Marcin Kerlin wrote: > This patch adds new function rte_jobstats_abort. It marks *job* as finished > and time of this work will be add to management time instead of execution > time. > This function should be used instead of rte_jobstats_finish if condition > occure, > condition is defined by the application for example when receiving n>0 > packets. > > Signed-off-by: Marcin Kerlin > --- > lib/librte_jobstats/rte_jobstats.c | 22 ++ > lib/librte_jobstats/rte_jobstats.h | 17 + > lib/librte_jobstats/rte_jobstats_version.map | 7 +++ > 3 files changed, 46 insertions(+) > [...] > diff --git a/lib/librte_jobstats/rte_jobstats.h > b/lib/librte_jobstats/rte_jobstats.h > index de6a89a..9995319 100644 > --- a/lib/librte_jobstats/rte_jobstats.h > +++ b/lib/librte_jobstats/rte_jobstats.h > @@ -90,6 +90,9 @@ struct rte_jobstats { > uint64_t exec_cnt; > /**< Execute count. */ > > + uint64_t last_job_time; > + /**< Last job time */ > + > char name[RTE_JOBSTATS_NAMESIZE]; > /**< Name of this job */ > AFAICS this is an ABI break and as such, needs to be preannounced, see http://dpdk.org/doc/guides/contributing/versioning.html For 2.3 it'd need to be a CONFIG_RTE_NEXT_ABI feature. - Panu -
[dpdk-dev] [PATCH v6 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
On 01/26/2016 07:03 PM, Huawei Xie wrote: > v6 changes: > reflect the changes in release notes and library version map file > revise our duff's code style a bit to make it more readable > > v5 changes: > add comment about duff's device and our variant implementation > > v3 changes: > move while after case 0 > add context about duff's device and why we use while loop in the commit > message > > v2 changes: > unroll the loop a bit to help the performance > > rte_pktmbuf_alloc_bulk allocates a bulk of packet mbufs. > > There is related thread about this bulk API. > http://dpdk.org/dev/patchwork/patch/4718/ > Thanks to Konstantin's loop unrolling. > > Attached the wiki page about duff's device. It explains the performance > optimization through loop unwinding, and also the most dramatic use of > case label fall-through. > https://en.wikipedia.org/wiki/Duff%27s_device > > In our implementation, we use while() loop rather than do{} while() loop > because we could not assume count is strictly positive. Using while() > loop saves one line of check if count is zero. > > Signed-off-by: Gerald Rogers > Signed-off-by: Huawei Xie > Acked-by: Konstantin Ananyev > --- > doc/guides/rel_notes/release_2_3.rst | 3 ++ > lib/librte_mbuf/rte_mbuf.h | 55 > > lib/librte_mbuf/rte_mbuf_version.map | 7 + > 3 files changed, 65 insertions(+) > > diff --git a/doc/guides/rel_notes/release_2_3.rst > b/doc/guides/rel_notes/release_2_3.rst > index 99de186..a52cba3 100644 > --- a/doc/guides/rel_notes/release_2_3.rst > +++ b/doc/guides/rel_notes/release_2_3.rst > @@ -4,6 +4,9 @@ DPDK Release 2.3 > New Features > > > +* **Enable bulk allocation of mbufs. ** > + A new function ``rte_pktmbuf_alloc_bulk()`` has been added to allow the > user > + to allocate a bulk of mbufs. > > Resolved Issues > --- > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > index f234ac9..b2ed479 100644 > --- a/lib/librte_mbuf/rte_mbuf.h > +++ b/lib/librte_mbuf/rte_mbuf.h > @@ -1336,6 +1336,61 @@ static inline struct rte_mbuf > *rte_pktmbuf_alloc(struct rte_mempool *mp) > } > > /** > + * Allocate a bulk of mbufs, initialize refcnt and reset the fields to > default > + * values. > + * > + * @param pool > + *The mempool from which mbufs are allocated. > + * @param mbufs > + *Array of pointers to mbufs > + * @param count > + *Array size > + * @return > + * - 0: Success > + */ > +static inline int rte_pktmbuf_alloc_bulk(struct rte_mempool *pool, > + struct rte_mbuf **mbufs, unsigned count) > +{ > + unsigned idx = 0; > + int rc; > + > + rc = rte_mempool_get_bulk(pool, (void **)mbufs, count); > + if (unlikely(rc)) > + return rc; > + > + /* To understand duff's device on loop unwinding optimization, see > + * https://en.wikipedia.org/wiki/Duff's_device. > + * Here while() loop is used rather than do() while{} to avoid extra > + * check if count is zero. > + */ > + switch (count % 4) { > + case 0: > + while (idx != count) { > + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0); > + rte_mbuf_refcnt_set(mbufs[idx], 1); > + rte_pktmbuf_reset(mbufs[idx]); > + idx++; > + case 3: > + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0); > + rte_mbuf_refcnt_set(mbufs[idx], 1); > + rte_pktmbuf_reset(mbufs[idx]); > + idx++; > + case 2: > + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0); > + rte_mbuf_refcnt_set(mbufs[idx], 1); > + rte_pktmbuf_reset(mbufs[idx]); > + idx++; > + case 1: > + RTE_MBUF_ASSERT(rte_mbuf_refcnt_read(mbufs[idx]) == 0); > + rte_mbuf_refcnt_set(mbufs[idx], 1); > + rte_pktmbuf_reset(mbufs[idx]); > + idx++; > + } > + } > + return 0; > +} > + > +/** >* Attach packet mbuf to another packet mbuf. >* >* After attachment we refer the mbuf we attached as 'indirect', > diff --git a/lib/librte_mbuf/rte_mbuf_version.map > b/lib/librte_mbuf/rte_mbuf_version.map > index e10f6bd..257c65a 100644 > --- a/lib/librte_mbuf/rte_mbuf_version.map > +++ b/lib/librte_mbuf/rte_mbuf_version.map > @@ -18,3 +18,10 @@ DPDK_2.1 { > rte_pktmbuf_pool_create; > > } DPDK_2.0; > + > +DPDK_2.3 { > + global: > + > + rte_pktmbuf_alloc_bulk; > + > +} DPDK_2.1; > Since rte_pktmbuf_alloc_bulk() is an inline function, it is not part of the library ABI and should not be listed in the version map. I assume its inline for performance reasons, but then you lose the benefits of dynamic linking such as ability to fix bugs and/or improve itby just updating the library. Since the point of
[dpdk-dev] [PATCH V1 1/1] jobstats: added function abort for job
On 01/27/2016 05:57 PM, Jastrzebski, MichalX K wrote: >> -Original Message- >> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Panu Matilainen >> Sent: Wednesday, January 27, 2016 2:38 PM >> To: Kerlin, MarcinX ; dev at dpdk.org >> Subject: Re: [dpdk-dev] [PATCH V1 1/1] jobstats: added function abort for job >> >> On 01/26/2016 06:15 PM, Marcin Kerlin wrote: >>> This patch adds new function rte_jobstats_abort. It marks *job* as finished >>> and time of this work will be add to management time instead of execution >> time. >>> This function should be used instead of rte_jobstats_finish if condition >> occure, >>> condition is defined by the application for example when receiving n>0 >> packets. >>> >>> Signed-off-by: Marcin Kerlin >>> --- >>>lib/librte_jobstats/rte_jobstats.c | 22 ++ >>>lib/librte_jobstats/rte_jobstats.h | 17 + >>>lib/librte_jobstats/rte_jobstats_version.map | 7 +++ >>>3 files changed, 46 insertions(+) >>> >> [...] >>> diff --git a/lib/librte_jobstats/rte_jobstats.h >> b/lib/librte_jobstats/rte_jobstats.h >>> index de6a89a..9995319 100644 >>> --- a/lib/librte_jobstats/rte_jobstats.h >>> +++ b/lib/librte_jobstats/rte_jobstats.h >>> @@ -90,6 +90,9 @@ struct rte_jobstats { >>> uint64_t exec_cnt; >>> /**< Execute count. */ >>> >>> + uint64_t last_job_time; >>> + /**< Last job time */ >>> + >>> char name[RTE_JOBSTATS_NAMESIZE]; >>> /**< Name of this job */ >>> >> >> AFAICS this is an ABI break and as such, needs to be preannounced, see >> http://dpdk.org/doc/guides/contributing/versioning.html >> For 2.3 it'd need to be a CONFIG_RTE_NEXT_ABI feature. >> >> - Panu - > > Hi Panu, > Thanks for Your notice. This last_job_time field is actually not necessary > here > and will be removed from this structure. Right, makes sense. You can always add it later on when there's a more pressing need to break the ABI. - Panu -
[dpdk-dev] [PATCH v2] fix checkpatch errors
On 01/28/2016 10:38 AM, Xie, Huawei wrote: > On 1/28/2016 4:06 PM, Thomas Monjalon wrote: >> 2016-01-28 03:09, Xie, Huawei: >>> On 1/28/2016 2:17 AM, Thomas Monjalon wrote: 2016-01-27 01:26, Huawei Xie: > v2 changes: > add missed commit message in v1 > > fix the error reported by checkpatch: > "ERROR: return is not a function, parentheses are not required" > > also removed other extra parentheses like: > "return val == 0" > "return (rte_mempool_lookup(...))" How these examples are differents from above checkpatch error? >>> Don't get it. >> Me too ;) >> I don't understand which paren you removed in "return val == 0" >> and why you say "also removed other...", meaning it is different >> from the checkpatch error. > > Got you. I thought your example means DPDK examples. > I mean i also removed paren in "return (val == 0)". But checkpatch > doesn't report "return (logical expression)" as error. I think it is > also not necessary, so removed some of them. That is why i listed them > seperately. > So perhaps there's a reason checkpatch doesn't report it as an error? At least I find the parentheses to increase readability in case of logical expressions, for example return val == 0; return (val == 0); The parentheses kinda force you to notice there's something special going on and its not val that's returned. This "note there's something special here" of course only works if parentheses are not sprinkled around everywhere. - Panu -
[dpdk-dev] [PATCH v3 4/4] virtio: check if kernel driver is manipulating the virtio device
On 01/27/2016 05:21 PM, Huawei Xie wrote: > v3 changes: > change log message to tell user that the virtio device is skipped > due to it is managed by kernel driver, instead of asking user to > unbind it from kernel driver. > > v2 changes: > change LOG level from ERR to INFO > > virtio PMD could use IO port to configure the virtio device without > using uio driver(vfio-noniommu mode should work as well). > > There are two issues with previous implementation: > 1) virtio PMD will take over each virtio device blindly even if some > are not intended for DPDK. > 2) driver conflict between virtio PMD and virtio-net kernel driver. > > This patch checks if there is any kernel driver manipulating the virtio > device before virtio PMD uses IO port to configure the device. > > Fixes: da978dfdc43b ("virtio: use port IO to get PCI resource") > > Signed-off-by: Huawei Xie > --- > drivers/net/virtio/virtio_ethdev.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/drivers/net/virtio/virtio_ethdev.c > b/drivers/net/virtio/virtio_ethdev.c > index e815acd..ea1874a 100644 > --- a/drivers/net/virtio/virtio_ethdev.c > +++ b/drivers/net/virtio/virtio_ethdev.c > @@ -1138,6 +1138,11 @@ static int virtio_resource_init_by_ioports(struct > rte_pci_device *pci_dev) > int found = 0; > size_t linesz; > > + if (pci_dev->kdrv != RTE_KDRV_NONE) { > + PMD_INIT_LOG(INFO, "skip kernel managed virtio device."); > + return -1; > + } > + > snprintf(pci_id, sizeof(pci_id), PCI_PRI_FMT, >pci_dev->addr.domain, >pci_dev->addr.bus, > "Manage" is a good term for this, much better than "manipulate" used in the subject of this patch and patch 2/4. "Check if kernel is manipulating foo" sounds like something that is happening right now, as in "wait until kernel has stopped fiddling with it and then do our own stuff while its quiet", managed makes is clear its about the overall state instead. Not asking you to submit v4 just because of that, but if the need arises for other reasons it'd be nice to fix it as well, otherwise perhaps Thomas can adjust it while committing? - Panu -
[dpdk-dev] [PATCH v6 1/2] tools: Add support for handling built-in kernel modules
On 01/28/2016 01:17 PM, Kamil Rytarowski wrote: > > > W dniu 26.01.2016 o 16:23, Thomas Monjalon pisze: >> 2016-01-20 10:48, krytarowski at caviumnetworks.com: >>> --- a/tools/dpdk_nic_bind.py >>> +++ b/tools/dpdk_nic_bind.py >>> -for line in loaded_mods: >>> +try: >>> +# Get list of syfs modules, some of them might be builtin >>> and merge with mods >> Please could you explain this comment? >> Is it remaining from previous versions of the patch? > > Yes. It might be changed to: > # Get list of sysfs modules (both built-in and dynamically loaded) > >> [...] >>> +# special case for vfio_pci (module is named vfio-pci, >>> +# but its .ko is named vfio_pci) >> Isn't it common to have dash replaced by underscore for kernel modules? >> > > I retained the logic for special case of vfio-pci. At the moment > (according to my knowledge) there are no other DPDK modules with this > name replacement. > > I checked few example Linux modules and if a module is named with dash, > it's being replaced to underscore. The modprobe(8) tool can accept both > names as interchangeable (with dash and underscore). > > Would you like to make it a general rule and replace all dashes with > underscores? It would be nice to behave the same as modprobe wrt dash and underscore, yes. - Panu - > Thank you
[dpdk-dev] [PATCH 3/4] lib/librte_port: add packet dumping to PCAP file support in sink port
On 01/27/2016 07:39 PM, Fan Zhang wrote: > Originally, sink ports in librte_port releases received mbufs back to > mempool. This patch adds optional packet dumping to PCAP feature in sink > port: the packets will be dumped to user defined PCAP file for storage or > debugging. The user may also choose the sink port's activity: either it > continuously dump the packets to the file, or stops at certain dumping > > This feature shares same CONFIG_RTE_PORT_PCAP compiler option as source > port PCAP file support feature. Users can enable or disable this feature > by setting CONFIG_RTE_PORT_PCAP compiler option "y" or "n". > > Signed-off-by: Fan Zhang > Acked-by: Cristian Dumitrescu > --- > lib/librte_port/rte_port_source_sink.c | 268 > +++-- > lib/librte_port/rte_port_source_sink.h | 11 +- > 2 files changed, 263 insertions(+), 16 deletions(-) > [...] > +#ifdef RTE_PORT_PCAP > + > +/** > + * Open PCAP file for dumping packets to the file later > + * > + * @param port > + * Handle to sink port > + * @param p > + * Sink port parameter > + * @return > + * 0 on SUCCESS > + * error code otherwise > + */ [...] > + > +#else > + > +static int > +pcap_sink_open(struct rte_port_sink *port, > + __rte_unused struct rte_port_sink_params *p) > +{ > + port->dumper = NULL; > + port->max_pkts = 0; > + port->pkt_index = 0; > + port->dump_finish = 0; > + > + return 0; > +} Shouldn't this just return -ENOTSUP instead of success when the pcap feature is not built in? > + > +static void > +pcap_sink_dump_pkt(__rte_unused struct rte_port_sink *port, > + __rte_unused struct rte_mbuf *mbuf) {} > + > +static void > +pcap_sink_flush_pkt(__rte_unused void *dumper) {} > + > +static void > +pcap_sink_close(__rte_unused void *dumper) {} > + > +#endif > + > static void * > rte_port_sink_create(__rte_unused void *params, int socket_id) > { > struct rte_port_sink *port; > + struct rte_port_sink_params *p = params; > + int status; > > /* Memory allocation */ > port = rte_zmalloc_socket("PORT", sizeof(*port), > @@ -360,6 +532,19 @@ rte_port_sink_create(__rte_unused void *params, int > socket_id) > return NULL; > } > > + /* Try to open PCAP file for dumping, if possible */ > + status = pcap_sink_open(port, p); > + if (status < 0) { > + RTE_LOG(ERR, PORT, "%s: Failed to enable PCAP support " > + "support\n", __func__); > + rte_free(port); > + port = NULL; > + } else { > + if (port->dumper != NULL) > + RTE_LOG(INFO, PORT, "Ready to dump packets to file " > + "%s\n", p->file_name); > + } > + > return port; > } > > @@ -369,6 +554,8 @@ rte_port_sink_tx(void *port, struct rte_mbuf *pkt) > __rte_unused struct rte_port_sink *p = (struct rte_port_sink *) port; > > RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, 1); > + if (p->dumper != NULL) > + pcap_sink_dump_pkt(p, pkt); > rte_pktmbuf_free(pkt); > RTE_PORT_SINK_STATS_PKTS_DROP_ADD(p, 1); > > @@ -387,21 +574,44 @@ rte_port_sink_tx_bulk(void *port, struct rte_mbuf > **pkts, > > RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, n_pkts); > RTE_PORT_SINK_STATS_PKTS_DROP_ADD(p, n_pkts); > - for (i = 0; i < n_pkts; i++) { > - struct rte_mbuf *pkt = pkts[i]; > - > - rte_pktmbuf_free(pkt); > + if (p->dumper) { > + for (i = 0; i < n_pkts; i++) { > + struct rte_mbuf *pkt = pkts[i]; > + > + pcap_sink_dump_pkt(p, pkt); > + rte_pktmbuf_free(pkt); > + } > + } else { > + for (i = 0; i < n_pkts; i++) { > + struct rte_mbuf *pkt = pkts[i]; > + > + rte_pktmbuf_free(pkt); > + } > } > } else { > - for ( ; pkts_mask; ) { > - uint32_t pkt_index = __builtin_ctzll(pkts_mask); > - uint64_t pkt_mask = 1LLU << pkt_index; > - struct rte_mbuf *pkt = pkts[pkt_index]; > - > - RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, 1); > - RTE_PORT_SINK_STATS_PKTS_DROP_ADD(p, 1); > - rte_pktmbuf_free(pkt); > - pkts_mask &= ~pkt_mask; > + if (p->dumper) { > + for ( ; pkts_mask; ) { > + uint32_t pkt_index = __builtin_ctzll(pkts_mask); > + uint64_t pkt_mask = 1LLU << pkt_index; > + struct rte_mbuf *pkt = pkts[pkt_index]; > + > + RTE_PORT_SINK_STATS_PKTS_IN_ADD(p, 1); > + RTE_PORT_SINK_STATS_PKTS
[dpdk-dev] [PATCH 1/4] lib/librte_port: add PCAP file support to source port
On 01/27/2016 07:39 PM, Fan Zhang wrote: > Originally, source ports in librte_port is an input port used as packet > generator. Similar to Linux kernel /dev/zero character device, it > generates null packets. This patch adds optional PCAP file support to > source port: instead of sending NULL packets, the source port generates > packets copied from a PCAP file. To increase the performance, the packets > in the file are loaded to memory initially, and copied to mbufs in circular > manner. Users can enable or disable this feature by setting > CONFIG_RTE_PORT_PCAP compiler option "y" or "n". > > Signed-off-by: Fan Zhang > Acked-by: Cristian Dumitrescu > --- > config/common_bsdapp | 1 + > config/common_linuxapp | 1 + > lib/librte_port/Makefile | 4 + > lib/librte_port/rte_port_source_sink.c | 190 > + > lib/librte_port/rte_port_source_sink.h | 7 ++ > mk/rte.app.mk | 1 + > 6 files changed, 204 insertions(+) > [...] > +#ifdef RTE_PORT_PCAP > + > +/** > + * Load PCAP file, allocate and copy packets in the file to memory > + * > + * @param p > + * Parameters for source port > + * @param port > + * Handle to source port > + * @param socket_id > + * Socket id where the memory is created > + * @return > + * 0 on SUCCESS > + * error code otherwise > + */ > +static int > +pcap_source_load(struct rte_port_source_params *p, > + struct rte_port_source *port, > + int socket_id) > +{ [...] > +#else > +static int > +pcap_source_load(__rte_unused struct rte_port_source_params *p, > + struct rte_port_source *port, > + __rte_unused int socket_id) > +{ > + port->pkt_buff = NULL; > + port->pkt_len = NULL; > + port->pkts = NULL; > + port->pkt_index = 0; > + > + return 0; > +} > +#endif Same as in patch 3/4, shouldn't this return -ENOTSUP when pcap support is not built in, instead of success? [...] > diff --git a/lib/librte_port/rte_port_source_sink.h > b/lib/librte_port/rte_port_source_sink.h > index 0f9be79..6f39bec 100644 > --- a/lib/librte_port/rte_port_source_sink.h > +++ b/lib/librte_port/rte_port_source_sink.h > @@ -53,6 +53,13 @@ extern "C" { > struct rte_port_source_params { > /** Pre-initialized buffer pool */ > struct rte_mempool *mempool; > + /** The full path of the pcap file to read packets from */ > + char *file_name; > + /** The number of bytes to be read from each packet in the > + * pcap file. If this value is 0, the whole packet is read; > + * if it is bigger than packet size, the generated packets > + * will contain the whole packet */ > + uint32_t n_bytes_per_pkt; > }; This is a likely ABI-break. It "only" appends to the struct, which might in some cases be okay but only when there's no sensible use for the struct within arrays or embedded in structs. The ip_pipeline example for example embeds struct rte_port_source_params within another struct which is could be thought of as an indication that other applications might be doing this as well. An ABI break for librte_port has not been announced for 2.3 so you'd need to announce the intent to do so in 2.4 now, and then either wait till post 2.3 or wrap this in CONFIG_RTE_NEXT_ABI. - Panu -
[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe
On 01/28/2016 11:38 PM, Thomas Monjalon wrote: > 2016-01-13 14:22, Panu Matilainen: >> On 01/13/2016 01:55 PM, Bruce Richardson wrote: >>> On Thu, Dec 31, 2015 at 09:12:14AM -0800, Stephen Hemminger wrote: >>>> On Tue, 29 Dec 2015 10:53:26 +0800 >>>> Ziye Yang wrote: >>>> >>>>> This patch is used to add the class_id support >>>>> for pci_probe since some devices need the class_info >>>>> (class_code, subclass_code, programming_interface) >>>>> >>>>> Signed-off-by: Ziye Yang >>>> >>>> Since rte_pci is exposed to application this breaks the ABI. >>> >>> But applications are not going to be defining rte_pci_ids values >>> internally, are >>> they? That is for drivers to use. Is this really an ABI breakage for >>> applications that we >>> need to be concerned about? >> >> There might not be applications using it but drivers are ABI consumers >> too - think of 3rd party drivers and such. > > Drivers are not ABI consumers in the sense that ABI means > Application Binary Interface. > We are talking about drivers interface here. > When establishing the ABI policy we were discussing about applications only. Generally speaking an ABI is an interface between two program (or software if you like) modules, its not specific to "applications". Looking at http://dpdk.org/doc/guides/contributing/versioning.html I see it does only talk about applications, but an ABI consumer can also be another library. A driver calling rte_malloc() is just as much librte_eal ABI consumer as anything else. Now, I understand that drivers use and need interface(s) that applications have no use for or simply cannot use, and those interfaces could be subject to different policies. As an extreme example, the Linux kernel has two isolated ABIs, one is the userland system call interface which is guaranteed to stay forever and the other is kernel module interface, guarantees nothing at all. In DPDK the difference is far muddier than that since all the interfaces live in common, versioned userland DSOs. So if there are two different interfaces following two different policies, it's all the more important to clearly document them. One simple way could be using a different prefix than rte_. > I agree we must allow 3rd party drivers but there is no good reason > to try to upgrade DPDK without upgrading/porting the external drivers. > If someone does not want to release its driver and keep upgrading DPDK, > it is acceptable IMHO to force an upgrade of its driver. Note that I've no particular sympathy for 3rd party drivers as such. What I *do* care about is that breakage is made explicit, as in drivers built for an incompatible version refuse to load at all, instead of silently corrupting memory etc. - Panu -
[dpdk-dev] [PATCH] pci: Add the class_id support in pci probe
On 01/29/2016 11:34 AM, Thomas Monjalon wrote: > 2016-01-29 11:21, Panu Matilainen: >> On 01/28/2016 11:38 PM, Thomas Monjalon wrote: >>> 2016-01-13 14:22, Panu Matilainen: >>>> On 01/13/2016 01:55 PM, Bruce Richardson wrote: >>>>> On Thu, Dec 31, 2015 at 09:12:14AM -0800, Stephen Hemminger wrote: >>>>>> On Tue, 29 Dec 2015 10:53:26 +0800 >>>>>> Ziye Yang wrote: >>>>>> >>>>>>> This patch is used to add the class_id support >>>>>>> for pci_probe since some devices need the class_info >>>>>>> (class_code, subclass_code, programming_interface) >>>>>>> >>>>>>> Signed-off-by: Ziye Yang >>>>>> >>>>>> Since rte_pci is exposed to application this breaks the ABI. >>>>> >>>>> But applications are not going to be defining rte_pci_ids values >>>>> internally, are >>>>> they? That is for drivers to use. Is this really an ABI breakage for >>>>> applications that we >>>>> need to be concerned about? >>>> >>>> There might not be applications using it but drivers are ABI consumers >>>> too - think of 3rd party drivers and such. >>> >>> Drivers are not ABI consumers in the sense that ABI means >>> Application Binary Interface. >>> We are talking about drivers interface here. >>> When establishing the ABI policy we were discussing about applications only. >> >> Generally speaking an ABI is an interface between two program (or >> software if you like) modules, its not specific to "applications". >> Looking at http://dpdk.org/doc/guides/contributing/versioning.html I see >> it does only talk about applications, but an ABI consumer can also be >> another library. A driver calling rte_malloc() is just as much >> librte_eal ABI consumer as anything else. >> >> Now, I understand that drivers use and need interface(s) that >> applications have no use for or simply cannot use, and those interfaces >> could be subject to different policies. As an extreme example, the Linux >> kernel has two isolated ABIs, one is the userland system call interface >> which is guaranteed to stay forever and the other is kernel module >> interface, guarantees nothing at all. >> >> In DPDK the difference is far muddier than that since all the interfaces >> live in common, versioned userland DSOs. So if there are two different >> interfaces following two different policies, it's all the more important >> to clearly document them. One simple way could be using a different >> prefix than rte_. > > Good suggestion. Or we can simply have different files with a clear notice > in their headers and in the versioning doc. > It was well split in rte_cryptodev_pmd.h Using separate headers is also good. Optimally both? :) >>> I agree we must allow 3rd party drivers but there is no good reason >>> to try to upgrade DPDK without upgrading/porting the external drivers. >>> If someone does not want to release its driver and keep upgrading DPDK, >>> it is acceptable IMHO to force an upgrade of its driver. >> >> Note that I've no particular sympathy for 3rd party drivers as such. >> What I *do* care about is that breakage is made explicit, as in drivers >> built for an incompatible version refuse to load at all, instead of >> silently corrupting memory etc. > > OK I agree. Cool, the rest is just details then. > Anyway the ABI versionning does not cover the structure changes. > What about making a DPDK version check when registering a driver? > So a binary driver would be clearly bound to a DPDK version. That's one possibility. Another way to achieve essentially the same is to make rte_eal_driver_register() symbol version follow the DPDK version, in which case a driver built for another version will fail at dlopen() already. > And we should change or remove the .so version which never change for > most of drivers. Yup, so-versioning DSOs which do not offer any ABI is kinda pointless. - Panu -
[dpdk-dev] [PATCH] mem: skip memory locking on failure
On 06/13/2016 01:26 PM, Olivier Matz wrote: > Since recently [1], it is not possible to run the dpdk with user > (non-root) privileges and the --no-huge option. This is because the eal > layer tries to lock the memory. Using locked memory is mandatory for > physical devices because they reference physical addresses. > > But a user may want to start the dpdk without locked memory, because he > does not have the permission to do so, and/or does not have this need. > > Moreover, the option --no-huge is still not functional today since the > physical memory address is not properly filled, so the initial patch is > not really useful. > > This commit fixes this issue by retrying the mmap() wihout the > MAP_LOCKED flag if the first mmap() failed. > > [1] http://www.dpdk.org/ml/archives/dev/2016-May/039404.html > > Fixes: 593a084afc2b ("mem: lock pages when not using hugepages") > Reported-by: Panu Matilainen > Signed-off-by: Olivier Matz > --- > lib/librte_eal/linuxapp/eal/eal_memory.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c > b/lib/librte_eal/linuxapp/eal/eal_memory.c > index 79d1d2d..08692d1 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memory.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c > @@ -1075,6 +1075,15 @@ rte_eal_hugepage_init(void) > if (internal_config.no_hugetlbfs) { > addr = mmap(NULL, internal_config.memory, PROT_READ | > PROT_WRITE, > MAP_LOCKED | MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); > + /* retry without MAP_LOCKED */ > + if (addr == MAP_FAILED && errno == EAGAIN) { > + addr = mmap(NULL, internal_config.memory, > + PROT_READ | PROT_WRITE, > + MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); > + if (addr != MAP_FAILED) > + RTE_LOG(NOTICE, EAL, > + "Cannot lock memory: don't use physical > devices\n"); > + } > if (addr == MAP_FAILED) { > RTE_LOG(ERR, EAL, "%s: mmap() failed: %s\n", __func__, > strerror(errno)); > I'm not really that familiar with dpdk memory usage, but gut feeling says such a thing needs to be explicit - either you explicitly ask for memory that doesn't need to be locked, or this simply fails with no retries. Or something like that. But "maybe I did, maybe I didn't" doesn't seem like very good API semantics to me :) Are there actual plans to make --no-huge work with real devices? If not then documenting --no-huge to imply unlocked memory is one option I guess. - Panu - - Panu -
[dpdk-dev] [PATCH v2 1/1] eal: fix resource leak of mapped memory
On 06/15/2016 12:35 PM, Kerlin, MarcinX wrote: > Hi Sergio, > > Thanks for tips, I removed double casting and I leave (void *) casting > because pointer hp is const and compiler returns warnings. If hp is something that needs freeing then it shouldn't be const in the first place. Please fix that instead. - Panu -
[dpdk-dev] random pkt generator PMD
On 06/15/2016 02:10 PM, Yerden Zhumabekov wrote: > > > On 15.06.2016 16:43, Dumitrescu, Cristian wrote: >> >>> -Original Message- >>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Yerden >>> Zhumabekov >>> Sent: Wednesday, June 15, 2016 10:44 AM >>> To: dev at dpdk.org >>> Subject: [dpdk-dev] random pkt generator PMD >>> >>> Hello everybody, >>> >>> DPDK already got a number of PMDs for various eth devices, it even has >>> PMD emulations for backends such as pcap, sw rings etc. >>> >>> I've been thinking about the idea of having PMD which would generate >>> mbufs on the fly in some randomized fashion. This would serve goals >>> like, for example: >>> >>> 1) running tests for applications with network processing capabilities >>> without additional software packet generators; >>> 2) making performance measurements with no hw inteference; >>> 3) ability to run without root privileges, --no-pci, --no-huge, for CI >>> build, so on. >>> >>> Maybe there's no such need, and these goals may be achieved by other >>> means and this idea is flawed? Any thoughts? >> How about a Perl/Python script to generate a PCAP file with random >> packets and then feed the PCAP file to the PCAP PMD? >> >> Random can mean different requirements for different >> users/application, I think it is difficult to fit this under a simple >> generic API. Customizing the script for different requirements if a >> far better option in my opinion. > > AFAIK, the thing about pcap pmd is that one needs to rewind pcap file > once pcap pmd reaches its end. It requires additional (non-generic) > handling in app code. So add a loop-mode to pcap pmd? - Panu -
[dpdk-dev] random pkt generator PMD
On 06/15/2016 03:14 PM, Yerden Zhumabekov wrote: > > > On 15.06.2016 17:25, Panu Matilainen wrote: >> On 06/15/2016 02:10 PM, Yerden Zhumabekov wrote: >>> >>> >>> On 15.06.2016 16:43, Dumitrescu, Cristian wrote: >>>> >>>>> >>>>> Hello everybody, >>>>> >>>>> DPDK already got a number of PMDs for various eth devices, it even has >>>>> PMD emulations for backends such as pcap, sw rings etc. >>>>> >>>>> I've been thinking about the idea of having PMD which would generate >>>>> mbufs on the fly in some randomized fashion. This would serve goals >>>>> like, for example: >>>>> >>>>> 1) running tests for applications with network processing capabilities >>>>> without additional software packet generators; >>>>> 2) making performance measurements with no hw inteference; >>>>> 3) ability to run without root privileges, --no-pci, --no-huge, for CI >>>>> build, so on. >>>>> >>>>> Maybe there's no such need, and these goals may be achieved by other >>>>> means and this idea is flawed? Any thoughts? >>>> How about a Perl/Python script to generate a PCAP file with random >>>> packets and then feed the PCAP file to the PCAP PMD? >>>> >>>> Random can mean different requirements for different >>>> users/application, I think it is difficult to fit this under a simple >>>> generic API. Customizing the script for different requirements if a >>>> far better option in my opinion. >>> >>> AFAIK, the thing about pcap pmd is that one needs to rewind pcap file >>> once pcap pmd reaches its end. It requires additional (non-generic) >>> handling in app code. >> >> So add a loop-mode to pcap pmd? > > It would be nice to have an option like "...,rewind=1,...". As Cristian points out in http://dpdk.org/ml/archives/dev/2016-June/041589.html, the current pmd behavior of stopping is the odd man out in the pmd crowd. Rather than whether to rewind or not, I'd make the number of loops configurable, defaulting to forever and 1 being the equal to current behavior. - Panu -
[dpdk-dev] [PATCH v4] eal: out-of-bounds write
On 06/15/2016 04:25 PM, Slawomir Mrozowicz wrote: > Overrunning array mcfg->memseg of 256 44-byte elements > at element index 257 using index j. > Fixed by add condition with message information. > > Fixes: af75078fece3 ("first public release") > Coverity ID 13282 > > Signed-off-by: Slawomir Mrozowicz > --- > lib/librte_eal/linuxapp/eal/eal_memory.c | 9 + > 1 file changed, 9 insertions(+) > > diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c > b/lib/librte_eal/linuxapp/eal/eal_memory.c > index 5b9132c..19753b1 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memory.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c > @@ -1301,6 +1301,15 @@ rte_eal_hugepage_init(void) > break; > } > > + if (j >= RTE_MAX_MEMSEG) { > + RTE_LOG(ERR, EAL, > + "Failed: all memsegs used by ivshmem.\n" > + "Current %d is not enough.\n" > + "Please either increase the RTE_MAX_MEMSEG\n", > + RTE_MAX_MEMSEG); > + return -ENOMEM; > + } The error message is either incomplete or not coherent: "please either increase..." or what? Also no need for that "Failed:" because its already prefixed by "Error:". I'm not sure how helpful it is to have an error message suggest increasing a value that requires recomplication, but maybe something more in the lines of: ("All memory segments exhausted by IVSHMEM. Try recompiling with larger RTE_MAX_MEMSEG than current %d?", RTE_MAX_MEMSEG) - Panu -
[dpdk-dev] [PATCHv7 1/6] pmdinfogen: Add buildtools and pmdinfogen utility
On 06/09/2016 08:46 PM, Neil Horman wrote: > pmdinfogen is a tool used to parse object files and build json strings for > use in later determining hardware support in a dso or application binary. > pmdinfo looks for the non-exported symbol names this_pmd_name and > this_pmd_tbl (where n is a integer counter). It records the name of > each of these tuples, using the later to find the symbolic name of the > pci_table for physical devices that the object supports. With this > information, it outputs a C file with a single line of the form: > > static char *_driver_info[] __attribute__((used)) = " \ > PMD_DRIVER_INFO="; > > Where is the arbitrary name of the pmd, and is the > json encoded string that hold relevant pmd information, including the pmd > name, type and optional array of pci device/vendor ids that the driver > supports. > > This c file is suitable for compiling to object code, then relocatably > linking into the parent file from which the C was generated. This creates > an entry in the string table of the object that can inform a later tool > about hardware support. > > Signed-off-by: Neil Horman > CC: Bruce Richardson > CC: Thomas Monjalon > CC: Stephen Hemminger > CC: Panu Matilainen > --- Unlike earlier versions, pmdinfogen ends up installed in bindir during "make install". Is that intentional, or just a side-effect from using rte.hostapp.mk? If its intentional it probably should be prefixed with dpdk_ like the other tools. - Panu -
[dpdk-dev] [PATCHv7 5/6] pmdinfo.py: Add tool to query binaries for hw and other support information
On 06/09/2016 08:47 PM, Neil Horman wrote: > This tool searches for the primer sting PMD_DRIVER_INFO= in any ELF binary, > and, if found parses the remainder of the string as a json encoded string, > outputting the results in either a human readable or raw, script parseable > format > > Note that, in the case of dynamically linked applications, pmdinfo.py will > scan for implicitly linked PMDs by searching the specified binaries > .dynamic section for DT_NEEDED entries that contain the substring > librte_pmd. The DT_RUNPATH, LD_LIBRARY_PATH, /usr/lib and /lib are > searched for these libraries, in that order > > If a file is specified with no path, it is assumed to be a PMD DSO, and the > LD_LIBRARY_PATH, /usr/lib[64]/ and /lib[64] is searched for it > > Currently the tool can output data in 3 formats: > > a) raw, suitable for scripting, where the raw JSON strings are dumped out > b) table format (default) where hex pci ids are dumped in a table format > c) pretty, where a user supplied pci.ids file is used to print out vendor > and device strings > > Signed-off-by: Neil Horman > CC: Bruce Richardson > CC: Thomas Monjalon > CC: Stephen Hemminger > CC: Panu Matilainen > --- > mk/rte.sdkinstall.mk | 2 + > tools/pmdinfo.py | 629 > +++ > 2 files changed, 631 insertions(+) > create mode 100755 tools/pmdinfo.py > > diff --git a/mk/rte.sdkinstall.mk b/mk/rte.sdkinstall.mk > index 68e56b6..dc36df5 100644 > --- a/mk/rte.sdkinstall.mk > +++ b/mk/rte.sdkinstall.mk > @@ -126,6 +126,8 @@ install-runtime: > $(Q)$(call rte_mkdir, $(DESTDIR)$(sbindir)) > $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/tools/dpdk_nic_bind.py, > \ > $(DESTDIR)$(sbindir)/dpdk_nic_bind) > + $(Q)$(call rte_symlink,$(DESTDIR)$(datadir)/tools/pmdinfo.py, \ > +$(DESTDIR)$(bindir)/dpdk-pmdinfo) The symlink should be with underscore instead of dash for consistency with all the other tools, ie dpdk_pmdinfo. Neil, I already gave you an ack on the series as per the functionality, feel free to include that in any future versions of the patch series. Minor nits like these are ... well, minor nits from my POV at least. - Panu -
[dpdk-dev] [PATCHv7 1/6] pmdinfogen: Add buildtools and pmdinfogen utility
On 06/16/2016 04:33 PM, Neil Horman wrote: > On Thu, Jun 16, 2016 at 03:29:57PM +0300, Panu Matilainen wrote: >> On 06/09/2016 08:46 PM, Neil Horman wrote: >>> pmdinfogen is a tool used to parse object files and build json strings for >>> use in later determining hardware support in a dso or application binary. >>> pmdinfo looks for the non-exported symbol names this_pmd_name and >>> this_pmd_tbl (where n is a integer counter). It records the name of >>> each of these tuples, using the later to find the symbolic name of the >>> pci_table for physical devices that the object supports. With this >>> information, it outputs a C file with a single line of the form: >>> >>> static char *_driver_info[] __attribute__((used)) = " \ >>> PMD_DRIVER_INFO="; >>> >>> Where is the arbitrary name of the pmd, and is the >>> json encoded string that hold relevant pmd information, including the pmd >>> name, type and optional array of pci device/vendor ids that the driver >>> supports. >>> >>> This c file is suitable for compiling to object code, then relocatably >>> linking into the parent file from which the C was generated. This creates >>> an entry in the string table of the object that can inform a later tool >>> about hardware support. >>> >>> Signed-off-by: Neil Horman >>> CC: Bruce Richardson >>> CC: Thomas Monjalon >>> CC: Stephen Hemminger >>> CC: Panu Matilainen >>> --- >> >> Unlike earlier versions, pmdinfogen ends up installed in bindir during "make >> install". Is that intentional, or just a side-effect from using >> rte.hostapp.mk? If its intentional it probably should be prefixed with dpdk_ >> like the other tools. >> > Im not sure what the answer is here. As you can see, Thomas and I argued at > length over which makefile to use, and I gave up, so I suppose you can call it > intentional. Being in bindir makes a reasonable amount of sense I suppose, as > 3rd party developers can use it during their independent driver development. Right, it'd be useful for 3rd party driver developer, so lets consider it intentional :) > I'm not sure I agree with prefixing it though. Given that the hostapp.mk file > installs everything there, and nothing that previously used that make file > had a > dpdk_ prefix that I can tell, I'm not sure why this would. pmdinfogen seems > like a pretty unique name, and I know of no other project that uses the term > pmd > to describe anything. I agree about "pmd" being fairly unique as is, but if pmdinfo is dpdk_ prefixed then this should be too, or neither should be prefixed. I dont personally care which way, but it should be consistent. - Panu - > > Neil > >> - Panu - >> >>
[dpdk-dev] [PATCH] dropping librte_ivshmem - was log: deprecate history dump
On 06/10/2016 12:26 AM, Thomas Monjalon wrote: > Looking a bit more into librte_ivshmem, the documentation says we need > a Qemu patch but the URL doesn't exist anymore: > https://01.org/packet-processing/intel%C2%AE-ovdk > -> 404 Oops, we couldn't find that page > > I've never understood why we should keep this wart and now I'm going > to be upset. Good :) > To sum up the situation, eal depends on ivshmem which depends on > ring/mempool which depends... on eal. > The truth is that eal should not depends on librte_ivshmem. > And the option CONFIG_RTE_LIBRTE_IVSHMEM should not exist. > > There are 3 parts to distinguish: > > 1/ The librte_ivshmem API to export some data structures from host. > No real problem here. > > 2/ The scan of the ivshmem devices in the guest init. > It should be handled as any other PCI device with an appropriate driver. > The scan is done by rte_eal_pci_init. > > 3/ The automatic mapped allocation of DPDK objects in the guest. > It should not be done in EAL. > An ivshmem driver would be called by rte_eal_dev_init. > It would check where are the shared DPDK structures, as currently done > with the IVSHMEM_MAGIC (0x0BADC0DE), and do the appropriate allocations. > Thus only the driver would depend on ring and mempool. > > The last step of the ivshmem cleanup will be to remove the memory hack > RTE_EAL_SINGLE_FILE_SEGMENTS. Then CONFIG_RTE_LIBRTE_IVSHMEM could be > removed. > > So this is my proposal: > Someone start working on the above cleanup now, otherwise the whole > rte_ivshmem feature will be deprecated in 16.07 and removed in 16.11. > We already talked about the rte_ivshmem design issues several times > and nobody declared using it. +1 (more like +100) to that. In addition to the technical mess in EAL, there are quite some eyebrow-raisers related to IVSHMEM: That it all starts with "you'll need to build a special version of qemu" with this special patch from the 'net, a patch which doesn't even exist anymore, is a complete non-starter. Such a situation can occur during early development, but its been years by now. Dependencies to non-upstreamed features in other projects are not a healthy sign. Regardless of whether the patch has been integrated to qemu upstream or not, the situation is quite telling: nobody cares enough to have updated the information. I found a copy of the patch from my laptop, and as far as I can tell, the patch has never been proposed upstream, much less applied. Certainly the patch would not come even close to applying to current qemu. And apparently IVSHMEM is unmaintained in qemu upstream too (according to MAINTAINERS). On DPDK side, that the most obvious (to me at least) user of memnic PMD has been unmaintained for two years no, and allowed to fall off the edge of the world (witness http://dpdk.org/browse/memnic/) is also quite telling. Just deprecate it already. If somebody shows up with actual patches to clean it all up, the deprecation can be lifted of course, but cleaning up this abandonware seems like waste of engineering resources to me. - Panu -