[dpdk-dev] [PATCH] mk: fix the combined library problems by replacing it with a linker script
On Tue, Dec 01, 2015 at 02:21:02PM +0200, Panu Matilainen wrote: > Adding a soname and a semi-arbitrary version does not fix the fundamental > problems: > > Since the library lumps together everything in DPDK, you'd have to bump its > version whenever any of the individual libraries bumps its version to have > the version mean anything. DPDK 2.0 and 2.1 are supposedly binary compatible > but 2.2 certainly is not, and beyond that who knows. > > That in turn forces all apps to be rebuild whenever one of the libraries > changes version, whether those apps use that particular library or not. If we bundle all the libraries together into one package, then in distributions we have to rebuild anyway when any of the libraries changes version since a dependent package can't just depend on any later version, because we don't know in advance what ABI breaks might occur. It's also trivial to do rebuilds in a distribution. I'd prefer to see ABI versioning done right to avoid the pain that might occur there. Rebuilding dependent packages is on the other hand straightforward. > The combined library doesn't have symbol versioning, so besides the better > version compatibility tracking it loses other benefits like limited symbol > visibility. The combined library *should* have symbol versioning, which I've brought up before. This isn't a reason to not have a combined library; it is a reason to fix the combined library. Why is limited symbol visibility a benefit in this case? > Not to mention the extra complexity in makefiles to support it, the > increasing amount of duct-tape required to hold it together. And still eg > the MLX pmds declare the configuration not supported at all. I'd argue that this is because the build system is unnecessarily complex currently. A library consumer should just be able to #include and link with -ldpdk. It should not have a build system or custom flags imposed on it by one of the libraries it uses. Robie
[dpdk-dev] [PATCH] mk: fix the combined library problems by replacing it with a linker script
Re-sending this unsigned since the ML rejected my signed email. -1 from Ubuntu without further discussion since it will break us. Please don't commit this patch yet. I don't understand why we must have the complexity of so many shared libraries. From a distribution packaging perspective, all I see is that this multiplies potential work by twenty times and makes it awkward to work with without special tooling (which then needs maintaining). Before I go into details, it would be nice if someone could please explain why DPDK has to be "special" in needing to do this? I don't understand why DPDK must be different to every other userspace library out there. If DPDK has a good need to be different then that's fair enough. But I feel that if DPDK is deviating from the norm then we need to frame the discussion from the perspective of "why DPDK must be different", rather than having me trying to explain why the norm is the right way to do it. On Tue, Nov 24, 2015 at 04:31:17PM +0200, Panu Matilainen wrote: > That's how Fedora and RHEL are shipping it already and nobody has so much > as noticed anything strange, much less complained about it. 20 libraries > is but a drop in the ocean on a average distro. No, it is 20 times the work from the perspective of DPDK package maintenance. Let me explain why. In Debian and Ubuntu, we manage a library transition (an ABI bump in a library together with all dependencies moving to use the new ABI) by concurrently packaging both the old and new libraries at once. This works well with the norm for libraries. We ship one binary package per soname, with the major version as part of the package name. This allows a system to have two (or more) ABIs installed simultaneously. For a library transition, we just package the new version and then that can land and work concurrently as we then individually update every dependent (library-consuming) package. This works because of conventions around sonames, which DPDK breaks unless we treat it as twenty different libraries which changes our work from easy to painful. Usually a library transition is managed by hand by the package maintainer. It's not taxing because it's straightforward and well understood. Update and upload the new ABI source package, then find all reverse dependencies and sort them out, recursively. But if the maintainer must do it twenty times, then it becomes taxing and prone to error. And if the reverse dependency tree differs depending on the split library used by library consumers, then it gets far more complex to follow. Admittedly we could tool around this to make it easier, but that's extra work (both initially and in maintenance) and prone to error (because we'd only be doing it for DPDK). Packaging a library is usually virtually a no-op in Debian and Ubuntu nowadays. Our tooling does it all for us. But packaging DPDK is far from this currently because of all this added complexity. From my perspective this is unnecessary and makes no sense. We could do all kinds of things to work around it (that's what packaging is about) but then we'd have to maintain that specialness and I don't see why it must be awkward like this instead of just doing it the same way as every other library. > The combined library as it is simply is no longer a viable option. > Besides just being broken (witness the strange hacks people are coming > up with to work around issues in it) its ugly because it basically gives > the middle finger to all the effort going into version compatibility, > and its also big. Few projects will use every library in DPDK, but with > the combined library they're forced to lug the 800 pound gorilla along > needlessly. It's broken because it's broken upstream, and that's what we should fix. Why is it not viable? How does it give the middle finger to effort going into version compatibility? Doing it the right way like every other userspace library is what *gives us* version compatibility because then distributions can straightforwardly install multiple ABI versions at once. Finally, I fail to see any "lug the 800 pound gorilla along" saving. We (Ubuntu and Fedora) are both shipping all the libraries in one package, whether split or combined, so they are all being lugged onto disk anyway. Whether split or combined, there is no saving there. And memory is hardly saved either because the kernel will just page in and out what is needed in both cases. So how does this proposed change give us any saving at all? If distributions are expected to ship everything lumped together on one package, then we don't get any benefit of having the library split up. I did bring this up on this list[1] and my understanding of the outcome then was that it would be fine for us to use the combined library, and in time we could better define its ABI. Thus I'm not happy that you're proposing to change tack on this, both because I'm far from convinced it's a good idea for the project and wider ecosystem and also because it creates
[dpdk-dev] [PATCH] mk: fix the combined library problems by replacing it with a linker script
e reason as PAM). If this isn't what you mean, please can you find me a counter-example? Given a soname you can find the binary package that provides it at https://www.debian.org/distrib/packages under "Search the contents of packages". I suggest you set the distribution to "testing" to find more current sonames. Christian points out to me that libc6 does ship multiple sonames in a single package, but I think it's acceptable to consider this to be a special case that DPDK cannot really look to as an example. We don't normally co-install multiple ABI versions of libc because a major ABI bump in libc is extremely rare, and when we do it's a very special case that is handled as a major distribution-wide project. In answer to "You must already have a solution to this", we do. Our solution is to produce one binary package per soname. My point is that in the case of DPDK, this creates extra unnecessary work. Alternatively, we could treat DPDK packaging as the same sort of gargantuan task that packaging GNOME and KDE are, but without a good reason to split libraries this would be an artifical and unnecessary burden placed on packagers by DPDK upstream, which is why I am against upstream doing this. > > Packaging a library is usually virtually a no-op in Debian and Ubuntu > > nowadays. Our tooling does it all for us. But packaging DPDK is far from > > this currently because of all this added complexity. From my perspective > > this is unnecessary and makes no sense. We could do all kinds of things > > to work around it (that's what packaging is about) but then we'd have to > > maintain that specialness and I don't see why it must be awkward like > > this instead of just doing it the same way as every other library. > > > > > The combined library as it is simply is no longer a viable option. > > > Besides just being broken (witness the strange hacks people are coming > > > up with to work around issues in it) its ugly because it basically gives > > > the middle finger to all the effort going into version compatibility, > > > and its also big. Few projects will use every library in DPDK, but with > > > the combined library they're forced to lug the 800 pound gorilla along > > > needlessly. > > > > It's broken because it's broken upstream, and that's what we should fix. > > Why is it not viable? How does it give the middle finger to effort going > > into version compatibility? > Because each individual library has a version script that gets applied during > link to version symbols properly. Those scripts dont get applied when > building > the combined library. So this is just an upstream bug that needs resolving in the combined library case? Then I appreciate Ferruh Yigit's efforts in fixing this bug upstream. Thank you Ferruh Yigit. > > Doing it the right way like every other > > userspace library is what *gives us* version compatibility because then > > distributions can straightforwardly install multiple ABI versions at > > once. > Again, Not at all uncommon. You're packaging methodology is the issue here, > not the fact that there are multiple libraries. No, our packaging methdology is sound as I hope I've explained well enough above. The real issue is the yet-to-be-justified decision to split libraries creating unnecessary packaging work given that we wish to shared libraries properly rather than bundling all the sonames together (which defeats the point of split libraries in the first place). > > Finally, I fail to see any "lug the 800 pound gorilla along" saving. We > > (Ubuntu and Fedora) are both shipping all the libraries in one package, > > whether split or combined, so they are all being lugged onto disk > > anyway. Whether split or combined, there is no saving there. And memory > > is hardly saved either because the kernel will just page in and out what > > is needed in both cases. So how does this proposed change give us any > > saving at all? > > > Not true, initalization constructors for PMD's at the very least mean that > every > pmd will get paged in weather you want it or not using the combined library. > Individual libraries let you dynamically load them (via dlopen). I think the > same is true of several other facets of dpdk. What's the objective impact of this? Can you quantify your claimed saving? How does it compare to, say, the extra IOPS required in loading multiple shared libraries and the extra pages that they could consume? Are these things at all significant in an issue someone will face in the real world? On Tue, Dec 01, 2015 at 08:30:43AM -0500, Neil Horman wrote: > On Tue, Dec 01, 2015 at 12:36:15PM +, Robie Bas
[dpdk-dev] libdpdk upstream changes for ecosystem best practices
Hi, We?re looking at packaging DPDK in Ubuntu. We?d like to discuss upstream changes to better integrate DPDK into Linux distributions. Here?s a summary of what we need: 1) Define one library ABI (soname and sover) that we can use instead of the split build. 2) Fix #includes so we don't have to include config.h 3) Put headers into /usr/include/dpdk instead of /usr/include You can see our current packaging progress at https://git.launchpad.net/~ubuntu-server/dpdk/log/?h=ubuntu-wily and a test PPA at https://launchpad.net/~smb/+archive/ubuntu/dpdk/ First, it would be easier for us to ship a single binary package that ships a single shared library to cover all of DPDK that library consumers might need, rather than having it split up as you do. I understand the build system is capable of doing this already, but what we don?t have is a well defined soname and sover (currently parameterized in the build) for ABI compatibility purposes. As a binary distribution, this is something that we?d expect upstream to define, since normally we expect to achieve binary compatibility across all distributions at this level in the stack. So I have the following requests: So that we can get DPDK packaging into Ubuntu immediately, please could we agree to define (and burn) libdpdk.so.0 to be the ABI that builds with upstream release 2.0.0 when built with the native-linuxapp-gcc template options plus the following changes: CONFIG_RTE_MACHINE=?default? CONFIG_RTE_APP_TEST=n CONFIG_LIBRTE_VHOST=y CONFIG_RTE_EAL_IGB_UIO=n CONFIG_RTE_LIBRTE_KNI=n CONFIG_RTE_BUILD_COMBINE_LIBS=y CONFIG_RTE_BUILD_SHARED_LIB=y CONFIG_RTE_LIBNAME=?dpdk? The combined library would be placed into /usr/lib/$(ARCH)-linux-gnu/ where it can be found without modification to the library search path. We want to ship it like this in Ubuntu anyway, but I?d prefer upstream to have defined it as such since then we?ll have a proper definition of the ABI that can be shared across distributions and other consumers any time ABI compatibility is expected. Though not strictly part of a shared library ABI, I also propose some build-related upstream changes at API level below, that I?d like to also ship in the initial Ubuntu packaging of the header files. Clearly you cannot make this change in an existing release, but I propose that you do this for your next release so all library consumers will see a consistent and standard API interface. If you agree to this, then I?d also like to ship the Ubuntu package with patches to do the same thing in your current release. Right now, I understand that library consumers need to either: 1) use the upstream-provided build system (.mk files etc); or 2) otherwise make sure to include rte_config.h by specifying it as an extra CPPFLAGS parameter as the upstream API documentation does not require its inclusion use in source files. This is problematic because somebody writing against multiple libraries should just expect to #include the API-defined headers and link simply with -l for the build to work. It is common to have a config.h type file generated at build time, but in this case I?d expect it to be conditionally included automatically as part of the API, for example by #include?ing it in any file the API _does_ define that library users must include. To fix this, I propose to #include in every header file that library users may #include according to the API. That brings me to paths. To avoid polluting the /usr/include namespace, I?d expect either a single /usr/include/dpdk.h, or everything inside /usr/include/dpdk/, or both. Then library consumers would #include combinations of and as required, our packaging could install into these directories without stealing any other part of the shared filesystem namespace, and library users wouldn?t have to be concerned about paths, configuration or build systems. This would then match every other shared library we package. Does this sound reasonable to you? Is this a change you will accept? Thanks, Robie
[dpdk-dev] libdpdk upstream changes for ecosystem best practices
Hi Thomas, On Wed, Sep 02, 2015 at 04:18:33PM +0200, Thomas Monjalon wrote: > > First, it would be easier for us to ship a single binary package that > > ships a single shared library to cover all of DPDK that library > > consumers might need, rather than having it split up as you do. I > > understand the build system is capable of doing this already, but what > > we don?t have is a well defined soname and sover (currently > > parameterized in the build) for ABI compatibility purposes. As a binary > > No it is now fixed: > http://dpdk.org/browse/dpdk/commit/?id=c3ce2ad3548 It's great that the name "dpdk" is pinned down - thanks. But we need to define the sover also, and make sure it is bumped when the ABI changes. AIUI the build currently produces no sover - is this correct? We'll use a sover of 0 in our packaging for now, unless you object. Then we'll be able to move up to whatever you do when it is well-defined. > > So that we can get DPDK packaging into Ubuntu immediately, please could > > we agree to define (and burn) libdpdk.so.0 to be the ABI that builds > > with upstream release 2.0.0 when built with the native-linuxapp-gcc > > template options plus the following changes: > > CONFIG_RTE_MACHINE=?default? > > CONFIG_RTE_APP_TEST=n > > CONFIG_LIBRTE_VHOST=y > > CONFIG_RTE_EAL_IGB_UIO=n > > CONFIG_RTE_LIBRTE_KNI=n > > CONFIG_RTE_BUILD_COMBINE_LIBS=y > > CONFIG_RTE_BUILD_SHARED_LIB=y > > I feel this configuration is the responsibility of the distribution. > What do you expect to have in the source project? I just wanted to make it clear what we were doing in case changing build configuration parameters resulted in a different ABI. If this isn't the case, then that's fine - it is solely the consider of the distribution as to what build parameters we pick. > > The combined library would be placed into /usr/lib/$(ARCH)-linux-gnu/ > > where it can be found without modification to the library search path. > > We want to ship it like this in Ubuntu anyway, but I?d prefer upstream > > to have defined it as such since then we?ll have a proper definition of > > the ABI that can be shared across distributions and other consumers any > > time ABI compatibility is expected. > > You mean you target ABI compatibility between Linux distributons? > But other libraries could have different versions so you would be lucky > to have a binary application finding the same dependencies. In theory we do get ABI compatibility between distributions. Finding the dependencies is a separate issue; but if the right binaries were installed, there would be no conflicts in finding shared libraries across binaries from different distributions if the ABI is managed right. But that isn't directly our target. It's still useful to us to have this done right. It makes ABI transitions in the distribution (coordinating updates to libraries and their consumers concurrently) possible without breaking things in the middle. It means that when we talk to upstreams (both libraries and their consumers) then we're speaking the same language as other distributions, and patches apply to them all without each distribution having to kludge things independently. And it gives us options when different library consumers require different ABI versions since we can concurrently install two different ABIs of the same library (although we prefer to avoid that). > > Though not strictly part of a shared library ABI, I also propose some > > build-related upstream changes at API level below, that I?d like to also > > ship in the initial Ubuntu packaging of the header files. Clearly you > > cannot make this change in an existing release, but I propose that you > > do this for your next release so all library consumers will see a > > consistent and standard API interface. If you agree to this, then I?d > > also like to ship the Ubuntu package with patches to do the same thing > > in your current release. > > Yes cleanup patches are welcome :) I'm arranging to have someone work on these with you upstream and send you patches, thanks. Robie