On Wed, Apr 1, 2020 at 5:14 AM Samuel Bernardo < samuelbernardo.m...@gmail.com> wrote:
> Hi Robin, > On 4/1/20 6:36 AM, Robin H. Johnson wrote: > > Normally we don't bundle dependencies, avoiding that problem entirely. > The Go eclasses however are badly designed, committed against protest by > paid corporate interests, and serve only to facilitate large-scale > copyright infringement and security vulnerabilities. If you're looking > for a consistent explanation of how they're supposed to work with the > rest of Gentoo, you won't find one. > > mjo: Can you please substantiate your claims? > > It would have been nice to have heard your concerns during February, any > of one the three times that William and I posted the go-module.eclass > EGO_SUM development work for review on this mailing list. I don't see a > single email from you during that entire period. > > The EGO_SUM support explicitly ensured that upstream distfiles (for each > dependency) remained absolutely as upstream provided them, without > merging the distfiles together or altering their content in way (I admit > that the exact naming of the distfiles changed, because it was terrible, > v0.0.0-20190311183353-d8887717615a.zip for example). > > Forgive my noobishness in this matter that let Alec to comment over my own > statement. > > Alec pointed out some very important issues in go development that break > copyright infringement and security vulnerabilities, but I'm sure that is > not related to the good work done in go-module.eclass to surpass all go > mess. npm is worst and I take from go-module as a good pattern to apply > also into there. > I am antarus, not mjo (but more on that below!) I don't believe bundling presents many challenges with regards to copyright infringement. As a package maintainer you should know the licenses used in your packages. You are required to reflect any licenses used in the LICENSE ebuild variable. Obviously this becomes more work if you are using a bundle due to the fact that bundling will include more code. In the golang ecosystem there is a tool to help maintainers do this ( https://packages.gentoo.org/packages/dev-go/golicense). I get that with bundling we cannot share the work from previous packages because packages are not shared in a bundled environment but I expect the golicense tool to have good coverage in practice. If the tool does the work, sharing the work becomes moot. I think licensing can be more challenging in other bundling scenarios where tooling is not provided; but note that this is not significantly different from the unbundled scenario in terms of license discovery. If I am packaging a new program (A) and it depends on (B,C,D) I have two options. I can either package [A,B,C,D] (normal gentoo way) or I can package [A] (with B,C,D bundled). The intersection of the LICENSE variables is the same effort for both here. The benefit of the multiple packages is that future users of B,C,D can re-use the license discovery work and that isn't nothing. > Going back to my overlay use case, will go-modules download all modules to > distfiles directory? The naming convention will assure that there will be > no modules repetition? > What about eclean-dist, will it work as expected for those modules > dependencies? > > I think some of this answers would worth mention in documentation. > > Sorry for anything I wrongly stated and thank you very much for your help, > > Samuel > I've chosen this part to write my treatise on packaging, but rest assured it's mostly intended as a response to mgorny and mjo; not specifically in response to you. The very long answer is that Gentoo was designed around a paradigm of programs written primarily in C. In C programs you have the ability to link to libraries which offer APIs and in the ideal case, each API is offered via a unique SONAME[0]. Upstream packages were written and built in this way (with dynamic linking). So in the case of package A, that uses libraries B, C, and D; the result in many distributions is 4 packages (A,B,C,D) and users who want A will get B, C, and D installed. This in fact was a major selling point of package managers at the time because finding these dependencies by hand and building and merging them all was painful. Many applications break this trend; I don't think golang or nodejs are particularly new (python and ruby have had (pip, venv) and rubygems[1] for years, for example, which are similar bundling paradigms.) The struggle as packagers and distribution managers is when upstream decides "my software should be installed via a bundling solution (golang, node, pip, rubygems, and so on)" we are left to decide both whether to map this to the ebuild paradigm (no bundling of dependencies) or omit ebuilds entirely. In the former case we are often left working at odds with upstream (who are confused by our decomposition of their application) and in the latter case, users often use the bundle anyway (e.g. they install the packages by hand or use the ruby gems or whatever.) I assert this is somewhat of a false choice. Bundling isn't all bad and we can learn from past mistakes[2] to try to avoid problems. Another challenge with bundling is that often bundling systems (bundler, pip, venv, golang, etc.) specify specific versions, commits, or tags. This is fine when bundling (because each bundle has its own version of a dependency in the bundle) but when you are trying to share a system wide package between N packages, you either need to SLOT the dependencies or have a looser dependency specification. The fine-grained nature of the upstream dependency specification can make this challenging[3]. Unbundling then made it easier for system operators to operate a system; and you see this often in the security space. A security notice will come out saying "foo-X-Y-Z is vulnerable, move to foo-Y.1." So operators want to know "do I have foo-x-y-z installed?" When every package is in the package manager this is a trivial question. When software is bundled inside of a package, this visibility is lost. I haven't seen any tooling for Gentoo to this problem. In addition to the above, bundling can present exciting resource challenges for some deployments. Imagine a common dep (CommonFoo-x-y-z) has a security problem, so we must upgrade to CommonFoo-y-z. In the scenario where CommonFoo is a dynamically linked package we can recompile it once[4] and new consumers will just use the new dynamic shared object. In a bundling scenario, we will be forced to rebuild[5] all consumers. This can take a lot of time and resources depending on the deployment. Is the deployment using a build farm? A binary packages host? How many disparate platforms are in use? Which is to say many people in Gentoo dislike bundling for various reasons; many of them legitimate. I wish to present a narrative where bundling is an engineering trade-off, rather than a decision that is settled engineering law. This doesn't mean Gentoo needs to support all the bundling (clearly most people don't want to) but not supporting it means that many packages will not be in Gentoo at all (because unbundling is too costly) and so you end up at this exciting discussion which happens every couple of years. -A [0] I understand this is not always true in practice, but let's assume spherical cows momentarily. [1] Gentoo has a rubygems-fakegem eclass that makes it pretty streamlined to make an ebuild for a particular gem, but of course if my application depends on 20 gems I still need to make 20 ebuilds in this scheme and merge them all. Rubygems-fakegem is still pretty good though! [2] On windows https://en.wikipedia.org/wiki/DLL_Hell was common. In the .NET ecosystems assemblies addressed some of these problems. [3] Similar to DLL hell but more generically: https://en.wikipedia.org/wiki/Dependency_hell [4] Practice of course, leads to all kinds of weird edge cases where upgrading your shared lib causes dependencies to break for various reasons; which is one reason why application authors like to bundle; because their application ends up being perceived as more reliable and less finicky. [5] The number of package rebuilds in Gentoo is a fairly common complaint, from my personal observation. Obviously binary packages make this problem worse (not better.) I dunno if its something the community should put more effort into or not though; my expectation is that rebuilds are common and making them more common is not a strategic problem; but I'm also not compiling on some single core atom, so what do I know, eh? :)