On Wed, Apr 1, 2020 at 5:14 AM Samuel Bernardo <
samuelbernardo.m...@gmail.com> wrote:

> Hi Robin,
> On 4/1/20 6:36 AM, Robin H. Johnson wrote:
>
> Normally we don't bundle dependencies, avoiding that problem entirely.
> The Go eclasses however are badly designed, committed against protest by
> paid corporate interests, and serve only to facilitate large-scale
> copyright infringement and security vulnerabilities. If you're looking
> for a consistent explanation of how they're supposed to work with the
> rest of Gentoo, you won't find one.
>
> mjo: Can you please substantiate your claims?
>
> It would have been nice to have heard your concerns during February, any
> of one the three times that William and I posted the go-module.eclass
> EGO_SUM development work for review on this mailing list. I don't see a
> single email from you during that entire period.
>
> The EGO_SUM support explicitly ensured that upstream distfiles (for each
> dependency) remained absolutely as upstream provided them, without
> merging the distfiles together or altering their content in way (I admit
> that the exact naming of the distfiles changed, because it was terrible,
> v0.0.0-20190311183353-d8887717615a.zip for example).
>
> Forgive my noobishness in this matter that let Alec to comment over my own
> statement.
>
> Alec pointed out some very important issues in go development that break
> copyright infringement and security vulnerabilities, but I'm sure that is
> not related to the good work done in go-module.eclass to surpass all go
> mess. npm is worst and I take from go-module as a good pattern to apply
> also into there.
>
I am antarus, not mjo (but more on that below!) I don't believe bundling
presents many challenges with regards to copyright infringement. As a
package maintainer you should know the licenses used in your packages. You
are required to reflect any licenses used in the LICENSE ebuild variable.
Obviously this becomes more work if you are using a bundle due to the fact
that bundling will include more code. In the golang ecosystem there is a
tool to help maintainers do this (
https://packages.gentoo.org/packages/dev-go/golicense). I get that with
bundling we cannot share the work from previous packages because packages
are not shared in a bundled environment but I expect the golicense tool to
have good coverage in practice. If the tool does the work, sharing the work
becomes moot.

I think licensing can be more challenging in other bundling scenarios where
tooling is not provided; but note that this is not significantly different
from the unbundled scenario in terms of license discovery. If I am
packaging a new program (A) and it depends on (B,C,D) I have two options. I
can either package [A,B,C,D] (normal gentoo way) or I can package [A] (with
B,C,D bundled). The intersection of the LICENSE variables is the same
effort for both here. The benefit of the multiple packages is that future
users of B,C,D can re-use the license discovery work and that isn't nothing.

> Going back to my overlay use case, will go-modules download all modules to
> distfiles directory? The naming convention will assure that there will be
> no modules repetition?
>
What about eclean-dist, will it work as expected for those modules
> dependencies?
>
> I think some of this answers would worth mention in documentation.
>
> Sorry for anything I wrongly stated and thank you very much for your help,
>
> Samuel
>
I've chosen this part to write my treatise on packaging, but rest assured
it's mostly intended as a response to mgorny and mjo; not specifically in
response to you.

The very long answer is that Gentoo was designed around a paradigm of
programs written primarily in C. In C programs you have the ability to link
to libraries which offer APIs and in the ideal case, each API is offered
via a unique SONAME[0]. Upstream packages were written and built in this
way (with dynamic linking). So in the case of package A, that uses
libraries B, C, and D; the result in many distributions is 4 packages
(A,B,C,D) and users who want A will get B, C, and D installed. This in fact
was a major selling point of package managers at the time because finding
these dependencies by hand and building and merging them all was painful.

Many applications break this trend; I don't think golang or nodejs are
particularly new (python and ruby have had (pip, venv) and rubygems[1] for
years, for example, which are similar bundling paradigms.) The struggle as
packagers and distribution managers is when upstream decides "my software
should be installed via a bundling solution (golang, node, pip, rubygems,
and so on)" we are left to decide both whether to map this to the ebuild
paradigm (no bundling of dependencies) or omit ebuilds entirely. In the
former case we are often left working at odds with upstream (who are
confused by our decomposition of their application) and in the latter case,
users often use the bundle anyway (e.g. they install the packages by hand
or use the ruby gems or whatever.) I assert this is somewhat of a false
choice. Bundling isn't all bad and we can learn from past mistakes[2] to
try to avoid problems.

Another challenge with bundling is that often bundling systems (bundler,
pip, venv, golang, etc.) specify specific versions, commits, or tags. This
is fine when bundling (because each bundle has its own version of a
dependency in the bundle) but when you are trying to share a system wide
package between N packages, you either need to SLOT the dependencies or
have a looser dependency specification. The fine-grained nature of the
upstream dependency specification can make this challenging[3].

Unbundling then made it easier for system operators to operate a system;
and you see this often in the security space. A security notice will come
out saying "foo-X-Y-Z is vulnerable, move to foo-Y.1." So operators want to
know "do I have foo-x-y-z installed?" When every package is in the package
manager this is a trivial question. When software is bundled inside of a
package, this visibility is lost. I haven't seen any tooling for Gentoo to
this problem.

In addition to the above, bundling can present exciting resource challenges
for some deployments. Imagine a common dep (CommonFoo-x-y-z) has a security
problem, so we must upgrade to CommonFoo-y-z. In the scenario where
CommonFoo is a dynamically linked package we can recompile it once[4] and
new consumers will just use the new dynamic shared object. In a bundling
scenario, we will be forced to rebuild[5] all consumers. This can take a
lot of time and resources depending on the deployment. Is the deployment
using a build farm? A binary packages host? How many disparate platforms
are in use?

Which is to say many people in Gentoo dislike bundling for various reasons;
many of them legitimate. I wish to present a narrative where bundling is an
engineering trade-off, rather than a decision that is settled engineering
law. This doesn't mean Gentoo needs to support all the bundling (clearly
most people don't want to) but not supporting it means that many packages
will not be in Gentoo at all (because unbundling is too costly) and so you
end up at this exciting discussion which happens every couple of years.

-A

[0] I understand this is not always true in practice, but let's assume
spherical cows momentarily.
[1] Gentoo has a rubygems-fakegem eclass that makes it pretty streamlined
to make an ebuild for a particular gem, but of course if my application
depends on 20 gems I still need to make 20 ebuilds in this scheme and merge
them all. Rubygems-fakegem is still pretty good though!
[2] On windows https://en.wikipedia.org/wiki/DLL_Hell was common. In the
.NET ecosystems assemblies addressed some of these problems.
[3] Similar to DLL hell but more generically:
https://en.wikipedia.org/wiki/Dependency_hell
[4] Practice of course, leads to all kinds of weird edge cases where
upgrading your shared lib causes dependencies to break for various reasons;
which is one reason why application authors like to bundle; because their
application ends up being perceived as more reliable and less finicky.
[5] The number of package rebuilds in Gentoo is a fairly common complaint,
from my personal observation. Obviously binary packages make this problem
worse (not better.) I dunno if its something the community should put more
effort into or not though; my expectation is that rebuilds are common and
making them more common is not a strategic problem; but I'm also not
compiling on some single core atom, so what do I know, eh? :)

Reply via email to