On 30/05/2023 18.52, Florian Schmaus wrote:
> 
> I am thankful that the council considered my request to vote on the
> topic. However, the council decided not to vote on this in its last
> session and to return the issue to the mailing lists.
> 
> Some see the requirement of some limitations as necessity it comes to
> reinstating EGO_SUM. Unfortunately, I could not see specific numbers
> mentioned since June 2022 in the three EGO_SUM threads [1, 2, 3] I am
> aware of.
> 
> To prevent harm from Gentoo, we should reach an agreement that everyone
> can live with. To achieve a consensus, and since I can not rule out that
> I missed a post that includes specific numbers, please share your ideas
> on how EGO_SUM could be reinstated in ::gentoo by replying to this mail.

I still want to ask why in ::gentoo should it be enabled? I'm trying to
understand why? If you speak about overlays, then I agree that it should
be allowed there, but I don't see any benefit to it existence in
::gentoo. My reason for that difference: the existence of gentoo-devs
with access to ~devspace.

Currently the best solution *per package* is to speak with upstream, to
add a CI workflow which create a source tarball which includes `vendor`
dir. This is the best way, and I'm doing that for multiple upstream of
some random Go packages in ::gentoo. But I know the disadvantage -
requirement to speak with upstream, explain why, and add it to the
system. This is best long-run solution, but more hardships.

> Having EGO_SUM would significantly increase the security of Gentoo's
> users (amongst other benefits).

While technically correct, we return to same "confidence" issue in the
dev (a dev can add malicious code into ebuild). Yes, adding malicious
code inside vendor tarball to hide it is easier and robbat2 demonstrated
it as working.

How can we solve it? One weird idea I have is to use vendor tarball
consisting of multiple tarballs per package, and include hash for it
inside the vendor tarball. I think you can compare the manifest stored
in `go.sum` file in source code with the once from the tarball
(verification of that claim needed). As a result I think we can offline
verify it.

> Personally, I do not see that we currently need any form of limitation
> to reinstate EGO_SUM. I substantiated this with data based on a two-year
> history analysis of gentoo.git. The summary is that the
> - size increase of ::gentoo is unproblematic for users
> - additional sync delta of ::gentoo is unproblematic for users
> - higher rate of gentoo.git's increase is unproblematic for developers
> when we reinstate EGO_SUM in ::gentoo.

Why "unproblematic"? Where I leave I have quite high RTT, meaning each
download takes long initial time until fetches with good speed. Fetching
a lot of small files is really bad for me (even from mirror in same
country, sigh). Having big deltas hit hard the git packs, higher load on
a lot of places.

Thinking on infra side, I remember stories of the issues when go.pkg was
doing full `git clone` (not shallow copy) of the whole gentoo.git
repository. Now imagine we allow the huge and frequent deltas of go
modules to run, image how fast we get to huge full repository. Yes, now
we blacklist this stupid failure of go.pkg, but it might happen with
other service. Full git clones aren't that rare.

Also note that Go packages tend to update frequently (because of all the
bundling and security issues). The fact you don't see a lot of updates
in ::gentoo is because many of them are under less active developers
(not to offend anyone, it is fine to skip bumps were a good place, not
my place to criticize!).

Also please remember the issue of scale. Look at the amount of packages
under dev-python. There are a lot of tools written in Go.

> Therefore, we could (and IMHO should) simply un-deprecate EGO_SUM.
> However, I would review this decision once the number of Go packages has
> doubled or in two years (whatever comes first).
> 
> Many share the concerns of an EGO_SUM-less world. I know that some seek
> a compromise by reinstating EGO_SUM with some limitations. The ::gentoo
> repository is able to handle packages (at least) up to the range of 2 to
> 1.5 MiB total package-directory size. Therefore I propose a limit in
> that range.

My solution is as such:

1. Undeprecate EGO_SUM in eclass
2. Forbid it's usage in ::gentoo (done by pkgcheck, error level, will
fail CI and as such we can see the misuse). Overlays are allowed.
3. Maintainer starts talks with upstreams to add release workflow to
create vendored source tarball, in hopes of it succeeding. "Start early,
to future profit". I see this flow similar to the "always try to
upstream patches".
4. Until upstream adds it, in ::gentoo use vendor tarballs.

I also think many devs agree with this solution, but I can't talk for
them, so I'll be happy agreeing devs can at least reply shortly their
agreement or disagreement.

> - Flow
> 
> 
> 1: https://www.mail-archive.com/gentoo-dev@lists.gentoo.org/msg95175.html
> 2: https://www.mail-archive.com/gentoo-dev@lists.gentoo.org/msg95279.html
> 3: https://www.mail-archive.com/gentoo-dev@lists.gentoo.org/msg97310.html

I must say this conversation around EGO_SUM makes me a little sad the
long time it takes, and sometimes it feels like it derails to bad
directions (I mean less helpful once) too often. I think we should go to
the way Flow - suggest concrete action items (something easier for
Council / all devs to vote).

Also sorry this mail is a little jumping all over, it is quite hard for
me to write long mails in English, so if paragraphs are less coherent,
I'll be happy to explain them more :)

-- 
Arthur Zamarin
arthur...@gentoo.org
Gentoo Linux developer (Python, pkgcore stack, Arch Teams, GURU)

Attachment: OpenPGP_signature
Description: OpenPGP digital signature

Reply via email to