Re: [gentoo-dev] Trustless Infrastructure

Rich Freeman Mon, 02 Jul 2018 09:06:52 -0700

Overall I very much like the concept, but I might propose a few tweaks
(just quoting the stuff that might benefit from adjustment):

On Mon, Jul 2, 2018 at 11:36 AM Jason A. Donenfeld <zx...@gentoo.org> wrote:
>
> - Sign every file in the portage tree so that it has a corresponding
> .asc. Repoman will need support for this.
> - Ensure the naming scheme of portage files is sufficiently strict, so
> that renaming or re-parenting signed files doesn't result in RCE. [*]
> - Distribute said .asc files with rsync per usual.

This has two issues:

1.  It requires changes to the repos/infra/etc to work, which means it
is painful to just pilot/etc and grow organically.
2.  It is 99% redundant with the git signatures we already have.

Why not build this off of git signatures?  This could be done directly
by syncing via git, or by having a tool that extracts the git
signatures and stores the metadata in the repo (ideally done by infra
before mirroring, but it could be done after the fact as well).  Git
is just content-hashed, so as long as the files aren't modified you
should be able to verify the git content hash against a repo synced
outside of git, assuming no modifications (obviously this means
accounting for stuff like metadata that infra adds after the fact).
You still need a solution for metadata in your original proposal
anyway.

The only downside I see to git signatures is how far back to go to
check history.  With the .asc solution you'd remove the signatures
when you remove the files they pertain to.  With git there is no
trivial way to know when to stop going back with the signature
verfiication since every signature applies to a mix of both current
and subsequently-removed files, with the percentage of each slowly
shifting as you go back further in time.  That said if you just track
the last known-good sync you could just check the subsequent
signatures, which would be very efficient (probably more efficient
than checking all of rsync unless you sync very infrequently).

> - Never rsync into the /usr/portage directory, but rather into an
> unused shadow directory, and only copy files from the shadow directory
> into /usr/portage after verification succeeds. (The fact that those
> files are visible to portage prior to verification and following a
> failed verification is a shameful oversight of the current system.)

I certainly agree that /usr/portage being usable if it fails
verification is a major weakness right now.

Other alternatives to your proposal include:

1.  Store state somewhere that portage checks.  It is invalidated
before starting a sync, and set back to "secure" after verification.
2.  Store a last-known-good hash if using git signature checking.
Portage would check the current tree state against this in all
operations.
3.  Have portage check signatures on all files it access at the time
of access.  This would make portage safe to use even in a compromised
tree.

Especially options 1/2 are going to be more efficient than copying
files at the filesystem level from a scratch location.  Also, all
three options would be compatible with git syncing, while trying to
copy a git repo after the sync would probably be messier (though still
possible).

But, I have no objection to your original proposal either - I'd prefer
it to what we have today at least for rsync.

In general I do advocate giving serious consideration to the benefits
of syncing via git.  If you sync frequently (which most Gentoo users
probably do, and which we generally advocate), then it tends to be a
lot more efficient than rsync.  It naturally tracks changes over time
as well, so it fits in very well with merging untrusted changes into a
known-good tree, as only the changes need to be verified.

The main downside to git signature checking is sha1.  It baffles me
that nobody has bothered to fix this, especially since I'd think it
would be pretty simple to do.  Just designate new tree/blob/parent
record types that use the new hash - like tree256/blob256/parent256.
Git would use the appropriate hash when following references, so you
could have continuity in a repository with older sha1 commits and
newer sha256 ones.  Obviously the newer repos would be incompatible
with older versions of git, but anything like this would be phased in,
and updating git isn't particularly painful.  Projects that care about
the security could consider rebasing the entire thing, but that would
of course discard history.  Presumably you could even do a merge where
one branch of the merge is the original untouched sha1 commits, and
the other branch is the rebased sha-256 commits, and the merge ties
them together into a forward-going sha256 history.

-- 
Rich

Re: [gentoo-dev] Trustless Infrastructure

Reply via email to