bug#75552: Non-committers can't keep authenticated forks updated

Liliana Marie Prikler Thu, 16 Jan 2025 06:35:27 -0800

Am Donnerstag, dem 16.01.2025 um 13:10 +0000 schrieb 45mg:
> As the 'Authenticate Your Git Checkouts'
> blog post [9] pointed out, we wouldn't need `guix git authenticate`
> if we were willing to delegate our security to a trusted third party,
> like all the open source projects that sport those "fancy “✅
> verified” badges as found on GitLab and on GitHub" do. We shouldn't
> force anyone hosting a fork to do so as well.
I mean, you can host your own fork and use the fancy “✅ verified” badge
of your host as source of trust – it just won't be checked by `guix
pull', that's all.  If you do do that, I'd recommend using a file://
URI with a local checkout for your channel, so that you can verify that
little check mark on your own (then you only need to trust your own
file system).


> > 
> > I think you're making this more complicated than it needs to be. 
> > checkout, authenticate, rebase*, merge* ought to have you covered.
> > 
> > * you can authenticate after these if you're paranoid 
> 
> The complexity is due to the requirements of not bumping the channel
> introduction (to avoid the increased attack surface from having to
> keep obtaining the updated one, as I discussed earlier), keeping fork
> history intact (to avoid force pulls), keeping upstream history
> intact, and being able to authenticate both upstream and fork
> commits. If you can think of a simpler method that meets these
> requirements, I'd love to hear it.
Guix committers are more than happy to use work trees and rebases,
which simplify this a lot – again, it's as simple as pull,
authenticate, rebase.

W.r.t. keeping history intact, we had the following exchange on IRC
yesterday.

<Rutherther>    lfam: that's interesting that there is really a merge
commit, for example if I remember right, the core updates merge few
months ago happened by directly appending the commits instead of a
merge commit
<lfam>  Yes, there are two ways to do it (rebase and merge) and it's a
matter of taste
<lfam>  Of course there is a correct choice, as with all questions of
taste ;)
<Rutherther>    I personally prefer a merge commit, since it has two
parents, you can track where the previous master pointed to
<lfam>  And I prefer a rebase. But ultimately it's up to whoever is in
front of the keyboard
<lilyp> FWIW, a rebase is cleaner, but requires that only one person
signs off commits on any given branch (or else you're signing commits
that someone else signed before and have to update the trailer… not
ideal)
<lfam>  It doesn't matter who signs the commits as long as they are
authorized. That's the security model we have

So yeah, even for (branch-)local work at least some committers prefer
rebasing.

> > No, it wouldn't.  You would rebase those changes on top of what you
> > already have on those respective branches.
> 
> It looks like at least one of us is misunderstanding rebasing. Could
> be me, so I'm consulting the relevant chapter from the Pro Git book
> [11] for a refresher.
> 
> Let's imagine that the first example given there represents our fork
> of Guix, where the 'experiment' branch marks the beginning of our
> fork (and its channel introduction) and the 'master' branch tracks
> upstream Guix.  After `git rebase master`, the commit that used to be
> C4 is gone, and now C4' takes its place. It may contain the same
> changes, but it's a different commit - so it (and any commits that
> it's the parent of) has a different hash. So the channel introduction
> has changed, and so has the entire history of the `experimental`
> branch. So we need to force-pull.
Yes, that's one variant – the one where you need to keep bumping your
channel introductions.  The other direction would be to rebase Guix
changes on top of your local branch.  This keeps your channel
introduction as-is.

> > > 
> […]
> This led my to think of an attack that's possible with my design - if
> I want to screw with anyone `guix pull`ing from my fork, I can merge
> upstream into my fork branch, add a bunch of malicious commits, and
> then make the upstream branch ref point to the latest such commit.
> Now anyone pulling from my fork will recieve the malicious commits as
> part of upstream's history - since no commit hashes needed to change,
> the pull is a regular fast-forward one, with no indication that
> anything is amiss. Authentication will succeed since the malicious
> merge commit has our fork as its (first) parent, and that parent has
> the primary introduction as its most recent ancestor.
> 
> The takeaway here is that anyone authorized via the primary
> introduction can fake new upstream commits.
Care to state how designating one introduction as "primary" adds to
security here?  

> So why bother with additional introductions at all, then? Because as
> far as I can tell, they are still the only solution mentioned so far
> that satisfies the requirements I mentioned earlier:
> > not bumping the channel introduction (to avoid the increased attack
> > surface from having to keep obtaining the updated one, as I
> > discussed earlier), keeping fork history intact (to avoid force
> > pulls), keeping upstream history intact, and being able to
> > authenticate both upstream and fork commits
> ...and yes, you do have to trust the fork maintainer to not
> deliberately mess those things up. But that's nothing new - even in
> the existing design, we have to trust everyone who can make trusted
> commits not to mess things up on purpose.
You are trading one attack surface for another.  Again, all is fine
while you only have to trust yourself, but weakening an invariant is
weakening an invariant (:

> So what does this all of this mean for the statement of my design?
> Well, it means that we need to stop thinking in terms of 'which
> branch can be merged into which?' and more in terms of 'which merge
> commits can be authenticated?'. And the answer to that question, with
> my design, would be:
> 
> 1. Any merge commit signed with a key in the intersection of its
>    parents' .guix_authorizations. (Standard authorization invariant.)
> 
> 2. Any merge commit that doesn't meet the above conditions, but has a
>    parent whose most recent ancestor is the primary introduction, and
>    is signed by a key in the .guix_authorizations of that parent. (My
>    weakened authorization invariant.)
That's a pretty long way of saying "Any merge commit signed with a key
in one of its parents' .guix_authorizations."  It is (by your design)
trivial to have the "primary introduction" under your control.

> Finally, let me restate the conditions for authenticating merge
> commits, taking this into account:
> 
> --8<---------------cut here---------------start------------->8---
> For commits that have multiple parents - ie. merge commits - we
> weaken the invariant as follows:
> 
> 1. If all parents have the primary introduction as their most recent
>    ancestor, then the invariant holds as usual.
>    
> 2. If one or more parents has the primary introduction as its most
>    recent ancestor (call these the 'primary parents'), and the rest
>    have any of the additional introductions, then the merge commit is
>    authenticated if and only if it's signed by a key authorized in 
>    all of the primary parents.
>    
> 3. If all parents have the same additional introduction as their most
>    recent ancestor, then the invariant holds as usual.
>    
> 4. If none of the parents have the primary introduction as their most
>    recent ancestor, nor do they have the same additional
>    introduction, then the merge commit cannot be authenticated.
> --8<---------------cut here---------------end--------------->8---
> 
> I merged 2a. into 2., and removed 2b.
> 
> Now let me try to respond to you:
> 
> > Yeah, I think this scheme will still end up in [4].  As pointed out
> > in [8], "primary" is just a convention that we can't rely on.
> 
> Not really. As I discussed, [8] points out that /merge order/ is the
> convention that we can't rely on. Introductions can be deliberately
> specified as primary or additional, whether via command-line flags or
> separate sections in .git/config.
> 
> > So let's just talk about the idea of widening one channel
> > introduction to any number of channel introductions – we can always
> > store a mapping of HEAD → first authenticated commit and then
> > assert that this set is a subset of what we declare as
> > introductions.  (This mapping will also make authentication as
> > efficient as it currently is, since we don't need to reauthenticate
> > everything all the time.)
> 
> I'm not sure what you mean. What do you mean by "mapping of HEAD →
> first authenticated commit"? Does this perhaps mean 'all commits
> between the latest one and the first authenticated commit'?
Little refresher: Guix stores a list of already authenticated commits
so as to not redo this work all over again.  If we were to allow
multiple introductions, we would also need to find the first
authenticated commit among them to match against the channel
introductions.

> What does "assert that this set is a subset of what we declare as
> introductions" mean?
Let's say that you work on branches B, C, and D with "primary"
introductions I, J , and K.  If you want to merge C into B, you need to
remember that B has I as its primary introduction, C has J, and so on.

> > Is this good enough?  No: an attacker could easily add their own
> > introduction and call it a day.  In fact, this scheme is even worse
> > than what was exploited in [4], because they never need commit
> > access to the Guix repo to do so.  Ahh, but wait!  `guix pull` on
> > the user's side uses their clean set of channels for
> > authentication.  Those only have upstream Guix… unless you actually
> > pull your own fork or manage an attack as outlined below (in which
> > case you do need commit access for some amount of time).
> 
> I should point out - my design does not require us to distribute any
> introductions besides Guix's existing one, nor will it provide any
> mechanism to automatically 'install' someone else's introduction.
Yes it will, per `%default-guix-channel'.

> An introduction is a tuple of (introductory commit, key that signs
> it) that you specify as arguments to `guix git authenticate`. An
> attacker would have to convince the entire Guix community to specify
> their (the attacker's) own introduction on the command line (or
> directly add it into .git/config). And given that there is no reason
> to ever do so unless you're using someone's fork... that's a hard
> sell.
Well, another hard sell would be introducing a feature to `guix git-
authenticate` that must not ever be used in Guix itself.  Now, since
you are already soft-forking Guix, you can obviously add this to your
own guix command, but do beware the dragons you're summoning with it.

Cheers

bug#75552: Non-committers can't keep authenticated forks updated

Reply via email to