Re: Security review of tag2upload

Antoine Beaupré Wed, 12 Jun 2024 10:13:00 -0700

On 2024-06-11 18:39:04, Russ Allbery wrote:
> Hi all,
>
> Below is the security review that I did of the tag2upload design.


Hi Russ, and thank you so much for taking the time to do this excellent
work. It's really comforting to think that we have an actual
professional look at our stuff, and I think we should do this more
often and systematically. :)

> I am not a neutral party, in the sense that I think tag2upload is a good
> idea and should be deployed.  However, I do these types of security
> reviews professionally, and I tried to approach this review the same way
> that I would approach a major work project that needed a security review
> to ensure we weren't deploying something with security issues.  I
> encourage any Debian community member with security expertise to check my
> work; with security reviews, the more eyes, the better.

For context, I am *not* a security professional (anymore?)... I don't
have formal security training, but I did work for Debian LTS for a while
and got involved in security audits and uploads in the past.

So, I guess I offer you my poorly adjusted eye-balls...

I guess I'm not a neutral party either, because I have Opinions, but I
guess then no one is neutral here, we'd need someone completely
disinterested in Debian to provide us with an opinion here if we would
want that, and even *that* wouldn't be quite neutral. ;)

> I will also post this review on my web site, probably later tonight if I
> have time.

I didn't find that post, btw... No big deal of course!

Anyways, here are my comments!

[...]

> ## Threat model
>
> I evaluated both the existing source package upload architecture and the
> tag2upload architecture against the following threats:
>
> - Someone not in the keyring uploads a malicious source package, possibly
>   via a sponsor.
>   
> - Someone in the keyring (either a Debian Developer or a Debian Maintainer
>   for a package) uploads a malicious source package but makes it appear
>   that the package was uploaded by someone else in the keyring.
>
> - An attacker compromises the system a Debian uploader uses to build
>   source packages and uses that access to inject malicious code into a
>   source package.
>
> - Someone with administrative access to the archive processing machinery
>   (DAK, the archive signing key, or similar infrastructure) uploads a
>   malicious source package.
>   
> - Someone with administrative access to the tag2upload server or its
>   signing key uploads a malicious source package.
>   
> - Someone with administrative access to Salsa uploads a malicious source
>   package.
>
> In each case, I looked at prevention, detection, and tracing.
>
> Neither the existing upload mechanism nor tag2upload attempt to prevent or
> detect (as opposed to trace) the upload of a malicious source package by
> someone in full possession of a key in the keyring, so this threat is not
> considered in this document, although tracing for this threat is
> discussed briefly.

I'm actually curious as to why that is treated as a separate
possibility, because if kind of overlaps with the second model ("someone
uploads a malicious package appearing from someone else")...

For me, that case and the "xz-utils" case are actually quite pressing
matters, they don't quite keep me up at night, but they're the kind of
threat models I do worry about and that we should address head on. But
*maybe* this is not the right vector to address them, that said...

> ## Brief architecture summary

[...]

> ### tag2upload
>
> tag2upload replaces the first step of this upload process with the
> following:
>
> 1. The uploader pushes a signed tag in a specific format to Salsa. For
>    non-native packages, this may reference an upstream tree in the same
>    Git repository by commit ID, which will be used to create the `orig`
>    tar file if needed.
>
> 2. Salsa notifies a web hook on a secure project-maintained system that a
>    new tag of interest has been pushed.
>
> 3. That system (with internal privilege separation) retrieves the Git tag
>    and corresponding commit, verifies the signature and tag metadata, and
>    verifies that the signer is in the relevant keyring.
>
> 4. Inside a VM or schroot, that system retrieves the Git tree and upstream
>    source tree if applicable, constructs or retrieves the `orig` tar file,
>    and constructs the Debian source package and source package control
>    file.  This VM or schroot in essence operates as a source package
>    buildd.
>
> 5. The tag2upload server adds control header fields specifying the Git
>    object ID and the identity string and fingerprint of the uploader,
>    signs the resulting source package control file, constructs an upload
>    changes control file, signs it, and creates and signs another Git tag
>    reflecting any additional Git commits that were required to put the
>    repository into a canonical format (the "dgit view").
>
> 6. The tag2upload server pushes the original Git tag, its referenced tree,
>    and the additional "dgit view" tag to the publicly-accessible
>    dgit-repos Git server as a permanent archive.
>
> 7. The tag2upload server uploads the signed source package to the normal
>    archive incoming queue.
>
> Subsequent processing of the upload happens identically to the existing
> upload system.

Nice. I didn't realize Salsa was involved in tag2upload, it answers part
of my question in the other thread of "can i keep using salsa", so
that's actually quite nice!

Thanks for that neat overview!

> ## Analysis

[...]

> #### tag2upload server
>
> The new tag2upload server architecture introduces a new type of build
> sandboxing that is similar but not identical to buildds (source package
> construction requires sufficient network access to Salsa, for example,
> while buildds can be cut off from the network completely) and new code
> that has to parse untrusted input.
>
> The sandboxing design of the tag2upload server does a good job of reducing
> that risk. Signatures are checked early, so only attackers able to create
> a valid OpenPGP signature with a key in the keyring can attack the most
> security-sensitive part of the system. The signing key is isolated from
> both the component that processes incoming requests from Salsa and the
> component that constructs the source package, only interacting with them
> via a restricted protocol.

This is more a question to the dgit people, but what kind of hardening
do we have on the tag2upload server? I think dak has the cryptographic
keys in a HSM (Hardware Security Module) to prevent a threat actor from
grabbing those keys for offline attacks...

Is there something similar (HSM or YubiKey) on the tag2upload server? If
not, why?

> The best way to detect whether the tag2upload server has been compromised
> would be to independently verify its output via a reproducible source
> package construction system that starts from the same inputs, namely a
> signed Git tag on a Salsa repository. This could be as simple as an
> independent tag2upload server, or could involve auditing or independent
> reimplementation of the steps the tag2upload server performs.
>
> We don't have reproducible source package builds today, so this is not a
> regression. We currently blindly trust whatever the uploader uploads, and
> the tag2upload proposal does not make that risk worse, merely shifts it to
> central infrastructure. I therefore don't consider reproducible source
> builds to be a security prerequisite for adoption of the tag2upload
> proposal. It is, however, obvious follow-on work that would improve
> detection of some classes of attacks.

Does tag2upload make reproducible source packages harder?

[...]

> #### Replacing the upstream tree
>
> The attack: Construct a benign and malicious Git tree pair containing only
> the upstream source. Reference the benign tree in a source package and get
> that source package signed by a sponsor to trigger tag2upload processing.
> Race the tag2upload server by deleting the upstream tag and commit ID and
> then pushing the malicious Git repository as a new commit with the same
> commit ID.
>
> The upstream tag name is present in the signed tag metadata, but since
> that tag itself is not required to be signed, the attacker can move it at
> will. The upstream tag therefore provides no protection against this
> attack apart from a small detection risk. Authentication of the upstream
> tree comes only from the inclusion of its commit ID in the tag metadata.

I just wanted to state that this is a really nice attack, good modeling
there, I didn't think of that one!

> I suspect (but am not certain) that this attack would normally be
> prevented by the Salsa Git service. The benign tree already existed in the
> same repository with the referenced commit ID (presumed to be checked by
> the sponsor during review), and even if references to that object are
> deleted via branch deletion, I believe Git will reject the push of the
> malicious commit ID until the old objects have been garbage-collected.
> This presumably will take long enough that the tag2upload process will
> fail because the upstream commit is missing.

Yeah, that's a reasonable assumption, but I believe those jobs are ran
on a schedule on GitLab servers, so an attacker could *time* their
attack just so, to make sure the old tree gets GC'd just in time. It's a
heck of a race to win though, especially, since you need to time it on
the other side as well...

> This attack could be done by someone with administrative access to Salsa,
> and thus in a position to force an immediate garbage collection of the
> unreferenced objects so that the tree underlying the upstream commit ID
> can be replaced. Administrative access to Salsa would also make it trivial
> to win the race against the tag2upload server. This attack is less prone
> to detection than moving the tag to a different Salsa repository.

That too, of course...

> There is a variation on this attack where the attacker deletes the Git tag
> and tree that it references, pushes a colliding tree, and then repushes
> the Git tag. I believe this has essentially the same properties as the
> above attack.

... and probably just easier.

[...]

I wonder if you've considered the "we need to revoke access to
compromised/hostile developer" threat model. Right now, we have a
relatively centralized model here (modulo DAM, dak, debian-keyring), and
we're introducing a new component... How does tag2upload manage keys and
does it introduce additional response time or issues when revoking
access to retiring or revoked developers?

> ## Conclusions

[...]

> I believe widespread adoption of tag2upload would represent a security
> improvement for Debian. The availability of a more secure source package
> construction system outweighs, in my opinion, the small additional risks
> it would introduce. I do not believe it introduces any significant
> security regressions.

I agree with this assessment.

> Were tag2upload adopted, I would recommend some follow-on work:
>
> - Verify that there are securely-archived backups of the dgit-repos Git
>   server, since they contain useful information for tracing any discovered
>   malicious packages.

Having uploads in Git brings a whole set of interesting properties and
tools we could leverage there as well, to ensure the integrity of the
dgit repository itself. When Tor transitioned from gitolite to GitLab,
one of the concerns was exactly that kind of problem space where we're
not sure we want to trust GitLab with our code. So I did a significant
amount of work researching Git integrity solutions, and my findings are
documented here:

https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/gitlab#git-repository-integrity-solutions

The work the kernel.org folks have been doing about publishing a
transparency log for the kernel git repository might be particularly
relevant here.

A.
-- 
May your trails be crooked, winding, lonesome, dangerous, leading to
the most amazing view. May your mountains rise into and above the
clouds.
                        - Edward Abbey

Re: Security review of tag2upload

Reply via email to