Scott Kitterman writes ("Re: [RFC] General Resolution to deploy tag2upload"):
> On Tuesday, June 11, 2024 6:25:02 PM EDT Sean Whitton wrote:
> > - it improves the traceability and auditability of our source-only
> >   uploads, in ways that are particular salient in the wake of xz-utils.
> 
> As I understand it, Debian was affected by the xz-utils hack, in
> part, because some artifacts were inserted into an upstream tarball
> that were not represented in the upstream git.  Please explain how
> use of tag2upload is relevant to this scenario?  I'm afraid I don't
> follow.

Disclaimer: I don't know precisely the Debian xz's maintainer's
workflow.

tag2upload, like dgit, ensures and insists that the git tree you are
uploading corresponds precisely [1] to the generated source package.

If you base your Debian git maintainer branch on the upstream git (as
you should) and there is a discrepancy between the contents of the
upstream git branch, and the .orig.tar.gz you're using, the upload
will fail.

In the xz case, if the .orig.tar.gz is upstream's, that would have
detected the attack.  More realistically, since the attacker was
targeting Debian, they would instead have had to put all of the
malicious code into the git repository, which is possible, but riskier
- so it makes the attack harder, or easier to detect, but doesn't rule
it out.

There are some cavests to this.

I believe some maintainers maintain a "upstream tarball imports"
branch, which has upstream git as its ancestor, but whose tree
contents are the upstream tarballs.  They then base the Debian branch
on that.  That workflow is vulnerable to "random stuff" in the
tarballs.

It would also be possible to create a debian/patches/ patch [2]
representing the difference between git and the tarball.  There are
various tools in Debian that might make such a patch, including (I
think) dpkg-source, gbp and perhaps dgit, depending on what workflow
and options and so on.

There are probably other workflows that have similar weaknesses.
I wouldn't recommend any of them.


Stepping back a bit, the underlying theme is (obviously) that the
upstream tarball wasn't great, in this case.

In Debian we have historically had a strong culture of wanting to use
upstream release tarballs.  That made a lot of sense 20-30 years ago
when almost all free software projects released tarballs, and
considered them primary, and the VCS situation was a total mess.

Nowadays, for most projects, the upstream developers work in git.  So
git is the source code.  Upstream provides tarballs via some
semi-automated process, but it's not what they work with.  Ie the
tarballs are an intermediate build product.

In Debian we are supposed to use the source code.  We should be using
the same thing as upstream.

There are other reasons why tarballs can be worse, than that they
could be maliciously modified.  Often tarballs contain prebuilt stuff
of various kinds.  In Debian we usually want to build everything from
source.  That's much easier to get right if we start from the actual
source!

Ian.


[1] Modulo "patches-applied" vs "patches-unapplied" and some other
fiddly details which aren't relevant to this discussion.

[2] Assuming a gbp workflow and `3.0 (quilt)`, for the moment.

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.

Reply via email to