Scott Kitterman writes ("Re: [RFC] General Resolution to deploy tag2upload"): > On Tuesday, June 11, 2024 6:25:02 PM EDT Sean Whitton wrote: > > - it improves the traceability and auditability of our source-only > > uploads, in ways that are particular salient in the wake of xz-utils. > > As I understand it, Debian was affected by the xz-utils hack, in > part, because some artifacts were inserted into an upstream tarball > that were not represented in the upstream git. Please explain how > use of tag2upload is relevant to this scenario? I'm afraid I don't > follow.
Disclaimer: I don't know precisely the Debian xz's maintainer's workflow. tag2upload, like dgit, ensures and insists that the git tree you are uploading corresponds precisely [1] to the generated source package. If you base your Debian git maintainer branch on the upstream git (as you should) and there is a discrepancy between the contents of the upstream git branch, and the .orig.tar.gz you're using, the upload will fail. In the xz case, if the .orig.tar.gz is upstream's, that would have detected the attack. More realistically, since the attacker was targeting Debian, they would instead have had to put all of the malicious code into the git repository, which is possible, but riskier - so it makes the attack harder, or easier to detect, but doesn't rule it out. There are some cavests to this. I believe some maintainers maintain a "upstream tarball imports" branch, which has upstream git as its ancestor, but whose tree contents are the upstream tarballs. They then base the Debian branch on that. That workflow is vulnerable to "random stuff" in the tarballs. It would also be possible to create a debian/patches/ patch [2] representing the difference between git and the tarball. There are various tools in Debian that might make such a patch, including (I think) dpkg-source, gbp and perhaps dgit, depending on what workflow and options and so on. There are probably other workflows that have similar weaknesses. I wouldn't recommend any of them. Stepping back a bit, the underlying theme is (obviously) that the upstream tarball wasn't great, in this case. In Debian we have historically had a strong culture of wanting to use upstream release tarballs. That made a lot of sense 20-30 years ago when almost all free software projects released tarballs, and considered them primary, and the VCS situation was a total mess. Nowadays, for most projects, the upstream developers work in git. So git is the source code. Upstream provides tarballs via some semi-automated process, but it's not what they work with. Ie the tarballs are an intermediate build product. In Debian we are supposed to use the source code. We should be using the same thing as upstream. There are other reasons why tarballs can be worse, than that they could be maliciously modified. Often tarballs contain prebuilt stuff of various kinds. In Debian we usually want to build everything from source. That's much easier to get right if we start from the actual source! Ian. [1] Modulo "patches-applied" vs "patches-unapplied" and some other fiddly details which aren't relevant to this discussion. [2] Assuming a gbp workflow and `3.0 (quilt)`, for the moment. -- Ian Jackson <ijack...@chiark.greenend.org.uk> These opinions are my own. Pronouns: they/he. If I emailed you from @fyvzl.net or @evade.org.uk, that is a private address which bypasses my fierce spamfilter.