(Trimming the cc list a bit to keep from deluging people who aren't actively participating in this part of the conversation.)
Ansgar 🙀 <ans...@43-1.org> writes: > On Wed, 2024-06-19 at 16:06 -0700, Russ Allbery wrote: >> I appreciate that you're trying really hard to find a way to represent >> the Git tree directly in a source package so that no build step is >> required and building the tree of files locally is trivial. I >> understand that you truly think that this accomplishes the same goal >> with some acceptable lossage around the edges. But it doesn't; it's >> missing the point. The point is that there are a wide variety of >> potential transformations between the Git tree and the source package >> required to accomodate the range of Git workflows used in Debian today >> *and tomorrow*. > Let us talk about *today*. How many packages would not be possible to > upload via tag2upload if one required a signature covering content of > packages? Is it 0.1%? Is it 90%? If we're talking about *today*, the answer is 100%, because what you're describing requires new code in both dak and tag2upload that no one has written. But I understand that you're not really talking about today, you're asking about a hypothetical future world in which someone does that work. I personally do not have those numbers. I know there are a huge variety of workflows mostly from previous debian-devel discussions, which gives me an appreciation for the scope of the problem that tag2upload solves but doesn't give me numbers. I checked on trends.debian.net to see if by chance it was trying to collect workflow data, but the closest thing to relevant is graphs showing the overwhelming popularity of 3.0 (quilt) as a source package format. I can say that for the packages I maintain personally, 100% of them would not be possible to upload this way at some point over time. As mentioned previously, I frequently have reasons to carry a Debian-specific patch for some period of time (which is a file that's generated at source package build time), and for the reasons that Philip Hands already explained, I don't want to have to think, with each package upload, whether this upload of the package will be blocked if I use tag2upload. If one only looks at the most recent version of each of my packages at a specific point in time, my guess is that about 50% of those packages could not be uploaded and the other 50% would currently work because they currently don't have any Debian-specific changes to upstream and therefore building the *.debian.tar.xz is fairly trivial and involves only files already present in the Git head. I haven't counted, though, so consider this number rather rough; I could be off by 10-20% in either direction. > For *tomorrow* we might change things in the future. Some things like > arbitrary code execution at .dsc construction time are fairly useful > after all (required for some workflows, even when it might not change > the files ending in the source package). If you also agree that having a source package build step is useful, why are you opposed it? What do you think would be different about tomorrow that we don't know today? I don't understand what you're trying to accomplish. Also, I think you're talking about yet a different style of build system than what tag2upload does, but just to be explicit: I believe tag2upload source package builds do not involve arbitrary code execution, if by that you mean running code shipped in the Git repository being built. (The tag2upload and dpkg-source code is not "arbitrary" in the normal security sense of that term.) Normally, arbitrary code execution can happen with source package builds due to running the clean target, which can be arbitrary code specified in debian/rules. That shouldn't be necessary in tag2upload since it's starting from a clean Git checkout. I'm not sure I'm a fan of truly arbitrary code execution during the source package build, since avoiding that is a very nice bit of security hardening for the source package builder that makes it way, way more difficult for a malicious source package to somehow compromise the builder. But that's hardening, not an essential element of the design, so I could be convinced. > Pretty much all changes in Debian (say systemd, usrmerge, ...) happened > incrementally. Why should that not be appropriate here? (Or was a slow > move wrong in retrospect and we should have decided to support only > systemd and drop sysvinit in 2014, and to move to only usrmerge also > several releases earlier?) I don't think you thought through this analogy before you mailed it. :) Surely the differences between a system that would only be used by Debian package uploaders and is strictly optional, and major architectural changes to Debian *user* systems that are either huge changes in how they administer their systems (in the first case) or irreversible changes to their system (in the second case), is too obvious to get into. Also, I do feel like I have to point out that you have not, in this thread, been asking for an incremental deployment. You have been asking for someone to completely rewrite tag2upload along different principles using brand new features of dak that no one has yet written, in order to get a system that does a fraction of what it currently can do, has severe flexibility restrictions, and would be, to be quite frank, a tottering pile of workarounds and hacks. You're asking for the sort of muddled and incoherent design that, were someone advocating it anywhere else in Debian, you would be grumbling about as poor architecture. And you would be correct. The straightforward, flexible, and adaptable way to glue together the wild variety of preferred Git packaging workflows with a stable and useful source package format is to put a build step in between those two things. This is not some sort of novel or radical idea. It is literally what we have been doing in Debian for as long as I have been involved in the project. tag2upload doesn't change that property. It just allows me, and others who want to use it, to move that step off of potentially untrustworthy or unsuitable local hardware onto a Debian project system that, like binary buildds, we can secure and incrementally improve. And, as a bonus, it regularizes the construction and gives us hooks for doing useful additional verifications in the future. > Why should there be a standardized Debian source package in the end > (where tag2upload might try to build quilt patches) when nobody but a > machine is supposed to use them? This is such a mystifying question to me that I can't reverse-engineer what you are trying to ask. There are innumerable consumers of source packages in Debian: analysis tools, linting tools, tools that gather together all of the Debian-specific patches so that upstreams can see how we're changing their software, reporting, auditing, etc. tag2upload's generation of normal, semantically correct 3.0 (quilt) packages lets that entire ecosystem of downstream tools continue to work. In some cases, using it (or dgit) makes them work better, by providing more or higher-quality metadata. This was the case for my own personal packages. If you want to change the source package format in the archives, by (for example) deprecating 3.0 (quilt) in favor of something else, I think that's at least arguably within your remit as an FTP team delegate. Figuring out all of the things that would break and all the assumptions that would change and fixing the entire ecosystem of tools so that you can make such a change is a massive amount of work, and I don't envy you the task. Nor would I make any assumptions about how long it would take you to complete. In the meantime, tag2upload works correctly with the ecosystem of tools that we have now, with the minor caveat that tools that care specifically about the identity of the package uploader will need to parse some fields out of *.dsc or *.changes instead of only from the OpenPGP signature. > Also, if they are standardized why are there options for the maintainer > to control how the source package gets constructed? (Like an option to > control how many patches end up in d/patches.) Because that's a *semantic* decision and dgit or tag2upload cannot always guess the correct semantics. Some maintainers choose to not maintain changes to upstream as separable commits that can be meaningfully turned into a patch series, and in that case they need to tell dgit or tag2upload to not attempt to make sense of the changes and instead just build a monolithic patch. The tools can do a lot, but they can't recover information that the package maintainer has intentionally discarded. -- Russ Allbery (r...@debian.org) <https://www.eyrie.org/~eagle/>