On Sat, 05 Aug 2023 at 17:06:27 +0200, Lucas Nussbaum wrote: > Should we give up on requiring a 'clean' target that works? After all, > when 17% of packages are failing, it means that many maintainers don't > depend on it in their workflow.
I think it's somewhat inevitable that code paths that aren't frequently exercised don't work. If a majority of maintainers are doing all of their builds with git-buildpackage, or dgit --clean=git, or something basically equivalent to one of those, then `debian/rules clean` will never actually be run against a built tree. For teams with a strongly preferred workflow (like the Perl, Python and GNOME teams consistently using git-buildpackage), this seems particularly likely. I think we need to think about what benefit this Policy requirement brings us, and whether it's worth the cost, before treating it as important: the higher we choose to make the cost of fixing this class of bug, the higher the bar should be for treating it as a real bug at all, or treating it as RC. For me, the main purpose of `debian/rules clean` is being able to do incremental builds while debugging something - but if I want to do incremental builds, it's quite likely that I'll also be using `debuild -b -nc` to make the builds genuinely incremental (and then a fully clean build from first principles at the end, to verify that whatever issue I'm debugging is really fixed). Having looked at some of the packages with my name on in all_failing.txt.dd-list, many of them are a simple matter of built files being created by the build (either upstream or downstream) or by running automated tests, but not deleted. Those are easily fixed (several fixed in git already). libsdl2 has generated files in its upstream source tarball which are re-generated with different content during the build (mostly Autotools files, but also include/SDL_config.h and include/SDL_revision.h), and I'm confident that it's not the only one: many upstreams want to do this for the convenience of people building their package on OSs whose development tools are less tightly integrated or less scriptable than ours (especially Windows). In many cases the most pragmatic way to deal with that is to delete the file during clean and ignore the resulting warnings from dpkg-source. I certainly don't want to require maintainers to invent machinery for saving and restoring upstream's versions of generated files like dh-autoreconf does, because that seems like busy-work that just adds to the complexity of our builds without making Debian any better. One way to streamline dealing with these generated files would be to normalize repacking of upstream source releases to exclude them, and make it easier to have source packages that genuinely only contain what we consider to be source. At the moment, devref §6.8.8.2 strongly discourages repacking tarballs to exclude DFSG-but-unnecessary files (including generated files, as well as source/build files only needed on Windows or macOS or whatever[1]), and Lintian strongly encourages adding a +dfsg or +ds suffix to any repacked tarball, which makes it less straightforward to track upstream's versioning. Is it time for us to reconsider those recommendations? For many upstreams (for example Autotools-based projects, and any project like GTK that includes pre-generated documentation in source releases), we can get "more source-like" upstream source releases by repacking our own tarball based on upstream VCS tags than we would get by using their official source release artifacts. For other upstreams, Files-Excluded can be used to delete generated or unneeded files. A side benefit of normalizing repacking upstream source releases would be that maintainers are no longer expected to check the diff between those files in the old and new version, and no longer required to track the copyright and licensing status of the files that get excluded, which can significantly speed up the process of importing new upstream versions in some cases. The major disadvantage of repacking upstream source is that if upstream makes signed releases or publishes checksums, our repacked source cannot be validated against that information; but if we can easily re-download upstream source, then it would be possible for any interested developer to verify that the repacked tarball matches what upstream released, minus the files listed in Files-Excluded. Devref §6.8.8.2 also says that "it is common for Debian users who need to build software for non-Debian platforms to fetch the source from a Debian mirror rather than trying to locate a canonical upstream distribution point", but I'm not convinced that's true any more. Our volunteers' time is our most limited resource, so if we can use that time more efficiently by no longer catering to possibly-hypothetical users who are building our source code on non-Debian platforms, then that might be a worthwhile tradeoff. smcv [1] I mentioned Windows/macOS, but devref actually talks about MS-DOS, which is perhaps an indication of how old that text is