On 2025-02-15 12:33:16 +0100 (+0100), Daniel Gröber wrote: [...] > FYI: If all upstream wants is git metadata I like to introduce them to the > wonderful, but obscure, git `export-subst` feature. See git-attributes(1). > > Works with forges, git-archive and everything. > > Example: > https://github.com/YosysHQ/yosys/commit/222e7a2da345f01980d9261c40c5d50eced4f9ab > thoug this was later improved by others > https://github.com/YosysHQ/yosys/commit/9d15f1d6ac4a9ff2e1f87cda8c366659027fb76f > > If that's not enough can you point us to what this upstream is doing exactly?
I'm not that particular upstream, but with my upstream hat for other projects on, there's a lot of data in a Git repository that our source tarball build process relies on but isn't strictly files in the checked-out worktree: names and addresses of all commit authors, most recent tag in the checkout's history and number of commits following it, presence of certain footer lines in commit messages after the most recent tag, tags in the checkout's history matching a specific pattern which come immediately after the introduction of each file in a certain directory... these are things easily queried from a (non-shallow) clone of the repository but which aren't simple string substitutions. There are several reasons for this complexity: First and foremost, when these projects started Git was still fairly new on the scene, and most distros preferred or even required source tarballs for packaging. Second, the projects' maintainers were burned on multiple occasions by mistakes where metadata duplicated from Git committed into the file tree ended up out of sync or straddling release points, so developed ways to avoid the duplication and risk of divergence by extracting that data from Git at dist build time. Third, because the projects needed to deal with heavy volumes of development activity from many contributors in parallel, they relied on a distributed parallel approval model with the fewest possible coordination chokepoints, so needed to support independent features merging in arbitrary order with things like release notes sorted out automatically whenever a release got tagged. I would argue that our source tarballs don't exactly "differ" from what's in Git; they include content which isn't solely represented by the worktree files in a corresponding checkout, but is still data extracted from the corresponding Git repository state. Downstream distros can choose to use our official signed release source tarballs, or run the tarball build process themselves from a full checkout of our Git repositories, but just naively dumping the file tree from a git checkout or even using a shallow clone is inadequate and we expressly do not support those workflows (if someone insists on doing that, it's on them to make it work, and to check that they're not omitting things the copyright license references such as a generated authors file). Unfortunately, package maintainers sometimes like to insist that upstream projects' workflows are "wrong" because the choices they've made differ from how other projects might choose to develop software, but communities are unique and often face different challenges that need their own solutions or aren't willing to compromise by adopting partial solutions popular elsewhere just to conform. -- Jeremy Stanley
signature.asc
Description: PGP signature