Zbigniew Jędrzejewski-Szmek venit, vidit, dixit 2025-03-31 12:53:54: > tl;dr: change the Packaging Guidelines to recommend the raw "git > archive" or equivalent over the upstream tarball produced using > "make dist". > > This is inspired by the discussion in "Reproducible Builds" mailing list, > in particular [1]. > > Background: upstreams use version control for their projects, but in > packaging we use a "tarball". Nowadays, this tarball is often the output > of "git archive". But it is also common for upstream to use "make dist" > (in case of autoconf) or 'python setup.py sdist' (python), etc. > In particular, when we download from github/gitlab/…, the archive is > often autogenerated by the forge upon request, equivalent to 'git archive'. > > In general, those "upstream tarballs" include the results of some > local processing, for example translating a configure.ac source into a > configure script, using local autoconf macros. Those preprocessed > scripts can become outdated, and in fact we often run 'autoreconf' in > %build to "refresh". In the "xz debacle", an upstream tarball was used > to smuggle rogue payload that wasn't checked into git. Finally, those > "upstream tarballs" are generally not reproducible because they depend > on the build environment. So there are good reasons to start with the > "raw" tarball and build everything from that.
All of these are very good reasons to choose a different "source of truth". > Our packaging guidelines don't say much about which tarball to use. > I propose to make two changes: > 1. Say that "raw" tarballs SHOULD be used, rather than the preprocessed type, > when both are available from upstream, and there isn't a particular > reason to use the latter. > 2. Update > https://docs.fedoraproject.org/en-US/packaging-guidelines/SourceURL/#_using_revision_control > from svn to git and just use an link to a autogenerated github tarball. > > This is only "SHOULD", because sometimes the git tarball is too large > or has other deficiencies. Another reason is that the "upstream > tarball" may be signed, and that'd be preferred to the unsigned "raw" > archive. But those should be rare exceptions. Those (signed) should not be rare. Indeed, we require to check signatures if present. Deliberately choosing an unsigned tarball over a signed tarball circumvents that. That also raises the question why an autogenerated tarball should be more trustworthy - in the xz case, the person was able to commit a release tarball and could have commit to the source tree as well. Do we give up on signatures? Do we switch to signed commits (and trust the forge's tarball creation for that commit)? Let me also mention the case where we have to clean sources (proprietary material) before committing to the look-aside cache. We should document how to do so in spec. Ideally, one could: - get original sources - check upstream's signature - apply the checked-in clean script (which creates a tarball) - check that the results matches the look-aside hash in "sources". Cheers Michael -- _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue