Some static security tooling that restricts the use of binary files
during the build, check, and install can help here as an alternative to
git archive to mitigate xz kind of attacks (e.g., no binary files use
during the build allowed, and can only be copied during the install as
long as they are not executable, no build output changed during checks,
etc...). A tool like that could leave this proposal main argument only
about the challenges that maintainer provided tarballs present to
reproducible builds?
On 3/31/25 3:53 AM, Zbigniew Jędrzejewski-Szmek wrote:
tl;dr: change the Packaging Guidelines to recommend the raw "git
archive" or equivalent over the upstream tarball produced using
"make dist".
This is inspired by the discussion in "Reproducible Builds" mailing list,
in particular [1].
Background: upstreams use version control for their projects, but in
packaging we use a "tarball". Nowadays, this tarball is often the output
of "git archive". But it is also common for upstream to use "make dist"
(in case of autoconf) or 'python setup.py sdist' (python), etc.
In particular, when we download from github/gitlab/…, the archive is
often autogenerated by the forge upon request, equivalent to 'git archive'.
In general, those "upstream tarballs" include the results of some
local processing, for example translating a configure.ac source into a
configure script, using local autoconf macros. Those preprocessed
scripts can become outdated, and in fact we often run 'autoreconf' in
%build to "refresh". In the "xz debacle", an upstream tarball was used
to smuggle rogue payload that wasn't checked into git. Finally, those
"upstream tarballs" are generally not reproducible because they depend
on the build environment. So there are good reasons to start with the
"raw" tarball and build everything from that.
Our packaging guidelines don't say much about which tarball to use.
I propose to make two changes:
1. Say that "raw" tarballs SHOULD be used, rather than the preprocessed type,
when both are available from upstream, and there isn't a particular
reason to use the latter.
2. Update
https://docs.fedoraproject.org/en-US/packaging-guidelines/SourceURL/#_using_revision_control
from svn to git and just use an link to a autogenerated github tarball.
This is only "SHOULD", because sometimes the git tarball is too large
or has other deficiencies. Another reason is that the "upstream
tarball" may be signed, and that'd be preferred to the unsigned "raw"
archive. But those should be rare exceptions.
There is also a whole category of projects like Rust, Pypi, and maven,
where we download the tarball from a language distribution website,
not from upstream directly. I'm NOT proposing that we stop using
those. Instead, later on, I would add the requirement that those
bundles must build reproducibly from the git checkout. But I'm leaving
this out of the current proposal for simplicity.
[1]
https://lists.reproducible-builds.org/pipermail/rb-general/2025-March/003694.html
Zbyszek
--
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct:
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives:
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it:
https://pagure.io/fedora-infrastructure/new_issue