On Mon, Mar 31, 2025 at 2:04 PM Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Mon, Mar 31, 2025 at 01:39:57PM +0200, Vitaly Zaitsev via devel wrote: > > On 31/03/2025 12:53, Zbigniew Jędrzejewski-Szmek wrote: > > > This is inspired by the discussion in "Reproducible Builds" mailing list, > > > in particular [1]. > > > > But auto-generated Git archives are not reproducible. > > The git archive hash may not be stable, but the contents of the archive > are expected to be stable, provided git history was not tampered with. > > When it comes to reproducibility we should not be verifying the tarball > hash. Instead we should be proving that the content of the archive Fedora > stores, is an accurate representation of the git content at the given > tag/commit > > I don't think we're well setup for that - we don't want to be parsing > URLs to try to identify if the URL points to a particular git repo > tag or commit. We have the forgemeta macros, which record the info as > %global statement, but they're not mandatory, and also when we parse a > spec, this data is already expanded. > > We're drifted into our current way of doing things because it was the > least effort to achieve with Fedora's historical lookaside cache > bit-bucket. > > If we're thinking about provenance more generally, not just the RPM > reproducibility, then perhaps the 'sources' file should have been > adapted to be more explicit about what we're storing. It could record > the full git repository location, tag/commit hash, list of globs of > files to strip. rhpkg could include commands for downloading, and > later verifying tarball contents against git hashes, and for auto > repacking of tarballs, and various other tarball management tasks. > Potentially tarball contents verification against the git repo would > happen as a gating CI task on every build.
Or could we maybe kindly stop overdesigning? Currently: we store the URL and hash of the bespoke archive, verified before unpacking. It's entirely possible to: store the URL and hash of the forge-generated archive contents, verified *after* unpacking. That gives all the guarantees Zbigniew is after (verify correspondence to git repo contents provided we trust the forge) without a single new field, command or anything like this. If you make unpacking a fallback, you don't even need to store that one bit for the old/new way. -- _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue