On Mon, Mar 31, 2025 at 2:04 PM Daniel P. Berrangé <berra...@redhat.com> wrote:
>
> On Mon, Mar 31, 2025 at 01:39:57PM +0200, Vitaly Zaitsev via devel wrote:
> > On 31/03/2025 12:53, Zbigniew Jędrzejewski-Szmek wrote:
> > > This is inspired by the discussion in "Reproducible Builds" mailing list,
> > > in particular [1].
> >
> > But auto-generated Git archives are not reproducible.
>
> The git archive hash may not be stable, but the contents of the archive
> are expected to be stable, provided git history was not tampered with.
>
> When it comes to reproducibility we should not be verifying the tarball
> hash. Instead we should be proving that the content of the archive Fedora
> stores, is an accurate representation of the git content at the given
> tag/commit
>
> I don't think we're well setup for that - we don't want to be parsing
> URLs to try to identify if the URL points to a particular git repo
> tag or commit. We have the forgemeta macros, which record the info as
> %global statement, but they're not mandatory, and also when we parse a
> spec, this data is already expanded.
>
> We're drifted into our current way of doing things because it was the
> least effort to achieve with Fedora's historical lookaside cache
> bit-bucket.
>
> If we're thinking about provenance more generally, not just the RPM
> reproducibility, then perhaps the 'sources' file should have been
> adapted to be more explicit about what we're storing. It could record
> the full git repository location, tag/commit hash, list of globs of
> files to strip. rhpkg could include commands for downloading, and
> later verifying tarball contents against git hashes, and for auto
> repacking of tarballs, and various other tarball management tasks.
> Potentially tarball contents verification against the git repo would
> happen as a gating CI task on every build.

Or could we maybe kindly stop overdesigning?

Currently: we store the URL and hash of the bespoke archive, verified
before unpacking.

It's entirely possible to: store the URL and hash of the
forge-generated archive contents,
verified *after* unpacking.

That gives all the guarantees Zbigniew is after
(verify correspondence to git repo contents provided we trust the forge)
without a single new field, command or anything like this.
If you make unpacking a fallback,
you don't even need to store that one bit for the old/new way.

-- 
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org
Do not reply to spam, report it: 
https://pagure.io/fedora-infrastructure/new_issue

Reply via email to