On 2024-12-04, Simon Josefsson wrote:
> It doesn't strike me as ideal situation that when rebuilding, let's say,
> the coreutils distributed for trixie+X we may need to download groff
> from say trixie+X-1, or even a groff that wasn't in any stable release.
> This happens if we reproduce packages using build dependencies from
> Built-Using:
>
> Shouldn't we try to reproduce the coreutils binary package in trixie+X
> by rebuilding it using the version of Build-Depends that are in
> trixie+X?

The short of it, no... at least, not anytime soon. :)


> If there is a reproducability difference between these two approaches,
> isn't that something that should be fixed?

If we rebuild with a different toolchain, I expect to get different
results.

You might get lucky and get reproducibility with small variations in the
toolchain, but in general, it seems unreasonable to expect with
different inputs you get the same outputs.


We have been doing CI builds against the current toolchain for about 10
years now, where we do two builds in subtly different environments and
compare the results:

  https://tests.reproducible-builds.org/debian/reproducible.html

That at least can test for regressions in the toolchain and discover
reproducibility issues that might be harder to find.

In the vast majority of cases, you need to perform two builds using the
same toolchain to have expectations of reproducibility.


> Compare this with building gcc; it first builds a copy of gcc using the
> system compiler (comparable to the rebuilderd effort) and then rebuilds
> itself using the newly built gcc and comparing the results (no effort on
> this underway today for Debian).

There are some intentional exceptions, such as Mes, which work very hard
to produce bit-for-bit identical results even if the starting toolchain
differs, but I would not expect that of most software most of the time.


> Opinions may differ, but having a state where 100% of Debian trixie+X is
> reproducible only if you have access to potentially a large fraction of
> packages ever published by Debian doesn't sound ideal to me.

There was an effort to archive only the packages relevent to current
trixie, bookworm, etc. which is a *much* smaller set of packages. Since
snapshot.debian.org started working again, this has been at least paused
for the moment, but I think long-term something like this would be an
important safety net to provide some redundancy in case of future
problems with snapshot.debian.org.

Something that would make that much easier for any given release is if
the entire release was rebuilt at least once (ideally several times) in
the development cycle with most packages built against a narrow set of
the toolchain (e.g. rebuild all of the build essential set, and then
rebuild everything from there on out). That would help reduce the
numerous permutations of a given compiler down to a smaller set of
versions, at least.


> One goal with all this is that we can identify what source code was
> used to build the software we use, and be able to audit that.  It is
> less work to audit all of trixie+X source code than to audit all of
> trixie+X PLUS all required build-dependencies going back to the
> beginning of time, which may include no longer available packages (for
> legal or technical reasons).

Agreed that this is unfortunate... though practically speaking, I fear
this may be the necessity.


> So what I'm suggesting is that it would be useful to have a reproducible
> rebuild effort that publish diffoscope output comparing what we publish
> with a new rebuild of the package using the latest build-dependency
> versions.  And ultimately what should go into policy is that packages
> built this way have to be reproducible.

I suspect this is an extremely complicated engineering challenge, and we
have a hard enough time getting to 100% reproducible with current
expectations... that could be raising the bar at least into the
stratosphere, if not past the confines of our humble solar system... :)


live well,
  vagrant

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Reproducible-builds mailing list
Reproducible-builds@alioth-lists.debian.net
https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/reproducible-builds

Reply via email to