On Tue, Apr 15, 2025 at 07:55:08PM +0200, Santiago Vila wrote:
> El 15/4/25 a las 14:52, Eric Blake escribió:
> > I'm not sure the exact process Debian uses to do downstream builds,
> > but my guess is that it involves a downstream git repository for their
> > patches to be applied on top of the upstream tarball - and it is the
> > very act of applying those patches from git which can alter the mtime
> > of $PACKAGE.texi, which in turn changes the date that
> > build-aux/mdate-sh installs into version.texi, which then breaks
> > reproducible builds.
> 
> Debian uses the original tarball plus a series of patches in quilt format
> (from another small tarball containing only debian/* files), which are applied
> in a given order before the build.
> 
> So the procedure is not exactly how you describe, but the mechanism by which
> reproducibility may be lost is very similar.

Out of curiosity, do the Debian patches alter doc/m4.texi (and thus
its mtime)?

> 
> > And on that front, I could argue that anyone not
> > building directly from the tarball, and who therefore triggers a
> > different timestamp as part of their downstream process, should be
> > fixing their downstream process to address reproducibility, rather
> > than patching upstream to rip out dates just to make downstream's life
> > easier.
> 
> Maybe a different timestamp is a change too small to regenerate the date
> which is used in the manual as the date in which the manual is published.

mdate-sh's granularity is one day; so any downstream process that
touches mtime but takes more than 24 hours from the original tarball
is running into the reproducibility issue of the downstream manual
having a different date than the original upstream tarball.  Not
necessarily a problem if downstream WANTS a later date, unless
downstream is also worried about being able to use the SAME date each
time the downstream process is re-run, even when more than 24 hours
have elapsed between runs.

> 
> Ideally, the date would be chosen and stored in a static file as part of
> your upstream release process, i.e. when you are about to release m4-1.4.20,
> and we should not change such file unless we really meant a different time.

And that ideal is _supposed_ to be met: automake generates the file
doc/version.texi to contain the mtime of doc/$PACKAGE.texi, AND
includes that generated file in the tarball.  If the mtime is
preserved, then the tarball SHOULD be producing the same datestamp for
all subsequent builds of the documentation from that tarball.  So I am
now questioning why Debian ever needed a patch to rip UPDATED out of
the manual in the name of reproducibility.

But since mtime is fragile, ideas on making the reproducibility even
more reliable in the face of downstream patches to m4.texi applied at
a later date not causing version.texi to be rebuilt are still worth
talking about.  In the short term, I may install a one-off solution in
m4's configure.ac (Simon's hack to force the mtime of
doc/$PACKAGE.texi to the last git commit as part of configure seems
nice); but in the long term, a more generic patch to automake and
mdate-sh for use by ALL packages affected by the same problem will
have a better outcome.

> 
> (Is this the same as "maintainer mode" or maybe those two things
> could be independent?)

I think they are independent; maintainer mode controls how many
generated files are cleaned, and tends to get in the way if you don't
normally check generated files into version control.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org


Reply via email to